feat: log cluster module IPC via onelogger#119
Conversation
BREAKING CHANGE: drop Node.js < 18.19.0 support part of eggjs/egg#3644 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes for @eggjs/cluster v3.0.0-beta.4 - **New Features** - Migrated project to TypeScript. - Added support for Node.js 18.19.0, 20, 22, and 23. - Enhanced type safety and module exports. - Improved worker thread and process management. - Introduced new error handling classes for better debugging. - **Breaking Changes** - Renamed package from `egg-cluster` to `@eggjs/cluster`. - Updated import/export syntax to ES modules. - Minimum Node.js version is now 18.19.0. - **Performance Improvements** - Refactored cluster and worker management. - Optimized error handling and logging. - **Bug Fixes** - Resolved various edge cases in worker initialization. - Improved graceful shutdown mechanisms. - **Documentation** - Updated README with new package name and usage examples. - Added TypeScript and ESM import examples. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
[skip ci] ## [3.0.0](v2.4.0...v3.0.0) (2024-12-28) ### ⚠ BREAKING CHANGES * drop Node.js < 18.19.0 support part of eggjs/egg#3644 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes for @eggjs/cluster v3.0.0-beta.4 - **New Features** - Migrated project to TypeScript. - Added support for Node.js 18.19.0, 20, 22, and 23. - Enhanced type safety and module exports. - Improved worker thread and process management. - Introduced new error handling classes for better debugging. - **Breaking Changes** - Renamed package from `egg-cluster` to `@eggjs/cluster`. - Updated import/export syntax to ES modules. - Minimum Node.js version is now 18.19.0. - **Performance Improvements** - Refactored cluster and worker management. - Optimized error handling and logging. - **Bug Fixes** - Resolved various edge cases in worker initialization. - Improved graceful shutdown mechanisms. - **Documentation** - Updated README with new package name and usage examples. - Added TypeScript and ESM import examples. <!-- end of auto-generated comment: release notes by coderabbit.ai --> ### Features * support cjs and esm both by tshy ([#117](#117)) ([e15a4bf](e15a4bf))
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **Dependencies** - Updated `@eggjs/utils` package to version 4.2.1 - **Enhancements** - Improved module loading mechanism with new `baseDir` option - Added support for specifying module search paths during import - **Testing** - Enhanced agent worker test suite with additional module injection - Added module execution verification in tests - **Deprecation Notice** - HTTPS configuration now recommends object-based format with `key` and `cert` <!-- end of auto-generated comment: release notes by coderabbit.ai -->
[skip ci] ## [3.0.1](v3.0.0...v3.0.1) (2024-12-30) ### Bug Fixes * require support paths ([#118](#118)) ([b74872c](b74872c))
Add structured logging on every master<->app worker IPC point in Node.js cluster mode (process mode only). Covers both application-layer messages and the internal NODE_CLUSTER fd dispatch / fd ack. Application layer (always on): - master->app send (AppProcessWorker#send) - master->app sticky-session Socket direct send (master.ts) - master<-app recv (cluster.on 'fork' -> worker.on 'message') - app->master send (static AppProcessWorker.send) - app<-master recv (process.on 'message' in app_worker.ts) Internal cluster layer (opt-in via EGG_CLUSTER_IPC_LOG=1, very verbose): - master<-app internal (worker.process.on 'internalMessage'): online / listening / queryServer / accepted (fd ack) / close - app<-master internal (process.on 'internalMessage'): newconn (fd dispatch, with handle) / disconnect / suicide Users can replace the default console sink via onelogger.setLogger(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis pull request migrates the Changes
Sequence DiagramsequenceDiagram
participant Client
participant startCluster as startCluster()
participant Master
participant OptionParser
participant WorkerManager
participant Messenger
participant ProcessWorker as App/Agent Worker
Client->>startCluster: startCluster(options)
startCluster->>Master: new Master(options)
Master->>OptionParser: parseOptions(options)
OptionParser-->>Master: ParsedClusterOptions (with framework, baseDir, etc.)
Master->>Master: `#start`() async initialization
Master->>WorkerManager: create WorkerManager instance
Master->>Messenger: create Messenger(master, workerManager)
par Worker Spawning
Master->>ProcessWorker: fork/spawn AppWorker(s)
Master->>ProcessWorker: fork/spawn AgentWorker
end
ProcessWorker->>ProcessWorker: importModule(framework)
ProcessWorker->>ProcessWorker: instantiate Application/Agent
ProcessWorker->>ProcessWorker: app.ready() / agent.ready()
ProcessWorker->>Messenger: send agent-start / app-start message
Messenger->>Master: deliver IPC message
Master->>WorkerManager: track worker status
Master->>Master: emit ready()
Master-->>startCluster: Promise resolved
startCluster-->>Client: startup complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request migrates the egg-cluster package to TypeScript, renames it to @eggjs/cluster, and updates the minimum Node.js version requirement to 18.19.0. The refactoring transitions the codebase to ES modules and organizes cluster management logic into TypeScript classes. Review feedback highlights a regression in AppThreadWorker where the exitCode is hardcoded to 0, which could mask actual failure states. Additionally, the review suggests several technical improvements, including specifying a radix for parseInt calls to ensure consistent behavior, addressing uninitialized properties in WorkerManager to satisfy strict TypeScript rules, and optimizing digit detection logic in the stickyWorker method.
| return 0; | ||
| // return this.instance.exitCode; | ||
| } | ||
|
|
There was a problem hiding this comment.
| agent: BaseAgentWorker | null; | ||
| workers = new Map<number, BaseAppWorker>(); | ||
| exception = 0; | ||
| timer: NodeJS.Timeout; |
There was a problem hiding this comment.
| let s = ''; | ||
| for (let i = 0; i < ip.length; i++) { | ||
| if (!isNaN(ip[i])) { | ||
| if (!isNaN(parseInt(ip[i]))) { |
| const appTimeout = parseInt(process.env.EGG_APP_CLOSE_TIMEOUT || legacyTimeout); | ||
| const agentTimeout = parseInt(process.env.EGG_AGENT_CLOSE_TIMEOUT || legacyTimeout); |
There was a problem hiding this comment.
Always specify a radix (e.g., 10) when using parseInt to avoid potential ambiguity and ensure consistent behavior across different environments.
| const appTimeout = parseInt(process.env.EGG_APP_CLOSE_TIMEOUT || legacyTimeout); | |
| const agentTimeout = parseInt(process.env.EGG_AGENT_CLOSE_TIMEOUT || legacyTimeout); | |
| const appTimeout = parseInt(process.env.EGG_APP_CLOSE_TIMEOUT || legacyTimeout, 10); | |
| const agentTimeout = parseInt(process.env.EGG_AGENT_CLOSE_TIMEOUT || legacyTimeout, 10); |
| } | ||
|
|
||
| if (options.port && typeof options.port === 'string') { | ||
| options.port = parseInt(options.port); |
| } | ||
|
|
||
| if (options.workers && typeof options.workers === 'string') { | ||
| options.workers = parseInt(options.workers); |
| let childrenPids: number[] = []; | ||
| try { | ||
| const children = await pstree(pid); | ||
| childrenPids = children!.map(c => parseInt(c.PID)); |
There was a problem hiding this comment.
Pull request overview
This PR adds structured, opt-in/always-on IPC logging for master↔worker communication in cluster process mode, using onelogger as the logging facade, and updates the codebase/tooling toward TS/ESM + refreshed tests/fixtures to support the new behavior.
Changes:
- Introduce
ipc_loggerutilities and instrument master/app-worker IPC send/recv paths (plus optional internalNODE_CLUSTERtraffic viaEGG_CLUSTER_IPC_LOG). - Refactor core cluster runtime to TypeScript/ESM-oriented structure (new
src/*entrypoints, base worker abstractions, updated terminate/options). - Migrate/adjust tests and fixtures to TS + new mocks/utilities and updated behaviors.
Reviewed changes
Copilot reviewed 62 out of 64 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tsconfig.json | Adds TS config (strict, NodeNext) for the repo. |
| test/worker_threads.test.ts | Migrates test to ESM/TS and new cluster() helper. |
| test/utils.ts | New TS test helper for cluster + fixture paths. |
| test/utils.js | Removes legacy CJS test helper. |
| test/options.test.ts | Migrates options tests to TS/ESM and async parseOptions(). |
| test/options.test.js | Removes legacy CJS options tests. |
| test/master.test.ts | Migrates master tests to TS/ESM and updates timings/requests. |
| test/https.test.ts | Migrates HTTPS tests to TS/ESM and updated cluster helper usage. |
| test/fixtures/egg/index.js | Adjusts fixture framework exports to satisfy new framework validation. |
| test/fixtures/apps/worker-close-timeout/app.js | Replaces custom sleep with timers/promises wait. |
| test/fixtures/apps/worker-close-timeout/agent.js | Replaces custom sleep with timers/promises wait. |
| test/fixtures/apps/script-start/start-server.js | Replaces custom sleep with timers/promises wait. |
| test/fixtures/apps/options/config/plugin.js | Adds fixture plugin disables for deterministic tests. |
| test/fixtures/apps/options/app.js | Removes strict mode boilerplate; keeps fixture behavior. |
| test/fixtures/apps/options/agent.js | Removes strict mode boilerplate; keeps fixture behavior. |
| test/fixtures/apps/framework/lib/framework.js | Updates fixture loader to async loadConfig(). |
| test/fixtures/apps/framework/lib/agent.js | Simplifies fixture agent import pattern. |
| test/fixtures/apps/framework/index.js | Updates fixture framework exports. |
| test/fixtures/apps/before-close/app.js | Replaces custom sleep with timers/promises wait. |
| test/fixtures/apps/before-close/agent.js | Replaces custom sleep with timers/promises wait. |
| test/fixtures/apps/app-listen-without-port/config/config.default.js | Removes strict mode boilerplate in fixture config. |
| test/fixtures/apps/app-listen-port/config/config.default.js | Removes strict mode boilerplate in fixture config. |
| test/fixtures/apps/app-listen-port/app/router.js | Updates fixture to read ctx.app.options.port. |
| test/fixtures/apps/agent-debug-port/inject1.js | Adds injection fixture used by updated tests. |
| test/fixtures/apps/agent-debug-port/agent.js | Removes strict mode boilerplate. |
| test/app_worker.test.ts | Migrates app worker tests to TS/ESM and updates expectations/timing. |
| test/agent_worker.test.ts | Migrates agent worker tests to TS/ESM and updates expectations. |
| src/utils/worker_manager.ts | Refactors manager to typed WorkerManager with worker listing helpers. |
| src/utils/terminate.ts | New TS terminate helper using @fengmk2/ps-tree. |
| src/utils/options.ts | New async TS parseOptions() + typings + validation. |
| src/utils/mode/impl/worker_threads/app.ts | TS/ESM refactor of worker_threads app worker utils. |
| src/utils/mode/impl/worker_threads/agent.ts | TS/ESM refactor of worker_threads agent worker utils. |
| src/utils/mode/impl/process/app.ts | New process-mode app worker utils + IPC logging hooks. |
| src/utils/mode/impl/process/agent.ts | New process-mode agent worker utils. |
| src/utils/mode/base/app.ts | New base abstractions for app workers/utils. |
| src/utils/mode/base/agent.ts | New base abstractions for agent workers/utils. |
| src/utils/messenger.ts | TS refactor + message typing + routing tweaks. |
| src/utils/ipc_logger.ts | Implements onelogger-backed IPC formatter + safe stringification. |
| src/master.ts | Major TS/ESM refactor; integrates new worker utils and IPC logging for sticky dispatch. |
| src/index.ts | New TS entrypoint exporting startCluster and public types. |
| src/error/index.ts | Central export for new error types. |
| src/error/ClusterWorkerExceptionError.ts | Introduces typed exception error for unhealthy cluster state. |
| src/error/ClusterAgentWorkerError.ts | Introduces typed wrapper for agent worker errors. |
| src/dirname.ts | Helper to resolve src dirname under ESM/CJS contexts. |
| src/app_worker.ts | New TS app worker runtime + process-mode IPC recv logging and internalMessage logging. |
| src/agent_worker.ts | New TS agent worker runtime. |
| package.json | Renames package, updates engines/exports, adds onelogger, updates scripts/build. |
| lib/utils/timer.js | Removes legacy CJS sleep helper. |
| lib/utils/terminate.js | Removes legacy CJS terminate implementation. |
| lib/utils/options.js | Removes legacy CJS options parser. |
| lib/utils/mode/impl/process/app.js | Removes legacy CJS process app utils. |
| lib/utils/mode/impl/process/agent.js | Removes legacy CJS process agent utils. |
| lib/utils/mode/base/app.js | Removes legacy CJS base app abstractions. |
| lib/utils/mode/base/agent.js | Removes legacy CJS base agent abstractions. |
| lib/app_worker.js | Removes legacy CJS app worker runtime. |
| lib/agent_worker.js | Removes legacy CJS agent worker runtime. |
| index.js | Removes legacy CJS entrypoint. |
| README.md | Updates docs for new package name and async/ESM usage. |
| CHANGELOG.md | Adds v3.x release notes content. |
| .gitignore | Ignores new build artifacts/cache output. |
| .github/workflows/pkg.pr.new.yml | Adds pkg-pr-new workflow to publish preview builds. |
| .github/workflows/nodejs.yml | Updates Node test matrix to 18.19+ and newer versions. |
| .github/PULL_REQUEST_TEMPLATE.md | Removes old PR template. |
| .eslintrc | Switches to TS-focused eslint config + node: prefix rule. |
Comments suppressed due to low confidence (6)
src/master.ts:421
- In onAppExit(), the log/error message uses
worker.exitCodeinstead of thecodeprovided in the exit event payload (data.code). For process workers,worker.instance.process.exitCodecan benullor differ from the actualcodeargument, which makes the reported exit code unreliable. Usedata.code(and consider falling back toworker.exitCodeonly if needed).
src/utils/mode/impl/worker_threads/agent.ts:85 worker_threadsWorker emitsexitwith only the exit code; it does not provide asignalargument. The current listener signature(code, signal)and forwardingsignalto master will produceundefinedat runtime and misleads the type system. Update the listener to accept onlycodeand either omitsignalin the message or set it explicitly (e.g.,null/ empty string) and adjust the master-side types accordingly.
src/utils/mode/impl/worker_threads/app.ts:129- The worker_threads
exitevent only provides an exit code;signalis not available. Theapp-exitmessage emitted here omitssignal, but master currently typessignalas required and uses it for debug logic/log formatting. Consider includingsignal: null/''explicitly (or updating the master handler/types to makesignaloptional) to avoidundefinedvalues and inconsistent behavior between modes.
src/utils/mode/impl/worker_threads/app.ts:75 gracefulExit()hooksprocess.on('exit', async ...)and awaitsbeforeExit(). Node'sexitevent does not allow asynchronous work to complete (the event loop is already shutting down), sobeforeExit()promises will not reliably run to completion. Consider usingbeforeExit/signal handlers (orgraceful-process) instead, and avoid callingprocess.exit()from inside anexithandler.
src/utils/mode/impl/worker_threads/agent.ts:35gracefulExit()usesprocess.on('exit', async ...)and awaitsbeforeExit(), but theexitevent cannot perform async cleanup reliably. This can causebeforeExit()(e.g.agent.close()) to be skipped/truncated in worker_threads mode. PreferbeforeExit/signal handlers (or reusegraceful-processif supported) and avoid callingprocess.exit()from inside anexithandler.
src/utils/mode/impl/worker_threads/app.ts:39AppThreadWorker.exitCodeis currently hard-coded to0, which means master logs and any exit handling will always report a successful exit even when the thread exits with a non-zero code. Persist the actual exit code from the Workerexitevent (e.g., store it on the wrapper) and return that here.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (options.port && typeof options.port === 'string') { | ||
| options.port = parseInt(options.port); | ||
| } | ||
| if (options.port === null) { | ||
| options.port = undefined; | ||
| } |
There was a problem hiding this comment.
parseOptions() parses string ports with parseInt(), but does not handle NaN results. If a non-numeric string is provided, options.port becomes NaN (type number), which will bypass typeof port !== 'number' checks later and can cause server.listen(NaN) failures. After parsing, normalize NaN to undefined and/or validate that the port is a finite integer.
| function getUnterminatedProcesses(pids: number[]) { | ||
| return pids.filter(pid => { | ||
| try { | ||
| // success means it's still alive | ||
| process.kill(pid, 0); | ||
| return true; | ||
| } catch (err) { | ||
| // error means it's dead | ||
| debug('kill %s error: %s, it still alive', pid, err); | ||
| return false; | ||
| } |
There was a problem hiding this comment.
In getUnterminatedProcesses(), the comment says an error from process.kill(pid, 0) means the process is dead, but the debug log message says "it still alive". Also, kill(pid, 0) can throw EPERM even when the process is alive; treating all errors as "dead" can leave processes unterminated. Fix the debug message and consider handling EPERM as "still alive" (and other errors like ESRCH as "dead").
| "engines": { | ||
| "node": ">= 14.0.0" | ||
| } | ||
| "node": ">= 18.19.0" | ||
| }, | ||
| "type": "module", | ||
| "tshy": { |
There was a problem hiding this comment.
PR description/test plan mentions running builds/tests on Node 10/16, but this change sets engines.node to >= 18.19.0 and CI matrix to 18+/23. Please update the PR description/test plan to match the supported Node versions (or adjust engines/CI if Node 10/16 support is still intended).
There was a problem hiding this comment.
Actionable comments posted: 15
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
test/master.test.ts (1)
318-324:⚠️ Potential issue | 🟡 MinorType mismatch:
alive()expectsnumberbut receivesstring | undefined.The
alivefunction signature at line 984 expects anumber, but it's called with values extracted from regex matches which arestring | undefined. This will cause the function to always returnfalse(or throw) becauseprocess.kill()coerces its first argument.🔧 Proposed fix
-function alive(pid: number) { +function alive(pid: string | number | undefined) { + if (pid === undefined) return false; try { // success means it's still alive - process.kill(pid, 0); + process.kill(Number(pid), 0); return true; } catch (err) { // error means it's dead return false; } }Also applies to: 352-357, 984-993
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/master.test.ts` around lines 318 - 324, The calls to alive() are passing values extracted from regex matches (pid1, pid2 from app.stdout.match) which are string | undefined; convert/validate these to numbers before calling alive(). Update the test to parse the captured PID strings (e.g. Number(pid1) or parseInt(pid1, 10)) and guard against undefined (skip assertion or fail the test with a clear message) so alive() receives a number; reference the regex match usage for worker1/worker2 and the alive() function to locate where to change the calls.src/utils/messenger.ts (1)
100-109:⚠️ Potential issue | 🟡 MinorPotential null dereference when agent is not available.
Line 104 uses non-null assertion
getAgent()!.workerId, butgetAgent()can returnnullif no agent worker has been set yet. This could throw during early message routing.🛡️ Proposed defensive fix
const receiverWorkerId = data.receiverWorkerId ?? data.receiverPid; if (receiverWorkerId) { if (receiverWorkerId === String(process.pid)) { data.to = 'master'; - } else if (receiverWorkerId === String(this.#workerManager.getAgent()!.workerId)) { + } else if (this.#workerManager.getAgent() && + receiverWorkerId === String(this.#workerManager.getAgent()!.workerId)) { data.to = 'agent'; } else { data.to = 'app'; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/messenger.ts` around lines 100 - 109, The message routing code risks a null dereference by using getAgent()!; modify the block that computes receiverWorkerId and sets data.to to first retrieve const agent = this.#workerManager.getAgent(); then compare receiverWorkerId to String(process.pid) first, then to String(agent?.workerId) only if agent is non-null (or use agent?.workerId and ensure the equality check handles undefined), and otherwise fall back to 'app'; ensure you do not use the non-null assertion getAgent()! and handle the case where agent is null so no exception is thrown during early routing.src/utils/mode/impl/worker_threads/agent.ts (2)
27-35:⚠️ Potential issue | 🟠 MajorSame async
beforeExitissue as in app.ts.The
process.on('exit')callback cannot perform async operations - theawaithas no effect and the process will exit immediately. See the fix suggested forsrc/utils/mode/impl/worker_threads/app.ts.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/mode/impl/worker_threads/agent.ts` around lines 27 - 35, The current gracefulExit in gracefulExit(options) incorrectly registers an async handler on process.on('exit') where await has no effect; change it to listen for the 'beforeExit' event instead and perform the async call there (i.e., process.on('beforeExit', async code => { if (typeof beforeExit === 'function') await beforeExit(); })), and remove the call to process.exit from that handler to avoid re-entrancy; keep references to the gracefulExit method and the beforeExit option when making this change.
75-85:⚠️ Potential issue | 🟡 MinorRemove incorrect
signalparameter from Worker exit event handler.The
exitevent on Node.jsworker_threads.Workeronly providesexitCode, not a signal parameter. Thesignalparameter in the handler will always beundefined, causing confusion when this undefined value is sent to the master process.🔧 Proposed fix
worker.once('exit', (code: number, signal: string) => { + worker.once('exit', (code: number) => { this.messenger.send({ action: 'agent-exit', data: { code, - signal, + signal: null, }, to: 'master', from: 'agent', }); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/mode/impl/worker_threads/agent.ts` around lines 75 - 85, The worker exit handler is using a non-existent `signal` parameter; update the `worker.once('exit', ...)` callback in agent.ts to accept the correct `exitCode` parameter (e.g., `(exitCode: number)`) and remove `signal` from the message payload—send only the exit code (use `exitCode` in the data object) when calling `this.messenger.send` for the 'agent-exit' action.
🧹 Nitpick comments (13)
.gitignore (1)
15-15: Remove duplicate ignore entry for coverage artifacts.
coverageis already listed at Line 2, so Line 15 is redundant and can be removed to keep the ignore file clean.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.gitignore at line 15, The .gitignore contains a duplicate entry for "coverage" (already present earlier); remove the redundant "coverage" line shown in the diff to keep the file clean and avoid duplicate ignore entries.test/utils.ts (1)
14-20: Preserve defaultexecArgvwhenoptions.optis provided.
...optionsat Line 19 can overwrite theoptobject from Lines 14-17 entirely. Mergeoptexplicitly to keep defaults stable.♻️ Proposed refactor
export function cluster(baseDir: string, options: MockClusterOptions = {}) { + const { opt, ...rest } = options; return mm.cluster({ baseDir, framework: path.join(__dirname, 'fixtures/egg'), // eggPath: path.join(__dirname, '../node_modules/egg'), cache: false, opt: { // clear execArgv from egg-bin - execArgv: [], + execArgv: [], + ...opt, }, // override `@eggjs/mock` default port 17001 - ...options, + ...rest, }); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/utils.ts` around lines 14 - 20, The current spread of ...options can overwrite the default opt object (which sets execArgv: [])—to preserve defaults merge opt explicitly: build the call so opt is constructed as opt: { execArgv: [], ...(options?.opt || {}) } and then spread the rest of ...options without allowing options.opt to replace it; ensure you remove opt from the top-level options spread (or spread options after extracting opt) so execArgv default remains unless explicitly overridden.src/utils/terminate.ts (1)
91-93: Debug message text is inverted.In this catch path, the process is considered dead, but Line 92 logs “it still alive”.
📝 Suggested tweak
- debug('kill %s error: %s, it still alive', pid, err); + debug('kill %s error: %s, process is already dead', pid, err);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/terminate.ts` around lines 91 - 93, The debug log in the catch path is inverted: update the debug call that currently reads debug('kill %s error: %s, it still alive', pid, err) to accurately reflect that the process is dead (e.g., debug('kill %s error: %s, it is not alive', pid, err) or 'it is dead'), keeping the same parameters (pid, err) so the message and error details remain logged; locate the debug invocation in terminate.ts and replace the text accordingly.src/app_worker.ts (1)
155-157: Falsy check onstickyWorkerPortmay reject port 0.Using
options.stickyWorkerPortin a boolean context will treat port0as falsy. While port 0 is rarely used intentionally (it means "pick any available port"), this could cause unexpected behavior.Consider using explicit undefined check if port 0 should be valid:
if (options.sticky && options.stickyWorkerPort !== undefined) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/app_worker.ts` around lines 155 - 157, The current conditional uses a falsy check on options.stickyWorkerPort which will reject port 0; change the condition around the server.listen invocation to explicitly check for undefined (e.g. use options.sticky && options.stickyWorkerPort !== undefined) so port 0 is allowed, and then call server.listen(options.stickyWorkerPort, '127.0.0.1') as before; update the check near the server.listen call that references options.sticky and options.stickyWorkerPort to use an explicit undefined comparison.test/master.test.ts (1)
467-494: Commented-out test should be removed or documented.This entire test block is commented out without explanation. If it's intentionally disabled due to a known issue, add a
TODOorFIXMEcomment explaining why. Otherwise, consider removing dead code.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/master.test.ts` around lines 467 - 494, The commented-out test block labelled "egg-script exit" in test/master.test.ts is dead code; either remove the entire commented block or re-enable it with a clear rationale by adding a TODO/FIXME explaining why it's disabled and linking to an issue/PR; if you choose to document it, add a concise comment above the block mentioning the reason (e.g., failing on CI, platform-specific behavior) and reference the test components like appDir, start-server.js, errLogPath and the asserted stderr check so future readers know the context.test/options.test.ts (1)
221-233: Remove malformed commented-out test block.This commented-out code has a nested
it()inside anotherit()which is invalid Mocha syntax. This appears to be leftover from incomplete refactoring and should be removed.♻️ Proposed fix
- // it('should exist when specify baseDir', () => { - // it('should get egg by default but not exist', () => { - // const baseDir = path.join(__dirname, 'noexist'); - // try { - // parseOptions({ - // baseDir, - // }); - // throw new Error('should not run'); - // } catch (err) { - // assert(err.message === `${path.join(baseDir, 'package.json')} should exist`); - // } - // }); - // });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/options.test.ts` around lines 221 - 233, Remove the malformed commented-out Mocha block that nests an `it()` inside another `it()` in the tests around `parseOptions`; specifically delete the entire commented section containing the nested `it()`s (the block that references `baseDir`, `parseOptions`, and the `${path.join(baseDir, 'package.json')} should exist` assertion) so the test file no longer contains invalid commented test syntax, or replace it with a single well-formed test that calls `parseOptions({ baseDir })` and asserts the expected error.src/utils/messenger.ts (1)
112-122: Consider using else-if chain for mutually exclusive routing.The default routing logic uses separate
ifstatements. While this works because only onefromvalue matches at a time, anelse ifchain would make the mutual exclusivity explicit and slightly more efficient.♻️ Suggested refactor
// default from -> to rules if (!data.to) { if (data.from === 'agent') { data.to = 'app'; - } - if (data.from === 'app') { + } else if (data.from === 'app') { data.to = 'agent'; - } - if (data.from === 'parent') { + } else if (data.from === 'parent') { data.to = 'master'; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/messenger.ts` around lines 112 - 122, The routing block that sets data.to based on data.from uses three separate if statements which are mutually exclusive; update that block to an if / else if / else if chain so only one branch executes (e.g., if (data.from === 'agent') { data.to = 'app'; } else if (data.from === 'app') { data.to = 'agent'; } else if (data.from === 'parent') { data.to = 'master'; }) to make the intent explicit and slightly more efficient—modify the code around the data.to assignment in src/utils/messenger.ts accordingly.src/utils/mode/impl/process/agent.ts (1)
46-51: Consider adding IPC logging for agent worker messages.The PR objective mentions adding IPC logging for master↔app worker communication via
ipcLogger. For consistency, agent worker IPC (AgentProcessWorker#send,process.send, and message handlers at lines 77-86) could also benefit from IPC logging. Currently,src/utils/mode/impl/process/app.tsimports and usesipcLoggerfor app workers but agent workers don't have equivalent logging.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/mode/impl/process/agent.ts` around lines 46 - 51, Add IPC logging for agent worker messages by importing and using the existing ipcLogger in src/utils/mode/impl/process/agent.ts; instrument AgentProcessWorker#send (the method that wraps child.send) to log outgoing messages and instrument the agent process message handlers (the process.on('message') callbacks around lines 77-86) and any direct process.send calls to log incoming messages, mirroring the pattern used in app.ts (use the same log format/context) so master↔agent IPC is recorded consistently.src/utils/options.ts (2)
115-122: Deprecation warning may be suppressed by later code.The deprecation warning at line 117 is helpful, but lines 161-163 set
process.env.NO_DEPRECATION = '*'in production, which would suppress this warning anyway. Consider whether this warning should be logged via the logger instead to ensure it's visible in production.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/options.ts` around lines 115 - 122, The console.warn in options handling (the options.https === true branch) can be suppressed by process.env.NO_DEPRECATION set later; replace the console.warn with a logger warning so it always appears in production logs: locate the options.https block in src/utils/options.ts and change the deprecation message to use the existing application logger (or export/accept a processLogger) to emit a warning (e.g., logger.warn or processLogger.warn) instead of console.warn, preserving the same message and leave the options.https assignment (options.key, options.cert) intact.
159-163: SettingNO_DEPRECATION='*'globally may have unintended side effects.Setting this environment variable globally suppresses all deprecation warnings from Node.js and all npm packages in the process. This could hide important deprecation notices from dependencies. Consider limiting the scope or documenting this behavior.
💡 Alternative: use a more targeted approach
If the intent is only to suppress egg-cluster's own deprecation messages, consider using a package-specific value:
// don't print deprecated message in production env. // it will print to stderr. if (process.env.NODE_ENV === 'production') { - process.env.NO_DEPRECATION = '*'; + process.env.NO_DEPRECATION = process.env.NO_DEPRECATION || '@eggjs/cluster'; }Or document clearly that this affects all packages.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/options.ts` around lines 159 - 163, The current code in options.ts sets process.env.NO_DEPRECATION = '*' under the NODE_ENV === 'production' check which suppresses all deprecation warnings globally; change this to a targeted approach by either removing the global assignment or replacing '*' with a package-specific value (e.g., the egg-cluster identifier) and/or gating it behind a more specific flag so only egg-cluster deprecations are suppressed; update the logic around the NODE_ENV check and the assignment to process.env.NO_DEPRECATION and add a brief comment documenting the scope and reason for the suppression.src/utils/mode/impl/process/app.ts (1)
157-163: Non-null assertion oncluster.workersand worker lookup.The
cluster.workers!andcluster.workers![id]!non-null assertions assume workers exist. While this is likely safe during shutdown (workers should exist), consider adding a guard for defensive coding.💡 Defensive check
async kill(timeout: number) { - await Promise.all(Object.keys(cluster.workers!).map(id => { - const worker = cluster.workers![id]!; + const workers = cluster.workers ?? {}; + await Promise.all(Object.keys(workers).map(id => { + const worker = workers[id]; + if (!worker) return; Reflect.set(worker, 'disableRefork', true); return terminate(worker.process, timeout); })); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/mode/impl/process/app.ts` around lines 157 - 163, The kill method assumes cluster.workers and each worker entry are non-null; change it to defensively handle missing workers by first checking if cluster.workers is truthy and returning early (or using an empty object) and then verifying each worker is defined before calling Reflect.set and terminate; reference the kill function, cluster.workers, the per-worker lookup (worker), Reflect.set(..., 'disableRefork', true), and terminate(worker.process, timeout) so you only call these on confirmed non-null worker objects.src/master.ts (2)
602-604: Consider adding10radix toparseIntcalls.While modern JavaScript defaults to base 10 for decimal strings, explicitly passing the radix is a best practice for clarity and to avoid edge cases with leading zeros.
💡 Add explicit radix
- const appTimeout = parseInt(process.env.EGG_APP_CLOSE_TIMEOUT || legacyTimeout); - const agentTimeout = parseInt(process.env.EGG_AGENT_CLOSE_TIMEOUT || legacyTimeout); + const appTimeout = parseInt(process.env.EGG_APP_CLOSE_TIMEOUT || legacyTimeout, 10); + const agentTimeout = parseInt(process.env.EGG_AGENT_CLOSE_TIMEOUT || legacyTimeout, 10);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/master.ts` around lines 602 - 604, The parseInt calls for appTimeout and agentTimeout should explicitly pass a radix to avoid ambiguity: update the parseInt invocations that set appTimeout (currently parseInt(process.env.EGG_APP_CLOSE_TIMEOUT || legacyTimeout)) and agentTimeout (currently parseInt(process.env.EGG_AGENT_CLOSE_TIMEOUT || legacyTimeout)) to include 10 as the second argument (e.g., parseInt(..., 10)); ensure both occurrences are changed and leave legacyTimeout assignment unchanged.
279-291: Sticky worker selection logic is correct but fragile.The algorithm extracts digits from the IP address to determine worker assignment. The non-null assertion on
getWorker()at line 290 assumes the calculatedpidalways exists in the worker map. If workers are being killed/restarted, this could fail.💡 Add fallback for missing worker
stickyWorker(ip: string) { const workerNumbers = this.options.workers; const ws = this.workerManager.listWorkerIds(); + if (ws.length === 0) { + return null; + } let s = ''; for (let i = 0; i < ip.length; i++) { if (!isNaN(parseInt(ip[i]))) { s += ip[i]; } } - const pid = ws[Number(s) % workerNumbers]; - return this.workerManager.getWorker(pid)!; + const index = ws.length > 0 ? Number(s) % ws.length : 0; + const pid = ws[index]; + return this.workerManager.getWorker(pid); }Note: The caller at line 268 would also need to handle null case.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/master.ts` around lines 279 - 291, The stickyWorker method currently computes a pid from digits in the IP and force-asserts a worker with this.workerManager.getWorker(pid)! which can throw if that pid was removed; change stickyWorker to check the result of getWorker(pid) and return null (or a defined fallback worker from this.workerManager.listWorkerIds(), e.g., round-robin or first available) when getWorker returns undefined, and remove the non-null assertion; update callers (the caller that invokes stickyWorker) to handle a null return. Ensure references: stickyWorker, this.options.workers, this.workerManager.listWorkerIds(), this.workerManager.getWorker(pid) are used to locate and implement the safe fallback.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Around line 21-48: Remove the auto-generated "## Summary by CodeRabbit" block
and the verbose beta-oriented release notes beneath it (the section starting
with "## Release Notes for `@eggjs/cluster` v3.0.0-beta.4" through the closing
auto-generated comment), and replace it with a concise, final-release changelog
entry that omits beta wording and redundant bullets; locate the block by the
header "## Summary by CodeRabbit" and the subsection title "Release Notes for
`@eggjs/cluster` v3.0.0-beta.4" and trim or rewrite those lines to a short,
trustable summary.
In `@README.md`:
- Around line 40-41: The README’s startup section uses “callback” and awkward
grammar (“when application has started”, “will exit when catch an error”);
change it to explicit Promise-based wording that is grammatically correct and
clear. Replace references to a “callback” with a Promise-based description such
as: the startup function returns a Promise that resolves when the application
has started and rejects if startup fails; update the sentence “master process
will exit when catch an error” to something like “the master process will exit
if the Promise rejects due to a startup error,” and ensure examples and verbs
are present-tense and grammatically correct.
In `@src/agent_worker.ts`:
- Around line 54-71: The agent.ready callback currently swallows an error;
modify it to route startup errors through the existing startErrorHandler: inside
the agent.ready((err?: Error) => { ... }) callback, if err is truthy call
startErrorHandler(err) and return (so the same cleanup and exit path is taken),
otherwise proceed to removeListener('error', startErrorHandler) and call
AgentWorker.send({ action: 'agent-start', to: 'master' }); this ensures errors
reported via agent.ready are handled the same as agent.once('error',
startErrorHandler').
In `@src/app_worker.ts`:
- Around line 117-124: Move the port validation to occur before sending the
"realport" message: ensure the code that validates the port variable (the
numeric check currently around the port validation logic) runs before calling
AppWorker.send({ action: 'realport', data: { port, protocol } }). Replace or
reuse the existing validation logic (the block that checks port is a
number/valid range) to guard the send; if validation fails, do not call
AppWorker.send and handle the error/early return accordingly so master never
receives invalid port data.
In `@src/index.ts`:
- Around line 17-18: The public entrypoint startCluster should accept an
optional options parameter to match Master; change the function signature of
startCluster to take options?: ClusterOptions (instead of required) and pass
that value into new Master(options).ready(), ensuring callers can call
startCluster() with no args; update any related exported types if necessary to
reflect the optional parameter.
In `@src/master.ts`:
- Around line 462-467: onAppStart currently dereferences the result of
this.workerManager.getWorker(data.workerId) without checking for undefined (same
as onAppExit); update onAppStart to guard against a missing worker: call const
worker = this.workerManager.getWorker(data.workerId); if (!worker) { debug or
this.logger.warn(...) and return; } then use worker safely (or use optional
chaining for non-critical logs). Ensure you reference the getWorker call and the
onAppStart method so the guard covers all subsequent uses of worker.
- Around line 407-414: The onAppExit handler currently uses a non-null assertion
when fetching the worker with this.workerManager.getWorker(data.workerId) which
can return undefined; update onAppExit to guard against a missing worker: call
const worker = this.workerManager.getWorker(data.workerId) and if (!worker)
return (or log and return) before any use of worker so you avoid a null
dereference in onAppExit and related logic that assumes the worker exists.
In `@src/utils/ipc_logger.ts`:
- Around line 12-16: Replace the current truthy check for EGG_CLUSTER_IPC_LOG
used by internalIpcLogEnabled with an explicit boolean parse: implement a small
parser (e.g., normalize with toLowerCase().trim()) that treats "1", "true",
"yes" as true and "0", "false", "no", "" as false, and set internalIpcLogEnabled
from that parser instead of using !!process.env.EGG_CLUSTER_IPC_LOG; reference
the internalIpcLogEnabled export and any new helper (e.g., parseEnvBool) so the
behavior is deterministic and avoids treating "0" or "false" strings as true.
In `@src/utils/mode/base/app.ts`:
- Around line 66-78: Update the error messages in the BaseAppWorker static
methods so they reference the correct class name: change the thrown Error
messages in static send, static kill, and static gracefulExit to say
"BaseAppWorker should implement ..." instead of "BaseAgentWorker should
implement ..."; locate the three methods (send, kill, gracefulExit) in the
BaseAppWorker class and correct the string constants passed to new Error(...)
accordingly.
In `@src/utils/mode/impl/worker_threads/app.ts`:
- Around line 141-144: The code uses a non-null assertion on this.options.port
when adding to ports; instead, check for undefined and handle it explicitly:
after const ports = this.options.ports ?? []; if (!ports.length) { if
(this.options.port !== undefined) { ports.push(this.options.port); } else {
throw new Error("No port specified: options.port is undefined (parseOptions may
have returned undefined)"); } } — reference symbols: ports, this.options.ports,
this.options.port, parseOptions.
- Around line 68-75: The current gracefulExit implementation registers a
process.on('exit') handler in gracefulExit(gracefulExitOptions) which attempts
to await options.beforeExit() but cannot because 'exit' handlers run
synchronously; change the implementation to listen for the 'beforeExit' event
(or explicitly handle termination signals like SIGINT/SIGTERM and SIGQUIT) so
the async options.beforeExit() can be awaited before calling process.exit(code).
Update the listener registration in gracefulExit to use process.on('beforeExit',
async (code) => { if (typeof options.beforeExit === 'function') await
options.beforeExit(); process.exit(code); }) or add signal handlers that await
options.beforeExit() and then call process.exit, referencing the gracefulExit
function and the options.beforeExit callback.
- Around line 36-39: The exitCode getter in app.ts currently returns 0; change
it to return a stored worker exit code by adding a private field (e.g.,
this._exitCode) and updating that field inside the worker 'exit' event handler
(the handler already receives the code). Then make get exitCode() return
this._exitCode (falling back to 0 or undefined if unset). Update any references
to this.instance if needed, but ensure the 'exit' event handler for the worker
sets this._exitCode so master.ts reading worker.exitCode gets the real value.
In `@src/utils/terminate.ts`:
- Around line 49-58: The code can skip collecting live child PIDs when timeout
<= checkInterval because unterminated is initialized empty and the loop may
never run; before calling kill(...) ensure you compute the current list of live
children by invoking getUnterminatedProcesses(childrenPids) one final time (or
initialize unterminated with that result before the loop) so that
kill(unterminated, 'SIGKILL') is only called with the actual remaining PIDs;
update the logic around unterminated, the while loop using
start/timeout/checkInterval, and the final kill call to use that fresh result.
- Around line 25-36: Replace the unreliable check of subProcess.killed with an
explicit "didExit" flag set from the 'exit' event promise: attach a listener or
await once(subProcess, 'exit') to set didExit = true before sending SIGKILL,
then after the Promise.race check didExit to decide whether to escalate to
(subProcess.process ?? subProcess).kill('SIGKILL');; reference the existing
subProcess, once(subProcess, 'exit'), sleep(timeout), kill, and killed symbols
and ensure the 'exit' event is the authoritative source for termination rather
than subProcess.killed.
In `@test/worker_threads.test.ts`:
- Around line 18-19: The test "should exit when emit error during agent worker
boot" no longer exercises the worker_threads path because the cluster invocation
omits the startMode override; update the cluster call in the test (the line
assigning app = cluster('apps/agent-worker-threads-error')) to include the
option { startMode: 'worker_threads' } so the agent runs in worker_threads mode
(or switch to the correct fixture that explicitly uses worker_threads), ensuring
the test actually triggers the worker_threads code path.
---
Outside diff comments:
In `@src/utils/messenger.ts`:
- Around line 100-109: The message routing code risks a null dereference by
using getAgent()!; modify the block that computes receiverWorkerId and sets
data.to to first retrieve const agent = this.#workerManager.getAgent(); then
compare receiverWorkerId to String(process.pid) first, then to
String(agent?.workerId) only if agent is non-null (or use agent?.workerId and
ensure the equality check handles undefined), and otherwise fall back to 'app';
ensure you do not use the non-null assertion getAgent()! and handle the case
where agent is null so no exception is thrown during early routing.
In `@src/utils/mode/impl/worker_threads/agent.ts`:
- Around line 27-35: The current gracefulExit in gracefulExit(options)
incorrectly registers an async handler on process.on('exit') where await has no
effect; change it to listen for the 'beforeExit' event instead and perform the
async call there (i.e., process.on('beforeExit', async code => { if (typeof
beforeExit === 'function') await beforeExit(); })), and remove the call to
process.exit from that handler to avoid re-entrancy; keep references to the
gracefulExit method and the beforeExit option when making this change.
- Around line 75-85: The worker exit handler is using a non-existent `signal`
parameter; update the `worker.once('exit', ...)` callback in agent.ts to accept
the correct `exitCode` parameter (e.g., `(exitCode: number)`) and remove
`signal` from the message payload—send only the exit code (use `exitCode` in the
data object) when calling `this.messenger.send` for the 'agent-exit' action.
In `@test/master.test.ts`:
- Around line 318-324: The calls to alive() are passing values extracted from
regex matches (pid1, pid2 from app.stdout.match) which are string | undefined;
convert/validate these to numbers before calling alive(). Update the test to
parse the captured PID strings (e.g. Number(pid1) or parseInt(pid1, 10)) and
guard against undefined (skip assertion or fail the test with a clear message)
so alive() receives a number; reference the regex match usage for
worker1/worker2 and the alive() function to locate where to change the calls.
---
Nitpick comments:
In @.gitignore:
- Line 15: The .gitignore contains a duplicate entry for "coverage" (already
present earlier); remove the redundant "coverage" line shown in the diff to keep
the file clean and avoid duplicate ignore entries.
In `@src/app_worker.ts`:
- Around line 155-157: The current conditional uses a falsy check on
options.stickyWorkerPort which will reject port 0; change the condition around
the server.listen invocation to explicitly check for undefined (e.g. use
options.sticky && options.stickyWorkerPort !== undefined) so port 0 is allowed,
and then call server.listen(options.stickyWorkerPort, '127.0.0.1') as before;
update the check near the server.listen call that references options.sticky and
options.stickyWorkerPort to use an explicit undefined comparison.
In `@src/master.ts`:
- Around line 602-604: The parseInt calls for appTimeout and agentTimeout should
explicitly pass a radix to avoid ambiguity: update the parseInt invocations that
set appTimeout (currently parseInt(process.env.EGG_APP_CLOSE_TIMEOUT ||
legacyTimeout)) and agentTimeout (currently
parseInt(process.env.EGG_AGENT_CLOSE_TIMEOUT || legacyTimeout)) to include 10 as
the second argument (e.g., parseInt(..., 10)); ensure both occurrences are
changed and leave legacyTimeout assignment unchanged.
- Around line 279-291: The stickyWorker method currently computes a pid from
digits in the IP and force-asserts a worker with
this.workerManager.getWorker(pid)! which can throw if that pid was removed;
change stickyWorker to check the result of getWorker(pid) and return null (or a
defined fallback worker from this.workerManager.listWorkerIds(), e.g.,
round-robin or first available) when getWorker returns undefined, and remove the
non-null assertion; update callers (the caller that invokes stickyWorker) to
handle a null return. Ensure references: stickyWorker, this.options.workers,
this.workerManager.listWorkerIds(), this.workerManager.getWorker(pid) are used
to locate and implement the safe fallback.
In `@src/utils/messenger.ts`:
- Around line 112-122: The routing block that sets data.to based on data.from
uses three separate if statements which are mutually exclusive; update that
block to an if / else if / else if chain so only one branch executes (e.g., if
(data.from === 'agent') { data.to = 'app'; } else if (data.from === 'app') {
data.to = 'agent'; } else if (data.from === 'parent') { data.to = 'master'; })
to make the intent explicit and slightly more efficient—modify the code around
the data.to assignment in src/utils/messenger.ts accordingly.
In `@src/utils/mode/impl/process/agent.ts`:
- Around line 46-51: Add IPC logging for agent worker messages by importing and
using the existing ipcLogger in src/utils/mode/impl/process/agent.ts; instrument
AgentProcessWorker#send (the method that wraps child.send) to log outgoing
messages and instrument the agent process message handlers (the
process.on('message') callbacks around lines 77-86) and any direct process.send
calls to log incoming messages, mirroring the pattern used in app.ts (use the
same log format/context) so master↔agent IPC is recorded consistently.
In `@src/utils/mode/impl/process/app.ts`:
- Around line 157-163: The kill method assumes cluster.workers and each worker
entry are non-null; change it to defensively handle missing workers by first
checking if cluster.workers is truthy and returning early (or using an empty
object) and then verifying each worker is defined before calling Reflect.set and
terminate; reference the kill function, cluster.workers, the per-worker lookup
(worker), Reflect.set(..., 'disableRefork', true), and terminate(worker.process,
timeout) so you only call these on confirmed non-null worker objects.
In `@src/utils/options.ts`:
- Around line 115-122: The console.warn in options handling (the options.https
=== true branch) can be suppressed by process.env.NO_DEPRECATION set later;
replace the console.warn with a logger warning so it always appears in
production logs: locate the options.https block in src/utils/options.ts and
change the deprecation message to use the existing application logger (or
export/accept a processLogger) to emit a warning (e.g., logger.warn or
processLogger.warn) instead of console.warn, preserving the same message and
leave the options.https assignment (options.key, options.cert) intact.
- Around line 159-163: The current code in options.ts sets
process.env.NO_DEPRECATION = '*' under the NODE_ENV === 'production' check which
suppresses all deprecation warnings globally; change this to a targeted approach
by either removing the global assignment or replacing '*' with a
package-specific value (e.g., the egg-cluster identifier) and/or gating it
behind a more specific flag so only egg-cluster deprecations are suppressed;
update the logic around the NODE_ENV check and the assignment to
process.env.NO_DEPRECATION and add a brief comment documenting the scope and
reason for the suppression.
In `@src/utils/terminate.ts`:
- Around line 91-93: The debug log in the catch path is inverted: update the
debug call that currently reads debug('kill %s error: %s, it still alive', pid,
err) to accurately reflect that the process is dead (e.g., debug('kill %s error:
%s, it is not alive', pid, err) or 'it is dead'), keeping the same parameters
(pid, err) so the message and error details remain logged; locate the debug
invocation in terminate.ts and replace the text accordingly.
In `@test/master.test.ts`:
- Around line 467-494: The commented-out test block labelled "egg-script exit"
in test/master.test.ts is dead code; either remove the entire commented block or
re-enable it with a clear rationale by adding a TODO/FIXME explaining why it's
disabled and linking to an issue/PR; if you choose to document it, add a concise
comment above the block mentioning the reason (e.g., failing on CI,
platform-specific behavior) and reference the test components like appDir,
start-server.js, errLogPath and the asserted stderr check so future readers know
the context.
In `@test/options.test.ts`:
- Around line 221-233: Remove the malformed commented-out Mocha block that nests
an `it()` inside another `it()` in the tests around `parseOptions`; specifically
delete the entire commented section containing the nested `it()`s (the block
that references `baseDir`, `parseOptions`, and the `${path.join(baseDir,
'package.json')} should exist` assertion) so the test file no longer contains
invalid commented test syntax, or replace it with a single well-formed test that
calls `parseOptions({ baseDir })` and asserts the expected error.
In `@test/utils.ts`:
- Around line 14-20: The current spread of ...options can overwrite the default
opt object (which sets execArgv: [])—to preserve defaults merge opt explicitly:
build the call so opt is constructed as opt: { execArgv: [], ...(options?.opt ||
{}) } and then spread the rest of ...options without allowing options.opt to
replace it; ensure you remove opt from the top-level options spread (or spread
options after extracting opt) so execArgv default remains unless explicitly
overridden.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 720a8da4-1985-411a-a004-b718cb9ed576
📒 Files selected for processing (64)
.eslintrc.github/PULL_REQUEST_TEMPLATE.md.github/workflows/nodejs.yml.github/workflows/pkg.pr.new.yml.gitignoreCHANGELOG.mdREADME.mdindex.jslib/agent_worker.jslib/app_worker.jslib/utils/mode/base/agent.jslib/utils/mode/base/app.jslib/utils/mode/impl/process/agent.jslib/utils/mode/impl/process/app.jslib/utils/options.jslib/utils/terminate.jslib/utils/timer.jspackage.jsonsrc/agent_worker.tssrc/app_worker.tssrc/dirname.tssrc/error/ClusterAgentWorkerError.tssrc/error/ClusterWorkerExceptionError.tssrc/error/index.tssrc/index.tssrc/master.tssrc/utils/ipc_logger.tssrc/utils/messenger.tssrc/utils/mode/base/agent.tssrc/utils/mode/base/app.tssrc/utils/mode/impl/process/agent.tssrc/utils/mode/impl/process/app.tssrc/utils/mode/impl/worker_threads/agent.tssrc/utils/mode/impl/worker_threads/app.tssrc/utils/options.tssrc/utils/terminate.tssrc/utils/worker_manager.tstest/agent_worker.test.tstest/app_worker.test.tstest/fixtures/apps/agent-debug-port/agent.jstest/fixtures/apps/agent-debug-port/inject1.jstest/fixtures/apps/app-listen-port/app/router.jstest/fixtures/apps/app-listen-port/config/config.default.jstest/fixtures/apps/app-listen-without-port/config/config.default.jstest/fixtures/apps/before-close/agent.jstest/fixtures/apps/before-close/app.jstest/fixtures/apps/framework/index.jstest/fixtures/apps/framework/lib/agent.jstest/fixtures/apps/framework/lib/framework.jstest/fixtures/apps/options/agent.jstest/fixtures/apps/options/app.jstest/fixtures/apps/options/config/plugin.jstest/fixtures/apps/script-start/start-server.jstest/fixtures/apps/worker-close-timeout/agent.jstest/fixtures/apps/worker-close-timeout/app.jstest/fixtures/egg/index.jstest/https.test.tstest/master.test.tstest/options.test.jstest/options.test.tstest/utils.jstest/utils.tstest/worker_threads.test.tstsconfig.json
💤 Files with no reviewable changes (18)
- test/fixtures/apps/options/agent.js
- .github/PULL_REQUEST_TEMPLATE.md
- test/fixtures/apps/app-listen-without-port/config/config.default.js
- test/fixtures/apps/agent-debug-port/agent.js
- test/fixtures/apps/options/app.js
- test/fixtures/apps/app-listen-port/config/config.default.js
- lib/utils/timer.js
- index.js
- test/utils.js
- lib/utils/options.js
- lib/utils/terminate.js
- test/options.test.js
- lib/agent_worker.js
- lib/utils/mode/impl/process/app.js
- lib/utils/mode/impl/process/agent.js
- lib/utils/mode/base/app.js
- lib/utils/mode/base/agent.js
- lib/app_worker.js
| ## Summary by CodeRabbit | ||
|
|
||
| ## Release Notes for @eggjs/cluster v3.0.0-beta.4 | ||
|
|
||
| - **New Features** | ||
| - Migrated project to TypeScript. | ||
| - Added support for Node.js 18.19.0, 20, 22, and 23. | ||
| - Enhanced type safety and module exports. | ||
| - Improved worker thread and process management. | ||
| - Introduced new error handling classes for better debugging. | ||
|
|
||
| - **Breaking Changes** | ||
| - Renamed package from `egg-cluster` to `@eggjs/cluster`. | ||
| - Updated import/export syntax to ES modules. | ||
| - Minimum Node.js version is now 18.19.0. | ||
|
|
||
| - **Performance Improvements** | ||
| - Refactored cluster and worker management. | ||
| - Optimized error handling and logging. | ||
|
|
||
| - **Bug Fixes** | ||
| - Resolved various edge cases in worker initialization. | ||
| - Improved graceful shutdown mechanisms. | ||
|
|
||
| - **Documentation** | ||
| - Updated README with new package name and usage examples. | ||
| - Added TypeScript and ESM import examples. | ||
| <!-- end of auto-generated comment: release notes by coderabbit.ai --> |
There was a problem hiding this comment.
Trim the auto-generated summary block from the changelog.
This section is very verbose and includes beta-oriented wording inside final release notes, which can make the changelog harder to trust/scan.
Suggested cleanup
-## Summary by CodeRabbit
-
-## Release Notes for `@eggjs/cluster` v3.0.0-beta.4
-
-- **New Features**
- - Migrated project to TypeScript.
- - Added support for Node.js 18.19.0, 20, 22, and 23.
- - Enhanced type safety and module exports.
- - Improved worker thread and process management.
- - Introduced new error handling classes for better debugging.
-
-- **Breaking Changes**
- - Renamed package from `egg-cluster` to `@eggjs/cluster`.
- - Updated import/export syntax to ES modules.
- - Minimum Node.js version is now 18.19.0.
-
-- **Performance Improvements**
- - Refactored cluster and worker management.
- - Optimized error handling and logging.
-
-- **Bug Fixes**
- - Resolved various edge cases in worker initialization.
- - Improved graceful shutdown mechanisms.
-
-- **Documentation**
- - Updated README with new package name and usage examples.
- - Added TypeScript and ESM import examples.🧰 Tools
🪛 LanguageTool
[grammar] ~30-~30: Use a hyphen to join words.
Context: ...ess management. - Introduced new error handling classes for better debugging. ...
(QB_NEW_EN_HYPHEN)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CHANGELOG.md` around lines 21 - 48, Remove the auto-generated "## Summary by
CodeRabbit" block and the verbose beta-oriented release notes beneath it (the
section starting with "## Release Notes for `@eggjs/cluster` v3.0.0-beta.4"
through the closing auto-generated comment), and replace it with a concise,
final-release changelog entry that omits beta wording and redundant bullets;
locate the block by the header "## Summary by CodeRabbit" and the subsection
title "Release Notes for `@eggjs/cluster` v3.0.0-beta.4" and trim or rewrite those
lines to a short, trustable summary.
| You can specify a callback that will be invoked when application has started. | ||
| However, master process will exit when catch an error. |
There was a problem hiding this comment.
Clarify Promise wording in startup description.
Current text says “callback” and has awkward grammar; this section reads clearer if it explicitly describes Promise-based completion/error handling.
Suggested wording
-You can specify a callback that will be invoked when application has started.
-However, master process will exit when catch an error.
+You can handle startup completion from the returned Promise.
+The master process exits if startup throws an error.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| You can specify a callback that will be invoked when application has started. | |
| However, master process will exit when catch an error. | |
| You can handle startup completion from the returned Promise. | |
| The master process exits if startup throws an error. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` around lines 40 - 41, The README’s startup section uses “callback”
and awkward grammar (“when application has started”, “will exit when catch an
error”); change it to explicit Promise-based wording that is grammatically
correct and clear. Replace references to a “callback” with a Promise-based
description such as: the startup function returns a Promise that resolves when
the application has started and rejects if startup fails; update the sentence
“master process will exit when catch an error” to something like “the master
process will exit if the Promise rejects due to a startup error,” and ensure
examples and verbs are present-tense and grammatically correct.
| function startErrorHandler(err: Error) { | ||
| consoleLogger.error(err); | ||
| consoleLogger.error('[agent_worker] start error, exiting with code:1'); | ||
| AgentWorker.kill(); | ||
| } | ||
|
|
||
| agent.ready((err?: Error) => { | ||
| // don't send started message to master when start error | ||
| if (err) { | ||
| return; | ||
| } | ||
|
|
||
| agent.removeListener('error', startErrorHandler); | ||
| AgentWorker.send({ action: 'agent-start', to: 'master' }); | ||
| }); | ||
|
|
||
| // exit if agent start error | ||
| agent.once('error', startErrorHandler); |
There was a problem hiding this comment.
Route agent.ready(err) through the same failure path.
The callback on Lines 60-64 drops err on the floor. If a framework reports startup failure via ready(err) without also emitting 'error', this worker never sends agent-start and never exits, so startup hangs until the master timeout.
Suggested fix
+ let startupFailed = false;
+
function startErrorHandler(err: Error) {
+ if (startupFailed) return;
+ startupFailed = true;
consoleLogger.error(err);
consoleLogger.error('[agent_worker] start error, exiting with code:1');
AgentWorker.kill();
}
agent.ready((err?: Error) => {
// don't send started message to master when start error
if (err) {
- return;
+ startErrorHandler(err);
+ return;
}
agent.removeListener('error', startErrorHandler);
AgentWorker.send({ action: 'agent-start', to: 'master' });
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agent_worker.ts` around lines 54 - 71, The agent.ready callback currently
swallows an error; modify it to route startup errors through the existing
startErrorHandler: inside the agent.ready((err?: Error) => { ... }) callback, if
err is truthy call startErrorHandler(err) and return (so the same cleanup and
exit path is taken), otherwise proceed to removeListener('error',
startErrorHandler) and call AgentWorker.send({ action: 'agent-start', to:
'master' }); this ensures errors reported via agent.ready are handled the same
as agent.once('error', startErrorHandler').
| AppWorker.send({ | ||
| to: 'master', | ||
| action: 'realport', | ||
| data: { | ||
| port, | ||
| protocol, | ||
| }, | ||
| }); |
There was a problem hiding this comment.
Port validation occurs after realport message is sent.
The worker sends the realport message to master (lines 117-124) before validating that port is a number (lines 172-177). If port validation fails, master has already received potentially invalid port information.
♻️ Proposed fix - validate port before sending realport message
const port = app.options.port = options.port || listenConfig.port;
const debugPort = options.debugPort;
+
+ // Validate port early before notifying master
+ if (!listenConfig.path && !options.sticky && typeof port !== 'number') {
+ consoleLogger.error('[app_worker:%s] port should be number, but got %s(%s)',
+ process.pid, port, typeof port);
+ exitProcess();
+ return;
+ }
+
const protocol = (httpsOptions.key && httpsOptions.cert) ? 'https' : 'http';
debug('[app_worker:%s] listenConfig: %j, real port: %o, protocol: %o, debugPort: %o',
process.pid, listenConfig, port, protocol, debugPort);
AppWorker.send({Also applies to: 172-177
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/app_worker.ts` around lines 117 - 124, Move the port validation to occur
before sending the "realport" message: ensure the code that validates the port
variable (the numeric check currently around the port validation logic) runs
before calling AppWorker.send({ action: 'realport', data: { port, protocol } }).
Replace or reuse the existing validation logic (the block that checks port is a
number/valid range) to guard the send; if validation fails, do not call
AppWorker.send and handle the error/early return accordingly so master never
receives invalid port data.
| export async function startCluster(options: ClusterOptions) { | ||
| await new Master(options).ready(); |
There was a problem hiding this comment.
Keep options optional in the public entrypoint.
Master already accepts options?: ClusterOptions, but startCluster now requires an argument. That unnecessarily narrows the root API and will break existing TypeScript callers that rely on default startup behavior.
Suggested fix
-export async function startCluster(options: ClusterOptions) {
+export async function startCluster(options?: ClusterOptions) {
await new Master(options).ready();
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/index.ts` around lines 17 - 18, The public entrypoint startCluster should
accept an optional options parameter to match Master; change the function
signature of startCluster to take options?: ClusterOptions (instead of required)
and pass that value into new Master(options).ready(), ensuring callers can call
startCluster() with no args; update any related exported types if necessary to
reflect the optional parameter.
| static gracefulExit(options: gracefulExitOptions) { | ||
| process.on('exit', async code => { | ||
| if (typeof beforeExit === 'function') { | ||
| await beforeExit(); | ||
| if (typeof options.beforeExit === 'function') { | ||
| await options.beforeExit(); | ||
| } | ||
| process.exit(code); | ||
| }); | ||
| } |
There was a problem hiding this comment.
Async beforeExit callback not properly awaited before process.exit.
The gracefulExit handler calls await options.beforeExit() but then immediately calls process.exit(code). The process.on('exit') callback runs synchronously, so the await has no effect - the process will exit before beforeExit completes.
🐛 Proposed fix
static gracefulExit(options: gracefulExitOptions) {
- process.on('exit', async code => {
- if (typeof options.beforeExit === 'function') {
- await options.beforeExit();
- }
- process.exit(code);
- });
+ // Use SIGTERM/SIGINT handlers for async cleanup, not 'exit' event
+ const cleanup = async (signal: string) => {
+ if (typeof options.beforeExit === 'function') {
+ await options.beforeExit();
+ }
+ process.exit(0);
+ };
+ process.on('SIGTERM', () => cleanup('SIGTERM'));
+ process.on('SIGINT', () => cleanup('SIGINT'));
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| static gracefulExit(options: gracefulExitOptions) { | |
| process.on('exit', async code => { | |
| if (typeof beforeExit === 'function') { | |
| await beforeExit(); | |
| if (typeof options.beforeExit === 'function') { | |
| await options.beforeExit(); | |
| } | |
| process.exit(code); | |
| }); | |
| } | |
| static gracefulExit(options: gracefulExitOptions) { | |
| // Use SIGTERM/SIGINT handlers for async cleanup, not 'exit' event | |
| const cleanup = async (signal: string) => { | |
| if (typeof options.beforeExit === 'function') { | |
| await options.beforeExit(); | |
| } | |
| process.exit(0); | |
| }; | |
| process.on('SIGTERM', () => cleanup('SIGTERM')); | |
| process.on('SIGINT', () => cleanup('SIGINT')); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/utils/mode/impl/worker_threads/app.ts` around lines 68 - 75, The current
gracefulExit implementation registers a process.on('exit') handler in
gracefulExit(gracefulExitOptions) which attempts to await options.beforeExit()
but cannot because 'exit' handlers run synchronously; change the implementation
to listen for the 'beforeExit' event (or explicitly handle termination signals
like SIGINT/SIGTERM and SIGQUIT) so the async options.beforeExit() can be
awaited before calling process.exit(code). Update the listener registration in
gracefulExit to use process.on('beforeExit', async (code) => { if (typeof
options.beforeExit === 'function') await options.beforeExit();
process.exit(code); }) or add signal handlers that await options.beforeExit()
and then call process.exit, referencing the gracefulExit function and the
options.beforeExit callback.
| const ports = this.options.ports ?? []; | ||
| if (!ports.length) { | ||
| ports.push(this.options.port); | ||
| ports.push(this.options.port!); | ||
| } |
There was a problem hiding this comment.
Non-null assertion on potentially undefined options.port.
Line 143 uses this.options.port! but the port may be undefined based on parseOptions behavior (which returns undefined when no port is specified and HTTPS is not enabled).
🛡️ Proposed fix
const ports = this.options.ports ?? [];
if (!ports.length) {
- ports.push(this.options.port!);
+ if (this.options.port !== undefined) {
+ ports.push(this.options.port);
+ } else {
+ throw new Error('Either ports or port must be specified for worker_threads mode');
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const ports = this.options.ports ?? []; | |
| if (!ports.length) { | |
| ports.push(this.options.port); | |
| ports.push(this.options.port!); | |
| } | |
| const ports = this.options.ports ?? []; | |
| if (!ports.length) { | |
| if (this.options.port !== undefined) { | |
| ports.push(this.options.port); | |
| } else { | |
| throw new Error('Either ports or port must be specified for worker_threads mode'); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/utils/mode/impl/worker_threads/app.ts` around lines 141 - 144, The code
uses a non-null assertion on this.options.port when adding to ports; instead,
check for undefined and handle it explicitly: after const ports =
this.options.ports ?? []; if (!ports.length) { if (this.options.port !==
undefined) { ports.push(this.options.port); } else { throw new Error("No port
specified: options.port is undefined (parseOptions may have returned
undefined)"); } } — reference symbols: ports, this.options.ports,
this.options.port, parseOptions.
| (subProcess.process ?? subProcess).kill('SIGTERM'); | ||
| await Promise.race([ | ||
| once(subProcess, 'exit'), | ||
| sleep(timeout), | ||
| ]); | ||
| if (subProcess.killed) { | ||
| return; | ||
| } | ||
| // SIGKILL: http://man7.org/linux/man-pages/man7/signal.7.html | ||
| // worker: https://github.com/nodejs/node/blob/master/lib/internal/cluster/worker.js#L22 | ||
| // subProcess.kill is wrapped to subProcess.destroy, it will wait to disconnected. | ||
| (subProcess.process ?? subProcess).kill('SIGKILL'); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Node.js child_process, does ChildProcess.killed mean the process has exited, or only that a signal was successfully sent?
💡 Result:
ChildProcess.killed means that a signal was successfully sent via subprocess.kill, not that the process has exited.
Citations:
- 1:
childProcess.killedshould betrueafterprocess.kill(childProcess.pid)is called nodejs/node#27490 - 2: https://nodejs.org/dist/latest-v19.x/docs/api/child_process.html
🏁 Script executed:
cat -n src/utils/terminate.ts | sed -n '20,40p'Repository: eggjs/cluster
Length of output: 983
subProcess.killed cannot determine if the process actually exited; it only indicates a signal was sent.
At line 30, the code incorrectly uses killed to skip SIGKILL escalation. After the Promise.race() times out, the process may still be alive even though killed is true. The 'exit' event must be explicitly checked to confirm termination; otherwise, SIGKILL may never be sent.
🐛 Proposed fix
async function killProcess(subProcess: SubProcess, timeout: number) {
- // https://github.com/nodejs/node/pull/34312
- (subProcess.process ?? subProcess).kill('SIGTERM');
- await Promise.race([
- once(subProcess, 'exit'),
- sleep(timeout),
- ]);
- if (subProcess.killed) {
- return;
- }
+ const target = subProcess.process ?? subProcess;
+ const targetPid = target.pid;
+ if (!targetPid) return;
+
+ try {
+ target.kill('SIGTERM');
+ } catch {
+ return;
+ }
+
+ const exited = await Promise.race([
+ once(subProcess, 'exit').then(() => true),
+ sleep(timeout).then(() => false),
+ ]);
+ if (exited) return;
+
+ try {
+ process.kill(targetPid, 0);
+ } catch {
+ return;
+ }
// SIGKILL: http://man7.org/linux/man-pages/man7/signal.7.html
// worker: https://github.com/nodejs/node/blob/master/lib/internal/cluster/worker.js#L22
// subProcess.kill is wrapped to subProcess.destroy, it will wait to disconnected.
- (subProcess.process ?? subProcess).kill('SIGKILL');
+ target.kill('SIGKILL');
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/utils/terminate.ts` around lines 25 - 36, Replace the unreliable check of
subProcess.killed with an explicit "didExit" flag set from the 'exit' event
promise: attach a listener or await once(subProcess, 'exit') to set didExit =
true before sending SIGKILL, then after the Promise.race check didExit to decide
whether to escalate to (subProcess.process ?? subProcess).kill('SIGKILL');;
reference the existing subProcess, once(subProcess, 'exit'), sleep(timeout),
kill, and killed symbols and ensure the 'exit' event is the authoritative source
for termination rather than subProcess.killed.
| let unterminated: number[] = []; | ||
|
|
||
| while (Date.now() - start < timeout - checkInterval) { | ||
| await sleep(checkInterval); | ||
| unterminated = getUnterminatedProcesses(childrenPids); | ||
| if (unterminated.length === 0) { | ||
| return; | ||
| } | ||
| } | ||
| kill(unterminated, 'SIGKILL'); |
There was a problem hiding this comment.
Short timeout path can skip child SIGKILL entirely.
unterminated starts empty, so if the loop never runs (e.g., timeout <= check interval), Line 58 sends SIGKILL to an empty list even when children are still alive.
🛠️ Proposed fix
- let unterminated: number[] = [];
+ let unterminated: number[] = childrenPids;
- while (Date.now() - start < timeout - checkInterval) {
+ while (unterminated.length > 0 && Date.now() - start < timeout - checkInterval) {
await sleep(checkInterval);
unterminated = getUnterminatedProcesses(childrenPids);
if (unterminated.length === 0) {
return;
}
}
- kill(unterminated, 'SIGKILL');
+ if (unterminated.length > 0) {
+ kill(unterminated, 'SIGKILL');
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/utils/terminate.ts` around lines 49 - 58, The code can skip collecting
live child PIDs when timeout <= checkInterval because unterminated is
initialized empty and the loop may never run; before calling kill(...) ensure
you compute the current list of live children by invoking
getUnterminatedProcesses(childrenPids) one final time (or initialize
unterminated with that result before the loop) so that kill(unterminated,
'SIGKILL') is only called with the actual remaining PIDs; update the logic
around unterminated, the while loop using start/timeout/checkInterval, and the
final kill call to use that fresh result.
| it('should exit when emit error during agent worker boot', () => { | ||
| app = utils.cluster('apps/agent-worker-threads-error'); | ||
| app = cluster('apps/agent-worker-threads-error'); |
There was a problem hiding this comment.
This case no longer exercises the worker_threads path.
Unlike the previous test, this cluster boot omits { startMode: 'worker_threads' }, so the suite can pass while the default process-mode agent path is what actually ran.
Suggested fix
- app = cluster('apps/agent-worker-threads-error');
+ app = cluster('apps/agent-worker-threads-error', { startMode: 'worker_threads' } as any);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it('should exit when emit error during agent worker boot', () => { | |
| app = utils.cluster('apps/agent-worker-threads-error'); | |
| app = cluster('apps/agent-worker-threads-error'); | |
| it('should exit when emit error during agent worker boot', () => { | |
| app = cluster('apps/agent-worker-threads-error', { startMode: 'worker_threads' } as any); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/worker_threads.test.ts` around lines 18 - 19, The test "should exit when
emit error during agent worker boot" no longer exercises the worker_threads path
because the cluster invocation omits the startMode override; update the cluster
call in the test (the line assigning app =
cluster('apps/agent-worker-threads-error')) to include the option { startMode:
'worker_threads' } so the agent runs in worker_threads mode (or switch to the
correct fixture that explicitly uses worker_threads), ensuring the test actually
triggers the worker_threads code path.
|
Wrong base branch — this was created against master (v3+ TypeScript), but the active branch is 1.x. Re-creating against 1.x. |
Summary
Add structured logging for every master↔app worker IPC point under Node.js
clustermode (process mode only), using onelogger as the logger facade.Covers both layers of cluster IPC:
Application layer (always on)
AppProcessWorker#send()master.tssticky-sessionworker.instance.send('sticky-session:connection', connection)cluster.on('fork')→worker.on('message', (msg, handle) => ...)static AppProcessWorker.send()process.on('message', (msg, handle) => ...)inapp_worker.tscluster 内部层 (opt-in via
EGG_CLUSTER_IPC_LOG=1; very verbose —newconnfires once per HTTP request)online/listening/queryServer/accepted(fd ack) /close— viaworker.process.on('internalMessage')newconn(fd dispatch, handle=<TCP>) /disconnect/suicide— viaprocess.on('internalMessage')Note:
internalMessageis an undocumented Node.js event but has been stable across major versions. E/F hook onworker.process(ChildProcess) rather thancluster.Worker, since the latter does not forwardinternalMessage.Agent worker (
child_process.fork) andworker_threadsmode are intentionally out of scope — they are not theclustermodule.Users can replace the default console sink via
onelogger.setLogger()/setCustomLogger().Sample output
With
EGG_CLUSTER_IPC_LOG=1and a real HTTP request against theegg-readyfixture:formatIpcMessagehas a safe replacer:net.Socket/net.Server/ Buffer / circular refs are collapsed to tokens; long strings are truncated at 200 chars; whole JSON payload is capped at 1024 chars. Handles get a+handle=<Socket fd=N>/<TCP>/<Server>suffix.Test plan
tshydual-build +attw --packall green (node10 / node16-CJS / node16-ESM / bundler)test/master.test.ts— 43/43 passing (core IPC lifecycle)test/https.test.ts— 3/3 passingtest/agent_worker.test.ts— 12/12 passingtest/worker_threads.test.ts— 2/2 passingegg-readyfixture withEGG_CLUSTER_IPC_LOG=1+ real HTTP request; all 7 points (A/A'/B/C/D/E/F) print as expected, no Socket serialization errorstest/app_worker.test.ts— blocked by a pre-existing TS18048 error (let app2;+await app2.close();in thefinallyblock); unrelated to this changetest/options.test.ts— 19 passing / 2 pre-existing failures about fixture framework packages; unchanged by this commit (verified by re-running againstgit stashed code)🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
Breaking Changes
egg-clusterto@eggjs/clusterstartCluster()from callback-based to promise-based APINew Features
Bug Fixes