fix(core): graceful process tree shutdown for continuous tasks#1
Open
fix(core): graceful process tree shutdown for continuous tasks#1
Conversation
Building on the Rust-based process tree killer, this adds a
`killProcessTreeGraceful` async function that implements
SIGTERM → wait → SIGKILL escalation, preventing two issues:
1. Orphaned processes: the tree walker sends signals to all
descendants regardless of process group, so deep child processes
(e.g. doppler → bunx → bun → app) all receive SIGTERM.
2. Premature cleanup interruption: the main process now stays alive
during a configurable grace period (default 5s), keeping the PTY
master open and preventing the kernel from sending SIGHUP to
children before they finish their cleanup handlers.
The fire-and-forget `killProcessTree` is kept for synchronous
`process.on('exit')` handlers as a last resort.
Fixes nrwl#32438
Co-Authored-By: Claude Opus 4.6 <[email protected]>
- forked-process-task-runner: await kill promises before process.exit() - running-tasks: per-child SIGINT handlers no longer call process.exit() to prevent first-child-exits race - pseudo-terminal: shutdown() uses sync killProcessTree for exit handler - pseudo-terminal: kill() uses killProcessTreeGraceful directly - process_killer: graceful kill detects reparented descendants Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
TaskOrchestrator.cleanup() now awaits the forked process runner's cleanup promise alongside other running tasks, ensuring all process trees are fully shut down before the orchestrator continues. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
…reeGraceful Tests cover: - Killing a simple process - Killing a multi-level process tree (parent + children) - Graceful SIGTERM with cleanup handler verification - Slow cleanup within grace period (verifies process isn't killed prematurely) - Multi-level tree graceful shutdown - Already-dead process (no-op fast path) Skipped on Windows where Unix signals don't apply. Co-Authored-By: Claude Opus 4.6 <[email protected]>
1d68855 to
888a97c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This builds on top of Leo's
nxc-3480branch (PR nrwl#33655) to fix the graceful shutdown regression that was blocking the PR from landing.Problem: PR nrwl#33655 added
killProcessTreeusingsysinfoto kill all descendants on Ctrl+C, but it was fire-and-forget — processes were killed so fast that cleanup handlers (e.g.server.close()) never had a chance to run.Solution: Added
killProcessTreeGraceful()— an async Rust function that:All TypeScript
kill()methods across the codebase now use the graceful variant, and signal handlers properlyawaitthe kill promises before callingprocess.exit().Key changes
process_killer/mod.rs— AddedkillProcessTreeGraceful,collect_descendants_of, and helper functionsrunning-tasks.ts— Per-child SIGINT handlers no longer callprocess.exit()(prevents first-child-exits race); parent-level handler awaits all killsforked-process-task-runner.ts—cleanup()is now async, SIGINT handler awaits ittask-orchestrator.ts— Awaits forked runner cleanup alongside other running taskspseudo-terminal.ts—kill()uses graceful tree kill directly;shutdown()uses synckillProcessTreefor exit handlernode-child-process.ts— Both kill methods usekillProcessTreeGracefulrun-script.impl.ts— Exit handler uses graceful kill with.finally()Test results
Tested against the reproduction repo (agcty/nx-32438-continuous-tasks-repro) with 3 concurrent services (service-a:3001, service-b:3002, frontend):
Added 6 integration tests in
kill_process_tree.spec.tscovering tree killing, graceful SIGTERM with cleanup verification, slow cleanup within grace period, and already-dead process fast path.Review by OpenAI Codex (gpt-5.3-codex, xhigh reasoning)
The fix was reviewed by Codex CLI which found and helped address:
Related
Test plan
npx jest packages/nx/src/native/tests/kill_process_tree.spec.ts(6/6 pass)nx affected -t build,test,linton nrwl/nx monorepo🤖 Generated with Claude Code