Skip to content

fix: kill ghost agent processes with two-stage Ctrl+C#38

Merged
kasperjunge merged 8 commits into
mainfrom
ghost-agent-cleanup
Mar 24, 2026
Merged

fix: kill ghost agent processes with two-stage Ctrl+C#38
kasperjunge merged 8 commits into
mainfrom
ghost-agent-cleanup

Conversation

@kasperjunge
Copy link
Copy Markdown
Collaborator

Summary

Fixes #26. Supersedes #27.

  • Adds _kill_process_group helper that sends SIGTERM with 3s grace period before SIGKILL
  • Uses start_new_session=True to isolate agent subprocesses in their own process group
  • Two-stage Ctrl+C: First Ctrl+C finishes the current iteration naturally (no interrupted writes). Second Ctrl+C force-kills the agent process group immediately.
  • Clear UX messaging: after first Ctrl+C, shows "Finishing current iteration… (Ctrl+C again to force stop)"

Based on @malpou's work in #27 (process group isolation and kill helper), with the addition of two-stage Ctrl+C to avoid corrupting work in progress.

Test plan

  • All 445 tests pass
  • New tests verify process group isolation (start_new_session=True)
  • New tests verify cleanup kills the process group on cancellation/timeout
  • New tests verify two-stage signal handler (first Ctrl+C → graceful stop, second → force kill)

🤖 Generated with Claude Code

malpou and others added 8 commits March 24, 2026 14:08
Add _kill_process_group() and _SESSION_KWARGS to support killing agent
processes and all their children when the ralph loop is cancelled or
times out. This is the foundation for fixing #26.

Co-authored-by: Ralphify <noreply@ralphify.co>
Use start_new_session=True in _run_agent_streaming so the agent and its
children form a separate process group. Replace proc.kill() calls with
_kill_process_group() for proper group-wide cleanup on timeout and in
the finally block. Also harden _kill_process_group with a pgid==pid
guard and SIGTERM-before-SIGKILL strategy for safer WSL behavior.

Co-authored-by: Ralphify <noreply@ralphify.co>
Use subprocess.Popen with start_new_session=True in the blocking path
so agent subprocesses form their own process group. On timeout or
KeyboardInterrupt, the entire group is killed via _kill_process_group,
preventing ghost processes from surviving Ctrl+C.

Co-authored-by: Ralphify <noreply@ralphify.co>
Cover _kill_process_group behavior (SIGTERM/SIGKILL escalation, session
leader guard, fallbacks) and verify both streaming and blocking modes
use start_new_session and call _kill_process_group on timeout/interrupt.

Co-authored-by: Ralphify <noreply@ralphify.co>
Remove redundant _kill_process_group integration tests that duplicated
unit test coverage, merge similar fallback tests, and trim helper
docstrings. Reduces test additions by ~100 lines without losing coverage.

Co-authored-by: Ralphify <noreply@ralphify.co>
Co-authored-by: Ralphify <noreply@ralphify.co>
Merge TestKillProcessGroup and TestProcessGroupIsolation into a single
TestProcessGroupCleanup class with a class-level pytestmark for the
win32 skipif. Remove redundant docstrings and unused os import.

Co-authored-by: Ralphify <noreply@ralphify.co>
First Ctrl+C finishes the current iteration naturally, avoiding
interrupted writes. Second Ctrl+C force-kills the agent process
group (SIGTERM with 3s grace, then SIGKILL).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@kasperjunge kasperjunge merged commit 34592f3 into main Mar 24, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ghost agent processes continue editing files after Ctrl+C

2 participants