feat(shell): persist background-task output to disk + completion notifications#140
Merged
Merged
Conversation
…fications
Fixes eval-loss bug. Previously, `is_background: true` shell commands
detached via `&` and the OS sent their stdout/stderr to nowhere — protoCLI's
data callback only fires while the wrapper is attached. Long-running
processes like eval suites would print results that nothing ever saw.
Mirrors cc-2.18's task framework, scoped to local shells:
- packages/core/src/backgroundShells/: new module
* registry.ts — BackgroundShellRegistry: in-memory map of taskId →
{status, command, outputPath, pid, exitCode, ...}, with
drainPendingNotifications() for the next-turn injection.
* diskOutput.ts — path helpers + readBackgroundTaskOutput / Exit / Pid.
Files live at <projectTempDir>/<sessionId>/tasks/<taskId>.{output,exit,pid}.
* watcher.ts — polls the .exit sentinel (with PID-liveness fallback),
marks the task complete/failed in the registry once the bg process
exits.
* notifications.ts — builds <task_notification> blocks (task_id,
output_file, status, exit_code, summary).
- shell.ts: when is_background=true on non-Windows, generate a taskId,
redirect stdout/stderr at the shell level into the per-task output
file, capture the bg PID and exit code via sentinel files, register
the task, and start the watcher. Tool result returns the file path
+ task ID so the model can `Read` it later.
- core/client.ts: drain completed-but-unnotified tasks at the start of
each user query and prepend <task_notification> blocks to the request,
matching how plan/subagent/arena reminders are added.
- bg-stop.ts: new bg_stop tool. SIGTERM the process group, escalate to
SIGKILL after 3s, mark the task killed in the registry. Registered
as a core tool alongside the existing task-* family.
- bgCommand.ts: new /bg slash command listing running and recent
background tasks with id, status, age, command, output path, pid.
Tests: shell.test.ts updated for new wrapper format; 55/55 pass. Full
core suite (5,337 tests) and cli suite (3,767 tests) pass.
Deferred (worth follow-up PRs):
- Auto-background of long foreground commands and Ctrl+B keybinding.
Both need foreground shell execution rearchitected so output tees
to disk from second 1 and the await can be interrupted mid-flight
without killing the process.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (15)
WalkthroughThis PR adds a comprehensive background shell task management system spanning CLI commands, core infrastructure, tools, and integrations. It enables tracking, monitoring, and stopping long-running background tasks with disk-based output capture and user notifications. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Shell as Shell Tool
participant Registry as Background<br/>Shell Registry
participant Watcher as Background<br/>Shell Watcher
participant Disk as Disk Output<br/>Layer
participant Client as Client
User->>Shell: Execute background command
Shell->>Disk: Create task dir & output files
Shell->>Registry: Register task (id, command, outputPath)
Shell->>Watcher: Start async polling watcher
Shell-->>User: Task started (id & output path)
Note over Watcher: Poll loop (check sentinel files)
Watcher->>Disk: Check for .exit sentinel
Watcher->>Registry: Update task status & exit code
Note over Registry: Task marked completed/failed
Registry->>Disk: (no action, output persisted)
User->>Client: Send next message
Client->>Registry: Drain pending notifications
Registry-->>Client: List of completed tasks
Client->>Client: Build notification blocks
Client-->>User: System reminders with task summaries
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
| const commandToExecute = (() => { | ||
| if (!shouldRunInBackground || !useDiskCapture) return finalCommand; | ||
| // Strip trailing & — we'll re-add it on the subshell wrapper. | ||
| const inner = finalCommand.trim().replace(/\s*&\s*$/, ''); |
This was referenced Apr 26, 2026
mabry1985
added a commit
that referenced
this pull request
Apr 26, 2026
…147) CodeQL flagged two new alerts on the background-shell wrapper code landed in #140: - js/polynomial-redos at shell.ts:292 ('&+$' on a trimmed string) - js/polynomial-redos at shell.ts:305 ('\\s*&\\s*$' on a trimmed string) Both are low practical risk (the inputs are bounded model-emitted shell commands) but the alert is blocking the dev → main promotion in PR #141. Swap each for a plain string-op equivalent — same intent, no quantifier-on-quantifier shape for the analyzer to flag. Co-authored-by: Automaker <automaker@localhost> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Background shell tasks lose their output. When the shell tool runs a command with `is_background: true`, it appends `&` and the OS detaches the process. ShellExecutionService's data callback only fires while the wrapper is attached; once it exits, all subsequent stdout/stderr from the detached process flows into the void. Long-running evals (and similar) print results that nothing ever sees, and the agent goes looking for files the runner never wrote.
What
Mirrors cc-2.18's task framework, scoped to local shells:
Files
Tool result before/after
```
before — model has no idea where output went
Background command started. PIDs: 54322 (Use kill to stop)
after — explicit path, model can Read it
Background command started.
Task ID: 7f9c…
Output file: /tmp/proto///tasks/7f9c…output
PID: 54322
Read the output file at any time to check progress. You will
be notified via <task_notification> when the task completes.
Stop early with the bg_stop tool (task_id="7f9c…").
```
Tests
Deferred (follow-up PRs)
Both need foreground shell execution rearchitected so output tees to disk from second 1 and the await can be interrupted mid-flight without killing the process — a bigger blast-radius change worth its own PR.
🤖 Generated with Claude Code
Summary by CodeRabbit
bgslash command to list and monitor long-running background shell tasks, displaying task ID, status, runtime, command, output path, and process ID.bg_stopcommand to gracefully stop running background tasks.