Skip to content

🤖 feat: notify on terminal background work#3632

Open
ThomasK33 wants to merge 36 commits into
mainfrom
agent-tasks-zgxs
Open

🤖 feat: notify on terminal background work#3632
ThomasK33 wants to merge 36 commits into
mainfrom
agent-tasks-zgxs

Conversation

@ThomasK33

@ThomasK33 ThomasK33 commented Jun 25, 2026

Copy link
Copy Markdown
Member

Summary

Refactors background work orchestration so intentionally backgrounded tasks, workspace turns, and workflow runs no longer force broad active task_await prompts at parent turn-end. Background launches now persist an internal notify-on-terminal attention policy, keep the parent free to finish its turn, and wake the owner only when terminal output is ready to integrate.

Background

Previously run_in_background: true work could still be re-forced through broad parent stream-end await prompts, which made background execution behave like foreground execution and caused agents to spend extra turns polling. The accepted implementation plan called for separating true blocking dependencies from background work that should notify only on terminal completion.

Implementation

  • Added a shared internal BackgroundWorkAttentionPolicy with legacy-safe blocking defaults.
  • Persisted attention policy on child task workspaces, workspace-turn handles, and workflow run records.
  • Made parent stream-end auto-resume policy-aware so only blocking work is included in active task_await prompts.
  • Added a persisted terminal attention store and idle-only drain for coalesced terminal wake-ups.
  • Routed workspace-turn terminal output through a one-shot task_await(..., timeout_secs: 0) wake-up.
  • Routed sub-agent success/failure handoffs through the same terminal notifier while preserving already-injected report/failure context.
  • Threaded notify-on-terminal behavior from background task/workflow launch and from detached foreground waits, including queued-message and timeout detach paths.
  • Updated built-in skills/tool descriptions, including new background monitor guidance for condition-driven CI/mergeability/review watchers.

Validation

  • MUX_ESLINT_CONCURRENCY=1 make static-check
  • bun test src/node/services/taskService.test.ts -t "background"
  • bun test src/node/services/taskService.test.ts -t "workspace-turn"
  • bun test src/node/services/taskService.test.ts -t "workflow"
  • bun test src/node/services/tools/task_await.test.ts src/node/services/tools/workflow_run.test.ts src/node/services/tools/workflow_resume.test.ts
  • Browser dogfooding evidence captured with agent-browser against Storybook synthetic auto-resume messages and a dev-server sandbox. Evidence included screenshots/videos/report in /tmp/mux-notify-dogfood and was attached in the workspace chat.

Risks

This touches task lifecycle orchestration, workspace-turn settlement, workflow run records, and model-facing tool guidance. Regression risk is mainly around incorrectly classifying active work as non-blocking or duplicating terminal wake-ups; mitigations include persisted legacy-default blocking behavior, explicit terminal notification idempotency, queued/streaming owner deferral, restart-safe tests, and targeted task/workflow integration coverage.

Pains

The refactor crossed several durable-state seams (workspace config, handle store, workflow store, generated skill bundles, tool docs). Electron desktop GUI dogfooding was unavailable in this headless workspace, so browser evidence was captured through Storybook and dev-server sandbox instead.


📋 Implementation Plan

Implementation Plan: Replace Broad Background-Task Force-Awaits with Terminal Wake-Ups

Objective

Refactor Mux's background-work orchestration so agents are not broadly forced to await every active background task at turn end. Instead:

  1. True blocking dependencies still force task_await before the agent can conclude.
  2. Intentionally backgrounded work lets the current turn finish.
  3. When that background work reaches a terminal state, Mux sends a targeted synthetic wake-up so the agent can integrate the completed output.

This preserves correctness for dependency-driven parallel work while making run_in_background behave like real background execution.

Evidence gathered

Verified through repo inspection and Explore sub-agent reports:

  • src/node/services/taskService.ts
    • buildBackgroundAwaitPrompt() currently emits the MUST NOT end your turn prompt and instructs the agent to call task_await until all listed work is terminal.
    • handleStreamEnd() currently auto-resumes parent workspaces whenever active descendant tasks, active workflow runs, or active workspace-turn handles exist.
    • Queue-backgrounded foreground waiters currently get a one-shot exemption via userBackgroundedTaskIds, but a later stream-end can force the await again.
    • Successful sub-agent completion already appends a synthetic <mux_subagent_report> and uses COMPLETED_BACKGROUND_SUBAGENT_HANDOFF_PROMPT, which tells the parent to integrate injected reports without task_await.
    • Terminal sub-agent failure similarly appends <mux_subagent_failure> and uses FAILED_BACKGROUND_SUBAGENT_HANDOFF_PROMPT.
    • Workspace-turn handles (wst_...) persist terminal output in TaskHandleStore; they do not inject output into parent history, so completed output must still be retrieved with task_await.
  • src/node/services/taskHandleStore.ts
    • WorkspaceTurnTaskHandleRecord is persisted per owner workspace and is the right persisted seam for workspace-turn attention policy and terminal wake-up state.
  • src/common/schemas/project.ts
    • WorkspaceConfigSchema persists child task metadata (parentWorkspaceId, taskStatus, workflowTask, bestOf, model/thinking settings) and is the right seam for sub-agent attention policy.
  • src/common/orpc/schemas/workflow.ts and src/common/types/workflow.ts
    • WorkflowRunRecordSchema persists workflow run state and is the right seam for workflow attention policy.
  • src/node/services/aiService.ts
    • onBackgroundRunTerminal already feeds terminal workflow results back as a synthetic continuation using requireIdle: true and startStreamInBackground: true.
    • Therefore workflow work mainly needs to stop being forced through active task_await while it is still running; terminal wake-up already exists for top-level background workflow runs.
  • Existing tests to adapt/add live primarily in:
    • src/node/services/taskService.test.ts
    • src/node/services/tools/task_await.test.ts
    • workflow tool tests under src/node/services/tools/workflow_run.test.ts / workflow_resume.test.ts if schema/result notes change.
  • Synthetic message story copy lives in src/browser/features/Messages/MessageRenderer.stories.tsx and currently still contains older force-await / task_await wording.

Recommended approach

Approach A — Recommended: internal attention policy derived from launch intent

Net product LoC estimate: +850 to +1,150 product LoC.

Introduce a persisted internal attention policy for background work, but do not expose a new model-visible tool parameter in the first implementation. Derive policy from existing intent:

  • run_in_background: false / foreground waits → blocking_until_terminal.
  • run_in_background: truenotify_on_terminal.
  • Foreground waits that are backgrounded because the user queued another message → mark the handle/task notify_on_terminal instead of relying on a one-shot in-memory exemption.
  • Workflow-owned child tasks keep their existing workflow-owned behavior and are not generically woken through the parent task path.

This keeps the public tool interface small while aligning behavior with the existing meaning of run_in_background.

Explicit v1 semantic tradeoff: in this plan, run_in_background: true is intentionally redefined as “non-blocking with terminal wake-up.” Agents that need a result before answering should either use the foreground/default mode or explicitly call task_await before finalizing. This is a behavioral change from today's effective force-await safety net, so the tool descriptions and tests must make the new contract unambiguous.

Alternatives considered

Approach B — Expose attention_policy on task/workflow tools

Net product LoC estimate: +1,050 to +1,450 product LoC.

Add a public attention_policy field to task, workflow_run, and possibly workflow_resume tool schemas. This is more flexible (blocking_until_terminal, notify_on_terminal, silent_background) but increases model-facing interface complexity and creates new misuse modes. Defer until users need explicit silent/background semantics beyond run_in_background.

If implemented later, optional tool schema fields must use .nullish() per repo convention, not .optional() alone.

Approach C — Remove the force-await guard globally

Net product LoC estimate: +80 to +160 product LoC.

Simply stop auto-resuming on active descendants/workflows. This is too risky: agents could finalize answers before dependency results arrive, breaking best-of-N, foreground task calls, workflow-owned work, and task-local background workflows.

Reject this approach.

Target design

Attention policy domain model

Add a shared internal type/schema, e.g. src/common/types/backgroundWorkAttention.ts:

export const BackgroundWorkAttentionPolicySchema = z.enum([
  "blocking_until_terminal",
  "notify_on_terminal",
]);

export type BackgroundWorkAttentionPolicy = z.infer<
  typeof BackgroundWorkAttentionPolicySchema
>;

Phase 1 intentionally omits silent_background from persisted/public state unless implementation shows it is nearly free. The product need is notify_on_terminal, and avoiding unused policy variants keeps behavior auditable.

Policy semantics:

  • blocking_until_terminal
    • Active work blocks the owner agent from ending its turn.
    • Current buildBackgroundAwaitPrompt() behavior remains appropriate.
  • notify_on_terminal
    • Active work does not block owner turn-end.
    • On terminal success/failure/interruption, Mux wakes the owner if appropriate.
    • Wake-up prompt depends on whether output is already in context or requires task_await.

Output-delivery rules

Work kind Output on terminal Wake-up behavior
Sub-agent task Success/failure is already injected into parent history as synthetic report/failure messages. Wake parent with existing completed/failed handoff prompts. Prompt must tell agent not to call task_await.
Workspace-turn handle (wst_...) Terminal result is persisted in TaskHandleStore, not injected into parent history. Wake parent with IDs and instruct task_await({ task_ids: [...], timeout_secs: 0 }) to retrieve terminal output. Do not instruct repeated waiting.
Top-level workflow run (wfr_...) Existing AIService.onBackgroundRunTerminal injects workflow result context and starts a continuation when current and idle. Do not force active task_await while running. Preserve existing terminal continuation.
Workflow-owned child task Workflow runner owns result delivery through workflow journals/step results. Keep current skip behavior; do not generic-wake parent for workflow-owned children.
Task-local background workflow inside a child task Child task's final report may depend on it. Keep blocking by default unless a later design explicitly propagates terminal output safely to the parent.

Terminal notification delivery seam

Add a small delivery seam (helper/module) so terminal wake-ups are not ad-hoc sendMessage() calls from settlement paths. Suggested name: TerminalAttentionNotifier plus a tiny TerminalAttentionNotificationStore.

Responsibilities:

  1. Persist pending terminal notifications in the owner workspace session directory, keyed by source kind + source ID, e.g.:

    type TerminalAttentionNotification = {
      id: string;
      ownerWorkspaceId: string;
      sourceKind: "agent_task" | "workspace_turn";
      sourceId: string;
      outputDelivery: "already_injected" | "requires_task_await";
      terminalOutcome: "completed" | "failed" | "interrupted" | "error";
      promptKind:
        | "subagent_report"
        | "subagent_failure"
        | "workspace_turn_terminal";
      terminalStatus?: string;
      title?: string;
      status: "pending" | "delivered" | "superseded";
      createdAt: string;
      deliveredAt?: string;
    };

    Top-level workflow terminal continuation already has a specialized path in AIService.onBackgroundRunTerminal; do not route workflows through this notifier in the first PR unless doing so avoids duplication and preserves isWorkflowInvocationCurrent().

  2. Coalesce by owner workspace before sending:

    • Combine multiple injected sub-agent reports/failures into one “integrate injected reports/failures” wake-up.
    • Combine multiple terminal workspace-turn handles into one “call task_await with these terminal IDs and timeout_secs: 0” wake-up.
    • If both injected reports and workspace-turn handles are pending, compose one prompt with two clear sections.
  3. Drain only when the owner is idle:

    • If the owner is streaming, queued, or preparing, leave notifications pending.
    • Chosen queued-turn behavior: persist pending + drain after the queued/user turn finishes. Do not enqueue the synthetic wake-up ahead of the queued turn.
    • Trigger drains from owner stream-end, task/workspace-turn terminal settlement scheduling, and startup/recovery scans.
  4. Never send while holding settlement locks:

    • Settlement paths should persist artifacts/results, enqueue pending notifications, release locks, then schedule an async drain.
    • No workspaceService.sendMessage() calls or retry loops while holding workspaceTurnSettlementLocks, workspaceEventLocks, the broad TaskService mutex, or task-finalization locks.
  5. Preserve existing parent resume/send behavior:

    • Resolve model, agentId, thinking level, and relevant resume metadata/options through the same path as existing TaskService parent auto-resumes (e.g. resolveParentAutoResumeOptions() or a small injected equivalent).
    • Send with { synthetic: true, agentInitiated: true, skipAutoResumeReset: true, requireIdle: true }.
    • Preserve parent agent selection behavior currently covered by tests such as “auto-resume preserves parent agentId from stream-end event metadata/history”.
  6. Mark delivered only after accepted send:

    • Only set deliveredAt / terminalAttentionNotifiedAt after sendMessage() succeeds.
    • If sendMessage() returns busy, leave pending and retry on the next drain trigger.
    • If currentness/supersession says the notification is stale, mark superseded with a logged reason and leave the underlying report/handle retrievable.

This notifier is the deep module for terminal attention delivery: settlement code should not need to know retry/coalescing/currentness details beyond enqueueing the right notification.

Explicit out of scope: background bash handles

The current active turn-end guard under discussion covers sub-agent tasks, workspace-turn handles, and workflow runs. Background bash(run_in_background=true) processes are surfaced by the bash/task await tooling but are not part of the TaskService.handleStreamEnd() descendant/workflow force-await inventory. This plan does not add attention policy for background bash handles. If implementation discovers a separate bash force-await path, treat that as a follow-up design rather than silently folding it into this refactor.

Implementation phases

Phase 0 — Baseline and invariants

  1. Run targeted existing tests before code changes to establish baseline:
    • bun test src/node/services/taskService.test.ts -t "auto-resumes a parent workspace until background workflow runs finish"
    • bun test src/node/services/taskService.test.ts -t "does not auto-resume for queue-backgrounded descendants"
    • bun test src/node/services/taskService.test.ts -t "tasks-completed auto-resume preserves parent agentId"
    • bun test src/node/services/tools/task_await.test.ts -t "backgrounded"
  2. Confirm current UI copy/story expectations:
    • src/browser/features/Messages/MessageRenderer.stories.tsx
  3. Document pre-change behavior in test names/comments where useful, but avoid tautological prose tests.

Quality gate: baseline targeted tests either pass or failures are recorded as unrelated pre-existing failures before implementation proceeds.

Phase 1 — Persist attention policy on each work-handle kind

1. Add shared policy schema/type

Create a small shared type module for BackgroundWorkAttentionPolicy or colocate the schema in an existing common task/workflow type module if a better existing home is obvious.

Defensive checks:

  • Use an exhaustive Record<BackgroundWorkAttentionPolicy, ...> anywhere prompts/labels are mapped.
  • Treat missing persisted policy as blocking_until_terminal for backward compatibility.

2. Sub-agent task workspaces

Update WorkspaceConfigSchema in src/common/schemas/project.ts with optional persisted metadata such as:

taskAttentionPolicy: BackgroundWorkAttentionPolicySchema.optional().meta({
  description: "How owner workspace stream-end handles this child task while it is active.",
})

Implementation notes:

  • Use taskAttentionPolicy rather than generic attentionPolicy to avoid confusion with unrelated workspace metadata.
  • Populate it from TaskService.create() based on the call's launch mode.
  • Default legacy/missing values to blocking_until_terminal.

3. Workspace-turn handles

Update WorkspaceTurnTaskHandleRecord and WorkspaceTurnTaskHandleRecordSchema in src/node/services/taskHandleStore.ts with:

  • attentionPolicy?: BackgroundWorkAttentionPolicy
  • terminalAttentionNotifiedAt?: string (mandatory for restart-safe one-shot terminal wake-up dedupe).

Populate attentionPolicy in TaskService.createWorkspaceTurn().

4. Workflow runs

Update WorkflowRunRecordSchema in src/common/orpc/schemas/workflow.ts with optional attentionPolicy.

Thread policy through:

  • src/node/services/tools/workflow_run.ts
  • src/node/services/tools/workflow_resume.ts whenever workflow_resume({ run_in_background: true }) resumes or continues a run in the background; resumed background runs must persist/retain notify_on_terminal so they do not re-enter active force-await behavior.
  • src/node/services/workflows/WorkflowService.ts create/start input and run creation.

Default existing runs to blocking_until_terminal unless the runtime can infer they were started via a background tool invocation. For first implementation, explicit new runs are enough; legacy runs can remain conservative.

Quality gate: schema/type tests parse old records without policy and new records with policy; make typecheck should catch every place that needs threading.

Phase 2 — Derive policy from launch/wait intent

1. task tool launch intent

In src/node/services/tools/task.ts and TaskService.create() / TaskService.createWorkspaceTurn():

  • run_in_background: truenotify_on_terminal.
  • run_in_background: falseblocking_until_terminal.

Keep the model-facing task tool schema unchanged in Phase 1/2. Update only descriptions/notes to say Mux will wake the workspace when intentionally backgrounded work completes.

2. Detached foreground waiters

Replace the one-shot userBackgroundedTaskIds mental model with durable policy escalation for every foreground-wait detachment path:

  • Queued-message detachment: when waitForAgentReport() or waitForWorkspaceTurn() is backgrounded because a queued message arrives (ForegroundWaitBackgroundedError path), persist that task/handle as notify_on_terminal.
  • Foreground wait timeout/limit detachment: when a foreground task/workflow/workspace-turn wait exceeds its foreground wait budget and continues in the background, persist or retain notify_on_terminal for that handle/run as well.
  • Any future “sent to background but still running” path must route through the same helper, e.g. markBackgroundWorkNotifyOnTerminal(...), rather than adding another in-memory exemption.

Keep durable attention policy separate from markTaskForegroundRelevant():

  • markTaskForegroundRelevant() may still manage transient waiter bookkeeping, but it must not clear durable notify_on_terminal policy.
  • An explicit later task_await on a notify handle is allowed and should resolve/reject normally, but it must not automatically re-promote the persisted policy to blocking_until_terminal.
  • If a future design wants explicit re-promotion, it needs a separate user/model-visible action and tests; do not do it implicitly in this refactor.

3. Workflow launches

For workflow_run({ run_in_background: true }) and workflow_resume({ run_in_background: true }):

  • Persist or retain attentionPolicy: "notify_on_terminal" on the workflow run.
  • Preserve existing AIService.onBackgroundRunTerminal continuation for top-level background runs.
  • Update workflow tool descriptions: if backgrounded, Mux may wake with the terminal workflow result; call task_await only when the current request depends on immediate output.

Quality gate: targeted unit tests prove policy is stored for new sub-agent tasks, workspace-turn handles, and background workflow runs.

Phase 3 — Filter active work at parent stream-end

Refactor the parent branch of TaskService.handleStreamEnd() so it builds an inventory of active work:

type ActiveBackgroundWorkInventory = {
  blockingTaskIds: string[];
  notifyTaskIds: string[];
  blockingWorkflowRunIds: string[];
  notifyWorkflowRunIds: string[];
};

Implementation details:

  1. Add helpers on TaskService:
    • resolveAgentTaskAttentionPolicy(taskId, cfg)
    • resolveWorkspaceTurnAttentionPolicy(ownerWorkspaceId, handleId)
    • resolveWorkflowRunAttentionPolicy(workspaceId, runId)
    • listBlockingActiveDescendantTaskIds(ownerWorkspaceId, cfg)
    • listBlockingActiveWorkflowRunIds(ownerWorkspaceId, runIds)
  2. Keep workflow-owned descendants excluded from generic parent prompts.
  3. Keep active task-local background work inside child tasks blocking by default.
  4. For kind="workspace" task workspaces, do not finalize the wst_... handle while that workspace still has active descendants, workspace-turn handles, or task-local workflow dependencies that would make its final output incomplete.
  5. Replace current blockingTaskIds = getBlockingTaskIds(activeTaskIds) logic with policy-aware filtering.
  6. Only call buildBackgroundAwaitPrompt() for blocking_until_terminal work.
  7. If only notify_on_terminal work remains active:
    • finalize normal stream-end/workspace-turn handling as appropriate;
    • clear parent auto-resume counters;
    • do not send a synthetic active-work prompt.

Important behavior changes:

  • Existing tests for default active descendants should still pass because missing/default policy is blocking.
  • Queue-backgrounded tests should change: a queue-backgrounded task remains non-blocking across future stream-ends, not just one stream-end.
  • Mixed active work should prompt only for blocking IDs; notify IDs must not appear in the task_await prompt.

Quality gate: taskService.test.ts proves default blocking behavior remains, notify_on_terminal active work does not force await, and mixed blocking/notify prompts include only blocking IDs.

Phase 4 — Terminal wake-ups for notify-on-terminal work

1. Reuse existing sub-agent terminal handoffs

For successful sub-agent tasks:

  • Preserve deliverReportToParent() and the existing injected <mux_subagent_report> artifact.
  • Instead of directly sending COMPLETED_BACKGROUND_SUBAGENT_HANDOFF_PROMPT from finalization, enqueue a TerminalAttentionNotification with outputDelivery: "already_injected" and let TerminalAttentionNotifier drain it when the owner is idle.
  • Continue to skip generic parent handoff for workflow-owned child tasks.

For terminal sub-agent failures:

  • Preserve <mux_subagent_failure> injection.
  • Enqueue a failure-shaped TerminalAttentionNotification with outputDelivery: "already_injected" and let the notifier compose the failure handoff prompt when it drains.

Foreground waiter consumption rule:

  • Preserve existing hadForegroundWaiters skip semantics. If an active foreground task_await / waitForAgentReport() consumed the sub-agent report or failure, do not enqueue a terminal notification for that source.
  • Apply the same rule to workspace-turn settlement: if waitForWorkspaceTurn() has an active foreground waiter that receives the terminal result, do not enqueue a duplicate terminal wake-up.
  • This rule is independent of durable attention policy; notify_on_terminal work can still be explicitly awaited, but a successful explicit await suppresses the later synthetic wake-up for that terminal result.

Coalescing recommendation:

  • Keep existing “wake only once the last active descendant settles” behavior, but compute “active descendants” as active owner-visible descendants, excluding workflow-owned children and already-terminal records.
  • If multiple notify tasks complete at nearly the same time, one parent wake-up should integrate all injected reports.

2. Add workspace-turn terminal wake-up

Workspace-turn output is not injected, so add a new prompt constant near existing background handoff prompts:

const COMPLETED_BACKGROUND_WORKSPACE_TURN_PROMPT =
  "Background workspace turn(s) have completed: ... " +
  "Call task_await now with task_ids: [...] and timeout_secs: 0 to retrieve their terminal output, then integrate it. " +
  "These handles are already terminal; do not repeatedly wait if task_await returns terminal status.";

Implementation details:

  • Trigger from settleWorkspaceTurn() on first active → terminal transition when attentionPolicy === "notify_on_terminal".
  • While inside workspaceTurnSettlementLocks, persist the terminal record and enqueue a pending TerminalAttentionNotification; do not call sendMessage() from inside the lock.
  • Let TerminalAttentionNotifier send to record.ownerWorkspaceId after the lock is released with:
    • synthetic: true
    • agentInitiated: true
    • skipAutoResumeReset: true
    • requireIdle: true
  • If owner workspace is busy, leave the notification pending and drain after the owner reaches idle.
  • Include exact wst_... IDs. Use timeout_secs: 0 in prompt because the handle is terminal and this should be retrieval, not waiting.

Workspace-turn settlement may need a small interface adjustment so waiter consumption is observable:

  • Change settleWorkspaceTurnWaiters() (or its caller) to return whether any foreground waiter consumed the terminal result.
  • Only enqueue workspace_turn_terminal notifications when no foreground waiter consumed the result.

Workspace-turn wake-up idempotency is mandatory:

  • Persist terminalAttentionNotifiedAt (or an equivalent notification token) on the WorkspaceTurnTaskHandleRecord only after the synthetic wake-up is accepted/sent.
  • Re-reading a terminal handle during stale recovery or duplicate settlement must not send a second wake-up if the marker is present.
  • If send fails because the owner is busy, do not mark notified; retry/defer according to the busy/queued-turn policy below.
  • Add a restart-style unit test that writes a terminal notify handle with/without the marker and verifies duplicate wake-ups are prevented.

3. Preserve workflow terminal continuation

For top-level background workflows:

  • Do not introduce a second workflow wake-up path if AIService.onBackgroundRunTerminal already handles it.
  • Ensure active parent stream-end filtering treats workflow runs with attentionPolicy === "notify_on_terminal" as non-blocking.
  • Keep isWorkflowInvocationCurrent() and requireIdle: true semantics unchanged.

4. Busy/queued-turn gating, currentness, and supersession

Terminal wake-ups must have an explicit owner-workspace gating policy:

  • If aiService.isStreaming(ownerWorkspaceId) is true, do not interrupt the active stream.
  • If workspaceService.hasPendingQueuedOrPreparingTurn(ownerWorkspaceId) is true, do not inject ahead of the user's queued/preparing turn.
  • Preferred behavior for notify terminal wake-ups is defer-until-idle, not drop, when the terminal output is still relevant and a durable notification marker is absent.
  • If a wake-up is deliberately skipped as superseded/currentness-failed, log the reason and leave the terminal artifact/handle retrievable via task_list / task_await.

Work-kind-specific currentness:

  • Workflows already check isWorkflowInvocationCurrent() before terminal continuation; preserve that behavior.
  • Sub-agent terminal handoff currently wakes idle parent after report/failure injection. While touching the path, add queued/preparing-turn protection equivalent to the parent active auto-resume guard; do not inject a sub-agent completion handoff ahead of a queued user turn.
  • Workspace-turn terminal wake-up should use requireIdle: true through the notifier drain. If requireIdle fails because the workspace is busy, leave the notification pending without marking terminalAttentionNotifiedAt; a later idle/stream-end drain trigger should retry it.

If implementation needs stronger stale-result handling, add helper(s) modeled on workflow currentness:

isWorkspaceTurnInvocationCurrent(ownerWorkspaceId, handleId)
isAgentTaskInvocationCurrent(parentWorkspaceId, childTaskId)

Do not silently broaden wake-ups beyond existing currentness semantics; add tests for whichever skip/defer policy is implemented.

Quality gate: terminal wake-up tests prove sub-agent reports/failures still wake once, workspace-turn terminal handles wake with task_await timeout_secs: 0, and workflow terminal continuation is not duplicated.

Phase 5 — Update tool descriptions, built-in skills, and synthetic story copy

Update wording so model instructions match behavior:

  • src/common/utils/tools/toolDefinitions.ts
    • task tool: run_in_background: true returns immediately; Mux may wake the workspace when the task completes. task_await is for when the current request depends on the output or the agent chooses to inspect progress.
    • task_await tool: remove or soften the line that says synthetic follow-ups with active background work are always blocking, or narrow it to “blocking dependency follow-ups”.
    • workflow_run / workflow_resume: background runs may wake on terminal result; task_await remains available for explicit progress checks.
  • Built-in agent skills:
    • $workflow-authoring: explain the internal persisted attention policy for workflow runs and workflow-owned steps.
    • $loop: explain how loop/orchestration authors should choose foreground/default execution vs run_in_background: true based on whether the next decision depends on the result.
    • These skill docs must explicitly say attentionPolicy is internal in v1 and is not a field authors/agents pass directly.
  • src/browser/features/Messages/MessageRenderer.stories.tsx
    • Replace hardcoded MUST NOT end examples with updated blocking-only wording and terminal wake-up examples.
    • Ensure completed sub-agent story no longer says to call task_await for already-injected reports.

Built-in skill policy guidance to add

Add equivalent guidance to $workflow-authoring and $loop so agents understand the inferred policy without needing a new tool argument:

Case Public author/agent action Internal persisted policy
Foreground/default task, workflow_run, workflow_resume Omit run_in_background or set it to false. blocking_until_terminal
Background task, workflow_run, workflow_resume Set run_in_background: true only when unrelated work can proceed. notify_on_terminal
Foreground wait detached by queued user message System persists detachment automatically. notify_on_terminal
Foreground wait timeout/limit continues in background System persists detachment automatically. notify_on_terminal
Explicit task_await on notify work Allowed when a later decision needs the result. Remains notify_on_terminal; if the waiter consumes the terminal result, no duplicate wake-up is sent.
Workflow-owned agent(), parallel(), pipeline(), nested workflow() Workflow conductor waits/replays through durable workflow journals. Workflow-owned/blocking inside the workflow; no generic parent wake-up for workflow-owned child tasks.
Background bash handles Use existing bash process guidance. Out of scope; no new attention policy in v1.
Silent/no-wake background work Not supported in v1. Future design only; do not document as settable.

Suggested $workflow-authoring wording:

Mux persists an internal attention policy for background work. Workflow scripts do not set this field directly in v1. Use foreground/default workflow runs when the caller needs the result before continuing. Use run_in_background: true only when unrelated work can proceed; Mux treats that as non-blocking and will wake the owning workspace when the run reaches a terminal state. Workflow-owned agent(), parallel(), pipeline(), and nested workflow() steps are blocking from the workflow conductor’s perspective because their outputs are durable step results.

Suggested $loop wording:

Before dispatching work, decide whether the loop’s next decision depends on the result. If yes, use foreground/default mode or explicitly task_await the returned ID. If no, use run_in_background: true; Mux assigns notify-on-terminal behavior and wakes the workspace later. Do not use background work to hide an unbounded polling loop; record IDs and convergence conditions.

Avoid tautological tests that assert exact prompt prose unless the behavior depends on a specific machine-readable ID list.

Quality gate: tool descriptions, built-in skill docs, and Storybook stories all describe the same inferred-policy behavior without advertising an unsupported public attentionPolicy / attention_policy tool argument; Storybook renders the new synthetic-message states without layout regressions.

Test plan

Unit tests to add/update

src/node/services/taskService.test.ts

  1. Default blocking remains unchanged
    • Active child task with missing/blocking_until_terminal policy still triggers parent auto-resume with buildBackgroundAwaitPrompt().
  2. Notify sub-agent does not force active await
    • Active child task with taskAttentionPolicy: "notify_on_terminal" does not trigger parent auto-resume at stream-end.
  3. Queue-backgrounded foreground wait persists notify policy
    • A foreground wait backgrounded by a queued message is not forced again on later stream-end.
    • Replace/retire the current one-shot exemption expectation.
  4. Foreground wait timeout/limit persists notify policy
    • A foreground wait that times out or exceeds the foreground wait limit and continues in the background is not forced again on later stream-end.
  5. Explicit task_await on notify work does not re-promote to blocking or duplicate wake
    • The await may resolve/reject normally, but persisted policy remains notify_on_terminal.
    • If the foreground waiter consumes the terminal result, no pending terminal notification is enqueued for the same source.
  6. Mixed blocking + notify active tasks prompts only blocking IDs
    • Prompt includes blocking task IDs.
    • Prompt excludes notify task IDs.
  7. Terminal notifier preserves parent resume options
    • Completed sub-agent wake-up still preserves parent agentId, model, and thinking-level behavior from stream-end metadata/history/defaults.
    • Synthetic send options include synthetic, agentInitiated, skipAutoResumeReset, and requireIdle.
  8. Notify workspace-turn does not force active await
    • Active wst_... with notify policy does not trigger parent active-await prompt.
  9. Notify workspace-turn wakes on terminal completion
    • settleWorkspaceTurn() terminal transition enqueues a pending notification, then the notifier sends a synthetic prompt to the owner workspace after locks are released and the owner is idle.
    • Prompt includes exact handle ID and timeout_secs: 0 guidance.
    • Duplicate terminal settlement does not enqueue or send a duplicate prompt.
    • Restart-style stale recovery does not send a duplicate prompt when terminalAttentionNotifiedAt is already present.
  10. Parent streaming / queued turn defers terminal wake-up
  • Notify sub-agent completion while the parent is streaming creates a pending notification, sends no immediate wake-up, then drains once the parent stream ends and is idle.
  • Notify completion while a user turn is queued/preparing does not inject ahead of the user turn; it persists pending and drains after that turn completes.
  1. Pending terminal notification restart safety
  • A pending notification written before restart/recovery is not lost.
  • Delivered/superseded notifications do not send again after restart.
  1. Terminal notification coalescing
  • Multiple pending terminal notifications for one owner drain as one synthetic wake-up where possible.
  1. Workspace-turn target with active dependencies does not finalize prematurely
  • A kind="workspace" task workspace with active descendants/workflows does not settle its wst_... handle before those dependencies are terminal or explicitly non-blocking under the target workspace's own policy.
  1. Top-level background workflow does not force active await
  • Active workflow run with notify policy does not trigger buildBackgroundAwaitPrompt() at parent stream-end.
  • Existing onBackgroundRunTerminal continuation remains responsible for terminal wake-up.
  1. Best-of / variants coalesce notify wake-ups
  • Multiple notify sibling tasks spawned via n or variants do not cause repeated wake-up spam; the idle parent wakes once with all injected terminal reports available.
  1. Task-local workflows remain blocking
  • Child task with active task-local workflow still receives promptTaskForBackgroundAwait() before final report unless explicitly handled by workflow ownership.
  1. Workflow-owned child tasks remain excluded
  • Existing generic parent handoff skip behavior remains.
  1. Legacy/missing policy remains blocking
  • Old records without policy continue to use current force-await behavior.

src/node/services/tools/task_await.test.ts

  1. Backgrounded foreground wait returns running/backgrounded status as before.
  2. Completed workspace-turn retrieval with timeout_secs: 0 returns terminal report/error without blocking.
  3. Tool notes are updated if wording changes.

Workflow tests

  1. workflow_run({ run_in_background: true }) persists attentionPolicy: "notify_on_terminal".
  2. workflow_resume({ run_in_background: true }) persists or retains attentionPolicy: "notify_on_terminal".
  3. Legacy/missing policy workflow records still parse and default to blocking in active-await filtering.
  4. Existing workflow terminal continuation tests still pass and are not duplicated by TaskService wake-ups.

Schema tests

  1. WorkspaceConfigSchema parses old child workspaces without taskAttentionPolicy.
  2. WorkspaceTurnTaskHandleRecordSchema parses old handles without attentionPolicy.
  3. WorkflowRunRecordSchema parses old runs without attentionPolicy.
  4. Invalid attention policy values fail schema parse.

Built-in skill documentation checks

  1. $workflow-authoring describes internal inferred attentionPolicy behavior for foreground/default workflow runs, background workflow runs, and workflow-owned conductor steps.
  2. $loop describes how to choose foreground/default execution vs run_in_background: true based on whether the next loop decision depends on the result.
  3. Neither skill tells agents/authors to pass a public attentionPolicy or attention_policy field in v1.
  4. Neither skill documents unsupported silent_background behavior as available.

Terminal notification store tests

  1. Pending notifications persist and reload from the owner workspace session directory.
  2. Enqueueing the same source kind/source ID is idempotent.
  3. Drain coalesces pending notifications by owner workspace and output-delivery kind.
  4. Delivered/superseded notifications are not redelivered.
  5. Notification prompt composition uses persisted promptKind / terminalOutcome rather than guessing from history.
  6. Drain does not call workspaceService.sendMessage() when the owner is streaming, queued, or preparing.

Validation commands

Run after implementation:

bun test src/node/services/taskService.test.ts -t "background"
bun test src/node/services/taskService.test.ts -t "workspace-turn"
bun test src/node/services/taskService.test.ts -t "workflow"
bun test src/node/services/tools/task_await.test.ts
bun test src/node/services/tools/workflow_run.test.ts
bun test src/node/services/tools/workflow_resume.test.ts
make typecheck
make lint

Before declaring done, run the broader static gate if local resources allow:

MUX_ESLINT_CONCURRENCY=1 make static-check

If make static-check is too heavy or fails for known environmental reasons, record the exact blocker and the targeted passing checks.

Dogfooding plan

Dogfooding must happen between implementation phases and before PR-ready claims. Collect screenshots and a short video so reviewers can verify behavior.

Setup

Use an isolated desktop sandbox so the test does not affect the developer's real Mux state:

MUX_E2E=1 make dev-desktop-sandbox

If provider/project state should be clean:

MUX_E2E=1 make dev-desktop-sandbox DEV_DESKTOP_SANDBOX_ARGS="--clean-projects"

Use agent-browser for interaction and evidence capture. At dogfood time, load current CLI instructions:

agent-browser skills get core
agent-browser skills get electron
agent-browser skills get dogfood

Scenario 1 — Background sub-agent no longer forces active await

  1. In the sandbox Mux app, open a trusted test project/workspace.
  2. Ask the agent to start a background sub-agent, e.g. “Spawn an explore sub-agent in the background to inspect src/node/services/taskService.ts, then continue with a short acknowledgement.”
  3. Verify the parent turn ends naturally while the child task remains active.
  4. Verify there is no synthetic MUST NOT end your turn / active task_await prompt.
  5. While the child remains active, send an unrelated short prompt and verify the parent can answer without being forced to await the child.

Evidence:

  • Screenshot of the task tool card showing background task metadata and active/running status.
  • Screenshot of the parent answering an unrelated follow-up while the task is still running.
  • Video segment showing no forced task_await turn appears.

Scenario 2 — Terminal sub-agent wake-up integrates injected report

  1. Let the background sub-agent complete while the parent workspace is idle.
  2. Verify the parent wakes automatically.
  3. Verify the parent response integrates the injected <mux_subagent_report>.
  4. Verify the parent does not call task_await just to retrieve an already-injected sub-agent report.

Evidence:

  • Screenshot of the terminal wake-up response integrating report content.
  • Screenshot or exported chat.jsonl excerpt showing a synthetic <mux_subagent_report> in parent history.
  • Short video from child completion to parent wake-up.

Scenario 3 — Workspace-turn terminal wake-up retrieves output with task_await

  1. Ask the agent to start a background workspace turn (task(kind="workspace", run_in_background=true)) that will produce a short final answer.
  2. Verify the parent can end its current turn without active forced await while the wst_... handle runs.
  3. When the workspace turn completes, verify the parent gets a synthetic wake-up that includes the exact wst_... ID and instructs task_await with timeout_secs: 0.
  4. Verify the parent calls task_await, receives terminal output, and integrates it.

Evidence:

  • Screenshot of wake-up prompt/tool call with wst_... and timeout_secs: 0.
  • Screenshot of final integrated response.
  • Video of terminal handle completion to retrieval.

Scenario 4 — Background workflow continues via existing workflow result path

  1. Start a background workflow run from a workspace.
  2. Verify parent turn is not forced into active task_await while the workflow is still running.
  3. Let the workflow complete.
  4. Verify existing workflow result continuation appears and includes workflow result context.

Evidence:

  • Screenshot of active workflow card without forced await prompt.
  • Screenshot/video of workflow terminal result continuation.

Mobile/narrow-width visual check

If synthetic message copy or tool-card layout changes are visible in the renderer:

  1. Validate Storybook or app at ~375px width.
  2. Confirm long ID lists (task_..., wst_..., wfr_...) wrap/truncate without right-edge overflow.
  3. Capture a narrow viewport screenshot if UI changed.

Acceptance criteria

  • run_in_background: true sub-agent tasks do not trigger active force-await prompts at parent stream-end.
  • run_in_background: true workspace-turn handles do not trigger active force-await prompts while running.
  • run_in_background: true top-level workflow runs do not trigger active force-await prompts while running.
  • Foreground/blocking tasks and workflows still force await when the current turn truly depends on them.
  • A foreground wait backgrounded by a queued message remains non-blocking on future stream-ends and wakes on terminal completion.
  • Foreground waits that timeout/exceed foreground wait limits and continue in the background persist notify_on_terminal.
  • Explicit task_await on notify_on_terminal work does not clear durable notify policy or re-promote the work to blocking.
  • Terminal sub-agent success/failure wake-ups continue to integrate already-injected synthetic report/failure messages and do not tell the agent to call task_await unnecessarily.
  • Terminal workspace-turn wake-ups instruct one-shot retrieval with task_await and timeout_secs: 0.
  • Workflow-owned child tasks continue through workflow-owned result delivery and do not trigger generic parent handoffs.
  • Task-local background workflows inside child tasks remain blocking unless a deliberate output-propagation design is added.
  • Mixed active work prompts include only blocking IDs.
  • Terminal wake-ups are one-shot/coalesced enough to avoid repeated auto-resume spam.
  • Missing attention policy on old persisted records defaults to blocking_until_terminal.
  • Terminal notifications preserve existing parent resume option resolution (agentId, model, thinking level) and synthetic send flags.
  • If a foreground waiter consumes a terminal result, no duplicate pending wake-up is enqueued for that source.
  • Notification records include enough terminal outcome metadata to compose success/failure/interrupted prompts without guessing from history.
  • Terminal wake-ups are enqueued through a persisted notifier and are not sent directly from settlement locks.
  • Pending sub-agent and workspace-turn terminal notifications survive restart and drain once the owner is idle.
  • Terminal wake-ups do not inject ahead of queued/preparing user turns or active owner streams.
  • Workspace-turn wake-up dedupe is restart-safe via persisted notification state.
  • workflow_resume({ run_in_background: true }) retains/persists notify policy.
  • $workflow-authoring and $loop are updated to document inferred internal attention policy cases and explicitly state the field is not directly settable in v1.
  • Built-in skill guidance matches tool descriptions and does not advertise unsupported silent_background or a public attentionPolicy / attention_policy argument.
  • Tests, typecheck, lint, and dogfood evidence support the behavior.

Risks and mitigations

Risk Impact Mitigation
Premature final answers Agent may answer without work it actually needed. Keep foreground/default work blocking_until_terminal; only derive notify from explicit run_in_background or queued-message backgrounding.
Stale wake-ups after user changes direction Background result may interrupt or confuse a newer task. Require idle, respect queued/preparing turns, preserve workflow currentness, and consider follow-up currentness helpers if tests expose stale wake-ups.
Wake-up spam from many siblings Multiple synthetic turns can waste tokens. Coalesce by waking only when no other owner-visible active notify descendants remain, and persist workspace-turn notification state if needed.
Output retrieval mismatch Agent may call/skip task_await incorrectly. Prompt by output delivery kind: injected sub-agent/workflow reports = do not await; workspace-turn handles = task_await with terminal IDs.
Restart inconsistency In-memory exemptions disappear after restart. Persist policy on workspace config, task handles, and workflow run records.
Overexposed model interface New tool parameter may be misused. Do not expose attention_policy in first PR; derive from existing run_in_background.
Test tautology Tests could assert prose instead of behavior. Assert IDs, send/no-send decisions, policy persistence, terminal statuses, and tool call args; avoid exact prompt copy unless necessary.

Suggested implementation order

  1. Add policy schema/type and persistence fields with legacy defaults.
  2. Add TerminalAttentionNotifier / notification store and tests before wiring terminal sends to it.
  3. Thread policy from task/workflow launch and resume paths.
  4. Persist notify policy when foreground waits detach via queued messages or foreground wait timeout/limit.
  5. Refactor handleStreamEnd() active-work filtering to separate blocking and notify work.
  6. Convert sub-agent terminal handoffs and workspace-turn terminal wake-ups to enqueue/drain through the notifier, with mandatory persisted idempotency marker and busy/queued-turn gating.
  7. Update tool descriptions, $workflow-authoring, $loop, and story copy to define run_in_background: true as non-blocking with terminal wake-up while keeping attentionPolicy internal/not directly settable in v1.
  8. Add/update targeted tests and documentation checks.
  9. Run validation commands.
  10. Dogfood in desktop sandbox with screenshots/video.
  11. If a PR is requested later, include plan contents in PR body per repo convention.

Advisor review log

  • Draft prepared for advisor review.
  • Advisor review round 1: not approved. Required changes addressed in this revision:
    • explicitly defined v1 run_in_background: true semantics as non-blocking with terminal wake-up;
    • added foreground wait timeout/limit detachment paths;
    • separated durable attention policy from transient markTaskForegroundRelevant() bookkeeping;
    • made workspace-turn terminal wake-up idempotency mandatory and restart-safe;
    • tightened busy/queued-turn gating;
    • included workflow_resume({ run_in_background: true });
    • added workspace-turn active-dependency invariants;
    • expanded required tests;
    • explicitly scoped background bash handles out.
  • Advisor review round 2: not approved. Required changes addressed in this revision:
    • added a concrete TerminalAttentionNotifier / persisted notification store seam;
    • made sub-agent busy/queued deferral restart-safe through pending notifications;
    • chose concrete queued-turn behavior: persist pending and drain after the queued/user turn finishes;
    • prohibited sendMessage() / retry loops while holding TaskService settlement/event locks;
    • added notifier store/coalescing/restart/busy-deferral tests.
  • Advisor review round 3: not approved. Required changes addressed in this revision:
    • notifier must preserve existing parent resume option resolution and send flags;
    • foreground waiter consumption suppresses duplicate terminal notification enqueue;
    • notification records include terminal outcome/prompt metadata for success/failure/interrupted prompt composition;
    • added tests for parent option preservation and explicit task_await without duplicate wake-up.
  • Advisor review round 4: APPROVED — no remaining required changes.
  • User requested an amendment to include $workflow-authoring and $loop guidance. Amendment added:
    • built-in skills document inferred internal attentionPolicy behavior;
    • v1 explicitly does not expose a public attentionPolicy / attention_policy argument;
    • skill docs include a case table for foreground/default, background, detached foreground waits, workflow-owned steps, background bash, and unsupported silent background.
  • Advisor amendment review: APPROVED — no required changes. Minor non-blocking wording suggestion: prefer “internal attention policy” in actual skill text so agents do not infer a public tool parameter exists.
  • Advisor approval: complete.

Generated with mux • Model: openai:gpt-5.5 • Thinking: xhigh • Cost: $128.29

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 83b1c999cb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/taskService.ts Outdated
Comment thread src/node/services/taskService.ts
@mintlify

mintlify Bot commented Jun 25, 2026

Copy link
Copy Markdown

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
Mux 🟢 Ready View Preview Jun 25, 2026, 10:03 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Please take another look. I addressed both P2 findings in 3f55f57 and added regression tests for the workspace-turn terminal race and self-backgrounded foreground workflow policy persistence.

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Re-requesting on the latest head after resolving the previous threads and retriggering checks.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 55d64c4ca2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/taskService.ts Outdated
Comment thread src/node/services/workflows/WorkflowService.ts
@ThomasK33

Copy link
Copy Markdown
Member Author

Addressed the latest Codex P2 findings in 8dbd050:

  • taskService.ts: task-owned stream-end recovery now filters active descendants, task-local workflow runs, and workspace-turn handles through the persisted notify_on_terminal policy before deciding whether to force a background await. Added regression coverage for all three task-owned active-work kinds.
  • WorkflowService.ts: checkpoint retries dispatched in the background now persist attentionPolicy: "notify_on_terminal" before the background runner starts, with regression coverage.

Validation on the pushed head:

  • bun test src/node/services/taskService.test.ts -t "requests agent_report while task-owned notify_on_terminal"
  • bun test src/node/services/taskService.test.ts -t "active descendants|active background workflow|task-local background workflow|workspace turns are active|notify_on_terminal|stream end while task has active descendants|awaiting_report tasks|stream-error"
  • bun test src/node/services/workflows/WorkflowService.test.ts
  • MUX_ESLINT_CONCURRENCY=1 make static-check

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Please review the latest pushed fixes for the two P2 findings.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8dbd0506f2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/taskService.ts
Comment thread src/node/services/taskService.ts Outdated
@ThomasK33

Copy link
Copy Markdown
Member Author

Addressed the second Codex P2 findings in c9b877b:

  • Startup recovery now scans the terminal attention store for owner workspaces with pending notifications and schedules drains from TaskService.initialize(), so crash windows after enqueueIfAbsent() are recovered without waiting for another enqueue/stream-end.
  • Task-owned non-blocking background work no longer forces task_await, but live child tasks/workflow runs/workspace turns still prevent agent_report finalization. This keeps nested background results from being lost while preserving non-blocking ordinary turn-end behavior.
  • Fixed the unit-test expectation for the newly added built-in background-monitors skill.

Validation on the pushed head:

  • bun test src/node/services/terminalAttentionStore.test.ts
  • bun test src/node/services/taskService.test.ts -t "initialize drains persisted terminal wake-ups|notify_on_terminal descendants are active|notify_on_terminal workspace turns are active|notify_on_terminal workflow runs are active|workspace turns are still active|active descendants"
  • bun test src/node/services/agentSkills/agentSkillsService.test.ts
  • bun test src/node/services/workflows/WorkflowService.test.ts
  • MUX_ESLINT_CONCURRENCY=1 make static-check

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Please review the latest pushed fixes for the second P2 round.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c9b877b8fa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/taskService.ts
Comment thread src/node/services/taskService.ts Outdated
@ThomasK33

Copy link
Copy Markdown
Member Author

Addressed the latest Codex findings in 0a62cb0:

  • Terminal attention drains now re-check owner-visible blocking active work (blocking descendants, workspace-turn handles, and active blocking workflow runs) after the idle/queued checks and defer while any exist, so a terminal wake-up cannot race ahead of a required active-work task_await nudge.
  • Completion-tool recovery now blocks on all active task-owned work, including notify_on_terminal descendants/workspace turns/workflow runs, so it no longer asks for premature agent_report while handleStreamEnd would still reject finalization.

Validation on the pushed head:

  • bun test src/node/services/taskService.test.ts -t "initialize drains persisted terminal wake-ups|initialize defers terminal wake-up|initialize does not request agent_report|notify_on_terminal descendants are active|notify_on_terminal workspace turns are active|notify_on_terminal workflow runs are active|keeps agent_report blocked|workspace turns are still active|active descendants"
  • bun test src/node/services/terminalAttentionStore.test.ts
  • bun test src/node/services/agentSkills/agentSkillsService.test.ts
  • bun test src/node/services/workflows/WorkflowService.test.ts
  • MUX_ESLINT_CONCURRENCY=1 make static-check

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Please review the latest pushed fixes for the P1/P2 drain and completion-recovery findings.

@ThomasK33

Copy link
Copy Markdown
Member Author

Addressed the auto-retry-start waiter gap in bdc522f:

  • WorkspaceService.waitForIdleAndNoQueuedMessages() now wakes on auto-retry-starting as well as auto-retry-abandoned, then re-checks the outer busy/queued/retry loop. This prevents terminal-attention drain waiters from hanging after the retry timer clears and the retry begins.

Validation on the pushed head:

  • MUX_ESLINT_CONCURRENCY=1 make static-check

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Please review the latest pushed fix for waking terminal-attention drain waiters when auto-retry starts.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bdc522fc7c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/workspaceService.ts Outdated
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Pushed one more retry-state hardening fix: hasPendingAutoRetry() now remains true while an auto-retry resume attempt is in its auto-retry-starting window, so terminal attention drains keep deferring until the retry either starts streaming or settles. Added targeted coverage in agentSession.startupAutoRetry.test.ts.

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Resolved the retry-startup drain thread with the pushed fix in 7b2cb3d. Please re-review the latest head.

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

The retry-startup thread is now resolved after the pushed fix. Please review the latest head.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b2cb3d899

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/workspaceService.ts Outdated
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Resolved the auto-retry idle-wait thread with the pushed fix in 30c6685. Please re-review the latest head.

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

The auto-retry idle-wait thread is now resolved after 30c6685. Please review the latest head.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 30c6685b0c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/taskService.ts Outdated
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Resolved the pending terminal wake-up sibling thread with 54ee67e. Please review the latest head.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 54ee67e98b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/workflows/WorkflowService.ts
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Resolved the durable background workflow wake-up thread with 8363d18. Please review the latest head.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8363d18f03

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/aiService.ts
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Resolved the duplicate workflow task_await wake-up thread with 7ac8f15. Please review the latest head.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7ac8f15a8c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/taskService.ts
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Resolved the consumed-workflow wake-up tombstone thread with 3621cb5. Please review the latest head.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3621cb51a4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/terminalAttentionStore.ts
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Resolved the retried-workflow wake-up reset thread with 34be8c8. Please review the latest head.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 34be8c8b07

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/taskService.ts
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Resolved the workspace-turn nonblocking-descendant deferral thread with 312000d. Please review the latest head.

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep them coming!

Reviewed commit: 312000d413

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant