Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
cef2fe8
feat: persist background-work attention policy + policy-aware stream-…
ThomasK33 Jun 25, 2026
0e965f2
feat: derive notify_on_terminal from run_in_background + detach paths…
ThomasK33 Jun 25, 2026
d2dc9ca
feat: TerminalAttentionStore + workspace-turn terminal wake-up via id…
ThomasK33 Jun 25, 2026
165f2ee
feat: route sub-agent terminal handoffs through TerminalAttentionNoti…
ThomasK33 Jun 25, 2026
616b68b
test: schema parse coverage for attentionPolicy (workspace config + w…
ThomasK33 Jun 25, 2026
476a5b6
docs: update tool descriptions, workflow-authoring/loop skills, and s…
ThomasK33 Jun 25, 2026
cc9353b
chore: fix lint (optional chain + redundant assertions)
ThomasK33 Jun 25, 2026
ae7d4e1
chore: prettier format
ThomasK33 Jun 25, 2026
eb2561c
fix: enqueue workspace-turn wake-up after lock + mark notified post-a…
ThomasK33 Jun 25, 2026
83b1c99
docs: add background monitor guidance for notify-on-terminal tasks
ThomasK33 Jun 25, 2026
3f55f57
fix: close notify-on-terminal detach races
ThomasK33 Jun 25, 2026
55d64c4
chore: retrigger checks after resolving codex threads
ThomasK33 Jun 25, 2026
8dbd050
fix: honor notify policy in task-owned background checks
ThomasK33 Jun 25, 2026
c9b877b
fix: recover pending terminal wakeups
ThomasK33 Jun 25, 2026
0a62cb0
fix: gate terminal wakeups behind blocking work
ThomasK33 Jun 25, 2026
b182459
fix: keep notify task subtrees nonblocking
ThomasK33 Jun 25, 2026
0ed6d35
fix: default background workflows to notify
ThomasK33 Jun 25, 2026
324fc40
fix: ignore nested workflow drain blockers
ThomasK33 Jun 25, 2026
2b11e59
fix: serialize workspace turn notify writes
ThomasK33 Jun 25, 2026
9ffefa5
fix: recover terminal turn wakeups
ThomasK33 Jun 25, 2026
5490cc7
fix: retry terminal drains after idle race
ThomasK33 Jun 25, 2026
93a8a99
fix: keep queued terminal wakeups pending
ThomasK33 Jun 25, 2026
ffeb4a5
fix: retry canceled terminal wakeups
ThomasK33 Jun 25, 2026
fdbda56
fix: retry terminal wakeups after stream failure
ThomasK33 Jun 25, 2026
c886082
fix: retry terminal wakeups after idle
ThomasK33 Jun 25, 2026
01943e7
fix: retry terminal drains after queued turns
ThomasK33 Jun 25, 2026
796969d
fix: retry terminal drains after auto retry
ThomasK33 Jun 25, 2026
bdc522f
fix: wake terminal drains on retry start
ThomasK33 Jun 25, 2026
7b2cb3d
fix: keep retry startup pending for terminal drains
ThomasK33 Jun 25, 2026
30c6685
fix: wake idle waits when retry starts streaming
ThomasK33 Jun 25, 2026
54ee67e
fix: keep terminal wakeups pending across siblings
ThomasK33 Jun 25, 2026
8363d18
fix: persist workflow terminal wakeups
ThomasK33 Jun 25, 2026
7ac8f15
fix: clear workflow wakeups consumed by await
ThomasK33 Jun 25, 2026
3621cb5
fix: tombstone consumed workflow wakeups
ThomasK33 Jun 25, 2026
34be8c8
fix: reset consumed workflow wakeups on retry
ThomasK33 Jun 25, 2026
312000d
fix: avoid deferring nonblocking workspace turns
ThomasK33 Jun 25, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions docs/hooks/tools.mdx

Large diffs are not rendered by default.

10 changes: 6 additions & 4 deletions src/browser/features/Messages/MessageRenderer.stories.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -318,9 +318,9 @@ export const SyntheticAutoResumeMessages: AppStory = {
),
createUserMessage(
"msg-3",
"You have active background sub-agent task(s) (task-abc123). " +
"You MUST NOT end your turn while any sub-agent tasks are queued/running/awaiting_report. " +
"Call task_await now to wait for them to finish.",
"You have active background task handle(s) (task-abc123). " +
"You MUST NOT end your turn while any listed task handles are queued/starting/running/awaiting_report. " +
'Call task_await now with task_ids: ["task-abc123"] to wait for them.',
{
historySequence: 3,
timestamp: STABLE_TIMESTAMP - 290000,
Expand All @@ -329,7 +329,9 @@ export const SyntheticAutoResumeMessages: AppStory = {
),
createUserMessage(
"msg-4",
"Your background sub-agent task(s) have completed. Use task_await to retrieve their reports and integrate the results.",
"Background sub-agent task(s) have completed. Their accepted reports and any structured outputs " +
"are already injected into this workspace context as task tool results or synthetic user report " +
"messages. Write the final response now, integrating those results.",
{
historySequence: 4,
timestamp: STABLE_TIMESTAMP - 285000,
Expand Down
27 changes: 27 additions & 0 deletions src/common/orpc/schemas/workflow.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,33 @@ describe("workflow domain schemas", () => {
expect(run.events.map((event) => event.sequence)).toEqual([1, 2, 3]);
});

test("workflow run records default to no attentionPolicy and accept notify_on_terminal", () => {
const baseRun = {
id: "wfr_123",
workspaceId: "workspace-1",
workflow: { name: "deep-research", description: "x", scope: "built-in", executable: true },
source: "export default async function workflow() { return null; }",
sourceHash: "sha256:abc123",
args: {},
status: "running",
createdAt: "2026-05-29T00:00:00.000Z",
updatedAt: "2026-05-29T00:00:01.000Z",
events: [],
steps: [],
};
// Legacy record without the field still parses.
expect(WorkflowRunRecordSchema.parse(baseRun).attentionPolicy).toBeUndefined();
// Background runs persist notify_on_terminal.
expect(
WorkflowRunRecordSchema.parse({ ...baseRun, attentionPolicy: "notify_on_terminal" })
.attentionPolicy
).toBe("notify_on_terminal");
// Invalid policy values are rejected.
expect(
WorkflowRunRecordSchema.safeParse({ ...baseRun, attentionPolicy: "bogus" }).success
).toBe(false);
});

test("accepts plan file path metadata on structured task output", () => {
const parsed = StructuredTaskOutputSchema.parse({
taskId: "task-plan",
Expand Down
5 changes: 5 additions & 0 deletions src/common/orpc/schemas/workflow.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import { z } from "zod";

import { BackgroundWorkAttentionPolicySchema } from "@/common/types/backgroundWorkAttention";

export const WorkflowNameSchema = z
.string()
.min(1)
Expand Down Expand Up @@ -269,6 +271,9 @@ export const WorkflowRunRecordSchema = z.object({
agentOutputSchemaRequired: z.boolean().optional(),
agentTypeAliasAllowed: z.boolean().optional(),
parentWorkflow: WorkflowRunParentSchema.optional(),
// How the owner workspace's stream-end treats this run while active. Background
// runs are "notify_on_terminal"; missing/legacy records default to blocking.
attentionPolicy: BackgroundWorkAttentionPolicySchema.optional(),
status: WorkflowRunStatusSchema,
createdAt: IsoDateTimeSchema,
updatedAt: IsoDateTimeSchema,
Expand Down
26 changes: 26 additions & 0 deletions src/common/schemas/project.attentionPolicy.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import { describe, expect, test } from "bun:test";

import { WorkspaceConfigSchema } from "@/common/schemas/project";

describe("WorkspaceConfigSchema taskAttentionPolicy", () => {
const base = { path: "/repo/ws", id: "ws-1" };

test("parses legacy child workspaces without taskAttentionPolicy", () => {
const parsed = WorkspaceConfigSchema.parse(base);
expect(parsed.taskAttentionPolicy).toBeUndefined();
});

test("accepts a persisted notify_on_terminal policy", () => {
const parsed = WorkspaceConfigSchema.parse({
...base,
taskAttentionPolicy: "notify_on_terminal",
});
expect(parsed.taskAttentionPolicy).toBe("notify_on_terminal");
});

test("rejects an invalid attention policy value", () => {
expect(WorkspaceConfigSchema.safeParse({ ...base, taskAttentionPolicy: "bogus" }).success).toBe(
false
);
});
});
7 changes: 7 additions & 0 deletions src/common/schemas/project.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import {
WorkspaceAISettingsSchema,
} from "@/common/orpc/schemas/workspaceAiSettings";
import { ThinkingLevelSchema } from "@/common/types/thinking";
import { BackgroundWorkAttentionPolicySchema } from "@/common/types/backgroundWorkAttention";
import { z } from "zod";

import { RuntimeEnablementIdSchema } from "./ids";
Expand Down Expand Up @@ -193,6 +194,12 @@ export const WorkspaceConfigSchema = z.object({
"checkout (no fork): its `path` points at the parent's checkout, init is skipped, and removal " +
'must not delete that shared directory. Absent/"fork" is the isolated default.',
}),
taskAttentionPolicy: BackgroundWorkAttentionPolicySchema.optional().meta({
description:
"How the owner workspace's stream-end treats this child task while it is active. " +
'"notify_on_terminal" (background launches) does not force the owner to await; ' +
'"blocking_until_terminal" (foreground/default, and missing/legacy records) does.',
}),
mcp: WorkspaceMCPOverridesSchema.optional().meta({
description:
"LEGACY: Per-workspace MCP overrides (migrated to <workspace>/.mux/mcp.local.jsonc)",
Expand Down
37 changes: 37 additions & 0 deletions src/common/types/backgroundWorkAttention.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
import { z } from "zod";

/**
* Internal attention policy for background work (sub-agent tasks, workspace-turn
* handles, workflow runs). Determines how the owning workspace's stream-end
* treats the work while it is still active.
*
* - `blocking_until_terminal`: active work forces the owner to call `task_await`
* before ending its turn (the historical force-await behavior). This is the
* default for foreground/default launches and for any legacy record missing a
* persisted policy (backward compatibility).
* - `notify_on_terminal`: active work does NOT block the owner's turn-end. When
* the work reaches a terminal state Mux sends a targeted synthetic wake-up so
* the owner can integrate the completed output. This is derived from
* `run_in_background: true` and from foreground waits detached by a queued
* message or a foreground-wait timeout.
*
* This is internal in v1: it is not exposed as a model-visible tool parameter.
*/
export const BACKGROUND_WORK_ATTENTION_POLICIES = [
"blocking_until_terminal",
"notify_on_terminal",
] as const;

export type BackgroundWorkAttentionPolicy = (typeof BACKGROUND_WORK_ATTENTION_POLICIES)[number];

export const BackgroundWorkAttentionPolicySchema = z.enum(BACKGROUND_WORK_ATTENTION_POLICIES);

/** Missing/legacy persisted policy is treated as blocking for backward compatibility. */
export const DEFAULT_BACKGROUND_WORK_ATTENTION_POLICY: BackgroundWorkAttentionPolicy =
"blocking_until_terminal";

export function resolveBackgroundWorkAttentionPolicy(
policy: BackgroundWorkAttentionPolicy | undefined | null
): BackgroundWorkAttentionPolicy {
return policy ?? DEFAULT_BACKGROUND_WORK_ATTENTION_POLICY;
}
8 changes: 8 additions & 0 deletions src/common/types/tasks.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,14 @@ export const DEFAULT_TASK_SETTINGS: TaskSettings = {
preserveSubagentsUntilArchive: false,
};

export {
BACKGROUND_WORK_ATTENTION_POLICIES,
BackgroundWorkAttentionPolicySchema,
DEFAULT_BACKGROUND_WORK_ATTENTION_POLICY,
resolveBackgroundWorkAttentionPolicy,
type BackgroundWorkAttentionPolicy,
} from "./backgroundWorkAttention";

const AGENT_DEFAULT_IDS_EXCLUDED_FROM_LEGACY_SUBAGENTS: ReadonlySet<string> = new Set([
"plan",
"exec",
Expand Down
6 changes: 4 additions & 2 deletions src/common/utils/tools/toolDefinitions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -320,7 +320,7 @@ export function buildTaskToolDescription(runtimeMode: RuntimeMode | undefined):
"Avoid telling the sub-agent to read your plan file; child workspaces do not automatically have access to it. " +
"\n\nIf run_in_background is false, waits for the sub-agent to finish and returns the completed report. When grouped sibling tasks are requested via n or variants, the completed result includes one report per spawned task. " +
"If the foreground wait times out, returns queued/starting/running task metadata with a note (the task continues running); use task_await to monitor progress. " +
"If run_in_background is true, returns immediately with queued/starting/running task metadata; use task_await to wait for completion, task_list to rediscover active tasks, and task_terminate to stop it. " +
"If run_in_background is true, returns immediately with queued/starting/running task metadata and the task runs non-blocking: you may end your turn without awaiting it, and Mux wakes this workspace when the task reaches a terminal state so you can integrate its result. Use task_await only when the current request depends on the output before you can answer, or to inspect progress. " +
"Prefer run_in_background: false when spawning a single task — it is equivalent to spawning background + immediately awaiting, but saves a round-trip. " +
"Use run_in_background: true when launching multiple tasks in parallel so you can act on each as it completes via task_await (which returns on the first completion by default); a foreground grouped spawn (run_in_background: false) instead blocks until every sibling finishes and returns all reports at once. " +
"Do not call task_await in the same parallel tool-call batch; wait for the returned task metadata first. " +
Expand Down Expand Up @@ -1350,6 +1350,7 @@ export const TOOL_DEFINITIONS = {
"List active tasks with task_list. " +
"Process persists until timeout_secs expires, terminated, or workspace is removed." +
"\\n\\nFor long-running tasks like builds or compilations, prefer background mode to continue productive work in parallel. " +
"Raw background bash does not automatically wake the parent workspace when it prints output or exits; use task_await when you need output, or wrap the script in a background task/workflow monitor when wake-on-condition behavior is required. " +
"Do not call task_await in the same parallel tool-call batch; wait for the returned taskId first. " +
"When you actually need the output, read it with task_await; do not poll task_await just because the process is still running."
),
Expand Down Expand Up @@ -1892,6 +1893,7 @@ export const TOOL_DEFINITIONS = {
"\n\nWHEN TO USE: only call task_await when the current user request depends on a task's output, or when synthesis/integration of a previously-spawned task is the next logical step. " +
"Do not call task_await solely because active tasks exist; for unrelated user messages, respond directly and let tasks continue in the background. " +
"If a synthetic/system follow-up explicitly says active background tasks or workflow runs block your turn, treat that as a dependency and await the listed IDs. " +
"When a terminal wake-up says a sub-agent report or failure is already injected into context, integrate it directly — do NOT call task_await for it. When a wake-up asks you to retrieve a workspace turn's terminal output, call task_await with the listed IDs and timeout_secs: 0 (a one-shot retrieval, not a wait). " +
"\n\nIMPORTANT: Do not call task_await in the same parallel tool-call batch as task, bash, or workflow_run — " +
"the taskId/runId is not available until the spawning tool returns. " +
"Always wait for the task/bash/workflow_run tool result first, then call task_await in a subsequent step. " +
Expand Down Expand Up @@ -1937,7 +1939,7 @@ export const TOOL_DEFINITIONS = {
"Use agent_skill_read / agent_skill_read_file to discover and inspect skill-packaged workflows; non-skill workflow files must be addressed by an explicit known path and can be inspected with normal file tools. " +
"Prefer the default foreground mode (`run_in_background` omitted or false) so completed workflows return their result without an extra task_await round-trip. " +
"If workflow_run returns status=running or status=backgrounded, await the returned runId with task_await before using or reporting the workflow output. " +
"Use background mode only when you intend to start another workflow/task or do independent work while the workflow runs.",
"Use background mode only when you intend to start another workflow/task or do independent work while the workflow runs; a background run is non-blocking and Mux wakes this workspace with the terminal workflow result, so call task_await only when the current request depends on the output before you can answer.",
schema: WorkflowRunToolArgsSchema,
},
workflow_resume: {
Expand Down
2 changes: 2 additions & 0 deletions src/common/utils/tools/tools.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import { type LanguageModel, type Tool } from "ai";
import type { LanguageModelV2Usage } from "@ai-sdk/provider";
import type { BackgroundWorkAttentionPolicy } from "@/common/types/backgroundWorkAttention";
import { cloneToolPreservingDescriptors } from "@/common/utils/tools/cloneToolPreservingDescriptors";
import { createFileReadTool } from "@/node/services/tools/file_read";
import { createAttachFileTool } from "@/node/services/tools/attach_file";
Expand Down Expand Up @@ -192,6 +193,7 @@ export interface ToolConfiguration {
workspaceId: string;
projectTrusted: boolean;
args: unknown;
attentionPolicy?: BackgroundWorkAttentionPolicy;
onRunCreated?: (event: {
runId: string;
status: "pending";
Expand Down
91 changes: 91 additions & 0 deletions src/node/builtinSkills/background-monitors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
---
name: background-monitors
description: Run bounded background monitors that wake the agent only when a condition changes or the monitor finishes
---

# Background monitors

Use this skill when you need a long-running watcher for CI, mergeability, PR review, deployments, queue state, logs, or any condition where the agent can safely end its turn and be woken when the watcher finishes.

## What wakes the parent

Mux wakes the owning workspace when a background **task** or **workflow** reaches a terminal state (`completed`, `failed`, `interrupted`, or `error`). Use one of these forms for monitors:

- `task({ run_in_background: true, ... })` for an ad-hoc monitor implemented by a sub-agent.
- `workflow_run({ run_in_background: true, ... })` for durable/reusable monitors.

Raw `bash({ run_in_background: true })` is different: it keeps a process running and you can retrieve output with `task_await`, but it does **not** by itself send an automatic terminal wake-up to the parent. If you need wake-on-finish or wake-on-condition, wrap the shell polling inside a background task or workflow and have that task/workflow finish when the condition is reached.

## Monitor contract

Every monitor must be bounded and idempotent. Before launching one, define:

- **Condition:** exact event that should complete the monitor (for example, all required CI checks passed, mergeability changed, Codex left a review, deployment became healthy).
- **Actual-state read:** exact command/API used to check state (`gh pr view`, `gh run list`, project CLI, HTTP endpoint, log command).
- **Cadence:** sleep interval between checks; use one blocking loop in the monitor, not repeated parent turns.
- **Bound:** max attempts or wall-clock deadline, and what terminal report says on timeout.
- **Idempotency key:** PR number, deployment id, run id, or another stable identifier so duplicate monitors are recognizable in the report/title.
- **Output policy:** report only state transitions, convergence, or blockers; do not stream noisy logs into the parent.

## Preferred patterns

### Ad-hoc task monitor

Use a background `exec` task when the watch is specific to the current conversation:

```ts
task({
agentId: "exec",
title: "Monitor PR #123 CI",
run_in_background: true,
prompt: `
Task: Monitor PR #123 CI until it converges.

Loop guards:
- Desired state: all required checks pass, or a required check fails terminally.
- Actual-state read: gh pr checks 123 --watch=false --json name,state,conclusion,link.
- Cadence: sleep 60 seconds between checks.
- Bound: stop after 60 minutes or 60 attempts.
- Idempotency key: pr-123-ci.

Instructions:
1. Poll with a bounded shell loop.
2. Do not edit files or push commits.
3. When checks pass, call agent_report with a concise success summary and notable links.
4. If a required check fails, call agent_report with the failing check names and links.
5. If the bound expires, call agent_report with the last observed state and the next human decision needed.
`,
})
```

The parent may end its turn after the `task` tool returns. Mux will wake the parent when the monitor task calls `agent_report` or settles terminally.

### Parallel PR monitors

For PR readiness, run independent monitors in parallel when their state reads are independent:

- CI/checks monitor: required checks pass or fail.
- Mergeability monitor: merge state becomes clean/blocked/dirty.
- Review monitor: Codex/coder-agents review arrives, approves, or requests changes.
- Deployment monitor: preview/deployment health converges.

Each monitor should have a distinct title and idempotency key. Do not make the parent poll all monitors manually; let each monitor finish and wake the parent with a focused report.

### Durable workflow monitor

Use a workflow when monitoring must be reusable, resumable, or composed with other phases. A workflow can run in the background and own multiple bounded monitor steps. Workflow-owned child agents report through the workflow journal; the parent wakes when the workflow reaches a terminal result.

## Heartbeat fallback

Heartbeat is still useful as a coarse fallback reminder, but it should not replace a condition-driven monitor:

- Use the monitor to wake promptly when the condition changes.
- Use heartbeat only for periodic reconciliation if a monitor is interrupted, times out, or misses an external event.

## Avoid these traps

- Do not create unbounded `while true` monitors. Every monitor needs a deadline.
- Do not launch a raw background bash process and assume the parent will be woken automatically.
- Do not have multiple monitors watch the same idempotency key unless you intentionally want duplicate reports.
- Do not report every polling iteration. Report convergence, state transitions, failures, or timeout.
- Do not use monitors to hide work that the current answer depends on; use foreground/default mode or `task_await` when the next decision requires the result.
Loading
Loading