Skip to content

Commit 96d2dc0

Browse files
NagyViktNagyVikt
andauthored
Expose Colony-ready agent session surfaces (#487)
Agent orchestration needs machine-readable start, status, and finish evidence so Colony and cockpit consumers can pick up branches without scraping terminal prose. This records metadata, changed files, claims, PR evidence, and dry-run JSON while keeping existing human output and multi-account dry-run rendering. Constraint: Existing agents status and cockpit tests still need stable text output. Rejected: Replace multi-account dry-run rendering with single-plan JSON logic | would regress the launcher panel flow. Confidence: high Scope-risk: moderate Directive: Keep JSON payloads additive and schemaVersioned for downstream Colony consumers. Tested: node --test test/agents-start-dry-run.test.js test/agents-status.test.js test/agents-finish.test.js test/cockpit-actions.test.js test/cockpit-render.test.js Not-tested: Live Colony consumer integration Co-authored-by: NagyVikt <nagy.viktordp@gmail.com>
1 parent 8f056c7 commit 96d2dc0

17 files changed

Lines changed: 537 additions & 24 deletions

File tree

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
## Why
2+
3+
- Colony takeover and Queen-plan lanes need machine-readable Guardex surfaces so another agent can resume, inspect, and finish work without scraping human text.
4+
5+
## What Changes
6+
7+
- Extend `gx agents start --dry-run --json` with branch, worktree, launch command, claims, tmux, and Colony metadata.
8+
- Extend `gx agents status --json` and cockpit state with activity, claims, changed files, metadata, launch command, and PR evidence.
9+
- Extend `gx agents finish --json` with PR, merge, cleanup, and status evidence written back to session metadata.
10+
11+
## Impact
12+
13+
- Existing text output remains human-readable.
14+
- JSON output is additive and versioned with `schemaVersion: 1`.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
## ADDED Requirements
2+
3+
### Requirement: Colony-ready agent start planning
4+
`gx agents start <task> --dry-run --json` SHALL emit a versioned JSON plan that can be consumed by Colony or a cockpit integration without parsing human text.
5+
6+
#### Scenario: Previewing a Colony handoff lane
7+
- **WHEN** a user runs `gx agents start "fix handoff" --agent codex --claim README.md --meta colony.plan=queen-plan --dry-run --json`
8+
- **THEN** Guardex emits `schemaVersion`, `dryRun`, `task`, `agent`, `base`, `branch`, `worktree`, `worktreePath`, `claimedFiles`, `launchCommand`, `tmuxSession`, `tmuxTarget`, and `metadata`
9+
- **AND** the command does not create a branch, worktree, session, file claim, tmux session, or agent process.
10+
11+
### Requirement: Colony-ready agent status
12+
`gx agents status --json` SHALL expose session metadata needed to inspect, adopt, or finish active agent lanes.
13+
14+
#### Scenario: Inspecting a Queen-plan lane
15+
- **WHEN** a session stores Colony metadata, claimed files, changed files, launch command, and PR evidence
16+
- **THEN** the status payload includes `activity`, `claimedFiles`, `changedFiles`, `metadata`, `launchCommand`, `tmux`, `prUrl`, `prState`, and `pr`
17+
- **AND** cockpit state preserves the same fields for rendering.
18+
19+
### Requirement: Finish evidence JSON
20+
`gx agents finish --json` SHALL emit versioned completion evidence and persist that evidence to the session record.
21+
22+
#### Scenario: Finishing a merged lane
23+
- **WHEN** a finish command completes with PR output and cleanup enabled
24+
- **THEN** Guardex emits `schemaVersion`, `sessionId`, `branch`, `prUrl`, `mergeState`, `cleanupResult`, and `status`
25+
- **AND** the matching agent session records the PR state and finish evidence for later status and handoff surfaces.
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
## Definition of Done
2+
3+
This change is complete only when all of the following are true:
4+
5+
- Every checkbox below is checked.
6+
- The agent branch reaches `MERGED` state on `origin` and the PR URL + state are recorded in the completion handoff.
7+
- If any step blocks, append a `BLOCKED:` line under section 4 explaining the blocker and stop.
8+
9+
## Handoff
10+
11+
- Handoff: change=`agent-codex-colony-queen-agent-json-surface-2026-04-30-00-03`; branch=`agent/codex/colony-queen-agent-json-surface-2026-04-30-00-03`; scope=`agents start/status/finish JSON surface plus cockpit state`; action=`finish cleanup after quota takeover`.
12+
- Copy prompt: Continue `agent-codex-colony-queen-agent-json-surface-2026-04-30-00-03` on branch `agent/codex/colony-queen-agent-json-surface-2026-04-30-00-03`. Work inside the existing sandbox, review this file, continue from the current state instead of creating a new sandbox, and when the work is done run `gx branch finish --branch agent/codex/colony-queen-agent-json-surface-2026-04-30-00-03 --base main --via-pr --wait-for-merge --cleanup`.
13+
14+
## 1. Specification
15+
16+
- [x] 1.1 Record proposal scope and acceptance criteria.
17+
- [x] 1.2 Define normative JSON surface requirements.
18+
19+
## 2. Implementation
20+
21+
- [x] 2.1 Preserve Colony metadata through `gx agents start`.
22+
- [x] 2.2 Add status/cockpit fields for claims, changed files, metadata, launch command, and PR evidence.
23+
- [x] 2.3 Add finish JSON evidence and persist it on the session.
24+
25+
## 3. Verification
26+
27+
- [x] 3.1 Run focused agent/cockpit tests.
28+
- [x] 3.2 Run `openspec validate agent-codex-colony-queen-agent-json-surface-2026-04-30-00-03 --type change --strict`.
29+
- [x] 3.3 Run `openspec validate --specs`.
30+
31+
## 4. Cleanup
32+
33+
- [ ] 4.1 Run `gx branch finish --branch agent/codex/colony-queen-agent-json-surface-2026-04-30-00-03 --base main --via-pr --wait-for-merge --cleanup`.
34+
- [ ] 4.2 Record PR URL and final merge state (`MERGED`) in the completion handoff.
35+
- [ ] 4.3 Confirm sandbox worktree is gone and no local/remote branch refs remain.
36+
- BLOCKED: `gx branch finish --branch agent/codex/colony-queen-agent-json-surface-2026-04-30-00-03 --base main --via-pr --wait-for-merge --cleanup` auto-synced onto `origin/main` and hit rebase conflicts in `src/agents/start.js`, `src/cli/args.js`, and `test/agents-start-dry-run.test.js`; branch is 11 commits behind `origin/main`.

src/agents/finish.js

Lines changed: 96 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -43,10 +43,78 @@ function sessionStatusAfterFinish(finishArgs) {
4343
return finishArgs.includes('--no-wait-for-merge') && !directMode ? 'pr-opened' : 'finished';
4444
}
4545

46+
function cleanupResultAfterFinish(finishArgs, status) {
47+
if (status === 'failed') return 'failed';
48+
if (finishArgs.includes('--no-cleanup')) return 'skipped';
49+
if (finishArgs.includes('--cleanup')) return status === 'finished' ? 'completed' : 'pending';
50+
return 'unknown';
51+
}
52+
53+
function firstMatch(text, patterns) {
54+
for (const pattern of patterns) {
55+
const match = text.match(pattern);
56+
if (match && match[1]) return match[1].trim();
57+
}
58+
return '';
59+
}
60+
61+
function finishOutputText(result, captured = {}) {
62+
return [
63+
captured.stdout,
64+
captured.stderr,
65+
result?.stdout,
66+
result?.stderr,
67+
].map((value) => String(value || '')).join('\n');
68+
}
69+
70+
function buildFinishEvidence(session, finishArgs, status, result, captured = {}) {
71+
const outputText = finishOutputText(result, captured);
72+
const prUrl = firstMatch(outputText, [
73+
/\[agent-branch-finish\] (?:Merged PR|PR):\s+(https?:\/\/\S+)/,
74+
/\b(https?:\/\/\S+\/pull\/\d+)\b/,
75+
]);
76+
const mergeState = status === 'finished' ? 'MERGED' : status === 'pr-opened' ? 'OPEN' : status.toUpperCase();
77+
return {
78+
schemaVersion: 1,
79+
sessionId: session.id || '',
80+
branch: session.branch || '',
81+
prUrl,
82+
mergeState,
83+
cleanupResult: cleanupResultAfterFinish(finishArgs, status),
84+
status,
85+
};
86+
}
87+
88+
function captureProcessOutput(fn) {
89+
let stdout = '';
90+
let stderr = '';
91+
const originalStdoutWrite = process.stdout.write;
92+
const originalStderrWrite = process.stderr.write;
93+
process.stdout.write = function captureStdout(chunk, encoding, callback) {
94+
stdout += Buffer.isBuffer(chunk) ? chunk.toString(encoding || 'utf8') : String(chunk || '');
95+
if (typeof encoding === 'function') encoding();
96+
if (typeof callback === 'function') callback();
97+
return true;
98+
};
99+
process.stderr.write = function captureStderr(chunk, encoding, callback) {
100+
stderr += Buffer.isBuffer(chunk) ? chunk.toString(encoding || 'utf8') : String(chunk || '');
101+
if (typeof encoding === 'function') encoding();
102+
if (typeof callback === 'function') callback();
103+
return true;
104+
};
105+
try {
106+
return { result: fn(), captured: { stdout, stderr } };
107+
} finally {
108+
process.stdout.write = originalStdoutWrite;
109+
process.stderr.write = originalStderrWrite;
110+
}
111+
}
112+
46113
function finishAgentSession(repoRoot, options, deps = {}) {
47114
const finishRunner = deps.finishRunner || finishCommands.finish;
48115
const output = deps.output || process.stdout;
49116
const session = resolveAgentSessionForFinish(repoRoot, options);
117+
const jsonMode = Boolean(options.json);
50118

51119
if (!session.branch) {
52120
throw new Error(`Agent session '${session.id}' has no branch metadata.`);
@@ -62,24 +130,43 @@ function finishAgentSession(repoRoot, options, deps = {}) {
62130
...options.finishArgs,
63131
];
64132

65-
output.write(`[${TOOL_NAME}] Agent session: ${session.id}\n`);
66-
output.write(`[${TOOL_NAME}] Branch: ${session.branch}\n`);
67-
output.write(`[${TOOL_NAME}] Worktree: ${session.worktreePath || '(unknown)'}\n`);
133+
if (!jsonMode) {
134+
output.write(`[${TOOL_NAME}] Agent session: ${session.id}\n`);
135+
output.write(`[${TOOL_NAME}] Branch: ${session.branch}\n`);
136+
output.write(`[${TOOL_NAME}] Worktree: ${session.worktreePath || '(unknown)'}\n`);
137+
}
68138

69139
try {
70-
const result = finishRunner(finishArgs);
140+
const runnerResult = jsonMode
141+
? captureProcessOutput(() => finishRunner(finishArgs))
142+
: { result: finishRunner(finishArgs), captured: {} };
143+
const result = runnerResult.result;
71144
const status = sessionStatusAfterFinish(finishArgs);
72-
updateAgentSession(repoRoot, session.id, { status });
73-
output.write(`[${TOOL_NAME}] Finish result: ${status}\n`);
74-
return { session, status, result, finishArgs };
145+
const evidence = buildFinishEvidence(session, finishArgs, status, result, runnerResult.captured);
146+
updateAgentSession(repoRoot, session.id, {
147+
status,
148+
pr: { url: evidence.prUrl, state: evidence.mergeState },
149+
finishEvidence: evidence,
150+
});
151+
if (!jsonMode) {
152+
output.write(`[${TOOL_NAME}] Finish result: ${status}\n`);
153+
}
154+
return { session, status, result, finishArgs, evidence };
75155
} catch (error) {
76-
updateAgentSession(repoRoot, session.id, { status: 'failed' });
77-
output.write(`[${TOOL_NAME}] Finish result: failed\n`);
156+
const evidence = buildFinishEvidence(session, finishArgs, 'failed', null);
157+
updateAgentSession(repoRoot, session.id, {
158+
status: 'failed',
159+
finishEvidence: evidence,
160+
});
161+
if (!jsonMode) {
162+
output.write(`[${TOOL_NAME}] Finish result: failed\n`);
163+
}
78164
throw error;
79165
}
80166
}
81167

82168
module.exports = {
169+
buildFinishEvidence,
83170
finishAgentSession,
84171
resolveAgentSessionForFinish,
85172
};

src/agents/sessions.js

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,13 @@ const SESSION_FIELDS = [
1010
'worktreePath',
1111
'base',
1212
'status',
13+
'activity',
14+
'claims',
15+
'metadata',
16+
'launchCommand',
17+
'tmux',
18+
'pr',
19+
'finishEvidence',
1320
'claimFailure',
1421
'createdAt',
1522
'updatedAt',

src/agents/start.js

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,9 @@ function buildStartPlan(options, repoRoot, env = process.env) {
8181
base,
8282
branchName,
8383
worktreePath,
84+
claimedFiles: Array.isArray(options.claims) ? [...options.claims] : [],
85+
metadata: options.metadata && typeof options.metadata === 'object' ? { ...options.metadata } : {},
86+
tmux: options.tmux && typeof options.tmux === 'object' ? { ...options.tmux } : null,
8487
launchCommand: buildAgentLaunchCommand({ agentId: agent.id, prompt: requestedTask, worktreePath }),
8588
};
8689
}
@@ -114,6 +117,26 @@ function buildLaunchOptions(options) {
114117
return launchOptions;
115118
}
116119

120+
function buildDryRunPayload(plan) {
121+
const tmux = plan.tmux || {};
122+
return {
123+
schemaVersion: 1,
124+
dryRun: true,
125+
task: plan.task,
126+
prompt: plan.requestedTask,
127+
agent: plan.agent.id,
128+
base: plan.base,
129+
branch: plan.branchName,
130+
worktree: plan.worktreePath,
131+
worktreePath: plan.worktreePath,
132+
claimedFiles: plan.claimedFiles,
133+
launchCommand: plan.launchCommand,
134+
tmuxSession: tmux.session || null,
135+
tmuxTarget: tmux.target || null,
136+
metadata: plan.metadata,
137+
};
138+
}
139+
117140
function renderDryRunPlan(plan) {
118141
return [
119142
'[gitguardex] Agents start dry-run:',
@@ -132,6 +155,16 @@ function renderDryRunPlan(plan) {
132155
function dryRunStart(options, repoRoot) {
133156
const launchOptions = buildLaunchOptions(options);
134157
const plans = launchOptions.map((launchOption) => buildStartPlan(launchOption, repoRoot));
158+
if (options.json) {
159+
if (plans.length === 1) {
160+
return `${JSON.stringify(buildDryRunPayload(plans[0]), null, 2)}\n`;
161+
}
162+
return `${JSON.stringify({
163+
schemaVersion: 1,
164+
dryRun: true,
165+
launches: plans.map(buildDryRunPayload),
166+
}, null, 2)}\n`;
167+
}
135168
if (plans.length === 1 && !options.panel) {
136169
return renderDryRunPlan(plans[0]);
137170
}
@@ -183,6 +216,14 @@ function buildSessionPayload(options, metadata, status, extra = {}) {
183216
branch: metadata.branch,
184217
worktreePath: path.resolve(metadata.worktreePath),
185218
base: options.base || null,
219+
claims: Array.isArray(options.claims) ? [...options.claims] : [],
220+
metadata: options.metadata && typeof options.metadata === 'object' ? { ...options.metadata } : {},
221+
launchCommand: buildAgentLaunchCommand({
222+
agentId: options.agent || 'codex',
223+
prompt: options.task,
224+
worktreePath: path.resolve(metadata.worktreePath),
225+
}),
226+
tmux: options.tmux && typeof options.tmux === 'object' ? { ...options.tmux } : null,
186227
status,
187228
...extra,
188229
};
@@ -343,6 +384,7 @@ function startAgentLane(repoRoot, options, deps = {}) {
343384

344385
module.exports = {
345386
buildBranchStartArgs,
387+
buildDryRunPayload,
346388
buildLaunchOptions,
347389
buildStartPlan,
348390
buildRecoveryLines,

0 commit comments

Comments
 (0)