feat(run-action): forceReload flag acknowledges human edits as new baseline (closes #173 sub-issue 3) by Lykhoyda · Pull Request #176 · Lykhoyda/rn-dev-agent

Lykhoyda · 2026-05-20T10:33:50Z

Summary

Closes the third sub-issue in #173 (fresh session feedback) — cdp_run_action aborting with "yaml at ... has been edited externally since the in-memory action was loaded" during active YAML composition.

Root cause

cdp_repair_action's Phase 129 guardrail (actionWasEditedExternally() check) was designed to protect offline human edits from auto-repair clobber. But it has no way to distinguish:

Protect-me case: human edited offline, doesn't want repair to overwrite
Active-composition case: human is in the loop right now, wants repair to operate on their latest YAML

In active composition, the guardrail fires constantly (every YAML save bumps mtime past the sidecar's lastSeenMtimeMs) and forces fallback to raw maestro_run, losing auto-repair telemetry.

Fix

New forceReload: boolean arg on cdp_run_action (default true).

forceReload: true (default) — the orchestrator stats the YAML at load time and persists its current mtime to the sidecar's lastSeenMtimeMs before running. Downstream actionWasEditedExternally() returns false; repair proceeds normally on top of the human's latest edit.
forceReload: false — opt-in to the strict Phase 129 behavior. Refuses to persist over an externally-modified YAML. The right choice for CI replays of fixed baselines.

Why default `true`

The active-composition case is high-frequency in the user's reported session. The protection case is the minority. Defaulting true fixes the reported pain without requiring callers to remember a flag every call.

Worth flagging: codex-pair raised this design call as a HIGH during review ("default forceReload bypasses external-edit protection before auto-repair"). It's a deliberate user-approved tradeoff — the strict behavior is opt-in for callers who specifically need offline-edit protection, and the tool description spells out the choice. Reviewers please weigh in if you disagree with the default.

Implementation

New acknowledgeExternalEdit(action) helper in domain/action-store.ts — stats YAML, persists markSeen(state, currentMtime) to sidecar, returns refreshed ReusableAction. No-op when the YAML mtime is already current.
cdp_run_action handler invokes it iff forceReload !== false, between loadAction() and the first maestroRun call.
New Zod field + extended tool description so MCP clients see the flag.

Test plan

1471/1471 cdp-bridge unit tests passing (+4 net new in test/unit/gh-173-run-action-force-reload.test.js)
acknowledgeExternalEdit helper: refreshes sidecar to YAML's current stat-mtime
acknowledgeExternalEdit helper: no-op when sidecar already current (returns same object reference, no disk write)
Discriminating handler test for forceReload=true: the sidecar advance happens BEFORE the maestro run starts (snapshot captured INSIDE the fake maestroRun). A regression that skipped pre-run acknowledgment would still see the post-run RunRecord save advance the mtime, so this stronger assertion catches the exact failure mode.
Discriminating handler test for forceReload=false: at the moment maestro runs, the sidecar is still at baseline (no pre-run bump). The strict mode run fails with isError: true on stale YAML — proves the user-facing pain is real and that opting out preserves it.
codex-pair MED on weak assertions fixed by switching to mid-run snapshots
CI green
Live verification: composing a YAML, calling cdp_run_action without explicitly passing forceReload, observing the run + repair complete without STALE_TARGET

Out of scope

The other 4 friction items in Session feedback (IX-2997, ~6h): 5 wins + 5 friction items #173 (multi-sim recording ambiguity, runFlow.when race, cdp_run_action CAS auto-reload-and-retry was THIS PR's surface, Expo Dev Client tutorial modal trap).
Splitting Session feedback (IX-2997, ~6h): 5 wins + 5 friction items #173 into separate child issues. Worth doing as project housekeeping — the parent issue stays open after this merges.

Refs

Session feedback (IX-2997, ~6h): 5 wins + 5 friction items #173 — fresh session feedback (5 wins + 5 friction items)
scripts/cdp-bridge/src/domain/action-store.ts:acknowledgeExternalEdit
scripts/cdp-bridge/src/tools/run-action.ts (handler wiring)
Phase 129 guardrail context: scripts/cdp-bridge/src/tools/repair-action.ts:194

🤖 Generated with Claude Code

…seline Closes #173 (sub-issue 3). Symptom: cdp_run_action refuses to complete with "yaml at ... has been edited externally since the in-memory action was loaded" every time the human touches the YAML during composition. The Phase 129 guardrail in cdp_repair_action — added to protect offline human edits from auto-repair clobber — can't distinguish "human edited offline, protect" from "human is actively composing, use latest". Forces users to fall back to raw maestro_run, losing auto-repair telemetry. Fix: new `forceReload: boolean` arg on cdp_run_action (default true). When true, the orchestrator stats the YAML on load and persists its current mtime to the sidecar's lastSeenMtimeMs BEFORE running. Any downstream actionWasEditedExternally() check returns false, so repair proceeds normally on top of the human's latest edit. Pass `forceReload: false` to opt back into the strict Phase 129 behavior (refuse to persist over an externally-modified YAML). The right choice for CI replays of fixed baselines and any flow where offline-edit protection matters more than active-composition ergonomics. Why default true: the high-frequency friction is the active-composition case. The protection case is the minority. Defaulting true fixes the reported pain without requiring callers to remember a flag every call. codex-pair flagged this design call as a HIGH during review; it's a deliberate user-approved tradeoff (the strict behavior is opt-in for callers who specifically need it). Implementation: - new acknowledgeExternalEdit(action) helper in action-store.ts — stats YAML, persists markSeen(state, currentMtime) to sidecar, returns refreshed ReusableAction. No-op when mtime is already current. - cdp_run_action handler invokes it iff forceReload !== false, between loadAction() and the maestroRun call. - new Zod field + extended tool description so MCP clients see the flag and understand the tradeoff. Tests (1471/1471 passing, +4 net new): - acknowledgeExternalEdit: refreshes sidecar lastSeenMtimeMs to YAML's current stat-mtime - acknowledgeExternalEdit: no-op when sidecar already current (returns same object reference, no disk write) - cdp_run_action forceReload=true (default): the sidecar advance happens BEFORE the run starts (snapshot captured inside fake maestroRun) - cdp_run_action forceReload=false: the sidecar is still at baseline at the moment maestro runs (snapshot proves no pre-run acknowledgment), and the strict mode run fails with isError=true on stale YAML The post-run mtime check was strengthened to a mid-run snapshot per codex-pair's MED review — the prior assertion could pass even if acknowledgment never happened (the RunRecord save would advance mtime anyway). The new snapshot capture proves the pre-run acknowledgment specifically. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Lykhoyda merged commit a18669f into main May 20, 2026
7 checks passed

Lykhoyda deleted the fix/gh-173-run-action-force-reload branch May 20, 2026 10:53

Lykhoyda mentioned this pull request May 20, 2026

Session feedback (IX-2997, ~6h): 5 wins + 5 friction items #173

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(run-action): forceReload flag acknowledges human edits as new baseline (closes #173 sub-issue 3)#176

feat(run-action): forceReload flag acknowledges human edits as new baseline (closes #173 sub-issue 3)#176
Lykhoyda merged 1 commit into
mainfrom
fix/gh-173-run-action-force-reload

Lykhoyda commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Lykhoyda commented May 20, 2026

Summary

Root cause

Fix

Why default true

Implementation

Test plan

Out of scope

Refs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Why default `true`