You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(isolation): close all 4 subissues of tracker #228
Lands v1 scope of every subissue under tracker #228. Friction observed
across two campaigns on 2026-05-27 (paper-burst + cross-account-signal-
pooling) where the experiment worktree's isolation guarantee was
silently undermined: executors `cd` to the parent repo to use
gitignored deps, write Python modules into the worktree without
declaring them as code_changes, and on resume occasionally hit
unreplaced-placeholder errors.
Closes#229 — `target_system.worktree_extras` schema field +
auto-symlink gitignored deps from main into each experiment
worktree on creation. Each entry must be a non-empty relative path
resolving under repo_path; absolute paths and ../ traversal are
rejected at creation time. Source must exist. On extras failure,
the half-built worktree + branch are cleaned up before re-raising
(no leak). Symlinks live inside the worktree dir, so
`git worktree remove --force` reaps them with the rest of the dir.
Closes#230 — Pre-cleanup `git status --porcelain` warns on
undeclared writes; surfaced in `findings.json` under a new optional
`worktree_uncommitted_writes` key (schema-allowed). Filters
symlinks (orchestrator-managed inputs from #229) and any path
declared in `bundle.arms[].code_changes[].file`. Tripwire only —
never blocks cleanup.
Closes#231 — "Worktree discipline" guidance added to
`execute_analyze.md` (full) and `execute_analyze_thin.md`. Tells
the executor: stay in the worktree, reference parent assets via
`worktree_extras` symlinks (relative paths, not absolute paths
into main), declare any new files via `code_changes`. Prose-only
change — no new placeholders, no behavior change to dispatch.
Closes#232 — Forensic logging in `prompt_loader.py`: when
rendering fails on unreplaced placeholders, log
`template`, `resolved_path`, `missing_placeholders`, and
`context_keys` at ERROR before raising. Diagnostic only — no
speculative fix for the resume-time bug. The intermittent
Unreplaced-placeholders failure on iter-2 EXECUTE_ANALYZE is
gated on reproduction; this PR ships the seine.
Refs #228 (tracker).
## Test plan
- 1167 passed, 1 skipped (was 1133 + 1 in #227 baseline; +34 new
behavioral tests).
- All tests use real on-disk fixtures or seam-injected fakes per
CLAUDE.md (no live LLM calls).
- New behavioral coverage:
- test_worktree.py: TestWorktreeExtras (9), TestDetectUndeclared
Writes (6), TestDeclaredCodeChangePaths (5), TestRecord
UndeclaredWritesInFindings (5)
- test_prompt_loader.py: TestWorktreeDisciplineGuidance (3),
TestPlaceholderDiagnosticLogging (2)
## Out of scope (v2 — explicit, in tracker #228)
- Post-execution `git -C <main> diff` fail-loud check on main's
working tree (defense in depth for #229)
- `auto_capture_writes` synthetic `code_changes` flag (alternative
for #230)
- Actual fix for the resume-time placeholder bug (gated on
reproduction; #232 ships the diagnostic)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: orchestrator/schemas/findings.schema.json
+6Lines changed: 6 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -24,6 +24,12 @@
24
24
"minimum": 0,
25
25
"maximum": 100,
26
26
"description": "Percentage of total effect from single dominant component, if detected."
27
+
},
28
+
"worktree_uncommitted_writes": {
29
+
"type": "array",
30
+
"items": { "type": "string", "minLength": 1 },
31
+
"uniqueItems": true,
32
+
"description": "#230 — paths the executor wrote inside the experiment worktree without declaring them in the bundle's `code_changes`. Surfaced just before worktree cleanup; logged at WARNING. Empty array (or absent) means the executor declared everything it wrote, or wrote nothing untracked. The orchestrator does not block cleanup on undeclared writes — this is a tripwire, not a gate."
Copy file name to clipboardExpand all lines: prompts/methodology/execute_analyze.md
+8Lines changed: 8 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,14 @@ You are a scientific executor for the Nous hypothesis-driven experimentation fra
2
2
3
3
You have **shell access**. You are running inside an isolated git worktree of the target system. You own this worktree — reset it yourself with `git checkout -- .` between conditions.
4
4
5
+
## Worktree discipline (#228)
6
+
7
+
Your `cwd` is an experiment worktree forked from the target repo's main branch. It contains the tracked source tree plus any symlinks the orchestrator created from `target_system.worktree_extras` (#229) — typically virtualenvs, prefetched data dirs, prior-iteration outputs, build artifacts.
8
+
9
+
-**Stay in your worktree.** Do not `cd` to the parent repo to "use the real venv" or "read prior-iter results from main." Reference parent assets via the `worktree_extras` symlinks. They appear as ordinary paths inside the worktree and resolve to main automatically.
10
+
-**Reference parent assets through symlinks, not absolute paths into main.** If `worktree_extras` includes `.nous/<campaign>`, read prior-iter files at `.nous/<campaign>/runs/iter-{N-1}/...` (relative — resolves through the symlink), NOT `/Users/<user>/.../<repo>/.nous/<campaign>/...`. Absolute paths into main work today by accident; they break under harness-managed isolation (#123).
11
+
-**Code you write must be declared.** Any new file you create in the worktree must appear in your bundle arm's `code_changes[]` to survive cleanup. Files you don't declare get listed in `findings.worktree_uncommitted_writes` (#230) and lost when the worktree is removed. If you write code outside the worktree (e.g., into a `worktree_extras`-symlinked parent dir), that code persists by virtue of living in main — but think twice before doing it; you're outside the experiment's isolation.
12
+
5
13
Your job has FIVE phases — all in one session with full context:
6
14
1.**Prepare** — build, create patches, validate ALL commands
7
15
2.**Execute** — run all conditions across seeds, capture results
0 commit comments