Skip to content

Commit b6dfa08

Browse files
sriumcpclaude
andcommitted
fix(isolation): close all 4 subissues of tracker #228
Lands v1 scope of every subissue under tracker #228. Friction observed across two campaigns on 2026-05-27 (paper-burst + cross-account-signal- pooling) where the experiment worktree's isolation guarantee was silently undermined: executors `cd` to the parent repo to use gitignored deps, write Python modules into the worktree without declaring them as code_changes, and on resume occasionally hit unreplaced-placeholder errors. Closes #229 — `target_system.worktree_extras` schema field + auto-symlink gitignored deps from main into each experiment worktree on creation. Each entry must be a non-empty relative path resolving under repo_path; absolute paths and ../ traversal are rejected at creation time. Source must exist. On extras failure, the half-built worktree + branch are cleaned up before re-raising (no leak). Symlinks live inside the worktree dir, so `git worktree remove --force` reaps them with the rest of the dir. Closes #230 — Pre-cleanup `git status --porcelain` warns on undeclared writes; surfaced in `findings.json` under a new optional `worktree_uncommitted_writes` key (schema-allowed). Filters symlinks (orchestrator-managed inputs from #229) and any path declared in `bundle.arms[].code_changes[].file`. Tripwire only — never blocks cleanup. Closes #231 — "Worktree discipline" guidance added to `execute_analyze.md` (full) and `execute_analyze_thin.md`. Tells the executor: stay in the worktree, reference parent assets via `worktree_extras` symlinks (relative paths, not absolute paths into main), declare any new files via `code_changes`. Prose-only change — no new placeholders, no behavior change to dispatch. Closes #232 — Forensic logging in `prompt_loader.py`: when rendering fails on unreplaced placeholders, log `template`, `resolved_path`, `missing_placeholders`, and `context_keys` at ERROR before raising. Diagnostic only — no speculative fix for the resume-time bug. The intermittent Unreplaced-placeholders failure on iter-2 EXECUTE_ANALYZE is gated on reproduction; this PR ships the seine. Refs #228 (tracker). ## Test plan - 1167 passed, 1 skipped (was 1133 + 1 in #227 baseline; +34 new behavioral tests). - All tests use real on-disk fixtures or seam-injected fakes per CLAUDE.md (no live LLM calls). - New behavioral coverage: - test_worktree.py: TestWorktreeExtras (9), TestDetectUndeclared Writes (6), TestDeclaredCodeChangePaths (5), TestRecord UndeclaredWritesInFindings (5) - test_prompt_loader.py: TestWorktreeDisciplineGuidance (3), TestPlaceholderDiagnosticLogging (2) ## Out of scope (v2 — explicit, in tracker #228) - Post-execution `git -C <main> diff` fail-loud check on main's working tree (defense in depth for #229) - `auto_capture_writes` synthetic `code_changes` flag (alternative for #230) - Actual fix for the resume-time placeholder bug (gated on reproduction; #232 ships the diagnostic) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 0ae90a9 commit b6dfa08

9 files changed

Lines changed: 693 additions & 5 deletions

File tree

orchestrator/iteration.py

Lines changed: 60 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,46 @@ class IterationOutcome(str, Enum):
4848

4949
TEMPLATES_DIR = Path(__file__).resolve().parent / "templates"
5050
SCHEMAS_DIR = Path(__file__).resolve().parent / "schemas"
51+
52+
53+
def _declared_code_change_paths(bundle_path: Path) -> set[str]:
54+
"""Read ``bundle.yaml`` and return every ``arms[].code_changes[].file``
55+
relative path declared on any arm. Returns an empty set if the file
56+
is missing, unparseable, or declares no ``code_changes`` (#230)."""
57+
if not bundle_path.exists():
58+
return set()
59+
try:
60+
bundle = yaml.safe_load(bundle_path.read_text()) or {}
61+
except yaml.YAMLError:
62+
return set()
63+
arms = bundle.get("arms") or []
64+
declared: set[str] = set()
65+
for arm in arms:
66+
if not isinstance(arm, dict):
67+
continue
68+
for change in arm.get("code_changes") or []:
69+
if isinstance(change, dict) and isinstance(change.get("file"), str):
70+
declared.add(change["file"])
71+
return declared
72+
73+
74+
def _record_undeclared_writes_in_findings(
75+
findings_path: Path,
76+
undeclared: list[str],
77+
) -> None:
78+
"""Merge ``worktree_uncommitted_writes`` into ``findings.json`` (#230).
79+
80+
No-op if findings.json is missing or unparseable — the schema
81+
validator downstream will already complain and we don't want to
82+
block worktree cleanup on a malformed findings artifact."""
83+
if not undeclared or not findings_path.exists():
84+
return
85+
try:
86+
findings = json.loads(findings_path.read_text())
87+
except json.JSONDecodeError:
88+
return
89+
findings["worktree_uncommitted_writes"] = sorted(set(undeclared))
90+
atomic_write(findings_path, json.dumps(findings, indent=2) + "\n")
5191
DEFAULTS_PATH = Path(__file__).resolve().parent / "defaults.yaml"
5292
_ARM_TYPE_RE = re.compile(r"^[a-zA-Z0-9_-]+$")
5393

@@ -960,8 +1000,9 @@ def _max_turns_for(phase_key: str) -> int:
9601000
create_experiment_worktree,
9611001
remove_experiment_worktree,
9621002
)
1003+
extras = campaign.get("target_system", {}).get("worktree_extras") or []
9631004
experiment_dir, experiment_id = create_experiment_worktree(
964-
Path(repo_path), iteration,
1005+
Path(repo_path), iteration, extras=extras,
9651006
)
9661007
(iter_dir / ".experiment_id").write_text(experiment_id)
9671008
print(f" Experiment worktree: {experiment_dir}")
@@ -1031,6 +1072,24 @@ def _max_turns_for(phase_key: str) -> int:
10311072
"Executor artifacts failed post-check validation: %s",
10321073
result["errors"],
10331074
)
1075+
# #230: surface undeclared writes that would be lost when the
1076+
# worktree is removed below. Persist into findings.json so the
1077+
# design agent on iter-N+1 can see what to declare in
1078+
# ``code_changes``. Tripwire only — never blocks cleanup.
1079+
if repo_path and experiment_dir is not None:
1080+
from orchestrator.worktree import detect_undeclared_writes
1081+
declared = _declared_code_change_paths(iter_dir / "bundle.yaml")
1082+
undeclared = detect_undeclared_writes(experiment_dir, declared)
1083+
if undeclared:
1084+
logger.warning(
1085+
"Executor wrote %d files in the experiment worktree "
1086+
"without declaring them in bundle.arms[].code_changes; "
1087+
"they will be lost on cleanup: %s",
1088+
len(undeclared), undeclared[:20],
1089+
)
1090+
_record_undeclared_writes_in_findings(
1091+
iter_dir / "findings.json", undeclared,
1092+
)
10341093
# Clean up worktree only on success
10351094
if repo_path and experiment_id:
10361095
remove_experiment_worktree(Path(repo_path), experiment_id)

orchestrator/prompt_loader.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,9 +60,21 @@ def load(self, template_name: str, context: dict[str, str]) -> str:
6060

6161
remaining = _PLACEHOLDER_RE.findall(text)
6262
if remaining:
63+
missing = sorted(set(remaining))
64+
# #232: forensic logging on the resume-time placeholder bug.
65+
# The error message names the missing placeholders; we add
66+
# what keys WERE present in the context so the next
67+
# occurrence produces evidence pointing at the actual cause
68+
# (phase string mismatch, stale loader, resume-path field
69+
# uninitialized, ...).
70+
logger.error(
71+
"prompt render failed: template=%s resolved_path=%s "
72+
"missing_placeholders=%s context_keys=%s",
73+
template_name, path, missing, sorted(context.keys()),
74+
)
6375
raise ValueError(
6476
f"Unreplaced placeholders in {template_name}.md: "
65-
f"{', '.join(sorted(set(remaining)))}"
77+
f"{', '.join(missing)}"
6678
)
6779

6880
logger.debug("Loaded prompt %s (%d chars)", template_name, len(text))

orchestrator/schemas/campaign.schema.yaml

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,23 @@ properties:
8484
type: ["string", "null"]
8585
minLength: 1
8686
description: "Path to target system git repo. Used by CLIDispatcher for code-access agents. If set, experiments run in isolated worktrees."
87+
worktree_extras:
88+
type: array
89+
items:
90+
type: string
91+
minLength: 1
92+
uniqueItems: true
93+
description: >
94+
Paths (relative to repo_path) to symlink into each experiment
95+
worktree on creation (#229). Use this for gitignored assets
96+
the executor needs from main — virtualenvs, pre-fetched data
97+
dirs, prior-iteration outputs, build artifacts. Without this
98+
mechanism, executors discover that the worktree is missing
99+
their tooling and `cd` back to the parent repo, silently
100+
breaking isolation. Each entry must be a relative path that
101+
resolves under repo_path; absolute paths and ``..`` traversal
102+
are rejected at worktree creation. Source must exist in main
103+
at the time of worktree creation.
87104
88105
models:
89106
type: object

orchestrator/schemas/findings.schema.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,12 @@
2424
"minimum": 0,
2525
"maximum": 100,
2626
"description": "Percentage of total effect from single dominant component, if detected."
27+
},
28+
"worktree_uncommitted_writes": {
29+
"type": "array",
30+
"items": { "type": "string", "minLength": 1 },
31+
"uniqueItems": true,
32+
"description": "#230 — paths the executor wrote inside the experiment worktree without declaring them in the bundle's `code_changes`. Surfaced just before worktree cleanup; logged at WARNING. Empty array (or absent) means the executor declared everything it wrote, or wrote nothing untracked. The orchestrator does not block cleanup on undeclared writes — this is a tripwire, not a gate."
2733
}
2834
},
2935
"$defs": {

orchestrator/worktree.py

Lines changed: 143 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
import time
1818
import uuid
1919
from pathlib import Path
20-
from typing import Callable
20+
from typing import Callable, Sequence
2121

2222
logger = logging.getLogger(__name__)
2323

@@ -26,9 +26,28 @@
2626
_DEFAULT_ORPHAN_AGE_SECONDS = 60 * 60 # 1 hour
2727

2828

29-
def create_experiment_worktree(repo_path: Path, iteration: int) -> tuple[Path, str]:
29+
def create_experiment_worktree(
30+
repo_path: Path,
31+
iteration: int,
32+
*,
33+
extras: Sequence[str] | None = None,
34+
) -> tuple[Path, str]:
3035
"""Create a git worktree for running an experiment in isolation.
3136
37+
Args:
38+
repo_path: Target repo root.
39+
iteration: 1-based iteration index. Used to name the experiment.
40+
extras: Paths (relative to ``repo_path``) to symlink into the
41+
worktree. Each entry is symlinked as ``<worktree>/<entry>``
42+
pointing at ``<repo_path>/<entry>`` (#229). Use this for
43+
gitignored deps the executor needs from main — venvs, large
44+
data dirs, prior-iteration outputs — so the executor doesn't
45+
have to ``cd`` to the parent repo and silently break
46+
isolation. Each path must be relative and resolve under
47+
``repo_path``; absolute paths and ``..`` traversal are
48+
rejected. Source must exist; missing source raises
49+
``FileNotFoundError``.
50+
3251
Returns:
3352
Tuple of (worktree_path, experiment_id).
3453
"""
@@ -50,9 +69,131 @@ def create_experiment_worktree(repo_path: Path, iteration: int) -> tuple[Path, s
5069
text=True,
5170
)
5271
logger.info("Created experiment worktree: %s (branch: %s)", worktree_dir, branch_name)
72+
73+
if extras:
74+
try:
75+
_link_worktree_extras(repo_path, worktree_dir, extras)
76+
except BaseException:
77+
# If symlinking fails (bad extras config), don't leak the
78+
# half-built worktree + branch — clean up before re-raising.
79+
remove_experiment_worktree(repo_path, experiment_id)
80+
raise
81+
5382
return worktree_dir, experiment_id
5483

5584

85+
def _link_worktree_extras(
86+
repo_path: Path,
87+
worktree_dir: Path,
88+
extras: Sequence[str],
89+
) -> None:
90+
"""Symlink each entry in ``extras`` from ``repo_path`` into ``worktree_dir``.
91+
92+
Each entry must be a relative path (no leading ``/``, no ``..``
93+
traversal that escapes ``repo_path``). The source path must exist in
94+
``repo_path``. Parent directories under the worktree are created as
95+
needed.
96+
97+
If a path under the worktree already exists at the target location
98+
(typically because the entry refers to a tracked path that the
99+
worktree checkout already populated), the existing path is left
100+
untouched and a warning is logged — do not silently overwrite the
101+
real checkout with a symlink.
102+
"""
103+
for entry in extras:
104+
if not entry or os.path.isabs(entry):
105+
raise ValueError(
106+
f"worktree_extras entries must be non-empty relative paths; "
107+
f"got {entry!r}"
108+
)
109+
source = (repo_path / entry).resolve()
110+
try:
111+
source.relative_to(repo_path.resolve())
112+
except ValueError as exc:
113+
raise ValueError(
114+
f"worktree_extras entry {entry!r} resolves outside repo_path "
115+
f"({repo_path}); refusing to symlink across the repo boundary."
116+
) from exc
117+
if not source.exists():
118+
raise FileNotFoundError(
119+
f"worktree_extras source not found: {source} "
120+
f"(declared as {entry!r} in target_system.worktree_extras)"
121+
)
122+
123+
link_path = worktree_dir / entry
124+
if link_path.exists() or link_path.is_symlink():
125+
logger.warning(
126+
"worktree_extras: %s already present in worktree; leaving "
127+
"the existing path untouched (entry was %r)",
128+
link_path, entry,
129+
)
130+
continue
131+
link_path.parent.mkdir(parents=True, exist_ok=True)
132+
os.symlink(source, link_path)
133+
logger.info("worktree_extras: linked %s -> %s", link_path, source)
134+
135+
136+
def detect_undeclared_writes(
137+
worktree_path: Path,
138+
declared_paths: set[str] | None = None,
139+
) -> list[str]:
140+
"""Return paths in ``worktree_path`` that the executor wrote without
141+
declaring them via the bundle's ``code_changes`` (#230).
142+
143+
Parses ``git -C <worktree_path> status --porcelain`` and reports
144+
every untracked file (``??``) and modified-tracked file (``M``) that
145+
is not already covered by ``declared_paths``. Each returned path is
146+
relative to ``worktree_path``.
147+
148+
Symlinks (typically created by ``worktree_extras``, #229) are
149+
excluded — they're orchestrator-managed inputs, not undeclared
150+
executor writes.
151+
152+
The intent is to surface silent loss of work: an executor that
153+
writes a Python module via the ``Write`` tool but forgets to add a
154+
``code_changes`` entry will lose the file when the worktree is
155+
cleaned up. Reporting it loudly turns the silent loss into an
156+
auditable trail.
157+
"""
158+
declared_paths = declared_paths or set()
159+
worktree_path = Path(worktree_path)
160+
161+
if not worktree_path.exists():
162+
return []
163+
164+
result = subprocess.run(
165+
["git", "-C", str(worktree_path), "status", "--porcelain"],
166+
capture_output=True,
167+
text=True,
168+
check=False,
169+
)
170+
if result.returncode != 0:
171+
# Worktree may have been removed already, or git is unhappy —
172+
# don't make cleanup fail because diagnostics failed.
173+
logger.debug(
174+
"git status --porcelain on %s exited %s: %s",
175+
worktree_path, result.returncode, result.stderr.strip(),
176+
)
177+
return []
178+
179+
undeclared: list[str] = []
180+
for line in result.stdout.splitlines():
181+
if len(line) < 4:
182+
continue
183+
# Porcelain v1 prefix: 2 status chars + space + path.
184+
status = line[:2]
185+
path = line[3:].strip()
186+
if not (status.startswith("??") or "M" in status):
187+
continue
188+
if path in declared_paths:
189+
continue
190+
full = worktree_path / path
191+
if full.is_symlink():
192+
continue
193+
undeclared.append(path)
194+
return undeclared
195+
196+
56197
def remove_experiment_worktree(repo_path: Path, experiment_id: str) -> None:
57198
"""Remove a previously created experiment worktree and its branch.
58199

prompts/methodology/execute_analyze.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,14 @@ You are a scientific executor for the Nous hypothesis-driven experimentation fra
22

33
You have **shell access**. You are running inside an isolated git worktree of the target system. You own this worktree — reset it yourself with `git checkout -- .` between conditions.
44

5+
## Worktree discipline (#228)
6+
7+
Your `cwd` is an experiment worktree forked from the target repo's main branch. It contains the tracked source tree plus any symlinks the orchestrator created from `target_system.worktree_extras` (#229) — typically virtualenvs, prefetched data dirs, prior-iteration outputs, build artifacts.
8+
9+
- **Stay in your worktree.** Do not `cd` to the parent repo to "use the real venv" or "read prior-iter results from main." Reference parent assets via the `worktree_extras` symlinks. They appear as ordinary paths inside the worktree and resolve to main automatically.
10+
- **Reference parent assets through symlinks, not absolute paths into main.** If `worktree_extras` includes `.nous/<campaign>`, read prior-iter files at `.nous/<campaign>/runs/iter-{N-1}/...` (relative — resolves through the symlink), NOT `/Users/<user>/.../<repo>/.nous/<campaign>/...`. Absolute paths into main work today by accident; they break under harness-managed isolation (#123).
11+
- **Code you write must be declared.** Any new file you create in the worktree must appear in your bundle arm's `code_changes[]` to survive cleanup. Files you don't declare get listed in `findings.worktree_uncommitted_writes` (#230) and lost when the worktree is removed. If you write code outside the worktree (e.g., into a `worktree_extras`-symlinked parent dir), that code persists by virtue of living in main — but think twice before doing it; you're outside the experiment's isolation.
12+
513
Your job has FIVE phases — all in one session with full context:
614
1. **Prepare** — build, create patches, validate ALL commands
715
2. **Execute** — run all conditions across seeds, capture results

prompts/methodology/execute_analyze_thin.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,15 @@ This iteration's mode is: **{{iteration_mode}}**
1111

1212
{{mode_guidance}}
1313

14+
## Worktree discipline (#228)
15+
16+
Your `cwd` is an experiment worktree, not the target repo. Stay in it.
17+
Parent-repo assets (venvs, data dirs, prior-iter outputs) appear as
18+
symlinks declared in `target_system.worktree_extras` (#229) — reference
19+
them via relative paths, never `cd` to the parent repo. Any file you
20+
write that isn't in a bundle arm's `code_changes[]` gets listed in
21+
`findings.worktree_uncommitted_writes` and lost on cleanup (#230).
22+
1423
## Active principles
1524
{{active_principles}}
1625

0 commit comments

Comments
 (0)