Commit c6a6ee6
post-SDK-migration UX + architecture cleanup (AI-native-Systems-Research#189) (AI-native-Systems-Research#192)
* post-SDK-migration UX + architecture cleanup (AI-native-Systems-Research#189)
Closes AI-native-Systems-Research#183, AI-native-Systems-Research#184, AI-native-Systems-Research#185, AI-native-Systems-Research#186, AI-native-Systems-Research#187, AI-native-Systems-Research#188, AI-native-Systems-Research#190. Refs AI-native-Systems-Research#189.
Six paper-burst attempts on `mechanismdesign` failed before nous
produced a complete DESIGN→EXECUTE_ANALYZE flow, surfacing seven
distinct trip-hazards. This PR retires the legacy claude -p
subprocess path, fixes the last SDK-dispatcher path bug, widens
several authoring surfaces, and adds two operator-facing tools
(`nous stop`, `nous schema`) so agents and humans can run nous
without grepping the source.
Behaviour changes
-----------------
* `AI-native-Systems-Research#190` — SDK dispatcher writes `executor_log.jsonl` under
`runs/iter-N/inputs/` so the design-phase validator's iter-root
whitelist is preserved. Status reader falls back to the legacy
iter-root path so older campaigns keep rendering.
* `AI-native-Systems-Research#183` (BREAKING) — removed `--agent api` (the legacy claude -p
subprocess). `--agent sdk` is now the default and the only
user-facing code-access path. `claude-agent-sdk` and `anyio`
moved from `[project.optional-dependencies]` to required
`dependencies`. Programmatic `agent="api"` callers raise
`ValueError` with a migration message.
* `AI-native-Systems-Research#184` — `nous create-campaign` defaults `target_system.repo_path`
to CWD at scaffold time and exposes `--target-repo-path` to
override. Closes the silent "wrong work_dir" trap when
`nous run` is invoked from a different CWD later.
* `AI-native-Systems-Research#185` — campaign schema accepts top-level `ground_truth` (the
pre-registration use case) and `theory_references` items as
strings or full objects. New helpers
`_format_campaign_ground_truth` and `_normalize_theory_references`
in `llm_dispatch.py` surface ground_truth into the DESIGN prompt.
* `AI-native-Systems-Research#186` — campaign-level `max_turns` block overrides
`defaults.yaml` per phase. Resolution order: campaign > defaults
> hardcoded fallback (25).
* `AI-native-Systems-Research#187` — `DesignIncompleteError` fires before schema validation
when `bundle.yaml` / `problem.md` / `handoff_snapshot.md` are
missing after dispatch. The error names the missing files and
lists four common causes (max_turns, ran the experiment in
DESIGN, API stall, transport failure) each pointing at a
concrete artifact. A `failure_type: "design_incomplete"`
retry_log entry is also written.
* `AI-native-Systems-Research#188` — new `--bundle <path>` (with optional `--problem-md`
and `--handoff-md`) skips DESIGN by copying a pre-authored bundle.
Bundle is schema-validated, hashed, and recorded in
`iter-1/bundle_manifest.json` with `bundle_source: pre_authored`,
`bundle_path`, and `bundle_sha256` for reviewer-defensible
provenance.
New tooling for agents/humans
-----------------------------
* `nous stop <target> [--reason ...]` — writes a STOP sentinel at
the campaign work_dir. The orchestrator checks before each
iteration and exits cleanly with a `stopped_by_user` ledger row.
Mid-iteration interrupt is still ctrl-C; this is the
agent-friendly handle.
* `nous schema [campaign|bundle|findings] [--format md|json|yaml]`
— pure deterministic Python (no LLM) that renders the schema
YAML/JSON as a Markdown reference. Walks `properties` once,
groups required vs optional, surfaces descriptions verbatim.
* README — added a "Quick reference" table and an "Observability"
section pointing at `executor_log.jsonl`, `retry_log.jsonl`,
`llm_metrics.jsonl`, `state.json`, and the design-incomplete
diagnostic. CLI flag help strings are now exhaustive.
Tests
-----
939 passed, 1 skipped, 0 failed. New behavioural tests:
- `test_validate.py` — SDK log under inputs/ passes; iter-root
log still rejected (AI-native-Systems-Research#190 contract pinned).
- `test_inline_dispatch.py` — `agent="api"` raises migration
ValueError; sdk routing still works.
- `test_create_campaign.py` — `--target-repo-path` lands as real
value; CWD default works.
- `test_theory_references.py` + `test_campaign_ground_truth.py` —
new schema shapes accepted, helpers render correctly.
- `test_max_turns_resolution.py` — campaign > defaults > hardcoded.
- `test_design_artifact_assertion.py` — DesignIncompleteError
shape, hint coverage, retry_log entry.
- `test_pre_authored_bundle.py` — --bundle artifact copy, manifest
shape, schema-invalid bundle fails fast, validator accepts the
pre-authored iter dir.
- `test_nous_stop.py` — sentinel helpers, CLI handler, campaign
loop honours sentinel.
- `test_nous_schema_command.py` — Markdown / JSON / YAML output;
pinned that `nous schema` never invokes a dispatcher.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* ci: nudge to retrigger Tests workflow on PR AI-native-Systems-Research#191
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 3c26611 commit c6a6ee6
24 files changed
Lines changed: 2274 additions & 137 deletions
File tree
- docs
- orchestrator
- schemas
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
67 | | - | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
68 | 70 | | |
69 | 71 | | |
70 | 72 | | |
71 | | - | |
| 73 | + | |
72 | 74 | | |
73 | 75 | | |
74 | 76 | | |
| |||
93 | 95 | | |
94 | 96 | | |
95 | 97 | | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
101 | 102 | | |
102 | 103 | | |
103 | 104 | | |
104 | | - | |
| 105 | + | |
105 | 106 | | |
106 | 107 | | |
107 | 108 | | |
| |||
112 | 113 | | |
113 | 114 | | |
114 | 115 | | |
115 | | - | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
116 | 128 | | |
117 | 129 | | |
118 | 130 | | |
| |||
129 | 141 | | |
130 | 142 | | |
131 | 143 | | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
132 | 167 | | |
133 | 168 | | |
134 | 169 | | |
| |||
152 | 187 | | |
153 | 188 | | |
154 | 189 | | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
155 | 195 | | |
156 | 196 | | |
157 | 197 | | |
| |||
174 | 214 | | |
175 | 215 | | |
176 | 216 | | |
177 | | - | |
| 217 | + | |
178 | 218 | | |
179 | 219 | | |
180 | 220 | | |
| |||
215 | 255 | | |
216 | 256 | | |
217 | 257 | | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
223 | 266 | | |
| 267 | + | |
| 268 | + | |
224 | 269 | | |
225 | 270 | | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
226 | 309 | | |
227 | 310 | | |
228 | 311 | | |
| |||
238 | 321 | | |
239 | 322 | | |
240 | 323 | | |
241 | | - | |
| 324 | + | |
| 325 | + | |
242 | 326 | | |
243 | 327 | | |
244 | 328 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
101 | | - | |
102 | | - | |
| 101 | + | |
| 102 | + | |
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
| |||
110 | 110 | | |
111 | 111 | | |
112 | 112 | | |
113 | | - | |
114 | | - | |
| 113 | + | |
| 114 | + | |
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
| |||
136 | 136 | | |
137 | 137 | | |
138 | 138 | | |
139 | | - | |
| 139 | + | |
140 | 140 | | |
141 | | - | |
| 141 | + | |
142 | 142 | | |
143 | 143 | | |
144 | 144 | | |
145 | | - | |
| 145 | + | |
146 | 146 | | |
147 | 147 | | |
148 | 148 | | |
| |||
163 | 163 | | |
164 | 164 | | |
165 | 165 | | |
166 | | - | |
| 166 | + | |
167 | 167 | | |
168 | 168 | | |
169 | 169 | | |
| |||
176 | 176 | | |
177 | 177 | | |
178 | 178 | | |
179 | | - | |
| 179 | + | |
180 | 180 | | |
181 | 181 | | |
182 | 182 | | |
| |||
368 | 368 | | |
369 | 369 | | |
370 | 370 | | |
371 | | - | |
| 371 | + | |
372 | 372 | | |
373 | | - | |
374 | | - | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
375 | 376 | | |
376 | 377 | | |
377 | 378 | | |
| |||
0 commit comments