You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A focused initiative to elevate Nous along four axes — user experience, campaign success and quality, speed, and token budget — by leaning hard into Claude Code primitives that Nous currently re-implements (or doesn't use) in plain Python. This issue tracks 15 child issues; each is independently shippable but they compose into a coherent rewrite plan.
Why now
Real-world friction (mined from ~3 days of recent Claude sessions across the inference-sim, well-baked, and saturation projects):
Visibility: the user typed "report progress" / "where is the campaign" / "how is this proceeding" dozens of times in a single afternoon (5/18). The agent answered every one by re-running the same five-line bash pipeline — sometimes mis-reading the live state because results files appeared between two ls calls.
Token bloat: handoff.md files in .nous/ range 8–18 KB and grow monotonically; principles.json reaches 26 entries on mech-design-enforcement. The 266-line design.md and 199-line execute_analyze.md are re-sent every call uncached.
Cross-campaign work: 33 campaigns on inference-sim alone. Asking "all campaigns about saturation detection, with results and patches" requires find … -name findings.json plumbing.
Recently shipped (this initiative builds on, does not duplicate)
The 15 sub-issues below are explicitly complementary to the work above — none re-litigate it.
Strategic shape
Nous today shells out to claude -p and rebuilds, in Python, capabilities that Claude Code already provides natively: parallel subagents, prompt caching, deterministic Stop hooks, MCP-mediated context, asynchronous human-in-the-loop, scheduled routines. The shape of this initiative is therefore delete code while gaining capabilities: most of cli_dispatch.py and parts of engine.py / worktree.py go away once the orchestrator is rebuilt on the Claude Agent SDK.
The single highest-leverage change is #1 (Agent SDK port) because it makes #2, #3, #4, #6, #7 from "lift" to "configure."
What this is
A focused initiative to elevate Nous along four axes — user experience, campaign success and quality, speed, and token budget — by leaning hard into Claude Code primitives that Nous currently re-implements (or doesn't use) in plain Python. This issue tracks 15 child issues; each is independently shippable but they compose into a coherent rewrite plan.
Why now
Real-world friction (mined from ~3 days of recent Claude sessions across the
inference-sim,well-baked, andsaturationprojects):lscalls.state.jsonhand-editing and repeated full re-designs (now partly fixed by fix: resume mid-flight campaign at correct iteration after timeout #91, but the parallel-worktree race remained).--max-cli-retries 10flag caused a second worktree to spawn while the first was still alive — two executors writing to the sameiter-N/results/directory. Solved partly by feat: retry transient claude -p subprocess errors with exponential backoff #71 + feat: pre-flight check + retry everything with failure persistence #111, but the architectural fragility (one giant session) remains..nous/range 8–18 KB and grow monotonically; principles.json reaches 26 entries onmech-design-enforcement. The 266-linedesign.mdand 199-lineexecute_analyze.mdare re-sent every call uncached.inference-simalone. Asking "all campaigns about saturation detection, with results and patches" requiresfind … -name findings.jsonplumbing.Recently shipped (this initiative builds on, does not duplicate)
nousCLInous validateCLI; executor writes artifacts directlynous replayruns deterministic plan, no LLMThe 15 sub-issues below are explicitly complementary to the work above — none re-litigate it.
Strategic shape
Nous today shells out to
claude -pand rebuilds, in Python, capabilities that Claude Code already provides natively: parallel subagents, prompt caching, deterministic Stop hooks, MCP-mediated context, asynchronous human-in-the-loop, scheduled routines. The shape of this initiative is therefore delete code while gaining capabilities: most ofcli_dispatch.pyand parts ofengine.py/worktree.pygo away once the orchestrator is rebuilt on the Claude Agent SDK.The single highest-leverage change is #1 (Agent SDK port) because it makes #2, #3, #4, #6, #7 from "lift" to "configure."
Suggested ship order
/goal-driven loopSub-issues
Sub-issues
cli_dispatch.pyfromsubprocess(claude -p)to the Claude Agent SDK #121 — Port to Claude Agent SDK (foundation)cache_control: ephemeral) #122 — Cache static methodology prompts (cache_control: ephemeral)/goal-driven loop (replace parts ofengine.py) #124 —/goal-driven campaign loopnous-mcpexposing campaigns as resources + tools #126 — MCP servernous-mcpfor campaigns--output-format stream-json;nous status --watchTUI #127 — Stream-json +nous status --watchTUIexperiment_plan.yaml#128 — PreToolUse hook to enforceexperiment_plan.yamlnous-execute-stop) #129 — Deterministic Stop hook for iteration completion.claude/settings.json) #135 — Per-campaign permission policy (kill--dangerously-skip-permissions)