Nous — Usage Story, Architecture, and Phase Plan #20
mtoslalibu
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
What Is Nous?
Nous turns Claude into a systematic scientist for your codebase. Instead of trial-and-error ("try this, did it work?"), Nous makes Claude predict before acting, test properly, learn from failures, and remember what it learned.
One sentence: The scientific method, implemented as a Claude Code plugin, so your system investigations compound instead of starting fresh every time.
The Gap We're Closing
Engineers have intuitions about their systems but no structured way to turn those intuitions into testable hypotheses, run controlled experiments, analyze why results happened (not just what happened), and carry those lessons forward to the next idea. Today, knowledge lives in people's heads and evaporates between sessions. Nous makes it accumulate.
The Problem
Engineers use Claude Code every day to explore and optimize their systems. It works — Claude is smart. But it has three blind spots:
Nous fixes all three.
How You Use It
Install
Three Commands
/nous:init/nous:investigate "question"/nous:statusWalkthrough: Investigating a Scheduler
You're working on an LLM serving system. You think priority scheduling might reduce latency. Here's what happens.
1. Set up —
/nous:initYou open Claude Code in your repo and run
/nous:init. Nous explores the codebase and presents:You add "swap manager" as a knob, remove "memory pressure" (not relevant here), and approve. Nous writes a config file and creates a
.nous/directory next to your code. Nothing in your repo changes.2. Investigate —
/nous:investigateNous designs a hypothesis bundle:
You review and approve. You see the bundle and AI review summaries. If something looks off, you reject and Nous revises.
Nous runs the experiment in an isolated git worktree — your branch is untouched. Three seeds per arm. Results:
Nous learns from the failure:
This is the key insight: the prediction was wrong, but we learned exactly why. With plain Claude, you'd just say "didn't work, try something else."
3. Next iteration
Nous is now constrained by RP-1. It won't propose priority scheduling at high load. Instead, it explores admission control — a different mechanism entirely.
4. Check progress —
/nous:statusThe prediction accuracy trend shows Nous building a better model of your system over time.
Walkthrough: A Different System
Nous isn't tied to LLM serving. Here's the same flow on a transaction scheduling system:
Same methodology, same orchestrator, same schemas. Only the configuration changes.
How It Compares
Nous vs Plain Claude
Nous vs Evolutionary Search (OpenEvolve/ADRS)
These are complementary. Evolutionary search finds solutions in large parameter spaces. Nous builds understanding.
What's Underneath: The Architecture
The three commands above are the UX layer. Underneath is a four-layer stack:
Why the layers are separate
AI reasons, Python enforces.
Key Mechanisms
Human gates — Two hard stops per iteration. You approve the experiment plan before any code runs, and you approve the results before principles are extracted.
Fast-fail rules — Main hypothesis refuted? Skip remaining arms, go straight to learning. No wasted compute.
Worktree isolation — Every experiment runs in an isolated git worktree. Your branch is never touched.
Prediction error taxonomy — When a prediction is wrong:
Compounding knowledge — Principles from iteration N constrain iteration N+1. The system gets smarter. Prediction accuracy trends upward.
How It Generalizes
Nous works on any system where you can measure something, change something, run it again, and reason about its parts. The generalization happens in three places:
campaign.yaml) — describes the target system. Generated by/nous:initfor any repo.Why a Top Engineer Would Use This
A top engineer already thinks scientifically about their system. They have intuitions, mental models, informal hypotheses. What they lack:
Nous doesn't replace the engineer's thinking. It structures it — and makes Claude operate at the same level of rigor the engineer aspires to but rarely achieves under time pressure.
Phase Plan
Phase 1: Schemas + Orchestrator Skeleton — DONE
Issue: #11 | PR: #14
Built the foundation:
What you can do after this phase: Run the full orchestrator loop with stub agents. Validate that the state machine, gates, and fast-fail rules work correctly.
Phase 2: Agent Prompts + LLM Dispatch
Issue: #8 | Status: Not started | Depends on: Phase 1
Replace stubs with real LLM agents.
What you can do after this phase: Run a real single-iteration experiment on BLIS via the Python API.
Phase 3: Plugin UX — Init, Investigate, Status
Issue: #19 | Status: Not started | Depends on: Phase 2
Make it easy to use. Three Claude Code plugin skills.
What you can do after this phase: Install Nous as a plugin, run
/nous:initon any repo, and start investigating.Phase 4: Multi-Iteration Campaigns + Observability
Issue: #9 | Status: Not started | Depends on: Phases 1–3
Scale from single iterations to sustained campaigns.
What you can do after this phase: Run 3–5 iteration campaigns on BLIS. Compare discovered principles to the known BLIS principle catalog.
Phase 5: Outer Loop + Co-Evolution
Issue: #10 | Status: Not started | Depends on: Phases 1–4
Real-world validation, VoI-governed experiment selection, and self-improvement.
Summary: Path to "Easy to Use"
Beta Was this translation helpful? Give feedback.
All reactions