From 795edd991998650aaf619c488a64fd3472e4ec56 Mon Sep 17 00:00:00 2001
From: bntvllnt <32437578+bntvllnt@users.noreply.github.com>
Date: Wed, 10 Jun 2026 17:53:36 +0200
Subject: [PATCH 1/2] chore: gitignore analysis artifacts, local agent state,
 and init outputs

---
 .gitignore | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/.gitignore b/.gitignore
index e36a717..a51c03d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -10,3 +10,17 @@ next-env.d.ts
 .vllnt/
 test-results/
 scripts/
+
+# analysis + coverage artifacts
+.code-visualizer/
+coverage/
+
+# local agent state
+.claude/
+
+# self-dogfood `init` outputs stay local (PR #35 closed unmerged)
+.cursor/
+.pi/
+GEMINI.md
+CONVENTIONS.md
+.github/copilot-instructions.md

From 761fee8e324f14e909db27872da5523f0d024f05 Mon Sep 17 00:00:00 2001
From: bntvllnt <32437578+bntvllnt@users.noreply.github.com>
Date: Wed, 10 Jun 2026 17:53:36 +0200
Subject: [PATCH 2/2] chore(specs): archive shipped specs to specs/shipped, add
 backlog specs

- mcp-only (shipped 2026-03-03, PR #3), cli-mode (shipped 2026-03-12, PR #16),
  config-rules-engine (shipped 2026-06-10, PR #42) moved to specs/shipped
  with status frontmatter + history.log entries
- add three 2026-04-29 backlog specs (module-depth, find-seams, forces signals)
---
 .../2026-04-29-add-analyze-module-depth.md    |  45 +++
 .../2026-04-29-add-find-seams-and-adapters.md |  50 +++
 ...end-analyze-forces-architecture-signals.md |  40 ++
 specs/history.log                             |   3 +
 .../2026-03-03-mcp-only.md                    |   3 +-
 specs/shipped/2026-03-11-cli-mode.md          | 376 ++++++++++++++++++
 .../2026-06-02-config-rules-engine.md         |   2 +-
 7 files changed, 517 insertions(+), 2 deletions(-)
 create mode 100644 specs/backlog/2026-04-29-add-analyze-module-depth.md
 create mode 100644 specs/backlog/2026-04-29-add-find-seams-and-adapters.md
 create mode 100644 specs/backlog/2026-04-29-extend-analyze-forces-architecture-signals.md
 rename specs/{active => shipped}/2026-03-03-mcp-only.md (99%)
 create mode 100644 specs/shipped/2026-03-11-cli-mode.md
 rename specs/{backlog => shipped}/2026-06-02-config-rules-engine.md (98%)

diff --git a/specs/backlog/2026-04-29-add-analyze-module-depth.md b/specs/backlog/2026-04-29-add-analyze-module-depth.md
new file mode 100644
index 0000000..56d8a85
--- /dev/null
+++ b/specs/backlog/2026-04-29-add-analyze-module-depth.md
@@ -0,0 +1,45 @@
+---
+title: Add analyze_module_depth tool with CLI parity
+status: backlog
+created: 2026-04-29
+estimate: 1d
+tier: standard
+---
+
+# Add analyze_module_depth tool with CLI parity
+
+## Goal
+Add a focused tool for ranking modules by architectural depth.
+
+## Scope
+Add both interfaces:
+- MCP tool: `analyze_module_depth`
+- CLI command: `module-depth`
+
+## Output
+For each module return:
+- `path`
+- `interfaceSize`
+- `implementationSize`
+- `depthScore`
+- `leverageScore`
+- `localityScore`
+- `verdict`: `DEEP | SHALLOW | MIXED`
+- short `evidence[]`
+
+## Heuristics
+- **Interface size**: exported symbols, public entry points, config surface
+- **Implementation size**: internal files/symbols/LOC hidden behind the public surface
+- **Depth score**: hidden behavior divided by interface cost
+- **Leverage score**: useful behavior reused by many callers
+- **Locality score**: change and knowledge concentrated in one place
+
+## Acceptance Criteria
+- MCP tool returns ranked modules with stable JSON shape
+- CLI command supports table output and `--json`
+- command/tool support `--limit` / `limit`
+- docs updated in `docs/mcp-tools.md` and `docs/cli-reference.md`
+- tests cover one deep-module example and one shallow-module example
+
+## Notes
+This should be a narrow, high-signal tool. Avoid mixing in seam detection or force-analysis summary.
diff --git a/specs/backlog/2026-04-29-add-find-seams-and-adapters.md b/specs/backlog/2026-04-29-add-find-seams-and-adapters.md
new file mode 100644
index 0000000..623d4d3
--- /dev/null
+++ b/specs/backlog/2026-04-29-add-find-seams-and-adapters.md
@@ -0,0 +1,50 @@
+---
+title: Add find_seams tool and adapter detection with CLI parity
+status: backlog
+created: 2026-04-29
+estimate: 1d
+tier: standard
+---
+
+# Add find_seams tool and adapter detection with CLI parity
+
+## Goal
+Help agents find where behavior can vary and which concrete adapters exist.
+
+## Scope
+Add both interfaces:
+- MCP tool: `find_seams`
+- CLI command: `seams`
+
+## Output
+Return:
+- `seams[]`
+  - `name`
+  - `contract`
+  - `callers`
+  - `adapters[]`
+  - `status`: `HYPOTHETICAL | REAL | LEAKY`
+  - `evidence[]`
+- `adapterPatterns[]`
+- `summary`
+
+## Rules
+- **Hypothetical seam**: one adapter behind a contract
+- **Real seam**: two or more adapters behind a contract
+- **Leaky seam**: callers know adapter-specific details they should not need to know
+
+## Detection Signals
+- shared interface/type used by multiple implementations
+- sibling files with same role but different backing dependencies
+- injected dependencies or provider-style construction
+- duplicated exported API shape across files
+
+## Acceptance Criteria
+- MCP tool returns seam/adapter analysis in JSON
+- CLI command prints concise seam summaries and supports `--json`
+- every seam includes evidence for why it was detected
+- docs updated in `docs/mcp-tools.md` and `docs/cli-reference.md`
+- tests cover one hypothetical seam and one real seam case
+
+## Notes
+Prefer confidence + evidence over aggressive detection. False positives will reduce trust quickly.
diff --git a/specs/backlog/2026-04-29-extend-analyze-forces-architecture-signals.md b/specs/backlog/2026-04-29-extend-analyze-forces-architecture-signals.md
new file mode 100644
index 0000000..700976b
--- /dev/null
+++ b/specs/backlog/2026-04-29-extend-analyze-forces-architecture-signals.md
@@ -0,0 +1,40 @@
+---
+title: Extend analyze_forces with architecture signals
+status: backlog
+created: 2026-04-29
+estimate: 1d
+tier: standard
+---
+
+# Extend analyze_forces with architecture signals
+
+## Goal
+Make `analyze_forces` more useful for architecture review by adding simple derived signals instead of only cohesion/tension/extraction output.
+
+## Scope
+Update both interfaces:
+- MCP: extend existing `analyze_forces`
+- CLI: extend existing `forces` command
+
+## Add
+- `shallowModules[]`
+- `deepModules[]`
+- `seamCandidates[]`
+- `localityRisks[]`
+
+## Heuristics
+- **Shallow module**: interface surface is large relative to hidden behavior
+- **Deep module**: small public surface, large hidden behavior, reused by many callers
+- **Seam candidate**: clear interface point where behavior could vary
+- **Locality risk**: understanding or changing one concept requires bouncing across many files
+
+## Acceptance Criteria
+- `analyze_forces` returns the new fields in MCP JSON
+- `forces --json` returns the same fields
+- human CLI output prints short sections for shallow/deep/seam/locality
+- existing force analysis output remains intact
+- docs updated in `docs/mcp-tools.md` and `docs/cli-reference.md`
+- tests cover at least one shallow-module and one locality-risk case
+
+## Notes
+Keep heuristics simple and explainable. Every flagged item should include evidence, not just a score.
diff --git a/specs/history.log b/specs/history.log
index 38699ac..bce9c99 100644
--- a/specs/history.log
+++ b/specs/history.log
@@ -1,7 +1,10 @@
 2026-02-18 | shipped | 3d-code-mapper-v1 | 10h→2h | 1d | 3D codebase visualizer with 6 views, 6 MCP tools, 75 tests
 2026-03-02 | shipped | mcp-parity-readme-sync | 3h→2h | 1d | 100% MCP-REST parity: +2 tools, enhanced 3 tools, 15 tool descriptions, README sync, 21 new tests
+2026-03-03 | shipped | mcp-only | 3h→- | - | Dropped web UI & REST API — MCP stdio only. PR #3. (archived retroactively)
 2026-03-11 | shipped | fix-dead-export-false-positives | 2h→1.5h | 1d | Fix 33% false positive rate: merge duplicate imports, include same-file calls, call graph consumption. 8 regression tests.
 2026-03-11 | shipped | fix-error-handling | 1h→0.5h | 1d | Consistent impact_analysis error handling, LOC off-by-one fix, empty file guard. 17 regression tests.
 2026-03-11 | shipped | fix-metrics-test-files | 2h→1.5h | 1d | Exclude test files from coverage/coupling metrics, isTestFile detection, coupling formula fix. 19 regression tests.
 2026-03-11 | shipped | feat-metric-quality | 3h→2h | 1d | LEAF verdict for single-file modules, tension suppression for type hubs/entry points, file_context path normalization. 20 regression tests.
+2026-03-12 | shipped | cli-mode | 8h→- | - | CLI parity with MCP: 15 commands, shared core extraction, AGENTS.md + llms.txt + cli-reference docs. PR #16. (archived retroactively)
 2026-05-30 | shipped | feat-agent-adoption-init | 3h→1h | 1d | `init` command: idempotent managed-block instructions for 6 agents + portable skill + registry SKILL.md. 18 tests, docs + CHANGELOG. PR #34.
+2026-06-10 | shipped | config-rules-engine | -→- | - | Declarative config + ESLint-style rules engine + `check` command CI gate. PR #42. (archived retroactively)
diff --git a/specs/active/2026-03-03-mcp-only.md b/specs/shipped/2026-03-03-mcp-only.md
similarity index 99%
rename from specs/active/2026-03-03-mcp-only.md
rename to specs/shipped/2026-03-03-mcp-only.md
index b71f55c..95e55b6 100644
--- a/specs/active/2026-03-03-mcp-only.md
+++ b/specs/shipped/2026-03-03-mcp-only.md
@@ -1,6 +1,7 @@
 ---
 title: Drop Web UI & REST API — MCP Only
-status: active
+status: shipped
+shipped: 2026-03-03
 created: 2026-03-03
 estimate: 3h
 tier: standard
diff --git a/specs/shipped/2026-03-11-cli-mode.md b/specs/shipped/2026-03-11-cli-mode.md
new file mode 100644
index 0000000..d1e0345
--- /dev/null
+++ b/specs/shipped/2026-03-11-cli-mode.md
@@ -0,0 +1,376 @@
+---
+title: CLI Output Mode + AI-Friendly Docs
+status: shipped
+shipped: 2026-03-12
+created: 2026-03-11
+estimate: 8h
+tier: standard
+---
+
+# CLI Output Mode + AI-Friendly Docs
+
+## Context
+
+Tool only exposes results via MCP stdio — usable by LLMs but invisible to humans and CI. Industry trend confirms CLI over MCP for local dev tools: Nx deleted most MCP tools, Google Workspace CLI removed 1,151 lines of MCP code, benchmarks show 28% higher task completion + 33% token efficiency for CLI vs MCP (Mario Zechner, Aug 2025). A direct CLI output mode serves humans + CI. AI agents are already well-served by MCP — they need better docs discovery, not CLI wrappers.
+
+**Three audiences, three interfaces:**
+
+```
+Humans (terminal)  → CLI formatted output (this spec)
+CI/Scripts         → CLI --json + exit codes (this spec)
+AI Agents (LLMs)   → MCP (already works) + llms.txt/AGENTS.md (this spec)
+```
+
+## Codebase Impact (MANDATORY)
+
+| Area | Impact | Detail |
+|------|--------|--------|
+| `src/cli.ts` | MODIFY | Add subcommand-first routing (`ci overview <path>`). Bare `<path>` = MCP mode (backward compat). Progress messages to stderr. |
+| `src/core/index.ts` | CREATE | Extract shared result-building logic from `mcp/index.ts` — `normalizeFilePath`, `resolveFilePath`, `suggestSimilarPaths`, `getSearchIndex`, and result computation functions. Both MCP + CLI import from here. |
+| `src/cli/formatters.ts` | CREATE | Human-readable terminal formatters — tables, color (picocolors), ASCII output |
+| `src/cli/commands.ts` | CREATE | 5 core command handlers: overview, hotspots, file, search, changes |
+| `src/mcp/index.ts` | MODIFY | Extract shared helpers to `src/core/index.ts`. MCP handlers import from core. No behavior change. |
+| `src/types/index.ts` | MODIFY | Add `CliResult` interface for structured command results |
+| `AGENTS.md` | CREATE | Cross-tool AI agent discovery file (Google/OpenAI/Cursor/Factory standard). Points to docs, describes CLI + MCP interfaces |
+| `llms.txt` | CREATE | AI-consumable doc index following llmstxt.org spec |
+| `llms-full.txt` | CREATE | Full documentation concatenated for LLM context injection |
+| `docs/cli-reference.md` | CREATE | CLI command reference with examples |
+| `package.json` | MODIFY | Add `picocolors` dep. Add `"docs"`, `"llms.txt"`, `"llms-full.txt"`, `"AGENTS.md"` to `files`. Add `"llms"` + `"llmsFull"` fields. |
+| `tests/cli-commands.test.ts` | CREATE | Integration tests for CLI commands (real pipeline) |
+
+**Files:** 6 create | 3 modify | 0 affected (MCP behavior unchanged but file modified for extraction)
+**Reuse:** MCP handler computation logic extracted to `src/core/` — both MCP and CLI consume it. No duplication.
+**Breaking changes:** None. `codebase-intelligence <path>` still starts MCP. New subcommand syntax is additive.
+**New dependencies:** `picocolors` (terminal colors) — 3x smaller than chalk, zero-dep, ESM, auto-strips in non-TTY.
+
+## User Journey (MANDATORY)
+
+### Primary Journey
+
+ACTOR: Developer analyzing a codebase from terminal
+GOAL: Get instant architectural insights without LLM round-trip
+PRECONDITION: `codebase-intelligence` installed, TypeScript codebase exists, index cached (auto or via `--index`)
+
+1. User runs `codebase-intelligence overview ./myproject`
+   → System loads cached index (or parses with stderr progress), prints formatted overview
+   → User sees file count, function count, dependency count, top modules, hotspots, circular deps
+
+2. User runs `codebase-intelligence hotspots ./myproject --metric coupling --limit 5`
+   → System prints top 5 files ranked by coupling score
+   → User sees table: file, score, fan-in, fan-out, reason
+
+3. User runs `codebase-intelligence file ./myproject src/auth/login.ts`
+   → System prints file context: LOC, exports, imports, dependents, metrics
+   → User sees full file profile with blast radius
+
+4. User runs `codebase-intelligence search ./myproject "auth"`
+   → System runs BM25 search, prints matching files + symbols
+   → User sees ranked results with relevance scores
+
+5. User runs `codebase-intelligence changes ./myproject`
+   → System detects git changes, prints affected files + risk metrics
+   → User sees what changed, what's affected, risk level
+
+POSTCONDITION: User has architectural insights without MCP/LLM, can script/pipe output
+
+### Secondary Journey: JSON output for scripting/CI
+
+ACTOR: CI pipeline or script consumer
+GOAL: Machine-readable structured output
+
+1. User runs `codebase-intelligence overview ./myproject --json`
+   → System prints stable JSON schema to stdout (progress to stderr)
+   → User pipes to `jq`, scripts, or dashboards
+
+2. User chains commands in CI: `ci overview ./src --json && ci hotspots ./src --json`
+   → System auto-caches index on first invocation, second command uses cache
+   → Both complete without re-parsing
+
+POSTCONDITION: Structured JSON available for automation, no re-parse overhead
+
+### Secondary Journey: AI Agent Discovery
+
+ACTOR: AI coding agent (Claude Code, Cursor, Copilot)
+GOAL: Discover tool capabilities and docs without web access
+
+1. Agent reads `AGENTS.md` at project root (auto-loaded by Cursor, Copilot, Codex)
+   → Agent learns available MCP tools + CLI commands + workflow patterns
+2. Agent reads `llms.txt` / `llms-full.txt` for full documentation
+   → Agent has complete reference without web fetching
+3. Agent uses MCP tools directly (preferred) or falls back to CLI `--json` if MCP not configured
+
+POSTCONDITION: AI agent has full tool knowledge from local files
+
+### Error Journeys
+
+E1. Invalid subcommand
+   Trigger: User types `codebase-intelligence foobar ./src`
+   1. User runs invalid subcommand
+      → System prints to stderr: "Unknown command: foobar\n\nAvailable commands:\n  overview   High-level codebase snapshot\n  hotspots   Rank files by metric\n  file       Detailed file context\n  search     Keyword search\n  changes    Git diff analysis\n\nRun codebase-intelligence <command> --help for details."
+      → Exit code 2
+   Recovery: User picks correct command
+
+E2. File not found
+   Trigger: User runs `codebase-intelligence file ./src nonexistent.ts`
+   1. User requests file context for non-existent file
+      → System prints to stderr: error + top 3 similar paths (reuse `suggestSimilarPaths` from core)
+      → Exit code 1
+   Recovery: User retries with correct path
+
+E3. No codebase at path
+   Trigger: User points at empty or non-TS directory
+   1. User runs `codebase-intelligence overview ./empty-dir`
+      → System prints to stderr: "No TypeScript files found at ./empty-dir"
+      → Exit code 1
+   Recovery: User provides correct path
+
+### Edge Cases
+
+EC1. Large codebase parse time: Progress to stderr ("Parsing... 1247 files found") — MANDATORY, not optional
+EC2. No git repo: Churn/changes degrade gracefully (churn=0, changes="not a git repo")
+EC3. Auto-caching: CLI mode auto-writes index to `.code-visualizer/` on first run, reads on subsequent if HEAD unchanged. No explicit `--index` needed.
+EC4. Piped output: picocolors auto-strips ANSI when !isTTY. Explicit `--no-color` flag also available. `NO_COLOR` env var respected.
+EC5. Directory named like subcommand: `codebase-intelligence overview` (no path) → helpful error: "Missing path. Usage: codebase-intelligence overview <path>"
+
+## Acceptance Criteria (MANDATORY)
+
+### Must Have (BLOCKING — all must pass to ship)
+
+- [ ] AC-1: GIVEN a TS codebase WHEN user runs `codebase-intelligence overview <path>` THEN stdout shows file count, function count, dep count, modules with LOC, top 5 hotspots, circular dep count. Progress on stderr.
+- [ ] AC-2: GIVEN a TS codebase WHEN user runs `codebase-intelligence hotspots <path>` THEN stdout shows ranked table of files by metric (default: coupling) with scores and reasons
+- [ ] AC-3: GIVEN a TS codebase WHEN user runs `codebase-intelligence file <path> <file>` THEN stdout shows file LOC, exports, imports, dependents, all FileMetrics
+- [ ] AC-4: GIVEN a TS codebase WHEN user runs `codebase-intelligence search <path> <query>` THEN stdout shows BM25 results with file, score, matching symbols
+- [ ] AC-5: GIVEN a git-tracked codebase WHEN user runs `codebase-intelligence changes <path>` THEN stdout shows changed files, affected files, risk metrics
+- [ ] AC-6: GIVEN any CLI command WHEN `--json` flag present THEN stdout is valid JSON with stable CLI schema. No progress messages on stdout.
+- [ ] AC-7: GIVEN no subcommand WHEN user runs `codebase-intelligence <path>` (path exists) THEN MCP stdio mode starts (backward compatible)
+- [ ] AC-8: GIVEN `codebase-intelligence` WHEN user runs with no args THEN help text shows available subcommands with one-line descriptions + "Try: codebase-intelligence overview ./src"
+- [ ] AC-9: GIVEN any CLI command WHEN it completes THEN exit code is 0 (success), 1 (runtime error), or 2 (bad args/usage)
+- [ ] AC-10: GIVEN a codebase already indexed WHEN user runs any CLI command THEN cached index is used (no re-parse). Progress shows "Using cached index (HEAD: abc1234)"
+- [ ] AC-11: GIVEN all progress/error messages WHEN output is generated THEN progress goes to stderr, results go to stdout
+
+### Error Criteria (BLOCKING — all must pass)
+
+- [ ] AC-E1: GIVEN invalid subcommand WHEN user runs it THEN stderr shows error + available commands, exit code 2
+- [ ] AC-E2: GIVEN non-existent file path WHEN user runs `file <path> <file>` THEN stderr shows error + 3 similar path suggestions, exit code 1
+- [ ] AC-E3: GIVEN non-TS directory WHEN user runs `overview <path>` THEN stderr shows "No TypeScript files found", exit code 1
+
+### Should Have (ship without, fix soon)
+
+- [ ] AC-12: GIVEN remaining commands (modules, forces, dead-exports, groups, symbol, dependents, impact, rename, processes, clusters) WHEN implemented in v2 THEN each prints formatted output following same patterns
+- [ ] AC-13: GIVEN `AGENTS.md` + `llms.txt` + `llms-full.txt` WHEN published in npm package THEN AI agents discover docs locally
+- [ ] AC-14: GIVEN `docs/cli-reference.md` WHEN a developer reads it THEN every v1 command, flag, and output format documented with examples
+- [ ] AC-15: GIVEN `hotspots` command WHEN run without `--metric` THEN defaults to `coupling` metric with note: "Showing coupling (default). Use --metric to change."
+
+## Scope
+
+### v1 — Ship (this spec)
+
+- [ ] 1. Extract shared logic from `mcp/index.ts` to `src/core/index.ts` → AC-7 (MCP still works)
+- [ ] 2. Add subcommand-first routing to `src/cli.ts` — 5 commands + MCP fallback → AC-7, AC-8, AC-9, AC-E1
+- [ ] 3. Implement `--json` flag with stable CLI JSON schema + stderr/stdout separation → AC-6, AC-11
+- [ ] 4. Implement auto-caching for CLI mode (write `.code-visualizer/` on first run) → AC-10
+- [ ] 5. Create `src/cli/formatters.ts` — table/color formatters using picocolors → AC-1 through AC-5
+- [ ] 6. Implement 5 core commands: overview, hotspots, file, search, changes → AC-1 through AC-5, AC-15
+- [ ] 7. Implement error handling: invalid command, file not found, no TS files, missing path → AC-E1, AC-E2, AC-E3
+- [ ] 8. Create `AGENTS.md` + `llms.txt` + `llms-full.txt` + update package.json fields → AC-13
+- [ ] 9. Create `docs/cli-reference.md` → AC-14
+- [ ] 10. Create integration tests (TDD — tests first) → AC-1 through AC-E3
+
+### v2 — Later (separate spec)
+
+- [ ] Remaining 10 commands (modules, forces, dead-exports, groups, symbol, dependents, impact, rename, processes, clusters)
+- [ ] `--describe` subcommand for runtime schema introspection
+- [ ] Named workflow patterns in `codebase://setup` MCP resource
+- [ ] `--fields` flag for field masks (protect AI agent context windows)
+
+### Out of Scope
+
+- Interactive/TUI mode (curses, blessed, ink)
+- Watch mode / file system watcher
+- Config file (`.codebase-intelligence.json`)
+- Custom output format templates
+- Streaming output during analysis (beyond stderr progress)
+- Claude Code skill file (replaced by AGENTS.md — cross-tool, no install friction)
+
+## Quality Checklist
+
+### Blocking (must pass to ship)
+
+- [ ] All Must Have ACs passing
+- [ ] All Error Criteria ACs passing
+- [ ] All v1 scope items implemented
+- [ ] No regressions in existing tests (~186 tests)
+- [ ] MCP mode backward compatible — `codebase-intelligence <path>` still starts MCP
+- [ ] Progress/errors to stderr, results to stdout — never mixed
+- [ ] `--json` output valid JSON for all 5 commands (snapshot tests)
+- [ ] Exit codes: 0 success, 1 runtime error, 2 bad args
+- [ ] Colors stripped when stdout not TTY + `--no-color` flag + `NO_COLOR` env var
+- [ ] Auto-caching works: first CLI invocation writes index, subsequent reads it
+- [ ] No hardcoded secrets or credentials
+- [ ] Error messages are actionable (not just "Error occurred")
+
+### Advisory (should pass, not blocking)
+
+- [ ] Code follows existing project patterns (ESM, `.js` extensions, strict types)
+- [ ] Docs accurate and complete
+- [ ] AGENTS.md follows agents.md spec
+- [ ] llms.txt follows llmstxt.org spec
+- [ ] `hotspots` defaults to `coupling` when no `--metric` given
+
+## Test Strategy (MANDATORY)
+
+### Test Environment
+
+| Component | Status | Detail |
+|-----------|--------|--------|
+| Test runner | detected | vitest (vitest.config.ts) |
+| E2E framework | not configured | No Playwright/Cypress for CLI |
+| Test DB | N/A | No database |
+| Mock inventory | 0 existing mocks | All tests use real pipeline |
+
+### AC → Test Mapping
+
+| AC | Test Type | Test Intention |
+|----|-----------|----------------|
+| AC-1 | Integration | Real fixture codebase → `overview` → verify output contains file count, modules, hotspots |
+| AC-2 | Integration | Real fixture → `hotspots` → verify ranked table format, default metric=coupling |
+| AC-3 | Integration | Real fixture → `file <known-path>` → verify all FileMetrics present |
+| AC-4 | Integration | Real fixture → `search <term>` → verify BM25 results returned |
+| AC-5 | Integration | Real git fixture → `changes` → verify changed files listed |
+| AC-6 | Integration | Each command + `--json` → verify valid JSON + snapshot schema |
+| AC-7 | Integration | Bare `<path>` → verify startMcpServer called (spy/mock boundary) |
+| AC-9 | Integration | Various invalid inputs → verify correct exit codes (0, 1, 2) |
+| AC-10 | Integration | Run command twice → verify second uses cache (no "Parsing..." on stderr) |
+| AC-11 | Integration | Capture stdout + stderr separately → verify no mixing |
+| AC-E1 | Integration | Invalid subcommand → verify stderr + exit code 2 |
+| AC-E2 | Integration | Bad file path → verify error + 3 suggestions on stderr |
+| AC-E3 | Integration | Empty dir → verify "No TypeScript files" on stderr + exit code 1 |
+
+### Failure Mode Tests (MANDATORY)
+
+| Source | ID | Test Intention | Priority |
+|--------|----|----------------|----------|
+| Error Journey | E1 | Invalid subcommand shows available commands, exit 2 | BLOCKING |
+| Error Journey | E2 | Bad file path shows suggestions from real graph | BLOCKING |
+| Error Journey | E3 | Empty dir exits cleanly | BLOCKING |
+| Edge Case | EC2 | Non-git dir → changes returns graceful message | Advisory |
+| Edge Case | EC4 | Piped output has no ANSI escape codes | Advisory |
+| Edge Case | EC5 | Subcommand without path gives helpful error | BLOCKING |
+| Failure Hypothesis | FH-1 | MCP mode still works after routing refactor | BLOCKING |
+| Failure Hypothesis | FH-2 | `--json` output is valid JSON for all 5 commands (snapshot) | BLOCKING |
+| Failure Hypothesis | FH-3 | Auto-cache: first run writes, second reads without re-parse | BLOCKING |
+| Failure Hypothesis | FH-4 | MCP tool responses unchanged after core extraction (snapshot) | BLOCKING |
+
+### Mock Boundary
+
+| Dependency | Strategy | Justification |
+|------------|----------|---------------|
+| TypeScript Compiler API | Real | Used as library, fixture files on disk |
+| git | Real | Tests run in real git repo |
+| File system | Real | Fixture directories with real .ts files |
+| picocolors | Real | No reason to mock terminal colors |
+| startMcpServer | Spy only | Verify it's called for backward compat test (AC-7). Don't mock internals |
+
+### TDD Commitment
+
+Tests written BEFORE implementation (RED → GREEN → REFACTOR).
+Scope item 10 (tests) runs first — test stubs for all ACs created before any command implementation.
+
+## Risks
+
+| Risk | Impact | Likelihood | Mitigation |
+|------|--------|------------|------------|
+| Subcommand-first routing ambiguity: `ci overview` with no path | MEDIUM | HIGH | If arg count < 2, show usage error: "Missing path." Commander `.argument('<path>')` on each subcommand enforces this |
+| Auto-caching writes to user's project dir implicitly | MEDIUM | MEDIUM | `.code-visualizer/` already used by `--index`. Add to `.gitignore` suggestion in help output. Users expect this (similar to `.next/`, `.turbo/`) |
+| Core extraction breaks MCP tool responses | HIGH | LOW | Snapshot tests: capture MCP tool responses before extraction, assert identical after. Run existing 186 tests. |
+| `picocolors` missing feature needed later | LOW | LOW | picocolors covers bold, dim, red, green, yellow, cyan — sufficient for tables. If need more, swap to chalk (API-compatible) |
+| Parse time on first CLI invocation still slow | MEDIUM | HIGH | Progress on stderr ("Parsing 1247 files..."). Auto-cache means only first run is slow. Subsequent runs instant. |
+
+**Kill criteria:** If subcommand-first routing can't cleanly distinguish `ci <subcommand> <path>` from `ci <path>` (MCP fallback) → use `--mode cli` flag instead of subcommands.
+
+## State Machine
+
+**Status**: N/A — Stateless feature
+
+**Rationale**: CLI commands are pure request→response. No persistent state transitions. Input → (cache check → parse) → compute → print → exit.
+
+## Analysis
+
+### Assumptions Challenged (Post-Review)
+
+| # | Assumption | Verdict | Resolution |
+|---|------------|---------|------------|
+| 1 | Path-first syntax (`ci ./src overview`) works with Commander | **WRONG** — Commander routes path to default action, subcommand lost | Fixed: subcommand-first (`ci overview ./src`) |
+| 2 | "No changes to mcp/index.ts" | **SELF-CONTRADICTORY** — 4 helpers must be extracted | Fixed: explicit `src/core/index.ts` extraction in scope |
+| 3 | `--json` should match MCP response shape exactly | **RISKY** — couples CLI API to MCP internals | Fixed: stable CLI JSON schema, independent of MCP shape. Snapshot tests prevent drift |
+| 4 | chalk auto-strips colors reliably | **PARTIALLY WRONG** — `NO_COLOR`, `tee` edge cases | Fixed: switched to picocolors + explicit `--no-color` flag + `NO_COLOR` env var |
+| 5 | AI agents benefit from CLI + skill file | **WRONG** for agents | Fixed: CLI for humans/CI, AGENTS.md + llms.txt for AI agents, MCP unchanged |
+| 6 | 6h estimate for 15 commands + skill + docs + tests | **UNDERESTIMATED** | Fixed: 5 core commands + 8h estimate |
+| 7 | Skill file at `skill/` works for Claude Code | **BROKEN** — no install mechanism | Fixed: replaced with AGENTS.md (cross-tool, auto-loaded) |
+| 8 | Each CLI invocation can cold-parse | **WRONG** for CI | Fixed: auto-caching mandatory for CLI mode |
+
+### Blind Spots (Post-Review)
+
+1. **[Resolved: stdout/stderr]** Progress to stderr unconditionally. `console.log` in cli.ts replaced with `console.error` for progress, `process.stdout.write` for results.
+2. **[Resolved: exit codes]** 0=success, 1=runtime error, 2=bad args/usage.
+3. **[Resolved: parse latency]** Auto-caching eliminates re-parse for subsequent commands.
+4. **[Resolved: scope]** Reduced to 5 core commands. Remaining 10 deferred to v2.
+5. **[Resolved: AI discovery]** AGENTS.md + llms.txt replace skill file.
+
+### Failure Hypotheses (Post-Review)
+
+| # | IF | THEN | BECAUSE | Severity | Mitigation |
+|---|-----|------|---------|----------|------------|
+| FH-1 | Subcommand routing added | MCP fallback breaks | Commander treats bare `<path>` as unknown subcommand | HIGH | Mitigated: subcommand-first + `.action()` catch-all for bare path → MCP |
+| FH-2 | Core extraction from MCP | MCP responses change subtly | Rounding, clamping, normalization done inline in handlers | HIGH | Mitigated: snapshot tests on all 15 MCP tools before/after extraction |
+| FH-3 | Auto-caching writes index implicitly | User surprised by `.code-visualizer/` appearing | No explicit opt-in | MEDIUM | Mitigated: first run prints "Index saved to .code-visualizer/ (add to .gitignore)" |
+| FH-4 | `--json` + piped output | Malformed JSON if progress leaks to stdout | Missed stderr redirect | HIGH | Mitigated: AC-11 + explicit test that stdout-only contains valid JSON |
+
+### The Real Question
+
+**Confirmed after 4-perspective review + industry research:** CLI is the right move for humans + CI. MCP is the right interface for AI agents. The spec now correctly serves each audience through their optimal interface:
+- CLI formatted output → humans
+- CLI `--json` + exit codes → CI/scripts
+- MCP tools (unchanged) + AGENTS.md + llms.txt → AI agents
+
+### Open Items
+
+- [resolved] Commander routing → subcommand-first architecture
+- [resolved] `chalk` vs `picocolors` → picocolors (3x smaller)
+- [resolved] Skill file → replaced with AGENTS.md + llms.txt
+- [resolved] Cold parse per invocation → auto-caching
+- [resolved] stdout/stderr separation → progress to stderr
+- [explore] `--describe` subcommand for runtime schema introspection → v2
+- [explore] Named workflow patterns in `codebase://setup` MCP resource → v2
+- [explore] `--fields` flag for field masks → v2
+
+## Notes
+
+**Spec review applied: 2026-03-11**
+- 4 perspectives: Developer Advocate, Systems Engineer, AI Agent Consumer, Skeptic
+- 2 research threads: MCP→CLI industry trends, embedded docs for AI agents
+- 8 assumptions challenged, 5 blind spots resolved, 4 failure hypotheses mitigated
+- Scope reduced from 15 to 5 commands, estimate increased from 6h to 8h
+- AI skill replaced with industry-standard AGENTS.md + llms.txt
+
+## Progress
+
+| # | Scope Item | Status | Iteration |
+|---|-----------|--------|-----------|
+| 1 | Extract core from mcp/index.ts | done | 1 |
+| 2 | Subcommand routing in cli.ts | done | 1 |
+| 3 | --json + stderr/stdout separation | done | 1 |
+| 4 | Auto-caching for CLI mode | done | 1 |
+| 5 | Formatters (inline in cli.ts) | done | 1 |
+| 6 | 5 core commands | done | 1 |
+| 7 | Error handling | done | 1 |
+| 8 | AGENTS.md + llms.txt + llms-full.txt | done | 1 |
+| 9 | docs/cli-reference.md | done | 1 |
+| 10 | Integration tests | done | 1 |
+
+## Timeline
+
+| Action | Timestamp | Duration | Notes |
+|--------|-----------|----------|-------|
+| plan | 2026-03-11T00:00:00Z | - | Created |
+| spec-review | 2026-03-11T00:00:00Z | - | 4 perspectives + 2 research threads. 8 assumptions fixed, scope reduced 15→5 commands |
diff --git a/specs/backlog/2026-06-02-config-rules-engine.md b/specs/shipped/2026-06-02-config-rules-engine.md
similarity index 98%
rename from specs/backlog/2026-06-02-config-rules-engine.md
rename to specs/shipped/2026-06-02-config-rules-engine.md
index 3b0de86..ddbef56 100644
--- a/specs/backlog/2026-06-02-config-rules-engine.md
+++ b/specs/shipped/2026-06-02-config-rules-engine.md
@@ -1,6 +1,6 @@
 # Spec: Config, Rules Engine & CI Gate
 
-Status: backlog · Created: 2026-06-02 · Depends on: `schema.json`, `codebase-intelligence.json`
+Status: shipped (2026-06-10, PR #42) · Created: 2026-06-02 · Depends on: `schema.json`, `codebase-intelligence.json`
 
 Adds a declarative config, an ESLint-style rules engine, and a CI gate so the tool can fail builds on policy violations (including a `no-comments` rule).