Skip to content

Commit 8a82839

Browse files
sohmnclaude
andcommitted
Merge origin/main (v1.52.1.0 brain-aware planning) into feat/fanout-skill
Third post-ship merge. PR garrytan#1742 (brain-aware planning) landed at v1.52.1.0. Our v1.53.0.0 claim still clean per queue check. Conflicts resolved: VERSION (kept 1.53.0.0), package.json (synced), CHANGELOG.md (our 1.53.0.0 entry above main's new 1.52.1.0). Regenerated fanout/SKILL.md against merged preamble state. Tests: 6/6 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2 parents 3b49cc7 + 070722a commit 8a82839

43 files changed

Lines changed: 5366 additions & 54 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CHANGELOG.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,84 @@ If your team or your single instance of Claude Code is sitting on a finished des
3333
- Design doc at [`docs/designs/FANOUT.md`](docs/designs/FANOUT.md) documents the 4-layer slab detection heuristic, Slab 0 promotion logic, conflict resolution rules, and edge cases.
3434
- No new infrastructure: skill is auto-discovered by `setup` via the existing top-level-directory glob at [setup:620-633](setup).
3535

36+
## [1.52.1.0] - 2026-05-27
37+
38+
## **Brain-aware planning lands. Five planning skills read structured context from any personal gbrain before asking — same questions, smarter answers, no token tax.**
39+
40+
`/office-hours`, `/plan-ceo-review`, `/plan-eng-review`, `/plan-design-review`, and `/plan-devex-review` now preflight a typed entity model from your gbrain (Wintermute, local PGLite, or any thin-client MCP) before their first AskUserQuestion. Reviews stop asking "what's the product?" / "who's the target user?" / "what was your prior scope call?" — that context loads from cached digests of typed `gstack/product`, `gstack/goal`, `gstack/developer-persona`, `gstack/brand`, `gstack/competitive-intel`, `gstack/skill-run`, `gstack/user-profile`, and `gstack/take` pages. The brain becomes a structured model of your product and your judgment patterns, not just a search index.
41+
42+
The unlock: every planning skill filters its recommendations through "what does the user actually want right now, what is this product, what have we decided before." That's the qualitative shift codex outside-voice argued for — the brain telling reviews "this contradicts your January CEO plan" or "your developer persona digest says first-time CLI users; this plan adds 3 setup commands."
43+
44+
### The numbers that matter
45+
46+
Source: `bun test test/brain-cache-spec.test.ts test/skill-preflight-budget.test.ts` (verifies budgets statically) and `bin/gstack-brain-cache get product` smoke (verifies warm-hit latency).
47+
48+
| Surface | Before | After | Δ |
49+
|---|---|---|---|
50+
| Planning-skill cold-start tokens (preflight context) | 0 (asked everything) | 500–1500 tokens (warm hit) / 5–15 KB once-per-day (cold miss) | brain-as-model, not just search |
51+
| MCP calls per skill invocation (warm hit) | n/a (no integration) | 0 (single disk read) | 95% path |
52+
| MCP calls per skill invocation (cold miss) | n/a | 4–8 parallel calls, ~1–2s once | bounded |
53+
| Autoplan (4 sequential skills) preflight cost | n/a | 1 cold-miss + 3 warm-hits via lockfile dedup | concurrent dedup saves 4× |
54+
| New typed brain page kinds | 0 | 8 (`gstack-core@1.0.0` schema pack) | first-class entity model |
55+
| Per-endpoint trust policies | 0 (sync mode global only) | 1 per `sha8(MCP URL)` namespace, hash collision → sha16 | shared-brain safe |
56+
| New gate-tier tests | 0 | 10 files / 111 assertions | every correctness path covered |
57+
58+
The cache layer keeps the brain integration honest: 95% of invocations are a single disk read at ~10–30ms; cold-miss pays a one-time ~1–2s tax that's deduplicated across concurrent autoplan dispatches via a project-scoped lockfile. Salience is filtered by an allowlist (`projects/`, `concepts/`, `gstack/`) before write so personal pages — family, therapy, reflection — never leak into work-flow planning prompts. The trust-policy primitive makes personal-brain auto-push safe and shared-brain reads conservative by default.
59+
60+
### What this means for you
61+
62+
If you use planning skills today: every invocation gets sharper without you doing anything different. The skills ask fewer redundant questions and surface "this contradicts your Jan plan" / "your Feb TTHW benchmark was 2:15 vs the 5:30 baseline" / "tendency to under-expand on infra plans" — the brain doing the bookkeeping that your memory shouldn't have to.
63+
64+
If you use a remote MCP brain (Wintermute or your own): `/setup-gbrain` Step 9.5 asks the trust-policy question once per endpoint. Personal endpoint → `~/.gstack/` artifacts auto-push and calibration takes write back to your brain. Shared/team endpoint → reads only, prompts before writes, user-namespaced via federation sources or `users/<slug>/gstack/` prefix.
65+
66+
If you use local PGLite: auto-detected as personal; no question fires. The cache lives at `~/.gstack/{,projects/<slug>/}brain-cache/` with per-entity TTLs.
67+
68+
If you're a contributor: the new resolver pattern (`{{BRAIN_PREFLIGHT}}` / `{{BRAIN_CACHE_REFRESH}}` / `{{BRAIN_WRITE_BACK}}`) is the template seam for the brain integration. Empty string for any skill not in `SKILL_DIGEST_SUBSETS` — drop the placeholders anywhere with zero cost.
69+
70+
Phase 2 calibration write-back is gated behind the `BRAIN_CALIBRATION_WRITEBACK` feature flag (default off) until upstream gbrain ships `takes_add` / `takes_resolve` MCP ops (filed in TODOS.md as P2). When the flag flips, the existing skill templates pick up the write-back behavior with no template changes.
71+
72+
### Itemized changes
73+
74+
**Added**
75+
- `scripts/brain-cache-spec.ts` — single source of truth for `BRAIN_CACHE_ENTITIES` (8 entities × TTL + budget + invalidation rules), `SKILL_DIGEST_SUBSETS` (per-skill which files to load), `SALIENCE_DEFAULT_ALLOWLIST`, `SKILL_CALIBRATION_WEIGHTS`, trust-policy + schema-pack constants.
76+
- `scripts/gstack-schema-pack.ts` — `gstack-core@1.0.0` schema pack with 8 typed page kinds: `user-profile`, `product`, `goal`, `developer-persona`, `brand`, `competitive-intel`, `skill-run`, `take`. Frontmatter shapes, retention policies, link verbs for `mcp__gbrain__schema_graph`.
77+
- `bin/gstack-brain-cache` — three-tier cache CLI: `get` / `refresh` / `invalidate` / `digest` / `meta` / `bootstrap` / `list` / `purge` subcommands. Atomic writes, TTL staleness, schema-version full-rebuild on mismatch, stale-but-usable fallback, concurrent-refresh lockfile dedup.
78+
- `scripts/resolvers/gbrain.ts` — three new resolver functions: `generateBrainPreflight`, `generateBrainCacheRefresh`, `generateBrainWriteBack`. Empty-string for non-preflight skills (defensive).
79+
- `bin/gstack-config` — `brain_trust_policy@<endpoint-hash>` namespace, `endpoint-hash` subcommand (sha8 with collision → sha16 escalation), `resolve-user-slug` subcommand (D4 A3 identity resolution chain: `whoami` → `$USER` → `sha8(git email)` → `anonymous-<sha8(hostname)>`).
80+
- `setup-gbrain` Step 9.5 — brain trust policy question per-endpoint. Local auto-set personal; remote-ambiguous asks; personal flips `artifacts_sync_mode=full`.
81+
- `sync-gbrain` — `--refresh-cache` flag (replaces planned `/brain-refresh-context` skill per D1 fold), `--audit` flag (gstack-owned page summary + salience leak check), Step 1 trust-policy gate.
82+
- 10 new gate-tier test files (111 assertions): `brain-cache-spec`, `gstack-schema-pack`, `brain-cache-roundtrip`, `cache-concurrent-refresh`, `salience-allowlist`, `brain-preflight`, `user-slug-fallback`, `schema-version-migration`, `takes-fence-fallback`, `skill-preflight-budget`.
83+
84+
**Changed**
85+
- 5 planning SKILL.md.tmpl files wired with `{{BRAIN_PREFLIGHT}}` (top of skill body) and `{{BRAIN_CACHE_REFRESH}}` / `{{BRAIN_WRITE_BACK}}` (end of skill) placeholders.
86+
- `scripts/resolvers/index.ts` registers `BRAIN_PREFLIGHT`, `BRAIN_CACHE_REFRESH`, `BRAIN_WRITE_BACK`.
87+
88+
**For contributors**
89+
- Three follow-ups deferred to `TODOS.md` (P2 / P3): `/gstack-reflect` nightly synthesis, cross-machine brain-cache sync, dedicated `/gstack-onboarding` skill.
90+
- Upstream gbrain dependency for Phase 2: `takes_add` + `takes_resolve` MCP ops in `~/git/gbrain/` (filed as P2 in TODOS.md). Phase 2 wiring already exists behind `BRAIN_CALIBRATION_WRITEBACK` flag; flag flips when upstream lands.
91+
- Plan / CEO + eng review record: `~/.claude/plans/hm-interesting-well-why-dapper-eagle.md` (Approach B + 5 cherry-picks + 11 D-decisions from full eng review + codex outside-voice synthesis).
92+
93+
### Save-results path: works under any CLI when gbrain is on PATH
94+
95+
Brain-aware planning saves the actual review document to gbrain, not just preflight digests and calibration takes. Setup detects gbrain at install time and, if present, the planning skills emit compressed `gbrain put "<prefix>/<feature-slug>"` instructions for `office-hours/`, `ceo-plans/`, `eng-reviews/`, `design-reviews/`, and `devex-reviews/` slug spaces. If gbrain is not detected, the save-results block is suppressed entirely. Zero token overhead for users without gbrain. If you install gbrain after running `./setup`, run `gstack-config gbrain-refresh` to pick up the change.
96+
97+
Token cost stays tight: the inline save-results block is ~150 tokens per planning skill (down from ~1000 a naive un-suppression would have added). The full save template (heredoc body, entity-stub instructions, throttle handling, backlinks) lives in `docs/gbrain-write-surfaces.md` §Save Template and the agent reads it on demand only when it actually saves. Same compression discipline for the brain-context-load block: ~115 tokens with skip-header pointing to §Context Load.
98+
99+
| Detection state | Per-planning-skill token overhead | What the agent does on save |
100+
|---|---|---|
101+
| gbrain on PATH + `gstack-config gbrain-refresh` says `local_status: "ok"` | ~250 tokens (CONTEXT_LOAD + SAVE_RESULTS, compressed) | reads `docs/gbrain-write-surfaces.md` on demand, calls `gbrain put <prefix>/<slug>` |
102+
| gbrain not on PATH | 0 tokens | block suppressed at gen-time, nothing rendered |
103+
| GBrain or Hermes host adapter | full inline render (unchanged) | calls `gbrain put` always |
104+
105+
Wired for all five planning skills uniformly: `office-hours`, `plan-ceo-review`, `plan-eng-review`, `plan-design-review`, `plan-devex-review`. The last two gained the `{{GBRAIN_SAVE_RESULTS}}` placeholder in their templates (previously only the first three had it, so design-review and devex-review produced no retrievable page even under GBrain CLI).
106+
107+
Coverage: a free resolver-level unit test pins per-skill slug + tag metadata + the compressed token budget (`test/resolvers-gbrain-save-results.test.ts`, 10 tests / 53 assertions); a free override-mechanism test asserts the detection file gates resolver rendering correctly across `detected: true`, `detected: false`, and `no file` states (`test/gbrain-detection-override.test.ts`, 4 tests); a periodic-tier fake-CLI E2E drives `/office-hours` against a stub `gbrain` on PATH and asserts the agent actually calls `gbrain put office-hours/<slug>` with valid YAML frontmatter (`test/skill-e2e-office-hours-brain-writeback.test.ts`, ~$0.50-1/run); a periodic-tier real-CLI round-trip drives `gbrain init --pglite` + `gbrain put` + `gbrain get` against an isolated temp HOME and asserts the body survives (`test/skill-e2e-gbrain-roundtrip-local.test.ts`, ~$0.001/run, skips if `VOYAGE_API_KEY` is unset). Together: the agent obeys the resolver instruction, the resolver emits a valid CLI shape, and the CLI persists the page on the local engine. Remote/Supabase routing is gbrain's contract to honor — the same CLI shape covers all engines, so gstack stops at local round-trip coverage.
108+
109+
**For contributors (save-results layer):**
110+
- `bin/gstack-config gbrain-refresh` re-runs `bin/gstack-gbrain-detect` and writes `~/.gstack/gbrain-detection.json`. `./setup` runs this at the end of install and conditionally regenerates Claude-host SKILL.md with `bun run gen:skill-docs:user` (added package.json script) so detected installs get the brain blocks immediately.
111+
- The default `bun run gen:skill-docs` (CI canonical) ignores the detection file. Committed SKILL.md stays reproducible regardless of any developer's local gbrain state. Use `bun run gen:skill-docs:user` for user-local installs.
112+
- Two follow-ups deferred to `TODOS.md` (P2): re-verify calibration takes when gbrain v0.42+ ships `takes_add` (the `BRAIN_CALIBRATION_WRITEBACK` flag flips); extend the brain-writeback E2E to the other 4 planning skills.
113+
36114
## [1.52.0.0] - 2026-05-27
37115

38116
## **`/plan-tune` settings actually do something now. Hooks make capture deterministic, preferences binding, and free-text answers loop back as memory.**

TODOS.md

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2070,3 +2070,165 @@ Shipped in v0.6.5. TemplateContext in gen-skill-docs.ts bakes skill name into pr
20702070
### Auto-upgrade mode + smart update check
20712071
- Config CLI (`bin/gstack-config`), auto-upgrade via `~/.gstack/config.yaml`, 12h cache TTL, exponential snooze backoff (24h→48h→1wk), "never ask again" option, vendored copy sync on upgrade
20722072
**Completed:** v0.3.8
2073+
2074+
---
2075+
2076+
## Brain-aware planning follow-ups (filed v1.48.0.0 via /plan-ceo-review + /plan-eng-review)
2077+
2078+
These are the deferred cherry-picks (E2/E3/E4) from the v1.48 brain-aware
2079+
planning plan at `~/.claude/plans/hm-interesting-well-why-dapper-eagle.md`.
2080+
The foundation (Phase 0 entity model + Phase 0.5 cache + Phase 1 preflight
2081+
+ Phase 1.5 trust policy + Phase 2 write-back scaffolding) ships in
2082+
v1.48.0.0. These follow-ups extend it.
2083+
2084+
### P2: /gstack-reflect nightly synthesis skill (E2)
2085+
2086+
**What:** Scheduled skill that reads weekly `gstack/skill-run` + takes +
2087+
`get_recent_salience` and synthesizes a `gstack/insight` page surfaced at
2088+
next skill preflight.
2089+
2090+
**Why:** Cross-time pattern detection is the compounding move. "You ran 4
2091+
plan-ceo on infra this week, 0 on product — is product work getting
2092+
starved?" surfaces patterns the user wouldn't notice.
2093+
2094+
**Pros:** Brain compounds across TIME, not just across skills. Patterns
2095+
become actionable.
2096+
2097+
**Cons:** "You're starving product work" is high-judgment territory; needs
2098+
opt-out per project, careful insight templates.
2099+
2100+
**Context:** Deferred from v1.48.0.0 cherry-pick (D4) — wait 4-6 weeks for
2101+
real `gstack/skill-run` data to accumulate before designing the reflection
2102+
layer against real patterns instead of imagined ones.
2103+
2104+
**Effort:** L (human ~1-2 days, CC ~4-6h)
2105+
2106+
**Depends on:** Phase 0 (gstack/skill-run page type from v1.48.0.0) +
2107+
~6 weeks of accumulated data
2108+
2109+
### P3: Cross-machine brain-cache sync (E3)
2110+
2111+
**What:** Push compressed digests through the gstack-brain-sync git pipeline
2112+
so the brain-cache survives moving between Macs / Conductor workspaces.
2113+
2114+
**Why:** Eliminates the cold-miss tax on every new machine (~1-2s once per
2115+
machine per day).
2116+
2117+
**Pros:** Instant warm cache on new machines.
2118+
2119+
**Cons:** Cache poisoning risk if not designed carefully (hash invariants,
2120+
endpoint-binding, conflict resolution).
2121+
2122+
**Context:** Deferred from v1.48.0.0 cherry-pick (D5) — single-machine
2123+
cache is fine for V1; correctness risk needs its own design pass.
2124+
2125+
**Effort:** M (human ~4h, CC ~30min)
2126+
2127+
**Depends on:** Brain-cache layer from v1.48.0.0
2128+
2129+
### P3: /gstack-onboarding dedicated skill (E4)
2130+
2131+
**What:** Guided 5-minute setup skill for new gstack installs: walks user
2132+
through reading CLAUDE.md + README + recent commits to build `gstack/product`
2133+
and active goals with explicit AUQs.
2134+
2135+
**Why:** Better UX than the inline bootstrap (which only fires when a
2136+
planning skill is invoked).
2137+
2138+
**Pros:** Cleaner cold-start, explicit ceremony.
2139+
2140+
**Cons:** Inline bootstrap (in scope for v1.48) already covers the
2141+
cold-start path adequately.
2142+
2143+
**Context:** Deferred from v1.48.0.0 cherry-pick (D6) — observe inline
2144+
bootstrap performance first; add dedicated skill if friction is real.
2145+
2146+
**Effort:** S (human ~2h, CC ~15min)
2147+
2148+
**Depends on:** Inline bootstrap subcommand from v1.48.0.0
2149+
2150+
### P2: Upstream gbrain takes_add + takes_resolve MCP ops
2151+
2152+
**What:** Add `mcp__gbrain__takes_add` and `mcp__gbrain__takes_resolve`
2153+
ops in `~/git/gbrain/src/core/operations.ts`. Extract the markdown-fence
2154+
mirror logic from `commands/takes.ts:570` into a reusable
2155+
`engine.resolveTake()` helper.
2156+
2157+
**Why:** Unlocks Phase 2 calibration write-back without the fence-block
2158+
fallback. ~150 LOC. Already on gbrain's v0.31.x roadmap.
2159+
2160+
**Pros:** Clean Phase 2 path, removes the "fall back to put_page" smell.
2161+
2162+
**Cons:** Lives in upstream gbrain repo, not helsinki — separate PR.
2163+
2164+
**Context:** Phase 2 write-back is already wired in v1.48.0.0 behind the
2165+
BRAIN_CALIBRATION_WRITEBACK feature flag (default off). Flag flips to
2166+
true once upstream gbrain ships these ops. ~50 LOC follow-up in
2167+
helsinki to swap the fallback for the preferred op.
2168+
2169+
**Effort:** S (human ~1d, CC ~1h) in gbrain repo; trivial wire-up in
2170+
helsinki.
2171+
2172+
**Depends on:** None (parallel-track from v1.48.0.0)
2173+
2174+
### P3: Background-refresh hook supervision
2175+
2176+
**What:** Codex outside-voice raised that "background refresh at skill END"
2177+
is hand-wavy. Add proper process supervision: PID file, timeout, failure
2178+
log, cross-platform spawn.
2179+
2180+
**Why:** Current implementation backgrounds with `&` which works but
2181+
leaves no observability when a refresh fails.
2182+
2183+
**Context:** Deferred from v1.48.0.0 codex tension T3. Stays low priority
2184+
until users report stale digests where a background refresh silently
2185+
failed.
2186+
2187+
**Effort:** S (human ~2h, CC ~20min)
2188+
2189+
### P2: Re-verify calibration takes when gbrain v0.42+ lands
2190+
2191+
**What:** When upstream gbrain ships `takes_add` MCP op and we flip
2192+
`BRAIN_CALIBRATION_WRITEBACK` from FALSE to TRUE, re-run the manual
2193+
probe in `docs/gbrain-write-surfaces.md` against `/office-hours` and
2194+
confirm `gbrain takes_list` surfaces a `kind=bet` entry with the
2195+
expected weight (0.9 for office-hours, per
2196+
`scripts/brain-cache-spec.ts:151-157`).
2197+
2198+
**Why:** Today the calibration take path falls back to writing inside a
2199+
`gbrain put` fence block because `takes_add` isn't available yet. Once
2200+
v0.42+ ships, the agent will call `takes_add` directly — we should
2201+
confirm the new path actually persists a queryable take.
2202+
2203+
**Context:** v1.50.0.0 plan §"NOT in scope". The fence-block fallback
2204+
test (`test/takes-fence-fallback.test.ts`) covers wiring for both paths;
2205+
this TODO is about live verification of the preferred path when it
2206+
becomes available.
2207+
2208+
**Effort:** XS (human ~15min, CC ~5min)
2209+
2210+
**Depends on:** Upstream gbrain v0.42+ release shipping `takes_add` MCP
2211+
op (separate TODO above).
2212+
2213+
### P2: Extend brain-writeback E2E to the other 4 planning skills
2214+
2215+
**What:** `test/skill-e2e-office-hours-brain-writeback.test.ts` covers
2216+
the brain-writeback path for `/office-hours` only. Adding parallel
2217+
tests for `/plan-ceo-review`, `/plan-eng-review`, `/plan-design-review`,
2218+
and `/plan-devex-review` would bring per-skill agent-obedience coverage
2219+
to parity with the resolver unit test
2220+
(`test/resolvers-gbrain-save-results.test.ts`, which covers wiring for
2221+
all 5).
2222+
2223+
**Why:** The resolver test proves the right instructions get emitted;
2224+
the E2E proves the agent actually obeys. Today we only have that
2225+
end-to-end signal for one of five planning skills.
2226+
2227+
**Context:** v1.50.0.0 plan §"NOT in scope". Extract `makeFakeGbrain`
2228+
into `test/helpers/fake-gbrain.ts` when the second consumer arrives
2229+
(YAGNI for one consumer today).
2230+
2231+
**Effort:** S (human ~1d, CC ~1h). Periodic-tier (~$2-4 total for 4
2232+
runs).
2233+
2234+
**Depends on:** None.

0 commit comments

Comments
 (0)