Skip to content

Commit bf65487

Browse files
garrytanclaude
andauthored
v1.26.0.0 feat: V1 transcript ingest + per-skill gbrain manifests + retrieval surface (#1298)
* feat: lib/gstack-memory-helpers shared module for V1 memory ingest pipeline Lane 0 foundation per plan §"Eng review additions". 5 public functions imported by the V1 helpers (Lanes A/B/C): canonicalizeRemote(url) — normalize git remote → host/org/repo secretScanFile(path) — gitleaks wrapper with discriminated return detectEngineTier() — cached 60s in ~/.gstack/.gbrain-engine-cache.json parseSkillManifest(path) — extract gbrain.context_queries: from frontmatter withErrorContext(op,fn,caller) — async-aware error logging 22 unit tests, all passing. State files use schema_version: 1 + last_writer field per Section 2A standardization. Manifest parser handles all three kinds (vector/list/filesystem) and ignores incomplete items. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: bin/gstack-memory-ingest — V1 unified memory ingest helper Lane A. Walks coding-agent transcripts (Claude Code + Codex; Cursor V1.0.1 follow-up) AND ~/.gstack/ curated artifacts (eureka, learnings, timeline, ceo-plans, design-docs, retros, builder-profile). Calls gbrain put_page with type-tagged frontmatter. Uses gstack-memory-helpers (Lane 0): - Modes: --probe / --incremental (default, mtime fast-path) / --bulk - Default 90-day window; --all-history opts into full archive - --sources subset filter; --include-unattributed opt-in for no-remote sessions - --limit N for smoke testing; --benchmark for throughput reporting - Tolerant JSONL parser handles truncated last lines (D10 partial-flag) - State file at ~/.gstack/.transcript-ingest-state.json (LOCAL per ED1) - schema_version: 1 with backup-on-mismatch + JSON-corrupt recovery - gitleaks via secretScanFile() before every put_page (D19) - withErrorContext wraps every put_page for forensic ~/.gstack/.gbrain-errors.jsonl 15 unit tests cover --help, --probe (empty, Claude Code, Codex, mixed artifacts), --sources filter, state file lifecycle (create, schema mismatch backup, JSON corrupt backup), truncated-last-line handling, --limit validation. All passing. V1.5 P0 follow-ups noted in the file header: - Cursor SQLite extraction (V1.0.1) - gbrain put_file routing for Supabase Storage tier (cross-repo) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: bin/gstack-gbrain-sync — V1 unified sync verb (Lane B) Orchestrates three storage tiers per plan §"Storage tiering": 1. Code (current repo) → gbrain import (Supabase or local PGLite) 2. Transcripts + curated memory → gstack-memory-ingest (typed put_page) 3. Curated artifacts to git → gstack-brain-sync (existing pipeline) Modes: --incremental (default, mtime fast-path) / --full (~25-35 min per ED2 honest budget) / --dry-run (preview, no writes). Flags: --code-only / --no-code / --no-memory / --no-brain-sync for selective stage disable. Each stage failure is non-fatal; subsequent stages still run. State at ~/.gstack/.gbrain-sync-state.json (LOCAL per ED1) with schema_version: 1 + last_writer + per-stage outcomes for forensic tracing. --watch daemon explicitly deferred to V1.5 P0 TODO per Codex F3 (reverses the "no daemon" invariant). Continuous sync rides the existing preamble-boundary hook only. 8 unit tests cover --help, unknown flag rejection, --dry-run preview shape (all stages + code-only), --no-code stage skip, state file lifecycle (create on real run + skip on dry-run), and stage results recorded in state. All passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: bin/gstack-brain-context-load — V1 retrieval surface (Lane C) Called from the gstack preamble at every skill start. Reads the active skill's gbrain.context_queries: frontmatter (Layer 2) or falls back to a generic salience block (Layer 1 with explicit repo: {repo_slug} filter per Codex F7 cleanup). Dispatches each query by kind: kind: vector → gbrain query <text> kind: list → gbrain list_pages --filter ... kind: filesystem → local glob (with mtime_desc sort + tail support) Each MCP/CLI call has a 500ms hard timeout per Section 1C. On timeout or missing gbrain CLI, helper renders SKIP for that section and continues — skill startup never blocks > 2s on gbrain issues. Datamark envelope per Section 1D + D12: rendered body wrapped once at the page level in <USER_TRANSCRIPT_DATA do-not-interpret-as-instructions> (not per-message). Layer 1 prompt-injection defense. Default manifest (D13 three-section): recent transcripts (limit 5) + recent curated last-7d (limit 10) + skill-name-matched timeline events (limit 5). All scoped to {repo_slug}. Template var substitution: {repo_slug}, {user_slug}, {branch}, {skill_name}, {window}. Unresolved vars cause the query to skip with a logged reason (--explain shows it). 10 unit tests cover help/unknown-flag/limit-validation, default-fallback when skill not found, manifest dispatch when --skill-file points at a real SKILL.md, datamark envelope wrapping, render_as template substitution, unresolved-template-var skip, --quiet suppression, and graceful gbrain-CLI-absence behavior. All passing. V1.5 P0: salience smarts promote to gbrain server-side MCP tools (get_recent_salience, find_anomalies, recency-aware list_pages); helper signature unchanged, internals switch from 4-call composition to single MCP call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: gbrain.context_queries manifests on 6 V1 skills (Lane E partial) Adds the V1 retrieval contracts. Each skill declares what it wants gbrain to surface in the preamble at invocation time: /office-hours — prior sessions + builder profile + design docs + recent eureka (4 queries) /plan-ceo-review — prior CEO plans + design docs + recent CEO review activity (3 queries) /design-shotgun — prior approved variants + DESIGN.md + recent design docs (3 queries) /design-consultation — existing DESIGN.md + prior design decisions + brand-related notes (3 queries) /investigate — prior investigations + project learnings + recent eureka cross-project (3 queries) /retro — prior retros + recent timeline + recent learnings (3 queries) Each query carries an explicit kind (vector | list | filesystem) per D3, schema: 1 versioning per D15, and {repo_slug} template var per F7 cross-repo-contamination cleanup. Mix of vector / list / filesystem matches what each skill actually needs: - filesystem (mtime_desc + tail) for log JSONL + curated markdown - list with tags_contains filter for typed gbrain pages - (vector reserved for V1.0.1 when gbrain query surface stabilizes) Smoke test: bun run bin/gstack-brain-context-load.ts --skill-file office-hours/SKILL.md --repo test-repo --explain returns mode=manifest queries=4 with the filesystem kinds populating real data from ~/.gstack/builder-profile.jsonl + ~/.gstack/analytics/eureka.jsonl on this Mac. End-to-end retrieval flow confirmed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: setup-gbrain Step 7.5 ingest gate + Step 10 verdict + memory.md ref doc (Lane E partial) Step 7.5: Transcript & memory ingest gate. After Step 7 wires brain-sync but before Step 8's CLAUDE.md persist, runs gstack-memory-ingest --probe, then either silent-bulks (small) or AskUserQuestion-gates with the exact counts + value promise + 5 options (this-repo-90d, all-history, multi-repo, incremental-from-now, never). Decision persists to gstack-config set transcript_ingest_mode <choice>. Step 10: GREEN/YELLOW/RED verdict block. Re-running /setup-gbrain on a configured Mac is now a first-class doctor path — every step's detection + repair logic feeds into a single verdict at the end. Rows: CLI / Engine / doctor / MCP / Repo policy / Code import / Memory sync / Transcripts / CLAUDE.md / Smoke. Tells the user "Run /setup-gbrain again any time gbrain feels off; it's safe and idempotent." setup-gbrain/memory.md: user-facing reference doc covering what gets ingested + what stays local + secret scanning via gitleaks + storage tiering + querying + deleting + how the agent auto-loads context per skill + common recovery cases. Linked from Step 8's CLAUDE.md persist. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: V1 E2E pipeline + --no-write flag for ingest helper (Lane F) E2E pipeline test exercises the full Lane A → B → C value loop: 1. Set up fake $HOME with all 8 memory source types as fixtures 2. gstack-memory-ingest --probe verifies counts match disk 3. gstack-memory-ingest --incremental writes state with schema_version: 1 4. Idempotency: re-run reports 0 changes 5. --probe distinguishes new vs unchanged after first incremental 6. gstack-gbrain-sync --dry-run previews 3 stages 7. --no-code --no-brain-sync --quiet writes sync state with 1 stage entry 8. office-hours/SKILL.md V1 manifest dispatches 4 queries (mode=manifest) 9. Datamark envelope wraps every loaded section (Section 1D + D12) 10. Layer 1 fallback when no skill specified — default 3-section manifest 11. plan-ceo-review/SKILL.md manifest also dispatches (regression for V1 manifest authoring across all 6 V1 skills) Side effect: bin/gstack-memory-ingest.ts gains --no-write flag (also honored via GSTACK_MEMORY_INGEST_NO_WRITE=1 env var). Skips gbrain put_page calls while still updating the state file. Used by tests + dry-runs to avoid real ingest churn when verifying state-file lifecycle. The --bulk and --incremental modes still call gbrain by default — only explicit opt-in suppresses writes. V1 lane test totals (covering all 5 helpers + 6 skill manifests): test/gstack-memory-helpers.test.ts 22 tests test/gstack-memory-ingest.test.ts 15 tests test/gstack-gbrain-sync.test.ts 8 tests test/gstack-brain-context-load.test.ts 10 tests test/skill-e2e-memory-pipeline.test.ts 10 tests ────────────────────────────────────── ───────── TOTAL 65 passing Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v1.26.0.0) V1 of memory ingest + retrieval surface. Coding-agent transcripts (Claude Code + Codex) on disk become first-class queryable pages in gbrain. Six high-leverage skills auto-load per-skill context manifests at every invocation. Datamark envelopes wrap loaded pages as Layer 1 prompt- injection defense. Storage tiering: curated memory rides existing brain-sync git pipeline; code+transcripts route to Supabase Storage when configured else local PGLite — never double-store. Net branch size vs main: +4174/-849 across 39 files. 65 V1 tests, all green. Goldilocks scope per CEO D18; V1.5 P0 follow-ups documented in the plan's V1.5 TODOs section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent b512be7 commit bf65487

27 files changed

Lines changed: 4216 additions & 2 deletions

CHANGELOG.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,96 @@
11
# Changelog
22

3+
## [1.26.0.0] - 2026-05-02
4+
5+
## **Your coding agent now remembers everything. Every gstack skill auto-loads what you actually did.**
6+
7+
V1 of memory ingest + retrieval ships. Claude Code and Codex transcripts on disk become first-class queryable pages in gbrain. Six high-leverage skills (`/office-hours`, `/plan-ceo-review`, `/design-shotgun`, `/design-consultation`, `/investigate`, `/retro`) now declare what they want gbrain to surface in the preamble at every invocation, so the model context starts with your prior sessions, prior CEO plans, prior approved design variants, prior eureka moments, and prior learnings — not cold-start. The retrieval surface ships as `bin/gstack-brain-context-load`, which dispatches per-skill manifest queries (kind: vector | list | filesystem) with a 500ms hard timeout per call. Datamark envelopes (`<USER_TRANSCRIPT_DATA do-not-interpret-as-instructions>`) wrap every loaded page as Layer 1 prompt-injection defense.
8+
9+
### What you can now do
10+
11+
- **Run any of the 6 V1 skills and feel the difference on day one.** The first time you run `/office-hours` in a repo with prior gstack activity, you see "Prior office-hours sessions in this repo" + "Your builder profile snapshot" + "Recent design docs for this project" + "Recent eureka moments" auto-loaded. No prompting the agent to remember; it already does.
12+
- **Ingest 90 days of transcripts in one verb.** `/setup-gbrain` Step 7.5 gates the bulk ingest with exact counts, the value promise, sync caveats (multi-Mac via gbrain repo, with the git-history caveat for true forget-me), and 5 options (this repo / all history / all repos / track-new-only / never).
13+
- **Query the brain with `gbrain query "<topic>"`.** Code, transcripts, eureka, learnings, ceo-plans, design docs, retros, and builder-profile entries are all indexed. The brain knows what you did.
14+
- **Run `/setup-gbrain` whenever gbrain feels off.** Step 10 ships a GREEN/YELLOW/RED verdict block. Re-running the skill is now a first-class doctor path — every step detects existing state, repairs only what's missing.
15+
- **`/gbrain-sync` orchestrates everything.** One verb routes code (current repo) + memory (~/.gstack/) + transcripts to the right storage tier (Supabase Storage when configured, else local PGLite — never double-store). Modes: --incremental (default, mtime fast-path) / --full (~25-35 min honest budget for first-run on big Macs) / --dry-run.
16+
17+
### The numbers that matter
18+
19+
Source: `git diff --shortstat origin/main..HEAD` after V1 ship + the V1 test suite (`bun test test/gstack-memory-*.test.ts test/skill-e2e-memory-pipeline.test.ts`).
20+
21+
| Metric | Δ |
22+
|---|---|
23+
| Net branch size vs main | **+4174 / −849 lines** across 39 files |
24+
| New shared library | **`lib/gstack-memory-helpers.ts`** (330 LOC, 5 public functions: canonicalizeRemote, secretScanFile, detectEngineTier, parseSkillManifest, withErrorContext) |
25+
| New helpers in `bin/` | **3 helpers**`gstack-memory-ingest` (580 LOC), `gstack-gbrain-sync` (270 LOC), `gstack-brain-context-load` (420 LOC) |
26+
| Skills with V1 gbrain manifests | **6 skills**`/office-hours`, `/plan-ceo-review`, `/design-shotgun`, `/design-consultation`, `/investigate`, `/retro` |
27+
| Memory types ingested | **8 types** — transcript (Claude Code + Codex), eureka, learning, timeline, ceo-plan, design-doc, retro, builder-profile-entry |
28+
| Tests added | **65 new tests** — 22 helpers + 15 ingest + 8 sync + 10 context-load + 10 E2E pipeline |
29+
| New /setup-gbrain steps | **2 steps** — Step 7.5 (transcript ingest gate with 5-option AskUserQuestion) + Step 10 (GREEN/YELLOW/RED idempotent doctor verdict) |
30+
| New user-facing reference | **`setup-gbrain/memory.md`** — what gets ingested, what stays local, secret scanning via gitleaks, querying, deleting, recovery cases |
31+
| Manifest schema | **`gbrain.schema: 1`**, validated at gen-skill-docs time; 3 query kinds (vector / list / filesystem) with kind-specific required fields |
32+
| MCP-call timeout per query | **500ms** hard cap; preamble never blocks > 2s on gbrain issues |
33+
| Datamark envelope wrap | **per-page** (not per-message) — single envelope around rendered body |
34+
35+
### What this means for builders
36+
37+
You stop describing your past work to the agent. The agent already knows. Run `/office-hours` and the "Welcome back, last time you were on X" beat is sourced from data. Run `/investigate` and it opens with "have we hit this bug class before?" instead of cold-start. Run `/design-shotgun` and the variants regenerate from your taste, not generic defaults.
38+
39+
The storage architecture lands in V1: curated memory rides the existing brain-sync git pipeline; code and transcripts route to Supabase Storage when configured (multi-Mac native) or stay local on PGLite-only Macs. **Never double-store.** Decision rule from D2 (sync by default) survives a CEO review and Codex outside-voice challenge: the value loop (ingest → retrieve → better decisions) requires multi-Mac to feel real.
40+
41+
V1 is **Goldilocks** scope per CEO D18 (Codex F10 strategic challenge): the value loop closes on day one. V1.5 P0 follow-ups capture: `/gbrain-sync --watch` daemon (deferred per F3 invariant), `mcp__gbrain__code_search` MCP tool (cross-repo coordination), `gbrain: default` one-line manifest opt-in (per F1 frontmatter passthrough is bigger than estimated), agent-agnostic `gbrain context` CLI, brain-trajectory observability + weekly digest, classifier-based prompt-injection defense (per F5 ONNX integration), salience MCP server-side promotion. All documented in the plan's V1.5 TODOs.
42+
43+
### Itemized changes
44+
45+
#### Added — Foundation
46+
47+
- `lib/gstack-memory-helpers.ts` — shared module imported by all V1 helpers. canonicalizeRemote (handles https/ssh/git@/.git/quotes/multi-segment), secretScanFile (gitleaks wrapper with discriminated `scanner: "gitleaks" | "missing" | "error"` return), detectEngineTier (cached 60s), parseSkillManifest, withErrorContext (async-aware error logging to `~/.gstack/.gbrain-errors.jsonl`).
48+
49+
#### Added — Ingest pipeline
50+
51+
- `bin/gstack-memory-ingest` — walks `~/.claude/projects/*/`, `~/.codex/sessions/YYYY/MM/DD/`, and `~/.gstack/` artifacts (eureka, learnings, timeline, ceo-plans, design-docs, retros, builder-profile). Modes: --probe / --incremental (default, mtime fast-path) / --bulk. Tolerant JSONL parser handles truncated last lines (D10 partial-flag). State at `~/.gstack/.transcript-ingest-state.json` with schema_version: 1, backup-on-mismatch + JSON-corrupt recovery. gitleaks runs on every page before put_page (D19). --no-write flag for tests + dry-runs (also via `GSTACK_MEMORY_INGEST_NO_WRITE=1`).
52+
- `bin/gstack-gbrain-sync` — unified sync verb. Orchestrates 3 stages: code import → memory ingest → curated git push. Modes: --incremental / --full / --dry-run. State at `~/.gstack/.gbrain-sync-state.json` (LOCAL per ED1) with per-stage outcomes. --code-only / --no-code / --no-memory / --no-brain-sync for selective stage disable.
53+
54+
#### Added — Retrieval surface
55+
56+
- `bin/gstack-brain-context-load` — V1 retrieval surface. Dispatches per-skill manifest queries by kind (vector via `gbrain query`, list via `gbrain list_pages`, filesystem via local glob). 500ms hard timeout per MCP call. Datamark envelope per page. Layer 1 default fallback with 3 sections (recent transcripts + recent curated + skill-name-matched timeline) all carrying explicit `repo: {repo_slug}` filter (F7 cleanup). Template var substitution: {repo_slug}, {user_slug}, {branch}, {skill_name}, {window}.
57+
58+
#### Added — Skill manifests (6 V1 skills)
59+
60+
- `office-hours/SKILL.md.tmpl` — 4 queries (prior-sessions list + builder-profile fs + design-doc-history fs + prior-eureka fs)
61+
- `plan-ceo-review/SKILL.md.tmpl` — 3 queries (prior-ceo-plans fs + recent-design-docs fs + recent-reviews list)
62+
- `design-shotgun/SKILL.md.tmpl` — 3 queries (prior-approved-variants fs + DESIGN.md fs + recent-design-docs fs)
63+
- `design-consultation/SKILL.md.tmpl` — 3 queries (existing-DESIGN.md fs + prior-design-decisions fs + brand-guidelines list)
64+
- `investigate/SKILL.md.tmpl` — 3 queries (prior-investigations list + project-learnings fs + recent-eureka fs)
65+
- `retro/SKILL.md.tmpl` — 3 queries (prior-retros fs + recent-timeline fs + recent-learnings fs)
66+
67+
#### Added — setup-gbrain idempotent doctor + ref doc
68+
69+
- `setup-gbrain/SKILL.md.tmpl` Step 7.5 — Transcript & memory ingest gate. Probe → silent bulk if < 200 sessions / 100MB → AskUserQuestion with 5-option gate otherwise (this repo last 90d / all history / all repos / incremental / never).
70+
- `setup-gbrain/SKILL.md.tmpl` Step 10 — GREEN/YELLOW/RED verdict block. Re-running /setup-gbrain is now first-class doctor path with detect→repair→report rows for CLI / Engine / doctor / MCP / Repo policy / Code import / Memory sync / Transcripts / CLAUDE.md / Smoke.
71+
- `setup-gbrain/memory.md` — user-facing reference covering what gets ingested + what stays local + secret scanning + storage tiering + querying + deleting + how the agent uses it + recovery cases.
72+
73+
#### Added — Tests
74+
75+
- `test/gstack-memory-helpers.test.ts` — 22 unit tests covering all 5 public helpers
76+
- `test/gstack-memory-ingest.test.ts` — 15 tests covering CLI surface, --probe with all source types, state file lifecycle, schema mismatch + JSON corrupt backup-on-error, truncated JSONL handling
77+
- `test/gstack-gbrain-sync.test.ts` — 8 tests covering --help, unknown flag rejection, --dry-run preview, --no-code stage skip, state file lifecycle, stage results recorded
78+
- `test/gstack-brain-context-load.test.ts` — 10 tests covering CLI surface, default fallback, manifest dispatch, datamark envelope wrap, render_as template substitution, unresolved template var skip, --quiet suppression, graceful gbrain-CLI-absence
79+
- `test/skill-e2e-memory-pipeline.test.ts` — 10 E2E tests exercising the full Lane A → B → C value loop with 8 fixture file types
80+
81+
#### Changed
82+
83+
- `package.json` version 1.25.1.0 → 1.26.0.0
84+
- `VERSION` 1.25.1.0 → 1.26.0.0
85+
86+
#### For contributors
87+
88+
- The plan file at `/Users/garrytan/.claude/plans/ok-actually-lets-go-luminous-thacker.md` (~890 lines) is the canonical V1 design source, including office-hours findings, CEO review expansions (6 cherry-picks accepted, 1 reverted+replaced), Codex outside-voice 10 findings (F1-F10 each resolved or deferred), eng review additions (ED1 + ED2 + 6 auto-applied implementation specs), and V1.5 P0 TODOs section with full handoff context.
89+
- Manifest schema is versioned (`gbrain.schema: 1`); future format changes bump the schema and require explicit migration. gen-skill-docs validates the schema at build time (kind / required fields per kind / template var resolution / unique IDs).
90+
- Lane D (cross-repo `gbrain restore-from-sync` with atomic swap + 7-day .bak retention per D11) is documented as V1.5 P0 TODO — gstack repo cannot write to gbrain CLI repo.
91+
- The retrieval surface helper signature is V1.5-promotion-stable: when V1.5 ships server-side `mcp__gbrain__get_recent_salience` / `find_anomalies` MCP tools, the helper switches its internals from 4-call composition to a single MCP call without changing the manifest format or any skill template.
92+
- gitleaks vendoring is a V1.0.1 follow-up; for V1.0, the helper expects gitleaks on PATH and warns once if missing. `brew install gitleaks` on macOS gets you covered until the vendored binary ships.
93+
394
## [1.25.1.0] - 2026-05-01
495

596
## **Office-hours stops at Phase 4 architectural forks. AskUserQuestion evals — and `/codex` synthesis — now grade the "because" clause.**

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.25.1.0
1+
1.26.0.0

0 commit comments

Comments
 (0)