Skip to content

Commit 54e3a2c

Browse files
docs(research): non-goals reassessment + cohort positioning + ship sequence (2026-05) (#58)
* docs(research): non-goals reassessment + fallow clone deep-dive map (2026-05) Companion to research/fallow.md (capability tracker — what to adopt FROM fallow). This new doc inventories what THIS codebase already unlocks that the current Non-goals (v1) list forbids, post-C.11. User observation: many non-goals were defensive choices made when the project was 1/10th its current size, then carried forward unchallenged as the surface grew (15+ recipes, 12+ tables, 3 engines, watch mode, coverage, audit, impact). The reframe: stop asking "what should we not do?" and start asking "what does the SQL-index-with-three-transports actually unlock that no other tool does?" Findings: §1 — 10 first-class agent capabilities sitting in unwritten JOINs / formatters / verbs (components-touching-deprecated, unimported-exports, complexity per symbol, refactor-risk-ranking, boundary violations, unused type members, Mermaid output, MCP file/symbol resources, recipe usage telemetry, rename --dry-run preview). §2 — Five non-goals worth challenging: - "No FTS5 / use ripgrep" — SQLite ships FTS5; ripgrep loses JOIN composition (TODOs inside @deprecated functions in <50% covered files is one query, vs three tools today). - "No visualisation" — conflates rendering pixels with shaping render- ready data; Mermaid / D2 are JSON-shaped formatters (sibling of SARIF). - "No static analysis" — we already ship deprecated-symbols, untested- and-dead, barrel-files, fan-in/out; the line was rhetorical. Real boundary is "no opinionated rule engine, no fix mutation". - "No persistent daemon" — we have one (mcp --watch, serve --watch, watch); non-goal preserves a constraint that no longer exists. - "No LSP replacement" — show + impact + watch is 80% of LSP read-side; ship a thin shim consuming existing engines, don't write an LSP. §3 — Real architectural limits worth keeping (sub-100ms cold-start CLI, no LLM in box, no fix engine, no runtime tracing, no JS exec at index time). §4 — Map of /Users/sutusebastian/Developer/OSS/fallow clone deep-dive points: which crates / docs / configs to inspect before each shipped feature so we adopt patterns rather than reinvent. Cite-the-source-path discipline mirrors the existing research/fallow.md cite-the-PR habit. §5 — Recommended sequence: (a) FTS5 + Mermaid one-PR non-goal flip → (c) complexity column → (b) C.9 plugin layer (multi-tracer big surface) → (d) LSP shim. (a) is the cheapest non-goal flip; ships a confidence move before the bigger surfaces. §6 — 5 open questions (daemon-by-default for MCP/HTTP, FTS5 opt-in, LSP shim vs standalone, plugin contract scope, history table shape). Doc-governance compliance: - Goes in docs/research/ per Rule 3 (research-class doc). - Cross-references roadmap, why-codemap, fallow.md, competitive-scan per Rule 5. - Doesn't duplicate non-goals (Rule 1) — proposes amendments to be applied when § 2 items ship, in lockstep with why-codemap per the Single source of truth table. - No inventory counts in narrative (Rule 6) — uses qualitative "15+ recipes / 12+ tables" only. * docs(research): triangulate non-goals reassessment vs descriptive baseline User cross-checked my prescriptive doc (non-goals-reassessment-2026-05.md) against composer-2-fast's descriptive baseline (codemap-capability- surface-2026-05.md) plus the codebase as source of truth. Found three factual errors in mine; baseline doc held up clean. Corrections applied: 1. § 1.2 (Exports never imported): codebase has `exports.re_export_source` column — original doc missed it. Re-exports require a JOIN through that column to avoid false positives on barrel-only exports. Effort bumped XS → S. 2. § 1.3 (Cyclomatic complexity): claimed "AST walker already counts nodes during parse" — false. `rg 'complexity|node_count|nodeCount' src/` returns zero matches. Node-counting is NOT in place; needs an extension to the AST walker in src/parser.ts. Effort bumped S → M. 3. § 2.3 ("no static analysis" non-goal): listed `fan-in` and `fan-out` as "static analysis we already ship" — too loose. Per `fan-in.sql` (`ORDER BY fan_in DESC LIMIT 15`) they're hotspot rankers, not orphan / dead-code detectors. They don't cover the closed-dead- subgraph case from research/fallow.md § 0 (8-file pack with non- zero fan-in via self-import). That gap motivates C.9 framework plugin layer, not the "no static analysis" flip. Caveat now spelled out in the doc. Header updated: this doc is the **prescriptive** lens; the **descriptive baseline** lives in codemap-capability-surface-2026-05.md (read first). Cross-references list and § 8 errata block document the diff between v1 and v2 so future reviewers can see what changed and why. Process lesson encoded in § 8: every prescriptive research note should triangulate against a descriptive baseline (own doc or peer model) before recommending a ship sequence. Caught all three errors before they propagated into a plan PR. * docs(research): scrub local user paths from non-goals doc + new lesson User caught absolute-path leaks in the research note pointing at the fallow clone on the maintainer's machine. Three references replaced with the public upstream URL (https://github.com/fallow-rs/fallow): - Header "Local clone for deep-dives" → "Source for deep-dives" - § 4 heading "What to inspect in the local fallow clone" → "...in the fallow source tree" - § 7 cross-references "Local fallow clone — /Users/..." → "fallow upstream" Also adds a new general-purpose lesson to .agents/lessons.md: Never commit absolute local user paths — no /Users/<name>/…, /home/<name>/…, ~/…, or file:/// URIs in any tracked doc, code, comment, or PR body. Pattern: cite https://github.com/<org>/<repo> for upstream sources; repo-relative paths for in-tree references. Sibling to the existing "PR bodies via temp file" lesson — same family (committed strings need to be portable + non-leaking), different surface. * docs(lessons): add 'never commit local user paths' lesson (PR #58 catch) * docs(research): delete codemap-capability-surface-2026-05.md (existence test) Per docs/README.md existence test, this doc fails 3 of 4 criteria: - ❌ Doesn't document durable policy unavailable elsewhere — every fact reproducible from db.ts / builtin.ts / audit-engine.ts / --recipes-json - ❌ Doesn't track open work — pure snapshot - ❌ No unique historical context git log + architecture.md can't reconstruct - ✅ Cited by another doc (only because non-goals-reassessment cited it) Plus Rule 1 violation (duplicates architecture.md § Schema) and Rule 6 violation (hardcodes "15 recipes" / "9 of 15 ship actions" inventory counts in narrative). The real value the doc delivered was the **triangulation discipline** — catching 3 errors in non-goals-reassessment v1. That discipline is the durable artifact, not the doc. Codified in two places: 1. non-goals-reassessment § 8 errata + process lesson (kept) 2. .agents/lessons.md — new lesson explicitly bans the "dual descriptive + prescriptive doc" pattern as a Rule 1 violation. Right discipline: pin every concrete claim in the prescriptive doc itself, or self-audit against the canonical home before committing. Don't ship a parallel descriptive doc. non-goals-reassessment header + § 7 + § 8 updated to drop the now-deleted companion-doc references and point at canonical sources directly (architecture.md § Schema, db.ts, builtin.ts, audit-engine.ts V1_DELTAS). * docs(research): align § 5 (c) effort with § 1.3 / § 8 (M, not S) CodeRabbit caught § 5 row (c) "Cyclomatic complexity column" listing effort S, while § 1.3 + § 8 errata both list M (the v1→v2 bump after `rg 'complexity|node_count|nodeCount' src/` returned zero — node- counting isn't already in place; the AST walker in src/parser.ts has to be extended). Effort propagation gap from the v2 errata pass. § 5 row (c) updated to M; "Why" cell now spells out the AST-walker dependency inline so future readers don't re-litigate the figure. * docs(research): split § 3 into moat (load-bearing) vs ergonomic limits Grill-me Q1 outcome (under "extract max from SQL-index + equal/surpass fallow" mission): the original § 3 list conflated ergonomic floors (sub-100ms cold-start, no LLM, no JS at index time) with the actual moats. Most of the original entries are floors fallow also follows; they're not differentiators. The two real moats that needed naming as load-bearing limits: A. SQL is the API — every capability is a recipe (saved query) or a primitive recipes can compose. Verdicts are an OUTPUT mode (--format sarif, audit deltas), never a primitive. Reviewer test: "is this verb also expressible as query --recipe <id>?" B. Extracted structure ≥ verdicts — schema breadth (CSS, markers, type_members, calls.caller_scope, components.hooks_used) is what equals/surpasses fallow on agent-facing capability per fallow.md § 5. Reviewer test for any "drop column X" PR: "what recipe (bundled or hypothetical) does this kill?" Both are now load-bearing rows above the ergonomic ones. The original five preferences are kept verbatim but annotated with their relation to the moat (floor / convergent / adjacent / rivalrous / safety). Eroding either A or B is the most likely path from "codemap" to "fallow with extra steps" — § 3 now equips a reviewer to spot it. * docs(research): § 5 ship sequence — parallel plan-PR for (b) at T+0 Grill-me Q2 outcome (under "equal/surpass fallow" mission): the "cheapest non-goal flip first" ordering was a small-team confidence move, but the § 3 moat rewrite already paid that confidence cost. The real risk under the actual mission is the deferral trap — XL items become "next quarter" while every new recipe inherits the noisy substrate (untested-and-dead's Next.js page.tsx false-positive class). Hybrid resolved: - Shipping cadence stays (a) → (c) → (b) impl → (d). - (b) plan PR opens at T+0, iterates in parallel during (a)+(c). - Plan opens with ~30% of decisions pre-locked: entry-point hints only per Grill Q4, static config only per § 3 "no JS exec at index time" ergonomic limit. Not a blank-slate plan — structured from day 1. Added a 5-row T-table in § 5 spelling out the parallel tracks. (b)'s "Why" cell now names the deferral trap explicitly; (d)'s "Why" pins its dep on (b) impl (not just (b)). Rationale list updated to flag that the moat rewrite paid the confidence move so (a) doesn't pay it again. Cost-if-abandoned escape hatch: plan PR can close as "Status: Rejected (YYYY-MM-DD)" per docs/README.md Rule 8. Design surface captured either way. * docs(research): § 2 reframed via § 3 moats (taxonomy + verdict cross-refs) Grill-me Q3 outcome: § 2's five flips inherited their shape from "original non-goals worth challenging" — but after § 3 locked in the moats, that shape conflated three different categories: - Moat-extending flips (2.1 FTS5, 2.3 static analysis) — substrate growth inside moat B - Moat-aligned flip (2.2 output formatters) — verdicts as output mode per moat A - Moat-orthogonal transport flips (2.4 daemon, 2.5 LSP shim) — neither moat is touched; flipping just re-exposes existing substrate Anchors preserved (2.1-2.5 stay) — anchor-preservation discipline per docs-governance § 3 / docs/README.md Rule 7. No cascading link updates needed in § 3 / § 4 / § 5 / § 8. Changes per section: - § 2 header — added a reading note naming the three categories and pointing each flip at the moat row it relates to. - § 2.3 — verdict no longer restates "no opinionated rule engine + no fix engine" (now canonical in § 3 moat A + ergonomic row); instead cross-references and names the static-analysis category as in-scope. Closed-dead-subgraph caveat preserved (it's the C.9 motivator). - § 2.4 — added "Moat relation: orthogonal" subsection naming the transport / process-model framing. AST-caching capability claim preserved + cross-linked to § 6 Q1. Verdict points the daemon-default question at § 6 Q1 explicitly (single canonical home). - § 2.5 — replaced the unmeasured "80% of LSP read-side" claim with a structural argument: shim wraps shipped engines (show / impact / watch) via stdio without re-extracting structure; an LSP *engine* would duplicate moat B substrate (the actual reason not to build one). Cited application/show-engine.ts + application/impact-engine.ts as the substrate the shim wraps. - § 6 Q1 — enriched with the AST-caching downstream measurement note lifted from § 2.4 (single canonical home for the daemon-default decision; § 2.4 cross-refs here). Vital-info preservation audit: - ✅ Closed-dead-subgraph caveat (8-file widget pack via fallow.md § 0) — kept verbatim in § 2.3 caveat block. - ✅ AST-caching capability claim — kept in § 2.4 "Capability unlocked" + cross-linked from § 6 Q1. - ✅ Watch-mode receipts (codemap watch / mcp --watch / serve --watch) — kept verbatim in § 2.4 "What's actually true". - ✅ Fan-in/fan-out hotspot-rankers framing — kept verbatim in § 2.3 caveat (with errata cross-ref to § 8). - ✅ Fallow `crates/lsp/` cross-ref — kept in § 2.5. Dropped (intentional): - "80% of LSP read-side" — unmeasured; replaced with structural argument that doesn't need a measurement. * docs(research): § 1.7 Mermaid — bounded-input contract (moat A) Grill-me Q4 outcome: § 1.7's "What's needed" cell was loose ("new --format mermaid formatter") — true but underspecified. Real-project edge counts on dependencies / calls are 1k-10k+; rendering them is either Mermaid-choking or a hairball, and silently auto-truncating (or "best-effort") would be a verdict-shaped affordance masquerading as an output mode — violates moat A. Locked in: - Allow on: impact engine output (depth-bounded), LIMIT N-shipped recipes (fan-in / fan-out), ad-hoc SQL with explicit LIMIT ≤ 50. - Reject (with scope-suggestion message) on unbounded inputs. - No auto-truncation — that's a verdict (recipe author's job to scope). Threshold (50 edges) is configurable; chosen as a default-readable upper bound for chat-client rendering. Calibrate during (a) impl PR against fixtures/golden / external corpus. DX framing: hairballed Mermaid in MCP / Cursor / Slack chat clients renders as garbage; a clear error naming knobs (LIMIT / --via / WHERE from_path LIKE) is the better consumer signal. This keeps Mermaid an output mode (moat A clean) and forces recipe authors to scope graphs — correct because they own the structural meaning of the result set. * docs(research): § 1.10 rename — recipe-shape (moat A) + parametrised recipes Grill-me Q5 outcome: § 1.10's verb-shape ("codemap rename <old> <new> --dry-run") was downstream of the OLD § 3 ("no fix engine" as a top- level non-goal). After the moat reframe, the actual test is moat A: verdict-shape vs recipe-shape. Verb hides every implicit rename choice (visibility filter, type-only re-exports, test files, aliases) inside argv parsing — not auditable. Recipe-shape puts those choices in reviewable SQL. Locked in: - Bundled recipe rename-preview.sql with --params key=value substitution (?-placeholder binding via db.ts prepared statements). - --format diff output mode (sibling of --format mermaid per item 1.7; same "rows in, renderable text out" pattern). - No new verb / engine / MCP tool / HTTP route. SQL stays the API. - Effort drops M → S. Cross-cutting infrastructure unlocked: parametrised recipes is net-new plumbing but pays for itself on the first downstream use. Already- visible follow-ons captured in the new "Cross-cutting infrastructure unlocked by item 1.10" paragraph at the end of § 1: - delete-symbol-preview, extract-function-preview, inline-symbol- preview — same recipe-shape pattern; all gated on the same plumbing. - Parametrising existing static recipes (untested-and-dead --params min_coverage=80 instead of hardcoded < 80) — cleanup opportunity the same plumbing enables. This is the second moat-A demonstration in two adjacent grill rounds (after § 1.7's bounded-input contract on Mermaid). Both prove the "verdicts are output mode, recipes are the API" framing on real capabilities — exactly what the (a) plan-PR will need to point at when reviewers ask "what changed?". * docs(research): § 6 — close Q1 (daemon-default), Q3 (LSP shape), Q4 (plugin scope) Grill-me Q6 outcome (and accounting cleanup): three of five § 6 open questions are now resolved by prior grill outcomes — § 6 needs to reflect that, not pretend they're still open. Resolutions captured: - Q1 (daemon-default for mcp/serve) — RESOLVED THIS GRILL TURN. Default --watch ON for both modes; opt-out via --no-watch / CODEMAP_WATCH=0. One-shot CLI defaults preserved (no watcher on query/show/snippet). Receipts: stale-index = #1 agent UX complaint (fallow.md § 6); chokidar lazy startup validated tiny by PR #46 6-watcher audit. Flip is a small follow-up PR (flag default + test + patch changeset + agent rule update per docs/README.md Rule 10). AST-caching measurement parked downstream of the flip. - Q3 (LSP shim vs standalone) — RESOLVED in § 2.5 reframe earlier this grill (commit 0b9d878). Thin shim wrapping shipped engines; no engine (would duplicate moat B substrate). Standalone deferred to "if VSCode-extension demand emerges." - Q4 (C.9 plugin contract scope) — RESOLVED via § 5 (b) plan-PR pre-locked decisions (commit 6f845ba). Entry-point hints only for v1; arbitrary edge injection deferred to v2. Static config only per § 3 ergonomic "no JS exec at index time" floor. § 6 restructured: "Resolved (2026-05)" subsection at top with full rationale + receipts; "Still open" subsection below with Q2 (FTS5 default) and Q5 (history table) — the only two genuinely-open questions left. § 2.4 verdict updated to point at the resolved § 6 Q1 anchor instead of the open-question wording. Anchor preservation: external links (#6-open-questions) still resolve to the section heading. New internal anchor (#resolved-2026-05) used by § 2.4 verdict — single inbound link, no external citations to break. * docs(research): § 6 Q2 closed — FTS5 default-OFF, both config + CLI Grill-me Q7 outcome: § 6 Q2 (FTS5 opt-in vs default-on) resolved. Locked in: - Toggle: BOTH codemap.config.ts `fts5: true` AND --with-fts CLI flag at index time. Config-only forces CI / ephemeral workflows to commit fts5: true to a config file; CLI-only forces long-term users to remember the flag on every --full. Cheap to support both. - Default: OFF. Backwards-compat — existing users wouldn't see .codemap/index.db grow ~30-50% silently on next --full. - Re-evaluate default in v2 once external-corpus size measurements land (bun run benchmark:query shape). Bug fix in § 2.1: the "off by default to keep cold-start sub-100ms" framing was a WRONG REASON. FTS5 is index-time cost only; cold-start reads existing DB and the virtual table doesn't slow startup. Real reason for default-OFF is index size growth. § 2.1 verdict updated to reflect this; § 6 Q2 resolution explicitly calls out the wrong-reason correction so future readers see the diff. Principle pinned: default-ON is reserved for capabilities without disk-size tax (Mermaid output, parametrised recipes, complexity column). FTS5 is the disk-tax exception. Tree state after this commit: - § 6 Q1 (daemon-default) — resolved - § 6 Q2 (FTS5 default) — resolved - § 6 Q3 (LSP shape) — resolved - § 6 Q4 (plugin scope) — resolved - § 6 Q5 (history table) — STILL OPEN (defer-bias confirmed by doc) * docs(research): § 6 Q5 closed — history table deferred + full grill findings Grill-me Q8 outcome: § 6 Q5 (history table) resolved as DEFERRED, with the full grill analysis preserved inline so the next reviewer doesn't have to re-derive why we said no. Findings captured: - WHAT it would do — point-in-time index gains a temporal dimension ("when did symbol X get @deprecated?", "coverage trend over 50 commits", "files that became dead this week"). - WHAT audit --base <ref> already covers — pairwise diff serves the most-common temporal question (PR-scoped delta) with no schema growth. Longitudinal "evolved over commits 1..N" is the unfilled gap. - TWO SHAPES table — per-commit snapshots (~25 GB on 500-commit retention; trivial query cost) vs append-only event log (~5-25 MB deltas; heavy recursive-CTE query cost). - BACKFILL COST — N reindexes (~30s each = ~4 hrs first-run for 500 commits) is the same for both shapes; deal-breaker today. - ARCHITECTURE IMPACT — schema bump (minor per pre-v1 lesson), db.ts + indexer hooks, retention policy config, deeper git integration. - WHY DEFER — anti-bloat meta-rule (no recipe demands it); audit --base covers common case; backfill prohibitive without paying use case; shape-decision wasted without empirical access patterns. - REVISIT TRIGGERS — TWO consumers shipping jq-based "audit runs over time" workflows (mirrors B.5 verdict-threshold deferral pattern), OR query_baselines evolution becoming a recurring agent need. The full analysis is now inline in § 6 Q5 (~30 lines + cost table). Per user request: don't lose vital information; document grilling findings for fuller context. Future reviewers see the full reasoning, not just "deferred" — same posture as § 8 errata's "future readers can see the diff between v1 and v2." § 6 status after this commit: ALL FIVE OPEN QUESTIONS RESOLVED. Q1 (daemon-default), Q2 (FTS5 default), Q3 (LSP shape), Q4 (plugin scope), Q5 (history table) — every decision the doc was authored to force is now pinned with rationale and revisit triggers (where applicable). * docs(research): § 1.9 reframe + § 3 "No telemetry upload" floor Grill-me Q9 outcome: § 1.9's "Recipe usage telemetry" framing was a gotcha. The word "telemetry" carries upload / aggregation / surveillance connotations that don't match the actual capability (purely local recency tracking) — and would either get the feature rejected sight-unseen by privacy-conscious users / corp installations OR silently set up substrate for a future "phone home" PR without an explicit non-goal saying we won't. Renamed + tightened § 1.9: - "Recipe usage telemetry" → "Local recipe-recency tracking". - Table renamed recipe_usage → recipe_recency (named after the value, not the act). - Added 90-day retention bound (caps unbounded growth via per-reindex pruning). - Added opt-out config (`recipe_recency: false` skips the reconciler). - --recipes-json surface spec'd: {recipe_id, last_run_at, run_count_90d}. - Naming-note paragraph explains why "telemetry" was rejected. New § 3 ergonomic floor row "No telemetry upload": - Locks in the privacy posture explicitly. No HTTP-out primitive in codebase today (grep-able), but the floor exists to resist accumulation pressure — a future "anonymous opt-in usage stats to help prioritize recipes" PR would look reasonable without an explicit floor. - Convergent with fallow (probably also doesn't upload) — floor, not moat. - Cross-references item 1.9 as the only usage-data feature; consumers can audit the .codemap/index.db location + retention bound. Lockstep update needed when item 1.9 ships: docs/why-codemap.md "What Codemap is not" gains "Codemap never uploads usage data" per docs/README.md Rule 10. Already cross-referenced in § 7 of this doc. * docs(research): drop all fallow framing — codemap is structurally unique User reframe: codemap is the only SQL-based code index in the market; inspiration comes from the free and open internet (LSP spec, SQLite docs, AST tooling), not code-by-code cloning of any peer tool. Drop fallow as a yardstick throughout. Vital information preserved (per "don't lose any vital information that is used to execute the plan"): - Closed-dead-subgraph motivator for C.9 — kept as an abstract pattern description in § 2.3 caveat (N-file packs with self-imports, non- zero fan-in, none reachable from real entry). Was previously cited to fallow.md § 0; now stands on its own merit. - LSP read-side capabilities (show / impact / watch) — kept; LSP spec upstream is now the protocol authority instead of fallow's crates/lsp/. - Runtime-tracing scope distinction — § 3 floor reframed to anchor on "different product class entirely" (live process data vs static analysis) instead of "fallow's paid moat." - Predicate-as-API moat (A) — kept; justification now anchors on intrinsic merit (SQL is durable, agents compose any predicate) rather than "fallow ships verdicts; we don't." - Schema-breadth moat (B) — kept; justification now "codemap-specific extractions; their richness directly determines what JOINs are expressible" rather than "fallow has none of these." Section-by-section changes: - HEADER — "Companion docs / Source for deep-dives" replaced with "Companion doc" (competitive-scan only) + "Positioning" paragraph declaring structural uniqueness. - § 2.3 original-framing quote — paraphrased to drop the "(e.g. fallow, knip, jscpd)" parenthetical; pointers to roadmap.md for the full original wording. (roadmap.md itself still has the parenthetical; separate-PR scope.) - § 2.3 caveat — closed-dead-subgraph case described abstractly; no source citation needed. - § 2.5 LSP shim — "fallow has crates/lsp/" → "LSP spec upstream is the protocol authority." - § 3 intro — mission framing rewritten; "equal/surpass fallow" language replaced with "extract maximum value from the SQL-index architecture; grow the ecosystem" + "only SQL-based code index in the market" positioning. - § 3 Moat A — anchored on intrinsic merit (SQL durable + agent composability) instead of fallow comparison. - § 3 Moat B — anchored on "substrate every recipe layers on; richness determines JOIN expressivity" instead of "fallow has none of these." - § 3 ergonomic floors — dropped all "fallow is also fast" / "Convergent with fallow" annotations; reframed runtime-tracing as "different product class entirely (live process data, not static analysis)" + reframed telemetry-upload as standalone safety promise. - § 4 — DELETED ENTIRELY ("What to inspect in the fallow source tree"). Replaced with "Inspiration sources for plan-PR authoring" table listing open specs / primitive sources only (LSP spec, SQLite docs, oxc node reference, Lightning CSS, JSON-RPC + MCP spec, TC39 proposals, existing codemap surface, internal third-party graph audits). Discipline statement preserved: every plan PR cites the spec / primitive source it took inspiration from. - § 5 (d) row + T-table T+5w → +7w cell — dropped fallow crates/lsp/ refs; LSP spec is now the named authority. - § 6 Q1 — dropped fallow.md § 6 citation; stale-index frequency now anchored on PR #46 + PR #56 internal evidence. - § 6 Q4 — dropped fallow.md § 0 + § 6 citations; closed-dead-subgraph case cross-refs § 2.3 caveat instead. - § 7 cross-references — removed research/fallow.md and fallow upstream entries. Added § 4 inspection list as a self-reference. - § 8 errata § 2.3 row — dropped fallow.md citation; pattern described inline. Net effect: the doc stands on codemap's intrinsic structural properties. No peer-tool framing remains. The mission is now self-coherent: extract max value from the SQL-index architecture + grow the ecosystem, anchored on the unique-in-market positioning. * docs(research): retract uniqueness claim — honest cohort positioning Fact-check finding: the "structurally unique — only SQL-based code index in the market" claim doesn't hold. Web search + verification surfaced a real cohort of SQLite-backed code indexers for AI agents: - srclight (29 stars) — SQLite FTS5 + tree-sitter + embeddings + MCP, 42 tools, 11 langs. Pitch identical to codemap's ("AI agents spend 40-60% tokens on orientation; we eliminate this"). - Sverklo (30 stars) — local-first MCP, symbol graph, blast-radius, open-source alternative to Greptile/Sourcegraph. - ctxpp / ctx++ (17 stars) — Go MCP, tree-sitter, SQLite + FTS + vector, blast-radius analysis (= codemap's impact). - KotaDB (99 stars) — TS + Bun + SQLite — IDENTICAL stack to codemap. - codemogger (2026) — MCP, tree-sitter, SQLite + FTS + vector, semantic search. - @squirrelsoft/code-index, QuickAST, code-scale-mcp, CodeAgent Indexing Engine, Polyglot Indexer MCP, Continue's CodeSnippetsIndex — all SQLite-backed code indexers with overlapping surface. Codemap is one of ~10+, NOT unique. Retracting the claim. Honest differentiation (after verification): 1. Predicate-as-API — peers ship pre-baked verbs / MCP tools; codemap exposes raw SQL + recipes. Genuinely rare in the cohort. 2. Pure structural — no embeddings, no LLM in box. Most peers add vector search by default. Genuine differentiation. 3. JS/TS/CSS-ecosystem-deep extraction — CSS variables/classes/ keyframes, React components.hooks_used, type_members, markers. Peers focus on cross-language symbol+call surface via tree-sitter. The depth axis (3) is structurally enabled by parser choice — oxc (JS/TS) and lightningcss (CSS) are Rust-based and ecosystem- specialized; peers using tree-sitter trade depth for breadth. Where codemap is BEHIND the cohort (not hidden): multi-language support (codemap = TS/JS/CSS only; peers = 10-15 langs), star count, embeddings/semantic search, market traction. Edits applied: - HEADER positioning paragraph — retracted "structurally unique"; named the cohort explicitly (srclight, Sverklo, ctxpp, KotaDB, codemogger, etc.); spelled out the three differentiation axes; added the parser-choice rationale (oxc + lightningcss as the structural enabler of axis 3). - § 3 moat-intro line — replaced "the only SQL-based code index in the market" with "specific niche in the SQLite-backed-code-index cohort" + the three axes. Reviewer test reframed: eroding either moat turns codemap into "yet-another-tool-in-the-cohort instead of the predicate-shaped specialist." Moats A and B themselves required no rewrite — their justifications (predicate-as-API durability + extracted-structure substrate) hold under the corrected positioning. The peer cohort discovery actually sharpens both moats: A is the specialty (raw SQL surface) and B is the depth axis (richer extraction than tree-sitter cohort). * docs(research): § 1.4 refactor-risk formula — orphan + NULL fixes + caveat Grill-me Q12 outcome: § 1.4's "fan_in × (100 - coverage_pct)" formula had two correctness bugs and one accepted modeling limitation: CORRECTNESS FIXES (must ship): - Orphans (fan_in=0) scored 0 → "no risk" → wrong (orphans are high-risk: dead code or hidden-import targets we don't track). Fix: `fan_in + 1` so orphans score on coverage alone. - NULL coverage_pct propagated through the formula → 100 - NULL = NULL → row dropped from ORDER BY → unmeasured-coverage symbols silently vanished from the ranking. Fix: COALESCE(coverage_pct, 0) treats unmeasured as 0% (high risk). ACCEPTED v1 TRADE-OFF: - Linear-in-fan_in (fan_in 100 with 99% coverage = fan_in 1 with 0% coverage in the score). Real, but not worth fixing in the bundled recipe — users tune via project-local override. Caveat block in refactor-risk-ranking.md (will accompany the recipe when (a) ships) names tuning axes for project-local overrides: - Log-scale fan_in (LOG(fan_in + 1) * 30) for hub-heavy codebases - Visibility weight (if @public / @internal / @beta JSDoc tags are used consistently) - LOC weight (if test-density varies across files) Why ship-with-caveat instead of multi-axis composite (Option B): - Moat A says recipes are saved queries (starting points), not authoritative verdicts. Bundled formula gets 80% right; users iterate. - Anti-bloat meta-rule — every additional axis encodes more opinions; shipping minimal forces explicit thought during tuning. - Ecosystem-specific axes (visibility weight, LOC weight) shouldn't be in the bundled default. Effort stays XS. The .md caveat block lands in the (a) plan PR / impl PR alongside the .sql; not part of THIS research-note PR's scope. * docs(research): § 1.5 boundary violations — Shape A directional rules Grill-me Q13 outcome: § 1.5 was underspecified ("--boundaries <config> flag on audit OR recipe consuming the config"). Three real questions needed answering: where the config lives, what shape, recipe-or-flag. Shape A (directional rules) locked in for v1: boundaries: [ { name: "no-cross-feature", from_glob: "src/features/*/**", to_glob: "src/features/*/**", action: "deny", except_self: true, }, ... ] Why A over B (element-types) over C (layers) — honest discriminator: A and B have IDENTICAL expressiveness (B compiles to A at index time). The real question is ergonomics-at-scale vs forward-compat / smallest- viable-config: - A wins 5 of 6 dimensions: smallest-viable-config (one entry); Zod schema simplest; mental-model load (one concept); forward-compat (B layers on top later as sugar); backwards-compat (never paint into a corner; primitives are durable). - B wins only "ergonomics at scale" (5+ rules with element reuse) — exactly the dimension that can be added later as a sugar layer without breaking A. - C (layer ordering) is most opinionated; only fits layered architectures. Not a v1 default. Decision rule (ship the smallest primitive that doesn't paint into a corner; layer ergonomics on top later) mirrors § 6 Q5 history-table defer logic. Implementation reuses every shipped or in-flight piece of plumbing: - Zod config slot (existing src/config.ts substrate) - Index-time reconciler (mirrors recipe_recency from item 1.9) - New boundary_rules table (moat-B-aligned schema growth) - Bundled recipe boundary-violations.sql via SQLite GLOB operator - SARIF output formatter (already shipped) for CI gate NO new CLI flag — moat-A clean. The verb is query --recipe boundary-violations --format sarif. Recipe consumes config-as-data; SARIF output mode handles verdict-shaped CI consumers. Effort stays S. Element-types / layer sugar deferred to v1.x with explicit "demand-driven" trigger (mirrors fallow.md B.5 verdict- threshold deferral pattern, kept in this doc as the recurring deferral idiom). * docs(research): § 1.1, 1.6, 1.8 sanity sharpening (gotchas + envelopes) Grill-me Q14 outcome: three remaining § 1 rows had implicit gotchas the recipe author would otherwise have to discover during impl. Each row gets a small clarification — substrate unchanged, effort unchanged. § 1.1 components-touching-deprecated: - Was: "One bundled recipe (components-touching-deprecated)" - Now: explicit two-path UNION - HOOK PATH: components.hooks_used JSON overlap with @deprecated symbols (catches deprecated hooks like useDeprecatedThing) - CALL PATH: calls.caller_name IN (SELECT name FROM components) × @deprecated symbols by callee_name (catches regular deprecated functions called inside components) - Hook-only variants would ship false-negatives — recipe author needs the explicit UNION to avoid the trap. § 1.6 unused-type-members: - Was: "Recipe (unused-type-members) — needs JSON-extraction predicate" - Now: ADVISORY recipe with explicit caveat block in .md. Output is "review these" candidates, NEVER "safe to delete" — TS has multiple indirect-usage classes codemap's substrate doesn't track: - Indexed access: T['fieldName'] - keyof T - Type spreads: type X = T & {...} - Mapped types: {[K in keyof T]: ...} These produce false-positives. Recipe is useful as a candidate surfacer; agents must verify before deletion. § 1.8 more MCP resources: - Was: hand-wave "add codemap://files/{path} and codemap://symbols/ {name}" - Now: spell out disambiguation envelope (reuses {matches, disambiguation?} pattern from PR #39 show/snippet) — symbols with duplicate names across files (Component, index, default, util-name collisions) return all matches with by_kind / files / hint metadata. Plus ?in=<path-prefix> query parameter mirroring show --in <path>. - Without spelling this out, the implementation would have to invent disambiguation OR ship a "first match wins" gotcha. Net: each row's What's-needed cell now contains enough detail that the recipe / resource author can implement without re-deriving the JOIN structure or envelope shape. Tactical clarity layered on top of the structural decisions made in earlier grills.
1 parent b5679a6 commit 54e3a2c

2 files changed

Lines changed: 266 additions & 0 deletions

File tree

.agents/lessons.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,5 @@ Each entry is a single bullet: `- **<topic>** — <lesson>`. Newest entries at t
1313
- **backticks inside SQL or help-text template literals** — never put a literal backtick inside a `` `...` `` template-literal string. `db.ts` SQL DDL strings (multi-line CREATE TABLE templates) and `printQueryCmdHelp()` (multi-line help text) are both `` `...` `` template literals; an inner backtick — typically a Markdown-style code-fence around a flag like `` `--full` `` — terminates the literal early and the parser blows up several lines later with cryptic "expected `,` or `)`" errors. **Use plain prose in those strings** (`--full` not `` `--full` ``), or escape (`` \` ``) if you really need the character. Hit twice (B.7 + B.6 PR #30); the lesson is general — applies to any TS template literal that gets pasted prose later, not just SQL / help text.
1414
- **STOP-before-Grep applies to symbol lookups too**`Grep` for symbol names like `printQueryResult`, `getCurrentCommit`, `dropAll` violates the [`codemap` rule](rules/codemap.md). The codemap query `SELECT file_path, line_start FROM symbols WHERE name = '<X>'` answers it faster and without scanning. Reach for `Grep` only when the question is content-shaped (regex over file bodies, finding pattern usages inside function bodies, etc.) — not when it's "where is X defined / who calls X / what does file Y export." This was a PR #30 self-correction.
1515
- **PR / issue / comment bodies always go through a temp file** — never pass markdown bodies via shell heredoc to `gh pr create --body "$(cat <<'EOF'…)"` / `gh pr edit --body …` / `gh pr comment --body …` / `gh issue create --body …` / `gh api` `--field body=…`. Backticks inside the heredoc (every code span and code fence) get shell-escaped to `\`` and render literally on GitHub — every recipe id, file path, flag, SQL fragment, and code fence in the rendered body comes out as `\`coverage\``instead of`coverage`. Pattern: write the body to a temp file (`Write`to`/tmp/pr-<n>-body.md`), pass `--body-file /tmp/pr-<n>-body.md`, then delete the temp file. Cost is one extra tool call; saves redoing every PR body that has more than a few backticks. Hit on PR #57 — final body was a wall of `\`` artifacts until rewritten via temp file.
16+
- **Never commit absolute local user paths** — no `/Users/<name>/…`, `/home/<name>/…`, `~/…`, or `file:///` URIs in any tracked doc, code, comment, or PR body. Reasons: (1) leaks the maintainer's directory structure / username to public mirrors; (2) every other contributor's paths differ — the reference is dead on their machine; (3) a `git clone` of someone else's machine isn't a fact we can cite as a "source for deep-dives" — public upstream URLs are. Pattern: cite `https://github.com/<org>/<repo>` (with optional `/tree/<sha>/<path>`) for upstream sources; use repo-relative paths (`docs/foo.md`, `src/bar.ts`) for in-tree references. Hit on PR #58 first draft — referenced the local fallow clone path in the research note before the user caught it.
17+
- **Prescriptive research notes pin every concrete claim before recommending a ship sequence** — when a research/plan-shape doc proposes work (effort estimates, capability inventories, "we already do X" framing), every concrete claim needs a `file:line` / `codemap query` / `rg` / `--recipes-json` reference a reviewer can re-run. Reasoning-from-substrate intuition without pinning ships errors: "the AST walker already counts nodes" / "fan-in detects orphans" / "the `re_export_source` column doesn't exist" — all real errors caught on PR #58 by triangulating against the codebase. Don't ship a peer / parallel "descriptive baseline" doc to triangulate against (Rule 1 violation — it duplicates `architecture.md` / `db.ts` / `--recipes-json`); instead, either (a) pin claims in the prescriptive doc itself, or (b) self-audit by re-running every claim against the canonical home before committing. Either path beats the "dual descriptive + prescriptive doc" pattern on docs-governance grounds.

0 commit comments

Comments
 (0)