feat: scanPerformance + analyzeStructuralPerf MCP tools, Cartographer FFI expansion, perf optimizations#205
feat: scanPerformance + analyzeStructuralPerf MCP tools, Cartographer FFI expansion, perf optimizations#205
Conversation
…ndings) - internal/cartographer/bridge.go — full CGo bindings (11 FFI functions); RankedSkeleton() and UnreferencedSymbols() added for v1.6.0 support - internal/cartographer/bridge_stub.go — stubs for all 13 functions incl. RankedSkeleton and UnreferencedSymbols; builds cleanly without Rust toolchain - internal/cartographer/types.go — complete Go type set incl. new RankedSkeletonResult/File and UnreferencedSymbolsResult/File - internal/query/status.go — Cartographer added to getBackendStatuses(); shows availability, version, and capabilities in `ckb status` - internal/query/review_layers.go — new checkLayerViolations() check; runs Cartographer layer analysis on PR-changed files; skips gracefully when not compiled in; tier-2 findings (architecture/warning) - internal/query/review.go — wire layers check into the parallel check loop; add "layers" to tier-2 in findingTier - Makefile — build/build-cartographer/build-fast/test/lint/clean targets; documents the -tags cartographer build flag All binaries build clean with and without -tags cartographer. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wire five Cartographer FFI calls into the review pipeline so CKB uses the Rust library in-process, not via MCP: - review_coupling.go — HiddenCoupling() augments coupling check with co-change pairs that have no import edge; deduped against SCIP gaps; rule: ckb/coupling/hidden - review_deadcode.go — UnreferencedSymbols() adds phase 3 to dead-code check; catches public exports with no callers project-wide; rule: ckb/dead-code/unreferenced-export - review_arch_health.go (new) — checkArchitecturalHealth() reports cycles (≥3 → error), god modules, and layer violations from cartographer.Health(); tier-2; rules: ckb/arch-health/* - review_blastradius.go — GitChurn() loaded at check start; files with ≥15 commits escalate blast-radius findings from info → warning - review.go — arch-health goroutine wired into parallel check loop; arch-health added to tier-2 in findingTier() All calls guarded by cartographer.Available(); both build paths clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Addresses the remaining memory and cold-start bottlenecks for large repos. mmap (mmap_unix.go / mmap_other.go): - On Unix, memory-map the .scip file instead of os.ReadFile. The OS manages paging; raw bytes never hit the Go heap. Falls back to ReadFile on non-Unix platforms. Streaming protobuf (loader.go): - Replace proto.Unmarshal into a full scippb.Index (which materialises all documents simultaneously) with a protowire stream parse. A producer goroutine emits one *scippb.Document at a time to a buffered channel; nWorkers consumers convert+index each doc and release it. - Peak memory drops from ~3× .scip file size (raw bytes + scippb.Index + CKB structs) to ~1× (CKB structs only; raw bytes are mmap pages managed by the OS). - Fixed: SCIP Index.Documents is field 2, not 3 (ExternalSymbols is 3). DefinitionIndex (loader.go, symbols.go): - Add DefinitionIndex map[string]*OccurrenceRef — first definition occurrence per symbol, built for free during the parallel doc phase. - findSymbolLocationFast now hits DefinitionIndex in O(1) instead of scanning all RefIndex entries for the symbol (was O(k) per symbol, expensive for high-cardinality symbols during ConvertedSymbols build). NameIndex + SearchSymbols (loader.go, symbols.go): - Add NameIndex []NameEntry (sorted by name) built after ConvertedSymbols. - SearchSymbols iterates the compact sorted slice instead of the ConvertedSymbols map. Cache-line–friendly access pattern vs scattered map bucket pointers; also enables early-exit prefix search. Gob cache (cache.go, adapter.go): - After the first full build, saveDerivedCache writes ConvertedSymbols, ContainerIndex, and NameIndex to .ckb/scip_derived.gob (async). - On subsequent startups, loadDerivedCache validates mtime+size against the .scip file and, if fresh, restores all three via applyCachedDerived — skipping the entire parallel symbol-conversion phase entirely. - Cache file is written atomically (tmp + rename). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wire Cartographer into existing query endpoints — all guarded by cartographer.Available(), zero impact on non-Cartographer builds: - getArchitecture — MapProject adds arch health score, cycles, god modules, and bridge nodes to the response - analyzeImpact — SimulateChange adds predicted affected modules, cycle risk, layer violations, and health delta (ArchImpact field) - getModuleOverview — GetModuleContext adds skeleton (signatures + imports + deps) for the requested module - summarizeDiff — Semidiff adds function-level added/removed signatures per file for commit-range selectors (FunctionChanges field) - getHotspots — GitCochange adds co-change partners (top 3) to each hotspot file (CochangePartners field) - exportForLLM — SkeletonMap / RankedSkeleton injected into response; tokenBudget param triggers personalized PageRank skeleton - review_layers — auto-detects .cartographer/layers.toml instead of always passing empty string - status — capabilities list expanded to enumerate all 11 active Cartographer integrations - skeleton.go (new) — GetSkeleton/GetRankedSkeleton engine helpers used by exportForLLM and future callers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ce tests
scanPerformance (internal/perf + MCP tool):
- Whole-repo hidden coupling scan in a single git log pass; filters
out pairs with a static import edge via path-fragment heuristic
- Exposed as MCP tool scanPerformance with minCorrelation, minCoChanges,
windowDays, limit, and scope params
- cmd/ckb/perf — CLI command wrapping the same analyzer with table and
JSON output formats
internal/cicheck (CI workflow compliance tests):
- TestWorkflowActionsPinned — all uses: must be SHA-pinned
- TestWorkflowActionsVersionComments — SHA pins must have a version comment
- TestWorkflowJobsHaveTimeout — every job must declare timeout-minutes
- TestWorkflowNoDirectInputInterpolation — no ${{ inputs.* }} in run: blocks
- TestWorkflowNoLatestDockerTag — no docker://...:latest
- TestWorkflowConsistentActionVersions — same action must use same SHA everywhere
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- scipadapter_test.go: populate DocumentsByPath alongside Documents in test fixture; GetDocument now uses the map, not the slice - presets_test.go, token_budget_test.go: update expected tool counts to 99 (v8.5 adds analyzeStructuralPerf) - perf/types.go: add StructuralPerfOptions and LoopCallSite types for the upcoming structural performance analysis feature - typescript fixtures: sync expected search output Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ixes analyzeStructuralPerf (v8.5): - New MCP tool: detects loop call sites in high-churn files using tree-sitter. Complements scanPerformance (cross-file hidden coupling) with intra-file O(n)/O(n²) structural signals. - tool_impls_perf.go: implementation wired to internal/perf - tools.go: tool definition with windowDays, minChurnCount, limit, scope, entrypointFiles params - presets.go: added to full preset (total 99 tools) navigation.go (explore): - Use Cartographer.MapProject when available for directory overview; gains ignore-aware file list and per-language file counts (Languages field added to ExploreResult) - Falls back to OS walk when Cartographer is unavailable compound.go: minor query engine updates scanner_bench_test.go, tool_impls_batch2_test.go: test fixes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lit, GetOwnership PrepareChange (compound.go): - getPrepareCoChanges now uses cartographer.GitCochange() — single git pass (bot-filtered) instead of O(n) per-file subprocess spawns - Cross-references with HiddenCoupling() to mark each pair IsHidden when there is no import edge; callers can surface implicit risk to the LLM - Falls back to coupling.Analyzer when Cartographer is not compiled in - Added IsHidden bool to PrepareCoChange struct PlanRefactor (compound_refactor.go): - Collects hidden-coupling files from PrepareChange CoChangeFiles where IsHidden=true and surfaces them as HiddenCouplingFiles in CouplingAnalysis - Fallback: calls HiddenCoupling() directly when PrepareChange ran without Cartographer but it became available by assembly time - Added HiddenCouplingFiles []string to PlanRefactorCoupling struct suggestPRSplit (review_split.go): - addCartographerEdges: builds adjacency from static import graph (MapProject) + temporal coupling (GitCochange ≥ 0.5) in two single-pass calls; no per-file subprocess limit - Replaced 200-file skipCoupling heuristic with Cartographer path; fallback to addCouplingEdges for non-Cartographer builds (200-file cap kept) GetOwnership (ownership.go): - CoChangePartners field added to response — top 5 files that co-change with the queried path (noise-filtered); implicit co-owners invisible to CODEOWNERS but visible in git history Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…2.1) SCIP determinism: - loader.go: sort parallel-load results by document path before merging so RefIndex/DefinitionIndex construction is goroutine-schedule-independent - loader.go: fix NameIndex sort to total order (Name, ID) — map iteration produced non-deterministic output when two symbols share a name - adapter.go: SetCacheRoot() lets tests redirect the derived-index cache so concurrent tests don't race on the shared fixture scip_derived.gob FTS determinism: - fts.go: sort symbol IDs before inserting into FTS5 — map iteration produced random BM25 scores and flaky golden test rankings - engine.go: extract StartBgTasks() from NewEngine(); production entry points call it explicitly; tests skip it and call PopulateFTSFromSCIP synchronously to get deterministic state - engine_helper.go + server.go: call StartBgTasks() after engine init Test isolation: - golden_test.go: call DisableBgFTS() + SetCacheRoot(tmpDir) to prevent background goroutines racing with synchronous FTS population - Update golden fixtures to reflect deterministic symbol ordering Bump to v8.2.1. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
compliance/scanner.go: - Reuse identifier buffer and seen-set across lines to eliminate per-line map/slice allocations in scanFile and CheckPIIInLogs - Replace unicode import with sync (buffer reset pattern) - Add compliance testdata fixtures internal/perf/structural.go (cgo build): - AnalyzeStructural: detects calls-inside-loops using tree-sitter; identifies O(n)/O(n²) structural anti-patterns in high-churn files via three-stage pipeline: git churn → tree-sitter parse → annotation internal/perf/structural_stub.go (!cgo build): - Stub for non-cgo builds; returns ErrUnavailable Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Restructures the perf CLI from a single flat command into a parent with two explicit subcommands so each analysis mode has its own flags and help. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… tests - Add SearchContent and FindFiles FFI calls with stub implementations - Add structural perf tests split across CGO and non-CGO build tags
- Add perf_bench_test.go: 22 benchmarks covering recordCommit O(n²), co-change pipeline simulation, importCouldReferTo, shouldIgnore - Add structural_bench_test.go (CGO): 11 benchmarks for call-site annotation pipeline with documented baselines - Expand FindFiles to accept FindOptions (filter by mtime, size, depth) - Expand SearchContentOptions with full ripgrep-parity fields - Add FileCount, FilesWithMatches/WithoutMatch fields to SearchResult
- ReplaceOptions/ReplaceResult/FileChange/DiffLine types for sed-like replace - ExtractOptions/ExtractResult/ExtractMatch/CountEntry types for awk-like extract - ReplaceContent() and ExtractContent() in real bridge + stubs - Matches cartographer v1.8.0 FFI surface
…epoRoot field - recordCommit: allocate seen map once in buildCoChangePairs, clear with range-delete instead of make() per commit → −97% allocs at 1k commits - buildExplanation: strings.Builder + strconv replaces fmt.Sprintf → −40% latency, allocs halved (6→3 per site) - Remove ScanOptions.RepoRoot (dead field — Analyzer.repoRoot used throughout) - Update bench baselines and document results in docs/performance_log.md
Replace cmd.Output()+strings.Split with StdoutPipe+bufio.Scanner so the
full git log is never loaded into memory as a bulk string. On a repo with
10k commits the old path double-copied ~500 KB ([]byte→string→[]string);
now each line is processed in-place from a 64 KB ring buffer.
Also switch from TrimSpace to bytes.TrimRight("\r") — precise, no leading
space scan — and fix BenchmarkCoChangePipelineSimulated's pairs map hint
from commits×files (20k) to files*(files-1)/2 (190), the actual unique
pair ceiling.
🟢 Change Impact Analysis
Blast Radius: 0 modules, 0 files, 0 unique callers 📝 Changed Symbols (132)
Recommendations
Generated by CKB |
CKB Analysis
Risk factors: Large PR with 158 files • High churn: 35964 lines changed • Touches 30 hotspot(s) 👥 Suggested: @lisa.welsch1985@gmail.com (6%), @talantyyr@gmail.com (1%), @lisa@tastehub.io (1%)
🎯 Change Impact Analysis · 🟢 LOW · 132 changed → 0 affected
Symbols changed in this PR:
Recommendations:
💣 Blast radius · 0 symbols · 20 tests · 0 consumersTests that may break:
🔥 Hotspots · 30 volatile files
📦 Modules · 6 at risk
📊 Complexity · 8 violations
💡 Quick wins · 10 suggestions
📚 Stale docs · 175 broken references
Generated by CKB · Run details |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #205 +/- ##
=========================================
- Coverage 43.0% 42.8% -0.2%
=========================================
Files 507 525 +18
Lines 78022 80751 +2729
=========================================
+ Hits 33614 34639 +1025
- Misses 42045 43676 +1631
- Partials 2363 2436 +73
Flags with carried forward coverage won't be shown. Click here to find out more. 📢 Thoughts on this report? Let us know! 🚀 New features to boost your workflow:
|
|
CKB review failed to generate output. |
… below MinCoChanges early - ScanFiles now allocates seen+identBuf once per scan instead of once per file; eliminates N make() calls for N-file scans - ScanOptions.MaxCommitFiles: skip commits above the file-count threshold (mass renames / formatting sweeps); 0 = unlimited - buildCoChangePairs prunes pairCounts entries with count < MinCoChanges after parsing; lossless — Scan() would filter them anyway, but doing it early cuts the O(N²/2) correlation iteration for large repos - structural.go call-site updated to pass new params (0, 1 = no pruning)
Adds synthetic scale benchmarks for the 50k-file indexing bottleneck (1h SCIP + 10h+ ckb index timeout on customer repo): - backends/scip/scale_bench_test.go: LoadSCIPIndex at 1k/10k/50k docs baseline: ~8s/op, 6.9 GB alloc at 50k docs - incremental/scale_bench_test.go: ApplyDelta, PopulateFullIndex, UpdateFileDepsHotPath, GetDependenciesPerFile at scale baseline: ~50s/op at 50k files × 40 syms × 200 refs - bench/baselines/v8.2.1.txt: benchstat-compatible before-fix reference
…r, scip alloc fix - cartographer: expose BM25Search and QueryContext FFI bindings with full options/result types; stub no-ops added to bridge_stub.go - incremental: split UpdateFileDeps into one-off and bulk paths (updateFileDepsWithStmt) to share prepared statements across batch inserts - scip/loader: pre-allocate backing slice for OccurrenceRefs — cuts allocs from O(total_occs) to O(docs) at load time - .gitignore: exclude /ckb-bench binary, registry token files, marketing zips - docs: add cartographer integration docs, cognitive vault spec v1.1, roadmap v8.1, ckb-bench command skeleton, cartographer bench test
PopulateFromFullIndex rewrite for the 50k-file / 10h timeout case: - Phase 2 extractFileDelta now runs in parallel (GOMAXPROCS workers) — CPU-bound SCIP document parsing was fully sequential before - Single giant transaction split into 1000-file batches, keeping WAL bounded and allowing incremental checkpointing - PRAGMA synchronous=OFF for the bulk load duration (safe: failed full index is always re-run from scratch) - bulkInsertFileSymbols: batched multi-row VALUES (499 rows/stmt) instead of one Exec per symbol - applyStmts struct: file_symbols, callgraph, and file_deps statements all prepared once per transaction — eliminates 3× 50k Prepare/Close round-trips that were the dominant cost on large repos - insertCallEdgesWithStmt: callgraph inserts use the shared stmt too
…act format - scip: build CallerIndex at load time (phase 4); FindCallers is now O(1) via map lookup instead of O(docs×syms×occs) scan; buildCallerIndex uses sorted interval scan with early-break and per-doc edge dedup - envelope: add Backend/Accuracy fields to Meta; AccuracyForBackend helper maps scip→high, lsp→medium, tree-sitter/fallback→low - query/engine: ActiveBackendName() returns active backend tier - mcp: WithBackend on ToolResponse; analyzeImpact and prepareChange emit backend+accuracy in envelope; prepareChange supports format=compact for token-budget-constrained callers - mcp/tool_impls: best-effort LIP nyx-agent-lock check on affected files - cmd: `ckb impact prepare` subcommand with --format=compact - bench: v8.4.0 baseline (scip alloc -15%, ApplyDelta/large -20%)
…lation - Add queryContext tool: Cartographer PKG retrieval pipeline (BM25 search → personalized PageRank skeleton → context health) in a single MCP call. Returns ready-to-inject context bundle with token count and A–F grade. - Add contextHealth tool: scores a context bundle on 6 research-backed metrics (signal density, compression density, position health, entity density, utilisation headroom, dedup ratio) with composite 0–100 score and actionable recommendations. - prepareChange: add parallel Cartographer SimulateChange goroutine; result surfaces as archImpact on PrepareChangeResponse. Cycle risk and layer violations feed into calculatePrepareRisk as explicit risk factors. - Bump full preset tool count to 101 (was 99).
git-subtree-dir: vendor/cartographer git-subtree-split: 7e8fd8e8d9d29a0453d29dff436e2a65b61bbda9
… v8.5.0 - Move Cartographer from fragile sibling-repo CGo path to third_party/cartographer/ via git subtree (vendor/ conflicted with Go toolchain vendor consistency checks) - Fix 6 CGo path directives in bridge.go: ../../../../Cartographer → ../../third_party/cartographer - Wire three previously unexposed C exports as Go FFI + stubs: ShotgunSurgery, Evolution, BlastRadius - Expose as MCP tools: detectShotgunSurgery, getArchitecturalEvolution, getBlastRadius (registered in tool_impls_v86.go) - buildCallerIndex: pool ivs slice and replace per-doc docSeen map with generation counter — saves ~6k allocs on small SCIP load - Bump version to 8.5.0
|
CKB review failed to generate output. |
- CallerIndex is now built on the first FindCallers call (sync.Once)
instead of at LoadIndex time — removes ~22k persistent heap objects
from small SCIP loads, restoring alloc count to v8.4.0 baseline
- proto.UnmarshalOptions{DiscardUnknown: true} on both document stream
unmarshal calls; skips reflection-based unknown-field accumulator
- buildCallerIndex: reuse ivs slice + generation-counter deduplication
(already landed; documented in CHANGELOG)
- lip.GetEmbedding: request TurboQuant-quantized embeddings from LIP
daemon; same silent-degradation pattern as GetAnnotation
- CHANGELOG: full v8.5.0 section covering all perf improvements
including v8.4.0 incremental wins not previously documented
|
CKB review failed to generate output. |
PopulateFromFullIndexStreaming: two-pass proto-native path that never materialises the full SCIPIndex in RAM. Uses extractFileDeltaFromProto to skip all intermediate scip.Document allocations. Adaptive threshold in PopulateAfterFullIndex selects streaming for indexes > 200 MB, old single-pass path for smaller ones. Large-repo cold-run: 485s → 83s. LIP client: embedding_batch (single round-trip for RerankWithLIP), nearest / nearest_by_text (HNSW-backed semantic file search), symbol_embedding, index_status, file_status. Generic lipRPC transport consolidates all simple request→response calls. Fast-tier search: RerankWithLIP now uses GetEmbeddingsBatch instead of N serial GetEmbedding calls. SemanticSearchWithLIP supplements sparse FTS results (<3) via nearest_by_text → symbolsForFile resolution. lipRanked flag prevents a redundant second embedding_batch pass. LIP symbol annotations: annotationSet/Get/List MCP tools with local SQLite backing. Watcher gains Seq + DeltaAck for LIP delta protocol.
Add scipLargeRepoThreshold (50k source files). When exceeded, ckb index prints what's available without SCIP (FTS + LSP + LIP semantic search), what requires SCIP (call graph, analyzeImpact), and the exact indexer command to generate SCIP manually. --scip flag overrides the gate; --force also proceeds with a duration warning. Also update MCP full-preset tool count bounds to 107 (+3 Cartographer, +3 LIP annotation tools added in previous commit).
PopulateFromFullIndex and PopulateFromFullIndexStreaming now set wal_autocheckpoint=0 during the batch loop (eliminates WAL checkpoint I/O interruptions between the 50 batch transactions on a 50k-file repo) and double the page cache to 128 MB (vs startup's 64 MB) to keep more B-tree nodes warm during unique-key checks. A single PRAGMA wal_checkpoint(TRUNCATE) runs on defer after all batches. FTS BulkInsert replaces 2M individual stmt.ExecContext calls with batched 499-row multi-row INSERTs (~4k INSERT statements for a 50k- file repo). Triggers are already dropped before the bulk; FTS5 rebuild still runs once at the end. No change to the trigger/rebuild logic. Also updates SARIF golden to v8.5.0.
ckb doctor now shows a 'lip' check — pass with indexed file count when daemon is running, warn when it's not (semantic search disabled). checkScip detects large repos (>50k source files) and shows a 'pass' with a clear explanation: active tier is FTS+LSP+LIP, call graph requires --scip to opt in. Replaces the misleading "not found" warn. PopulateFTSFromSCIP: drop sort.Strings(symIDs). FTS5's 'rebuild' produces deterministic BM25 output regardless of insert order — the sort was allocating a 2M-element string slice on every full populate for no benefit.
For a 50k-doc repo buildCallerIndex takes ~6.7s and allocates 489 MB. Previously that work blocked the first getCallGraph / traceUsage call. Now a background goroutine starts it immediately after s.index is set; callerIndexOnce guarantees no duplicate work if FindCallers races it. Benchmark (BenchmarkBuildCallerIndex, Apple M4 Pro): small_1k_docs → 4.6 ms, 5 MB medium_10k_docs → 83 ms, 75 MB large_50k_docs → 6.7 s, 489 MB ← now absorbed in background Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
BulkInsert required the caller to materialise the full []SymbolFTSRecord
slice before the transaction started (~400 MB for a 50k-file repo).
BulkInsertFunc calls a user-provided fn(flush) callback instead, letting
the caller feed records in chunks and never holding more than one chunk.
PopulateFTSFromSCIP now streams in 10k-record chunks. The ftsDropTriggers
/ ftsCreateTriggers DDL slices are extracted to package-level vars and
shared by both BulkInsert and BulkInsertFunc.
Benchmark (BenchmarkBulkInsertVsFunc, 500k symbols):
BulkInsert: 6.6 s, 493 MB
BulkInsertFunc: 6.3 s, 439 MB (−55 MB caller alloc; further savings
in practice where no full slice exists)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SemanticSearchWithLIP fired one WHERE file_path = ? query per LIP hit (up to 20 for topK=20). Replace with a single WHERE file_path IN (…) query via the new symbolsForFiles method. SemanticSearchWithLIP signature changes from per-URI callback: func(fileURI string) []SearchResultItem to batch callback: func(fileURIs []string) map[string][]SearchResultItem The call site in symbols.go calls symbolsForFiles once for the full URI list instead of looping. Benchmark (BenchmarkSymbolsForFileVsBatch, Apple M4 Pro): 5 files: 118 µs → 84 µs (1.4×) 10 files: 281 µs → 168 µs (1.7×) 20 files: 756 µs → 301 µs (2.5×) Also adds BenchmarkBuildCallerIndex to scale_bench_test.go so the CallerIndex pre-warm cost is directly measurable. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Go Mod Tidy - Bump go directive 1.26.1 → 1.26.2 (fixes 4 crypto/tls + crypto/x509 stdlib vulns: GO-2026-4866/4870/4946/4947; all fixed in go1.26.2) - go mod tidy: promote golang.org/x/sys to direct dependency Lint (gofmt) - Format 20 files that gofmt disagreed with (pre-existing, not introduced by recent perf commits) Lint (govet) - cmd/ckb/impact.go:152: remove shadowed err via var+assign pattern - internal/query/impact.go:471: drop tautological symbolInfo != nil check (symbolInfo is provably non-nil after the nil-return guard at line 259) Lint (unused) - internal/backends/scip/loader.go: remove unused convertDocuments func (only convertDocument singular is called; my bench test added that call) - internal/cartographer/types.go: move ffiResponse out of types.go into bridge.go where it is used under //go:build cartographer; drop now-empty encoding/json import from types.go - internal/query/fts.go: remove symbolsForFile (superseded by the new symbolsForFiles batch method; call site updated in the previous commit) Lint (errcheck / check-type-assertions: true) - internal/compliance/scanner.go:335: use two-value type assertion for sync.Pool.Get with nil fallback - internal/query/golden_test.go:459,463,467: use _, ok form for map type assertions inside sort.Slice - internal/watcher/watcher.go:276: use ok2 form for chan type assertion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
CKB review failed to generate output. |
detectShotgunSurgery, getArchitecturalEvolution, and getBlastRadius now use s.engine().GetRepoRoot() instead of requiring an explicit repo_path parameter, consistent with every other Cartographer-backed tool. analyzeCoupling suggests detectShotgunSurgery when high coupling is found.
Summary
internal/perfpackage — new hidden-coupling scanner (Scan) and structural perf analyzer (AnalyzeStructural): detect files that co-change without static import edges, and find call expressions inside loop bodies in high-churn filesscanPerformance+analyzeStructuralPerfMCP tools — expose both scan modes via MCP and CLI (ckb perf coupling/ckb perf structural)SearchContentoptions (before/after context, invert, word-regexp, files-with-matches, count-only, no-ignore),FindFileswithFindOptions(modified-since, size bounds, max-depth),ReplaceContentandExtractContentbindingsckb indexskips SCIP above 50k source files and guides the user to the FTS + LSP + LIP tier;ckb doctorreports which tier is active and whether the LIP daemon is runningPerf optimizations (with before/after numbers)
All measured on Apple M4 Pro, arm64,
-count=3 -benchmemunless noted.1. Lift
seenmap out ofrecordCommitOne
make(map[string]bool)per commit → allocate once,range-delete to clear.CoChangePipeline/500c_10fCoChangePipeline/1kc_20fCoChangePipeline/1kc_20f(B/op)CoChangePipeline/1kc_20f(ns/op)2.
buildExplanation:fmt.Sprintf→strings.Builder+strconvCallSitePipeline/500sites3. Stream git output via
bufio.ScannerReplaced
cmd.Output()+ bulk string split withStdoutPipe+bufio.Scanner— one 64 KB ring buffer, zero bulk copy.4. SQLite bulk PRAGMA tuning
synchronous=OFF,cache_size=-131072(128 MB),wal_autocheckpoint=0duringPopulateFromFullIndex, with a singlePRAGMA wal_checkpoint(TRUNCATE)on completion. Eliminates WAL checkpoint interruptions across the 50 batch transactions on large repos.5. FTS batched INSERT
Replaced ~2M individual
stmt.Execcalls inBulkInsertwith 499-row multi-row INSERTs inside one transaction. Measured at 50k-doc scale (2M symbols):large_50k_docs6. CallerIndex background pre-warm
buildCallerIndexblocks the firstgetCallGraph/traceUsagecall. Now starts in a background goroutine immediately afterLoadIndexreturns;callerIndexOnceprevents duplicate work if the call races the goroutine.7.
BulkInsertFuncstreaming API for FTSPopulateFTSFromSCIPpreviously built a full[]SymbolFTSRecordslice (~400 MB for 50k-file repo) before the transaction started.BulkInsertFunctakes afn(flush)callback so the caller can stream in 10k-record chunks, never materialising the full slice.8.
symbolsForFilesbatch query inSemanticSearchWithLIPPer-URI
WHERE file_path = ?queries replaced with a singleWHERE file_path IN (…)call via the newsymbolsForFilesmethod.SemanticSearchWithLIPsignature updated to take a batch callback.Test results
Race detector clean on
./internal/backends/scip/...(covers new background goroutine).Test plan
go test ./...— all packages greengo test -race ./internal/backends/scip/...— no data racesgo test -bench=BenchmarkBulkInsertVsFunc -benchmem ./internal/storage/...go test -bench=BenchmarkSymbolsForFileVsBatch -benchmem ./internal/storage/...go test -bench=BenchmarkBuildCallerIndex -benchmem ./internal/backends/scip/...🤖 Generated with Claude Code