SimplyLiz
diff --git a/‎.claude/commands/audit.md‎
Lines changed: 117 additions & 0 deletions b/‎.claude/commands/audit.md‎
Lines changed: 117 additions & 0 deletions
diff --git a/‎.claude/commands/review.md‎
Lines changed: 45 additions & 11 deletions b/‎.claude/commands/review.md‎
Lines changed: 45 additions & 11 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 128 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 128 additions & 0 deletions
diff --git a/‎CLAUDE.md‎
Lines changed: 9 additions & 1 deletion b/‎CLAUDE.md‎
Lines changed: 9 additions & 1 deletion
@@ -0,0 +1,117 @@
+Run a CKB-augmented compliance audit optimized for minimal token usage.
+
+## Input
+$ARGUMENTS - Optional: framework(s) to audit (default: auto-detect from repo context). Examples: "gdpr", "gdpr,pci-dss,hipaa", "all"
+
+## Philosophy
+
+CKB already ran deterministic checks across 20 regulatory frameworks, mapped every finding
+to a specific regulation article, and assigned confidence scores. The LLM's job is ONLY what
+CKB can't do: assess whether findings are real compliance risks or false positives given the
+repo's actual purpose, and prioritize remediation by business impact.
+
+### Available frameworks (20 total)
+
+**Privacy:** gdpr, ccpa, iso27701
+**AI:** eu-ai-act
+**Security:** iso27001, nist-800-53, owasp-asvs, soc2, hipaa
+**Industry:** pci-dss, dora, nis2, fda-21cfr11, eu-cra
+**Supply chain:** sbom-slsa
+**Safety:** iec61508, iso26262, do-178c
+**Coding:** misra, iec62443
+
+### CKB's blind spots (what the LLM must catch)
+
+CKB maps code patterns to regulation articles using AST + regex + tree-sitter. It is
+structurally correct but contextually blind:
+
+- **Business context**: CKB flags PII patterns in a healthcare app and a game engine equally
+- **Architecture awareness**: a finding in dead/test code vs production code has different weight
+- **Compensating controls**: CKB can't see infrastructure-level encryption, WAFs, or IAM policies
+- **Regulatory applicability**: CKB flags HIPAA in a repo that doesn't handle PHI
+- **Risk prioritization**: 50 findings need ordering by actual business/legal exposure
+- **Cross-reference noise**: the same hardcoded credential maps to 6 frameworks — that's 1 fix, not 6
+
+## Phase 1: Structural scan (~2k tokens into context)
+
+```bash
+ckb audit compliance --framework=$ARGUMENTS --format=json --min-confidence=0.7 2>/dev/null
+```
+
+For large repos, scope to a specific path to reduce noise:
+```bash
+ckb audit compliance --framework=$ARGUMENTS --scope=src/api --format=json --min-confidence=0.7 2>/dev/null
+```
+
+If no framework specified, pick based on repo context:
+- Has health/patient/medical code → `hipaa,gdpr`
+- Has payment/billing/card code → `pci-dss,soc2`
+- EU company or processes EU data → `gdpr,dora,nis2`
+- AI/ML code → `eu-ai-act`
+- Safety-critical/embedded → `iec61508,iso26262,misra`
+- General SaaS → `iso27001,soc2,owasp-asvs`
+- If unsure → `iso27001,owasp-asvs` (broadest applicability)
+
+From the JSON output, extract:
+- `score`, `verdict` (pass/warn/fail)
+- `coverage[]` — per-framework scores with passed/warned/failed/skipped check counts
+- `findings[]` — with check, severity, file, startLine, message, suggestion, confidence, CWE
+- `checks[]` — per-check status and summary
+- `summary` — total findings by severity, files scanned
+
+Note:
+- **Per-framework scores**: which frameworks are clean vs problematic
+- **Finding count by severity**: errors are your priority
+- **CWE references**: cross-reference with known vulnerability databases
+- **Confidence scores**: low confidence (< 0.7) findings are likely false positives
+
+**Early exit**: If verdict=pass and all framework scores ≥ 90, write a one-line summary and stop.
+
+## Phase 2: Triage findings (targeted reads only)
+
+Do NOT read every flagged file. Group findings by root cause first:
+
+1. **Deduplicate cross-framework findings** — a hardcoded secret flagged by GDPR, PCI DSS, HIPAA, and ISO 27001 is one fix
+2. **Check for dominant category** — if > 50% of findings are one category (e.g., "sql-injection"), investigate that category systemically (is the pattern matching too broad?) rather than checking each file individually
+3. **Check applicability** — does this repo actually fall under the flagged framework? (e.g., HIPAA findings in a non-healthcare repo)
+4. **Read only error-severity files** — warnings and info can wait
+5. **For each error finding**, read just the flagged lines (not the whole file) and assess:
+   - Is this a real compliance risk or a pattern false positive?
+   - Are there compensating controls elsewhere? (check imports, config, middleware)
+   - What's the remediation effort: one-liner fix vs architectural change?
+
+## Phase 3: Write the audit summary (be terse)
+
+```markdown
+## [COMPLIANT|NEEDS REMEDIATION|NON-COMPLIANT] — CKB score: [N]/100
+
+[One sentence: what frameworks were audited and overall posture]
+
+### Critical findings (must remediate)
+1. **[framework]** `file:line` Art. [X] — [issue + remediation in one sentence]
+2. ...
+
+### Not applicable (false positives from context)
+[List findings CKB flagged but that don't apply to this repo, with one-line reason]
+
+### Cross-framework deduplication
+[N findings deduplicated to M root causes]
+
+### Framework scores
+| Framework | Score | Status | Checks |
+|-----------|-------|--------|--------|
+| [name]    | [N]   | [pass/warn/fail] | [passed]/[total] |
+```
+
+If fully compliant: just the header + framework scores. Nothing else.
+
+## Anti-patterns (token waste)
+
+- Reading every flagged file → waste (group by root cause, read only errors)
+- Treating cross-framework duplicates as separate issues → waste (1 code fix = 1 issue)
+- Explaining what each regulation requires → waste (CKB already mapped articles)
+- Re-checking frameworks CKB scored at 100 → waste
+- Auditing frameworks that don't apply to this repo → waste
+- Reading low-confidence findings (< 0.7) → waste (likely false positives)
+- Suggesting infrastructure controls for code-level findings → out of scope
+- Using wrong framework IDs (use pci-dss not pcidss, owasp-asvs not owaspasvs) → CKB error
@@ -19,6 +19,7 @@ It is structurally sound but semantically blind:
 - **Design fitness**: wrong abstraction, leaky interface, coupling that metrics miss
 - **Input validation**: missing bounds checks, nil guards outside AST patterns
 - **Race conditions**: concurrency issues, mutex ordering, shared state
+- **Resource leaks**: file handles, goroutines, connections not closed on all paths
 - **Incomplete refactoring**: callers missed across module boundaries
 - **Domain edge cases**: error paths, boundary conditions tests don't cover
 
@@ -29,19 +30,36 @@ so pre-existing issues interacting with new code won't surface.
 ## Phase 1: Structural scan (~1k tokens into context)
 
 ```bash
-ckb review --base=main --format=json --compact 2>/dev/null
+ckb review --base=main --format=json 2>/dev/null
 ```
 
 If a PR number was given:
 ```bash
 BASE=$(gh pr view $ARGUMENTS --json baseRefName -q .baseRefName)
-ckb review --base=$BASE --format=json --compact 2>/dev/null
+ckb review --base=$BASE --format=json 2>/dev/null
 ```
 
-From the output, build three lists:
+If "staged" was given:
+```bash
+ckb review --staged --format=json 2>/dev/null
+```
+
+Parse the JSON output to extract:
+- `score`, `verdict` — overall quality
+- `checks[]` — status + summary per check (15 checks: breaking, secrets, tests, complexity,
+  coupling, hotspots, risk, health, dead-code, test-gaps, blast-radius, comment-drift,
+  format-consistency, bug-patterns, split)
+- `findings[]` — severity + file + message + ruleId (top-level, separate from check details)
+- `narrative` — CKB AI-generated summary (if available)
+- `prTier` — small/medium/large
+- `reviewEffort` — estimated hours + complexity
+- `reviewers[]` — suggested reviewers with expertise areas
+- `healthReport` — degraded/improved file counts
+
+From checks, build three lists:
 - **SKIP**: passed checks — don't touch these files or topics
 - **INVESTIGATE**: warned/failed checks — these are your review scope
-- **READ**: hotspot files + files with warn/fail findings — the only files you'll read
+- **READ**: files with warn/fail findings — the only files you'll read
 
 **Early exit**: Skip LLM ONLY when ALL conditions are met:
 1. Score ≥ 90 (not 80 — per-check caps hide warnings at 80)
@@ -56,14 +74,20 @@ the code is semantically correct.
 
 Do NOT read the full diff. Do NOT read every changed file.
 
-Read ONLY:
-1. Files that appear in INVESTIGATE findings (just the changed hunks via `git diff main...HEAD -- <file>`)
-2. New files (CKB has no history for these) — but only if <500 lines each
-3. Skip generated files, test files for existing tests, and config/CI files
+**For files CKB flagged (INVESTIGATE list):**
+Read only the changed hunks via `git diff main...HEAD -- <file>`.
+
+**For new files** (CKB has no history — these are your biggest blind spot):
+- If it's a new package/module: read the entry point and types/interfaces first,
+  then follow references to understand the architecture before reading individual files
+- If < 500 lines: read the file
+- If > 500 lines: read the first 100 lines (types/imports) + functions CKB flagged
+- Skip generated files, test files for existing tests, and config/CI/docs files
 
-For each file you read, look for exactly:
+**For each file you read, look for exactly:**
 - Logic errors (wrong condition, off-by-one, nil deref, race condition)
-- Security issues (injection, auth bypass, secrets CKB's 26 patterns missed)
+- Resource leaks (file handles, connections, goroutines not closed on error paths)
+- Security issues (injection, auth bypass, secrets CKB's patterns missed)
 - Design problems (wrong abstraction, leaky interface, coupling metrics don't catch)
 - Missing edge cases the tests don't cover
 - Incomplete refactoring (callers that should have changed but didn't)
@@ -78,6 +102,11 @@ CKB already checked these structurally.
 
 [One sentence: what the PR does]
 
+[If CKB provided narrative, include it here]
+
+**PR tier:** [small/medium/large] | **Review effort:** [N]h ([complexity])
+**Health:** [N] degraded, [N] improved
+
 ### Issues
 1. **[must-fix|should-fix]** `file:line` — [issue in one sentence]
 2. ...
@@ -87,6 +116,9 @@ CKB already checked these structurally.
 
 ### CKB flagged (verified above)
 [for each warn/fail finding: confirmed/false-positive + one-line reason]
+
+### Suggested reviewers
+[reviewer — expertise area]
 ```
 
 If no issues found: just the header line + CKB passed list. Nothing else.
@@ -95,10 +127,12 @@ If no issues found: just the header line + CKB passed list. Nothing else.
 
 - Reading files CKB marked as pass → waste
 - Reading generated files → waste
-- Summarizing what the PR does in detail → waste (git log exists)
+- Summarizing what the PR does in detail → waste (git log exists, CKB has narrative)
 - Explaining why passed checks passed → waste
 - Running MCP drill-down tools when CLI already gave enough signal → waste
 - Reading test files to "verify test quality" → waste unless CKB flagged test-gaps
 - Reading hotspot-only files with no findings → high churn ≠ needs review right now
 - Trusting score >= 80 as "safe to skip" → dangerous (per-check caps hide warnings)
 - Skipping new files because CKB didn't flag them → CKB has no SCIP data for new files
+- Reading every new file in a large new package → read entry point + types first, then follow refs
+- Ignoring reviewEffort/prTier → these tell you how thorough to be
@@ -2,6 +2,134 @@
 
 All notable changes to CKB will be documented in this file.
 
+## [8.3.0] - 2026-03-27
+
+### Added
+
+#### Compliance Audit (`ckb audit compliance`)
+Full regulatory compliance auditing with 131 checks across 20 frameworks:
+
+```bash
+ckb audit compliance --framework=gdpr,iso27001    # Specific frameworks
+ckb audit compliance --framework=all              # All 20 frameworks
+ckb audit compliance --recommend                  # Auto-detect applicable frameworks
+ckb audit compliance --framework=gdpr --ci        # CI mode with exit codes
+```
+
+**20 frameworks:** GDPR, CCPA, ISO 27701, EU AI Act, ISO 27001, NIST 800-53, OWASP ASVS, SOC 2, PCI DSS, HIPAA, DORA, NIS2, FDA 21 CFR Part 11, EU CRA, SBOM/SLSA, DO-178C, IEC 61508, ISO 26262, MISRA C, IEC 62443.
+
+**Cross-framework mapping:** A single finding (e.g., hardcoded credential) automatically surfaces all applicable regulations with specific clause references and CWE IDs.
+
+**Framework recommendation (`--recommend`):** Scans codebase for indicators (HTTP handlers, PII fields, database imports, payment SDKs) and recommends applicable frameworks with confidence scores.
+
+**Output formats:** human, json, markdown, sarif.
+
+**MCP tool:** `auditCompliance` — runs compliance audit via MCP using the persistent SCIP index.
+
+#### MCP Tools: `listSymbols` and `getSymbolGraph`
+
+**`listSymbols`** — Bulk symbol listing without search query:
+```
+listSymbols(scope: "src/services/", kinds: ["function"], minLines: 30, sortBy: "complexity")
+```
+Returns complete symbol inventory with body ranges (`lines`, `endLine`) and complexity metrics (`cyclomatic`, `cognitive`). Replaces exploring 40 files one-by-one.
+
+**`getSymbolGraph`** — Batch call graph for multiple symbols:
+```
+getSymbolGraph(symbolIds: [...30], depth: 1, direction: "callers")
+```
+Returns deduplicated nodes and edges with complexity per node. One call replaces 30 serial `getCallGraph` calls.
+
+#### `searchSymbols` Enhancements
+
+- **Complexity metrics:** Results now include `lines`, `cyclomatic`, `cognitive` per symbol via tree-sitter enrichment
+- **Server-side filtering:** `minLines`, `minComplexity`, `excludePatterns` params — filter 80% of noise server-side instead of client-side
+- **`batchGet` with `includeCounts`:** Returns `referenceCount`, `callerCount`, `calleeCount` per symbol (parallel SCIP lookups)
+
+#### Symbol Body Ranges (`startLine`, `endLine`, `lines`)
+
+`searchSymbols`, `explore` keySymbols, and `getSymbolGraph` now return full body ranges via tree-sitter enrichment. Consumers no longer need to read source files for brace-matching.
+
+#### Explore keySymbols Improvements
+
+- Functions rank above struct fields (behavioral analysis priority)
+- Tree-sitter supplement fills in functions when SCIP returns only types
+- Per-symbol `cyclomatic` and `cognitive` complexity
+
+#### `getFileComplexity` in Refactor Preset
+
+Previously only available in `full` preset (96 tools). Now in `refactor` (39 tools).
+
+### Fixed
+
+#### Bug-Pattern False Positives (42 → 0)
+- **defer-in-loop:** Recognize `func(){}()` closure pattern as correct (defer fires per iteration)
+- **discarded-error:** Skip closure bodies in IIFE patterns; add `singleReturnNew` allowlist (NewScanner, NewReader, etc.); add `noErrorMethods` (Scan, WriteHeader, WriteJSON, WriteError, BadRequest, NotFound, InternalError)
+- **missing-defer-close:** Remove NewReader/NewWriter from resource-opening functions (bufio wrappers don't need Close)
+- **nil-after-deref:** 30-line gap threshold filters cross-scope false matches
+- **shadowed-err:** Only flag when outer `err` is standalone function-body-level `:=`; treat if/for/switch initializer `:=` as scoped
+
+All fixes use `FindNodesSkipping` — scope-aware tree-sitter node search that stops recursion at `func_literal` boundaries.
+
+#### Secrets Scanner
+- Shell variable interpolation (`${VAR:-default}`, `${VAR:?error}`) in Docker Compose URLs no longer flagged as password_in_url
+- Shell environment leak: `env -i` wrapper prevents user profile (.zshrc) from corrupting subprocess output
+
+#### Test-Gap Detection
+- `vi.mock`/`jest.mock` module-level mocking recognized — functions covered by module mocks no longer flagged
+- Barrel/re-export files (`export * from '...'`) skipped — pure re-exports have no logic to test
+
+#### Coupling Check
+- Expanded noise filter: test files, dependency manifests (go.mod, package.json), documentation, generated directories (dist/, build/, l10n/, __generated__/)
+- Generated file suffixes: .pb.go, .pb.h, .pb.cc, .pb.ts, _grpc.pb.go, _pb2.py, .g.dart, .freezed.dart, .mocks.dart, _string.go, wire_gen.go, _mock.go, .bundle.js, .arb, .d.ts
+- Flutter l10n false positive fixed (#185): .arb files excluded from coupling analysis
+
+#### Compliance Audit FP Reduction (11,356 → ~50 findings)
+- Deep-nesting: threshold 4→6, reset at function boundaries, 3-per-file cap
+- Dead-code: skip Go files (handled by AST-based bug-patterns)
+- Dynamic-memory: skip garbage-collected languages
+- Global-state: exclude regexp.MustCompile, errors.New, sync primitives
+- Swallowed-errors: remove overly broad `_ = obj.Method()` pattern
+- Eval-injection: skip Go and .github/ directories
+- Insecure-random: inline import scanning for crypto/rand vs math/rand; skip import lines
+- Path-traversal: skip filepath.Join, HasPrefix comparisons, testdata/
+- Non-FIPS-crypto: skip strings.Contains pattern matching
+- SQL injection (PCI DSS): add parameterized query detection, #nosec support
+- TODO detection: case-sensitive TEMP, skip "Stub:/Placeholder:/Note:" comments, require comment context
+
+#### FTS Empty Query Bug
+`FTS.Search("")` returned empty results (early return for empty query). Added `listAll()` method that queries `symbols_fts_content` directly. Fixes `listSymbols` and `searchSymbols("")` returning 0 on MCP.
+
+#### MCP Server Warmup
+Changed warmup from `SearchSymbols("", 1)` (cached empty results before SCIP loaded) to `RefreshFTS()` (populates FTS from SCIP without caching search results).
+
+#### IEC 61508 Tree-Sitter Crash
+`complexityExceededCheck` bypassed thread-safe `AnalyzeFileComplexity()` wrapper, calling `ComplexityAnalyzer.AnalyzeFile()` directly — SIGABRT when concurrent checks hit CGO.
+
+#### Daemon API Endpoints (7 stubs → implementations)
+- Schedule list/detail/cancel via scheduler.ListSchedules()
+- Repo list/detail via repos.LoadRegistry()
+- Federation list/detail via federation.List()/LoadConfig()
+- CLI daemon status: HTTP health query with version/uptime display
+
+#### Query Engine Stubs (4 → implementations)
+- Ownership refresh: CODEOWNERS parsing + git-blame analysis
+- Hotspot refresh: git churn data with 90-day window
+- Responsibility refresh: module responsibility extraction
+- Ownership history: storage table query
+
+### Changed
+- Score calculation: floor is 0 (not 20), per-rule deduction cap of 10 documented
+- `LikelyReturnsError`: removed "Scan" from error patterns, added `singleReturnNew` and `noErrorMethods` maps
+- Generated file detection: 20+ new patterns (protobuf, Go generators, Dart/Flutter, GraphQL, bundlers)
+- Per-check findings cap (50 max) in compliance engine
+- Compliance config: `DefaultDaemonPort` constant replaces hardcoded 9120
+
+### Performance
+- `batchGet` with `includeCounts`: parallel reference/caller/callee lookups (10-concurrent semaphore)
+- FTS multiplier: 2x → 10x when filters active (handles SCIP struct field flooding)
+- MCP index warmup: background `RefreshFTS()` on engine init
+
 ## [8.2.0] - 2026-03-21
 
 ### Added
 
@@ -53,6 +53,13 @@ golangci-lint run
 ./ckb review --base=develop --format=markdown
 ./ckb review --checks=breaking,secrets,health --ci
 
+# Run compliance audit (131 checks across 20 regulatory frameworks)
+./ckb audit compliance --framework=gdpr
+./ckb audit compliance --framework=gdpr,iso27001,owasp-asvs
+./ckb audit compliance --framework=all --min-confidence=0.7 --format=sarif
+./ckb audit compliance --framework=pci-dss,hipaa --ci --fail-on=error
+./ckb audit compliance --framework=iec61508 --sil-level=3
+
 # Auto-configure AI tool integration (interactive)
 ./ckb setup
 
@@ -92,7 +99,7 @@ ckb setup --tool=cursor --global
 claude mcp add ckb -- npx @tastehub/ckb mcp
 ```
 
-`ckb setup --tool=claude-code` also installs the `/ckb-review` slash command for Claude Code, which orchestrates CKB's structural analysis with LLM semantic review.
+`ckb setup --tool=claude-code` also installs the `/ckb-review` and `/ckb-audit` slash commands for Claude Code, which orchestrate CKB's structural analysis with LLM semantic review.
 
 ### Key MCP Tools
 
@@ -162,6 +169,7 @@ Storage Layer (internal/storage/) - SQLite for caching and symbol mappings
 - **internal/coupling/**: Co-change analysis from git history.
 - **internal/streaming/**: SSE streaming infrastructure for long-running MCP operations.
 - **internal/envelope/**: Response metadata (ConfidenceFactor, CacheInfo) for AI transparency.
+- **internal/compliance/**: Regulatory compliance auditing (131 checks, 20 frameworks). Each framework is a subpackage (gdpr/, iso27001/, owaspasvs/, etc.) with checks that map findings to regulation articles.
 
 ### Data Flow