You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CLAUDE.md
+39Lines changed: 39 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -184,6 +184,44 @@ Published to GHCR (`ghcr.io/nextlevelbuilder/goclaw`) and Docker Hub (`digitop/g
184
184
-**Tool gating:**`TeamActionPolicy` in `internal/tools/team_action_policy.go` — lite blocks comment/review/approve/reject/attach/ask_user. `skill_manage`/`publish_skill` not registered in lite
185
185
-**File serving:** 2-layer path isolation in `internal/http/files.go` — workspace boundary (all editions) + tenant scope (standard only with RBAC)
186
186
187
+
## Plan Verification Rules
188
+
189
+
Apply before finalizing any multi-phase plan. Trust-but-verify between scout → planner → final plan.
190
+
191
+
### Verification discipline (what to verify)
192
+
193
+
1.**Verify factual claims against code** — re-grep/re-count every number, path, endpoint. Don't copy from scout summaries.
194
+
2.**Trace semantics, not just cite lines** — when plan references existing/upstream code, identify WHEN each field mutates and under WHAT conditions. Line-range citation without control-flow trace = how ports silently invert behavior. Check: every call, or specific branches only?
195
+
3.**No fabricated identifiers / API families** — every symbol in plan must cite `file:line`. RED FLAGS: plausible-sounding wrappers (`Keyring`, `Validator`, `Manager`), centralized packages (`internal/security`, `internal/auth`) that may be scattered, OTel-style (`StartSpan/EndSpan`) when codebase is emit-based. When unsure, `go doc <pkg>` lists actual exported surface. Apply especially when plan says "reuse existing X".
196
+
4.**Struct scope audit before adding state** — verify lifetime (per-request/session/agent/process) before adding a field to an existing struct. "Plausibly per-X" is a red flag — grep construction + ownership. Shared-instance state leaks across isolation boundaries.
197
+
5.**Gate-premise test math** — before asserting "feature X triggers independently of Y", list all early-returns from function entry to X. Math-verify any fixture claiming "X without Y".
198
+
6.**Port = config-shape match** — "faithful port" divergences in config field name/type are silent breaking changes for users copying upstream config. Match upstream shape, or explicitly flag each divergence with rationale in the phase file.
199
+
7.**Verify external API endpoints via `docs-seeker`** — before writing endpoint into plan. Sibling APIs often use different roots.
200
+
201
+
### Scope & coverage (where to look)
202
+
203
+
8.**Grep delete scope deep** — `grep -rn '<symbol>' .` whole repo. Stubs often have refs in catalogs/routing/switch cases. Enumerate ALL sites in todo.
204
+
9.**Signature-change callers enumeration** — grep + list all callers explicitly. "Update all callers" insufficient.
205
+
10.**Alias/shim coverage** — enumerate ALL exported symbols via `go doc <pkg>`. Add compile-time signature guards.
206
+
11.**Scout desktop and web separately** — `ui/desktop/frontend/` ≠ `ui/web/`. Different structure, i18n namespaces, test framework presence.
207
+
208
+
### Phasing & ordering (when)
209
+
210
+
12.**Re-scout on scope change** — if phase promotes from deferred → active, re-scout. Don't reuse brainstorm summary.
211
+
13.**Cross-phase gates explicit** — "Phase N-1 merged + tests green" in phase Context. Execution order alone ≠ enforcement.
212
+
14.**Zero-coverage characterization test = blocker step** — write byte/request-body fixture test BEFORE migration. Not "recommended".
213
+
15.**i18n keys ordering** — add key + 3 catalogs as explicit todo step BEFORE handler code. Missing key = runtime crash.
214
+
215
+
### Conventions & finalization
216
+
217
+
16.**Context key style convention** — check existing `context.go` pattern before introducing new key types. Mixed = code smell.
218
+
17.**Verify pass MANDATORY after rewrite** — spawn fresh Explore/grep to audit planner output. Don't trust self-validation.
219
+
220
+
**Pattern to avoid:** user asks → planner writes → report "done".
**Red-team practice:** After planner completes, run `code-reviewer`/`brainstormer` in audit mode: "spot-check 15+ claims vs live codebase". Past catches: fabricated `crypto.Keyring`/`tracing.StartSpan` (agent-hooks plan); inverted TS-port semantics + wrong struct scope + misread early-return gate (context-pruning plan). See `plans/*/reports/audit-*.md` for concrete examples.
224
+
187
225
## Post-Implementation Checklist
188
226
189
227
After implementing or modifying Go code, run these checks:
@@ -207,6 +245,7 @@ Go conventions to follow:
207
245
-**DB query reuse:** Before adding a new DB query for key entities (teams, agents, sessions, users), check if the same data is already fetched earlier in the current flow/pipeline. Prefer passing resolved data through context, event payloads, or function params rather than re-querying. Duplicate queries waste DB resources and add latency
208
246
-**Solution design:** When designing a fix or feature, identify the root cause first — don't just patch symptoms. Think through production scenarios (high concurrency, multi-tenant isolation, failure cascades, long-running sessions) to ensure the solution holds up. Prefer explicit configuration over runtime heuristics. Prefer the simplest solution that addresses the root cause directly
209
247
-**Tenant-scope guards on admin writes:**`RoleAdmin` is not a tenant check. Writes to **global** tables (no `tenant_id` column — e.g. `builtin_tools`, disk config, package mgmt) must gate with `http.requireMasterScope` / WS `requireMasterScope(requireOwner(...))`. Writes to **tenant-scoped** tables must gate with `http.requireTenantAdmin` + SQL `WHERE tenant_id = $N`. Shared predicate: `store.IsMasterScope(ctx)`. See `CONTRIBUTING.md` → "Tenant-scope guards" for the full decision table and anti-patterns.
248
+
-**Skip load / stress / benchmark tests.** Do NOT write throughput benchmarks, p95/p99 latency assertions, or `runtime.ReadMemStats`-based memory-leak tests for regular feature work. They flake on shared CI runners, waste runner time, and rarely catch real bugs. Only add load tests when explicitly requested for a specific investigation. For normal "prove it works" coverage, use unit + integration + chaos tests.
.PHONY: build build-full build-tui run clean version up down logs reset test vet check-web dev migrate setup ci desktop-dev desktop-build desktop-dmg
5
+
.PHONY: build build-full build-tui run clean version up down logs reset test vet check-web dev migrate setup ci desktop-dev desktop-build desktop-dmg test-hooks test-hooks-unit test-hooks-e2e test-hooks-chaos test-hooks-rbac test-hooks-tracing
6
6
7
7
# Build backend only (API-only, no embedded web UI)
8
8
build:
@@ -94,6 +94,26 @@ test-scenarios:
94
94
# Critical tests (P0 + P1) - run before merge
95
95
test-critical: test-invariants test-contracts
96
96
97
+
# ── Agent Hooks targets (phase 4) ──
98
+
# Requires TEST_DATABASE_URL pointing at a pgvector:pg18 container on :5433
99
+
test-hooks-unit:
100
+
go test -race ./internal/hooks/... ./internal/gateway/methods/
101
+
102
+
test-hooks-e2e:
103
+
go test -race -timeout=180s -tags integration -run "TestHooksE2E" ./tests/integration/
104
+
105
+
test-hooks-chaos:
106
+
go test -race -timeout=180s -tags integration -run "TestHooksChaos" ./tests/integration/
107
+
108
+
test-hooks-rbac:
109
+
go test -race -timeout=90s -tags integration -run "TestHooksRBAC" ./tests/integration/
110
+
111
+
test-hooks-tracing:
112
+
go test -race -timeout=90s -tags integration -run "TestHooksTracing" ./tests/integration/
{Name: "stt", DisplayName: "Speech-to-Text", Description: "Transcribe voice/audio messages to text using ElevenLabs Scribe or a proxy service", Category: "media", Enabled: true,
0 commit comments