nextlevelbuilder
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 29 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 29 additions & 0 deletions
diff --git a/‎CLAUDE.md‎
Lines changed: 39 additions & 0 deletions b/‎CLAUDE.md‎
Lines changed: 39 additions & 0 deletions
diff --git a/‎Dockerfile‎
Lines changed: 4 additions & 2 deletions b/‎Dockerfile‎
Lines changed: 4 additions & 2 deletions
diff --git a/‎Makefile‎
Lines changed: 21 additions & 1 deletion b/‎Makefile‎
Lines changed: 21 additions & 1 deletion
diff --git a/‎cmd/gateway.go‎
Lines changed: 23 additions & 6 deletions b/‎cmd/gateway.go‎
Lines changed: 23 additions & 6 deletions
diff --git a/‎cmd/gateway_agents.go‎
Lines changed: 57 additions & 0 deletions b/‎cmd/gateway_agents.go‎
Lines changed: 57 additions & 0 deletions
diff --git a/‎cmd/gateway_announce_queue.go‎
Lines changed: 2 additions & 0 deletions b/‎cmd/gateway_announce_queue.go‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎cmd/gateway_builtin_tools.go‎
Lines changed: 4 additions & 0 deletions b/‎cmd/gateway_builtin_tools.go‎
Lines changed: 4 additions & 0 deletions
@@ -88,4 +88,5 @@ compose.d/*
 /mise.*.toml
 /mise
 /.mise
+*.test
 goclaw-patched-linux-amd64
@@ -2,6 +2,35 @@
 
 All notable changes to GoClaw are documented here. For full documentation, see [docs.goclaw.sh](https://docs.goclaw.sh).
 
+## Unreleased
+
+### Breaking Changes
+
+- **Context pruning now opt-in.** Previously tool-result trimming ran by default
+  for all providers; now requires explicit `contextPruning.mode: "cache-ttl"` in
+  `config.agents.defaults` to enable. Matches upstream TS design and prevents
+  silent prompt-cache invalidation on Anthropic.
+
+  Migration — add to `config.json5`:
+  ```json5
+  agents: {
+    defaults: {
+      contextPruning: { mode: "cache-ttl" }
+    }
+  }
+  ```
+
+### Improvements
+
+- **Context pruning cleanup.** Removed redundant Pass 0 (per-result 30% guard),
+  deduplicated double prune call per iteration, added SanitizeHistory to
+  PruneStage for broken tool_use/tool_result pair cleanup.
+- **Context pruning config backfill (migration).** Agents with existing custom
+  `context_pruning` config (e.g., `softTrimRatio`, `keepLastAssistants`) but
+  missing a `mode` field get auto-backfilled with `mode: "cache-ttl"` to
+  preserve their intent after the opt-in flip. Rows with NULL config stay
+  NULL (new opt-in default applies). PG migration 51; SQLite schema v19.
+
 ## Project Status
 
 ### Implemented & Tested in Production
 
@@ -184,6 +184,44 @@ Published to GHCR (`ghcr.io/nextlevelbuilder/goclaw`) and Docker Hub (`digitop/g
 - **Tool gating:** `TeamActionPolicy` in `internal/tools/team_action_policy.go` — lite blocks comment/review/approve/reject/attach/ask_user. `skill_manage`/`publish_skill` not registered in lite
 - **File serving:** 2-layer path isolation in `internal/http/files.go` — workspace boundary (all editions) + tenant scope (standard only with RBAC)
 
+## Plan Verification Rules
+
+Apply before finalizing any multi-phase plan. Trust-but-verify between scout → planner → final plan.
+
+### Verification discipline (what to verify)
+
+1. **Verify factual claims against code** — re-grep/re-count every number, path, endpoint. Don't copy from scout summaries.
+2. **Trace semantics, not just cite lines** — when plan references existing/upstream code, identify WHEN each field mutates and under WHAT conditions. Line-range citation without control-flow trace = how ports silently invert behavior. Check: every call, or specific branches only?
+3. **No fabricated identifiers / API families** — every symbol in plan must cite `file:line`. RED FLAGS: plausible-sounding wrappers (`Keyring`, `Validator`, `Manager`), centralized packages (`internal/security`, `internal/auth`) that may be scattered, OTel-style (`StartSpan/EndSpan`) when codebase is emit-based. When unsure, `go doc <pkg>` lists actual exported surface. Apply especially when plan says "reuse existing X".
+4. **Struct scope audit before adding state** — verify lifetime (per-request/session/agent/process) before adding a field to an existing struct. "Plausibly per-X" is a red flag — grep construction + ownership. Shared-instance state leaks across isolation boundaries.
+5. **Gate-premise test math** — before asserting "feature X triggers independently of Y", list all early-returns from function entry to X. Math-verify any fixture claiming "X without Y".
+6. **Port = config-shape match** — "faithful port" divergences in config field name/type are silent breaking changes for users copying upstream config. Match upstream shape, or explicitly flag each divergence with rationale in the phase file.
+7. **Verify external API endpoints via `docs-seeker`** — before writing endpoint into plan. Sibling APIs often use different roots.
+
+### Scope & coverage (where to look)
+
+8. **Grep delete scope deep** — `grep -rn '<symbol>' .` whole repo. Stubs often have refs in catalogs/routing/switch cases. Enumerate ALL sites in todo.
+9. **Signature-change callers enumeration** — grep + list all callers explicitly. "Update all callers" insufficient.
+10. **Alias/shim coverage** — enumerate ALL exported symbols via `go doc <pkg>`. Add compile-time signature guards.
+11. **Scout desktop and web separately** — `ui/desktop/frontend/` ≠ `ui/web/`. Different structure, i18n namespaces, test framework presence.
+
+### Phasing & ordering (when)
+
+12. **Re-scout on scope change** — if phase promotes from deferred → active, re-scout. Don't reuse brainstorm summary.
+13. **Cross-phase gates explicit** — "Phase N-1 merged + tests green" in phase Context. Execution order alone ≠ enforcement.
+14. **Zero-coverage characterization test = blocker step** — write byte/request-body fixture test BEFORE migration. Not "recommended".
+15. **i18n keys ordering** — add key + 3 catalogs as explicit todo step BEFORE handler code. Missing key = runtime crash.
+
+### Conventions & finalization
+
+16. **Context key style convention** — check existing `context.go` pattern before introducing new key types. Mixed = code smell.
+17. **Verify pass MANDATORY after rewrite** — spawn fresh Explore/grep to audit planner output. Don't trust self-validation.
+
+**Pattern to avoid:** user asks → planner writes → report "done".
+**Safer pattern:** user asks → scout → planner writes → audit-verify → report.
+
+**Red-team practice:** After planner completes, run `code-reviewer`/`brainstormer` in audit mode: "spot-check 15+ claims vs live codebase". Past catches: fabricated `crypto.Keyring`/`tracing.StartSpan` (agent-hooks plan); inverted TS-port semantics + wrong struct scope + misread early-return gate (context-pruning plan). See `plans/*/reports/audit-*.md` for concrete examples.
+
 ## Post-Implementation Checklist
 
 After implementing or modifying Go code, run these checks:
@@ -207,6 +245,7 @@ Go conventions to follow:
 - **DB query reuse:** Before adding a new DB query for key entities (teams, agents, sessions, users), check if the same data is already fetched earlier in the current flow/pipeline. Prefer passing resolved data through context, event payloads, or function params rather than re-querying. Duplicate queries waste DB resources and add latency
 - **Solution design:** When designing a fix or feature, identify the root cause first — don't just patch symptoms. Think through production scenarios (high concurrency, multi-tenant isolation, failure cascades, long-running sessions) to ensure the solution holds up. Prefer explicit configuration over runtime heuristics. Prefer the simplest solution that addresses the root cause directly
 - **Tenant-scope guards on admin writes:** `RoleAdmin` is not a tenant check. Writes to **global** tables (no `tenant_id` column — e.g. `builtin_tools`, disk config, package mgmt) must gate with `http.requireMasterScope` / WS `requireMasterScope(requireOwner(...))`. Writes to **tenant-scoped** tables must gate with `http.requireTenantAdmin` + SQL `WHERE tenant_id = $N`. Shared predicate: `store.IsMasterScope(ctx)`. See `CONTRIBUTING.md` → "Tenant-scope guards" for the full decision table and anti-patterns.
+- **Skip load / stress / benchmark tests.** Do NOT write throughput benchmarks, p95/p99 latency assertions, or `runtime.ReadMemStats`-based memory-leak tests for regular feature work. They flake on shared CI runners, waste runner time, and rarely catch real bugs. Only add load tests when explicitly requested for a specific investigation. For normal "prove it works" coverage, use unit + integration + chaos tests.
 
 ## Mobile UI/UX Rules
 
 
@@ -144,15 +144,17 @@ RUN chmod +x /app/docker-entrypoint.sh && \
 # while pip/npm subdirs are goclaw-owned (runtime installs by the app process).
 # Symlink .claude → data volume so Claude CLI credentials persist across container recreates.
 RUN mkdir -p /app/workspace /app/data/.runtime/pip /app/data/.runtime/npm-global/lib \
-        /app/data/.runtime/pip-cache /app/data/.claude /app/skills /app/tsnet-state /app/.goclaw \
+        /app/data/.runtime/pip-cache /app/data/.runtime/bin /app/data/.claude /app/skills \
+        /app/tsnet-state /app/.goclaw \
     && ln -s /app/data/.claude /app/.claude \
     && touch /app/data/.runtime/apk-packages \
     && chown -R goclaw:goclaw /app/workspace /app/skills /app/tsnet-state /app/.goclaw \
     && chown goclaw:goclaw /app/bundled-skills /app/data \
     && chown root:goclaw /app/data/.runtime /app/data/.runtime/apk-packages \
     && chmod 0750 /app/data/.runtime \
     && chmod 0640 /app/data/.runtime/apk-packages \
-    && chown -R goclaw:goclaw /app/data/.runtime/pip /app/data/.runtime/npm-global /app/data/.runtime/pip-cache /app/data/.claude
+    && chown -R goclaw:goclaw /app/data/.runtime/pip /app/data/.runtime/npm-global /app/data/.runtime/pip-cache /app/data/.runtime/bin /app/data/.claude \
+    && chmod 0755 /app/data/.runtime/bin
 
 # Default environment
 ENV GOCLAW_CONFIG=/app/config.json \
 
@@ -2,7 +2,7 @@ VERSION ?= $(shell git describe --tags --abbrev=0 --match "v[0-9]*" 2>/dev/null
 LDFLAGS  = -s -w -X github.com/nextlevelbuilder/goclaw/cmd.Version=$(VERSION)
 BINARY   = goclaw
 
-.PHONY: build build-full build-tui run clean version up down logs reset test vet check-web dev migrate setup ci desktop-dev desktop-build desktop-dmg
+.PHONY: build build-full build-tui run clean version up down logs reset test vet check-web dev migrate setup ci desktop-dev desktop-build desktop-dmg test-hooks test-hooks-unit test-hooks-e2e test-hooks-chaos test-hooks-rbac test-hooks-tracing
 
 # Build backend only (API-only, no embedded web UI)
 build:
@@ -94,6 +94,26 @@ test-scenarios:
 # Critical tests (P0 + P1) - run before merge
 test-critical: test-invariants test-contracts
 
+# ── Agent Hooks targets (phase 4) ──
+# Requires TEST_DATABASE_URL pointing at a pgvector:pg18 container on :5433
+test-hooks-unit:
+	go test -race ./internal/hooks/... ./internal/gateway/methods/
+
+test-hooks-e2e:
+	go test -race -timeout=180s -tags integration -run "TestHooksE2E" ./tests/integration/
+
+test-hooks-chaos:
+	go test -race -timeout=180s -tags integration -run "TestHooksChaos" ./tests/integration/
+
+test-hooks-rbac:
+	go test -race -timeout=90s -tags integration -run "TestHooksRBAC" ./tests/integration/
+
+test-hooks-tracing:
+	go test -race -timeout=90s -tags integration -run "TestHooksTracing" ./tests/integration/
+
+# Full hook test suite (unit + integration)
+test-hooks: test-hooks-unit test-hooks-e2e test-hooks-chaos test-hooks-rbac test-hooks-tracing
+
 vet:
 	go vet ./...
 
 
@@ -33,6 +33,7 @@ import (
 	"github.com/nextlevelbuilder/goclaw/internal/edition"
 	"github.com/nextlevelbuilder/goclaw/internal/gateway"
 	"github.com/nextlevelbuilder/goclaw/internal/gateway/methods"
+	"github.com/nextlevelbuilder/goclaw/internal/hooks"
 	httpapi "github.com/nextlevelbuilder/goclaw/internal/http"
 	mcpbridge "github.com/nextlevelbuilder/goclaw/internal/mcp"
 	"github.com/nextlevelbuilder/goclaw/internal/media"
@@ -133,7 +134,7 @@ func runGateway() {
 		tools.DetectServerIPs(context.Background())
 	}
 
-	toolsReg, execApprovalMgr, mcpMgr, sandboxMgr, browserMgr, webFetchTool, ttsTool, permPE, toolPE, dataDir, agentCfg := setupToolRegistry(cfg, workspace, providerRegistry)
+	toolsReg, execApprovalMgr, mcpMgr, sandboxMgr, browserMgr, webFetchTool, ttsTool, audioMgr, permPE, toolPE, dataDir, agentCfg := setupToolRegistry(cfg, workspace, providerRegistry)
 	if browserMgr != nil {
 		defer browserMgr.Close()
 	}
@@ -263,6 +264,9 @@ func runGateway() {
 
 	// Create all agents — resolved lazily from database by the managed resolver.
 	agentRouter := agent.NewRouter()
+	if traceCollector != nil {
+		agentRouter.SetTraceCollector(traceCollector)
+	}
 	slog.Info("agents will be resolved lazily from database")
 
 	// Create gateway server and wire enforcement
@@ -307,6 +311,7 @@ func runGateway() {
 		workspace:        workspace,
 		dataDir:          dataDir,
 		domainBus:        domainBus,
+		audioMgr:         audioMgr,
 	}
 
 	gatewayAddr := loopbackAddr(cfg.Gateway.Host, cfg.Gateway.Port)
@@ -375,6 +380,18 @@ func runGateway() {
 	server.SetLogTee(logTee)
 	pairingMethods, heartbeatMethods, chatMethods := registerAllMethods(server, agentRouter, pgStores.Sessions, pgStores.Cron, pgStores.Pairing, cfg, cfgPath, workspace, dataDir, msgBus, execApprovalMgr, pgStores.Agents, pgStores.Skills, pgStores.ConfigSecrets, pgStores.Teams, contextFileInterceptor, logTee, pgStores.Heartbeats, pgStores.ConfigPermissions, pgStores.SystemConfigs, pgStores.Tenants, pgStores.SkillTenantCfgs)
 
+	// Phase 3: Agent hooks RPC methods (hooks.list/create/update/delete/toggle/test/history).
+	if hs, ok := pgStores.Hooks.(hooks.HookStore); ok && hs != nil {
+		hm := methods.NewHookMethods(hs, edition.Current())
+		// Reuse dispatcher handlers for dry-run test runner so UI test panel
+		// exercises the exact code that will run in production.
+		if sharedHookHandlers != nil {
+			hm.SetTestRunner(methods.NewDispatcherTestRunner(sharedHookHandlers))
+		}
+		hm.Register(server.Router())
+		slog.Info("registered hooks RPC methods")
+	}
+
 	// Wire post-turn processor for team task dispatch (WS chat.send + HTTP API paths).
 	if postTurn != nil {
 		chatMethods.SetPostTurnProcessor(postTurn)
@@ -422,12 +439,12 @@ func runGateway() {
 		instanceLoader = channels.NewInstanceLoader(pgStores.ChannelInstances, pgStores.Agents, channelMgr, msgBus, pgStores.Pairing)
 		instanceLoader.SetProviderRegistry(providerRegistry)
 		instanceLoader.SetPendingCompactionConfig(cfg.Channels.PendingCompaction)
-		instanceLoader.RegisterFactory(channels.TypeTelegram, telegram.FactoryWithStores(pgStores.Agents, pgStores.ConfigPermissions, pgStores.Teams, pgStores.SubagentTasks, pgStores.PendingMessages))
-		instanceLoader.RegisterFactory(channels.TypeDiscord, discord.FactoryWithStores(pgStores.Agents, pgStores.ConfigPermissions, pgStores.PendingMessages))
-		instanceLoader.RegisterFactory(channels.TypeFeishu, feishu.FactoryWithPendingStore(pgStores.PendingMessages))
+		instanceLoader.RegisterFactory(channels.TypeTelegram, telegram.FactoryWithStoresAndAudio(pgStores.Agents, pgStores.ConfigPermissions, pgStores.Teams, pgStores.SubagentTasks, pgStores.PendingMessages, audioMgr))
+		instanceLoader.RegisterFactory(channels.TypeDiscord, discord.FactoryWithStoresAndAudio(pgStores.Agents, pgStores.ConfigPermissions, pgStores.PendingMessages, audioMgr))
+		instanceLoader.RegisterFactory(channels.TypeFeishu, feishu.FactoryWithPendingStoreAndAudio(pgStores.PendingMessages, audioMgr))
 		instanceLoader.RegisterFactory(channels.TypeZaloOA, zalo.Factory)
 		instanceLoader.RegisterFactory(channels.TypeZaloPersonal, zalopersonal.FactoryWithPendingStore(pgStores.PendingMessages))
-		instanceLoader.RegisterFactory(channels.TypeWhatsApp, whatsapp.FactoryWithDB(pgStores.DB, pgStores.PendingMessages, "pgx"))
+		instanceLoader.RegisterFactory(channels.TypeWhatsApp, whatsapp.FactoryWithDBAudio(pgStores.DB, pgStores.PendingMessages, "pgx", audioMgr, pgStores.BuiltinTools))
 		instanceLoader.RegisterFactory(channels.TypeSlack, slackchannel.FactoryWithPendingStore(pgStores.PendingMessages))
 		instanceLoader.RegisterFactory(channels.TypeFacebook, facebook.Factory)
 		instanceLoader.RegisterFactory(channels.TypePancake, pancake.Factory)
@@ -437,7 +454,7 @@ func runGateway() {
 	}
 
 	// Register config-based channels as fallback when no DB instances loaded.
-	registerConfigChannels(cfg, channelMgr, msgBus, pgStores, instanceLoader)
+	registerConfigChannels(cfg, channelMgr, msgBus, pgStores, instanceLoader, audioMgr)
 
 	// Register channels/instances/links/teams RPC methods
 	wireChannelRPCMethods(server, pgStores, channelMgr, agentRouter, msgBus, workspace)
 
@@ -4,6 +4,8 @@ import (
 	"context"
 	"log/slog"
 
+	"github.com/nextlevelbuilder/goclaw/internal/audio/elevenlabs"
+	minimaxaudio "github.com/nextlevelbuilder/goclaw/internal/audio/minimax"
 	"github.com/nextlevelbuilder/goclaw/internal/bus"
 	"github.com/nextlevelbuilder/goclaw/internal/config"
 	"github.com/nextlevelbuilder/goclaw/internal/memory"
@@ -187,6 +189,9 @@ func setupSubagents(providerReg *providers.Registry, cfg *config.Config, msgBus
 		if sc.ArchiveAfterMinutes > 0 {
 			subCfg.ArchiveAfterMinutes = sc.ArchiveAfterMinutes
 		}
+		if sc.MaxRetries > 0 {
+			subCfg.MaxRetries = sc.MaxRetries
+		}
 		if sc.Model != "" {
 			subCfg.Model = sc.Model
 		}
@@ -274,3 +279,55 @@ func setupTTS(cfg *config.Config) *tts.Manager {
 
 	return mgr
 }
+
+// setupAudioExtras wires Music and SFX providers into the audio Manager.
+// ElevenLabs is registered for both SFX and Music when an API key is present.
+// MiniMax music is registered when cfg.Audio.Music is configured with a key.
+// Phase 4 will add STT providers here.
+func setupAudioExtras(cfg *config.Config, mgr *tts.Manager) {
+	ellKey := cfg.Tts.ElevenLabs.APIKey
+	ellBase := cfg.Tts.ElevenLabs.BaseURL
+
+	// ElevenLabs SFX — reuse TTS credentials.
+	if ellKey != "" {
+		mgr.RegisterSFX(elevenlabs.NewSFXProvider(elevenlabs.Config{
+			APIKey:  ellKey,
+			BaseURL: ellBase,
+		}))
+		slog.Info("audio.sfx: elevenlabs registered")
+	}
+
+	// ElevenLabs Music — same credentials, uses /v1/music endpoint.
+	if ellKey != "" {
+		mgr.RegisterMusic(elevenlabs.NewMusicProvider(elevenlabs.Config{
+			APIKey:  ellKey,
+			BaseURL: ellBase,
+		}))
+		slog.Info("audio.music: elevenlabs registered")
+	}
+
+	// MiniMax Music — optional, from cfg.Audio.Music block.
+	if cfg.Audio != nil && cfg.Audio.Music != nil {
+		mc := cfg.Audio.Music
+		if mc.APIKey != "" {
+			mgr.RegisterMusic(minimaxaudio.NewMusicProvider(minimaxaudio.MusicConfig{
+				APIKey:  mc.APIKey,
+				APIBase: mc.BaseURL,
+				Model:   mc.Model,
+			}))
+			slog.Info("audio.music: minimax registered")
+		}
+	}
+
+	// ElevenLabs STT (Scribe v2) — reuse TTS credentials. Registered as tenant-scope
+	// default; per-request tenant override lands via builtin_tools[stt] in Phase 5
+	// channel migration. Legacy per-channel STTProxyURL is bridged separately.
+	if ellKey != "" {
+		mgr.RegisterSTT(elevenlabs.NewSTTProvider(elevenlabs.Config{
+			APIKey:  ellKey,
+			BaseURL: ellBase,
+		}))
+		mgr.SetSTTChain([]string{"elevenlabs", "proxy"})
+		slog.Info("audio.stt: elevenlabs registered")
+	}
+}
@@ -4,6 +4,7 @@ import (
 	"context"
 	"fmt"
 	"log/slog"
+	"path/filepath"
 	"strings"
 
 	"github.com/google/uuid"
@@ -103,6 +104,7 @@ func processAnnounceLoop(
 				req.ForwardMedia = append(req.ForwardMedia, bus.MediaFile{
 					Path:     mr.Path,
 					MimeType: mr.ContentType,
+					Filename: filepath.Base(mr.Path), // preserve sanitized stem from producer
 				})
 			}
 		}
 
@@ -71,6 +71,10 @@ func builtinToolSeedData() []store.BuiltinToolDef {
 			Requires: []string{"tts_provider"},
 			Metadata: json.RawMessage(`{"config_hint":"Config → TTS"}`),
 		},
+		{Name: "stt", DisplayName: "Speech-to-Text", Description: "Transcribe voice/audio messages to text using ElevenLabs Scribe or a proxy service", Category: "media", Enabled: true,
+			Requires: []string{"stt_provider"},
+			Metadata: json.RawMessage(`{"config_hint":"Config → Audio → STT"}`),
+		},
 
 		// browser
 		{Name: "browser", DisplayName: "Browser", Description: "Automate browser interactions: navigate pages, click elements, fill forms, take screenshots", Category: "browser", Enabled: true,
Original file line number	Diff line number	Diff line change
`@@ -4,6 +4,7 @@ import (`
`4`	`4`	`"context"`
`5`	`5`	`"fmt"`
`6`	`6`	`"log/slog"`
	`7`	`+ "path/filepath"`
`7`	`8`	`"strings"`
`8`	`9`
`9`	`10`	`"github.com/google/uuid"`
`@@ -103,6 +104,7 @@ func processAnnounceLoop(`
`103`	`104`	`req.ForwardMedia = append(req.ForwardMedia, bus.MediaFile{`
`104`	`105`	`Path: mr.Path,`
`105`	`106`	`MimeType: mr.ContentType,`
	`107`	`+ Filename: filepath.Base(mr.Path), // preserve sanitized stem from producer`
`106`	`108`	`})`
`107`	`109`	`}`
`108`	`110`	`}`