Skip to content

Commit 1aa591d

Browse files
authored
Merge pull request #929 from nextlevelbuilder/dev
Release: Hooks System, Audio/TTS Refactor, MCP Security & UI Improvements
2 parents c651cde + 7c5630c commit 1aa591d

604 files changed

Lines changed: 46810 additions & 2686 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,4 +88,5 @@ compose.d/*
8888
/mise.*.toml
8989
/mise
9090
/.mise
91+
*.test
9192
goclaw-patched-linux-amd64

CHANGELOG.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,35 @@
22

33
All notable changes to GoClaw are documented here. For full documentation, see [docs.goclaw.sh](https://docs.goclaw.sh).
44

5+
## Unreleased
6+
7+
### Breaking Changes
8+
9+
- **Context pruning now opt-in.** Previously tool-result trimming ran by default
10+
for all providers; now requires explicit `contextPruning.mode: "cache-ttl"` in
11+
`config.agents.defaults` to enable. Matches upstream TS design and prevents
12+
silent prompt-cache invalidation on Anthropic.
13+
14+
Migration — add to `config.json5`:
15+
```json5
16+
agents: {
17+
defaults: {
18+
contextPruning: { mode: "cache-ttl" }
19+
}
20+
}
21+
```
22+
23+
### Improvements
24+
25+
- **Context pruning cleanup.** Removed redundant Pass 0 (per-result 30% guard),
26+
deduplicated double prune call per iteration, added SanitizeHistory to
27+
PruneStage for broken tool_use/tool_result pair cleanup.
28+
- **Context pruning config backfill (migration).** Agents with existing custom
29+
`context_pruning` config (e.g., `softTrimRatio`, `keepLastAssistants`) but
30+
missing a `mode` field get auto-backfilled with `mode: "cache-ttl"` to
31+
preserve their intent after the opt-in flip. Rows with NULL config stay
32+
NULL (new opt-in default applies). PG migration 51; SQLite schema v19.
33+
534
## Project Status
635

736
### Implemented & Tested in Production

CLAUDE.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,44 @@ Published to GHCR (`ghcr.io/nextlevelbuilder/goclaw`) and Docker Hub (`digitop/g
184184
- **Tool gating:** `TeamActionPolicy` in `internal/tools/team_action_policy.go` — lite blocks comment/review/approve/reject/attach/ask_user. `skill_manage`/`publish_skill` not registered in lite
185185
- **File serving:** 2-layer path isolation in `internal/http/files.go` — workspace boundary (all editions) + tenant scope (standard only with RBAC)
186186

187+
## Plan Verification Rules
188+
189+
Apply before finalizing any multi-phase plan. Trust-but-verify between scout → planner → final plan.
190+
191+
### Verification discipline (what to verify)
192+
193+
1. **Verify factual claims against code** — re-grep/re-count every number, path, endpoint. Don't copy from scout summaries.
194+
2. **Trace semantics, not just cite lines** — when plan references existing/upstream code, identify WHEN each field mutates and under WHAT conditions. Line-range citation without control-flow trace = how ports silently invert behavior. Check: every call, or specific branches only?
195+
3. **No fabricated identifiers / API families** — every symbol in plan must cite `file:line`. RED FLAGS: plausible-sounding wrappers (`Keyring`, `Validator`, `Manager`), centralized packages (`internal/security`, `internal/auth`) that may be scattered, OTel-style (`StartSpan/EndSpan`) when codebase is emit-based. When unsure, `go doc <pkg>` lists actual exported surface. Apply especially when plan says "reuse existing X".
196+
4. **Struct scope audit before adding state** — verify lifetime (per-request/session/agent/process) before adding a field to an existing struct. "Plausibly per-X" is a red flag — grep construction + ownership. Shared-instance state leaks across isolation boundaries.
197+
5. **Gate-premise test math** — before asserting "feature X triggers independently of Y", list all early-returns from function entry to X. Math-verify any fixture claiming "X without Y".
198+
6. **Port = config-shape match** — "faithful port" divergences in config field name/type are silent breaking changes for users copying upstream config. Match upstream shape, or explicitly flag each divergence with rationale in the phase file.
199+
7. **Verify external API endpoints via `docs-seeker`** — before writing endpoint into plan. Sibling APIs often use different roots.
200+
201+
### Scope & coverage (where to look)
202+
203+
8. **Grep delete scope deep**`grep -rn '<symbol>' .` whole repo. Stubs often have refs in catalogs/routing/switch cases. Enumerate ALL sites in todo.
204+
9. **Signature-change callers enumeration** — grep + list all callers explicitly. "Update all callers" insufficient.
205+
10. **Alias/shim coverage** — enumerate ALL exported symbols via `go doc <pkg>`. Add compile-time signature guards.
206+
11. **Scout desktop and web separately**`ui/desktop/frontend/``ui/web/`. Different structure, i18n namespaces, test framework presence.
207+
208+
### Phasing & ordering (when)
209+
210+
12. **Re-scout on scope change** — if phase promotes from deferred → active, re-scout. Don't reuse brainstorm summary.
211+
13. **Cross-phase gates explicit** — "Phase N-1 merged + tests green" in phase Context. Execution order alone ≠ enforcement.
212+
14. **Zero-coverage characterization test = blocker step** — write byte/request-body fixture test BEFORE migration. Not "recommended".
213+
15. **i18n keys ordering** — add key + 3 catalogs as explicit todo step BEFORE handler code. Missing key = runtime crash.
214+
215+
### Conventions & finalization
216+
217+
16. **Context key style convention** — check existing `context.go` pattern before introducing new key types. Mixed = code smell.
218+
17. **Verify pass MANDATORY after rewrite** — spawn fresh Explore/grep to audit planner output. Don't trust self-validation.
219+
220+
**Pattern to avoid:** user asks → planner writes → report "done".
221+
**Safer pattern:** user asks → scout → planner writes → audit-verify → report.
222+
223+
**Red-team practice:** After planner completes, run `code-reviewer`/`brainstormer` in audit mode: "spot-check 15+ claims vs live codebase". Past catches: fabricated `crypto.Keyring`/`tracing.StartSpan` (agent-hooks plan); inverted TS-port semantics + wrong struct scope + misread early-return gate (context-pruning plan). See `plans/*/reports/audit-*.md` for concrete examples.
224+
187225
## Post-Implementation Checklist
188226

189227
After implementing or modifying Go code, run these checks:
@@ -207,6 +245,7 @@ Go conventions to follow:
207245
- **DB query reuse:** Before adding a new DB query for key entities (teams, agents, sessions, users), check if the same data is already fetched earlier in the current flow/pipeline. Prefer passing resolved data through context, event payloads, or function params rather than re-querying. Duplicate queries waste DB resources and add latency
208246
- **Solution design:** When designing a fix or feature, identify the root cause first — don't just patch symptoms. Think through production scenarios (high concurrency, multi-tenant isolation, failure cascades, long-running sessions) to ensure the solution holds up. Prefer explicit configuration over runtime heuristics. Prefer the simplest solution that addresses the root cause directly
209247
- **Tenant-scope guards on admin writes:** `RoleAdmin` is not a tenant check. Writes to **global** tables (no `tenant_id` column — e.g. `builtin_tools`, disk config, package mgmt) must gate with `http.requireMasterScope` / WS `requireMasterScope(requireOwner(...))`. Writes to **tenant-scoped** tables must gate with `http.requireTenantAdmin` + SQL `WHERE tenant_id = $N`. Shared predicate: `store.IsMasterScope(ctx)`. See `CONTRIBUTING.md` → "Tenant-scope guards" for the full decision table and anti-patterns.
248+
- **Skip load / stress / benchmark tests.** Do NOT write throughput benchmarks, p95/p99 latency assertions, or `runtime.ReadMemStats`-based memory-leak tests for regular feature work. They flake on shared CI runners, waste runner time, and rarely catch real bugs. Only add load tests when explicitly requested for a specific investigation. For normal "prove it works" coverage, use unit + integration + chaos tests.
210249

211250
## Mobile UI/UX Rules
212251

Dockerfile

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -144,15 +144,17 @@ RUN chmod +x /app/docker-entrypoint.sh && \
144144
# while pip/npm subdirs are goclaw-owned (runtime installs by the app process).
145145
# Symlink .claude → data volume so Claude CLI credentials persist across container recreates.
146146
RUN mkdir -p /app/workspace /app/data/.runtime/pip /app/data/.runtime/npm-global/lib \
147-
/app/data/.runtime/pip-cache /app/data/.claude /app/skills /app/tsnet-state /app/.goclaw \
147+
/app/data/.runtime/pip-cache /app/data/.runtime/bin /app/data/.claude /app/skills \
148+
/app/tsnet-state /app/.goclaw \
148149
&& ln -s /app/data/.claude /app/.claude \
149150
&& touch /app/data/.runtime/apk-packages \
150151
&& chown -R goclaw:goclaw /app/workspace /app/skills /app/tsnet-state /app/.goclaw \
151152
&& chown goclaw:goclaw /app/bundled-skills /app/data \
152153
&& chown root:goclaw /app/data/.runtime /app/data/.runtime/apk-packages \
153154
&& chmod 0750 /app/data/.runtime \
154155
&& chmod 0640 /app/data/.runtime/apk-packages \
155-
&& chown -R goclaw:goclaw /app/data/.runtime/pip /app/data/.runtime/npm-global /app/data/.runtime/pip-cache /app/data/.claude
156+
&& chown -R goclaw:goclaw /app/data/.runtime/pip /app/data/.runtime/npm-global /app/data/.runtime/pip-cache /app/data/.runtime/bin /app/data/.claude \
157+
&& chmod 0755 /app/data/.runtime/bin
156158

157159
# Default environment
158160
ENV GOCLAW_CONFIG=/app/config.json \

Makefile

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ VERSION ?= $(shell git describe --tags --abbrev=0 --match "v[0-9]*" 2>/dev/null
22
LDFLAGS = -s -w -X github.com/nextlevelbuilder/goclaw/cmd.Version=$(VERSION)
33
BINARY = goclaw
44

5-
.PHONY: build build-full build-tui run clean version up down logs reset test vet check-web dev migrate setup ci desktop-dev desktop-build desktop-dmg
5+
.PHONY: build build-full build-tui run clean version up down logs reset test vet check-web dev migrate setup ci desktop-dev desktop-build desktop-dmg test-hooks test-hooks-unit test-hooks-e2e test-hooks-chaos test-hooks-rbac test-hooks-tracing
66

77
# Build backend only (API-only, no embedded web UI)
88
build:
@@ -94,6 +94,26 @@ test-scenarios:
9494
# Critical tests (P0 + P1) - run before merge
9595
test-critical: test-invariants test-contracts
9696

97+
# ── Agent Hooks targets (phase 4) ──
98+
# Requires TEST_DATABASE_URL pointing at a pgvector:pg18 container on :5433
99+
test-hooks-unit:
100+
go test -race ./internal/hooks/... ./internal/gateway/methods/
101+
102+
test-hooks-e2e:
103+
go test -race -timeout=180s -tags integration -run "TestHooksE2E" ./tests/integration/
104+
105+
test-hooks-chaos:
106+
go test -race -timeout=180s -tags integration -run "TestHooksChaos" ./tests/integration/
107+
108+
test-hooks-rbac:
109+
go test -race -timeout=90s -tags integration -run "TestHooksRBAC" ./tests/integration/
110+
111+
test-hooks-tracing:
112+
go test -race -timeout=90s -tags integration -run "TestHooksTracing" ./tests/integration/
113+
114+
# Full hook test suite (unit + integration)
115+
test-hooks: test-hooks-unit test-hooks-e2e test-hooks-chaos test-hooks-rbac test-hooks-tracing
116+
97117
vet:
98118
go vet ./...
99119

cmd/gateway.go

Lines changed: 23 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ import (
3333
"github.com/nextlevelbuilder/goclaw/internal/edition"
3434
"github.com/nextlevelbuilder/goclaw/internal/gateway"
3535
"github.com/nextlevelbuilder/goclaw/internal/gateway/methods"
36+
"github.com/nextlevelbuilder/goclaw/internal/hooks"
3637
httpapi "github.com/nextlevelbuilder/goclaw/internal/http"
3738
mcpbridge "github.com/nextlevelbuilder/goclaw/internal/mcp"
3839
"github.com/nextlevelbuilder/goclaw/internal/media"
@@ -133,7 +134,7 @@ func runGateway() {
133134
tools.DetectServerIPs(context.Background())
134135
}
135136

136-
toolsReg, execApprovalMgr, mcpMgr, sandboxMgr, browserMgr, webFetchTool, ttsTool, permPE, toolPE, dataDir, agentCfg := setupToolRegistry(cfg, workspace, providerRegistry)
137+
toolsReg, execApprovalMgr, mcpMgr, sandboxMgr, browserMgr, webFetchTool, ttsTool, audioMgr, permPE, toolPE, dataDir, agentCfg := setupToolRegistry(cfg, workspace, providerRegistry)
137138
if browserMgr != nil {
138139
defer browserMgr.Close()
139140
}
@@ -263,6 +264,9 @@ func runGateway() {
263264

264265
// Create all agents — resolved lazily from database by the managed resolver.
265266
agentRouter := agent.NewRouter()
267+
if traceCollector != nil {
268+
agentRouter.SetTraceCollector(traceCollector)
269+
}
266270
slog.Info("agents will be resolved lazily from database")
267271

268272
// Create gateway server and wire enforcement
@@ -307,6 +311,7 @@ func runGateway() {
307311
workspace: workspace,
308312
dataDir: dataDir,
309313
domainBus: domainBus,
314+
audioMgr: audioMgr,
310315
}
311316

312317
gatewayAddr := loopbackAddr(cfg.Gateway.Host, cfg.Gateway.Port)
@@ -375,6 +380,18 @@ func runGateway() {
375380
server.SetLogTee(logTee)
376381
pairingMethods, heartbeatMethods, chatMethods := registerAllMethods(server, agentRouter, pgStores.Sessions, pgStores.Cron, pgStores.Pairing, cfg, cfgPath, workspace, dataDir, msgBus, execApprovalMgr, pgStores.Agents, pgStores.Skills, pgStores.ConfigSecrets, pgStores.Teams, contextFileInterceptor, logTee, pgStores.Heartbeats, pgStores.ConfigPermissions, pgStores.SystemConfigs, pgStores.Tenants, pgStores.SkillTenantCfgs)
377382

383+
// Phase 3: Agent hooks RPC methods (hooks.list/create/update/delete/toggle/test/history).
384+
if hs, ok := pgStores.Hooks.(hooks.HookStore); ok && hs != nil {
385+
hm := methods.NewHookMethods(hs, edition.Current())
386+
// Reuse dispatcher handlers for dry-run test runner so UI test panel
387+
// exercises the exact code that will run in production.
388+
if sharedHookHandlers != nil {
389+
hm.SetTestRunner(methods.NewDispatcherTestRunner(sharedHookHandlers))
390+
}
391+
hm.Register(server.Router())
392+
slog.Info("registered hooks RPC methods")
393+
}
394+
378395
// Wire post-turn processor for team task dispatch (WS chat.send + HTTP API paths).
379396
if postTurn != nil {
380397
chatMethods.SetPostTurnProcessor(postTurn)
@@ -422,12 +439,12 @@ func runGateway() {
422439
instanceLoader = channels.NewInstanceLoader(pgStores.ChannelInstances, pgStores.Agents, channelMgr, msgBus, pgStores.Pairing)
423440
instanceLoader.SetProviderRegistry(providerRegistry)
424441
instanceLoader.SetPendingCompactionConfig(cfg.Channels.PendingCompaction)
425-
instanceLoader.RegisterFactory(channels.TypeTelegram, telegram.FactoryWithStores(pgStores.Agents, pgStores.ConfigPermissions, pgStores.Teams, pgStores.SubagentTasks, pgStores.PendingMessages))
426-
instanceLoader.RegisterFactory(channels.TypeDiscord, discord.FactoryWithStores(pgStores.Agents, pgStores.ConfigPermissions, pgStores.PendingMessages))
427-
instanceLoader.RegisterFactory(channels.TypeFeishu, feishu.FactoryWithPendingStore(pgStores.PendingMessages))
442+
instanceLoader.RegisterFactory(channels.TypeTelegram, telegram.FactoryWithStoresAndAudio(pgStores.Agents, pgStores.ConfigPermissions, pgStores.Teams, pgStores.SubagentTasks, pgStores.PendingMessages, audioMgr))
443+
instanceLoader.RegisterFactory(channels.TypeDiscord, discord.FactoryWithStoresAndAudio(pgStores.Agents, pgStores.ConfigPermissions, pgStores.PendingMessages, audioMgr))
444+
instanceLoader.RegisterFactory(channels.TypeFeishu, feishu.FactoryWithPendingStoreAndAudio(pgStores.PendingMessages, audioMgr))
428445
instanceLoader.RegisterFactory(channels.TypeZaloOA, zalo.Factory)
429446
instanceLoader.RegisterFactory(channels.TypeZaloPersonal, zalopersonal.FactoryWithPendingStore(pgStores.PendingMessages))
430-
instanceLoader.RegisterFactory(channels.TypeWhatsApp, whatsapp.FactoryWithDB(pgStores.DB, pgStores.PendingMessages, "pgx"))
447+
instanceLoader.RegisterFactory(channels.TypeWhatsApp, whatsapp.FactoryWithDBAudio(pgStores.DB, pgStores.PendingMessages, "pgx", audioMgr, pgStores.BuiltinTools))
431448
instanceLoader.RegisterFactory(channels.TypeSlack, slackchannel.FactoryWithPendingStore(pgStores.PendingMessages))
432449
instanceLoader.RegisterFactory(channels.TypeFacebook, facebook.Factory)
433450
instanceLoader.RegisterFactory(channels.TypePancake, pancake.Factory)
@@ -437,7 +454,7 @@ func runGateway() {
437454
}
438455

439456
// Register config-based channels as fallback when no DB instances loaded.
440-
registerConfigChannels(cfg, channelMgr, msgBus, pgStores, instanceLoader)
457+
registerConfigChannels(cfg, channelMgr, msgBus, pgStores, instanceLoader, audioMgr)
441458

442459
// Register channels/instances/links/teams RPC methods
443460
wireChannelRPCMethods(server, pgStores, channelMgr, agentRouter, msgBus, workspace)

cmd/gateway_agents.go

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ import (
44
"context"
55
"log/slog"
66

7+
"github.com/nextlevelbuilder/goclaw/internal/audio/elevenlabs"
8+
minimaxaudio "github.com/nextlevelbuilder/goclaw/internal/audio/minimax"
79
"github.com/nextlevelbuilder/goclaw/internal/bus"
810
"github.com/nextlevelbuilder/goclaw/internal/config"
911
"github.com/nextlevelbuilder/goclaw/internal/memory"
@@ -187,6 +189,9 @@ func setupSubagents(providerReg *providers.Registry, cfg *config.Config, msgBus
187189
if sc.ArchiveAfterMinutes > 0 {
188190
subCfg.ArchiveAfterMinutes = sc.ArchiveAfterMinutes
189191
}
192+
if sc.MaxRetries > 0 {
193+
subCfg.MaxRetries = sc.MaxRetries
194+
}
190195
if sc.Model != "" {
191196
subCfg.Model = sc.Model
192197
}
@@ -274,3 +279,55 @@ func setupTTS(cfg *config.Config) *tts.Manager {
274279

275280
return mgr
276281
}
282+
283+
// setupAudioExtras wires Music and SFX providers into the audio Manager.
284+
// ElevenLabs is registered for both SFX and Music when an API key is present.
285+
// MiniMax music is registered when cfg.Audio.Music is configured with a key.
286+
// Phase 4 will add STT providers here.
287+
func setupAudioExtras(cfg *config.Config, mgr *tts.Manager) {
288+
ellKey := cfg.Tts.ElevenLabs.APIKey
289+
ellBase := cfg.Tts.ElevenLabs.BaseURL
290+
291+
// ElevenLabs SFX — reuse TTS credentials.
292+
if ellKey != "" {
293+
mgr.RegisterSFX(elevenlabs.NewSFXProvider(elevenlabs.Config{
294+
APIKey: ellKey,
295+
BaseURL: ellBase,
296+
}))
297+
slog.Info("audio.sfx: elevenlabs registered")
298+
}
299+
300+
// ElevenLabs Music — same credentials, uses /v1/music endpoint.
301+
if ellKey != "" {
302+
mgr.RegisterMusic(elevenlabs.NewMusicProvider(elevenlabs.Config{
303+
APIKey: ellKey,
304+
BaseURL: ellBase,
305+
}))
306+
slog.Info("audio.music: elevenlabs registered")
307+
}
308+
309+
// MiniMax Music — optional, from cfg.Audio.Music block.
310+
if cfg.Audio != nil && cfg.Audio.Music != nil {
311+
mc := cfg.Audio.Music
312+
if mc.APIKey != "" {
313+
mgr.RegisterMusic(minimaxaudio.NewMusicProvider(minimaxaudio.MusicConfig{
314+
APIKey: mc.APIKey,
315+
APIBase: mc.BaseURL,
316+
Model: mc.Model,
317+
}))
318+
slog.Info("audio.music: minimax registered")
319+
}
320+
}
321+
322+
// ElevenLabs STT (Scribe v2) — reuse TTS credentials. Registered as tenant-scope
323+
// default; per-request tenant override lands via builtin_tools[stt] in Phase 5
324+
// channel migration. Legacy per-channel STTProxyURL is bridged separately.
325+
if ellKey != "" {
326+
mgr.RegisterSTT(elevenlabs.NewSTTProvider(elevenlabs.Config{
327+
APIKey: ellKey,
328+
BaseURL: ellBase,
329+
}))
330+
mgr.SetSTTChain([]string{"elevenlabs", "proxy"})
331+
slog.Info("audio.stt: elevenlabs registered")
332+
}
333+
}

cmd/gateway_announce_queue.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ import (
44
"context"
55
"fmt"
66
"log/slog"
7+
"path/filepath"
78
"strings"
89

910
"github.com/google/uuid"
@@ -103,6 +104,7 @@ func processAnnounceLoop(
103104
req.ForwardMedia = append(req.ForwardMedia, bus.MediaFile{
104105
Path: mr.Path,
105106
MimeType: mr.ContentType,
107+
Filename: filepath.Base(mr.Path), // preserve sanitized stem from producer
106108
})
107109
}
108110
}

cmd/gateway_builtin_tools.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,10 @@ func builtinToolSeedData() []store.BuiltinToolDef {
7171
Requires: []string{"tts_provider"},
7272
Metadata: json.RawMessage(`{"config_hint":"Config → TTS"}`),
7373
},
74+
{Name: "stt", DisplayName: "Speech-to-Text", Description: "Transcribe voice/audio messages to text using ElevenLabs Scribe or a proxy service", Category: "media", Enabled: true,
75+
Requires: []string{"stt_provider"},
76+
Metadata: json.RawMessage(`{"config_hint":"Config → Audio → STT"}`),
77+
},
7478

7579
// browser
7680
{Name: "browser", DisplayName: "Browser", Description: "Automate browser interactions: navigate pages, click elements, fill forms, take screenshots", Category: "browser", Enabled: true,

0 commit comments

Comments
 (0)