Skip to content

Commit 5317903

Browse files
authored
docs: sync with goclaw source changes a47d7f9f..c388364d (EN+VI+ZH) (#44)
- Fix REST API: remove Sessions (WS-only, fixes #618), add 33 missing endpoints, add MCP Bridge + Edition sections, fix Tenants PUT→PATCH - Expand WebSocket-Only section with 15 method groups - Update schema version 33→34, add subagent_tasks table (migration 034) - Fix MCP hybrid search mode description (first 40 inline, rest deferred) - Fix KG extraction temp 0.0→0.2, add 3 entity types, bidirectional traversal - Add subagent enhancements: WaitAll, auto-retry, token tracking, SubagentDenyAlways - Add /subagents commands to Telegram, attachment URL + stop-typing to Discord - Add structured compaction summary and tool output cap to context-pruning - Add provider_type immutability, schema normalization, reasoning effort controls - Update installation for embedded Web UI (embedui build tag, 2-file compose)
1 parent 57ed475 commit 5317903

36 files changed

Lines changed: 1110 additions & 257 deletions

advanced/context-pruning.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -208,10 +208,22 @@ Pruning only acts on tool results. If long user messages or system prompt compon
208208

209209
---
210210

211+
## Pipeline Improvements
212+
213+
### Structured Compaction Summary
214+
215+
When context is compacted, the summary now preserves key identifiers — agent IDs, task IDs, and session keys — in a structured format. This ensures that agents can continue referencing their active tasks and sessions after compaction without losing critical tracking context.
216+
217+
### Tool Output Capping at Source
218+
219+
Tool output is now capped at the source before being added to context. Rather than waiting for the pruning pipeline to trim oversized results after the fact, GoClaw limits tool output size at ingestion time. This reduces unnecessary memory pressure and makes the pruning pipeline more predictable.
220+
221+
---
222+
211223
## What's Next
212224

213225
- [Sessions & History](/sessions-and-history) — session compaction, history limits
214226
- [Memory System](/memory-system) — persistent memory across sessions
215227
- [Configuration Reference](/config-reference) — full agent config reference
216228

217-
<!-- goclaw-source: e7afa832 | updated: 2026-03-30 -->
229+
<!-- goclaw-source: c388364d | updated: 2026-04-01 -->

advanced/knowledge-graph.md

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ The graph is scoped per agent and per user — each agent builds its own graph f
1717

1818
After a conversation, GoClaw sends the text to an LLM with a structured extraction prompt. For long texts (over 12,000 characters), GoClaw splits the input into chunks, extracts from each, and merges results by deduplicating entities and relations. The LLM returns:
1919

20-
- **Entities** — People, projects, tasks, events, concepts, locations, organizations
20+
- **Entities** — People, organizations, projects, products, technologies, tasks, events, documents, concepts, locations
2121
- **Relations** — Typed connections between entities (e.g., `works_on`, `reports_to`)
2222

2323
Each entity and relation has a **confidence score** (0.0–1.0). Only items at or above the threshold (default **0.75**) are stored.
@@ -26,7 +26,7 @@ Each entity and relation has a **confidence score** (0.0–1.0). Only items at o
2626
- 3–15 entities per extraction, depending on text density
2727
- Entity IDs are lowercase with hyphens (e.g., `john-doe`, `project-alpha`)
2828
- Descriptions are one sentence maximum
29-
- Temperature 0.0 for deterministic results
29+
- Temperature 0.2 for consistent yet slightly flexible results
3030

3131
### Extract API
3232

@@ -155,15 +155,15 @@ Marks the pair as not-duplicate — it won't appear in future review queues.
155155
| Parameter | Type | Description |
156156
|-----------|------|-------------|
157157
| `query` | string | Entity name, keyword, or `*` to list all (required) |
158-
| `entity_type` | string | Filter: `person`, `project`, `task`, `event`, `concept`, `location`, `organization` |
158+
| `entity_type` | string | Filter: `person`, `organization`, `project`, `product`, `technology`, `task`, `event`, `document`, `concept`, `location` |
159159
| `entity_id` | string | Start point for relationship traversal |
160160
| `max_depth` | int | Traversal depth (default 2, max 3) |
161161

162162
### 3-Tier Search Fallback
163163

164164
The tool uses a 3-tier fallback strategy to ensure results are always returned:
165165

166-
1. **Traversal** (when `entity_id` provided) — BFS outgoing traversal up to `max_depth`, returns up to 20 results with path info and relation types
166+
1. **Traversal** (when `entity_id` provided) — Bidirectional multi-hop traversal up to `max_depth`, returns up to 20 results with path info and relation types
167167
2. **Direct connections** (fallback if traversal returns nothing) — Bidirectional 1-hop relations, capped at 10
168168
3. **Text search** (fallback if no connections) — Full-text search on entity names/descriptions, returns up to 10 results with their relations (5 per entity)
169169

@@ -181,7 +181,7 @@ query: "John"
181181
query: "*"
182182
```
183183

184-
**Traverse relationships** — Start from an entity and follow outgoing connections:
184+
**Traverse relationships** — Start from an entity and follow connections in both directions:
185185
```
186186
query: "*"
187187
entity_id: "project-alpha"
@@ -266,12 +266,15 @@ Relations are directional: `source --relation_type--> target`. Deleting an entit
266266
| Type | Examples |
267267
|------|----------|
268268
| `person` | Team members, contacts, stakeholders |
269-
| `project` | Products, initiatives, codebases |
269+
| `organization` | Companies, teams, departments |
270+
| `project` | Initiatives, codebases, programs |
271+
| `product` | Software products, services, features |
272+
| `technology` | Languages, frameworks, platforms |
270273
| `task` | Action items, tickets, assignments |
271274
| `event` | Meetings, deadlines, milestones |
272-
| `concept` | Technologies, methodologies, ideas |
275+
| `document` | Reports, specs, wikis, runbooks |
276+
| `concept` | Methodologies, ideas, principles |
273277
| `location` | Offices, cities, regions |
274-
| `organization` | Companies, teams, departments |
275278

276279
---
277280

@@ -378,4 +381,4 @@ An agent can then answer questions like *"Who is working on Project Alpha?"* by
378381
- [Memory System](/memory-system) — Vector-based long-term memory
379382
- [Sessions & History](/sessions-and-history) — Conversation storage
380383

381-
<!-- goclaw-source: a47d7f9f | updated: 2026-03-31 -->
384+
<!-- goclaw-source: c388364d | updated: 2026-04-01 -->

advanced/mcp-integration.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ graph LR
2424
Registry --> Agent
2525
```
2626

27-
GoClaw runs a health-check loop every 30 seconds and reconnects with exponential backoff (initial delay 2 s, up to 10 attempts, capped at 60 s between retries) if a server goes down.
27+
GoClaw runs a health-check loop every 30 seconds. A server is only marked disconnected after **3 consecutive ping failures** — transient network blips do not trigger a reconnect. When a server does go down, GoClaw reconnects with exponential backoff (initial delay 2 s, up to 10 attempts, capped at 60 s between retries).
2828

2929
## Registering an MCP Server
3030

@@ -103,13 +103,13 @@ If no prefix is set and a name collision is detected, GoClaw logs a warning (`mc
103103

104104
## Search Mode (large tool sets)
105105

106-
When the total number of MCP tools across all servers exceeds **40**, GoClaw automatically enters **search mode**: tools are no longer registered inline in the tool registry. Instead, only the built-in `mcp_tool_search` tool is exposed. The agent uses `mcp_tool_search` to find and activate specific MCP tools on demand.
106+
When the total number of MCP tools across all servers exceeds **40**, GoClaw automatically enters **hybrid mode**: the first 40 tools remain registered inline in the tool registry, while the remainder are deferred to search mode. In hybrid mode, the built-in `mcp_tool_search` tool is also exposed so the agent can find and activate the deferred tools on demand.
107107

108108
This keeps the tool list manageable when connecting many MCP servers. There is no configuration required — the switch is automatic.
109109

110110
### Lazy activation
111111

112-
In search mode, if an agent calls a deferred MCP tool directly by name (without searching first), GoClaw **auto-activates** it. The tool is resolved from the MCP server, registered on the fly, and executed — no extra search step needed. This enables compatibility with agents that already know the tool name from prior context.
112+
In hybrid mode, if an agent calls a deferred MCP tool directly by name (without searching first), GoClaw **auto-activates** it. The tool is resolved from the MCP server, registered on the fly, and executed — no extra search step needed. This enables compatibility with agents that already know the tool name from prior context.
113113

114114
## Per-Agent Access Grants
115115

@@ -297,4 +297,4 @@ Requires admin role. The credentials are encrypted at rest using `GOCLAW_ENCRYPT
297297
- [Custom Tools](../advanced/custom-tools.md) — build shell-backed tools without an MCP server
298298
- [Skills](../advanced/skills.md) — inject reusable knowledge into agent system prompts
299299

300-
<!-- goclaw-source: e7afa832 | updated: 2026-03-30 -->
300+
<!-- goclaw-source: c388364d | updated: 2026-04-01 -->

channels/discord.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -76,15 +76,15 @@ In servers (channels), the bot requires being mentioned by default (`require_men
7676

7777
### Typing Indicator
7878

79-
While the agent processes, a typing indicator is shown (9-second keepalive).
79+
While the agent processes, a typing indicator is shown (9-second keepalive). The typing indicator stops automatically after successful message delivery.
8080

8181
### Thread Support
8282

8383
The bot automatically detects and responds in Discord threads. Responses stay in the same thread.
8484

8585
### Media from Replied-to Messages
8686

87-
When a user replies to a message that contains media attachments, GoClaw extracts those attachments and includes them in the inbound message context. This lets the agent see and process media even when it was originally shared in a previous turn.
87+
When a user replies to a message that contains media attachments, GoClaw extracts those attachments and includes them in the inbound message context. This lets the agent see and process media even when it was originally shared in a previous turn. Attachment source URLs are preserved in media tags, so agents can reference the original Discord CDN URL.
8888

8989
### Group Media History
9090

@@ -135,4 +135,4 @@ Per-guild/channel overrides are not yet supported in the Discord channel impleme
135135
- [Larksuite](/channel-feishu) — Larksuite integration with streaming cards
136136
- [Browser Pairing](/channel-browser-pairing) — Pairing flow
137137

138-
<!-- goclaw-source: 120fc2d | updated: 2026-03-18 -->
138+
<!-- goclaw-source: c388364d | updated: 2026-04-01 -->

channels/telegram.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -243,6 +243,8 @@ Commands processed before message enrichment:
243243
| `/status` | Bot status + username | -- |
244244
| `/tasks` | Team task list | -- |
245245
| `/task_detail <id>` | View task | -- |
246+
| `/subagents` | List all active subagent tasks with status | -- |
247+
| `/subagent <id>` | Show detailed view of a subagent task from DB | -- |
246248
| `/addwriter` | Add group file writer | Writers only |
247249
| `/removewriter` | Remove group file writer | Writers only |
248250
| `/writers` | List group writers | -- |
@@ -291,4 +293,4 @@ Each Telegram instance maintains an isolated HTTP transport — no shared connec
291293
- [Browser Pairing](/channel-browser-pairing) — Pairing flow
292294
- [Sessions & History](/sessions-and-history) — Conversation history
293295

294-
<!-- goclaw-source: a47d7f9f | updated: 2026-03-31 -->
296+
<!-- goclaw-source: c388364d | updated: 2026-04-01 -->

core-concepts/agents-explained.md

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -169,10 +169,45 @@ After each conversation run, GoClaw evaluates whether to compact session history
169169

170170
Predefined agents have built-in protection against social engineering. If a user tries to convince the agent to ignore its SOUL.md or act outside its defined identity, the agent is designed to resist. Shared identity files are injected into the system prompt at a level that takes precedence over user instructions.
171171

172+
## Subagent Enhancements
173+
174+
When an agent spawns subagents via the `spawn` tool, the following capabilities apply:
175+
176+
### Per-Edition Rate Limiting
177+
178+
The `Edition` struct enforces two tenant-scoped limits on subagent usage:
179+
180+
| Field | Description |
181+
|-------|-------------|
182+
| `MaxSubagentConcurrent` | Max number of subagents running in parallel per tenant |
183+
| `MaxSubagentDepth` | Max nesting depth — prevents unbounded delegation chains |
184+
185+
These are set per edition and enforced at spawn time.
186+
187+
### Token Cost Tracking
188+
189+
Each subagent accumulates per-call input and output token counts. Totals are persisted in the database and included in announce messages, giving the parent agent full visibility into delegation cost.
190+
191+
### WaitAll Orchestration
192+
193+
`spawn(action=wait, timeout=N)` blocks the parent until all previously spawned children complete. This enables fan-out/fan-in patterns without polling.
194+
195+
### Auto-Retry with Backoff
196+
197+
Configurable `MaxRetries` (default `2`) with linear backoff handles transient LLM failures automatically. The parent is only notified on permanent failure after all retries are exhausted.
198+
199+
### SubagentDenyAlways
200+
201+
Subagents cannot spawn nested subagents — the `team_tasks` tool is blocked in subagent context. All delegation must originate from a top-level agent.
202+
203+
### Producer-Consumer Announce Queue
204+
205+
Staggered subagent results are queued and merged into a single LLM run announcement on the parent side. This reduces unnecessary parent wake-ups when multiple subagents finish at different times.
206+
172207
## What's Next
173208

174209
- [Sessions and History](/sessions-and-history) — How conversations persist
175210
- [Tools Overview](/tools-overview) — What tools agents can use
176211
- [Memory System](/memory-system) — Long-term memory and search
177212

178-
<!-- goclaw-source: c70e50c9 | updated: 2026-03-28 -->
213+
<!-- goclaw-source: c388364d | updated: 2026-04-01 -->

core-concepts/tools-overview.md

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,25 @@ Admins can disable specific groups per agent:
184184

185185
The `tools.exec_approval` setting adds an additional approval layer (`full`, `light`, or `none`).
186186

187+
## spawn — Subagent Orchestration
188+
189+
The `spawn` tool (part of `group:sessions`) creates and runs subagents. Key capabilities:
190+
191+
| Capability | Detail |
192+
|-----------|--------|
193+
| **WaitAll** | `spawn(action=wait, timeout=N)` blocks the parent until all previously spawned children complete. Useful for fan-out/fan-in patterns. |
194+
| **Auto-retry** | Configurable `MaxRetries` (default `2`) with linear backoff on LLM failures. Transient errors are retried automatically. |
195+
| **Token tracking** | Each subagent accumulates per-call input/output token counts. Totals are included in announce messages so the parent can account for cost. |
196+
| **SubagentDenyAlways** | Subagents cannot spawn nested subagents — the `team_tasks` tool is blocked in subagent context. Prevents unbounded delegation chains. |
197+
| **Producer-consumer announce queue** | Staggered subagent results are queued and merged into a single LLM run announcement on the parent side, reducing unnecessary wake-ups. |
198+
199+
```jsonc
200+
// Example: fan-out then wait
201+
spawn(action=start, prompt="Summarize part A")
202+
spawn(action=start, prompt="Summarize part B")
203+
spawn(action=wait, timeout=120) // blocks until both finish
204+
```
205+
187206
## Session Tool Security
188207

189208
Session tools (`sessions_list`, `sessions_history`, `sessions_send`) are hardened with fail-closed validation:
@@ -234,4 +253,4 @@ All parameters are optional — defaults apply when not configured.
234253
- [Multi-Tenancy](/multi-tenancy) — Per-user tool access and isolation
235254
- [Custom Tools](/custom-tools) — Build your own tools
236255

237-
<!-- goclaw-source: 4d31fe0 | updated: 2026-03-28 -->
256+
<!-- goclaw-source: c388364d | updated: 2026-04-01 -->

deployment/upgrading.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ A GoClaw upgrade has two parts:
99
1. **SQL migrations** — schema changes applied by `golang-migrate` (idempotent, versioned)
1010
2. **Data hooks** — optional Go-based data transformations that run after schema migrations (e.g. backfilling a new column)
1111

12-
The `./goclaw upgrade` command handles both in the correct order. It is safe to run multiple times — it is fully idempotent. The current required schema version is **33**.
12+
The `./goclaw upgrade` command handles both in the correct order. It is safe to run multiple times — it is fully idempotent. The current required schema version is **34**.
1313

1414
```mermaid
1515
graph LR
@@ -226,6 +226,7 @@ These five migrations are auto-applied on startup when upgrading to v2.x. No man
226226
| 031 | Adds `tsv tsvector` generated column + GIN index to `kg_entities` for full-text search; creates `kg_dedup_candidates` table for entity deduplication review |
227227
| 032 | Creates `secure_cli_user_credentials` for per-user CLI credential injection; adds `contact_type` column to `channel_contacts` |
228228
| 033 | Cron payload columns | Promotes `stateless`, `deliver`, `deliver_channel`, `deliver_to`, `wake_heartbeat` from `payload` JSONB to dedicated columns on `cron_jobs` |
229+
| 034 | `subagent_tasks` | Subagent task persistence for DB-backed task tracking |
229230

230231
### Breaking Changes in v2.x
231232

@@ -278,4 +279,4 @@ Before each upgrade, check the release notes for:
278279
- [Database Setup](/deploy-database) — PostgreSQL and pgvector setup
279280
- [Observability](/deploy-observability) — monitor your gateway post-upgrade
280281

281-
<!-- goclaw-source: a47d7f9f | updated: 2026-03-31 -->
282+
<!-- goclaw-source: c388364d | updated: 2026-04-01 -->

0 commit comments

Comments
 (0)