Clarify voice-only scoping of maxTokens and SOUL.md config

sanchitmonga22 · claude · sanchitmonga22 · commit f5d659626d1f · 2026-02-12T16:05:24.000-08:00
Add inline comments and explicit default agent to config examples
making it unmistakable that the 512-token cap only applies to the
voice-assistant channel. Other channels (Telegram, WhatsApp, Discord)
are completely unaffected and keep the standard 8192 default.

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/RASPBERRY-PI-SETUP.md b/RASPBERRY-PI-SETUP.md
@@ -477,11 +477,13 @@ Speech rules:
 
 Add the following to `~/.openclaw/openclaw.json`:
 
-```json
+```json5
 {
   "agents": {
     "defaults": {
       "models": {
+        // Voice-only model key — only used by voice-agent below.
+        // Other agents (Telegram, WhatsApp, etc.) are NOT affected.
         "anthropic/claude-sonnet-4-5-voice": {
           "params": {
             "maxTokens": 512
@@ -490,6 +492,14 @@ Add the following to `~/.openclaw/openclaw.json`:
       }
     },
     "list": [
+      // Default agent — used by Telegram, WhatsApp, Discord, etc.
+      // No model override → uses the global default (8192 maxTokens).
+      {
+        "id": "main",
+        "default": true
+      },
+      // Voice-only agent — ONLY used when channel matches "voice-assistant".
+      // Gets the 512-token cap via the dedicated model key above.
       {
         "id": "voice-agent",
         "workspace": "~/.openclaw/workspaces/voice-agent",
@@ -498,6 +508,8 @@ Add the following to `~/.openclaw/openclaw.json`:
     ]
   },
   "bindings": [
+    // This binding scopes voice-agent to the voice-assistant channel ONLY.
+    // All other channels fall through to the default "main" agent.
     {
       "agentId": "voice-agent",
       "match": {
@@ -508,11 +520,14 @@ Add the following to `~/.openclaw/openclaw.json`:
 }
 ```
 
-This routes all voice-assistant messages to the `voice-agent` (with the conversational SOUL.md), while Telegram and other channels continue using the default agent with normal rich-text responses.
+**Scoping:** The 512-token limit ONLY applies to the voice channel. Here's why:
 
-**Why a separate model key?** OpenClaw's `maxTokens` is set per-model, not per-agent. By creating a dedicated model key (`anthropic/claude-sonnet-4-5-voice`), the voice agent gets a hard 512-token ceiling while other channels keep their default limit (8192). Both keys route to the same underlying Anthropic model — the key is just OpenClaw's internal routing identifier. Combined with the SOUL.md conciseness instructions, this ensures voice responses stay short and natural for TTS.
+1. The model key `anthropic/claude-sonnet-4-5-voice` (with `maxTokens: 512`) is just an entry in the model catalog — it does nothing unless an agent explicitly references it.
+2. Only `voice-agent` sets `"model": "anthropic/claude-sonnet-4-5-voice"`.
+3. Only the `voice-assistant` channel is bound to `voice-agent` (via the binding).
+4. The default `main` agent (used by Telegram, WhatsApp, Discord, etc.) has no model override, so it uses the global default model with the standard 8192 maxTokens.
 
-> **Tip:** If 512 tokens feels too restrictive (responses getting cut off), bump it to `768` or `1024`. For most spoken responses, 512 tokens (~3–5 sentences) is the sweet spot.
+> **Tip:** If 512 tokens feels too restrictive (responses getting cut off), bump it to `768` or `1024`. For most spoken responses, 512 tokens (~3-5 sentences) is the sweet spot.
 
 ### 9c. Restart and Test
 
diff --git a/docs/channels/voice-assistant.md b/docs/channels/voice-assistant.md
@@ -117,12 +117,17 @@ Create a dedicated model key for the voice agent with a low `maxTokens` value. T
   agents: {
     defaults: {
       models: {
+        // Voice-only model key — only used by voice-agent below.
+        // Telegram, WhatsApp, Discord etc. are NOT affected.
         "anthropic/claude-sonnet-4-5-voice": {
           params: { maxTokens: 512 },
         },
       },
     },
     list: [
+      // Default agent — all non-voice channels (unaffected, keeps 8192 default).
+      { id: "main", default: true },
+      // Voice-only agent — scoped to voice-assistant channel via binding.
       {
         id: "voice-agent",
         workspace: "~/.openclaw/workspaces/voice-agent",
@@ -131,12 +136,13 @@ Create a dedicated model key for the voice agent with a low `maxTokens` value. T
     ],
   },
   bindings: [
+    // ONLY voice-assistant channel routes to voice-agent.
     { agentId: "voice-agent", match: { channel: "voice-assistant" } },
   ],
 }
 ```
 
-The dedicated model key (`-voice` suffix) inherits the same underlying model but gets its own `maxTokens`. Other channels keep their default limit. Start with 512 tokens and adjust up if responses feel cut off.
+The 512-token cap ONLY applies to the voice channel. The model key is just a catalog entry — it does nothing unless an agent explicitly references it. Only `voice-agent` does, and only the `voice-assistant` channel is bound to it. All other channels fall through to the default `main` agent with its standard 8192 maxTokens. Start with 512 tokens and adjust up if responses feel cut off.
 
 ## Configuration