fix(inworld-tts): don't poison receive stream with stale-context errors#539
Conversation
Two related issues caused TTS calls to fail with `Context not found
during sendText` after >60 seconds of inactivity, leaving the agent
unable to speak its proactive responses.
1. **`_keepalive_loop` sends `send_text` without a `contextId`.**
When `_active_context_id` is `None`, the keepalive payload still
contains `{"send_text": {"text": ""}}` with no `contextId`. The
server cannot route this to a context and responds with an error
message. The error then sits in the WebSocket receive buffer and
surfaces on the next valid TTS call, breaking it. Fix: skip the
keepalive iteration when no active context exists. The websockets
library handles TCP-level keepalive via PING/PONG independently.
2. **`_receive_audio` did not filter errors by `contextId`.** Audio
chunks were filtered by `msg_context_id != context_id`, but the
error/status check ran first and raised regardless of which context
the message was for. Fix: pull the `contextId` mismatch check above
the status/error checks so messages addressed to a different (or
stale) context are dropped early. Server-wide errors with no
`contextId` (e.g. "max contexts limit reached") still pass through.
|
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe TTS streaming client refactors message handling for concurrent contexts. The receive loop now extracts Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Why
TTS calls fail with
RuntimeError: Inworld TTS websocket error: Context not found during sendTextafter >60 seconds of inactivity. The LLM still generates a response, but the agent can never speak it. Reproduces reliably with any agent that goes quiet between turns — for example a face-aware agent that re-engages a distracted user with a proactive cue after a pause.Changes
_keepalive_loopsendssend_textwithout acontextId. When_active_context_idisNone, the keepalive payload still contains{"send_text": {"text": ""}}with nocontextId. The server cannot route this to a context and responds with an error message. The error then sits in the WebSocket receive buffer and surfaces on the next valid TTS call, breaking it. Fix: skip the keepalive iteration when no active context exists. The websockets library handles TCP-level keepalive via PING/PONG independently._receive_audiodid not filter errors bycontextId. Audio chunks were filtered bymsg_context_id != context_id, but the error/status check ran first and raised regardless of which context the message was for. Fix: pull thecontextIdmismatch check above the status/error checks so messages addressed to a different (or stale) context are dropped early. Server-wide errors with nocontextId(e.g."max contexts limit reached") still pass through.