fix: stabilize DeepSeek streaming output by avir4er · Pull Request #461 · aliasrobotics/cai

avir4er · 2026-07-03T03:15:05Z

Summary

This patch stabilizes DeepSeek/LiteLLM streaming in CAI by:

avoiding duplicate litellm.acompletion() calls when stream=True
fixing the DeepSeek/Claude thinking context construction path that referenced an undefined panel
making DeepSeek raw reasoning_content display opt-in via CAI_SHOW_REASONING=true (or legacy CAI_SHOW_THINKING=true) to prevent token-by-token terminal flooding
keeping Claude reasoning display behavior unchanged
adding bounded LiteLLM model request waits via CAI_MODEL_TIMEOUT (default 180 seconds, CAI_LLM_TIMEOUT accepted as an alias; <=0 disables CAI's injected timeout)
enforcing an outer asyncio timeout around LiteLLM calls, in case a provider/client stalls before LiteLLM raises
enforcing a per-chunk idle timeout for streamed LiteLLM responses, so a returned stream cannot wait forever for the next SSE chunk
retrying transient streaming provider/proxy disconnects such as DeepSeek Server disconnected / LiteLLM InternalServerError
routing remaining runtime litellm.acompletion() call sites through the shared timeout wrapper, except an existing REPL recovery helper that already uses asyncio.wait_for
converting streamed provider disconnects into typed LLMProviderUnavailable errors so headless mode shows the existing concise provider-load message instead of a full traceback unless CAI_DEBUG=2

Why

During interactive Web App Pentester usage with deepseek/deepseek-v4-pro, long tasks could remain stuck "in flight", flood the terminal with token-by-token DeepSeek reasoning, or print a long traceback when DeepSeek/LiteLLM disconnected while opening a stream.

Four issues contributed:

The streaming LiteLLM adapter opened a stream and discarded it before opening the returned stream.
LiteLLM model calls did not receive a bounded timeout, so a slow/stalled provider or proxy could leave CAI waiting indefinitely.
Some LiteLLM/provider combinations can return a stream object, then stall while awaiting the next streamed chunk; the new stream-idle wrapper bounds that phase too.
Streaming high-level recovery retried timeouts and rate limits, but not transient provider disconnects such as LiteLLM InternalServerError: DeepseekException - Server disconnected.

DeepSeek reasoning deltas were also printed by default, while the Rich thinking context path hit name 'panel' is not defined.

Tests

uv run --frozen pytest \
  tests/cli/test_cli_headless_cancellation.py \
  tests/core/test_openai_chatcompletions_stream.py::test_stream_response_retries_transient_provider_disconnect \
  tests/sdk/test_litellm_adapter_streaming.py \
  tests/util/test_thinking_display.py \
  -q

uv run --frozen python -m py_compile \
  src/cai/sdk/agents/models/chatcompletions/litellm_adapter.py \
  src/cai/sdk/agents/models/openai_chatcompletions.py \
  src/cai/cli_headless.py \
  src/cai/continuation.py \
  src/cai/ctr/digest.py \
  src/cai/tui/components/agent_creator_panel.py \
  tests/cli/test_cli_headless_cancellation.py \
  tests/core/test_openai_chatcompletions_stream.py \
  tests/sdk/test_litellm_adapter_streaming.py \
  tests/util/test_thinking_display.py

Result: 15 passed for the targeted pytest slice.

Additional audit:

rg -n "litellm\.acompletion" src/cai tests

Runtime direct call sites now go through the shared timeout wrapper; the remaining runtime exception-recovery call already has an explicit asyncio.wait_for, and the remaining direct references are tests.

avir4er added 4 commits July 3, 2026 11:12

fix: stabilize DeepSeek streaming output

c78b489

fix: bound LiteLLM model wait time

cdb6bce

fix: enforce LiteLLM completion timeout

3cfbf06

fix: retry transient streaming provider disconnects

0ac1c71

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: stabilize DeepSeek streaming output#461

fix: stabilize DeepSeek streaming output#461
avir4er wants to merge 4 commits into
aliasrobotics:mainfrom
avir4er:fix/deepseek-streaming-reasoning

avir4er commented Jul 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

avir4er commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

avir4er commented Jul 3, 2026 •

edited

Loading