fix(llmcore): suppress intermediate error yield during retry and handle ChunkedEncodingError#648
Open
Kailigithub wants to merge 1 commit into
Open
Conversation
…le ChunkedEncodingError When the LLM stream encounters a transient network error (ConnectionError, Timeout, SSLError, ProxyError), the retry path yielded the error string back to the caller on every attempt. This caused the UI to display a wall of '!!!Error: ProxyError!!!Error: ConnectionError!!!Error: SSLError' strings during a network blip, even when the retry ultimately succeeded. Match the existing HTTP retry path (which sleeps without yielding) by removing the intermediate yield so the caller only sees the final error if retries are exhausted. Retry attempts remain visible via the existing '[LLM Retry] ...' stdout log. Also extend the exception tuple to include ChunkedEncodingError. Without it, a truncated chunked response falls through to the bare 'except Exception' branch and aborts the stream with no retry, even though the transport-level error is transient and recoverable. Closes lsdefine#571
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem this solves
When the LLM stream hits a transient network error (e.g. wifi blip walking to a meeting room, VPN reconnect, or proxy hiccup), the retry path yielded the raw
!!!Error: ...string back to the UI on every attempt. The user then saw a wall of stacked error messages like:even when the eventual retry succeeded and the assistant was still producing a response. The user reported this as issue #571 with the expected behavior: exponential-backoff retry rather than a visible fail.
There was also a secondary gap:
requests.exceptions.ChunkedEncodingErrorwas not in the retryable-exception tuple, so a truncated chunked response would fall through to the bareexcept Exceptionbranch and abort the stream with no retry, even though it is a transient transport-level error.What changed
llmcore.py:_stream_with_retry:requests.exceptions.ChunkedEncodingErrorto the connection-error exception tuple. SSLError and ProxyError already inherit fromrequests.ConnectionErrorand were already caught - only ChunkedEncodingError was actually missing.yield errfrom the intermediate-retry branch, matching the existing HTTP-retry path (whichtime.sleeps without yielding). The retry is still visible via the existing[LLM Retry] ...stdout log; the user-facing stream now stays silent during retries and only surfaces a final error if all attempts are exhausted.Evidence
Behavior before vs after, mocked retry sequence (max_retries=3):
!!!Error: ...chunks + content.Local validation:
Behavioral tests (unittest.mock) verified that on a 2-error-then-success sequence the user-visible stream contains zero intermediate error chunks, while on exhaustion exactly one final error chunk is produced.
Risk
Minimal: 2 lines changed in one function. The removed
yield erronly affected the connection-error retry path; the HTTP-retry path was already yield-free, so this is just bringing the two paths into consistency. All three call sites of_stream_with_retry(_openai_stream,_claude_stream,BaseSession.ask) consume the generator without depending on intermediate yield count.Closes #571