Fix: Decode single message over multiple stream reads#52
Closed
Phantom168 wants to merge 3 commits into
Closed
Conversation
… buffer and then decoded, once complete
Collaborator
|
Thanks. Closing in favor of #53 which properly handles the buffer accumulation without the error handling and edge case issues found here. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR aims to fix the issue claude code exiting due to parsing failure of a single json message over multiple stream reads Issue 6 Issue 15
The default maximum size of asyncio stream is 64KB. This limit can easily be breached when claude code invokes a tool to read a file of over 1000 lines.
Since a single message has been split over multiple stream reads, attempting to decode it as json leads to JSONDecodeError.
To demonstrate this, let's assume the asyncio stream limit is 5 characters, then the json message
{"a": "bc"}would be split into 3 consecutive stream reads of{"a":,"bc",}Attempting to parse each of these individually will lead to JSONDecodeError error.
This PR includes a test case in test_subprocess_buffering.py to demonstrate the issue.
The test case fails without the fix.
To fix the issue, a buffer (accumulator variable) is initialised as an empty string. If there is a JSONDecodeError, the stream read is appended to the buffer. This cumulative buffer will be parsed. This will continue until the buffer has the complete JSON and is parsed successfully.
To avoid the buffer increasing infinitely, an arbitrary limit of 1MB (for now) has been put on the buffer, post which it'll result in an OverflowError.