feat: implement streaming support for Claude model#1221
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
I have signed the CLA! |
1 similar comment
|
I have signed the CLA! |
d856d37 to
a450dee
Compare
|
Hi @serelli, thank you for your contribution! We appreciate you taking the time to submit this pull request. As per contribution policy, please ensure your PR consists of a single commit. Could you please change your commits accordingly? |
Switch Claude.generateContent() to use the Anthropic SDK's createStreaming() API when stream=true, replacing the previous non-streaming fallback (which had a TODO: Switch to streaming API comment). - Text deltas are emitted immediately as partial LlmResponses - Tool use JSON is accumulated across inputJson deltas and emitted as a function call LlmResponse on contentBlockStop - messageStop emits a turnComplete=true sentinel - StreamResponse is always closed via Flowable.using() - Extracted buildMessageCreateParams() to deduplicate param logic Add 6 unit tests covering text streaming, tool call accumulation, mixed text+tool responses, empty streams, and invalid JSON fallback.
01e12c7 to
b85ef79
Compare
|
@hemasekhar-p Done! The PR now consists of a single commit |
|
@serelli, thank you for addressing the pre checks. Currently this PR is under review by our team, we will keep you posted if any additional information is required. thank you. |
|
@tilgalas, Could you please review this PR. |
Summary
Claude.generateContent()to use the Anthropic SDK'screateStreaming()API whenstream=true, replacing the previous non-streaming fallback (which had aTODO: Switch to streaming APIcomment)LlmResponses withpartial=trueinputJsondeltas and emitted as a function callLlmResponseoncontentBlockStopmessageStopemits aturnComplete=truesentinelStreamResponseis always closed viaFlowable.using()to prevent resource leaksbuildMessageCreateParams()to deduplicate param-building logic shared by both streaming and non-streaming pathsTest plan
testStreaming_textChunksEmittedAsPartialResponses— text deltas becomepartial=trueresponses in ordertestStreaming_messageStopEmitsTurnComplete—messageStopevent setsturnComplete=truetestStreaming_toolCallAccumulatedAndEmittedOnBlockStop— tool JSON accumulates across deltas, emits complete function call on block stoptestStreaming_mixedTextAndToolCall— interleaved text + tool call in same stream produces correct ordered responsestestStreaming_emptyStream_producesNoResponses— empty event stream produces no outputtestStreaming_toolCallWithInvalidJson_fallsBackToEmptyArgs— malformed JSON logs a warning and falls back to empty args map