Skip to content

feat: use reasoning field in StreamingChunk for Google GenAI#2900

Merged
anakin87 merged 4 commits intodeepset-ai:mainfrom
Br1an67:fix/issue-10478-genai-reasoning-field
Mar 2, 2026
Merged

feat: use reasoning field in StreamingChunk for Google GenAI#2900
anakin87 merged 4 commits intodeepset-ai:mainfrom
Br1an67:fix/issue-10478-genai-reasoning-field

Conversation

@Br1an67
Copy link
Copy Markdown
Contributor

@Br1an67 Br1an67 commented Mar 1, 2026

Related Issues

Proposed Changes:

Populate StreamingChunk.reasoning with ReasoningContent instead of storing reasoning deltas as dicts in meta["reasoning_deltas"]. This aligns the Google GenAI integration with the standard StreamingChunk.reasoning field, consistent with other integrations (e.g., Ollama in #2850).

Changes:

  • _convert_google_chunk_to_streaming_chunk(): Collect reasoning text from thought parts into a ReasoningContent object and pass it via the reasoning kwarg instead of meta["reasoning_deltas"]
  • _aggregate_streaming_chunks_with_reasoning(): Read from chunk.reasoning.reasoning_text instead of chunk.meta["reasoning_deltas"]
  • Ensure StreamingChunk mutual exclusivity constraint is respected (only one of content/tool_calls/reasoning set per chunk)

How did you test it?

  • Added test_convert_google_chunk_to_streaming_chunk_with_thought to verify thought parts populate StreamingChunk.reasoning instead of meta
  • Updated test_aggregate_streaming_chunks_with_reasoning to use chunk.reasoning instead of meta["reasoning_deltas"]
  • All 19 non-image unit tests pass

Notes for the reviewer

The thought_signature_deltas remain in meta since they are Google-specific metadata for multi-turn context preservation, not standard reasoning content.

Checklist

@Br1an67 Br1an67 requested a review from a team as a code owner March 1, 2026 17:04
@Br1an67 Br1an67 requested review from anakin87 and removed request for a team March 1, 2026 17:04
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 1, 2026

CLA assistant check
All committers have signed the CLA.

Populate StreamingChunk.reasoning with ReasoningContent instead of
storing reasoning deltas as dicts in meta. Update aggregation to read
from chunk.reasoning instead of chunk.meta["reasoning_deltas"].
@Br1an67 Br1an67 force-pushed the fix/issue-10478-genai-reasoning-field branch from 480aa44 to 70af8fa Compare March 2, 2026 07:28
@anakin87
Copy link
Copy Markdown
Member

anakin87 commented Mar 2, 2026

@Br1an67 thank you for the contribution!

Please sign the Contributor License Agreement, then I'll review this PR.

@Br1an67
Copy link
Copy Markdown
Contributor Author

Br1an67 commented Mar 2, 2026

Hi @anakin87, thanks for the reminder! The CLA has been signed — the CLAassistant bot confirmed "All committers have signed the CLA" above. Ready for your review whenever you have a chance. 🙏

Copy link
Copy Markdown
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work.

I left a few small comments

# Add thought signature deltas to meta if available (for multi-turn context)
if thought_signature_deltas:
meta["thought_signature_deltas"] = thought_signature_deltas

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My impression is that we can use the extra field of ReasoningContent (code) to store thought_signature_deltas.
In my opinion, we did something similar in #2849 for Anthropic redacted thinking and thinking signature.

Since all this info is related to reasoning, I'd like to have it grouped into ReasoningContent.

But I haven't tried, and I don't know if this is simple to implement or if it would make the code much more complex. So please try and let me know.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I've moved thought_signature_deltas into ReasoningContent.extra for reasoning chunks, consistent with the Anthropic approach in #2849.

One caveat: StreamingChunk enforces mutual exclusivity between content and reasoning (raises ValueError in __post_init__), so for text/tool-call chunks that also carry thought signatures, the signatures still go in meta. The aggregation logic reads from both sources. This keeps all reasoning-related info grouped into ReasoningContent where possible without breaking the constraint.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing the code, I now realize that Thought Signatures can be included in non-reasoning response parts.

For this reason, I recommend going back to the previous version of your code where thought_signature_deltas are always stored in meta. This would be simpler and consistent.
I'd just add a comment on top of the code line explaining that this data can be part of non-reasoning content parts.

Makes sense?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! Reverted in bd3d479thought_signature_deltas are now always stored in meta with a comment explaining that thought signatures can appear in both reasoning and non-reasoning response parts.

assert streaming_chunk.tool_calls[5].id is None
assert streaming_chunk.tool_calls[5].index == 5

def test_convert_google_chunk_to_streaming_chunk_with_thought(self, monkeypatch):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you try using actual objects from Google API in this test?
See test_convert_google_chunk_to_streaming_chunk_real_example for an example

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — updated the test to use actual types.Part, types.Content, types.Candidate, and types.GenerateContentResponse objects, following the pattern in test_convert_google_chunk_to_streaming_chunk_real_example.

Br1an67 and others added 3 commits March 3, 2026 00:18
….extra

Store thought_signature_deltas in ReasoningContent.extra instead of
StreamingChunk.meta when reasoning content is present, grouping all
reasoning-related info into ReasoningContent. For text/tool-call chunks
(where StreamingChunk mutual exclusivity prevents setting both content
and reasoning), signatures remain in meta. The aggregation logic reads
from both sources. Consistent with the Anthropic approach in PR deepset-ai#2849.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace Mock objects with actual types.Part, types.Content,
types.Candidate and types.GenerateContentResponse in the
test_convert_google_chunk_to_streaming_chunk_with_thought test,
following the pattern established in the existing real_example test.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Thought signatures can appear in both reasoning and non-reasoning response
parts, so storing them consistently in meta is simpler than splitting
between ReasoningContent.extra and meta.
Copy link
Copy Markdown
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@anakin87 anakin87 merged commit 650f26b into deepset-ai:main Mar 2, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration:google-genai type:documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Google GenAI - use reasoning field in StreamingChunk

3 participants