RSPEED-3221: round-trip Gemini 3 thought signatures through Vertex AI converter#1896
Draft
major wants to merge 1 commit into
Draft
RSPEED-3221: round-trip Gemini 3 thought signatures through Vertex AI converter#1896major wants to merge 1 commit into
major wants to merge 1 commit into
Conversation
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Gemini 3.x models (gemini-3-flash, gemini-3.5-flash) attach a thought_signature to the first functionCall part of a tool-calling turn and require it to be replayed verbatim on the next turn, or the request fails with HTTP 400. llama-stack converts Gemini responses into the OpenAI chat-completion shape, which has no field for the signature, so it is dropped and every multi-turn tool call against a Gemini 3 model fails. Monkeypatch llama-stack's vertexai converter at app import time. Both wrappers defer entirely to the upstream originals and only smuggle the base64-encoded signature in and out through the opaque tool-call id (which llama-stack round-trips untouched and only ever compares for equality): the extract wrapper re-pairs each functionCall part with the tool call the original emitted and embeds the signature in its id; the assistant-message wrapper decodes it back onto the rebuilt Gemini part. The patch is idempotent and a no-op when the Vertex AI provider is not installed. Remove it once the fix lands upstream. Signed-off-by: Major Hayden <major@redhat.com>
291b8d5 to
cff3b6b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Gemini 3.x models (e.g.
gemini-3-flash,gemini-3.5-flash) attach athought_signatureto the firstfunctionCallpart of a tool-calling turn and require it to be replayed verbatim on the next turn, or the request fails with HTTP 400. llama-stack converts Gemini responses into the OpenAI chat-completion shape, which has no field for the signature, so it gets dropped and every multi-turn tool call against a Gemini 3 model fails.This monkeypatches llama-stack's Vertex AI converter at app import time. Both wrappers defer entirely to the upstream originals and only smuggle the base64-encoded signature in and out through the opaque tool-call
id(which llama-stack round-trips untouched and only ever compares for equality):functionCallpart with the tool call the original emitted and embeds the signature in its id.The patch is idempotent and a no-op when the Vertex AI provider is not installed. This is a downstream shim; remove it once the fix lands upstream in llama-stack.
Type of change
Tools used to create PR
Related Tickets & Documents
Checklist before requesting a review
Testing
Unit tests cover the encode/decode helpers, idempotency of the patch, the no-op path when the Vertex AI provider is absent, signature embedding into the tool-call id, and the full extract -> assistant-message round trip.
uv run pytest tests/unit/utils/test_vertexai_thought_signature.py -q # 10 passed