Skip to content

RSPEED-3221: round-trip Gemini 3 thought signatures through Vertex AI converter#1896

Draft
major wants to merge 1 commit into
lightspeed-core:mainfrom
major:fix/vertexai-gemini3-thought-signature
Draft

RSPEED-3221: round-trip Gemini 3 thought signatures through Vertex AI converter#1896
major wants to merge 1 commit into
lightspeed-core:mainfrom
major:fix/vertexai-gemini3-thought-signature

Conversation

@major

@major major commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Description

Gemini 3.x models (e.g. gemini-3-flash, gemini-3.5-flash) attach a thought_signature to the first functionCall part of a tool-calling turn and require it to be replayed verbatim on the next turn, or the request fails with HTTP 400. llama-stack converts Gemini responses into the OpenAI chat-completion shape, which has no field for the signature, so it gets dropped and every multi-turn tool call against a Gemini 3 model fails.

This monkeypatches llama-stack's Vertex AI converter at app import time. Both wrappers defer entirely to the upstream originals and only smuggle the base64-encoded signature in and out through the opaque tool-call id (which llama-stack round-trips untouched and only ever compares for equality):

  • The extract wrapper re-pairs each functionCall part with the tool call the original emitted and embeds the signature in its id.
  • The assistant-message wrapper decodes it back onto the rebuilt Gemini part.

The patch is idempotent and a no-op when the Vertex AI provider is not installed. This is a downstream shim; remove it once the fix lands upstream in llama-stack.

Type of change

  • Bug fix

Tools used to create PR

  • Assisted-by: Claude
  • Generated by: N/A

Related Tickets & Documents

  • Related Issue # RSPEED-3221

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

Unit tests cover the encode/decode helpers, idempotency of the patch, the no-op path when the Vertex AI provider is absent, signature embedding into the tool-call id, and the full extract -> assistant-message round trip.

uv run pytest tests/unit/utils/test_vertexai_thought_signature.py -q
# 10 passed

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c7f38dde-06b1-4d45-81e1-c4e6baa8743b

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Gemini 3.x models (gemini-3-flash, gemini-3.5-flash) attach a
thought_signature to the first functionCall part of a tool-calling turn and
require it to be replayed verbatim on the next turn, or the request fails
with HTTP 400. llama-stack converts Gemini responses into the OpenAI
chat-completion shape, which has no field for the signature, so it is
dropped and every multi-turn tool call against a Gemini 3 model fails.

Monkeypatch llama-stack's vertexai converter at app import time. Both
wrappers defer entirely to the upstream originals and only smuggle the
base64-encoded signature in and out through the opaque tool-call id (which
llama-stack round-trips untouched and only ever compares for equality):
the extract wrapper re-pairs each functionCall part with the tool call the
original emitted and embeds the signature in its id; the assistant-message
wrapper decodes it back onto the rebuilt Gemini part.

The patch is idempotent and a no-op when the Vertex AI provider is not
installed. Remove it once the fix lands upstream.

Signed-off-by: Major Hayden <major@redhat.com>
@major major force-pushed the fix/vertexai-gemini3-thought-signature branch from 291b8d5 to cff3b6b Compare June 10, 2026 22:48
@major major changed the title fix: round-trip Gemini 3 thought signatures through Vertex AI converter RSPEED-3221: round-trip Gemini 3 thought signatures through Vertex AI converter Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant