Skip to content

fix: default thinking config for Gemini 3+ Flash models#4067

Merged
kompfner merged 2 commits into
pipecat-ai:mainfrom
omChauhanDev:fix-gemini3-flash-thinking-default
Apr 10, 2026
Merged

fix: default thinking config for Gemini 3+ Flash models#4067
kompfner merged 2 commits into
pipecat-ai:mainfrom
omChauhanDev:fix-gemini3-flash-thinking-default

Conversation

@omChauhanDev
Copy link
Copy Markdown
Contributor

Please describe the changes in your PR. If it is addressing an issue, please reference that as well.

Fixes #3993

  • Gemini 2.5 Flash gets a default thinking_budget=0 for low latency, but Gemini 3+ Flash uses a different API surface (thinking_level) & was getting no default at all, falling back to provider defaults.

  • Now Gemini 3.x Flash models get thinking_level="minimal" automatically when no explicit thinking config is provided.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 18, 2026

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/pipecat/services/google/llm.py 0.00% 5 Missing ⚠️
Files with missing lines Coverage Δ
src/pipecat/services/google/llm.py 44.84% <0.00%> (-0.22%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

model = self._settings.model
if model.startswith("gemini-2.5-flash"):
generation_params["thinking_config"] = {"thinking_budget": 0}
elif model.startswith("gemini-3") and "flash" in model:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this more general check (as opposed to explicitly checking for the string "gemini-3-flash") would cause use to set this thinking config unnecessarily in "gemini-3.1-flash-lite" (where minimal thinking is default), but I think it's absolutely worth it for future-proofing. So, good call 👍.

Copy link
Copy Markdown
Contributor

@kompfner kompfner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Nice improvement.

@kompfner kompfner merged commit 8e5fe8a into pipecat-ai:main Apr 10, 2026
6 checks passed
markbackman pushed a commit to pipecat-ai/docs that referenced this pull request Apr 14, 2026
Updates documentation to reflect that GoogleLLMService now applies
a low-latency thinking default (thinking_level="minimal") for Gemini 3+
Flash models, while Gemini 2.5 Flash continues to use thinking_budget=0.

Related to pipecat-ai/pipecat#4067
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GoogleLLMService: apply model-aware default thinking config for Gemini 3+

2 participants