fix(litellm): omit temperature for reasoning models by Sanderhoff-alt · Pull Request #2462 · vectorize-io/hindsight

Sanderhoff-alt · 2026-06-29T14:53:01Z

Summary

Do not send temperature when the in-process LiteLLM provider is configured with GPT-5, o1/o3, or supported DeepSeek reasoning models.
Apply the same rule to the in-process LiteLLM Router provider, but only for deployments that Hindsight can actually reach from the default router entrypoint.
Reuse the existing OpenAI-compatible reasoning-model detection so LiteLLM and OpenAI-compatible providers behave consistently.
Add regression tests for Azure GPT-5.5, o-series models, DeepSeek reasoning models, and Router fallback configs.

Why

Azure OpenAI GPT-5.5 rejects explicit temperature values. It only accepts the provider default.

Hindsight currently passes internal temperatures such as 0.0, 0.1, and 0.9 through its built-in LiteLLM SDK provider. That can break verification, retain, think, and related internal LLM calls when the configured LiteLLM model is Azure GPT-5.5.

This PR keeps the fix small: if Hindsight can tell from the configured LiteLLM model string that the target is a reasoning model that rejects temperature, it removes temperature before calling LiteLLM.

For litellmrouter, Hindsight always calls the Router with model default. The check therefore only looks at the default deployment and any deployments reachable through Router fallbacks. A GPT-5 deployment in an unrelated model group will not change request parameters for calls that cannot reach it.

This PR only covers Hindsight's built-in LiteLLM SDK usage (litellm and litellmrouter). It does not try to solve arbitrary aliasing where the configured model name hides the real model family; that would require explicit capability metadata or configuration.

Fixes #2459.

Validation

uv run pytest tests/test_llm_router_provider.py tests/test_llm_extra_body.py tests/test_deepseek_tool_call_compat.py tests/test_openai_max_tokens_param.py -q

Share the OpenAI-compatible reasoning model detector between the native OpenAI-compatible provider and LiteLLM-backed providers. Skip forwarding explicit temperature values through LiteLLM for those models, including Azure GPT-5 deployments, o-series models, and DeepSeek reasoning routes. Limit the LiteLLM Router check to deployments reachable from the default entrypoint so unrelated model groups do not affect call parameters.

Sanderhoff-alt force-pushed the fix-litellm-gpt5-temperature branch 2 times, most recently from 8f7f6ca to 1202e33 Compare June 29, 2026 15:10

Sanderhoff-alt changed the title ~~fix(litellm): omit temperature for GPT-5 models~~ fix(litellm): omit temperature for reasoning models Jun 29, 2026

Sanderhoff-alt force-pushed the fix-litellm-gpt5-temperature branch 2 times, most recently from 3809c74 to 7904d52 Compare June 29, 2026 15:36

Sanderhoff-alt marked this pull request as ready for review June 29, 2026 15:52

r266-tech mentioned this pull request Jun 30, 2026

fix(llm): wire default_headers into LiteLLM-backed providers (#2458) #2466

Merged

Sanderhoff-alt force-pushed the fix-litellm-gpt5-temperature branch from 7904d52 to 90bcb8c Compare June 30, 2026 08:43

Sanderhoff-alt force-pushed the fix-litellm-gpt5-temperature branch from 90bcb8c to 1f547b6 Compare June 30, 2026 09:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(litellm): omit temperature for reasoning models#2462

fix(litellm): omit temperature for reasoning models#2462
Sanderhoff-alt wants to merge 1 commit into
vectorize-io:mainfrom
Sanderhoff-alt:fix-litellm-gpt5-temperature

Sanderhoff-alt commented Jun 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Sanderhoff-alt commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Sanderhoff-alt commented Jun 29, 2026 •

edited

Loading