Skip to content

fix(litellm): omit temperature for reasoning models#2462

Open
Sanderhoff-alt wants to merge 1 commit into
vectorize-io:mainfrom
Sanderhoff-alt:fix-litellm-gpt5-temperature
Open

fix(litellm): omit temperature for reasoning models#2462
Sanderhoff-alt wants to merge 1 commit into
vectorize-io:mainfrom
Sanderhoff-alt:fix-litellm-gpt5-temperature

Conversation

@Sanderhoff-alt

@Sanderhoff-alt Sanderhoff-alt commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Do not send temperature when the in-process LiteLLM provider is configured with GPT-5, o1/o3, or supported DeepSeek reasoning models.
  • Apply the same rule to the in-process LiteLLM Router provider, but only for deployments that Hindsight can actually reach from the default router entrypoint.
  • Reuse the existing OpenAI-compatible reasoning-model detection so LiteLLM and OpenAI-compatible providers behave consistently.
  • Add regression tests for Azure GPT-5.5, o-series models, DeepSeek reasoning models, and Router fallback configs.

Why

Azure OpenAI GPT-5.5 rejects explicit temperature values. It only accepts the provider default.

Hindsight currently passes internal temperatures such as 0.0, 0.1, and 0.9 through its built-in LiteLLM SDK provider. That can break verification, retain, think, and related internal LLM calls when the configured LiteLLM model is Azure GPT-5.5.

This PR keeps the fix small: if Hindsight can tell from the configured LiteLLM model string that the target is a reasoning model that rejects temperature, it removes temperature before calling LiteLLM.

For litellmrouter, Hindsight always calls the Router with model default. The check therefore only looks at the default deployment and any deployments reachable through Router fallbacks. A GPT-5 deployment in an unrelated model group will not change request parameters for calls that cannot reach it.

This PR only covers Hindsight's built-in LiteLLM SDK usage (litellm and litellmrouter). It does not try to solve arbitrary aliasing where the configured model name hides the real model family; that would require explicit capability metadata or configuration.

Fixes #2459.

Validation

  • uv run pytest tests/test_llm_router_provider.py tests/test_llm_extra_body.py tests/test_deepseek_tool_call_compat.py tests/test_openai_max_tokens_param.py -q

@Sanderhoff-alt Sanderhoff-alt force-pushed the fix-litellm-gpt5-temperature branch 2 times, most recently from 8f7f6ca to 1202e33 Compare June 29, 2026 15:10
@Sanderhoff-alt Sanderhoff-alt changed the title fix(litellm): omit temperature for GPT-5 models fix(litellm): omit temperature for reasoning models Jun 29, 2026
@Sanderhoff-alt Sanderhoff-alt force-pushed the fix-litellm-gpt5-temperature branch 2 times, most recently from 3809c74 to 7904d52 Compare June 29, 2026 15:36
@Sanderhoff-alt Sanderhoff-alt marked this pull request as ready for review June 29, 2026 15:52
@Sanderhoff-alt Sanderhoff-alt force-pushed the fix-litellm-gpt5-temperature branch from 7904d52 to 90bcb8c Compare June 30, 2026 08:43
Share the OpenAI-compatible reasoning model detector between the native
OpenAI-compatible provider and LiteLLM-backed providers.

Skip forwarding explicit temperature values through LiteLLM for those
models, including Azure GPT-5 deployments, o-series models, and DeepSeek
reasoning routes.

Limit the LiteLLM Router check to deployments reachable from the default
entrypoint so unrelated model groups do not affect call parameters.
@Sanderhoff-alt Sanderhoff-alt force-pushed the fix-litellm-gpt5-temperature branch from 90bcb8c to 1f547b6 Compare June 30, 2026 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix LiteLLM provider temperature handling for GPT-5.5

1 participant