Skip to content

Commit 2ef6b24

Browse files
docs: investigation report for Opus 4.7 Haiku 4.5 Bedrock cost spike regression
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
1 parent 6c8485b commit 2ef6b24

1 file changed

Lines changed: 222 additions & 0 deletions

File tree

Lines changed: 222 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,222 @@
1+
# Investigation: Opus 4.7 Day 0 Launch Regression - Haiku 4.5 Cost Spike on Bedrock
2+
3+
## Issue Summary
4+
5+
**Reported by:** User on Slack (#llms-engineering)
6+
**Date:** April 17, 2026
7+
**Versions affected:**
8+
- **Working:** `v1.82.3-stable.patch.4`
9+
- **Broken:** `v1.82.3-stable.opus-4.7`
10+
11+
### Symptoms
12+
- Upgrade from `v1.82.3-stable.patch.4` to `v1.82.3-stable.opus-4.7` caused a **significant cost spike** on Haiku 4.5 on Bedrock
13+
- Downgrade restored normal costs
14+
- Errors reported: **"unsupported thinking metadata"**
15+
- The issue affects Haiku 4.5 specifically, not Opus 4.7
16+
17+
---
18+
19+
## Investigation Findings
20+
21+
### 1. Opus 4.7 Day 0 Launch PRs
22+
23+
**PR #25867** - "Litellm day 0 opus 4.7 support" (merged into main via PR #25876)
24+
- **Merged:** April 16, 2026
25+
- **Changes:** Added Claude Opus 4.7 support across all providers (Anthropic direct, Bedrock, Vertex AI, Azure AI)
26+
27+
**PR #25876** - "Litellm hotfix opus 4.7" (merged April 16, 2026)
28+
- **Purpose:** Hotfix for Opus 4.7 support
29+
- **Key changes (per Greptile review):**
30+
- Added `_is_opus_4_7_model()` and `_is_claude_4_7_model()` functions
31+
- Extended `_is_adaptive_thinking_model()` to include Opus 4.7
32+
- Updated `is_claude_4_5_on_bedrock()` to include Opus 4.7 patterns
33+
- Added Opus 4.7 to `_supports_extended_thinking_on_bedrock()`
34+
- Added model entries to `model_prices_and_context_window.json`
35+
36+
### 2. Root Cause: Hardcoded Model Checks Violating Design Principles
37+
38+
#### The Architectural Problem
39+
40+
LiteLLM has documented "Day 0 Model Release Gaps" (GitHub Discussion #22555) which explicitly identifies the problem:
41+
42+
**Gap G4 (Anthropic):**
43+
> "Thinking/reasoning gated on hardcoded Claude version strings"
44+
>
45+
> File: `litellm/llms/anthropic/chat/transformation.py:172–210`
46+
>
47+
> `_is_claude_4_6_model()` and similar checks control whether `thinking` blocks and `budget_tokens` are sent. Any Claude 5.x model added only to JSON will silently get thinking stripped out.
48+
49+
**Gap G13 (Bedrock):**
50+
> "Parallel tool use on Bedrock gated on hardcoded Claude 4.5/4.6 patterns"
51+
>
52+
> File: `litellm/llms/bedrock/common_utils.py:462–490`
53+
>
54+
> `is_claude_4_5_on_bedrock()` controls whether `parallel_tool_use` is enabled. Claude 5.x models on Bedrock will have features silently disabled until this list is updated.
55+
56+
#### What Happened with Opus 4.7
57+
58+
When Opus 4.7 support was added:
59+
60+
1. **Extended existing hardcoded checks** - Added "opus-4.7", "opus_4_7", "opus-4-7", "opus_4_7" patterns to functions like:
61+
- `is_claude_4_5_on_bedrock()` (which now covers 4.5, 4.6, AND 4.7 despite its name)
62+
- `_supports_extended_thinking_on_bedrock()`
63+
- `_is_adaptive_thinking_model()`
64+
65+
2. **Broad substring matching** - Some checks use broad patterns like:
66+
- `"claude-3-7"` in model
67+
- `"claude-sonnet-4"` in model
68+
- `"claude-opus-4"` in model
69+
- `"claude-haiku-4"` in model
70+
71+
### 3. The Haiku 4.5 Cost Spike Root Cause
72+
73+
#### Hypothesis 1: Thinking Metadata Incorrectly Sent to Haiku 4.5
74+
75+
**Most likely cause:** The broad substring checks or function extensions added for Opus 4.7 are now causing thinking/reasoning metadata to be sent to Haiku 4.5 models that don't support it (or support it differently), resulting in:
76+
77+
1. **API errors** - Bedrock rejecting requests with "unsupported thinking metadata"
78+
2. **Retries** - LiteLLM retrying failed requests, increasing costs
79+
3. **Fallback behavior** - Potentially using different (more expensive) code paths
80+
81+
#### Evidence from PR #24053 (Open, Not Merged)
82+
83+
PR #24053 attempted to add thinking support for all Claude 4+ models on Bedrock (including Haiku 4.5) but was flagged by Greptile as violating design principles:
84+
85+
> "The correct approach is to let `supports_reasoning()` (backed by `model_prices_and_context_window.json`) be the single source of truth... The real problem is that `supports_reasoning()` fails for Bedrock models due to a provider/cost-map key mismatch."
86+
87+
**The core bug:** `supports_reasoning()` doesn't work properly for Bedrock because:
88+
- Model entries use `litellm_provider: "bedrock_converse"` or `litellm_provider: "bedrock"`
89+
- But lookups use `custom_llm_provider: "bedrock"` or `"bedrock_converse"`
90+
- The mismatch causes `supports_reasoning()` to return `False` even when the model supports it
91+
92+
#### Hypothesis 2: Function Name/Scope Mismatch
93+
94+
From Greptile review of PR #25876:
95+
96+
> "`is_claude_4_5_on_bedrock` now covers 4.5, 4.6, *and* 4.7 model families. The name and docstring are misleading to any caller."
97+
98+
**Risk:** If `is_claude_4_5_on_bedrock()` is used to gate Bedrock-specific thinking behavior, adding Opus 4.7 patterns might have inadvertently changed behavior for other Claude 4.x models (including Haiku 4.5).
99+
100+
### 4. Specific Code Locations to Investigate
101+
102+
Based on the analysis, these files likely contain the bug:
103+
104+
1. **`litellm/llms/bedrock/common_utils.py`**
105+
- `is_claude_4_5_on_bedrock()` - Extended to include Opus 4.7 patterns
106+
- Line 565-597 per Greptile review
107+
108+
2. **`litellm/llms/anthropic/chat/transformation.py`**
109+
- `_is_opus_4_7_model()` and thinking/reasoning logic
110+
- Lines 172-210, 237-248 per Gap G4 and Greptile review
111+
112+
3. **`litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py`**
113+
- `_supports_extended_thinking_on_bedrock()` - Extended to include Opus 4.7
114+
- May have incorrect logic for Haiku 4.5
115+
116+
4. **`litellm/llms/bedrock/chat/converse_transformation.py`**
117+
- `get_supported_openai_params()` with hardcoded substring checks
118+
- Lines 565-580 per PR #24053 review
119+
120+
### 5. Why Downgrading Fixed It
121+
122+
The previous version (`v1.82.3-stable.patch.4`) likely had:
123+
- **More conservative checks** - Thinking metadata only sent to models explicitly known to support it
124+
- **Narrower pattern matching** - Only Opus 4.5/4.6 covered, not all Claude 4.x
125+
- **No Opus 4.7 extensions** - The functions weren't modified to include new patterns
126+
127+
---
128+
129+
## Reproduction Scenario
130+
131+
1. User makes request to Bedrock with Haiku 4.5 model
132+
2. LiteLLM routes through Bedrock provider
133+
3. **Bug trigger:** One of these happens:
134+
- `is_claude_4_5_on_bedrock()` returns `True` for Haiku 4.5 due to broad pattern matching
135+
- Substring check `"claude-haiku-4" in model` matches Haiku 4.5
136+
- Thinking metadata is added to request
137+
4. Bedrock API returns error: "unsupported thinking metadata"
138+
5. LiteLLM retries or uses fallback logic
139+
6. **Cost spike:** Multiple retries or expensive fallback code path
140+
141+
---
142+
143+
## Recommended Investigation Steps for LiteLLM Team
144+
145+
1. **Check git diff between versions:**
146+
```bash
147+
git diff v1.82.3-stable.patch.4..v1.82.3-stable.opus-4.7 -- \
148+
litellm/llms/bedrock/common_utils.py \
149+
litellm/llms/anthropic/chat/transformation.py \
150+
litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py
151+
```
152+
153+
2. **Test Haiku 4.5 on Bedrock with thinking parameters:**
154+
- Send request with `thinking` or `reasoning_effort` params
155+
- Verify if Bedrock rejects it with "unsupported thinking metadata"
156+
157+
3. **Review function scope:**
158+
- Audit `is_claude_4_5_on_bedrock()` call sites
159+
- Check what behavior is gated by this function
160+
- Verify if Opus 4.7 addition affected Haiku 4.5 logic
161+
162+
4. **Check model capability flags:**
163+
- Review `model_prices_and_context_window.json` entries for Haiku 4.5
164+
- Verify `supports_reasoning`, `supports_extended_thinking`, etc.
165+
- Ensure Haiku 4.5 is correctly marked (likely should NOT have these flags)
166+
167+
---
168+
169+
## Recommended Fixes
170+
171+
### Short-term Fix (Hotfix)
172+
Ensure Haiku 4.5 is explicitly excluded from thinking/reasoning metadata logic:
173+
- Add explicit checks: `if "haiku-4-5" in model: return False` before broad Claude 4.x checks
174+
- Or narrow the patterns in `is_claude_4_5_on_bedrock()` to only match 4.5/4.6/4.7 Opus and Sonnet
175+
176+
### Long-term Fix (Architectural)
177+
Implement the fix described in Discussion #22555:
178+
179+
1. **Fix `supports_reasoning()` for Bedrock:**
180+
- Normalize provider names in `_get_model_info_helper()` or `_supports_factory()`
181+
- Handle `"bedrock"` vs `"bedrock_converse"` mismatch
182+
- Make `model_prices_and_context_window.json` the single source of truth
183+
184+
2. **Remove hardcoded model checks:**
185+
- Delete `_is_claude_4_6_model()`, `_is_opus_4_7_model()`, etc.
186+
- Replace with `supports_reasoning(model, provider)` calls
187+
- Let JSON entries drive behavior
188+
189+
3. **Add proper capability flags to JSON:**
190+
- `supports_extended_thinking`
191+
- `supports_adaptive_thinking`
192+
- `supports_reasoning_effort`
193+
- `supports_xhigh_reasoning_effort`
194+
195+
---
196+
197+
## Related Issues & Context
198+
199+
- **GitHub Discussion #22555** - "Day 0 Model Release Gaps" (documents the architectural problem)
200+
- **PR #24053** (Open) - "fix: added thinking and reasoning support for all claude 4+ models on bedrock"
201+
- Flagged for violating no-hardcode policy
202+
- Shows the root cause in `supports_reasoning()` provider mismatch
203+
- **PR #25867** - "Litellm day 0 opus 4.7 support" (merged)
204+
- **PR #25876** - "Litellm hotfix opus 4.7" (merged April 16, 2026)
205+
- **Anthropic Opus 4.7 Announcement** - April 16, 2026
206+
207+
---
208+
209+
## Conclusion
210+
211+
The Opus 4.7 day 0 launch PRs extended hardcoded model checks (known anti-pattern) to support the new model. This likely caused unintended side effects for Haiku 4.5 on Bedrock:
212+
213+
1. **Thinking metadata incorrectly sent** to Haiku 4.5 models
214+
2. **Bedrock API rejections** with "unsupported thinking metadata" errors
215+
3. **Retry storms** or expensive fallback logic
216+
4. **Cost spike** observed by users
217+
218+
The root cause is the architectural pattern of hardcoded model checks instead of using the model capability JSON as the single source of truth, combined with the `supports_reasoning()` function not working correctly for Bedrock providers.
219+
220+
**Immediate action needed:** Revert or patch the Opus 4.7 changes to explicitly exclude Haiku 4.5 from thinking/reasoning metadata logic.
221+
222+
**Follow-up action:** Implement the architectural fix described in Discussion #22555 to prevent this category of bugs in future model releases.

0 commit comments

Comments
 (0)