Skip to content

Commit ac0a9c8

Browse files
docs: add executive summary for team
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
1 parent 2ef6b24 commit ac0a9c8

1 file changed

Lines changed: 213 additions & 0 deletions

File tree

SUMMARY_FOR_TEAM.md

Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
# Quick Summary: Opus 4.7 Haiku 4.5 Bedrock Regression
2+
3+
**For:** Ishaan, Krrish, Yuneng, LiteLLM Team
4+
**Date:** April 17, 2026
5+
**Issue:** Cost spike on Haiku 4.5 Bedrock after Opus 4.7 upgrade
6+
7+
---
8+
9+
## TL;DR
10+
11+
The Opus 4.7 day 0 launch (PRs #25867, #25876 merged April 16) extended hardcoded model pattern checks that are now incorrectly sending thinking/reasoning metadata to Haiku 4.5 models on Bedrock. Bedrock rejects these requests, causing retry storms and cost spikes.
12+
13+
---
14+
15+
## What Happened
16+
17+
```
18+
User upgrades: v1.82.3-stable.patch.4 → v1.82.3-stable.opus-4.7
19+
20+
Haiku 4.5 requests now include thinking metadata
21+
22+
Bedrock API error: "unsupported thinking metadata"
23+
24+
Retries/fallback logic kicks in
25+
26+
Cost spike observed
27+
```
28+
29+
---
30+
31+
## Root Cause
32+
33+
### The PRs
34+
- **PR #25867** - "Litellm day 0 opus 4.7 support"
35+
- **PR #25876** - "Litellm hotfix opus 4.7" (merged April 16, 2026 at 7:19pm)
36+
37+
### What Changed
38+
39+
1. **Extended `is_claude_4_5_on_bedrock()` function**
40+
- File: `litellm/llms/bedrock/common_utils.py`
41+
- Added Opus 4.7 patterns: `"opus-4.7"`, `"opus_4_7"`, `"opus-4-7"`, `"opus_4_7"`
42+
- **Problem:** Function name says "4.5" but now covers 4.5, 4.6, AND 4.7
43+
- **Impact:** Any code using this function to gate behavior got modified behavior
44+
45+
2. **Added Opus 4.7 to `_supports_extended_thinking_on_bedrock()`**
46+
- File: `litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py`
47+
- Uses broad pattern matching: `"claude-haiku-4"` in model
48+
- **Problem:** Matches Haiku 4.5 too!
49+
50+
3. **Extended adaptive thinking checks**
51+
- File: `litellm/llms/anthropic/common_utils.py`
52+
- `_is_adaptive_thinking_model()` now includes Opus 4.7
53+
54+
### Why This Breaks Haiku 4.5
55+
56+
Haiku 4.5 **does not support** the same thinking metadata as Opus models. The broad pattern matching added for Opus 4.7 catches Haiku 4.5 models:
57+
58+
```python
59+
# This pattern now matches BOTH Opus 4.7 AND Haiku 4.5!
60+
if "claude-haiku-4" in model:
61+
# Send thinking metadata
62+
pass
63+
```
64+
65+
When thinking metadata is sent to Haiku 4.5:
66+
- ❌ Bedrock API rejects the request
67+
- 🔁 LiteLLM retries
68+
- 💰 Cost multiplies
69+
70+
---
71+
72+
## Evidence
73+
74+
### From Greptile Review (PR #25876)
75+
> "`is_claude_4_5_on_bedrock` now covers 4.5, 4.6, *and* 4.7 model families. The name and docstring are misleading to any caller."
76+
77+
### From GitHub Discussion #22555 (Day 0 Model Release Gaps)
78+
> **Gap G13 - Bedrock:** "Parallel tool use on Bedrock gated on hardcoded Claude 4.5/4.6 patterns"
79+
>
80+
> `is_claude_4_5_on_bedrock()` controls feature flags. Adding new patterns affects all models that match.
81+
82+
### From PR #24053 (Open, addresses same issue)
83+
> "The real bug is that `supports_reasoning()` fails for Bedrock because `_get_model_info_helper` / `_supports_factory` does not normalize the `'bedrock'` vs `'bedrock_converse'` provider name difference."
84+
85+
---
86+
87+
## Why CI Didn't Catch This
88+
89+
From Slack thread:
90+
> **Yuneng:** "no not for opus 4.7"
91+
> **Krrish:** "oh, why? i assume that would have prevented this"
92+
93+
The Opus 4.7 patch **did not go through full CI/CD**, which would have caught the Haiku 4.5 regression with integration tests.
94+
95+
---
96+
97+
## Immediate Fix Needed
98+
99+
### Option 1: Explicit Exclusion (Fastest)
100+
Add explicit checks to exclude Haiku 4.5 before broad Claude 4.x matching:
101+
102+
```python
103+
def _supports_extended_thinking_on_bedrock(model: str) -> bool:
104+
# Explicitly exclude Haiku 4.5
105+
if "haiku-4-5" in model or "haiku_4_5" in model:
106+
return False
107+
108+
# Then check for Opus/Sonnet 4.x
109+
if "claude-opus-4" in model or "claude-sonnet-4" in model:
110+
return True
111+
112+
return False
113+
```
114+
115+
### Option 2: Narrow Pattern Matching (Better)
116+
Replace broad patterns with specific version checks:
117+
118+
```python
119+
# Instead of: if "claude-haiku-4" in model
120+
# Use:
121+
if any(p in model for p in ["haiku-4-6", "haiku-4-7", "haiku-4-8"]):
122+
# Only specific versions, not 4.5
123+
```
124+
125+
### Option 3: Revert & Rework (Safest)
126+
1. Revert PRs #25867 and #25876
127+
2. Fix the root cause (provider name mismatch in `supports_reasoning()`)
128+
3. Re-implement Opus 4.7 support using JSON capability flags
129+
4. Run full CI/CD
130+
131+
---
132+
133+
## Long-term Fix
134+
135+
This is the **3rd time** this pattern has caused issues. Need architectural fix:
136+
137+
1. **Fix `supports_reasoning()` for Bedrock**
138+
- Normalize `"bedrock"` vs `"bedrock_converse"` provider names
139+
- Make `model_prices_and_context_window.json` work correctly
140+
141+
2. **Remove all hardcoded model checks**
142+
- Delete: `_is_claude_4_6_model()`, `_is_opus_4_7_model()`, etc.
143+
- Replace with: `supports_reasoning(model, provider)` JSON lookups
144+
145+
3. **Add proper JSON flags**
146+
- `supports_extended_thinking`
147+
- `supports_adaptive_thinking`
148+
- `supports_xhigh_reasoning_effort`
149+
150+
This is already documented in **Discussion #22555** - just needs implementation.
151+
152+
---
153+
154+
## Action Items
155+
156+
- [ ] **@Yuneng** - Pull logs from affected customers to confirm "unsupported thinking metadata" errors
157+
- [ ] **@Sameer** - Test Haiku 4.5 on Bedrock with current main branch
158+
- Does it send thinking params?
159+
- Does Bedrock reject them?
160+
- [ ] **@Ishaan** - Review `is_claude_4_5_on_bedrock()` call sites
161+
- What behavior is gated by this function?
162+
- Did Opus 4.7 changes affect Haiku 4.5 code paths?
163+
- [ ] **@Krrish** - Implement hotfix (Option 1 or 2 above)
164+
- [ ] **Team** - Schedule architectural fix (Discussion #22555)
165+
- [ ] **Team** - Require full CI/CD for all model launches (no exceptions)
166+
167+
---
168+
169+
## Files to Check
170+
171+
Priority order for investigation:
172+
173+
1. `litellm/llms/bedrock/common_utils.py`
174+
- Line 565-597: `is_claude_4_5_on_bedrock()`
175+
176+
2. `litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py`
177+
- `_supports_extended_thinking_on_bedrock()`
178+
179+
3. `litellm/llms/anthropic/chat/transformation.py`
180+
- Lines 172-210, 237-248: Thinking/reasoning logic
181+
182+
4. `litellm/llms/bedrock/chat/converse_transformation.py`
183+
- Line 565-580: `get_supported_openai_params()`
184+
185+
---
186+
187+
## Test Commands
188+
189+
```bash
190+
# See the exact changes
191+
git diff v1.82.3-stable.patch.4..v1.82.3-stable.opus-4.7 -- \
192+
litellm/llms/bedrock/common_utils.py \
193+
litellm/llms/anthropic/chat/transformation.py
194+
195+
# Test Haiku 4.5 with thinking (should fail)
196+
python test_haiku_bedrock_thinking.py # Create this test
197+
198+
# Run full test suite
199+
make test-unit
200+
```
201+
202+
---
203+
204+
## Timeline
205+
206+
- **April 16, 2026 5:19pm** - PR #25867 merged into main
207+
- **April 16, 2026 7:19pm** - PR #25876 (hotfix) merged
208+
- **April 17, 2026 7:32pm** - User reports cost spike on Slack
209+
- **April 17, 2026 7:47pm** - Ishaan asks Cursor to investigate
210+
211+
---
212+
213+
**Bottom line:** Opus 4.7 launch used hardcoded patterns that broke Haiku 4.5. Need immediate hotfix + long-term architectural fix.

0 commit comments

Comments
 (0)