Skip to content

Commit 234003a

Browse files
docs: add investigation overview README
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
1 parent 7f2d1c5 commit 234003a

1 file changed

Lines changed: 256 additions & 0 deletions

File tree

INVESTIGATION_README.md

Lines changed: 256 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,256 @@
1+
# Investigation: Opus 4.7 Haiku 4.5 Bedrock Cost Spike
2+
3+
This investigation was conducted in response to a Slack report on April 17, 2026, regarding a cost spike on Haiku 4.5 Bedrock after upgrading from `v1.82.3-stable.patch.4` to `v1.82.3-stable.opus-4.7`.
4+
5+
---
6+
7+
## Documents in This Investigation
8+
9+
### 1. [SUMMARY_FOR_TEAM.md](./SUMMARY_FOR_TEAM.md) - **START HERE**
10+
Quick executive summary for the team with:
11+
- TL;DR and root cause
12+
- What happened and why
13+
- Immediate action items
14+
- Files to check
15+
16+
**Read this first if you need to understand the issue quickly.**
17+
18+
### 2. [INVESTIGATION_OPUS_4_7_HAIKU_REGRESSION.md](./INVESTIGATION_OPUS_4_7_HAIKU_REGRESSION.md)
19+
Full investigation report including:
20+
- Detailed issue summary
21+
- Investigation findings
22+
- Root cause analysis
23+
- Evidence from PRs and discussions
24+
- Recommended fixes (short-term and long-term)
25+
- References to related issues
26+
27+
**Read this for complete context and evidence.**
28+
29+
### 3. [CODE_PATTERNS_ANALYSIS.md](./CODE_PATTERNS_ANALYSIS.md)
30+
Technical deep-dive showing:
31+
- Specific code patterns that caused the bug
32+
- Before/after comparisons
33+
- Why each pattern breaks
34+
- Request flow diagrams
35+
- Detailed fix strategies
36+
37+
**Read this if you need to understand the exact code changes.**
38+
39+
---
40+
41+
## Quick Context
42+
43+
### The Problem
44+
After upgrading to the Opus 4.7 release, users saw **significant cost spikes** on Haiku 4.5 models on Bedrock, accompanied by errors: **"unsupported thinking metadata"**
45+
46+
### The Root Cause
47+
The Opus 4.7 day 0 launch PRs (#25867, #25876) extended hardcoded model pattern checks using **broad substring matching** that inadvertently matched Haiku 4.5 models. This caused LiteLLM to send thinking/reasoning metadata to Haiku 4.5, which Bedrock rejects, leading to retry storms and cost spikes.
48+
49+
### Key Pattern
50+
```python
51+
# This pattern matches BOTH Haiku 4.7 AND Haiku 4.5!
52+
if "claude-haiku-4" in model:
53+
add_thinking_metadata() # ❌ Haiku 4.5 doesn't support this!
54+
```
55+
56+
---
57+
58+
## For LiteLLM Team
59+
60+
### Immediate Actions Needed
61+
62+
1. **Verify the issue**
63+
```bash
64+
# Check customer logs for "unsupported thinking metadata" errors
65+
# Correlate with Haiku 4.5 requests on Bedrock
66+
```
67+
68+
2. **Test current behavior**
69+
```python
70+
# Test if Haiku 4.5 is getting thinking metadata
71+
# Test if Bedrock rejects it
72+
```
73+
74+
3. **Deploy hotfix**
75+
- See [CODE_PATTERNS_ANALYSIS.md](./CODE_PATTERNS_ANALYSIS.md) → "Fix Strategy" → "1. Immediate Hotfix"
76+
- Explicitly exclude Haiku 4.5 from thinking metadata logic
77+
78+
4. **Run full CI/CD for the hotfix**
79+
- Include integration tests for all Claude 4.x models on Bedrock
80+
- Verify Haiku 4.5 behavior specifically
81+
82+
### Strategic Actions
83+
84+
This is the **3rd occurrence** of hardcoded model checks causing regressions. Time to fix the root cause:
85+
86+
1. **Implement provider name normalization** in `supports_reasoning()`
87+
2. **Remove all hardcoded model version checks**
88+
3. **Enforce JSON-based capability system**
89+
4. **Require full CI/CD for all model launches** (no exceptions)
90+
91+
See [Discussion #22555 "Day 0 Model Release Gaps"](https://github.com/BerriAI/litellm/discussions/22555) for the full architectural fix plan.
92+
93+
---
94+
95+
## For Reviewers
96+
97+
### What Went Wrong
98+
99+
1. **Design principle violation**
100+
- LiteLLM has a documented principle: model capabilities should be in JSON, not hardcoded
101+
- Opus 4.7 PRs violated this principle (acknowledged in Greptile reviews)
102+
103+
2. **Broad pattern matching**
104+
- Used `"claude-haiku-4" in model` which matches 4.0, 4.5, 4.6, 4.7, 4.8...
105+
- Should have used explicit version checks
106+
107+
3. **Function scope creep**
108+
- `is_claude_4_5_on_bedrock()` now covers 4.5, 4.6, AND 4.7
109+
- Name no longer matches behavior
110+
111+
4. **Skipped CI/CD**
112+
- Opus 4.7 didn't go through full CI/CD (confirmed in Slack thread)
113+
- Integration tests would have caught this
114+
115+
### Why It Wasn't Caught
116+
117+
- **No Haiku 4.5 tests** for the new logic paths
118+
- **Hardcoded checks** bypassed the broken `supports_reasoning()` function
119+
- **Function name** (`is_claude_4_5_on_bedrock`) suggested it was only for 4.5
120+
- **Greptile flagged issues** but PRs merged anyway (P2 severity, not blocking)
121+
122+
---
123+
124+
## Evidence Trail
125+
126+
### GitHub PRs
127+
- **PR #25867** - "Litellm day 0 opus 4.7 support" (merged April 16)
128+
- **PR #25876** - "Litellm hotfix opus 4.7" (merged April 16, 7:19pm)
129+
- **PR #24053** - "fix: added thinking and reasoning support for all claude 4+ models" (OPEN, documents root cause)
130+
131+
### GitHub Discussions
132+
- **Discussion #22555** - "Day 0 Model Release Gaps" (documents the architectural problem)
133+
- Gap G4: Thinking/reasoning gated on hardcoded strings
134+
- Gap G13: Parallel tool use on Bedrock hardcoded
135+
136+
### Slack Thread
137+
- **Channel:** #llms-engineering
138+
- **Date:** April 17, 2026
139+
- **Reporter:** User upgraded and saw cost spike on Haiku 4.5
140+
- **Key quote:** "unsupported thinking metadata" errors
141+
- **Confirmation:** Downgrade restored normal costs
142+
143+
### Greptile Reviews
144+
Both PRs flagged by Greptile for:
145+
- Hardcoded model checks (P2)
146+
- Function name mismatch (P2)
147+
- Redundant checks (P2)
148+
149+
All flagged as violating the "no-hardcode" design principle.
150+
151+
---
152+
153+
## Files Changed in Opus 4.7 PRs
154+
155+
Based on Greptile review of PR #25876:
156+
157+
1. `litellm/llms/anthropic/chat/transformation.py`
158+
- Added `_is_opus_4_7_model()`
159+
- Extended thinking/reasoning logic
160+
161+
2. `litellm/llms/anthropic/common_utils.py`
162+
- Added `_is_claude_4_7_model()`
163+
- Extended `_is_adaptive_thinking_model()`
164+
165+
3. `litellm/llms/bedrock/common_utils.py`
166+
- Extended `is_claude_4_5_on_bedrock()` with 4.7 patterns
167+
168+
4. `litellm/llms/bedrock/chat/converse_transformation.py`
169+
- Added Opus 4.7 to computer-use beta header selection
170+
171+
5. `litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py`
172+
- Extended `_supports_extended_thinking_on_bedrock()`
173+
174+
6. `model_prices_and_context_window.json`
175+
- Added Opus 4.7 entries with pricing and capabilities
176+
177+
7. `litellm/setup_wizard.py`
178+
- Added claude-opus-4-7 to model list
179+
180+
---
181+
182+
## Testing Strategy (for Hotfix)
183+
184+
```python
185+
# Test 1: Verify Haiku 4.5 does NOT get thinking metadata
186+
def test_haiku_45_no_thinking():
187+
model = "anthropic.claude-haiku-4-5-20251001-v1:0"
188+
assert not _supports_extended_thinking_on_bedrock(model)
189+
assert not is_thinking_enabled(model)
190+
191+
# Test 2: Verify Opus 4.7 DOES get thinking metadata
192+
def test_opus_47_has_thinking():
193+
model = "anthropic.claude-opus-4-7-20260416-v1:0"
194+
assert _supports_extended_thinking_on_bedrock(model)
195+
assert is_thinking_enabled(model)
196+
197+
# Test 3: Verify Haiku 4.6+ gets thinking (if supported)
198+
def test_haiku_46_has_thinking():
199+
model = "anthropic.claude-haiku-4-6-20260101-v1:0"
200+
# Check JSON first!
201+
if supports_extended_thinking_in_json(model):
202+
assert _supports_extended_thinking_on_bedrock(model)
203+
204+
# Test 4: Verify no "unsupported thinking metadata" errors
205+
def test_bedrock_haiku_45_request():
206+
model = "anthropic.claude-haiku-4-5-20251001-v1:0"
207+
response = make_bedrock_request(model, messages=[...])
208+
assert response.status_code == 200
209+
assert "unsupported" not in response.error_message
210+
```
211+
212+
---
213+
214+
## Related Resources
215+
216+
### LiteLLM Documentation
217+
- [Day 0 Model Release Gaps](https://github.com/BerriAI/litellm/discussions/22555)
218+
- [Contributing Guide](https://docs.litellm.ai/docs/extras/contributing_code)
219+
220+
### Anthropic Resources
221+
- [Claude Opus 4.7 Announcement](https://www.anthropic.com/news/claude-opus-4-7) (April 16, 2026)
222+
- [AWS Bedrock Opus 4.7](https://aws.amazon.com/blogs/aws/introducing-anthropics-claude-opus-4-7-model-in-amazon-bedrock/)
223+
224+
### External Analysis
225+
- [Dev.to: Claude Opus 4.7 Migration Guide](https://dev.to/lavellehatcherjr/anthropic-releases-claude-opus-47-key-changes-and-migration-guide-for-developers-3an4)
226+
227+
---
228+
229+
## Contact
230+
231+
This investigation was conducted by Cursor Cloud Agent on April 17, 2026.
232+
233+
**For questions about this investigation:**
234+
- Review the three documents linked above
235+
- Check the Slack thread in #llms-engineering
236+
- Review PRs #25867 and #25876 on GitHub
237+
238+
**For technical questions about the fix:**
239+
- See [CODE_PATTERNS_ANALYSIS.md](./CODE_PATTERNS_ANALYSIS.md) → "Fix Strategy"
240+
- See [Discussion #22555](https://github.com/BerriAI/litellm/discussions/22555) for architectural fix
241+
242+
---
243+
244+
## Status
245+
246+
- ✅ Investigation complete
247+
- ✅ Root cause identified
248+
- ✅ Fix strategies documented
249+
- ⏳ Awaiting LiteLLM team implementation
250+
- ⏳ Hotfix needed
251+
- ⏳ Architectural fix needed
252+
253+
---
254+
255+
**Last Updated:** April 17, 2026
256+
**Investigation Branch:** `cursor/investigate-opus-4-7-haiku-regression-5249`

0 commit comments

Comments
 (0)