|
| 1 | +# Quick Summary: Opus 4.7 Haiku 4.5 Bedrock Regression |
| 2 | + |
| 3 | +**For:** Ishaan, Krrish, Yuneng, LiteLLM Team |
| 4 | +**Date:** April 17, 2026 |
| 5 | +**Issue:** Cost spike on Haiku 4.5 Bedrock after Opus 4.7 upgrade |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## TL;DR |
| 10 | + |
| 11 | +The Opus 4.7 day 0 launch (PRs #25867, #25876 merged April 16) extended hardcoded model pattern checks that are now incorrectly sending thinking/reasoning metadata to Haiku 4.5 models on Bedrock. Bedrock rejects these requests, causing retry storms and cost spikes. |
| 12 | + |
| 13 | +--- |
| 14 | + |
| 15 | +## What Happened |
| 16 | + |
| 17 | +``` |
| 18 | +User upgrades: v1.82.3-stable.patch.4 → v1.82.3-stable.opus-4.7 |
| 19 | + ↓ |
| 20 | +Haiku 4.5 requests now include thinking metadata |
| 21 | + ↓ |
| 22 | +Bedrock API error: "unsupported thinking metadata" |
| 23 | + ↓ |
| 24 | +Retries/fallback logic kicks in |
| 25 | + ↓ |
| 26 | +Cost spike observed |
| 27 | +``` |
| 28 | + |
| 29 | +--- |
| 30 | + |
| 31 | +## Root Cause |
| 32 | + |
| 33 | +### The PRs |
| 34 | +- **PR #25867** - "Litellm day 0 opus 4.7 support" |
| 35 | +- **PR #25876** - "Litellm hotfix opus 4.7" (merged April 16, 2026 at 7:19pm) |
| 36 | + |
| 37 | +### What Changed |
| 38 | + |
| 39 | +1. **Extended `is_claude_4_5_on_bedrock()` function** |
| 40 | + - File: `litellm/llms/bedrock/common_utils.py` |
| 41 | + - Added Opus 4.7 patterns: `"opus-4.7"`, `"opus_4_7"`, `"opus-4-7"`, `"opus_4_7"` |
| 42 | + - **Problem:** Function name says "4.5" but now covers 4.5, 4.6, AND 4.7 |
| 43 | + - **Impact:** Any code using this function to gate behavior got modified behavior |
| 44 | + |
| 45 | +2. **Added Opus 4.7 to `_supports_extended_thinking_on_bedrock()`** |
| 46 | + - File: `litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py` |
| 47 | + - Uses broad pattern matching: `"claude-haiku-4"` in model |
| 48 | + - **Problem:** Matches Haiku 4.5 too! |
| 49 | + |
| 50 | +3. **Extended adaptive thinking checks** |
| 51 | + - File: `litellm/llms/anthropic/common_utils.py` |
| 52 | + - `_is_adaptive_thinking_model()` now includes Opus 4.7 |
| 53 | + |
| 54 | +### Why This Breaks Haiku 4.5 |
| 55 | + |
| 56 | +Haiku 4.5 **does not support** the same thinking metadata as Opus models. The broad pattern matching added for Opus 4.7 catches Haiku 4.5 models: |
| 57 | + |
| 58 | +```python |
| 59 | +# This pattern now matches BOTH Opus 4.7 AND Haiku 4.5! |
| 60 | +if "claude-haiku-4" in model: |
| 61 | + # Send thinking metadata |
| 62 | + pass |
| 63 | +``` |
| 64 | + |
| 65 | +When thinking metadata is sent to Haiku 4.5: |
| 66 | +- ❌ Bedrock API rejects the request |
| 67 | +- 🔁 LiteLLM retries |
| 68 | +- 💰 Cost multiplies |
| 69 | + |
| 70 | +--- |
| 71 | + |
| 72 | +## Evidence |
| 73 | + |
| 74 | +### From Greptile Review (PR #25876) |
| 75 | +> "`is_claude_4_5_on_bedrock` now covers 4.5, 4.6, *and* 4.7 model families. The name and docstring are misleading to any caller." |
| 76 | +
|
| 77 | +### From GitHub Discussion #22555 (Day 0 Model Release Gaps) |
| 78 | +> **Gap G13 - Bedrock:** "Parallel tool use on Bedrock gated on hardcoded Claude 4.5/4.6 patterns" |
| 79 | +> |
| 80 | +> `is_claude_4_5_on_bedrock()` controls feature flags. Adding new patterns affects all models that match. |
| 81 | +
|
| 82 | +### From PR #24053 (Open, addresses same issue) |
| 83 | +> "The real bug is that `supports_reasoning()` fails for Bedrock because `_get_model_info_helper` / `_supports_factory` does not normalize the `'bedrock'` vs `'bedrock_converse'` provider name difference." |
| 84 | +
|
| 85 | +--- |
| 86 | + |
| 87 | +## Why CI Didn't Catch This |
| 88 | + |
| 89 | +From Slack thread: |
| 90 | +> **Yuneng:** "no not for opus 4.7" |
| 91 | +> **Krrish:** "oh, why? i assume that would have prevented this" |
| 92 | +
|
| 93 | +The Opus 4.7 patch **did not go through full CI/CD**, which would have caught the Haiku 4.5 regression with integration tests. |
| 94 | + |
| 95 | +--- |
| 96 | + |
| 97 | +## Immediate Fix Needed |
| 98 | + |
| 99 | +### Option 1: Explicit Exclusion (Fastest) |
| 100 | +Add explicit checks to exclude Haiku 4.5 before broad Claude 4.x matching: |
| 101 | + |
| 102 | +```python |
| 103 | +def _supports_extended_thinking_on_bedrock(model: str) -> bool: |
| 104 | + # Explicitly exclude Haiku 4.5 |
| 105 | + if "haiku-4-5" in model or "haiku_4_5" in model: |
| 106 | + return False |
| 107 | + |
| 108 | + # Then check for Opus/Sonnet 4.x |
| 109 | + if "claude-opus-4" in model or "claude-sonnet-4" in model: |
| 110 | + return True |
| 111 | + |
| 112 | + return False |
| 113 | +``` |
| 114 | + |
| 115 | +### Option 2: Narrow Pattern Matching (Better) |
| 116 | +Replace broad patterns with specific version checks: |
| 117 | + |
| 118 | +```python |
| 119 | +# Instead of: if "claude-haiku-4" in model |
| 120 | +# Use: |
| 121 | +if any(p in model for p in ["haiku-4-6", "haiku-4-7", "haiku-4-8"]): |
| 122 | + # Only specific versions, not 4.5 |
| 123 | +``` |
| 124 | + |
| 125 | +### Option 3: Revert & Rework (Safest) |
| 126 | +1. Revert PRs #25867 and #25876 |
| 127 | +2. Fix the root cause (provider name mismatch in `supports_reasoning()`) |
| 128 | +3. Re-implement Opus 4.7 support using JSON capability flags |
| 129 | +4. Run full CI/CD |
| 130 | + |
| 131 | +--- |
| 132 | + |
| 133 | +## Long-term Fix |
| 134 | + |
| 135 | +This is the **3rd time** this pattern has caused issues. Need architectural fix: |
| 136 | + |
| 137 | +1. **Fix `supports_reasoning()` for Bedrock** |
| 138 | + - Normalize `"bedrock"` vs `"bedrock_converse"` provider names |
| 139 | + - Make `model_prices_and_context_window.json` work correctly |
| 140 | + |
| 141 | +2. **Remove all hardcoded model checks** |
| 142 | + - Delete: `_is_claude_4_6_model()`, `_is_opus_4_7_model()`, etc. |
| 143 | + - Replace with: `supports_reasoning(model, provider)` JSON lookups |
| 144 | + |
| 145 | +3. **Add proper JSON flags** |
| 146 | + - `supports_extended_thinking` |
| 147 | + - `supports_adaptive_thinking` |
| 148 | + - `supports_xhigh_reasoning_effort` |
| 149 | + |
| 150 | +This is already documented in **Discussion #22555** - just needs implementation. |
| 151 | + |
| 152 | +--- |
| 153 | + |
| 154 | +## Action Items |
| 155 | + |
| 156 | +- [ ] **@Yuneng** - Pull logs from affected customers to confirm "unsupported thinking metadata" errors |
| 157 | +- [ ] **@Sameer** - Test Haiku 4.5 on Bedrock with current main branch |
| 158 | + - Does it send thinking params? |
| 159 | + - Does Bedrock reject them? |
| 160 | +- [ ] **@Ishaan** - Review `is_claude_4_5_on_bedrock()` call sites |
| 161 | + - What behavior is gated by this function? |
| 162 | + - Did Opus 4.7 changes affect Haiku 4.5 code paths? |
| 163 | +- [ ] **@Krrish** - Implement hotfix (Option 1 or 2 above) |
| 164 | +- [ ] **Team** - Schedule architectural fix (Discussion #22555) |
| 165 | +- [ ] **Team** - Require full CI/CD for all model launches (no exceptions) |
| 166 | + |
| 167 | +--- |
| 168 | + |
| 169 | +## Files to Check |
| 170 | + |
| 171 | +Priority order for investigation: |
| 172 | + |
| 173 | +1. `litellm/llms/bedrock/common_utils.py` |
| 174 | + - Line 565-597: `is_claude_4_5_on_bedrock()` |
| 175 | + |
| 176 | +2. `litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py` |
| 177 | + - `_supports_extended_thinking_on_bedrock()` |
| 178 | + |
| 179 | +3. `litellm/llms/anthropic/chat/transformation.py` |
| 180 | + - Lines 172-210, 237-248: Thinking/reasoning logic |
| 181 | + |
| 182 | +4. `litellm/llms/bedrock/chat/converse_transformation.py` |
| 183 | + - Line 565-580: `get_supported_openai_params()` |
| 184 | + |
| 185 | +--- |
| 186 | + |
| 187 | +## Test Commands |
| 188 | + |
| 189 | +```bash |
| 190 | +# See the exact changes |
| 191 | +git diff v1.82.3-stable.patch.4..v1.82.3-stable.opus-4.7 -- \ |
| 192 | + litellm/llms/bedrock/common_utils.py \ |
| 193 | + litellm/llms/anthropic/chat/transformation.py |
| 194 | + |
| 195 | +# Test Haiku 4.5 with thinking (should fail) |
| 196 | +python test_haiku_bedrock_thinking.py # Create this test |
| 197 | + |
| 198 | +# Run full test suite |
| 199 | +make test-unit |
| 200 | +``` |
| 201 | + |
| 202 | +--- |
| 203 | + |
| 204 | +## Timeline |
| 205 | + |
| 206 | +- **April 16, 2026 5:19pm** - PR #25867 merged into main |
| 207 | +- **April 16, 2026 7:19pm** - PR #25876 (hotfix) merged |
| 208 | +- **April 17, 2026 7:32pm** - User reports cost spike on Slack |
| 209 | +- **April 17, 2026 7:47pm** - Ishaan asks Cursor to investigate |
| 210 | + |
| 211 | +--- |
| 212 | + |
| 213 | +**Bottom line:** Opus 4.7 launch used hardcoded patterns that broke Haiku 4.5. Need immediate hotfix + long-term architectural fix. |
0 commit comments