Commit f3a2722
authored
🤖 fix: avoid extra Anthropic cache breakpoints with explicit TTL (#3112)
Summary
This PR fixes a direct Anthropic regression where explicitly setting
`anthropic.cacheTtl` caused Mux to emit one extra cache-control
breakpoint, pushing tool-enabled requests over Anthropic's
four-breakpoint limit.
Background
Mux already applies Anthropic prompt caching through manual cache
markers on the cached system prompt, conversation tail, and last tool.
When `buildProviderOptions()` also emitted top-level
`anthropic.cacheControl`, the Anthropic SDK serialized an additional
top-level `cache_control` block on direct requests. That produced the
user-visible failure: `A maximum of 4 blocks with cache_control may be
provided. Found 5.`
Implementation
The fix stops emitting top-level Anthropic `cacheControl` from
`buildProviderOptions()` while preserving the existing manual
cache-marker flow. To guard against future regressions, the PR also adds
a helper that counts Anthropic cache breakpoints in shaped request
payloads and tests that pin the intended breakpoint budget. A targeted
StreamManager regression test verifies that explicit `1h` TTL values
still propagate through the manual cache path even without the top-level
provider option.
Validation
- `bun test src/common/utils/ai/providerOptions.test.ts
src/node/services/providerModelFactory.test.ts
src/common/utils/ai/cacheStrategy.test.ts
src/node/services/streamManager.test.ts`
- `nix shell nixpkgs#hadolint -c make static-check`
- Dogfooded in an isolated `make dev-server-sandbox` instance using
env-backed direct Anthropic credentials:
- selected Anthropic in onboarding
- set prompt cache TTL to `1 hour`
- added the current repo as the first project
- opened an Exec workspace and sent a tool-using request
- verified the request completed successfully without the previous
`Found 5` Anthropic error
- verified the UI showed prompt-cache read/create stats for the
successful request
Risks
The main regression risk is Anthropic request shaping across direct and
routed paths. This change is intentionally narrow: it removes the
redundant top-level direct-provider cache marker while keeping the
existing manual cache markers intact, and adds tests at both the
provider-options layer and the final shaped-request layer.
Pains
`make static-check` requires `hadolint`, which was not installed in the
workspace environment. I ran it through `nix shell nixpkgs#hadolint -c
make static-check` so the full required local validation still passed.
---
<details>
<summary>📋 Implementation Plan</summary>
# Fix plan: direct Anthropic cache-marker duplication when explicit
cache TTL is set
## Recommendation
**Recommended approach: keep Mux's existing 3 manual Anthropic cache
breakpoints, and stop emitting the extra top-level Anthropic
`cacheControl` field from `buildProviderOptions()`.**
- **Net product-code LoC estimate:** **+20 to +55**
- Why this is the best fit:
- It removes the only repo-visible behavior change that happens **only
when `anthropic.cacheTtl` is explicitly set**.
- It preserves the current manual breakpoint strategy already documented
in `src/common/utils/ai/cacheStrategy.ts`:
1. cached system prompt
2. cached conversation tail / last message
3. cached last tool
- It avoids a wider refactor across `messagePipeline.ts`,
`streamManager.ts`, and `providerModelFactory.ts` unless follow-up
cleanup is still desired after the regression is fixed.
<details>
<summary>Evidence supporting the root-cause diagnosis</summary>
- The user hit Anthropic's runtime error: **"A maximum of 4 blocks with
cache_control may be provided. Found 5."** on a **direct Anthropic**
request.
- The repo already applies **3 manual Anthropic cache breakpoints**
across these files:
- `src/common/utils/ai/cacheStrategy.ts`
- `createCachedSystemMessage()`
- `applyCacheControl()`
- `applyCacheControlToTools()`
- `src/node/services/messagePipeline.ts` applies `applyCacheControl()`
after message transforms.
- `src/node/services/streamManager.ts` prepends the cached system
message and marks the last tool.
- `src/common/utils/ai/cacheStrategy.ts` explicitly documents
Anthropic's **4-breakpoint limit** and says the intended design is to
use **3 total**.
- `src/common/utils/ai/providerOptions.ts` is the one place that adds an
**extra top-level** Anthropic `cacheControl` field, and it does so
**only when `muxProviderOptions.anthropic.cacheTtl` is explicitly set**.
- `src/node/services/aiService.ts` already passes the explicit TTL
separately into both:
- `prepareMessagesForProvider(...)` (`anthropicCacheTtl` argument)
- `streamManager.startStream(...)` (`anthropicCacheTtlOverride`
argument)
- That means the explicit TTL already reaches the manual cache-marker
path **without needing** top-level
`providerOptions.anthropic.cacheControl`.
- So the most conservative repo-backed explanation is:
- **unset TTL** -> manual 3-breakpoint path
- **explicit TTL** -> same manual 3-breakpoint path **plus** an extra
top-level Anthropic cache-control path
- Anthropic rejects the resulting request once the effective marker
count reaches 5.
</details>
## Alternate approach (not recommended for the first fix)
**Centralize all Anthropic cache injection in
`src/node/services/providerModelFactory.ts` and remove the higher-level
cache-marker transforms.**
- **Net product-code LoC estimate:** **-40 to -110**
- Upside: one source of truth for the wire payload.
- Downside: materially larger behavior change, touches more call sites,
and increases regression surface for system prompts, tools, retries, and
gateway routing.
- Recommendation: defer this unless the surgical fix fails to cover
another hidden duplication path.
## Implementation plan
### Phase 1 — Remove the redundant top-level Anthropic cache-control
path
**Files/symbols**
- `src/common/utils/ai/providerOptions.ts`
- `src/common/utils/ai/providerOptions.test.ts`
**Changes**
1. Update `buildProviderOptions()` so Anthropic models **do not emit**
top-level `anthropic.cacheControl`, even when
`muxProviderOptions.anthropic.cacheTtl` is set to `"5m"` or `"1h"`.
2. Keep the rest of the Anthropic provider options intact:
- `thinking`
- `effort`
- `disableParallelToolUse`
- `sendReasoning`
3. Add a short code comment documenting why the top-level field is
intentionally omitted:
- explicit Anthropic TTL is already threaded through Mux's manual
cache-marker helpers
- sending an extra top-level cache-control field can create duplicate
cache breakpoints and violate Anthropic's 4-breakpoint limit
**Quality gate after Phase 1**
- Update `src/common/utils/ai/providerOptions.test.ts` to assert that
explicit Anthropic TTL **no longer appears** in top-level provider
options.
- Cover both:
- standard Anthropic models
- effort/adaptive-thinking Anthropic models (for example Opus 4.6 /
Sonnet 4.6 cases already exercised in this test file)
### Phase 2 — Add a narrow regression guard at the wire-shaping layer
**Files/symbols**
- `src/node/services/providerModelFactory.ts`
- `src/node/services/providerModelFactory.test.ts`
**Changes**
1. Extract or add a small pure helper near
`wrapFetchWithAnthropicCacheControl()` that can **count Anthropic cache
breakpoints in the final request body**.
2. Count all cache-bearing locations relevant to this repo's current
shaping strategy, including:
- cached system blocks/messages
- cached tools
- cached last-message content parts
- gateway-style `providerOptions.anthropic.cacheControl` message markers
if present
3. Reuse that helper in tests, and optionally add a defensive runtime
assertion or warning right before sending the mutated request body.
- Goal: fail loudly in development/tests if a future change pushes the
request above Anthropic's limit again.
- Keep the runtime behavior minimal; do not expand this into a broad
fallback/rewrite mechanism in the first fix.
**Quality gate after Phase 2**
- Add direct-provider regression coverage in
`src/node/services/providerModelFactory.test.ts` that builds a
representative Anthropic request shape with:
- cached system prompt
- cached last tool
- cached last message
- explicit `cacheTtl: "1h"`
- Assert that the final shaped request stays at **<= 4** breakpoints,
and preferably at the intended **3**.
### Phase 3 — Verify TTL still propagates through the manual
cache-marker path
**Files/symbols**
- `src/node/services/aiService.ts`
- `src/node/services/messagePipeline.ts`
- `src/node/services/streamManager.ts`
- `src/common/utils/ai/cacheStrategy.ts`
- existing tests in:
- `src/common/utils/ai/cacheStrategy.test.ts`
- `src/node/services/streamManager.test.ts` or
`src/node/services/aiService.test.ts` (only if a small targeted
regression test is needed)
**Changes**
1. Leave the existing manual cache-marker plumbing intact for the first
fix.
2. Add or update one targeted regression test proving that explicit
Anthropic TTL still reaches the manual cache path even after top-level
`cacheControl` is removed.
- Best case: reuse an existing unit seam rather than adding a new
integration harness.
- Only expand into `aiService` / `streamManager` tests if
`providerOptions` + `providerModelFactory` tests are not enough to pin
the behavior down.
3. Preserve the documented 3-breakpoint strategy in `cacheStrategy.ts`;
do not refactor that layer yet.
**Quality gate after Phase 3**
- Confirm the test suite still proves:
- system prompt caching works
- last tool caching works
- last message caching works
- explicit TTL values (`"1h"`) are preserved on the manual path
## Acceptance criteria
- Direct Anthropic requests with explicit `anthropic.cacheTtl: "1h"` no
longer exceed Anthropic's 4-breakpoint limit.
- The final direct-provider request shape remains at the intended **3
manual cache breakpoints** unless a future Anthropic-specific feature
intentionally adds another.
- Explicit TTL still applies to the existing manual cache markers;
removing top-level `providerOptions.anthropic.cacheControl` must **not**
silently disable 1-hour prompt caching.
- Anthropic models without explicit TTL continue to use the existing
manual cache-marker strategy.
- The change does not introduce a regression for gateway-routed
Anthropic models.
## Validation plan
1. **Targeted unit tests**
- `bun test src/common/utils/ai/providerOptions.test.ts`
- `bun test src/node/services/providerModelFactory.test.ts`
- `bun test src/common/utils/ai/cacheStrategy.test.ts`
2. **Focused service regression test**
- run the smallest relevant additional test file only if Phase 3 adds
coverage in `aiService.test.ts` or `streamManager.test.ts`
3. **Static validation**
- `make typecheck`
- `make lint` if the touched files introduce new lint exposure
4. **Optional integration check**
- If Anthropic credentials are available in the environment, run a
narrow Anthropic integration exercise after the unit tests pass.
- Prefer a direct-provider reproduction with explicit `cacheTtl: "1h"`
and at least one tool-enabled request.
## Dogfooding plan
**Goal:** reproduce the original failure mode on the app path the user
actually hit, then verify the fix with evidence a reviewer can inspect.
### Setup
- Configure the **direct Anthropic provider** (not mux-gateway).
- Use an Anthropic model that supports the affected prompt-caching path.
- Enable explicit `anthropic.cacheTtl: "1h"`.
- Use **Exec** mode or any other tool-enabled flow that exercises tool
definitions in the request.
### Repro / verification flow
1. Start the app in a local dev session.
2. Select the direct Anthropic provider and confirm `cacheTtl` is
`"1h"`.
3. Run a simple tool-eligible request in Exec mode.
4. Verify the request completes **without** the Anthropic API error
about **5 cache-control blocks**.
5. If a debug request snapshot or local debug logging is available,
verify the final outgoing Anthropic payload is at **<= 4** breakpoints.
### Evidence to capture
- **Screenshot 1:** provider/model settings showing direct Anthropic +
explicit `1h` TTL
- **Screenshot 2:** successful Exec/tool-enabled response where the old
error no longer appears
- **Screenshot 3 (if available):** debug snapshot or log evidence
showing the final cache-breakpoint count
- **Video recording:** a short end-to-end repro/verification run
covering provider selection, request submission, and successful
completion
### Suggested tooling for verification
- In exec mode, use the repo's normal desktop/dev workflow to reproduce
the conversation flow.
- If automation is helpful during implementation review, use the
desktop/browser automation tools available in exec mode to drive the app
and capture screenshots/video artifacts.
## Risks / non-goals
- **Non-goal for this fix:** full cache-system centralization across
`messagePipeline`, `streamManager`, and `providerModelFactory`.
- **Risk:** if another hidden Anthropic SDK path also materializes extra
cache markers, removing top-level `cacheControl` may not be sufficient
by itself.
- Mitigation: add the wire-level breakpoint counter test in Phase 2 so
the final payload shape is asserted directly.
- **Risk:** some tests may currently treat top-level
`anthropic.cacheControl` as the source of truth for TTL propagation.
- Mitigation: update those tests to assert the new invariant: TTL is
carried by the manual cache-marker path, not by top-level provider
options.
</details>
---
_Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` •
Cost: `$12.62`_
<!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=12.62
-->1 parent 78f6c78 commit f3a2722
File tree
5 files changed
+261
-26
lines changed- src
- common/utils/ai
- node/services
5 files changed
+261
-26
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
235 | 235 | | |
236 | 236 | | |
237 | 237 | | |
238 | | - | |
| 238 | + | |
239 | 239 | | |
240 | 240 | | |
241 | 241 | | |
| |||
250 | 250 | | |
251 | 251 | | |
252 | 252 | | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | 253 | | |
258 | 254 | | |
259 | 255 | | |
260 | 256 | | |
261 | | - | |
| 257 | + | |
262 | 258 | | |
263 | 259 | | |
264 | 260 | | |
| |||
276 | 272 | | |
277 | 273 | | |
278 | 274 | | |
279 | | - | |
280 | | - | |
281 | | - | |
282 | | - | |
283 | 275 | | |
284 | 276 | | |
285 | 277 | | |
286 | 278 | | |
287 | 279 | | |
288 | 280 | | |
289 | 281 | | |
290 | | - | |
| 282 | + | |
291 | 283 | | |
292 | 284 | | |
293 | 285 | | |
| |||
303 | 295 | | |
304 | 296 | | |
305 | 297 | | |
306 | | - | |
| 298 | + | |
307 | 299 | | |
308 | 300 | | |
309 | 301 | | |
| |||
315 | 307 | | |
316 | 308 | | |
317 | 309 | | |
318 | | - | |
| 310 | + | |
| 311 | + | |
319 | 312 | | |
320 | 313 | | |
321 | 314 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
254 | 254 | | |
255 | 255 | | |
256 | 256 | | |
257 | | - | |
258 | | - | |
259 | | - | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
260 | 262 | | |
261 | 263 | | |
262 | 264 | | |
| |||
291 | 293 | | |
292 | 294 | | |
293 | 295 | | |
294 | | - | |
295 | 296 | | |
296 | 297 | | |
297 | 298 | | |
| |||
310 | 311 | | |
311 | 312 | | |
312 | 313 | | |
313 | | - | |
314 | 314 | | |
315 | 315 | | |
316 | 316 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
| |||
500 | 501 | | |
501 | 502 | | |
502 | 503 | | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
503 | 579 | | |
504 | 580 | | |
505 | 581 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
206 | 206 | | |
207 | 207 | | |
208 | 208 | | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
209 | 266 | | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | | - | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
214 | 271 | | |
215 | 272 | | |
216 | 273 | | |
| |||
782 | 839 | | |
783 | 840 | | |
784 | 841 | | |
785 | | - | |
786 | | - | |
| 842 | + | |
787 | 843 | | |
788 | 844 | | |
789 | 845 | | |
| |||
1375 | 1431 | | |
1376 | 1432 | | |
1377 | 1433 | | |
1378 | | - | |
1379 | | - | |
| 1434 | + | |
1380 | 1435 | | |
1381 | 1436 | | |
1382 | 1437 | | |
| |||
0 commit comments