Cache Strategy

KV-cache hit rates depend on prompt prefix stability. Claude Code handles ordering automatically — this doc explains why, for anyone writing custom prompts or skill content.

Prefix Ordering

For maximum prefix cache reuse:

Stable elements first — system prompt, tool definitions, AGENTS.md rules
Semi-stable next — skill content, project context, loaded rules
Dynamic elements last — user messages, tool outputs, timestamps

Placing dynamic content (timestamps, user-specific data) early in context invalidates the cache prefix for everything after it.

TTL

Default: 5 minutes.
ENABLE_PROMPT_CACHING_1H=1 → 1 hour. Available on API key, Bedrock, Vertex, Foundry. Set in settings.json.
FORCE_PROMPT_CACHING_5M=1 → override back to 5 minutes.
DISABLE_PROMPT_CACHING=1 → disable entirely (warns at startup).

Session Wake-Up Budget

When a background agent sleeps:

< 5 min (60-270s) — cache stays warm. Use for active polling.
5 min-1 hour (300s-3600s) — pay the cache miss. Use when there's no point checking sooner.
Avoid 300s exactly — worst of both: cache miss without amortizing it. Pick 270s (warm) or 1200s+ (long wait).

Default for idle ticks with no specific signal: 1200-1800s (20-30 min).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache Strategy

Prefix Ordering

TTL

Session Wake-Up Budget

See Also

FilesExpand file tree

cache-strategy.md

Latest commit

History

cache-strategy.md

File metadata and controls

Cache Strategy

Prefix Ordering

TTL

Session Wake-Up Budget

See Also