中文 | English
How does Claude Code control context budget in long conversations while protecting prompt-cache hit rate?
python examples/l9_context_mgmt.pyutils/context.tsconstants/prompts.tsutils/api.tsmemdir/memdir.ts
SYSTEM_PROMPT_DYNAMIC_BOUNDARYcontext_1m_betaMEMORY.mdENTRYPOINT_NAME
- why the static/dynamic boundary matters
- how snapshotting and sticky latches protect cache stability
- what auto-compact solves versus what memory indexing solves
The demo only illustrates token budgeting and compaction. The real source also handles model window capability, cache boundaries, system-prompt assembly, and memory injection rules.
- Why does prompt-cache hit rate end up shaping architecture?
- Why must
MEMORY.mdstay short and index-like? - Auto-compact solves “history is too long” — what problem does memory solve?