Commit 26ca0dc
authored
feat(server): prefix-aware inline prefix-cache eviction (#452)
The inline prefix cache evicted by pure LRU, which can drop a short,
broadly-shared ancestor prefix (e.g. system prompt plus first turn, reused by
many later branches) while keeping a long, conversation-specific leaf snapshot
that nothing else reuses. The next branch that needs the shared ancestor then
re-prefills it.
Make eviction prefix-aware: prefer evicting the oldest leaf (an entry whose
tokens are not a strict prefix of any other live entry) so shared ancestors stay
resident. Falls back to plain LRU when no ancestor structure exists.
This is not strictly better than LRU. It keeps the frozen shallow ancestors plus
the current deepest entry, while LRU keeps the N most-recent entries. It wins
when later branches reuse an early shared root (the agentic system-prompt
pattern) and can lose when a branch reuses a recent but non-current prefix that
LRU would still hold. Linear conversations are unaffected: both keep the deepest
entry.
The policy is a pure free function select_inline_evict_victim over the cached
prefixes in LRU order; inline entries now carry their prefix tokens so the leaf
test can run. Contained to prefix_cache; no backend or request-protocol change.
Adds model-free unit tests for the policy.1 parent cd8b065 commit 26ca0dc
3 files changed
Lines changed: 105 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
140 | 140 | | |
141 | 141 | | |
142 | 142 | | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
143 | 175 | | |
144 | 176 | | |
145 | 177 | | |
| |||
171 | 203 | | |
172 | 204 | | |
173 | 205 | | |
174 | | - | |
| 206 | + | |
175 | 207 | | |
176 | | - | |
| 208 | + | |
177 | 209 | | |
178 | 210 | | |
179 | 211 | | |
| |||
235 | 267 | | |
236 | 268 | | |
237 | 269 | | |
238 | | - | |
239 | | - | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
240 | 278 | | |
241 | | - | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
242 | 286 | | |
243 | 287 | | |
244 | 288 | | |
| |||
278 | 322 | | |
279 | 323 | | |
280 | 324 | | |
281 | | - | |
| 325 | + | |
| 326 | + | |
282 | 327 | | |
283 | 328 | | |
284 | 329 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
48 | 60 | | |
49 | 61 | | |
50 | 62 | | |
| |||
139 | 151 | | |
140 | 152 | | |
141 | 153 | | |
142 | | - | |
143 | | - | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
144 | 157 | | |
145 | 158 | | |
146 | 159 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1242 | 1242 | | |
1243 | 1243 | | |
1244 | 1244 | | |
| 1245 | + | |
| 1246 | + | |
| 1247 | + | |
| 1248 | + | |
| 1249 | + | |
| 1250 | + | |
| 1251 | + | |
| 1252 | + | |
| 1253 | + | |
| 1254 | + | |
| 1255 | + | |
| 1256 | + | |
| 1257 | + | |
| 1258 | + | |
| 1259 | + | |
| 1260 | + | |
| 1261 | + | |
| 1262 | + | |
| 1263 | + | |
| 1264 | + | |
| 1265 | + | |
| 1266 | + | |
| 1267 | + | |
| 1268 | + | |
| 1269 | + | |
| 1270 | + | |
| 1271 | + | |
| 1272 | + | |
| 1273 | + | |
| 1274 | + | |
| 1275 | + | |
| 1276 | + | |
| 1277 | + | |
| 1278 | + | |
1245 | 1279 | | |
1246 | 1280 | | |
1247 | 1281 | | |
| |||
4054 | 4088 | | |
4055 | 4089 | | |
4056 | 4090 | | |
| 4091 | + | |
| 4092 | + | |
| 4093 | + | |
| 4094 | + | |
| 4095 | + | |
4057 | 4096 | | |
4058 | 4097 | | |
4059 | 4098 | | |
| |||
0 commit comments