Commit 7f9c905
PR-K1.G: memory usage tracking for K1.E NIAH validation
ADR 0008 §11.5 §'Five properties' item 1 — 'constant memory in
context length' — is a measurable claim, not a presumption. Until
this PR the K1.E harness emitted no memory data; the architectural
claim was rhetorical.
This PR adds memory measurement around each of the K1.E configs
(oracle / v0.3 / v0.4) and emits the per-config peak/current
allocation into the run's JSON evidence. Future analysis can plot
peak memory vs context length across the three configs and
empirically verify:
* v0.3 sustained memory ≈ constant (sink+window slab is small)
* v0.4 sustained memory ≈ constant (sink+window slab + transient
proposer activations that get freed each step)
* oracle memory grows linearly with context (full KV cache)
Files changed:
inference_engine/v04/niah_eval.py (+172 lines)
Three new public helpers:
reset_memory_peak(device): resets the high-water mark on the
device. CUDA: torch.cuda.reset_peak_memory_stats. MPS / CPU:
no-op (no peak counter exposed).
record_memory(device): captures a snapshot dict suitable for
JSON serialization. Per-device shape:
cuda: {device_kind, device_name, device_total_bytes,
current_allocated_bytes, current_reserved_bytes,
peak_allocated_bytes, peak_reserved_bytes}
mps: {device_kind, current_allocated_bytes,
driver_allocated_bytes, peak_allocated_bytes=None}
cpu: {device_kind, current_allocated_bytes=psutil RSS or
None if psutil missing}
All fields are int or None — JSON-serialisable. CUDA path
synchronises before reading. psutil is an optional CPU dep;
absence does not raise.
format_memory_summary(snapshot): one-line human-readable string
for stderr printing. Per-device formatting includes peak +
current GB plus a percent-of-total-VRAM indicator on CUDA.
scripts/research/k1e_niah_validation.py (+55 lines)
* After model+dataset load, a baseline_memory snapshot records
the minimum sustained working set. Per-config peak is
relative to the same baseline, so cross-config delta
comparisons are direct.
* Each config (oracle, v03, v04) runs:
reset_memory_peak(device)
result = evaluate(name, samples, decode_fn)
memory = record_memory(device)
and the memory snapshot is added to the JSON report.
* Per-config stderr line now includes peak memory:
oracle_full_attention recall=1.000 mean_latency=69.06s peak_mem=12.34GB
v03_sink_window recall=0.000 mean_latency=67.54s peak_mem=2.10GB
v04_dlm_restored recall=1.000 mean_latency=93.37s peak_mem=2.45GB
* JSON report schema bumped 1 -> 2:
new top-level 'memory' block:
{ 'baseline': <snapshot>,
'per_config': { name: <snapshot>, ... } }
v1 consumers must default this to {} on read.
tests/inference_engine/v04/test_niah_eval.py (+131 lines, 13 cases)
* TestRecordMemoryCPU — CPU branch returns expected dict shape;
peak fields are None on CPU; psutil-present and -absent both
handled gracefully (no raises); snapshot is JSON serialisable.
* TestResetMemoryPeak — CPU reset is no-op (no raise).
* TestFormatMemorySummary — CUDA / MPS / CPU formatting; missing
total / current / RSS handled with sensible fallbacks; output
is single-line.
After this PR: tests/inference_engine/v04/ has 176 cases (163 K1.A-F
+ 13 K1.G), all <0.20 s on Linux CI.
Out of scope:
* No K1.D smoke change (smoke runs 256 tokens, memory not the
bottleneck there).
* No reviewer-script change — they invoke the harness which now
emits memory itself.
* No ADR 0008 update (deferred to §11.12 postscript when
long-context evidence with memory data arrives).
Stacking notes:
Logical base: PR #77 (K1.F SDPA fix). After #71 -> #72 -> #73 ->
#74 -> #75 -> #77 land on main, this PR's diff shrinks to just
the four modified files.
Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>1 parent 84d0636 commit 7f9c905
4 files changed
Lines changed: 362 additions & 3 deletions
File tree
- inference_engine/v04
- scripts/research
- tests/inference_engine/v04
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| 52 | + | |
52 | 53 | | |
53 | 54 | | |
54 | 55 | | |
55 | 56 | | |
56 | 57 | | |
57 | 58 | | |
| 59 | + | |
| 60 | + | |
58 | 61 | | |
59 | 62 | | |
60 | 63 | | |
| |||
83 | 86 | | |
84 | 87 | | |
85 | 88 | | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
86 | 93 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
45 | | - | |
| 45 | + | |
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| |||
497 | 497 | | |
498 | 498 | | |
499 | 499 | | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
137 | 137 | | |
138 | 138 | | |
139 | 139 | | |
| 140 | + | |
140 | 141 | | |
141 | 142 | | |
142 | 143 | | |
143 | 144 | | |
| 145 | + | |
| 146 | + | |
144 | 147 | | |
145 | 148 | | |
146 | 149 | | |
| |||
173 | 176 | | |
174 | 177 | | |
175 | 178 | | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
176 | 193 | | |
| 194 | + | |
177 | 195 | | |
178 | 196 | | |
179 | 197 | | |
| |||
191 | 209 | | |
192 | 210 | | |
193 | 211 | | |
| 212 | + | |
194 | 213 | | |
| 214 | + | |
195 | 215 | | |
| 216 | + | |
196 | 217 | | |
197 | 218 | | |
198 | 219 | | |
199 | 220 | | |
200 | 221 | | |
201 | 222 | | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
202 | 227 | | |
203 | 228 | | |
204 | 229 | | |
| |||
221 | 246 | | |
222 | 247 | | |
223 | 248 | | |
| 249 | + | |
224 | 250 | | |
| 251 | + | |
225 | 252 | | |
| 253 | + | |
226 | 254 | | |
227 | 255 | | |
228 | 256 | | |
229 | 257 | | |
230 | 258 | | |
231 | 259 | | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
232 | 264 | | |
233 | 265 | | |
234 | 266 | | |
| |||
255 | 287 | | |
256 | 288 | | |
257 | 289 | | |
| 290 | + | |
258 | 291 | | |
| 292 | + | |
259 | 293 | | |
| 294 | + | |
260 | 295 | | |
261 | 296 | | |
262 | 297 | | |
263 | 298 | | |
264 | 299 | | |
265 | 300 | | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
266 | 305 | | |
267 | 306 | | |
268 | 307 | | |
| |||
283 | 322 | | |
284 | 323 | | |
285 | 324 | | |
286 | | - | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
287 | 328 | | |
288 | 329 | | |
289 | 330 | | |
| |||
302 | 343 | | |
303 | 344 | | |
304 | 345 | | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
305 | 350 | | |
306 | 351 | | |
307 | 352 | | |
| |||
316 | 361 | | |
317 | 362 | | |
318 | 363 | | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
319 | 370 | | |
320 | 371 | | |
321 | | - | |
| 372 | + | |
322 | 373 | | |
323 | 374 | | |
324 | 375 | | |
| |||
0 commit comments