Commit e1be645
feat(opt-pa01): PagedAttention — 4-10x memory efficiency, CoW block sharing, 14 tests
paged_attention.zig (947 lines): vLLM-style block-based KV cache manager
- BlockPool: pre-allocated block pool with LIFO free stack
- BlockTable: fixed-size per-sequence page mapping (no ArrayList)
- Copy-on-Write: ref-counted block sharing for beam search fork
- PagedKVCacheManager: multi-sequence lifecycle (create/append/fork/remove)
- Full attention: Q@K^T dot product, softmax, weighted V sum
- Memory analysis: 4x paged savings, 64x with ternary compression
- 14 tests: config, block lifecycle, CoW, fork, attention, exhaustion
- Zig 0.15.2 compatible (zero std.ArrayList usage)
- build.zig: test-paged-attention step wired
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>1 parent fb6eb05 commit e1be645
3 files changed
Lines changed: 1030 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
23 | 25 | | |
24 | 26 | | |
25 | 27 | | |
| |||
87 | 89 | | |
88 | 90 | | |
89 | 91 | | |
90 | | - | |
91 | 92 | | |
92 | 93 | | |
93 | 94 | | |
| |||
104 | 105 | | |
105 | 106 | | |
106 | 107 | | |
107 | | - | |
| 108 | + | |
108 | 109 | | |
109 | 110 | | |
110 | 111 | | |
111 | 112 | | |
112 | 113 | | |
113 | 114 | | |
114 | 115 | | |
115 | | - | |
| 116 | + | |
116 | 117 | | |
117 | 118 | | |
118 | | - | |
119 | | - | |
120 | | - | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
121 | 122 | | |
122 | 123 | | |
123 | 124 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1830 | 1830 | | |
1831 | 1831 | | |
1832 | 1832 | | |
| 1833 | + | |
| 1834 | + | |
| 1835 | + | |
| 1836 | + | |
| 1837 | + | |
| 1838 | + | |
| 1839 | + | |
| 1840 | + | |
| 1841 | + | |
| 1842 | + | |
| 1843 | + | |
| 1844 | + | |
| 1845 | + | |
1833 | 1846 | | |
0 commit comments