Commit b571fe0
Honest limits: document divergence at ~120 tokens, V remains FP32
Analysis found:
- 1-bit KV: byte-identical up to ~117 tokens (context ~132 = TQ_BK)
- 3-bit KV: byte-identical up to ~140 tokens
- Beyond divergence point: output differs but stays coherent English
- Root cause: only key vectors quantized, values remain FP32
- Divergence expected — quantized attention scores shift softmax
README updated with honest scope note.
Benchmark script header clarified.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 1afb716 commit b571fe0
2 files changed
Lines changed: 12 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
| 10 | + | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
22 | 27 | | |
23 | 28 | | |
24 | 29 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
5 | 10 | | |
6 | 11 | | |
7 | 12 | | |
| |||
0 commit comments