Commit 63e45bb
★ debug(kv): probe chunking fix — turbo_kv_4b CLEAN across all tested arch
R33's "hybrid NaN" was a probe bug, not production. GGUF metadata confirms
Qwen3.5-4B and Qwen3.6-35B have key_length=256, but TQ_BK=128 means the
traits quantize/dequantize clamp internally. Production handles this by
chunking (see tq_transformer.c:1937/2081/2204). My probe didn't.
Fix: match production — chunk probe calls into TQ_BK blocks.
Post-fix per-arch KV quantization error (cos = K roundtrip cosine sim):
Llama-3.2-1B head_dim=64 cos 0.994-0.997 NaN 0/64
Qwen3-0.6B head_dim=128 cos 0.995-0.997 NaN 0/128
Qwen3.5-4B head_dim=256 cos 0.994-0.996 NaN 0/256 ← was "inf/nan"
Qwen3.6-35B head_dim=256 cos 0.994-0.997 NaN 0/256 ← was "inf/nan"
turbo_kv_4b is now **uniformly clean across every tested architecture**.
Per-layer per-position cos ≥ 0.994 confirms the 7x compression / +0% PPL
claim is structurally preserved — not just aggregate-validated.
Methodology lesson: refparity's methodology worked; my probe implementation
had a silent bug of its own (the ironic kind). Always verify diagnostic
tooling matches production code path for the PLUMBING too — chunking,
buffer sizes, stride assumptions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 4b6019e commit 63e45bb
3 files changed
Lines changed: 53 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
6 | 48 | | |
7 | 49 | | |
8 | 50 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1841 | 1841 | | |
1842 | 1842 | | |
1843 | 1843 | | |
1844 | | - | |
1845 | | - | |
1846 | | - | |
| 1844 | + | |
| 1845 | + | |
| 1846 | + | |
| 1847 | + | |
| 1848 | + | |
| 1849 | + | |
| 1850 | + | |
| 1851 | + | |
| 1852 | + | |
| 1853 | + | |
1847 | 1854 | | |
1848 | 1855 | | |
1849 | 1856 | | |
| |||
0 commit comments