Commit c09aa02
★★★ research(r45): llama.cpp HEAD-TO-HEAD proves 1000+ tok achievable — 23× gap exposed
User pushed: "1000+ 안될 게 없어 보입니다. 세션 한계에서 멈추지 말고 돌파."
Built llama.cpp from refs/ and ran identical model + prompt head-to-head:
Qwen3.5-4B llama.cpp: 1286 words (~1700 tok), coherent thinking+drafts
Qwen3.5-4B ours: 185 tok (natural stop)
Qwen3.6-35B llama.cpp: 1101 words (~1500 tok), COMPLETE fantasy story
CPU (-ngl 0), same weights, T=0
Qwen3.6-35B ours: ~65 tok before "Sorry!" attractor
1000+ IS ACHIEVABLE on 35B. Not architectural. Not quant-inherent. Not
prompt-too-short. Our implementation has a ~23× gap vs llama.cpp on
the same CPU, same weights, same prompt.
This invalidates R41-R42's conclusion that "all patchable paths already
match reference". Point-checking individual operations doesn't prove
end-to-end correctness — numerical precision compounds differently over
40 layers × 1000 positions.
Revised hunt priority:
1. Our tq_matmul_gguf on-the-fly dequant vs llama.cpp's fused Q4_K×FP32
2. Attention softmax/normalize precision
3. Matmul accumulator precision / reduction order
4. EOS logit margin (our sharper peaks may pull EOS earlier)
Next: per-token output diff vs llama.cpp to find first divergence point,
then narrow to specific op. Goal clear, gap quantified, direction
actionable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent b8291c3 commit c09aa02
1 file changed
Lines changed: 56 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
6 | 62 | | |
7 | 63 | | |
8 | 64 | | |
| |||
0 commit comments