Commit 562fa34
debug(logit): TQ_LOGIT_PROBE — diagnoses 35B long-gen residual collapse
User set 1000+ tokens coherent on Qwen3.6-35B as the headline breakthrough
metric. Current ceiling (post-v0.28.0 T=2.0): ~170 coherent tokens with
Q5_K_M + T=2.0 (no rep-penalty). Alphabet-walk failure mode beyond.
Added TQ_LOGIT_PROBE=every=N — prints top-5 logits, token IDs, margin,
and softmax entropy per N positions.
Diagnosis on 35B -n 300 T=2.0:
pos=25 entropy=1.02 margin=0.94 (normal)
pos=125 entropy=0.29 margin=2.98 (very peaky — "Sorry!" loop starts)
pos=200 entropy=1.24 margin=0.52 top5_ids=[87,68,86,85,83] ← consec IDs
pos=225 entropy=0.21 margin=3.29
Crucial insight: the logits are NOT flat at degradation positions.
They're extremely peaky — model is CONFIDENTLY outputting bad tokens.
At pos=200, top-5 token IDs are consecutive single-char BPE tokens
(83-87 range). Model residual stream has collapsed into a "single-
character subspace"; lm_head projects confidently onto adjacent
byte-tokens.
Reframes the remaining problem: NOT logit-space, but residual-space
collapse. Driven by either KV attention focusing on recent repetitive
tokens, DeltaNet state saturating into a low-rank attractor, or
cumulative numerical drift through 40 layers × position.
Ablation sweep:
k-window=256: alphabet-walk → "Sorry?" repetition (less catastrophic)
k-window=64: too narrow, degrades faster
delta-reset=100: "2020 dragon" loop at 125 (too aggressive)
delta-reset=150: identical to default
Q5_K_M + T=2.0 (no rep-pen): ~170 coherent tokens (current peak)
rep-penalty 1.3: no effect on the math-loop (only TEMP=2.0 breaks it)
No fix this round — 170-tok ceiling stands. Next attack vectors noted
in state.md R38 entry: per-layer residual rms dump, attention-weight
dump at long positions, periodic "residual refresh" re-embedding.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent f047447 commit 562fa34
3 files changed
Lines changed: 103 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
6 | 67 | | |
7 | 68 | | |
8 | 69 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3355 | 3355 | | |
3356 | 3356 | | |
3357 | 3357 | | |
| 3358 | + | |
| 3359 | + | |
| 3360 | + | |
| 3361 | + | |
| 3362 | + | |
| 3363 | + | |
| 3364 | + | |
| 3365 | + | |
| 3366 | + | |
| 3367 | + | |
| 3368 | + | |
| 3369 | + | |
| 3370 | + | |
| 3371 | + | |
| 3372 | + | |
| 3373 | + | |
| 3374 | + | |
| 3375 | + | |
| 3376 | + | |
| 3377 | + | |
| 3378 | + | |
| 3379 | + | |
| 3380 | + | |
| 3381 | + | |
| 3382 | + | |
| 3383 | + | |
| 3384 | + | |
| 3385 | + | |
| 3386 | + | |
| 3387 | + | |
| 3388 | + | |
| 3389 | + | |
| 3390 | + | |
| 3391 | + | |
| 3392 | + | |
| 3393 | + | |
| 3394 | + | |
| 3395 | + | |
| 3396 | + | |
| 3397 | + | |
| 3398 | + | |
3358 | 3399 | | |
3359 | 3400 | | |
3360 | 3401 | | |
| |||
0 commit comments