Skip to content

Commit 1a3807f

Browse files
unamedkrclaude
andcommitted
docs(tier): Qwen3.6-27B Q4_K_M as Tier 3 — fundamental forward-pass bug
Newly released Qwen3.6-27B (dense DeltaNet hybrid, arch "qwen35"): - 64 layers, dim 5120, time_step_rank=48, ssm_inner=6144 - Loads without crash (config detected correctly) - basin_compat shows max rel_diff 3.87 at L3 with SIGN FLIPS - Sign flips at early layers = NOT basin drift, but fundamental bug Compare: Qwen3.6-35B-A3B (Tier 2): max 0.41, sign preserved Qwen3.5-4B (Tier 1 quality): max 2.30, sign mostly preserved Qwen3.6-27B: max 3.87, sign flips at L3, L51-L63 — fundamental issue Suspected causes (uninvestigated): - DeltaNet config variant (time_step_rank=48 vs A3B's 32) handling - MoE code paths leaking into dense model - 27B GGUF tensor layout differences Practical issues: - 16.8 GB Q4_K_M exceeds 16 GB RAM → constant swap → 0.3 tok/s - Smaller quants (UD-IQ2_M 10.1 GB) would fit RAM but same bug applies Marked Tier 3 in tier_benchmark_2026_04_25.md until engineering investigation. Documented suspected root causes for next session. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent c1d3d67 commit 1a3807f

1 file changed

Lines changed: 37 additions & 1 deletion

File tree

docs/tier_benchmark_2026_04_25.md

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,13 +26,49 @@ Standardized coherent-length measurement across 5 models, 3 prompts each. Run vi
2626
| Phi-3.5-mini-instruct Q8_0 | 299 (-n cap) | 299 (-n cap) | natural EOS | 1 |
2727
| **Qwen3.6-35B-A3B IQ4_XS** | 149 (EOS, thinking) | **73, rep loop** | **51, attractor** | **2** |
2828
| **Qwen3.6-35B-A3B Q5_K_M** | 169 (EOS, thinking) | **68, rep loop** | **69, rep loop** | **2** |
29+
| **Qwen3.6-27B Q4_K_M** (NEW) | not measured | not measured | not measured | **3** |
2930

3031
**Key observations:**
3132

3233
- Qwen2.5-0.5B (Tier 3): model is genuinely too small — attractor under any engine. Keep in table as lower-bound reference.
3334
- Qwen3.6-35B-A3B (both quants Tier 2): **quantization change doesn't help**. Both IQ4_XS and Q5_K_M show the same basin mismatch (attractor at ~70 tok on non-thinking prompts). Confirms basin issue is structural, not bit-width.
35+
- **Qwen3.6-27B Q4_K_M (Tier 3, 2026-04-25)**: newly released dense DeltaNet hybrid (arch `qwen35`, 64 layers, dim 5120, time_step_rank=48 / ssm_inner=6144 — different config from 35B-A3B). basin_compat shows max rel_diff **3.87 at L3** with sign flips throughout. This is NOT basin mismatch (which preserves sign) — it's a fundamental forward-pass bug (likely in DeltaNet config variants we haven't validated, or MoE-code-path leak into dense models). 16.8 GB Q4_K_M file also exceeds 16 GB RAM, causing constant disk swap. Marked Tier 3 pending engineering investigation.
3436
- 10/14 models are clean Tier 1 across 3 prompts.
35-
- No model has been demoted from Tier 1 to 2 by our R63 cleanup. The DN_PORT fix + auto-preset cleanup was a net improvement.
37+
- No prior Tier 1 model was demoted by R63 cleanup. The DN_PORT fix + auto-preset cleanup was a net improvement.
38+
39+
## Qwen3.6-27B Q4_K_M — full diagnostic (2026-04-25)
40+
41+
basin_compat measurement (sums over 5120 elements per layer):
42+
43+
| Layer | ours | llama | rel_diff | Notes |
44+
|-------|------|-------|----------|-------|
45+
| 0 | 23.5 | 6.8 | 2.47 | already off, same sign |
46+
| 1 | 28.1 | 6.5 | 3.35 | growing |
47+
| 3 | **-52.2** | **+18.2** | **3.87** | **sign flip — fundamental** |
48+
| 7 | 117 | 33.5 | 2.49 | |
49+
| 33 | 14 | 268 | 0.95 | huge magnitude diff |
50+
| 51 | -196 | 84 | 3.33 | sign flip |
51+
| 57 | -339 | 138 | 3.46 | sign flip |
52+
| 59 | -481 | 219 | 3.20 | sign flip |
53+
| 63 | 160 | -121 | 2.32 | sign flip on FINAL layer |
54+
55+
Max rel_diff: **3.87** (L3). For comparison:
56+
- Qwen3.6-35B-A3B (Tier 2): max 0.41, sign preserved everywhere
57+
- Qwen3.5-4B (Tier 1 quality, Tier 3 by basin): max 2.30, sign mostly preserved
58+
- Qwen3.6-27B (Tier 3): max 3.87, sign flips at L3 onwards
59+
60+
**Sign flips at early layers are the diagnostic signature of fundamental forward-pass bug** (not FP32 basin drift). Recommended action: skip 27B until bug is investigated.
61+
62+
**Suspected causes** (not validated this session):
63+
1. DeltaNet config variant (time_step_rank=48 vs A3B's 32, ssm_inner=6144 vs 4096) handling.
64+
2. MoE-only code paths firing on dense model (load message "Fused MoE kernels ready" appeared on dense model — likely harmless but worth confirming).
65+
3. Tensor layout interpretation differences in 27B GGUF.
66+
67+
**Memory**: at 16.8 GB Q4_K_M model size on 16 GB RAM Mac, evaluation is impractical (constant swap, ~0.3 tok/s, -n 30 test took 15+ min). For users wanting to test 27B, smaller quants are available:
68+
- UD-IQ2_M: 10.1 GB (recommended for 16 GB RAM)
69+
- UD-Q2_K_XL: 11.0 GB
70+
- Q3_K_S: 11.5 GB
71+
But the same Tier 3 basin issue would apply regardless of bit-width.
3672

3773
## Quality verdicts (first ~200 chars)
3874

0 commit comments

Comments
 (0)