You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Community validation suite: ASan clean, ablation, sampling, long context
Validation results:
- ASan + UBSan: 23/23 tests pass, zero memory errors
- Ablation: turbo_kv_3b matches uniform at 100 tok, diverges ~150
- Ablation: turbo_kv_1b matches uniform at 100 tok, diverges ~150
- Sampling (T=0.7): all KV types produce identical stochastic output
- V quant reality: Q4 V diverges from FP16 V (expected, documented)
New scripts:
- bench/ablation_test.sh: divergence analysis at 50-300 tokens
- bench/long_quality_test.sh: coherence at 200-1000 tokens
- bench/sampling_test.sh: temperature sampling comparison
- scripts/sanitize.sh: ASan + UBSan build and test
README: Benchmarks & Validation section with all test commands.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0 commit comments