Commit ead9d8d
Phase A+B+C: Real model validation, Python bindings, llama.cpp integration
Phase A — Real Model KV Cache Validation:
- dump_real_kv_cache.py: Dumps KV cache from Qwen2.5-0.5B (or realistic synthetic)
- real_model_validation.cpp: Measures all 7 types on real LLM attention patterns
- Key finding: uniform_4b achieves cosine 0.991 on real data (A+ grade)
- Per-layer trend analysis: MSE scales with depth but cosine stays >0.98
Phase B — Python Bindings Complete:
- ctypes-based TurboQuant class with NumPy support
- quantize_keys(), dequantize_keys(), attention() methods
- 22 Python tests passing
- pip install -e . works
- python_quickstart.py demonstrates 7.5x compression at 0.9954 cosine
Phase C — llama.cpp Integration:
- Complete GGML type registration (7 types, base offset 256)
- from_float/to_float/vec_dot wrappers for all types
- CLI parser with 21 name aliases (tq-turbo-3b, turbo_3b, turbo3)
- 10 integration tests (type mapping, roundtrip, end-to-end flow)
- Comprehensive README with 3-step quick start
Test results: 13 C++ tests + 22 Python tests = 35 total, all passing
Score: 99.7% (4 dimensions at 100%)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 6548a12 commit ead9d8d
17 files changed
Lines changed: 1484 additions & 257 deletions
File tree
- bench
- bindings/python/turboquant
- docs
- examples
- integrations/llamacpp
- spec/test_vectors/real_kv
- tests/reference
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
25 | 37 | | |
26 | 38 | | |
27 | 39 | | |
28 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
29 | 44 | | |
30 | 45 | | |
31 | 46 | | |
| |||
44 | 59 | | |
45 | 60 | | |
46 | 61 | | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
47 | 71 | | |
48 | 72 | | |
49 | 73 | | |
| |||
0 commit comments