Skip to content

Commit 6619cb1

Browse files
committed
docs: calibration matrix — ONNX vs GGUF vs highheelbgz × ICC profiles
5 encoding paths × ONNX ground truth × 6 models × 6 roles: ONNX (rten) = f32 ground truth GGUF raw u8 CDF, GGUF γ+φ, GGUF i8 signed, GGUF highheelbgz spiral ICC profile per path. Spearman ρ identifies best encoding per model×role. All tools ready: rten, streaming, gamma_phi, LensProfile, calibrate harness. Estimated: ~2.5 hours for complete matrix. https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
1 parent fb42be7 commit 6619cb1

1 file changed

Lines changed: 65 additions & 0 deletions

File tree

.claude/HANDOVER_MAVERICK_SESSION.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -581,3 +581,68 @@ Calibration pipeline:
581581
Both ONNX and GGUF from SAME model = no API key, no network for calibration.
582582
Jina v5 replaces v3 as truth anchor (newer, better, has ONNX).
583583
```
584+
585+
---
586+
587+
## CALIBRATION MATRIX — The Definitive Experiment
588+
589+
### Three encoding paths × ONNX ground truth
590+
591+
```
592+
For each model (Jina v5, BGE-M3, Reranker, Reader-LM, Qwopus, Maverick):
593+
For each role (Q, K, V, Gate, Up, Down — or token_embd for embedding models):
594+
595+
1. ONNX (rten): load model.onnx → forward pass → f32 embeddings = GROUND TRUTH
596+
2. GGUF raw: stream BF16 → CLAM → cosine → u8 HDR CDF table
597+
3. GGUF γ+φ: stream BF16 → CLAM → cosine → gamma offset → phi redistribute
598+
4. GGUF i8: stream BF16 → CLAM → cosine → signed i8 (preserves inhibition)
599+
5. GGUF hhbgz: stream BF16 → CLAM → highheelbgz spiral → golden ratio stride
600+
601+
ICC profile: compare each path (2-5) against ground truth (1)
602+
Measure: Spearman ρ, transfer curve, noise floor, effective bits
603+
604+
Best path = highest ρ after ICC correction
605+
Per-model per-role winner may differ!
606+
```
607+
608+
### Why this is definitive
609+
610+
```
611+
Current state: we ASSUME our encoding preserves topology.
612+
After calibration: we KNOW, quantified to 4 decimal places.
613+
614+
Expected outcomes:
615+
- i8 wins for reranker (symmetric cos range, sign matters)
616+
- γ+φ wins for gate-heavy roles (concentrates resolution at zero)
617+
- raw u8 CDF is surprisingly good for embedding models (positive-skewed)
618+
- hhbgz spiral wins for... we don't know yet. That's why we test.
619+
620+
ICC correction makes EVERY path usable.
621+
But some paths need less correction = more faithful = preferred.
622+
```
623+
624+
### Tools ready
625+
626+
```
627+
rten: AdaWorldAPI/rten (ONNX runtime, your fork)
628+
GGUF streaming: stream_hdr_lens.rs, stream_maverick.rs (HTTP range)
629+
highheelbgz: spiral addressing + golden ratio stride
630+
bgz-tensor: gamma_phi.rs (GammaProfile, encode/decode)
631+
LensProfile: lance-graph-contract/high_heel.rs (ICC DTO)
632+
LensConfig: lance-graph-contract/high_heel.rs (6-model registry)
633+
calibrate_lenses.rs: Spearman ρ + ICC builder harness
634+
Jina v5 ONNX: jinaai/jina-embeddings-v5-text-small-text-matching
635+
Jina v5 GGUF: same repo, F16.gguf (1.2 GB, streamable)
636+
```
637+
638+
### Session estimate
639+
```
640+
Download: Jina v5 ONNX (2.4 GB) + GGUF (1.2 GB) = 3.6 GB (fits in 18 GB free)
641+
Compute: rten inference on 1000 texts ≈ 5 min
642+
CLAM + 5 encoding paths ≈ 10 min
643+
ICC profiles ≈ 1 min
644+
Total: ~20 min for Jina v5 complete calibration
645+
646+
Repeat for remaining 5 models: ~2 hours total
647+
Full calibration matrix: ~2.5 hours
648+
```

0 commit comments

Comments
 (0)