@@ -581,3 +581,68 @@ Calibration pipeline:
581581Both ONNX and GGUF from SAME model = no API key, no network for calibration.
582582Jina v5 replaces v3 as truth anchor (newer, better, has ONNX).
583583```
584+
585+ ---
586+
587+ ## CALIBRATION MATRIX — The Definitive Experiment
588+
589+ ### Three encoding paths × ONNX ground truth
590+
591+ ```
592+ For each model (Jina v5, BGE-M3, Reranker, Reader-LM, Qwopus, Maverick):
593+ For each role (Q, K, V, Gate, Up, Down — or token_embd for embedding models):
594+
595+ 1. ONNX (rten): load model.onnx → forward pass → f32 embeddings = GROUND TRUTH
596+ 2. GGUF raw: stream BF16 → CLAM → cosine → u8 HDR CDF table
597+ 3. GGUF γ+φ: stream BF16 → CLAM → cosine → gamma offset → phi redistribute
598+ 4. GGUF i8: stream BF16 → CLAM → cosine → signed i8 (preserves inhibition)
599+ 5. GGUF hhbgz: stream BF16 → CLAM → highheelbgz spiral → golden ratio stride
600+
601+ ICC profile: compare each path (2-5) against ground truth (1)
602+ Measure: Spearman ρ, transfer curve, noise floor, effective bits
603+
604+ Best path = highest ρ after ICC correction
605+ Per-model per-role winner may differ!
606+ ```
607+
608+ ### Why this is definitive
609+
610+ ```
611+ Current state: we ASSUME our encoding preserves topology.
612+ After calibration: we KNOW, quantified to 4 decimal places.
613+
614+ Expected outcomes:
615+ - i8 wins for reranker (symmetric cos range, sign matters)
616+ - γ+φ wins for gate-heavy roles (concentrates resolution at zero)
617+ - raw u8 CDF is surprisingly good for embedding models (positive-skewed)
618+ - hhbgz spiral wins for... we don't know yet. That's why we test.
619+
620+ ICC correction makes EVERY path usable.
621+ But some paths need less correction = more faithful = preferred.
622+ ```
623+
624+ ### Tools ready
625+
626+ ```
627+ rten: AdaWorldAPI/rten (ONNX runtime, your fork)
628+ GGUF streaming: stream_hdr_lens.rs, stream_maverick.rs (HTTP range)
629+ highheelbgz: spiral addressing + golden ratio stride
630+ bgz-tensor: gamma_phi.rs (GammaProfile, encode/decode)
631+ LensProfile: lance-graph-contract/high_heel.rs (ICC DTO)
632+ LensConfig: lance-graph-contract/high_heel.rs (6-model registry)
633+ calibrate_lenses.rs: Spearman ρ + ICC builder harness
634+ Jina v5 ONNX: jinaai/jina-embeddings-v5-text-small-text-matching
635+ Jina v5 GGUF: same repo, F16.gguf (1.2 GB, streamable)
636+ ```
637+
638+ ### Session estimate
639+ ```
640+ Download: Jina v5 ONNX (2.4 GB) + GGUF (1.2 GB) = 3.6 GB (fits in 18 GB free)
641+ Compute: rten inference on 1000 texts ≈ 5 min
642+ CLAM + 5 encoding paths ≈ 10 min
643+ ICC profiles ≈ 1 min
644+ Total: ~20 min for Jina v5 complete calibration
645+
646+ Repeat for remaining 5 models: ~2 hours total
647+ Full calibration matrix: ~2.5 hours
648+ ```
0 commit comments