|
| 1 | +# PR-X12 Tensor Container Expansion Capstone |
| 2 | + |
| 3 | +## Purpose |
| 4 | + |
| 5 | +Capture the capstone epiphany that the PR-X12 codec line is not only an x265 replacement and not only a future x266 / 3DGS scene codec. |
| 6 | + |
| 7 | +It also expands naturally into a universal compressed tensor substrate for: |
| 8 | + |
| 9 | +```text |
| 10 | +GGUF |
| 11 | +safetensors |
| 12 | +Lance / Arrow tensor chunks |
| 13 | +KV cache streams |
| 14 | +gradient streams |
| 15 | +3DGS scene anchors |
| 16 | +model-weight distribution |
| 17 | +``` |
| 18 | + |
| 19 | +This document connects the existing perspective docs: |
| 20 | + |
| 21 | +```text |
| 22 | +.claude/knowledge/pr-x12-x265-blasgraph-gemm.md |
| 23 | +.claude/knowledge/pr-x12-x266-3dgs-spacetime-upscaling.md |
| 24 | +.claude/knowledge/pr-x12-gguf-llm-weights-encoding.md |
| 25 | +.claude/knowledge/pr-x12-anti-neural-lookup-inversion.md |
| 26 | +``` |
| 27 | + |
| 28 | +## Capstone thesis |
| 29 | + |
| 30 | +```text |
| 31 | +PR-X12 is not a video codec. |
| 32 | +PR-X12 is a hierarchical block grammar for structured tensors. |
| 33 | +``` |
| 34 | + |
| 35 | +The video path is the first consumer: |
| 36 | + |
| 37 | +```text |
| 38 | +x265 / HEVC |
| 39 | + -> CTU blocks |
| 40 | + -> Skip / Merge / Delta / Escape |
| 41 | + -> BLAS/GEMM inner loops |
| 42 | + -> PR-X12 |
| 43 | +``` |
| 44 | + |
| 45 | +The 3DGS path is the second consumer: |
| 46 | + |
| 47 | +```text |
| 48 | +x266-style scene codec |
| 49 | + -> scene anchor |
| 50 | + -> EWA splat basis |
| 51 | + -> deterministic space-time rendering |
| 52 | + -> codec-native upscaling |
| 53 | +``` |
| 54 | + |
| 55 | +The model-weight path is the third consumer: |
| 56 | + |
| 57 | +```text |
| 58 | +GGUF / safetensors |
| 59 | + -> tensor CTUs |
| 60 | + -> activation-aware RDO |
| 61 | + -> basin codebooks |
| 62 | + -> decode-during-GEMM |
| 63 | +``` |
| 64 | + |
| 65 | +The datalake path is the fourth consumer: |
| 66 | + |
| 67 | +```text |
| 68 | +Lance / Arrow tensor chunks |
| 69 | + -> HHTL fragment traversal |
| 70 | + -> certified skip / refine / hydrate |
| 71 | + -> exact rows, vectors, weights, or graph edges only when needed |
| 72 | +``` |
| 73 | + |
| 74 | +## General grammar |
| 75 | + |
| 76 | +PR-X12 gives every structured payload this shape: |
| 77 | + |
| 78 | +```text |
| 79 | +hierarchical block |
| 80 | + -> mode: Skip / Merge / Delta / Escape |
| 81 | + -> basin or codebook pointer |
| 82 | + -> residual or bypass payload |
| 83 | + -> optional inter-tier reference |
| 84 | + -> entropy-coded tail |
| 85 | + -> certified decode / reconstruction path |
| 86 | +``` |
| 87 | + |
| 88 | +The payload differs, but the grammar remains stable. |
| 89 | + |
| 90 | +## Payload mapping |
| 91 | + |
| 92 | +| Payload | Block | Basis / kernel | RDO distortion | Decode consumer | |
| 93 | +|---|---|---|---|---| |
| 94 | +| Video | CTU / CU | DCT / SSD / deblock | PSNR / rate-distortion | frame decoder | |
| 95 | +| 3DGS scene | splat block / scene anchor | EWA splat basis | raster error / view error | scene renderer | |
| 96 | +| GGUF weights | tensor CTU | GEMM / codebook lookup | activation-weighted error | inference matmul | |
| 97 | +| safetensors | raw tensor block | GEMM / quant decode | task or tensor error | training/inference loader | |
| 98 | +| KV cache | temporal tensor block | merge/delta stream | attention-logit error | transformer runtime | |
| 99 | +| gradients | shard / bucket | delta / sketch / codebook | optimizer-step error | distributed training | |
| 100 | +| Lance tensor chunks | fragment / row group | HHTL / field kernel | query confidence error | datalake executor | |
| 101 | + |
| 102 | +## GGUF path |
| 103 | + |
| 104 | +GGUF is the deployment-facing path. |
| 105 | + |
| 106 | +Recommended compatibility strategy: |
| 107 | + |
| 108 | +```text |
| 109 | +GGUF v3/vNext |
| 110 | + -> add Q_PRX12 quantization type |
| 111 | + -> keep GGUF metadata / tokenizer / architecture fields |
| 112 | + -> store PR-X12 encoded tensor blocks as the tensor payload |
| 113 | +``` |
| 114 | + |
| 115 | +Advantages: |
| 116 | + |
| 117 | +```text |
| 118 | +llama.cpp ecosystem can adopt incrementally |
| 119 | +Ollama / LM Studio / Open WebUI can load through existing GGUF discovery |
| 120 | +model distributors can ship one familiar container |
| 121 | +PR-X12 becomes a quantization variant, not a format war |
| 122 | +``` |
| 123 | + |
| 124 | +The codec layer should not become GGUF-specific. GGUF is an adapter. |
| 125 | + |
| 126 | +## Safetensors path |
| 127 | + |
| 128 | +Safetensors is the training/checkpoint-facing path. |
| 129 | + |
| 130 | +Recommended strategies: |
| 131 | + |
| 132 | +```text |
| 133 | +1. sidecar mode |
| 134 | + model.safetensors |
| 135 | + model.prx12.index |
| 136 | + model.prx12.blocks |
| 137 | +
|
| 138 | +2. transcode mode |
| 139 | + model.safetensors |
| 140 | + -> |
| 141 | + model.prx12.safetensorpack |
| 142 | +
|
| 143 | +3. Lance mode |
| 144 | + safetensors shards |
| 145 | + -> |
| 146 | + Lance / Arrow tensor chunks |
| 147 | + -> |
| 148 | + PR-X12 HHTL block encoding |
| 149 | +``` |
| 150 | + |
| 151 | +Safetensors should be treated as the clean source format. PR-X12 supplies compression, hierarchy, and decode-during-GEMM behavior. |
| 152 | + |
| 153 | +## Lance / Arrow tensor chunk path |
| 154 | + |
| 155 | +This is the most native path for the AdaWorld stack. |
| 156 | + |
| 157 | +```text |
| 158 | +checkpoint shard or GGUF tensor |
| 159 | + -> |
| 160 | +Arrow tensor chunk metadata |
| 161 | + -> |
| 162 | +Lance block store |
| 163 | + -> |
| 164 | +PR-X12 encoded payload |
| 165 | + -> |
| 166 | +HHTL traversal and decode scheduling |
| 167 | +``` |
| 168 | + |
| 169 | +This enables: |
| 170 | + |
| 171 | +```text |
| 172 | +partial model loading |
| 173 | +layer-family codebook sharing |
| 174 | +tensor-block search |
| 175 | +activation-aware block retention |
| 176 | +streaming decode-during-GEMM |
| 177 | +model-datalake queries |
| 178 | +cross-model weight comparison |
| 179 | +``` |
| 180 | + |
| 181 | +## Decode-during-GEMM |
| 182 | + |
| 183 | +The most important runtime idea from the GGUF doc is preserved here: |
| 184 | + |
| 185 | +```text |
| 186 | +encoded tensor block |
| 187 | + -> decode a small cache-resident window |
| 188 | + -> immediately consume it in GEMM |
| 189 | + -> discard decoded window |
| 190 | +``` |
| 191 | + |
| 192 | +This avoids full-tensor dequantization buffers. |
| 193 | + |
| 194 | +Target behavior: |
| 195 | + |
| 196 | +```text |
| 197 | +storage smaller than GGUF Q4_K_M |
| 198 | +perplexity near Q4_K_M / AWQ parity |
| 199 | +scratch memory near cache-window size |
| 200 | +GEMM overhead within a strict latency budget |
| 201 | +``` |
| 202 | + |
| 203 | +## Anti-neural lookup inversion |
| 204 | + |
| 205 | +The anti-neural doc supplies the runtime discipline: |
| 206 | + |
| 207 | +```text |
| 208 | +NNs may train tables. |
| 209 | +NNs must not sit in the codec hot loop. |
| 210 | +``` |
| 211 | + |
| 212 | +The runtime prefers: |
| 213 | + |
| 214 | +```text |
| 215 | +k-means basin codebook |
| 216 | +frozen lookup table |
| 217 | +Gaussian-tail rANS |
| 218 | +DCT / EWA / BLAS basis |
| 219 | +HHTL traversal |
| 220 | +``` |
| 221 | + |
| 222 | +Optional neural enhancement belongs above the deterministic base layer, not inside it. |
| 223 | + |
| 224 | +## Relationship to 3DGS plans |
| 225 | + |
| 226 | +The 3DGS work is not separate from tensor compression. |
| 227 | + |
| 228 | +The common shape is: |
| 229 | + |
| 230 | +```text |
| 231 | +3DGS splat block |
| 232 | + position / covariance / color / opacity |
| 233 | + -> EWA basis |
| 234 | + -> render reconstruction |
| 235 | +
|
| 236 | +LLM tensor block |
| 237 | + weight / scale / residual / codebook |
| 238 | + -> GEMM basis |
| 239 | + -> logit reconstruction |
| 240 | +``` |
| 241 | + |
| 242 | +Both are basis-selected reconstruction from compressed blocks. |
| 243 | + |
| 244 | +## Proposed follow-up docs |
| 245 | + |
| 246 | +Potential future documents: |
| 247 | + |
| 248 | +```text |
| 249 | +PR-X12-GGUF-Q_PRX12-format-sketch.md |
| 250 | +PR-X12-safetensors-sidecar-format-sketch.md |
| 251 | +PR-X12-Lance-tensor-chunk-layout.md |
| 252 | +PR-X12-decode-during-GEMM-benchmark-plan.md |
| 253 | +PR-X12-activation-aware-RDO-for-weights.md |
| 254 | +``` |
| 255 | + |
| 256 | +## Implementation trajectory |
| 257 | + |
| 258 | +```text |
| 259 | +Phase 0: preserve existing PR-X12 docs and add this capstone |
| 260 | +Phase 1: prototype tensor block encoder over small f16 matrix |
| 261 | +Phase 2: add activation-weighted RDO input |
| 262 | +Phase 3: add decode-during-GEMM microbench |
| 263 | +Phase 4: transcode one small safetensors tensor into PR-X12 block payload |
| 264 | +Phase 5: wrap one GGUF tensor as Q_PRX12-like experimental payload |
| 265 | +Phase 6: Lance/Arrow tensor chunk storage and HHTL traversal |
| 266 | +``` |
| 267 | + |
| 268 | +## Acceptance criteria |
| 269 | + |
| 270 | +- PR-X12 remains domain-neutral. |
| 271 | +- GGUF and safetensors are adapters, not special cases in the codec core. |
| 272 | +- Decode-during-GEMM remains a first-class benchmark target. |
| 273 | +- The anti-neural lookup discipline remains intact. |
| 274 | +- 3DGS scene anchors and LLM tensor blocks are documented as sibling consumers of the same grammar. |
| 275 | + |
| 276 | +## Wall sentence |
| 277 | + |
| 278 | +```text |
| 279 | +PR-X12 starts as x265-through-BLAS, grows into x266-through-3DGS, and then becomes a universal compressed tensor substrate for model weights, scene fields, datalakes, and streaming inference. |
| 280 | +``` |
0 commit comments