Skip to content

Commit acc4837

Browse files
committed
Add PR-X12 tensor container expansion capstone
1 parent 3daa4d7 commit acc4837

1 file changed

Lines changed: 280 additions & 0 deletions

File tree

Lines changed: 280 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,280 @@
1+
# PR-X12 Tensor Container Expansion Capstone
2+
3+
## Purpose
4+
5+
Capture the capstone epiphany that the PR-X12 codec line is not only an x265 replacement and not only a future x266 / 3DGS scene codec.
6+
7+
It also expands naturally into a universal compressed tensor substrate for:
8+
9+
```text
10+
GGUF
11+
safetensors
12+
Lance / Arrow tensor chunks
13+
KV cache streams
14+
gradient streams
15+
3DGS scene anchors
16+
model-weight distribution
17+
```
18+
19+
This document connects the existing perspective docs:
20+
21+
```text
22+
.claude/knowledge/pr-x12-x265-blasgraph-gemm.md
23+
.claude/knowledge/pr-x12-x266-3dgs-spacetime-upscaling.md
24+
.claude/knowledge/pr-x12-gguf-llm-weights-encoding.md
25+
.claude/knowledge/pr-x12-anti-neural-lookup-inversion.md
26+
```
27+
28+
## Capstone thesis
29+
30+
```text
31+
PR-X12 is not a video codec.
32+
PR-X12 is a hierarchical block grammar for structured tensors.
33+
```
34+
35+
The video path is the first consumer:
36+
37+
```text
38+
x265 / HEVC
39+
-> CTU blocks
40+
-> Skip / Merge / Delta / Escape
41+
-> BLAS/GEMM inner loops
42+
-> PR-X12
43+
```
44+
45+
The 3DGS path is the second consumer:
46+
47+
```text
48+
x266-style scene codec
49+
-> scene anchor
50+
-> EWA splat basis
51+
-> deterministic space-time rendering
52+
-> codec-native upscaling
53+
```
54+
55+
The model-weight path is the third consumer:
56+
57+
```text
58+
GGUF / safetensors
59+
-> tensor CTUs
60+
-> activation-aware RDO
61+
-> basin codebooks
62+
-> decode-during-GEMM
63+
```
64+
65+
The datalake path is the fourth consumer:
66+
67+
```text
68+
Lance / Arrow tensor chunks
69+
-> HHTL fragment traversal
70+
-> certified skip / refine / hydrate
71+
-> exact rows, vectors, weights, or graph edges only when needed
72+
```
73+
74+
## General grammar
75+
76+
PR-X12 gives every structured payload this shape:
77+
78+
```text
79+
hierarchical block
80+
-> mode: Skip / Merge / Delta / Escape
81+
-> basin or codebook pointer
82+
-> residual or bypass payload
83+
-> optional inter-tier reference
84+
-> entropy-coded tail
85+
-> certified decode / reconstruction path
86+
```
87+
88+
The payload differs, but the grammar remains stable.
89+
90+
## Payload mapping
91+
92+
| Payload | Block | Basis / kernel | RDO distortion | Decode consumer |
93+
|---|---|---|---|---|
94+
| Video | CTU / CU | DCT / SSD / deblock | PSNR / rate-distortion | frame decoder |
95+
| 3DGS scene | splat block / scene anchor | EWA splat basis | raster error / view error | scene renderer |
96+
| GGUF weights | tensor CTU | GEMM / codebook lookup | activation-weighted error | inference matmul |
97+
| safetensors | raw tensor block | GEMM / quant decode | task or tensor error | training/inference loader |
98+
| KV cache | temporal tensor block | merge/delta stream | attention-logit error | transformer runtime |
99+
| gradients | shard / bucket | delta / sketch / codebook | optimizer-step error | distributed training |
100+
| Lance tensor chunks | fragment / row group | HHTL / field kernel | query confidence error | datalake executor |
101+
102+
## GGUF path
103+
104+
GGUF is the deployment-facing path.
105+
106+
Recommended compatibility strategy:
107+
108+
```text
109+
GGUF v3/vNext
110+
-> add Q_PRX12 quantization type
111+
-> keep GGUF metadata / tokenizer / architecture fields
112+
-> store PR-X12 encoded tensor blocks as the tensor payload
113+
```
114+
115+
Advantages:
116+
117+
```text
118+
llama.cpp ecosystem can adopt incrementally
119+
Ollama / LM Studio / Open WebUI can load through existing GGUF discovery
120+
model distributors can ship one familiar container
121+
PR-X12 becomes a quantization variant, not a format war
122+
```
123+
124+
The codec layer should not become GGUF-specific. GGUF is an adapter.
125+
126+
## Safetensors path
127+
128+
Safetensors is the training/checkpoint-facing path.
129+
130+
Recommended strategies:
131+
132+
```text
133+
1. sidecar mode
134+
model.safetensors
135+
model.prx12.index
136+
model.prx12.blocks
137+
138+
2. transcode mode
139+
model.safetensors
140+
->
141+
model.prx12.safetensorpack
142+
143+
3. Lance mode
144+
safetensors shards
145+
->
146+
Lance / Arrow tensor chunks
147+
->
148+
PR-X12 HHTL block encoding
149+
```
150+
151+
Safetensors should be treated as the clean source format. PR-X12 supplies compression, hierarchy, and decode-during-GEMM behavior.
152+
153+
## Lance / Arrow tensor chunk path
154+
155+
This is the most native path for the AdaWorld stack.
156+
157+
```text
158+
checkpoint shard or GGUF tensor
159+
->
160+
Arrow tensor chunk metadata
161+
->
162+
Lance block store
163+
->
164+
PR-X12 encoded payload
165+
->
166+
HHTL traversal and decode scheduling
167+
```
168+
169+
This enables:
170+
171+
```text
172+
partial model loading
173+
layer-family codebook sharing
174+
tensor-block search
175+
activation-aware block retention
176+
streaming decode-during-GEMM
177+
model-datalake queries
178+
cross-model weight comparison
179+
```
180+
181+
## Decode-during-GEMM
182+
183+
The most important runtime idea from the GGUF doc is preserved here:
184+
185+
```text
186+
encoded tensor block
187+
-> decode a small cache-resident window
188+
-> immediately consume it in GEMM
189+
-> discard decoded window
190+
```
191+
192+
This avoids full-tensor dequantization buffers.
193+
194+
Target behavior:
195+
196+
```text
197+
storage smaller than GGUF Q4_K_M
198+
perplexity near Q4_K_M / AWQ parity
199+
scratch memory near cache-window size
200+
GEMM overhead within a strict latency budget
201+
```
202+
203+
## Anti-neural lookup inversion
204+
205+
The anti-neural doc supplies the runtime discipline:
206+
207+
```text
208+
NNs may train tables.
209+
NNs must not sit in the codec hot loop.
210+
```
211+
212+
The runtime prefers:
213+
214+
```text
215+
k-means basin codebook
216+
frozen lookup table
217+
Gaussian-tail rANS
218+
DCT / EWA / BLAS basis
219+
HHTL traversal
220+
```
221+
222+
Optional neural enhancement belongs above the deterministic base layer, not inside it.
223+
224+
## Relationship to 3DGS plans
225+
226+
The 3DGS work is not separate from tensor compression.
227+
228+
The common shape is:
229+
230+
```text
231+
3DGS splat block
232+
position / covariance / color / opacity
233+
-> EWA basis
234+
-> render reconstruction
235+
236+
LLM tensor block
237+
weight / scale / residual / codebook
238+
-> GEMM basis
239+
-> logit reconstruction
240+
```
241+
242+
Both are basis-selected reconstruction from compressed blocks.
243+
244+
## Proposed follow-up docs
245+
246+
Potential future documents:
247+
248+
```text
249+
PR-X12-GGUF-Q_PRX12-format-sketch.md
250+
PR-X12-safetensors-sidecar-format-sketch.md
251+
PR-X12-Lance-tensor-chunk-layout.md
252+
PR-X12-decode-during-GEMM-benchmark-plan.md
253+
PR-X12-activation-aware-RDO-for-weights.md
254+
```
255+
256+
## Implementation trajectory
257+
258+
```text
259+
Phase 0: preserve existing PR-X12 docs and add this capstone
260+
Phase 1: prototype tensor block encoder over small f16 matrix
261+
Phase 2: add activation-weighted RDO input
262+
Phase 3: add decode-during-GEMM microbench
263+
Phase 4: transcode one small safetensors tensor into PR-X12 block payload
264+
Phase 5: wrap one GGUF tensor as Q_PRX12-like experimental payload
265+
Phase 6: Lance/Arrow tensor chunk storage and HHTL traversal
266+
```
267+
268+
## Acceptance criteria
269+
270+
- PR-X12 remains domain-neutral.
271+
- GGUF and safetensors are adapters, not special cases in the codec core.
272+
- Decode-during-GEMM remains a first-class benchmark target.
273+
- The anti-neural lookup discipline remains intact.
274+
- 3DGS scene anchors and LLM tensor blocks are documented as sibling consumers of the same grammar.
275+
276+
## Wall sentence
277+
278+
```text
279+
PR-X12 starts as x265-through-BLAS, grows into x266-through-3DGS, and then becomes a universal compressed tensor substrate for model weights, scene fields, datalakes, and streaming inference.
280+
```

0 commit comments

Comments
 (0)