|
51 | 51 | | ADR-021 | **Meta-hygiene**: always grep peer crates before copying manifest patterns (the `[lints] workspace = true` cascade lesson) | **Pinned** | OGAR PR #15 + PR #17/#18 follow-ups | |
52 | 52 | | ADR-022 | **The Firewall** — absolute inner/outer boundary; no serialization in hot path; inner = compile-time HHTL; outer = contract-trait pluggable | **Pinned** | OGAR (this PR); `docs/THE-FIREWALL.md` | |
53 | 53 | | ADR-023 | **IR-as-wire-truth** — the source-language AST is *input dialect*; the canonical `Class`/`Attribute`/`Association`/`EnumDecl`/`ActionDef` IR is *wire truth*. Adapters lift dialects into IR; the IR routes everything (registry key, actor mailbox, Lance version, audit-log dimension) | **Pinned** | OGAR (this PR); `crates/ogar-vocab/`; `bardioc/substrate-b-shadow::EdgeDecoder<E>` (PR #19) | |
| 54 | +| ADR-024 | **Palette256 + HHTL codec** — the substrate's universal compression primitive. HHTL prefix establishes a frame; within the frame, values cluster; clustered values quantize to 256-index palette + const-table lookup. Names an existing primitive (Binary16K perms + bgz-tensor attention + arm-discovery aerial codebook, ρ=0.9973 vs cosine) rather than proposing one | **Pinned** | OGAR (this PR); `MedCare-rs/crates/medcare-analytics/src/{graph_contract.rs,column_mask_bridge.rs}`; `bgz-tensor/examples/compare_stacked_vs_i16.rs`; `lance-graph-arm-discovery` | |
54 | 55 |
|
55 | 56 | ## ADR-001: `State = ActionState` (lifecycle), not domain state, for Rubicon binding |
56 | 57 |
|
@@ -1167,6 +1168,185 @@ lance-graph) before merge. |
1167 | 1168 | - `docs/RDF-OWL-ALIGNMENT.md` §3 (OGAR's position in L1-L5) — the |
1168 | 1169 | IR sits at the AR-pattern lift seam. |
1169 | 1170 |
|
| 1171 | +## ADR-024: Palette256 + HHTL codec — the substrate's universal compression primitive |
| 1172 | + |
| 1173 | +**Status:** Pinned (2026-06-05). Names an existing primitive (three |
| 1174 | +independent deployments + one empirical anchor) rather than proposing |
| 1175 | +one. Companion to ADR-022 (The Firewall — this ADR specifies one of |
| 1176 | +its inner-side primitives) and ADR-023 (IR-as-wire-truth — palette256 |
| 1177 | +is the codec on the IR's wire form). |
| 1178 | + |
| 1179 | +**Context.** The substrate has accumulated three independent |
| 1180 | +palette256 deployments developed for their own domains: |
| 1181 | + |
| 1182 | +- **Security mesh** — `Binary16K = [u64; 256]` in |
| 1183 | + `MedCare-rs/crates/medcare-analytics/src/graph_contract.rs:36` |
| 1184 | + (canonical home). The per-row `_effectiveReaders` bitmap; auth is |
| 1185 | + Hamming-popcount bit-intersection at the inner / hot path |
| 1186 | + (`HEALTHCARE-TRANSCODING.md §3.1`). Wired into production at |
| 1187 | + `MedCare-rs/crates/medcare-analytics/src/column_mask_bridge.rs` → |
| 1188 | + `medcare-server/state.rs:167, 265, 439`. |
| 1189 | +- **Attention** — `bgz-tensor` `WeightPalette::build(…, 256)` + |
| 1190 | + `AttentionTable::build` (`crates/bgz-tensor/examples/ |
| 1191 | + compare_stacked_vs_i16.rs:90-92`). Replaces dense FP weights with |
| 1192 | + 256-index palette + precomputed distance table on the model's hot |
| 1193 | + path. |
| 1194 | +- **Distance** — `lance-graph-arm-discovery` aerial codebook — |
| 1195 | + measured **ρ = 0.9973 vs cosine**. The empirical anchor: palette256 |
| 1196 | + reproduces cosine distance with correlation 0.9973 (i.e. on a |
| 1197 | + scale where 1.0 = identical, palette256 is ~0.003 from cosine). |
| 1198 | + |
| 1199 | +Cross-domain analysis revealed all three are instances of the *same |
| 1200 | +codec*: HHTL prefix establishes a frame; within frame, values |
| 1201 | +cluster; clustered values quantize to a 256-index palette; decode is |
| 1202 | +a const-table lookup. The runtime side's BindSpace dissolution work |
| 1203 | +(bardioc PR #18 / lance-graph PR #470) hinted at this with the |
| 1204 | +Quintenzirkel qualia codebook ("frozen set + circle-of-fifths |
| 1205 | +progression → 8 B → 1-2 B per row") — same compression strategy, |
| 1206 | +different domain. |
| 1207 | + |
| 1208 | +The proposal in the cross-session conversation (2026-06-05) was to |
| 1209 | +name the primitive explicitly so: |
| 1210 | +1. Future adopters don't reinvent it per domain. |
| 1211 | +2. New adopters report a falsifiable measurement (ρ-vs-reference) |
| 1212 | + at adoption time rather than after the fact. |
| 1213 | +3. The 256-ceiling escape hatches are documented before reviewers |
| 1214 | + ask. |
| 1215 | + |
| 1216 | +**Decision.** **The codec is:** |
| 1217 | + |
| 1218 | +```text |
| 1219 | +HHTL prefix (NiblePath / quadkey / class identity) |
| 1220 | + ↓ establishes spatial / semantic frame |
| 1221 | +within-frame values cluster |
| 1222 | + ↓ quantize to 256-index palette |
| 1223 | + ↓ const-table lookup (compile-time HHTL where possible) |
| 1224 | +1-byte index per element, sub-microsecond decode, zero heap allocation |
| 1225 | +``` |
| 1226 | + |
| 1227 | +**Adoption checklist** for a new domain: |
| 1228 | +1. **Identify the prefix.** The NiblePath / quadkey / class identity |
| 1229 | + that establishes the frame the values live in. |
| 1230 | +2. **Identify the palette domain.** Which values cluster within the |
| 1231 | + frame? (Closed-keyspace tags, quantized continuous values, |
| 1232 | + enumerated state, etc.) |
| 1233 | +3. **Build the palette + measure ρ-vs-reference.** The reference is |
| 1234 | + the domain's full-precision metric (cosine for embeddings, L2 for |
| 1235 | + coordinates, exact-match for tags). Report ρ at adoption time as |
| 1236 | + the falsifiable property. Target: **ρ ≥ 0.99** to match the |
| 1237 | + arm-discovery anchor. |
| 1238 | +4. **Decode = const-table lookup.** Compile-time HHTL if the palette |
| 1239 | + is static; runtime const-table if the palette is per-frame / |
| 1240 | + per-tile. Either way the decode path is zero-allocation. |
| 1241 | + |
| 1242 | +**The 256-ceiling escape hatches** (documented to avoid the |
| 1243 | +predictable reviewer question): |
| 1244 | + |
| 1245 | +- **Per-tile / per-frame palettes** — the cheapest answer. Different |
| 1246 | + spatial-frame, different 256 entries. Used by Cesium tile codecs; |
| 1247 | + matches the quadkey-prefix discipline. Long-tail OSM tags inside a |
| 1248 | + zoom-21 tile rarely exceed 256. |
| 1249 | +- **Hierarchical palettes** — coarser palette at higher quadkey |
| 1250 | + levels, finer per leaf. Mirrors the standard tile pyramid; the SH |
| 1251 | + L0/L1 vs L2/L3 split in `splat-fit` is the same pattern. |
| 1252 | +- **Palette-64K upgrade** — 2-byte index instead of 1, for hot |
| 1253 | + palettes that genuinely exceed 256 distinct values (rare; reserve |
| 1254 | + for measured cases, not speculation). |
| 1255 | + |
| 1256 | +The escape hatches are part of the primitive, not exceptions to it. |
| 1257 | + |
| 1258 | +**Alternatives considered.** |
| 1259 | + |
| 1260 | +- *Continuous distributions that don't cluster* (e.g. timestamps in |
| 1261 | + microseconds, free-form text). Rejected as a counterargument to |
| 1262 | + the codec — these are out-of-domain. For them, use delta encoding |
| 1263 | + or VarInt or a different codec entirely. The codec applies to |
| 1264 | + *clustered* domains; the adoption checklist's step 2 is the filter. |
| 1265 | +- *Domain-specific codecs per domain.* Rejected. Three independent |
| 1266 | + re-derivations of the same primitive (security / attention / |
| 1267 | + distance) is the receipt that the abstraction is real, not the |
| 1268 | + receipt that each domain should have its own. ADR-024 reduces |
| 1269 | + per-domain re-derivation. |
| 1270 | +- *Skip the ρ-vs-reference measurement.* Rejected. The arm-discovery |
| 1271 | + ρ = 0.9973 is the existing FINDING-grade stake; new domains |
| 1272 | + reporting at adoption time keeps the empirical floor honest as the |
| 1273 | + primitive spreads. |
| 1274 | + |
| 1275 | +**Consequences.** |
| 1276 | + |
| 1277 | +- **The primitive is named.** Cross-domain reuse is now load-bearing, |
| 1278 | + not coincidental. New domains adopt the codec instead of inventing |
| 1279 | + their own quantization. |
| 1280 | +- **ρ-vs-reference becomes the adoption contract.** Reported once at |
| 1281 | + adoption per domain. The arm-discovery 0.9973 is the existing |
| 1282 | + anchor; new adopters target ≥ 0.99 and document if they fall short. |
| 1283 | +- **Two next-domain adopters are queued** (planned, not yet wired): |
| 1284 | + - **D-OSM-2** (OSM tag palette + tile-local coordinate |
| 1285 | + quantization) — per `lance-graph` PR #473 (`cesium-osm-substrate |
| 1286 | + -v1.md`). Reports ρ-vs-reference on first per-country PBF run per |
| 1287 | + the runtime session's §11 follow-up commitment. |
| 1288 | + - **D-SPLAT-4** (SH-aware palette extension on the |
| 1289 | + `Gaussian3D` carrier) — per the splat-native arc. Same codec; SH |
| 1290 | + coefficients are the long-tail-budget challenger. |
| 1291 | +- **The 256-ceiling has three explicit escapes** in the ADR body |
| 1292 | + (per-tile / hierarchical / palette-64K). Reviewers don't need to |
| 1293 | + re-derive the answer. |
| 1294 | +- **Cross-arc reuse argument is sharpened.** The substrate-reuse |
| 1295 | + framing in `docs/RDF-OWL-ALIGNMENT.md §10` (geographic litmus |
| 1296 | + complements anatomical) cashes out as: FMA-bones and OSM-vectors |
| 1297 | + use *the same codec* (palette256 + HHTL prefix), not just the same |
| 1298 | + IR. The §6 callout in `DOMAIN-INSTANCES.md` (queued, awaiting |
| 1299 | + lance-graph PR #473 land) will reference ADR-024 as the falsifiable |
| 1300 | + property. |
| 1301 | +- **The falsifiable property** that ties the substrate-reuse claim |
| 1302 | + down: *"the same compile-time HHTL prefix + palette256 codec |
| 1303 | + decodes (a) `_effectiveReaders` for row auth, (b) OSM way |
| 1304 | + attributes at zoom-21 tile, and (c) FMA-bone SH coefficients at |
| 1305 | + sub-microsecond per element with zero heap allocation."* If that |
| 1306 | + property holds across all three, the substrate is doing its job. |
| 1307 | + If it fails on one, the substrate is leaking dialect into the codec. |
| 1308 | + |
| 1309 | +**Change policy.** Adding a new palette256 adopter (new domain) is |
| 1310 | +routine — follow the adoption checklist + report ρ-vs-reference. |
| 1311 | +Changing the codec itself (e.g. palette-64K becoming default, or a |
| 1312 | +new escape-hatch added) is a substrate-wide concern and requires |
| 1313 | +consultation with the runtime session. |
| 1314 | + |
| 1315 | +**References.** |
| 1316 | + |
| 1317 | +- `lance-graph/.claude/board/EPIPHANIES.md:28` — FINDING-grade |
| 1318 | + anchor for palette256 + Hamming popcount on `_effectiveReaders`. |
| 1319 | +- `lance-graph/.claude/knowledge/old-stack-capability-parity.md §3.39` |
| 1320 | + — knowledge-doc record of the same primitive. |
| 1321 | +- `MedCare-rs/crates/medcare-analytics/src/graph_contract.rs:36` — |
| 1322 | + `Binary16K = [u64; 256]` canonical home. |
| 1323 | +- `MedCare-rs/crates/medcare-analytics/src/column_mask_bridge.rs` — |
| 1324 | + production wire-up; `redaction_mode_for` (line 128), |
| 1325 | + `column_mask_policy_for_table` (line 165), |
| 1326 | + `build_medcare_column_mask_registry` (line 192). |
| 1327 | +- `MedCare-rs/crates/medcare-server/src/state.rs:167, 265, 439` — |
| 1328 | + F2-E install sites consuming the column-mask registry. |
| 1329 | +- `bgz-tensor/examples/compare_stacked_vs_i16.rs:90-92` — |
| 1330 | + `WeightPalette::build(…, 256)` + `AttentionTable::build`. |
| 1331 | +- `lance-graph-arm-discovery` — aerial codebook with ρ = 0.9973 vs |
| 1332 | + cosine measurement. |
| 1333 | +- ADR-022 (The Firewall) — the inner-side discipline this ADR |
| 1334 | + specifies a primitive for. |
| 1335 | +- ADR-023 (IR-as-wire-truth) — palette256 is the codec on the IR's |
| 1336 | + wire form. |
| 1337 | +- `docs/THE-FIREWALL.md` §3 (the inner/hot side) — palette256 + HHTL |
| 1338 | + is one of its load-bearing primitives. |
| 1339 | +- `docs/HEALTHCARE-TRANSCODING.md §3.1` — palette256 + Hamming |
| 1340 | + popcount on Binary16K named as the inner-side security mesh. |
| 1341 | +- `docs/RDF-OWL-ALIGNMENT.md §10` — the brutal-upgrade sequencing |
| 1342 | + context (Phase 2c geospatial adopts the codec). |
| 1343 | +- `bardioc` PR #18 + `lance-graph` PR #470 — Quintenzirkel qualia |
| 1344 | + codebook (8 B → 1-2 B per row) as the same compression strategy in |
| 1345 | + a different domain. |
| 1346 | +- `lance-graph` PR #473 (forthcoming) `cesium-osm-substrate-v1.md` |
| 1347 | + §11 — runtime-side commitment to a follow-up callout on this ADR |
| 1348 | + once D-OSM-2 / D-SPLAT-4 wire. |
| 1349 | + |
1170 | 1350 | ## Implementation receipts — ADR ↔ commit cross-reference |
1171 | 1351 |
|
1172 | 1352 | > **Added in follow-up addendum (2026-06-05).** Records the implementation |
|
0 commit comments