Skip to content

Commit 5e9e55b

Browse files
committed
docs(adr): add ADR-024 — Palette256 + HHTL codec as universal compression primitive
Names an existing primitive (three independent deployments + one empirical anchor) rather than proposing one. Companion to ADR-022 (The Firewall) and ADR-023 (IR-as-wire-truth). # The codec HHTL prefix (NiblePath / quadkey / class identity) ↓ establishes spatial / semantic frame within-frame values cluster ↓ quantize to 256-index palette ↓ const-table lookup (compile-time HHTL where possible) 1-byte index per element, sub-µs decode, zero heap allocation # Three already-deployed instances (verified, not narrative) - Security mesh — Binary16K = [u64; 256] in MedCare-rs/crates/medcare-analytics/src/graph_contract.rs:36; Hamming-popcount on _effectiveReaders at the inner / hot path; wired into production at column_mask_bridge.rs → medcare-server/state.rs:167, 265, 439. - Attention — bgz-tensor WeightPalette::build(…, 256) + AttentionTable::build (crates/bgz-tensor/examples/ compare_stacked_vs_i16.rs:90-92). - Distance — lance-graph-arm-discovery aerial codebook, measured ρ = 0.9973 vs cosine — the empirical anchor. The runtime side's BindSpace dissolution work (bardioc PR #18 / lance-graph PR #470) hinted at the same compression strategy via the Quintenzirkel qualia codebook (8 B → 1-2 B per row). # Adoption checklist for new domains 1. Identify the prefix (NiblePath / quadkey / class identity). 2. Identify the palette domain — which values cluster? 3. Build the palette + measure ρ-vs-reference. Target ≥ 0.99 to match the arm-discovery anchor. 4. Decode = const-table lookup (zero-allocation hot path). # The 256-ceiling escape hatches (in the ADR body, not in callouts) Per runtime-session's "name the escape hatch upfront" tightening so reviewers don't spend the first session asking about long-tail: - Per-tile / per-frame palettes — cheapest; different frame, different 256 entries. - Hierarchical palettes — coarser at higher quadkey levels, finer per leaf. Mirrors the tile pyramid + SH L0/L1 vs L2/L3 split in splat-fit. - Palette-64K upgrade — 2 bytes/index, for measured cases only (not speculation). # Two next-domain adopters queued - D-OSM-2 (OSM tag palette + tile-local coordinate quantization) — per lance-graph PR #473 (cesium-osm-substrate-v1.md). Reports ρ-vs-reference on first per-country PBF run per the runtime session's §11 follow-up commitment. - D-SPLAT-4 (SH-aware palette extension on the Gaussian3D carrier). # The falsifiable property "The same compile-time HHTL prefix + palette256 codec decodes (a) _effectiveReaders for row auth, (b) OSM way attributes at zoom-21 tile, and (c) FMA-bone SH coefficients at sub-microsecond per element with zero heap allocation." If that property holds across all three, the substrate is doing its job. If it fails on one, the substrate is leaking dialect into the codec. # Cross-references named in the ADR Three FINDING-grade anchors (per runtime-session's "name existing primitive rather than propose one" tightening): - lance-graph/.claude/board/EPIPHANIES.md:28 - lance-graph/.claude/knowledge/old-stack-capability-parity.md §3.39 - lance-graph-arm-discovery (ρ = 0.9973 vs cosine) Plus the file:line citations for all three production deployments + cross-references to ADR-022 / ADR-023 / THE-FIREWALL.md / HEALTHCARE-TRANSCODING.md / RDF-OWL-ALIGNMENT.md. PII abort-guard (word-boundary): CLEAN. https://claude.ai/code/session_01PBTGaPCSnnt6u3pjXpbLwY
1 parent 9c41346 commit 5e9e55b

1 file changed

Lines changed: 180 additions & 0 deletions

File tree

docs/ARCHITECTURAL-DECISIONS-2026-06-04.md

Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
| ADR-021 | **Meta-hygiene**: always grep peer crates before copying manifest patterns (the `[lints] workspace = true` cascade lesson) | **Pinned** | OGAR PR #15 + PR #17/#18 follow-ups |
5252
| ADR-022 | **The Firewall** — absolute inner/outer boundary; no serialization in hot path; inner = compile-time HHTL; outer = contract-trait pluggable | **Pinned** | OGAR (this PR); `docs/THE-FIREWALL.md` |
5353
| ADR-023 | **IR-as-wire-truth** — the source-language AST is *input dialect*; the canonical `Class`/`Attribute`/`Association`/`EnumDecl`/`ActionDef` IR is *wire truth*. Adapters lift dialects into IR; the IR routes everything (registry key, actor mailbox, Lance version, audit-log dimension) | **Pinned** | OGAR (this PR); `crates/ogar-vocab/`; `bardioc/substrate-b-shadow::EdgeDecoder<E>` (PR #19) |
54+
| ADR-024 | **Palette256 + HHTL codec** — the substrate's universal compression primitive. HHTL prefix establishes a frame; within the frame, values cluster; clustered values quantize to 256-index palette + const-table lookup. Names an existing primitive (Binary16K perms + bgz-tensor attention + arm-discovery aerial codebook, ρ=0.9973 vs cosine) rather than proposing one | **Pinned** | OGAR (this PR); `MedCare-rs/crates/medcare-analytics/src/{graph_contract.rs,column_mask_bridge.rs}`; `bgz-tensor/examples/compare_stacked_vs_i16.rs`; `lance-graph-arm-discovery` |
5455

5556
## ADR-001: `State = ActionState` (lifecycle), not domain state, for Rubicon binding
5657

@@ -1167,6 +1168,185 @@ lance-graph) before merge.
11671168
- `docs/RDF-OWL-ALIGNMENT.md` §3 (OGAR's position in L1-L5) — the
11681169
IR sits at the AR-pattern lift seam.
11691170

1171+
## ADR-024: Palette256 + HHTL codec — the substrate's universal compression primitive
1172+
1173+
**Status:** Pinned (2026-06-05). Names an existing primitive (three
1174+
independent deployments + one empirical anchor) rather than proposing
1175+
one. Companion to ADR-022 (The Firewall — this ADR specifies one of
1176+
its inner-side primitives) and ADR-023 (IR-as-wire-truth — palette256
1177+
is the codec on the IR's wire form).
1178+
1179+
**Context.** The substrate has accumulated three independent
1180+
palette256 deployments developed for their own domains:
1181+
1182+
- **Security mesh**`Binary16K = [u64; 256]` in
1183+
`MedCare-rs/crates/medcare-analytics/src/graph_contract.rs:36`
1184+
(canonical home). The per-row `_effectiveReaders` bitmap; auth is
1185+
Hamming-popcount bit-intersection at the inner / hot path
1186+
(`HEALTHCARE-TRANSCODING.md §3.1`). Wired into production at
1187+
`MedCare-rs/crates/medcare-analytics/src/column_mask_bridge.rs`
1188+
`medcare-server/state.rs:167, 265, 439`.
1189+
- **Attention**`bgz-tensor` `WeightPalette::build(…, 256)` +
1190+
`AttentionTable::build` (`crates/bgz-tensor/examples/
1191+
compare_stacked_vs_i16.rs:90-92`). Replaces dense FP weights with
1192+
256-index palette + precomputed distance table on the model's hot
1193+
path.
1194+
- **Distance**`lance-graph-arm-discovery` aerial codebook —
1195+
measured **ρ = 0.9973 vs cosine**. The empirical anchor: palette256
1196+
reproduces cosine distance with correlation 0.9973 (i.e. on a
1197+
scale where 1.0 = identical, palette256 is ~0.003 from cosine).
1198+
1199+
Cross-domain analysis revealed all three are instances of the *same
1200+
codec*: HHTL prefix establishes a frame; within frame, values
1201+
cluster; clustered values quantize to a 256-index palette; decode is
1202+
a const-table lookup. The runtime side's BindSpace dissolution work
1203+
(bardioc PR #18 / lance-graph PR #470) hinted at this with the
1204+
Quintenzirkel qualia codebook ("frozen set + circle-of-fifths
1205+
progression → 8 B → 1-2 B per row") — same compression strategy,
1206+
different domain.
1207+
1208+
The proposal in the cross-session conversation (2026-06-05) was to
1209+
name the primitive explicitly so:
1210+
1. Future adopters don't reinvent it per domain.
1211+
2. New adopters report a falsifiable measurement (ρ-vs-reference)
1212+
at adoption time rather than after the fact.
1213+
3. The 256-ceiling escape hatches are documented before reviewers
1214+
ask.
1215+
1216+
**Decision.** **The codec is:**
1217+
1218+
```text
1219+
HHTL prefix (NiblePath / quadkey / class identity)
1220+
↓ establishes spatial / semantic frame
1221+
within-frame values cluster
1222+
↓ quantize to 256-index palette
1223+
↓ const-table lookup (compile-time HHTL where possible)
1224+
1-byte index per element, sub-microsecond decode, zero heap allocation
1225+
```
1226+
1227+
**Adoption checklist** for a new domain:
1228+
1. **Identify the prefix.** The NiblePath / quadkey / class identity
1229+
that establishes the frame the values live in.
1230+
2. **Identify the palette domain.** Which values cluster within the
1231+
frame? (Closed-keyspace tags, quantized continuous values,
1232+
enumerated state, etc.)
1233+
3. **Build the palette + measure ρ-vs-reference.** The reference is
1234+
the domain's full-precision metric (cosine for embeddings, L2 for
1235+
coordinates, exact-match for tags). Report ρ at adoption time as
1236+
the falsifiable property. Target: **ρ ≥ 0.99** to match the
1237+
arm-discovery anchor.
1238+
4. **Decode = const-table lookup.** Compile-time HHTL if the palette
1239+
is static; runtime const-table if the palette is per-frame /
1240+
per-tile. Either way the decode path is zero-allocation.
1241+
1242+
**The 256-ceiling escape hatches** (documented to avoid the
1243+
predictable reviewer question):
1244+
1245+
- **Per-tile / per-frame palettes** — the cheapest answer. Different
1246+
spatial-frame, different 256 entries. Used by Cesium tile codecs;
1247+
matches the quadkey-prefix discipline. Long-tail OSM tags inside a
1248+
zoom-21 tile rarely exceed 256.
1249+
- **Hierarchical palettes** — coarser palette at higher quadkey
1250+
levels, finer per leaf. Mirrors the standard tile pyramid; the SH
1251+
L0/L1 vs L2/L3 split in `splat-fit` is the same pattern.
1252+
- **Palette-64K upgrade** — 2-byte index instead of 1, for hot
1253+
palettes that genuinely exceed 256 distinct values (rare; reserve
1254+
for measured cases, not speculation).
1255+
1256+
The escape hatches are part of the primitive, not exceptions to it.
1257+
1258+
**Alternatives considered.**
1259+
1260+
- *Continuous distributions that don't cluster* (e.g. timestamps in
1261+
microseconds, free-form text). Rejected as a counterargument to
1262+
the codec — these are out-of-domain. For them, use delta encoding
1263+
or VarInt or a different codec entirely. The codec applies to
1264+
*clustered* domains; the adoption checklist's step 2 is the filter.
1265+
- *Domain-specific codecs per domain.* Rejected. Three independent
1266+
re-derivations of the same primitive (security / attention /
1267+
distance) is the receipt that the abstraction is real, not the
1268+
receipt that each domain should have its own. ADR-024 reduces
1269+
per-domain re-derivation.
1270+
- *Skip the ρ-vs-reference measurement.* Rejected. The arm-discovery
1271+
ρ = 0.9973 is the existing FINDING-grade stake; new domains
1272+
reporting at adoption time keeps the empirical floor honest as the
1273+
primitive spreads.
1274+
1275+
**Consequences.**
1276+
1277+
- **The primitive is named.** Cross-domain reuse is now load-bearing,
1278+
not coincidental. New domains adopt the codec instead of inventing
1279+
their own quantization.
1280+
- **ρ-vs-reference becomes the adoption contract.** Reported once at
1281+
adoption per domain. The arm-discovery 0.9973 is the existing
1282+
anchor; new adopters target ≥ 0.99 and document if they fall short.
1283+
- **Two next-domain adopters are queued** (planned, not yet wired):
1284+
- **D-OSM-2** (OSM tag palette + tile-local coordinate
1285+
quantization) — per `lance-graph` PR #473 (`cesium-osm-substrate
1286+
-v1.md`). Reports ρ-vs-reference on first per-country PBF run per
1287+
the runtime session's §11 follow-up commitment.
1288+
- **D-SPLAT-4** (SH-aware palette extension on the
1289+
`Gaussian3D` carrier) — per the splat-native arc. Same codec; SH
1290+
coefficients are the long-tail-budget challenger.
1291+
- **The 256-ceiling has three explicit escapes** in the ADR body
1292+
(per-tile / hierarchical / palette-64K). Reviewers don't need to
1293+
re-derive the answer.
1294+
- **Cross-arc reuse argument is sharpened.** The substrate-reuse
1295+
framing in `docs/RDF-OWL-ALIGNMENT.md §10` (geographic litmus
1296+
complements anatomical) cashes out as: FMA-bones and OSM-vectors
1297+
use *the same codec* (palette256 + HHTL prefix), not just the same
1298+
IR. The §6 callout in `DOMAIN-INSTANCES.md` (queued, awaiting
1299+
lance-graph PR #473 land) will reference ADR-024 as the falsifiable
1300+
property.
1301+
- **The falsifiable property** that ties the substrate-reuse claim
1302+
down: *"the same compile-time HHTL prefix + palette256 codec
1303+
decodes (a) `_effectiveReaders` for row auth, (b) OSM way
1304+
attributes at zoom-21 tile, and (c) FMA-bone SH coefficients at
1305+
sub-microsecond per element with zero heap allocation."* If that
1306+
property holds across all three, the substrate is doing its job.
1307+
If it fails on one, the substrate is leaking dialect into the codec.
1308+
1309+
**Change policy.** Adding a new palette256 adopter (new domain) is
1310+
routine — follow the adoption checklist + report ρ-vs-reference.
1311+
Changing the codec itself (e.g. palette-64K becoming default, or a
1312+
new escape-hatch added) is a substrate-wide concern and requires
1313+
consultation with the runtime session.
1314+
1315+
**References.**
1316+
1317+
- `lance-graph/.claude/board/EPIPHANIES.md:28` — FINDING-grade
1318+
anchor for palette256 + Hamming popcount on `_effectiveReaders`.
1319+
- `lance-graph/.claude/knowledge/old-stack-capability-parity.md §3.39`
1320+
— knowledge-doc record of the same primitive.
1321+
- `MedCare-rs/crates/medcare-analytics/src/graph_contract.rs:36`
1322+
`Binary16K = [u64; 256]` canonical home.
1323+
- `MedCare-rs/crates/medcare-analytics/src/column_mask_bridge.rs`
1324+
production wire-up; `redaction_mode_for` (line 128),
1325+
`column_mask_policy_for_table` (line 165),
1326+
`build_medcare_column_mask_registry` (line 192).
1327+
- `MedCare-rs/crates/medcare-server/src/state.rs:167, 265, 439`
1328+
F2-E install sites consuming the column-mask registry.
1329+
- `bgz-tensor/examples/compare_stacked_vs_i16.rs:90-92`
1330+
`WeightPalette::build(…, 256)` + `AttentionTable::build`.
1331+
- `lance-graph-arm-discovery` — aerial codebook with ρ = 0.9973 vs
1332+
cosine measurement.
1333+
- ADR-022 (The Firewall) — the inner-side discipline this ADR
1334+
specifies a primitive for.
1335+
- ADR-023 (IR-as-wire-truth) — palette256 is the codec on the IR's
1336+
wire form.
1337+
- `docs/THE-FIREWALL.md` §3 (the inner/hot side) — palette256 + HHTL
1338+
is one of its load-bearing primitives.
1339+
- `docs/HEALTHCARE-TRANSCODING.md §3.1` — palette256 + Hamming
1340+
popcount on Binary16K named as the inner-side security mesh.
1341+
- `docs/RDF-OWL-ALIGNMENT.md §10` — the brutal-upgrade sequencing
1342+
context (Phase 2c geospatial adopts the codec).
1343+
- `bardioc` PR #18 + `lance-graph` PR #470 — Quintenzirkel qualia
1344+
codebook (8 B → 1-2 B per row) as the same compression strategy in
1345+
a different domain.
1346+
- `lance-graph` PR #473 (forthcoming) `cesium-osm-substrate-v1.md`
1347+
§11 — runtime-side commitment to a follow-up callout on this ADR
1348+
once D-OSM-2 / D-SPLAT-4 wire.
1349+
11701350
## Implementation receipts — ADR ↔ commit cross-reference
11711351

11721352
> **Added in follow-up addendum (2026-06-05).** Records the implementation

0 commit comments

Comments
 (0)