Skip to content

Commit ef5a362

Browse files
committed
fix(#498 review): version-gate value-slab offset shift + Signed360 sign-partition
Two codex P2s on #498: - ENVELOPE_LAYOUT_VERSION 1→2 — the HelixResidue 48→6 B right-size shifted every downstream value-tenant offset; bump gates it so a v1 blob refuses to decode rather than read tenants from the wrong bytes (safe: nothing persisted under v1, FULL is POC-only). - Signed360: encode |y| in 7 bits + sign in the partition (Pos⇒[128,255], Neg⇒[0,127]) so a near-rim negative lift (|y|≈0) can't round to 128 and read back as Pos. Regression test signed360_neg_sign_survives_near_rim_at_high_total. Also adds the #498 PR_ARC_INVENTORY entry (CodeRabbit mis-attributed this PR's helix/keystone/OCR/causal-edge work to #496 — corrected: it's #498's). contract 623 lib, helix 73 lib + 7 doc green; clippy -D warnings + fmt clean. https://claude.ai/code/session_01D2WSmezQBNC3bUdHuGfGmo
1 parent f880141 commit ef5a362

4 files changed

Lines changed: 78 additions & 12 deletions

File tree

.claude/board/PR_ARC_INVENTORY.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,20 @@
3535
3636
---
3737

38+
## #498 GUID decode→read-mode keystone + helix Signed360 right-size + OCR→NodeRow transcode
39+
40+
**Status:** OPEN 2026-06-15 (branch `claude/wonderful-hawking-lodtql`, 8 commits post-#496). In review (CodeRabbit + codex). **NOTE:** this entry documents the helix / keystone / OCR / causal-edge work that CodeRabbit on PR #498 mis-attributed to #496 — those changes are **#498's, not #496's**. #496 shipped only ValueSchema presets + the reference plan (its immutable entry below correctly shows the pre-right-size 154/98 B budgets).
41+
42+
**Added:** (1) **Keystone** `canonical_node::{GuidParts, ReadMode, classid_read_mode}` + `NodeGuid::{heel/hip/twig, decode()→GuidParts, read_mode()}` — read-the-GUID-as-a-GUID decode + a `LazyLock<HashMap>` classid→read-mode registry (the single source consumer + OGAR inherit); `ReadMode` bundles the two existing axes (`ValueSchema` + `EdgeCodecFlavor`), no new property. (2) `helix::{Sign, Signed360}` + `HemispherePoint::signed_lift` + `ResidueEncoder::encode_signed` — signed full-sphere codec; **`HelixResidue` value-tenant right-sized 48 B → 6 B** (bits→bytes slip fix) → downstream offsets shifted (Turbovec 160→118, Energy 176→134, …), budgets re-locked (Full 154→112, Compressed 98→56), value carve now `[32,144)`. (3) `ocr::{BlockKind::entity_type, LayoutBlock::to_node_row}` + `ValueTenant::{value_offset, byte_len}` — OCR-engine-agnostic transcode. (4) causal-edge `test_build_fast` boundary `<`→`<=` (standing red on main fixed). (5) **`ENVELOPE_LAYOUT_VERSION` 1→2** — gates the value-slab offset shift (codex P2). Tests: contract 623 lib, helix 73 lib + 7 doc, causal-edge green.
43+
44+
**Locked:** (1) **one `NodeGuid` only** — the #490-retired `identity::NodeGuid` (UUIDv8) stays retired; the keystone extends the canon `canonical_node::NodeGuid`. (2) `ReadMode::DEFAULT = {Full, CoarseOnly}` mirrors the ClassView POC default; both flip back to Bootstrap together (guard `read_mode_default_is_full_poc`). (3) **`Signed360` sign-partition**`|y|`-in-7-bits + sign in the partition (Pos ⇒ polar [128,255], Neg ⇒ [0,127]); sign is exact at `|y|≈0` at the rim (codex P2 fix, regression test `signed360_neg_sign_survives_near_rim_at_high_total`). (4) text/bbox never bundled into the node — content store keyed by identity (I-VSA-IDENTITIES). (5) a value-slab offset shift is **version-gated, not reserved-gap** — safe pre-persistence (FULL is POC-only; codex P2 disposition).
45+
46+
**Deferred:** ontology-side `NiblePath::from_guid_prefix` (the keystone's other half); pure-Rust OCR via `ocrs`/`rten` (the tesseract-rs C-wrapper POC was reverted — wrong direction). `TD-VALUESCHEMA-FULL-POC-DEFAULT` paired-revert note updated (ReadMode::DEFAULT pairs with ClassView).
47+
48+
**Docs:** board `LATEST_STATE` + `TECH_DEBT` updated; this entry.
49+
50+
**Confidence (2026-06-15):** open — both codex P2s dispositioned (ENVELOPE_LAYOUT_VERSION bump for the offset shift; Signed360 sign-partition fix + regression test); CodeRabbit's #496-vs-#498 misattribution corrected here.
51+
3852
## #496 integrated-cognitive-planner reference map + ValueSchema presets + FULL POC default
3953

4054
**Status:** MERGED 2026-06-15 (merge commit `2e58e034`), branch `claude/wonderful-hawking-lodtql`. CI 5/5 green (format/clippy/linux-build/test/test-with-coverage). CodeRabbit 2 threads resolved; codex 2×P2 dispositioned (FULL-default intentional; CoarseResidue tracked as TD).

crates/helix/src/residue.rs

Lines changed: 42 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -64,8 +64,9 @@ impl ResidueEdge {
6464
/// to 48 bit (6 bytes)**. Maps a signed magnitude to the FULL sphere: the
6565
/// unsigned hemisphere `rim` edge (rim radius + place anchor via the existing
6666
/// pipeline), the signed `polar` byte (the equal-area lift `y = sign·√(1 − u)`
67-
/// quantised, centred at 128 — `> 128` upper hemisphere, `< 128` lower, so the
68-
/// hemisphere sign is recoverable), and the 16-bit `azimuth` (`n·φ` wrapped to
67+
/// quantised as `|y|`-in-7-bits with the sign in the partition — `≥ 128` upper
68+
/// hemisphere, `< 128` lower, so the hemisphere sign is recoverable even at the
69+
/// rim where `|y| ≈ 0`), and the 16-bit `azimuth` (`n·φ` wrapped to
6970
/// `[0, 2π)` over the full **360°**). Wire layout (LE):
7071
/// `[rim.start, rim.end, rim.floor_version, polar, azimuth_lo, azimuth_hi]`.
7172
///
@@ -76,8 +77,10 @@ impl ResidueEdge {
7677
pub struct Signed360 {
7778
/// Unsigned hemisphere edge (rim radius + place anchor). 3 bytes.
7879
pub rim: ResidueEdge,
79-
/// Signed equal-area lift `y` quantised, centred at 128 (128 = equator,
80-
/// `> 128` = upper hemisphere, `< 128` = lower). 1 byte.
80+
/// Signed equal-area lift `y` quantised: `|y|` in 7 bits, sign in the
81+
/// partition — `[128, 255]` = upper hemisphere ([`Sign::Pos`]), `[0, 127]` =
82+
/// lower ([`Sign::Neg`]). The partition (not a centred-at-128 round) keeps
83+
/// the sign exact even when `|y| ≈ 0` at the rim. 1 byte.
8184
pub polar: u8,
8285
/// Golden azimuth `n·φ mod 2π` mapped to `[0, 65536)` over the full 360°. 2 bytes.
8386
pub azimuth: u16,
@@ -173,14 +176,23 @@ impl ResidueEncoder {
173176
/// full-sphere residue (the doubled-hemisphere companion to
174177
/// [`encode`](Self::encode)). The `rim` reuses the unsigned hemisphere
175178
/// pipeline; `polar` carries the signed equal-area lift `y = sign·√(1 − u)`
176-
/// (centred at 128, so the hemisphere sign is recoverable via
177-
/// [`Signed360::sign`]); `azimuth` is the golden angle `n·φ` over the full 360°.
179+
/// (`|y|`-in-7-bits + sign-partition, so the hemisphere sign is recoverable
180+
/// via [`Signed360::sign`] even at the rim); `azimuth` is the golden angle
181+
/// `n·φ` over the full 360°.
178182
pub fn encode_signed(&self, place: u64, n: usize, sign: Sign) -> Signed360 {
179183
let n = n.min(self.total - 1);
180184
let rim = self.encode(place, n);
181-
// Signed equal-area lift y ∈ [−1, 1] → byte centred at 128.
185+
// Signed equal-area lift y ∈ [−1, 1]. Encode |y| in 7 bits and the sign
186+
// in the PARTITION (Pos ⇒ [128, 255], Neg ⇒ [0, 127]) so the hemisphere
187+
// sign survives even when |y| ≈ 0 near the rim. A naive `128 + y·127`
188+
// rounds a tiny negative lift up to 128, which `sign()` reads as Pos
189+
// (codex P2 on #498). The partition makes the sign exact at every magnitude.
182190
let p = HemispherePoint::signed_lift(n, self.total, sign);
183-
let polar = (128.0 + p.y * 127.0).round().clamp(0.0, 255.0) as u8;
191+
let mag = (p.y.abs() * 127.0).round().clamp(0.0, 127.0) as u8;
192+
let polar = match sign {
193+
Sign::Pos => 128 + mag,
194+
Sign::Neg => 127 - mag,
195+
};
184196
// Golden azimuth n·φ wrapped to [0, 2π) → u16 over the full 360°.
185197
let az = (n as f64 * GOLDEN_RATIO).rem_euclid(core::f64::consts::TAU);
186198
let azimuth = ((az / core::f64::consts::TAU) * 65536.0) as u16;
@@ -347,4 +359,26 @@ mod tests {
347359
enc.encode_signed(0x99, 2000, Sign::Neg)
348360
);
349361
}
362+
363+
#[test]
364+
fn signed360_neg_sign_survives_near_rim_at_high_total() {
365+
// Regression for codex P2 (#498): near the rim √(1−u) → 0, so |y| ≈ 0.
366+
// A centred-at-128 round mapped a negative lift to polar 128, which
367+
// `sign()` reads as Pos — losing the sign. The |y|-in-7-bits + partition
368+
// encoding makes Neg ⇒ polar ∈ [0, 127] for EVERY magnitude, so the sign
369+
// is exact even when the lift vanishes. Large total ⇒ rim n has tiny |y|.
370+
let enc = ResidueEncoder::new(65_536);
371+
for n in [65_530usize, 65_534, 65_535] {
372+
let neg = enc.encode_signed(0x55, n, Sign::Neg);
373+
assert_eq!(
374+
neg.sign(),
375+
Sign::Neg,
376+
"Neg must survive at the rim (n={n}, polar={})",
377+
neg.polar
378+
);
379+
assert!(neg.polar < 128, "Neg ⇒ polar < 128 even at |y| ≈ 0");
380+
// And the Pos companion stays in the upper partition.
381+
assert!(enc.encode_signed(0x55, n, Sign::Pos).polar >= 128);
382+
}
383+
}
350384
}

crates/helix/tests/probe_mantissa_fill.rs

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,11 @@ fn random_disk_points(k: usize, seed: u64) -> Vec<(f64, f64)> {
124124
fn probe_mantissa_fill_golden_beats_uniform_random() {
125125
// Three independent baseline seeds — golden must beat ALL of them on
126126
// BOTH metrics at BOTH sample counts; no cherry-picking.
127-
const SEEDS: [u64; 3] = [0x9E37_79B9_7F4A_7C15, 0xD1B5_4A32_D192_ED03, 0x2545_F491_4F6C_DD1D];
127+
const SEEDS: [u64; 3] = [
128+
0x9E37_79B9_7F4A_7C15,
129+
0xD1B5_4A32_D192_ED03,
130+
0x2545_F491_4F6C_DD1D,
131+
];
128132

129133
for &k in &[256usize, 1024] {
130134
let (g_occ, g_max) = fill_metrics(golden_points(k).into_iter());
@@ -189,7 +193,11 @@ fn probe_phase1_curve_ruler_regeneration_is_bit_exact() {
189193
for depth in [0u8, 1, 7, 16] {
190194
let a = CurveRuler::from_hhtl(path, depth);
191195
let b = CurveRuler::from_hhtl(path, depth);
192-
assert_eq!(a.arc(), b.arc(), "regeneration drift at ({path:#x},{depth})");
196+
assert_eq!(
197+
a.arc(),
198+
b.arc(),
199+
"regeneration drift at ({path:#x},{depth})"
200+
);
193201
}
194202
}
195203
}
@@ -207,6 +215,9 @@ fn probe_phase1_full_permutation_for_every_offset() {
207215
assert!(!seen[v as usize], "residue {v} repeated at place {place}");
208216
seen[v as usize] = true;
209217
}
210-
assert!(seen.iter().all(|&s| s), "incomplete permutation at place {place}");
218+
assert!(
219+
seen.iter().all(|&s| s),
220+
"incomplete permutation at place {place}"
221+
);
211222
}
212223
}

crates/lance-graph-contract/src/soa_envelope.rs

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,14 @@
4444
/// changes. A reader MUST refuse to decode a packet whose stamped version it
4545
/// does not understand (per `I-LEGACY-API-FEATURE-GATED`: layout reclaim is
4646
/// paired with a version gate on the serialization path).
47-
pub const ENVELOPE_LAYOUT_VERSION: u8 = 1;
47+
///
48+
/// - **v1** — initial canonical `NodeRow` value carve.
49+
/// - **v2** — `HelixResidue` value-tenant right-sized 48 B → 6 B (a bits→bytes
50+
/// slip fix), which shifted every downstream tenant offset (`TurbovecResidue`
51+
/// 160→118, `Energy` 176→134, …). The offsets moved, so the version gates it:
52+
/// a v1 blob now refuses to decode rather than read tenants from the wrong
53+
/// bytes. Safe because nothing persisted under v1 (FULL is POC-only).
54+
pub const ENVELOPE_LAYOUT_VERSION: u8 = 2;
4855

4956
/// The little-endian element type of one column.
5057
///

0 commit comments

Comments
 (0)