docs(knowledge): helix-48 information-preservation lineage cross-link (32 768-bit Jina → 48-bit Σ₁ → HelixResidue tenant)#499
Conversation
The operator's framing — "helix-48 carries x32000 information preservation, with or without Morton cascade" — is grounded in canon, but the grounding fragments are spread across five committed artifacts (lance-graph PRs #156, #176, #210, #218 + post-#496 canonical_node.rs) plus OGAR DISCOVERY-MAP. The post-#496 substrate exposes HelixResidue as a ValueTenant but its doc-comment does not cite the lineage, so a fresh reader sees "helix golden-spiral Place/Residue (48 B)" without the "94% of Jina 1024-D" claim that justifies the tenant's reason for being. This doc compiles the cross-citations so the framing is referenceable from one place. Pure docs, append-only, no code touched. Section map: §0 The claim in one line — 48-bit Σ₁ SEED preserves 94% of Jina 1024-D (= 32,768 bits), and the 48-byte HelixResidue ValueTenant inherits the lineage. §1 The committed fragments — Σ tier table (PR #210), CAM-PQ 48-bit lineage (PRs #176 + ndarray PR-x12), 11/17 X-Trans quasi-irrational stride rationale (PR #156), maximally-irrational stride finding (PR #218), Morton cascade addressing (OGAR DISCOVERY-MAP D-CASCADE), and the post-#496 HelixResidue ValueTenant. §2 The unified framing — two scales of "helix-48" (48 BIT Σ₁ SEED vs 48 BYTE HelixResidue tenant); the "with or without Morton cascade" independence (carrier vs addressing); why 32,768 specifically (1024D × f32 = full Jina embedding bit count). §3 Cross-references — five PRs + canon anchors. §4 What this doc does NOT do — no code, no canon edits, no retroactive plan rewrites. Anchors the post-#496 HelixResidue tenant in its documented information- preservation lineage so future readers don't have to re-derive the 94% Jina claim from first principles. https://claude.ai/code/session_01VysoWJ6vsyg3wEGc5v7T5v
|
Warning Review limit reached
More reviews will be available in 36 minutes and 52 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4e6b4be57c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| > *CAM fingerprint (48-bit) → COCA 4096 codebook → DeepNSM addressing* | ||
|
|
||
| [`ndarray/.claude/knowledge/pr-x12-cam-pq-sigker-dn-tree-substrate-bindings.md:22`](../../../../ndarray/.claude/knowledge/pr-x12-cam-pq-sigker-dn-tree-substrate-bindings.md) (ndarray PR-x12): |
There was a problem hiding this comment.
Use stable links for cross-repo citations
This link, and the other ../../../../ndarray / ../../../../OGAR links below, escape the lance-graph checkout from .claude/knowledge and resolve to absolute local paths like /ndarray or /OGAR, which are not tracked in this repository and will be dead for readers unless they have the same sibling checkouts. Since this file's purpose is to provide citable cross-links, these should be stable URLs or committed in-repo references instead of local filesystem-relative paths.
Useful? React with 👍 / 👎.
|
Retracted. The operator informed me that the 48-byte My PR documented it as a canonical "upscaled carrier inheriting the Σ₁ SEED lineage" in §2.1 and §2.3. That framing is wrong on its own terms: the 94 %-of-Jina claim from PR #210 is a property of the 48-bit Σ₁ SEED (Hamming-searchable, bit-packed, FAISS PQ6×8 = 48-bit fingerprints under bgz17 11/17 stride). It does not generalise to a 48-byte tenant that doesn't run the algorithm. By citing Failure mode: I treated presence-in-shipped-code as authority and built a framing on top without verifying the substrate. I should have searched the codebase for the actual Closing without merge. If you want a leaner doc that strictly grounds the 48-bit Σ₁ SEED claim (PR #210) with no reference to the retracted 48-byte tenant, say the word and I'll open a tight replacement. Otherwise this stays closed and the existing fragments in PRs #156, #176, #210, #218 remain the canonical home of the lineage. |
Summary
The operator's framing — "helix-48 carries x32 000 information preservation, with or without Morton cascade" — is canonical, but the grounding fragments are spread across 5 committed artifacts + OGAR DISCOVERY-MAP. The post-#496 substrate exposes
HelixResidueas aValueTenantbut its doc-comment does not cite the lineage, so a fresh reader sees "helix golden-spiral Place/Residue (48 B)" without the "94 % of Jina 1024-D" claim that justifies the tenant's reason for being.This PR adds a single knowledge file that cross-links the fragments so the framing is referenceable from one place. Pure docs, append-only, no code touched.
+150 / -0over one file.The canonical claim, in one line
The 48-bit Σ₁ SEED preserves ~94 % of a Jina 1024-D embedding (32 768 bits), validated on SimLex-999. The post-#496
HelixResidueValueTenant scales this carrier up to 48 bytes (384 bits) for substrate use, inheriting the lineage. The information-preservation property is independent of Morton cascade addressing — with or without the cascade, the helix-48 carrier holds the 32 768-bit Jina equivalent.The committed fragments this doc cross-links
.claude/knowledge/linguistic-epiphanies-2026-04-19.md:299-312dfcf246b).claude/knowledge/encoding-ecosystem.md:91c1d44910).claude/BGZ17_ELEVEN_SEVENTEEN_RATIONALE.md79b46189).claude/knowledge/codec-findings-2026-04-20.md:744c4c0e7f)ndarray/.claude/knowledge/pr-x12-cam-pq-sigker-dn-tree-substrate-bindings.md:2264→256→1024→4096→16k→64k→256k = immaterialized Morton enumerationOGAR/docs/DISCOVERY-MAP.md:127(D-CASCADE)HelixResidue = 4 — helix golden-spiral Place/Residue (48 B)ValueTenantcrates/lance-graph-contract/src/canonical_node.rs:333-334All fragments are committed. Only the unified framing was missing as a single citable artifact.
The two scales of "helix-48"
HelixResidueValueTenantBoth are helix (golden-spiral place/residue, stride-4-over-17 walked by
CurveRuler). They differ only in budget; the substrate uses the byte-wide tenant; the bit-wide SEED is the compression floor validated against Jina.Why "with or without Morton cascade" is independent
The information-preservation claim is a property of the carrier (the 48-bit fingerprint under CAM-PQ 6×256 + bgz17 11/17 stride), not of the addressing (Morton cascade 64→256→1024…). The two compose orthogonally:
Why 32 768 specifically
Σ₃ FULL = 1024 dimensions × 32 bits per f32 = 32 768 bits— the full Jina embedding. The "x32 000" in the operator's framing is this number, rounded — it's the size of the embedding the 48-bit seed preserves 94 % of. Compression ratio is 32 768 / 48 ≈ 683× (Σ₃ → Σ₁) or 32 768 / 384 ≈ 85× (Σ₃ → 48-byte HelixResidue tenant), with higher fidelity for the wider tenant.What this PR does NOT do
NODE_ROW_STRIDE.canonical_node.rs'sHelixResiduedoc-comment in this PR — that's a follow-up if desired (would touch shipped code, so kept separate).Test plan
cargonot invoked.dfcf246b,c1d44910,79b46189,4c4c0e7f, post-Integrated cognitive planner reference map + ValueSchema presets + FULL POC default #496 at2e58e034).Anchors
HelixResidueValueTenant +ValueSchema::Compressedpreset.https://claude.ai/code/session_01VysoWJ6vsyg3wEGc5v7T5v
Generated by Claude Code