Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .claude/board/EPIPHANIES.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@
**Why this matters here.** The ClassView design session is NOT blocked by this ask. The consumer correctly framed the recompute DAG as needing the *machinery* (which is now shipped consumer-side) AND the *data* (which the extractor owns). The reframe matters: the consumer's wishlist had originally listed this as P0 ASK to the ClassView session; the probe demonstrates that one extractor change is sufficient, no ClassView interface required. Producer-side action: when re-running the extractor on `/home/user/odoo`, lift the second AST level of `@api.depends('a.b')` strings into a sibling `reads_field` triple resolved through the relation's `comodel_name`. See `.claude/knowledge/odoo-extractor-wishlist-from-od-ontology-v1.md` § 3 for the concrete shape.

**Companion ratifications** (also 2026-06-17, also from the producer side): `ruff#19` (Rails STI → `inherits_from`, making `inherits_from` cross-language canonical for C++ + Rails + a natural P1 ask for Odoo `_inherit`) and `ruff#21` (`Predicate::ValidationKind` for per-attribute typed validation, a P2 ask for Odoo `@api.constrains`). Both recorded in the wishlist's "2026-06-17 update" section.
## 2026-06-17 — E-CPP-PARITY-1 — the unicharset adapter is byte-identical to libtesseract; PROBE-OGAR-ADAPTER-UNICHARSET green

**Status:** FINDING (in-env, real trained data). `lance_graph_contract::unicharset::UniCharSet` dumps the `eng.lstm-unicharset` id→unichar table **byte-identical to the C++ `UNICHARSET` FFI oracle, 112/112** (installed libtesseract 5.3.4 + libleptonica 1.82; `examples/unicharset_dump.rs` vs an oracle harness built over the source header + the lib's exported `UNICHARSET::{load_from_file,id_to_unichar}` symbols).

**The falsifier did exactly what a falsifier should.** The documented-format Rust parser (first token per line = unichar, id = position) matched 111/112; the C++ oracle named the one real convention it missed — the `NULL` file-token IS the space unichar (`unicharset.cpp:882` remaps `"NULL"` → `" "`). One-line fix, re-diff, 0 differences. NOT a Core gap.

**The leptonica epiphany:** leptonica is an *install*, not a transcode. It is only a *link* dep of the C++ oracle harness — never in the Rust path (the unicharset path is text parsing, never touches `Pix`). Transcoding leptonica (~250k LOC of pointer-heavy C image-processing, the hand-port category) is the far-off zero-C end-state, NOT a prerequisite to prove the pipeline. The whole "we need the operator's leptonica host" framing collapsed to one `apt-get`.

**Scope (honest):** this proves the unicharset adapter's id↔unichar bijection + content-store tier at byte-parity — the doctrine's designated falsifier (`PROBE-OGAR-ADAPTER-UNICHARSET`), now FINDING. The `classid → ClassView → UnifiedStep` dispatch wiring is mechanical remainder; each future method-body leaf is its own parity check, but the core-first adapter pattern is no longer a conjecture. Cross-ref: `core-first-transcode-doctrine.md` § falsifier RESULT; `transcode-extend-core-probe-v1.md` § BYTE-PARITY ACHIEVED.

## 2026-06-17 — E-MATERIALIZED-AWARENESS-2 — the driver wire is live (provenance-only); the four vocabularies are one 2-axis structure

Expand Down
4 changes: 4 additions & 0 deletions .claude/board/LATEST_STATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,10 @@

## Current Contract Inventory (lance-graph-contract)

> **2026-06-17 — ADDED (D-UNICHARSET-1, byte-parity probe Rust side)**: `lance_graph_contract::unicharset::{UniCharSet, UniCharSetError}` — the Tesseract `UNICHARSET` content-store tier (the Core-First doctrine's variable-length classid-keyed registry, `deepnsm::Vocabulary`-shaped: `reverse: Vec<String>` id→unichar + `lookup: HashMap<String,u32>` unichar→id). `load_from_str`/`load_from_file` parse the `.unicharset` text format (line 1 = count, then the first whitespace token per line = unichar, id = position; property columns ignored — the `old_style_included_` plain-table scope); `id_to_unichar`/`unichar_to_id` are the two adapter leaves; `dump()` renders the `<id>\t<unichar>` table matching the C++ oracle. **The Rust side of `PROBE-OGAR-ADAPTER-UNICHARSET`** — pure text parsing, ZERO leptonica (the unicharset path never touches `Pix`), so it builds + unit-tests in-env; byte-parity is one `diff` against a libtesseract oracle harness on a leptonica-installed box (steps in `examples/unicharset_dump.rs`). Additive (a sibling content-store module, zero `NodeRow`/tenant impact). +4 tests (format parse, bijection round-trip, oracle-shape dump, typed errors) + the `unicharset_dump` example; 644 contract lib green; clippy `-D warnings` clean. Plan: `transcode-extend-core-probe-v1.md` (the deferred Option A content-store tier, now built for the probe). The classid→`&UniCharSet` `LazyLock` resolver remains the wiring follow-up.

> **2026-06-17 — ADDED (D-CPP-CODEGEN-1, C-FIRST step 2 compile target)**: `lance_graph_contract::codegen_manifest::{MethodSig, ClassMethods, methods_for}` — the Core-side target of the C++ method-resolution manifest emitted by `ruff_cpp_codegen` (the Tesseract AST-DLL pipeline's stage 2). `MethodSig` is the dispatch-relevant signature in a **`const`-constructible** shape (all fields `&'static`: `name`, `params: &'static [&'static str]`, `ret`, `is_const`, `is_static`, `overrides`) — the method-axis sibling of `class_view::ClassView`'s field projection, deliberately NOT `String`-backed (a generated `const X: &[MethodSig] = &[MethodSig { .. }]` must compile; `FieldRef` is `String`-backed and cannot). `ClassMethods{classid, methods}` is the registry ENTRY the generated code emits (classid bound OGAR-side, never minted here); `methods_for(registry, classid) -> &'static [MethodSig]` is the pure lookup with zero-fallback (unregistered classid → empty slice). **Additive** (container-architect ADDITIVE-CONFIRMED): a sibling module, zero `NodeRow`/`ValueTenant`/`ValueSchema`/stride/`ENVELOPE_LAYOUT_VERSION` impact; the runtime `classid→methods` registry DATA lives downstream (generated in the consumer repo), not here. Body-shaping flags (pure-virtual/constexpr/noexcept/operator/requires) are out of scope (they drive body generation, not the signature manifest). The 8-agent step-2 council's deferred-runtime-registry resolution. +2 tests (const-constructibility proof + zero-fallback lookup); 640 contract lib green; clippy `-D warnings` clean. Plan: `.claude/plans/transcode-extend-core-probe-v1.md` (C step 2). Consumer: `ruff_cpp_codegen::render` (AdaWorldAPI/ruff) names this type in emit-text-only output.

> **2026-06-16 — ADDED (4-task unblock-cascade)**: `lance_graph_contract::hhtl::NiblePath::{from_guid_prefix(&NodeGuid) -> Option<NiblePath>, prefix(depth: u8) -> Option<NiblePath>}` — the ontology-side keystone follow-up of #498's `classid → ReadMode` LE contract. The 20-nibble `classid · HEEL · HIP · TWIG` prefix is deterministically folded to 16 (the canon-reserved high `u16` of classid drops); returns `None` when the fold would be lossy (callers don't get silent collisions). `prefix(d)` is the O(1) single-shot ancestor view that satisfies `prefix(d).is_ancestor_of(self)` for every `d ≤ self.depth` — the routing-cache view of a deeper class path. **One layer up** in `cognitive-shader-driver::MailboxSoA<N>`: `impl MailboxSoaView + MailboxSoaOwner` (cherry-pick of `jolly-cori-clnf9::463d71b`) + the `pub phase: KanbanColumn` field — the in-RAM Rubicon owner the contract's `MailboxSoaOwner` had no real implementor for (integrated-cognitive-planner-v1 §2 Seam #3 closed). In `lance_graph::graph::scheduler`: `LanceVersionScheduler<S = NextPhaseScheduler>` — D-MBX-9-IN core impl over `VersionedGraph::versions()`, generic over the inner `VersionScheduler` policy (closes `E-SUBSTRATE-IS-THE-SCHEDULER`'s OUT-direction). In `surreal_container::view`: `SurrealMailboxView<'a>` + `read_via_kv_lance()` (D-PG-6 contract slice) — the SurrealQL read-glove the integrator wires once the cold-build of the surrealdb fork is taken; the contract surface is available today. Plus `SurrealContainerError::BlockedColdBuild` — typed signal for callers to pattern-match the cold-build gate (distinct from the pre-existing `Blocked` variant which signals coordinate/API gaps). Zero-dep contract additions (+7 hhtl tests, 632 lib green); cognitive-shader-driver +1 driving-loop test (86 lib green); lance-graph::scheduler new module (+5 tests, real tempdir Lance); surreal_container::view new module (+4 tests). All four green; clippy `-D warnings` clean on the new files. EPIPHANIES `E-UNBLOCK-CASCADE-1` records the convergence of three independent landings onto the single `MailboxSoaView` trait surface.

> **2026-06-09 — ADDED (D-IDENTITY-1, Phase A of identity-architecture)**: `lance_graph_contract::identity::{NodeGuid([u8;16]), IDENTITY_LAYOUT_VERSION}` — the workspace's first **stable binary instance identity**: a structured 128-bit UUIDv8 (RFC 9562) = the HHTL nibble-address **formalized + namespaced**. **Composed from existing committed scalars, never re-invented** (Agent A sweep confirmed the 128-bit id space was empty): octets carry `namespace:u8 | entity_type:u16 | kind:u8` (the `SchemaPtr.packed` convention) ⊕ a truncated `NiblePath` routing prefix (`PREFIX_NIBBLES=4`) ⊕ a 22-bit `shape_hash` (truncated `StructuralSignature`) ⊕ a 24-bit `local`, with UUIDv8 version(=8)/variant(=0b10) at their RFC-fixed positions + an `IDENTITY_LAYOUT_VERSION` stamp. **Eineindeutigkeit**: `entity_type` is the canonical exact class identity; the `NiblePath` prefix is the bijective DERIVED view (a *truncated* prefix can't be the identity — deep classes collide past it; the prefix `is_ancestor_of` the full path). Five readings: resolve (`entity_type`) / route (`niblepath`) / witness (frozen bytes + merkle) / ground-truth (`shape_hash` drift) / dispatch-to-store (`as_bytes` → `EntityKey`). Also added `hhtl::NiblePath::from_packed` (inverse of `packed`). Zero-dep; 599 contract lib tests (+15: field-isolation matrix, UUIDv8 gates, ancestor-prefix invariant, Display=canonical-UUID); clippy `-D warnings` clean; fmt clean. Plans: `identity-architecture-exists-vs-needs-v1.md` (exists-vs-needs map + phases A→H), `cognitive-write-roundtrip-substrate-v1.md`. Epiphany: E-IDENTITY-WHITEBOX-1.
Expand Down
27 changes: 23 additions & 4 deletions .claude/knowledge/core-first-transcode-doctrine.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,8 +158,9 @@ reviewed — not an excuse to fatten one adapter.

## The falsifier (CONJECTURE → FINDING gate)

Per `truth-architect` discipline, this doctrine is a CONJECTURE until measured.
The cheapest end-to-end probe:
Per `truth-architect` discipline, this doctrine was a CONJECTURE until measured.
**The byte-parity heart RAN GREEN in-env on 2026-06-17 (result below) — promoted
to FINDING for the unicharset adapter.** The cheapest end-to-end probe:

```
PROBE-OGAR-ADAPTER-UNICHARSET (P0)
Expand All @@ -174,8 +175,26 @@ PROBE-OGAR-ADAPTER-UNICHARSET (P0)
building the whole transcode.
```

Until this runs green, "the OGAR Core makes the transcode clean" is a
CONJECTURE. Do NOT scale the adapter approach across modules until it passes.
**RESULT — RAN GREEN (2026-06-17, in-env).** Step 1's byte-parity heart is
confirmed: `lance_graph_contract::unicharset::UniCharSet` (the content-store tier
+ `id_to_unichar` / `unichar_to_id` leaves) is **byte-identical to libtesseract**
on the real `eng.lstm-unicharset` — 112/112 entries, diffed against a C++
`UNICHARSET` FFI oracle (installed `libtesseract` 5.3.4 + `libleptonica` 1.82;
`examples/unicharset_dump.rs` vs the oracle harness). The falsifier did its job:
it found exactly one real convention (`NULL` file-token → `" "` space,
`unicharset.cpp:882`) — a one-line fix, NOT a Core gap. The doctrine's central
worry ("the adapter needs state the SoA tenants can't carry") is **refuted**: the
variable-length bijection rides the content-store tier (`deepnsm::Vocabulary`-
shaped) cleanly, no new node state.

So "the OGAR Core makes the transcode clean" is now a **FINDING** for the
unicharset adapter — the bijection / content-store pattern is validated and may be
scaled. **Honest scope:** steps 2–3 (compose via `classid → ClassView` resolver,
invoke through `UnifiedStep`) are mechanical wiring of a now-proven-correct
adapter; each method-body leaf remains its own byte-parity check, but the pattern
is no longer a conjecture. Leptonica is an *install*, not a transcode — it is only
a link dep of the C++ oracle, never in the Rust path (the unicharset path never
touches `Pix`).

## Anti-patterns this doctrine exists to catch

Expand Down
Loading
Loading