|
10 | 10 |
|
11 | 11 | --- |
12 | 12 |
|
| 13 | +> **2026-06-15 — REVERTED (operator)** — the tesseract-rs `soa` wiring below was **deleted** (branch reset to master `420de08`). Operator: *"we don't want to use original Tesseract, we want to transcode it into Rust — delete everything you copied from original Tesseract into tesseract-rs."* Wrapping the original Tesseract C engine + parsing its TSV is the wrong direction; the real goal is a **pure-Rust OCR**. The contract-side transcode (`LayoutBlock::to_node_row`) + keystone STAY — they are OCR-engine-agnostic (a pure-Rust OCR feeds the same `LayoutBlock` → `NodeRow`); only the original-Tesseract coupling was removed. The strike-through entry below is retained per APPEND-ONLY. |
| 14 | +> |
| 15 | +> ~~**2026-06-15 — cross-repo landed** — **tesseract-rs fork wired to the transcode.**~~ *(REVERTED — see above)* `AdaWorldAPI/tesseract-rs` branch `claude/wonderful-hawking-lodtql` commit `1687c718`: opt-in `soa` feature (default-OFF — standalone OCR build untouched) + `src/soa.rs::tsv_to_nodes(tsv, classid, min_conf) -> Vec<NodeRow>` parsing tesseract `get_tsv_text` word rows → `contract::ocr::LayoutBlock` → `to_node_row`. Contract dep is a path dep mirroring smb-office-rs (sibling checkout). **Edition-2015 compatible** (the fork has no `edition` field → 2015: root `extern crate` + submodule root-relative `use` + explicit `TryInto` — all caught + fixed by verifying in a 2015 scratch crate against the real contract before pushing, 2 tests green). Pushed via `GH_TOKEN`+pygithub (out-of-MCP-scope fork). Could NOT compile the full crate here (no tesseract C-lib) — the transcode LOGIC is what's verified; the fork's own CI needs a co-located lance-graph for `--features soa`. |
| 16 | +> |
| 17 | +> **2026-06-15 — branch work (post-#496)** — **tesseract OCR → NodeRow transcode POC (keystone payoff).** `lance_graph_contract::ocr::LayoutBlock::to_node_row(classid, identity) -> NodeRow` — the reference transcode any `OcrProvider` (tesseract-rs + others) reuses, the keystone end-to-end: `classid → classid_read_mode → ValueSchema` gates WHICH tenants land; `BlockKind::entity_type() -> u16` → `ValueTenant::EntityType`, `confidence: f32` → `ValueTenant::Energy`, each written at its canon slab offset via the new `ValueTenant::{value_offset(), byte_len()}` (derived accessors over the locked carve — not new properties). **`text`/`bbox` are NOT bundled** (`I-VSA-IDENTITIES`: node = identity + typed scalars; the string + pixel geometry live in an external content store keyed by `identity`). Schema-gated (`schema.has(t)` before each write) so a Bootstrap-resolving class writes an empty slab; transcoded rows ride the `SoaEnvelope` zero-copy (verified). §0 anti-invention: reuses the existing EntityType/Energy tenants, no "ocr_kind" field. +4 tests; **623 contract lib green; clippy `-D warnings` + fmt clean.** Lives in the contract (next to the `ocr` types it uses, zero-dep, testable here — no OCR C-lib, no fork); tesseract-rs just adds the contract dep + calls it (integration step). Branch, not yet a PR. |
| 18 | +> |
| 19 | +> **2026-06-15 — branch work (post-#496)** — **keystone (contract half): GUID decode + classid→read-mode `LazyLock`.** `lance_graph_contract::canonical_node::{GuidParts, ReadMode, classid_read_mode}` + `NodeGuid::{heel(), hip(), twig(), decode() -> GuidParts, read_mode() -> ReadMode}` (re-exported from `lib.rs`). **The "read the GUID as a GUID" surface** the operator spec'd: `decode()` returns all six canon groups (classid + HHT·HEEL/HIP/TWIG + family·"Leaf" + identity) in one read; `ReadMode` bundles the two *already-existing* read-mode axes (`ValueSchema` + `EdgeCodecFlavor`) — **NOT a new node property, NOT a SoA column** (§0 anti-invention; it's the resolution lens, nothing stored on the row); `classid_read_mode(u32)` is the **single source both the consumer and OGAR inherit** — a `LazyLock<HashMap<u32,ReadMode>>` builtin registry (same immutable-after-init pattern `lance-graph-ontology` uses for its seed namespace registry), zero-fallback to `ReadMode::DEFAULT` for any unconfigured classid. `ReadMode::DEFAULT = {Full, CoarseOnly}` mirrors the `ClassView::value_schema` POC default (paired revert; `read_mode_default_is_full_poc` guards it). `Display` deduped onto the new HHT accessors. +6 tests (decode round-trip, HHT↔Display, read-mode single-source, carrier delegation, full-slab connect); **619 contract lib green; clippy `-D warnings` + fmt clean.** Delivers the contract-side half of the #496 keystone; the ontology-side `NiblePath::from_guid_prefix` (20→≤16-nibble subset) meets it at the classid (follow-up). Branch, not yet a PR. |
| 20 | +> |
| 21 | +> **2026-06-15 — branch work (post-#496)** — **helix `Signed360` codec + `HelixResidue` right-sized 48 B → 6 B.** Operator caught a slab over-allocation: `HelixResidue` reserved **48 *bytes*** but the intent was a 24-bit equal-area hemisphere **doubled = 48 *bit* = 6 B** (a bits→bytes slip; 42 dead bytes), and the tenant used **none** of the `helix` crate (zero-dep contract — only a doc string). Fixed: **(1) `helix::Signed360`** — the signed full-sphere codec: `HemispherePoint::signed_lift(n,N,sign)` (`y = sign·√(1−u)` → full sphere, `r²+y²=1`), `Sign{Pos,Neg}`, and `Signed360 {rim: ResidueEdge, polar: signed-lift centred@128 (sign recoverable), azimuth: u16 over 360°}` + `ResidueEncoder::encode_signed`. +9 tests; **helix 72 lib + 7 doctests green; lib clippy `-D warnings` + fmt clean.** **(2) contract** `HelixResidue.elems_per_row` 48→6, downstream tenants shifted (Turbovec 118 / Energy 134 / Plasticity 138 / EntityType 142), budgets re-locked (**Full 154→112, Compressed 98→56**); **613 contract green.** **NO `HelixFlavour` enum** — one canonical encoding, one tenant size (a fixed-offset SoA can't vary width per-class; Hemisphere = degenerate `sign=+`); the contract stays zero-dep, the producer writes `Signed360::to_bytes` into the 6 B. Cheap NOW (POC FULL default, no persisted real instances); after instances persist it's a version bump. Branch, not yet a PR. New: `TD-HELIX-PROBE-CLIPPY` (pre-existing `probe_mantissa_fill` clippy/fmt drift, NOT introduced here — helix is excluded so CI-invisible, same class as the standing `causal-edge` 47/1 red). |
| 22 | +> |
| 23 | +> **2026-06-15 — MERGED #496** (integrated-cognitive-planner reference map + ValueSchema + FULL POC default): `lance_graph_contract::canonical_node::{ValueSchema, ValueTenant, VALUE_TENANTS}` — the value-side `EdgeCodecFlavor` analog (9 append-only tenants carving `[32,186)`; presets Bootstrap/Cognitive/Compressed/Full). `ClassView::value_schema()` default flipped **Bootstrap→Full (TEMPORARY POC** — every unconfigured class materialises the full slab so consumers transcode against it; `TD-VALUESCHEMA-FULL-POC-DEFAULT` revert-when-POC-concludes; type-level `ValueSchema::default()` stays Bootstrap, only class→schema *resolution* flips). New reference plan `.claude/plans/integrated-cognitive-planner-v1.md` — **§0 ANTI-INVENTION GUARDRAIL (READ FIRST)**, §1–§7 grounded file:line map, §8 7-item additive ledger, §9 3-hardener verdicts; the SPEC for the integrated-planner refactor (~90% exists; remaining = the keystone + 6 seams, NOT a new build). CI 5/5 green; contract 613 lib tests; merge `2e58e034`. **The keystone = `NiblePath::from_guid_prefix` (the 20→≤16-nibble subset) + classid→ClassView read-mode on `lance-graph-ontology::registry` (already an immutable conflict-refusing `entity_type↔NiblePath` bijection)** — the single next unblock that converges the refactor, the tesseract-rs OCR transcode (`contract::ocr` → NodeRow), AND the OGAR-identity migration (`soa-migration-diff-resolution-2026-06-13.md`). HEEL=cache `dolce_id` / HIP·TWIG=deterministic subClassOf descent / registry=recorder-not-minter (verified `registry.rs`+`wikidata_hhtl.rs`). New: `TD-COARSERESIDUE-NO-VALUE-TENANT`, `TD-LAZY-IMPORT-VERSION-PIN`; IDEAS CLAM-residue-ladder TODO. |
| 24 | +> |
13 | 25 | > **2026-06-13 — shipped (autoattended, cross-repo)** (turbovec ⇄ ndarray): new excluded standalone crate **`crates/lance-graph-turbovec`** — Google TurboQuant (arXiv 2504.19874, the AdaWorldAPI `turbovec` fork) bridged onto the spine. `TurboVec` wraps `turbovec::TurboQuantIndex` with a `Kernel::{NativeLut, PolyfillGemm}` A/B switch. **Cross-repo (branch `claude/wonderful-hawking-lodtql` in turbovec + ndarray + lance-graph):** turbovec re-pointed from crates.io `ndarray 0.17` → the AdaWorldAPI fork (path, P0 forks-only; `blas` opt-in so default builds BLAS-free; `rust-toolchain.toml` = 1.95.0); new `turbovec::search_polyfill` (feature `ndarray-simd`) expresses scoring as a batched int8 GEMM via **`ndarray::simd::matmul_i8_to_i32`** (re-exported through `simd.rs` — AMX `TDPBUSD` tile → AVX-512 VPDPBUSD → AVX-VNNI → scalar, dispatched inside ndarray, zero intrinsics in turbovec). **Measured finding (E-TURBOVEC-AMX-WRONG-TOOL-1):** the polyfill GEMM is 11.4× SLOWER than the native nibble-LUT (TurboQuant trades the matmul away → AMX accelerates the op it removed); native LUT stays production, polyfill is the AMX-ready baseline. Placement: index → spine, kernel-math → ndarray (already owns clam/cam_pq/cascade/amx_matmul). Synergy map (HDR popcount stacking early-exit, Belichtungsmesser σ thresholds, preheating vs palette256) in `crates/lance-graph-turbovec/KNOWLEDGE.md`. Tests green in all three repos; benchmark via `examples/kernel_speed.rs`. NOT a merged PR yet (branch work). |
14 | 26 | > |
15 | 27 | > **2026-06-03 — hardened (follow-up after #460)** (D-HELIX-1 wiring): `crates/helix` now takes **ndarray as a MANDATORY, non-optional git dependency** (`git = AdaWorldAPI/ndarray @ master`), replacing the optional `path` dep + `ndarray-hpc` feature. Why: (1) codex P2 — an optional *path* dep still forces Cargo to read the local sibling manifest at resolution, so a clean checkout failed before feature selection; (2) directive "ndarray is mandatory for lance-graph". `simd.rs` always uses `ndarray::simd` (no scalar fallback); the self-contained fork → no import cycle. 63 unit + 6 doctests green; clippy/fmt clean. See E-HELIX-NDARRAY-MANDATORY. |
|
0 commit comments