From 18240ec30368182f9cd752aaf0ef9e0f002fe39a Mon Sep 17 00:00:00 2001 From: Claude Date: Wed, 29 Apr 2026 19:37:34 +0000 Subject: [PATCH] =?UTF-8?q?plan:=20grammar-foundry-followup-v1=20=E2=80=94?= =?UTF-8?q?=2013=20PRs=20to=20wire=20stubs=20to=20tissue?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Files the integration plan for the follow-up of PRs #275-#283 (Foundry + Grammar). Six explicit stub/skeleton/placeholder/unimplemented! markers in the merged code name what remains. 13 PRs across two parallel tracks (6 Foundry + 6 Grammar) sharing one keystone (LF-12 Pipeline DAG). Three waves: Wave 1 (8 parallel, no deps), Wave 2 (4 PRs after S1/F1), Wave 3 (G6 Animal Farm after G1+G2+G3). Board hygiene: INTEGRATION_PLANS.md prepended, STATUS_BOARD.md section added with 13 D-ids all Queued. https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj --- .claude/board/INTEGRATION_PLANS.md | 16 ++ .claude/board/STATUS_BOARD.md | 35 ++++ .claude/plans/grammar-foundry-followup-v1.md | 166 +++++++++++++++++++ 3 files changed, 217 insertions(+) create mode 100644 .claude/plans/grammar-foundry-followup-v1.md diff --git a/.claude/board/INTEGRATION_PLANS.md b/.claude/board/INTEGRATION_PLANS.md index 803230d8..9a4d8881 100644 --- a/.claude/board/INTEGRATION_PLANS.md +++ b/.claude/board/INTEGRATION_PLANS.md @@ -36,6 +36,22 @@ --- +## v1 — Grammar + Foundry Follow-up (authored 2026-04-29) + +**Author:** main thread (Opus 4.7), session 2026-04-29 +**Status:** Active +**Scope:** Wire the stubs and scaffolds shipped in PRs #275-#283 to existing tissue. Six explicit `stub`/`skeleton`/`placeholder`/`unimplemented!` markers in the merged code (verified by grep) name what remains. 13 PRs across two parallel tracks (6 Foundry + 6 Grammar) sharing one keystone (LF-12 Pipeline DAG). All deliverables target `main` directly; no stacking PRs (avoids the merge-order orphaning that bit #281/#283 → #284/#285). +**Path:** `.claude/plans/grammar-foundry-followup-v1.md` +**Deliverables:** PR-S1 (Pipeline DAG keystone), PR-F1..F6 (Foundry: PolicyRewriter UDF wrap, Encrypt+DP, Lance audit, PostgREST dispatch, audit_from_plan, dn_path scent), PR-G1..G6 (Grammar: Triangle causality, Disambiguator wiring, ContextChain fp, verb_table seed, AriGraph unbundle, Animal Farm real run). +**Cross-refs:** +- `lf-integration-mapping-v1.md` — LF-12 keystone rationale (PR-S1) +- `foundry-roadmap.md` — original PR-1..PR-5 (PR-1/PR-2 shipped as #278/#280; PR-3..PR-5 ship as PR-F1..F4 here) +- `integration-plan-grammar-crystal-arigraph.md` — original AriGraph follow-up (now ships as PR-G5) +- `grammar-landscape.md` — case inventories that PR-G4 consumes +**Open decisions:** (1) PR-F2 encryption key management (KMS? in-process? user-supplied?); (2) PR-G6 Animal Farm text licensing; (3) PR-F6 bgz-tensor → callcenter dep; (4) PR-G4 ownership. + +--- + ## v1 — LF Integration Mapping (authored 2026-04-25) **Author:** main thread (Opus 4.7 1M), session 2026-04-25 (branch claude/scenario-world-facade) diff --git a/.claude/board/STATUS_BOARD.md b/.claude/board/STATUS_BOARD.md index b28d64fc..08889a45 100644 --- a/.claude/board/STATUS_BOARD.md +++ b/.claude/board/STATUS_BOARD.md @@ -281,6 +281,41 @@ pattern IS the Supabase-shape transcode approach). --- +## grammar-foundry-followup-v1 — Wire stubs to existing tissue + +Plan: `.claude/plans/grammar-foundry-followup-v1.md`. Session 2026-04-29. +Six explicit stubs in PRs #275-#283 + 1 keystone (LF-12 Pipeline DAG). 13 PRs total in 3 waves. + +### Wave 1 — no deps (parallel) + +| D-id | Title | Status | Notes | +|---|---|---|---| +| PR-S1 | LF-12 Pipeline DAG: `UnifiedStep.depends_on` + topological executor | **Queued** | Keystone. Unblocks F4, G2, G6 | +| PR-F1 | PolicyRewriter UDF wrap: `RedactionMode` executors (closes `policy.rs:122`) | **Queued** | Unblocks F2, F5 | +| PR-F3 | Audit log Lance-backed writer (closes `lib.rs:100`) | **Queued** | | +| PR-F6 | `dn_path.rs` real scent via CAM-PQ (closes `dn_path.rs:53`) | **Queued** | Risk: bgz-tensor dep | +| PR-G1 | Triangle bridge real Causality footprint (closes `triangle_bridge.rs:90,221`) | **Queued** | | +| PR-G3 | ContextChain real `Binary16K` fingerprint (closes `context_chain.rs:345`) | **Queued** | | +| PR-G4 | verb_table seed 10/12 families (closes empty `default_table()` rows) | **Queued** | | +| PR-G5 | AriGraph episodic unbundle/rebundle (per `integration-plan-grammar-crystal-arigraph.md`) | **Queued** | | + +### Wave 2 — depends on Wave 1 + +| D-id | Title | Status | Notes | +|---|---|---|---| +| PR-F2 | RowEncryption + DifferentialPrivacy executors (closes `policy.rs:147,181`) | **Queued** | After F1; needs key-mgmt ADR | +| PR-F4 | PostgREST → DataFusion dispatch (closes `EchoHandler` stub) | **Queued** | After S1 | +| PR-F5 | `audit_from_plan()` helper (closes `orchestration.rs:202` `unimplemented!`) | **Queued** | After F1 | +| PR-G2 | Disambiguator wiring at parser boundary + FailureTicket emission | **Queued** | After S1 | + +### Wave 3 — depends on Waves 1+2 + +| D-id | Title | Status | Notes | +|---|---|---|---| +| PR-G6 | Animal Farm harness real run (D10 from PR #243) | **Queued** | After G1+G2+G3; text licensing needed | + +--- + ## unified-integration-v1 — PersonaHub × ONNX × Archetype × MM-CoT × RoleDB Plan: `.claude/plans/unified-integration-v1.md`. Session 2026-04-23. diff --git a/.claude/plans/grammar-foundry-followup-v1.md b/.claude/plans/grammar-foundry-followup-v1.md new file mode 100644 index 00000000..fd3f316a --- /dev/null +++ b/.claude/plans/grammar-foundry-followup-v1.md @@ -0,0 +1,166 @@ +# Grammar + Foundry Follow-up — v1 + +> **Status:** Active (2026-04-29) +> **Author:** main thread (Opus 4.7), session 2026-04-29 +> **Scope:** Wire the stubs and scaffolds shipped in PRs #275-#283 to existing tissue. 13 PRs across two parallel tracks (6 Foundry + 6 Grammar) sharing one keystone (LF-12 Pipeline DAG). All deliverables target `main` directly; no stacking PRs (avoids the merge-order orphaning that bit #281/#283 → #284/#285). +> **Cross-refs:** +> - `lf-integration-mapping-v1.md` — LF-12 keystone rationale +> - `foundry-roadmap.md` — original PR-1..PR-5 sequence (PR-1/PR-2 shipped as #278/#280; this plan ships PR-3..PR-5 as PR-F1..F4) +> - `integration-plan-grammar-crystal-arigraph.md` — original AriGraph follow-up (now shipped as PR-G5) +> - `grammar-landscape.md` — case inventories that PR-G4 consumes + +## Epiphany + +The shipped Foundry+Grammar surface is 70% scaffold, 30% executor. Six explicit `stub`/`skeleton`/`placeholder`/`unimplemented!` markers in the merged code (verified by grep) name what remains. The next pass is **wiring**, not new scaffolding. + +## Inventory: shipped scaffold vs. remaining stub + +### Foundry — `crates/lance-graph-callcenter` + +| Module | Wired | Stub | +|---|---|---| +| `rls.rs` (920 LOC) | RLS `OptimizerRule` predicates, sealed registry, 23 tests | actor_id column wiring (`filter_expr.rs:53`) | +| `audit.rs` (310 LOC) | `AuditSink` trait, `InMemoryAuditSink`, FNV hash | **Lance-backed writer** (`lib.rs:100`) | +| `postgrest.rs` (954 LOC) | URL parse, filter ops (eq/in/ilike/like/is), `EchoHandler` | **Real DataFusion dispatch** | +| `lance_membrane.rs` | `with_registry()`, Plugin handshake, atomic ops | — | +| `policy.rs` (309 LOC) | `PolicyRewriter` trait, `ColumnMaskRewriter` skeleton | **UDF column wrap** (`policy.rs:122`); **`RowEncryptionPolicy`** (`policy.rs:147`); **`DifferentialPrivacyPolicy`** (`policy.rs:181`) | +| `dn_path.rs` | DN path parser | **Real scent** (`scent_stub` XOR-fold at `dn_path.rs:53`) | +| `orchestration.rs` (contract) | `StepDomain`, `DomainProfile`, `Escalation`, `VerbTaxonomyId` | **`audit_from_plan()`** (`orchestration.rs:202` `unimplemented!`) | + +### Grammar — `crates/lance-graph-contract/src/grammar/` + `crates/deepnsm/` + +| Module | Wired | Stub | +|---|---|---| +| `role_keys.rs` (313 LOC) | 16384-dim slice catalogue, `LazyLock` arrays | — | +| `context_chain.rs` (+355) | `coherence_at`, `replay_with_alternative`, `WeightingKernel` | **Real `Binary16K` chosen-fingerprint** (`context_chain.rs:345`) | +| `thinking_styles.rs` (+725) | `GrammarStyleConfig`, NARS revision, 12 YAML configs | **Persistence E6** (`thinking_styles.rs:1209`) | +| `verb_table.rs` (120 LOC) | 144-cell `VerbFamily × Tense` schema | **10/12 family rows** (only Becomes + Causes seeded) | +| `disambiguator.rs` (137 LOC) | `Disambiguatable` trait, free function, 4 tests | **Caller wiring** (no parser callsite) | +| `markov_bundle.rs` | Ring buffer, role-indexed bundling | — | +| `trajectory.rs` | `Trajectory`, `role_bundle`, `role_candidates` | — | +| `ticket_emit.rs` (181 LOC) | `FailureTicket` decomposition | **Caller wiring** (parser doesn't emit) | +| `triangle_bridge.rs` (138 LOC) | NSM × Causality × Qualia merge | **Real Causality footprint** (`triangle_bridge.rs:90,221`) | +| `quantum_mode.rs` | `PhaseTag(u128)`, `HolographicMode` | **Dispatch** (no consumer reads `HolographicMode`) | +| `tests/animal_farm_harness.rs` | `EpiphanyPrediction`, `evaluate()`, 4 tests | **Real run on Animal Farm text** (D10 from PR #243) | + +## Deliverables + +### Shared keystone + +**PR-S1 — LF-12 Pipeline DAG** +- Add `UnifiedStep.depends_on: Vec` to `contract::orchestration` +- Topological executor in `lance-graph-planner::pipeline` (~300 LOC) +- Unblocks: PR-F4, PR-G2, PR-G6 +- Effort: L. Per `lf-integration-mapping-v1.md` §"Open questions": schema field add + executor as one PR. + +### Foundry track (6 PRs) + +**PR-F1 — PolicyRewriter UDF wrap** +- File: `policy.rs:120-124` skeleton → real +- Wire `Expr::Column(c) → mask_udf(c)` for each column in `ColumnMaskRewriter::columns` +- Wire `RedactionMode::{Null, Constant, Hash, Truncate}` to four UDF impls +- Tests: 5 tests per `RedactionMode` +- Effort: S–M (~250 LOC) + +**PR-F2 — RowEncryption + DifferentialPrivacy executors** +- Files: `policy.rs:147` and `policy.rs:181` stubs → executors +- Encryption: AES-GCM (or chacha20) +- DP: Laplace noise on aggregate columns marked `Marking::Pii` +- Depends on PR-F1 (UDF dispatch shape) +- Effort: M (~200 LOC + dep) +- Open ADR: encryption key management (KMS? in-process? user-supplied?) + +**PR-F3 — Audit log Lance-backed writer** +- File: `audit.rs` — new `LanceAuditSink` (~200 LOC) +- One row per `AuditEntry` to a Lance dataset +- Reuse `LanceMembrane::with_registry()` for wiring +- Tests: scan-back-after-flush, 1k roundtrip +- Effort: M +- Cross-ref: `foundry-roadmap.md` PR-2 ("append-only Lance table") + +**PR-F4 — PostgREST → DataFusion dispatch** +- File: `postgrest.rs` `EchoHandler` → real DataFusion (~400 LOC) +- Parse → `LogicalPlan` → `MembraneRegistry` (RLS rewrite) → `RecordBatch` → JSON +- Tests: 10–15 PostgREST shape tests with real Lance scans +- Depends on PR-S1 +- Effort: L (split if needed: dispatch-without-RLS, then RLS-aware) + +**PR-F5 — `audit_from_plan()` helper** +- File: `orchestration.rs:202` `unimplemented!` → helper +- Capture rewritten DataFusion `LogicalPlan` into `AuditEntry.rewritten_plan: Option` after RLS +- Touches: `orchestration.rs`, `audit.rs` (new field) +- Depends on PR-F1 (datafusion-plan feature non-stub) +- Effort: S (~80 LOC) + +**PR-F6 — `dn_path.rs` real scent** +- File: `dn_path.rs:53` `scent_stub` → real CAM-PQ lookup +- Use existing `bgz-tensor::CamCodecContract` +- Effort: S +- Risk: pulls bgz-tensor into callcenter dep tree + +### Grammar track (6 PRs) + +**PR-G1 — Triangle bridge real Causality footprint** +- Files: `triangle_bridge.rs:90,221` neutral 0.5 placeholder → real Pearl 2³ projection +- Each parsed sentence → `CausalEdge64` from SPO triple → triangle's causality channel +- Tests: two NSM-bound sentences, same Subject, different Pearl masks → different triangle outputs +- Effort: M (~100 LOC) + +**PR-G2 — Disambiguator wiring at parser boundary** +- File: `crates/deepnsm/src/parser.rs` — call `disambiguate_general` when `coverage < LOCAL_COVERAGE_THRESHOLD` +- Emit `FailureTicket` (already in `ticket_emit.rs`) when can't resolve +- Tests: Wechsel "auf den Tisch / auf dem Tisch" → disambiguator picks Akk vs Dat +- Depends on PR-S1 +- Effort: M (~150 LOC) + +**PR-G3 — ContextChain real Binary16K fingerprint** +- File: `context_chain.rs:345` placeholder zero-fingerprint → real `MarkovBundler::role_bundle()` output +- Effort: S (~80 LOC) + +**PR-G4 — verb_table seed beyond Becomes/Causes** +- File: `verb_table.rs` `default_table()` — populate 10 remaining `VerbFamily` rows × 12 `Tense` cells +- 120 cells of `SlotPrior` data from `grammar-landscape.md` §3 +- Tests: 12 sentences (one per VerbFamily) → priors land non-zero +- Effort: M (~300 LOC mostly content) +- Risk: linguistic judgment — flag for `family-codec-smith` review + +**PR-G5 — AriGraph episodic unbundle/rebundle** +- File: `crates/lance-graph/src/graph/arigraph/episodic.rs` +- Add `unbundle_hardened()`, `unbundle_targeted()`, `rebundle_cold()` per `integration-plan-grammar-crystal-arigraph.md` +- Wire to `UNBUNDLE_HARDNESS_THRESHOLD` +- Effort: M (~200 LOC) + +**PR-G6 — Animal Farm harness real run** +- File: `crates/deepnsm/tests/animal_farm_harness.rs` — feed actual Animal Farm chapters +- Verify `chapter_10_acc > chapter_1_acc` (the AGI thesis from PR #243) +- Depends on PR-G1 + PR-G2 + PR-G3 (otherwise harness measures placeholder behavior) +- Effort: L (text licensing + ground-truth annotation are real work) + +## Sequencing + +``` +Wave 1 (parallel, no deps): S1, F1, F3, F6, G1, G3, G4, G5 +Wave 2 (depends on Wave 1): F2 (after F1), F4 (after S1), F5 (after F1), G2 (after S1) +Wave 3 (depends on Waves 1+2): G6 (after G1, G2, G3) +``` + +## Out of scope + +- LF-13/14/15+ (cron, row-level lineage, additional connectors) — separate connector-tier plan after LF-12 lands +- LF-50/52 (ModelRegistry + LlmProvider) — Stage 5 of `lf-integration-mapping-v1.md` +- Cross-linguistic bundling (Finnish/Russian/Turkish) — needs parallel corpora; separate session +- LF-71 redesign / scenario-world — already in PR via `claude/scenario-world-facade` +- Quantum mode dispatch wiring — defer until one consumer requests `SinglePhase` vs `PerRole` + +## Open decisions + +1. **PR-F2 encryption key management.** KMS? in-process? user-supplied? Block until ADR. +2. **PR-G6 Animal Farm text licensing.** Public domain in some jurisdictions. Gutenberg has it. Repo policy on bundling text data? +3. **PR-F6 bgz-tensor → callcenter dep.** bgz-tensor is currently `exclude`d from workspace. Pull in or keep `dn_path::scent` best-effort? +4. **PR-G4 ownership.** 120 cells × content quality dependence on linguistic judgment. Single-author or split by language family? + +## Update protocol + +This plan stays Active until all 13 PRs ship. Each PR-X gets a row in `STATUS_BOARD.md` under "grammar-foundry-followup-v1" section. PR merges flip Status: Queued → In progress → In PR → Shipped. Open decisions resolve via ADR in `.claude/decisions/` before their dependent PR opens. + +End of v1.