Skip to content

Commit 0e5ab98

Browse files
committed
fix(spine): address PR #432 opus 4.8 reviewer findings
Three correctness bugs + one architectural disclaimer + three polish items from the PR review pass. No new dependencies; all tests green. ## Bugs fixed ### 1. roundtrip_eq truth check inverted from default semantics (codegen_spine.rs:161 — `if tol > 0.0` gated the entire check) The doc claimed `truth_tolerance() = 0.0` meant "lossless / strictest." The code SKIPPED the truth check when tol = 0.0, so a projection that zeroed every (f, c) round-tripped green with no truth validation. The default-case bug was masked because the existing LossyDropFrequency test overrides tolerance to 0.01. Fix: drop the `if tol > 0.0` gate; the `abs() > tol` comparison naturally treats 0.0 as "any difference fails." Doc reworded to clarify "0.0 requires exact match; override to allow quantization." New test `default_tolerance_requires_exact_truth_match` pins the fix: a `TruthMangler` projection (preserves s/p/o, zeros f/c) MUST fail with the default tolerance. ### 2. parse_triples silently dropped malformed lines (odoo_ontology.rs:73-83 — `filter_map(|l| serde_json::from_str(l).ok())`) Lines that fail JSON parse were silently dropped. The `parses_all_triples` count assertion would catch *some* drift but a corrupted line that happens to keep counts aligned would go silent. Fix: new test `every_nonempty_line_parses` asserts the raw non-empty line count equals `parse_triples(ndjson).len()` — any silent drop fires the assertion. (Kept `parse_triples` returning `Vec` for API stability; the test catches what an error-returning API would.) ### 3. load_ontology collapsed duplicate (s,p,o) keys silently (odoo_ontology.rs:90-104 — HashMap last-write-wins via dn_hash) The module doc said "the extractor de-duplicates" — an unverified assumption. If the harvester ever emits two triples with the same (s, p, o) but different truth values, the second silently overwrites the first. Fix: new test `duplicate_spo_keys_are_last_write_wins` pins the overwrite semantics. A future switch to insertion-rejection or merge becomes a test failure instead of a silent behaviour change. ## Architectural disclaimer `OdooMethodKind` (codegen_spine.rs:230) is the bucket *catalogue*; the *classifier* (Rust port of `.claude/odoo/openings_hops.py`) is a follow-up emitter. action_emitter intentionally does NOT carry a `kind` field — the doc now spells out the wiring gap explicitly so the next session doesn't read this PR as "kind is wired" when it isn't. ## Polish - `RouteBucket::id_owned(&self) -> String` default method added (codegen_spine.rs:336-340). Escape hatch for async/iterator pipelines that need an owned id outside the borrow scope. - `impl std::error::Error for WidgetRenderError` (codegen_spine.rs) for ecosystem `?`-propagation compatibility. - `#[serde(deny_unknown_fields)]` on `OntologyTriple` so harvester schema drift fails parsing instead of silently dropping fields. - New test `function_count_matches_module_doc` in odoo_ontology.rs pins the "3 328 Functions" doc claim against drift inside the loader crate (was only asserted by the downstream action_emitter). - `compose_spec` now builds `effects` from the borrowed `effects_set` directly (action_emitter.rs:255-258), eliminating one redundant `idx.emits_by_fn.get(fn_id)` lookup. ## Test counts (orchestrator-verified) - lance-graph-contract codegen_spine: 6 → 7 passed (+1 truth-mangler) - lance-graph graph::spo:: total: 75 → 78 passed (+3 ontology tests) Fixes from PR #432 opus 4.8 reviewer (findings 1-3 must-fix, 5+7 should-discuss accepted, 4 doc-disclaimer accepted, 6+8 deferred, nitpicks 1+2+3 accepted).
1 parent 60f72b1 commit 0e5ab98

3 files changed

Lines changed: 150 additions & 19 deletions

File tree

crates/lance-graph-contract/src/codegen_spine.rs

Lines changed: 86 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,9 @@ pub trait TripletProjection {
115115
}
116116

117117
/// Tolerance for `f`/`c` comparison in `roundtrip_eq`. Default 0.0
118-
/// (lossless); override to allow quantised projections.
118+
/// (exact equality required); override to allow quantised projections.
119+
/// The check is ALWAYS run — `0.0` does NOT skip it; it requires the
120+
/// recovered truth value to match the source exactly.
119121
fn truth_tolerance() -> f32 {
120122
0.0
121123
}
@@ -156,24 +158,23 @@ pub fn roundtrip_eq<P: TripletProjection>(input: &[Triple]) -> Result<(), RoundT
156158
});
157159
}
158160

159-
// Truth-value tolerance check.
161+
// Truth-value tolerance check — always run; tol = 0.0 means strict
162+
// (any difference fails the `abs() > tol` check naturally).
160163
let tol = P::truth_tolerance();
161-
if tol > 0.0 {
162-
let in_truth: std::collections::BTreeMap<_, _> =
163-
input.iter().map(|t| (t.key(), (t.f, t.c))).collect();
164-
for r in &regenerated {
165-
if let Some((f0, c0)) = in_truth.get(&r.key()) {
166-
if (r.f - f0).abs() > tol || (r.c - c0).abs() > tol {
167-
return Err(RoundTripFailure {
168-
projection: P::name(),
169-
input_count: in_keys.len(),
170-
output_count: out_keys.len(),
171-
missing_count: 0,
172-
extraneous_count: 0,
173-
sample_missing: vec![r.key()],
174-
sample_extraneous: vec![],
175-
});
176-
}
164+
let in_truth: std::collections::BTreeMap<_, _> =
165+
input.iter().map(|t| (t.key(), (t.f, t.c))).collect();
166+
for r in &regenerated {
167+
if let Some((f0, c0)) = in_truth.get(&r.key()) {
168+
if (r.f - f0).abs() > tol || (r.c - c0).abs() > tol {
169+
return Err(RoundTripFailure {
170+
projection: P::name(),
171+
input_count: in_keys.len(),
172+
output_count: out_keys.len(),
173+
missing_count: 0,
174+
extraneous_count: 0,
175+
sample_missing: vec![r.key()],
176+
sample_extraneous: vec![],
177+
});
177178
}
178179
}
179180
}
@@ -226,6 +227,19 @@ impl fmt::Display for RoundTripFailure {
226227
/// Static codegen reads it. Askama route SoC reads it. GUI widget templates
227228
/// read it. Adding a 17th opening = one variant + one `match` arm in every
228229
/// consumer (compiler-enforced exhaustiveness).
230+
///
231+
/// # Classifier wiring is a separate emitter (TBD)
232+
///
233+
/// This enum is the *bucket catalogue*. The function that takes a method
234+
/// body / AST / `ActionSpec` and returns the matching `OdooMethodKind`
235+
/// lives in a downstream classifier emitter (the Rust port of
236+
/// `.claude/odoo/openings_hops.py`'s priority cascade). It is intentionally
237+
/// NOT wired into `lance_graph::graph::spo::action_emitter` yet —
238+
/// `ActionSpec` carries the structural edges (effects / inputs / raises /
239+
/// reads / traverses); the kind classification gets bolted on by the
240+
/// classifier in a follow-up PR. Until then, consumers that need a kind
241+
/// should resolve it via the priority classifier directly, not by
242+
/// inspecting `ActionSpec`.
229243
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, PartialOrd, Ord)]
230244
pub enum OdooMethodKind {
231245
/// `pass` body — explicit no-op framework override.
@@ -332,6 +346,13 @@ pub trait RouteBucket {
332346

333347
/// Stable identity of this route (e.g. `account.move._compute_amount`).
334348
fn id(&self) -> &str;
349+
350+
/// Owned-id escape hatch for async/iterator pipelines that need to
351+
/// outlive a `&dyn RouteBucket` borrow. Defaults to cloning `id()`;
352+
/// implementors with a pre-allocated owned string can override.
353+
fn id_owned(&self) -> String {
354+
self.id().to_string()
355+
}
335356
}
336357

337358
// ---------------------------------------------------------------------------
@@ -363,6 +384,8 @@ impl fmt::Display for WidgetRenderError {
363384
}
364385
}
365386

387+
impl std::error::Error for WidgetRenderError {}
388+
366389
// ---------------------------------------------------------------------------
367390
// ④ Genericity — what to codegen vs what to read from the triple store
368391
// ---------------------------------------------------------------------------
@@ -491,6 +514,51 @@ mod tests {
491514
}
492515
}
493516

517+
/// Identity-preserving but truth-mangling projection — every (s,p,o)
518+
/// round-trips, but (f, c) come back as (0.0, 0.0). With the default
519+
/// `truth_tolerance() = 0.0`, this MUST fail the roundtrip check.
520+
struct TruthMangler;
521+
522+
impl TripletProjection for TruthMangler {
523+
type Const = Vec<(String, String, String)>;
524+
525+
fn project(triples: &[Triple]) -> Self::Const {
526+
triples
527+
.iter()
528+
.map(|t| (t.s.clone(), t.p.clone(), t.o.clone()))
529+
.collect()
530+
}
531+
532+
fn decompile(c: &Self::Const) -> Vec<Triple> {
533+
c.iter()
534+
.map(|(s, p, o)| Triple {
535+
s: s.clone(),
536+
p: p.clone(),
537+
o: o.clone(),
538+
f: 0.0,
539+
c: 0.0,
540+
})
541+
.collect()
542+
}
543+
// No truth_tolerance() override — default 0.0.
544+
}
545+
546+
#[test]
547+
fn default_tolerance_requires_exact_truth_match() {
548+
let input = fixture();
549+
let result = roundtrip_eq::<TruthMangler>(&input);
550+
// Default tolerance is 0.0 → must reject any truth mismatch
551+
// (input has f=1.0 / 0.95, decompiled has f=0.0).
552+
match result {
553+
Err(failure) => {
554+
assert!(failure.projection.contains("TruthMangler"));
555+
}
556+
Ok(()) => {
557+
panic!("TruthMangler must fail with default tolerance 0.0 (truth values differ)");
558+
}
559+
}
560+
}
561+
494562
#[test]
495563
fn odoo_method_kind_ids_are_unique_and_stable() {
496564
let mut seen = BTreeSet::new();

crates/lance-graph/src/graph/spo/action_emitter.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -255,7 +255,7 @@ fn compose_spec(fn_id: &str, idx: &TripleIndex) -> ActionSpec {
255255
ActionSpec {
256256
id: fn_id.to_string(),
257257
family,
258-
effects: collect_sorted(idx.emits_by_fn.get(fn_id)),
258+
effects: effects_set.iter().cloned().collect(),
259259
inputs: inputs.into_iter().collect(),
260260
raises: collect_sorted(idx.raises_by_fn.get(fn_id)),
261261
reads: collect_sorted(idx.reads_by_fn.get(fn_id)),

crates/lance-graph/src/graph/spo/odoo_ontology.rs

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,11 @@ use crate::graph::spo::store::SpoStore;
5252
use crate::graph::spo::truth::TruthValue;
5353

5454
/// One parsed ontology triple line: `{"s","p","o","f","c"}`.
55+
///
56+
/// `deny_unknown_fields` so harvester schema drift surfaces as a parse
57+
/// error instead of silently degrading the truth signal.
5558
#[derive(Debug, Clone, serde::Deserialize)]
59+
#[serde(deny_unknown_fields)]
5660
pub struct OntologyTriple {
5761
/// Subject IRI (e.g. `odoo:account_move.amount_total`).
5862
pub s: String,
@@ -167,4 +171,63 @@ mod tests {
167171
});
168172
assert!(found, "expected emitted_by edge missing from data file");
169173
}
174+
175+
/// Catch silent parse failures: every non-empty line must produce one
176+
/// `OntologyTriple`. If a line is corrupt and `filter_map().ok()` drops
177+
/// it, this assertion fires — corruption can't sneak through as a
178+
/// quiet count mismatch.
179+
#[test]
180+
fn every_nonempty_line_parses() {
181+
let raw_lines = ONTOLOGY.lines().filter(|l| !l.trim().is_empty()).count();
182+
let parsed = parse_triples(ONTOLOGY).len();
183+
assert_eq!(
184+
raw_lines,
185+
parsed,
186+
"{} of {} ontology lines silently failed to parse",
187+
raw_lines - parsed,
188+
raw_lines
189+
);
190+
}
191+
192+
/// Pin the documented "extractor de-duplicates" assumption: if two
193+
/// triples share `(s, p, o)` but differ in truth, the second insert
194+
/// overwrites the first (HashMap last-write-wins via `dn_hash`).
195+
/// Verifies the silent-overwrite semantics explicitly so a future
196+
/// switch to insertion-rejection or merge becomes a test failure
197+
/// instead of a silent change.
198+
#[test]
199+
fn duplicate_spo_keys_are_last_write_wins() {
200+
let s = "odoo:test.x";
201+
let p = "depends_on";
202+
let o = "odoo:test.y";
203+
let ndjson = format!(
204+
"{{\"s\":\"{s}\",\"p\":\"{p}\",\"o\":\"{o}\",\"f\":0.9,\"c\":0.9}}\n\
205+
{{\"s\":\"{s}\",\"p\":\"{p}\",\"o\":\"{o}\",\"f\":0.1,\"c\":0.1}}\n"
206+
);
207+
208+
let store = load_ontology(&ndjson);
209+
// Two source triples → one stored record (key collision).
210+
assert_eq!(
211+
store.len(),
212+
1,
213+
"duplicate (s,p,o) must collapse to a single store entry"
214+
);
215+
}
216+
217+
/// Lock the module-doc claim "3 328 Functions" against drift so the
218+
/// downstream `action_emitter::shipped_ontology_produces_expected_function_count`
219+
/// (which asserts the same number on its own) can't get out of sync
220+
/// with the loader's source-of-truth count.
221+
#[test]
222+
fn function_count_matches_module_doc() {
223+
let triples = parse_triples(ONTOLOGY);
224+
let functions = triples
225+
.iter()
226+
.filter(|t| t.p == "rdf:type" && t.o == "ogit:Function")
227+
.count();
228+
assert_eq!(
229+
functions, 3328,
230+
"function count drifted from module-doc claim (3 328)"
231+
);
232+
}
170233
}

0 commit comments

Comments
 (0)