feat(gfql/polars): engine followups — native multi-hop, to_fixed_point, undirected, min_hops>1, more predicates + NIE→native#1667
Open
lmeyerov wants to merge 39 commits into
Conversation
1e0d542 to
bfdfc65
Compare
This was referenced Jul 2, 2026
Open
lmeyerov
added a commit
that referenced
this pull request
Jul 2, 2026
… (3 files) The test-gfql-core per-file coverage audit fails on this stack because #1667 adds engine-specific (polars/cuDF-only) branches to files whose floors were generated pre-#1667 with razor-thin margins — those branches are unreachable in the pandas-only audit lane BY DESIGN (they are exercised in the polars CI lane and the dgx 4-engine conformance runs). Clean-room reproduction (fresh worktree at this tip + import-blocked polars/cudf to simulate the lane): 3312 tests pass, exactly 3 files below floor: row/frame_ops.py 60.22<65.0 (count_table polars branch + error paths), chain_let.py 70.07<70.21 and cypher/reentry/execution.py 85.33<85.66 (hair-thin engine-gate shifts). Two-part fix: - Direct unit tests for count_table's frame-op-level paths the GFQL call path can't reach (param validation intercepts first): bad-table + missing- source ValueErrors, null-in-mask counting, sibling-frame-templated and bare 0-count fallbacks. - Regenerate the 3 floors to the clean-room actuals (minus small local-vs-CI variance headroom): frame_ops 65.0->59.5, chain_let 70.21->69.5, reentry/execution 85.66->84.75. Intentional engine-specific growth, per the baseline-update-in-PR flow; all other floors untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lars' / 'polars-gpu' Two robustness gaps in engine selection: - engine='polars' without polars installed -> raw cryptic ImportError from deep in frame coercion / the lazy engine. - engine='polars-gpu' without the RAPIDS cudf_polars stack -> the missing-lib failure was caught by raise_on_fail=True and MISLABELED by _gpu_raise as "plan not GPU-executable, use engine='polars'" — pointing at the wrong fix. Add guards at the chain dispatch (compute/chain.py), pre-coercion, so the user always sees an actionable install message regardless of which query path runs: engine='polars'/'polars-gpu' both require polars (`pip install polars`); 'polars-gpu' additionally requires cudf_polars (checked via find_spec so it's consistent even for eager fast-path queries that never reach a GPU collect). lazy._engine_for keeps reporting genuine not-GPU-capable plans via _gpu_raise (unchanged; +clarifying comment). +1 regression test (polars-missing + cudf_polars-missing). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lt | streaming)
cudf-polars has multiple executor modes; expose a switch mirroring GFQL_POLARS_CPU_STREAMING.
Default 'in-memory' (fast + stable for results that fit device memory) is unchanged; opt-in
'streaming' is the escape hatch for larger-than-device-memory results (the in-memory executor
would OOM — the F3 85M-row case). Invalid values fall back to in-memory; raise_on_fail stays
True (NO-CHEATING) either way. ('auto' size-aware switch is a planned enhancement.) +1 test
(mocks GPUEngine, no GPU needed). Full streaming-vs-in-memory crossover benchmark is dgx-gated.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-bridge + 2 predicate bugs)
A parallel audit of the full polars NIE surface (predicates / row-pipeline / non-cypher modes)
found three real defects:
1. NO-CHEATING violation (the important one): a DAG `let({'d': call('get_degrees')})` binding
under engine='polars' SILENTLY ran the call on pandas and coerced the result back to polars
(call/executor.py ensure_engine_match), while the identical chain op `[call('get_degrees')]`
honestly raises NotImplementedError. Fix: under a polars engine, if a Plottable-method call's
result frames are not already polars, decline (NIE) instead of bridging; pass NotImplementedError
through execute_call + the DAG wrapper so it stays catchable (matching the chain surface).
pandas/cuDF engines unaffected. (158 call/DAG tests still green.)
2. Contains predicate ignored its regex flag (hardcoded literal=False) — a LITERAL pattern with
regex metacharacters (e.g. "a.b") was matched as a regex => wrong answer. Now honors regex=False
(literal substring; case-insensitive literal lowercases both sides). Differential parity verified.
3. Temporal comparison leak: _cmp_expr built `col > TemporalValue` (a non-None broken expr that
errors at df.filter / misorders) instead of declining. Now returns None for temporal-typed vals
-> honest NIE; numeric/string comparisons unaffected.
+2 regression tests. Full inventory + the tractable feature-win batch (CaseWhen, count_distinct,
str/match predicates, get_degrees-native, …) recorded in plans/gfql-engine-followups/plan.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… 0 strategy)
The bug pattern (DAG silent-bridge, Contains-regex, temporal leak, multi-hop/undirected combine)
showed the prior gates only covered the CHAIN surface with hand-picked cases. This adds the core
conformance invariant as automation: for any query, on a non-pandas engine the result is EITHER
parity-equal to the pandas oracle OR an honest NotImplementedError — never silently different,
never a silent bridge, never a non-NIE crash.
Covers the cross-product the chain-only gates missed:
- predicates (GT/LT/Between/IsIn/Contains{regex,case}/Startswith/Endswith/IsNA) x {chain, let-DAG};
- traversals (single-hop parity; multi-hop / undirected-multi-edge must NIE);
- cross-SURFACE call() consistency (chain vs DAG must agree — this permanently guards the
silent-bridge class we just fixed);
- a seeded predicate fuzz asserting the invariant.
37 cases green (validates the 3 correctness fixes). Wired into bin/test-polars.sh + ci.yml's
coverage list. CPU lane here; the cudf/polars-gpu lane + carve-out sweep + coverage ledger are the
next Phase-0 prongs (plan.md).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…arve-out cases - _assert_invariant now checks EVERY available non-pandas engine (polars always; cudf + polars-gpu auto-detected) against the pandas oracle — so the dgx run covers the full cross-product. Fixed _sig to normalize cudf frames to pandas too (was polars-only — a harness completeness bug that only running on the GPU box exposed; CPU-local never exercised the cudf lane). - Added hot-path carve-out cases (node-only MATCH, unconstrained/filtered single-hop, filters on BOTH endpoints, empty results, self-loop, isolated-node seed, reverse/undirected filtered) — the fast paths bypass the general engine, so they're the highest wrong-answer risk. - 48 cases green on dgx across pandas / polars / cudf / polars-gpu. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…gregation Three feature-gap lowerings from the NIE audit (were honest NIE -> now native): - sqrt(x) -> args.sqrt(); sign(x) -> args.sign() (_lower_function). - count(DISTINCT x) -> col.drop_nulls().n_unique() (_agg_expr) — drop_nulls matches cypher/pandas nunique(dropna=True) semantics (polars n_unique() counts null as a value). Conformance-gated: the value-level matrix (cypher RETURN/aggregation cases) verifies parity across pandas/polars/cudf/polars-gpu on dgx; confirmed each runs NATIVELY (not NIE) with parity=True. 339 polars tests green on dgx. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- CASE WHEN cond THEN a ELSE b END -> pl.when(cond.cast(Boolean)).then(a).otherwise(b) (cond cast to Boolean for Cypher 3-valued: a null WHEN takes ELSE, matching pandas). - Match -> regex anchored at START; Fullmatch -> anchored BOTH ends (case flag honored; declines on custom regex flags to avoid a flag-semantics gap). Conformance-gated: value-level matrix (63 cases) green across pandas/polars/cudf/polars-gpu on dgx; confirmed each runs NATIVELY (not NIE). 347 polars tests green on dgx. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
pandas boundary predicates accept a tuple of prefixes/suffixes (match if ANY); now OR-folds starts_with/ends_with over each element (case flag honored). Conformance-gated (67-case value-level matrix on dgx, all 4 engines); confirmed native, parity verified. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e temporal compare
Three Phase-2d feature-gap wins, each native-or-honest and conformance-gated
(parity vs the pandas oracle OR honest NIE — no silent pandas bridge):
- get_degrees: pure group_by/count over edge endpoints, left-joined onto nodes
(Int32, isolated/src-only/dst-only -> 0, self-loop double-counted). Wired on
BOTH the chain surface (engine_polars.chain) and the let()/ref() DAG + JSON
surface (call/executor) — turns a prior NIE/silent-bridge into a real win.
- with_(extend=True): native polars with_columns, reusing the shared
lower_select_items lowering (DRY with select_polars); unlowerable item -> NIE.
- DateValue temporal comparison: chain p.gt(date(...)) on a NAIVE Datetime column
lowers to col.dt.date() <op> pl.lit(date) (the exact pandas-oracle truncation);
tz-aware DateTimeValue / TimeValue / tz columns stay honest NIE.
Conformance fixes the matrix caught (verify-not-trust):
- Cypher WHERE n.ts > date('...') lowered date() to an ISO STRING, so a Datetime
column vs that string crashed polars (InvalidOperationError, a non-NIE crash).
Now decline (honest NIE) only when the column is schema-typed temporal; a String
column holding ISO text still computes lexicographically.
- no_silent_call_bridge test now uses get_indegrees (get_degrees is native).
+20 conformance cases (get_degrees / temporal / with_extend across chain·DAG·
cypher, incl. explicit "runs natively" + "honest NIE" assertions). dgx: matrix
87 passed across pandas/polars/cudf/polars-gpu; full polars suite 530 passed.
ruff + mypy clean. (Also splits 3 pre-existing E702/F541/F841 lint nits in the
touched test files.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…r predicate Phase-2d batch 2, each native-or-honest and conformance-gated (parity vs the pandas oracle OR honest NIE — no silent pandas bridge): - list literal `[e0, e1, ...]`: native `pl.concat_list` (order preserved, matching the pandas oracle) for a homogeneous-category list (all int / float / str / bool); mixed/nested/empty/null-dtype -> honest NIE. cudf is known to REORDER list elements (orthogonal cudf bug), so construction conformance is scoped pandas-vs-polars. - `x IN [literals]` as a row expression (distinct from the WHERE/IsIn predicate path): native 3-valued membership (null lhs masked to None to match cypher), Boolean output -> cudf-safe, full parity-or-NIE across all engines. - IsLeapYear predicate -> native `expr.dt.is_leap_year()` on a naive Datetime/Date column (Gregorian parity incl. 1900-non-leap / 2000-leap); tz-aware / non-temporal -> honest NIE. The 6 month/quarter/year boundary predicates KEEP declining — polars has no faithful boolean accessor (only rolled-datetime month_start/end), so re-deriving them would risk a subtle wrong answer (NO-CHEATING). +~15 conformance cases (list construction / IN / IsLeapYear across chain·cypher, incl. explicit native + honest-NIE assertions). Updated the row-expr unit test ([1,2,3] now lowers; [1,2.5] mixed still NIEs). dgx: matrix + full polars suite green across pandas/polars/cudf/polars-gpu; ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s on feat) get_degrees DAG dispatch (call/executor.py) and the _cmp_expr predicate unit-test import were ADDED on the followups branch after the base, so the lazy/engine/polars home-move's import rewrite didn't cover them. Repoint both to lazy.engine.polars. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ng + conformance ledger Phase-2d batch 3, each native-or-honest and conformance-gated (parity vs the pandas oracle OR honest NIE — no silent bridge); + the Phase-0 coverage-ledger automation: - get_indegrees / get_outdegrees: native single-direction group_by/count (shared helper with get_degrees), wired on chain + DAG/JSON surfaces. Matches the pandas empty-edges quirk (keep pre-existing col unchanged) exactly. Repointed the no-bridge test to `hypergraph` (architecturally pandas-only) since the degree calls are now all native. - size(x): native str.len_chars (String) / list.len (List); numeric/other -> honest NIE (pandas' row-count quirk is not replicated). substring(s,start[,len]): native str.slice for NON-NEGATIVE int-literal bounds over a String col; negative/non-literal/non-string -> NIE (negative start diverges: pandas python-slice vs polars offset/length). - test_conformance_ledger.py (Phase 0 prong #4): pure-introspection CI gate — derives the predicate universe from the live type_to_predicate registry + the exercised set from the matrix's _predicate_queries labels, and FAILS when a new predicate lands without either a conformance case or a reasoned KNOWN_UNCOVERED waiver. Dual-wired into bin/test-polars.sh AND ci.yml. CPU-only (no engine exec). dgx (--gpus all): 414 passed across pandas/polars/cudf/polars-gpu; ruff + mypy clean. All on the post-rename lazy/engine/polars/ paths. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dtypes; rest honest-NIE) - toInteger: Int/Bool identity cast; Float truncate-toward-zero with explicit NaN/null->null mask (matches pandas astype(float).astype(int64)); String -> honest NIE (pandas raises on non-numeric, not null-on-failure — polars strict=False would fabricate nulls). - toBoolean: Boolean identity only; String/numeric token-set parsing not provable -> NIE. - toString: Bool (lowercase true/false) / Int / String; Float -> NIE (cross-engine repr diverges) and temporal/Categorical -> NIE. + conformance cases (native + honest-NIE) per dtype. toString(float) is covered by a dedicated pandas-vs-polars NIE test, not _assert_invariant (cudf formats floats differently than pandas — an orthogonal cudf repr divergence). dgx (--gpus all): 221 passed across all engines; ruff+mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-until-impl)
Multi-hop native chain prep. The repo fuzz did NOT exercise multi-hop
(_rand_edge was single-hop only), so the multi-hop NIE guard was never
tested. Amplify (TEST-ONLY; product still NIEs -> chains skip cleanly):
- _rand_edge: directed edges now emit hops in {2,3} / max_hops 1..N
(e_undirected stays single-hop; undirected-multi is a separate defect)
- _rand_graph: densify toward cycles/self-loops/parallel-edges, |E|>>|V|
- fuzz parametrize 60 -> 500; add edge-COUNT multiplicity guard
- adversarial multi-hop parity cases (cycle/self-loop/dup/empty/reverse/
both-filtered/hops>diameter/sandwiched) — NIE-skip until Stage 1
- conformance matrix: fwd-hops2/rev-hops2/maxhops3/midfilter/named/
sandwiched (parity-native after Stage 1) + tofixed/und-hops2 stay-NIE
- polars-only test_polars_chain_multihop_deferred_raises (to_fixed_point,
min_hops>1, undirected multi-hop)
Harness hardening: _frame_repr astype(object) before where(notna,None) so
cudf->pandas nullable-extension NA becomes real None (pd.NA in a signature
made res==base raise 'bool of NA ambiguous'). This surfaced a real cudf
min_hops divergence (seed node_hop None vs max_hops) — orthogonal cudf bug,
scoped out of the 4-engine matrix, polars NIE asserted directly.
dgx --gpus all: 348 passed, 352 skipped. ruff+mypy clean (test files).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hops, fwd+rev) THE MARQUEE FEATURE. Native multi-hop polars chain, parity-equal to pandas. Two prior shortcut attempts (flat forward∩backward intersection) shipped wrong answers and were reverted; this ports the PATH-AWARE pandas algorithm. Root cause (was NIE): _combine_edges single-hop-semijoins each edge step by prev(src)/next(dst) wavefront, dropping intermediate-hop edges (e=3 vs pandas e=10 at hops=2). The forward/reverse hops were already correct. Fix (port of pandas combine_steps has_multihop, compute/chain.py:201-209): for each fixed-length multi-hop edge step, RE-EXECUTE the forward hop over the backward-pruned edge subgraph seeded from the forward-pass entry wavefront. Backward prune = 'reaches a valid endpoint'; forward re-exec = 'reachable from seed'; their composition along the BFS frontier = exactly the valid-path edges (NOT a coarse set intersection). _combine_edges then appends all recomputed edge ids as-is (bypass the single-hop semijoin). Recomputed frames feed the EDGE combine only; node combine + name tagging keep the original backward-pruned steps (matches pandas kind=='edges' gate). Guard NARROWED (not lifted): _is_native_multihop allows only plain hops=N / max_hops, forward/reverse. Still honest-NIE for to_fixed_point, min_hops>1, output slicing, hop labels, *_query, include_zero_hop_seed, prune_to_endpoints, and undirected multi-hop (Stage 3). NO-CHEATING intact. Validation (dgx --gpus all): - amplified fuzz (500 seeds, cycles/self-loops/parallel-edges) + 8 adversarial cases (cycle/self-loop/dup-multiplicity/empty/reverse/ both-filtered/hops>diameter/sandwiched): ZERO mismatches. 497 passed, 203 skipped (deferred surfaces) — up from 348/352 (+149 now native). - conformance matrix multi-hop cases parity across all 4 engines. - 387-test polars regression suite: zero regressions. - benchmark: polars vs pandas 1.0-2.1x @10k/50k, 2.5-4.1x @100k/500k, 6.0-8.2x @500k/2M; result sets identical across pandas/polars/cudf. - ruff + mypy clean on chain.py. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ersal matrix
Extract traversal cases to _TRAVERSAL_CASES; add test_conformance_traversals_dag
running the SAME cases via let({'a': query}). Guards the silent-bridge bug class
(chain NIEs but DAG silently bridges to pandas) for multi-hop: native multi-hop
must reach the DAG/let surface too. dgx --gpus all: all 15 cases pass (parity
where native incl fwd/rev-hops2, maxhops3, midfilter, named, sandwiched; NIE
consistent for to_fixed_point / undirected-multi). ruff clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extend the Phase-0 coverage ledger from predicates to cypher scalar functions. Universe = GFQL_SCALAR_FUNCTIONS (language_defs.py, importable like type_to_predicate); exercised set DERIVED by parsing function calls out of the cypher strings of a new importable _cypher_expression_queries() generator (refactored from the inline parametrize, mirrors _predicate_queries). Same 4 drift tests: registry entry must be exercised or waived; no bogus/stale/redundant waivers; reasons non-empty. 9 functions exercised (sqrt/sign/abs/coalesce/size/substring/tointeger/toboolean/ tostring), 11 waived honest one-liners (tofloat NIE; keys/labels/type/properties/range NIE; 5 internal __*__ helpers). Parser .lower()s for camelCase toInteger vs registry tointeger; intersects registry to drop aggregations/identifiers. dgx: ledger 14 passed (CPU introspection), cypher_expressions 19 passed (4-engine), ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extend the coverage ledger to the call() safelist. Universe = SAFELIST_V1 (call/validation.py, 59 entries); exercised DERIVED from a new importable _call_exercised_functions() (driven by the shared _CALL_CONSISTENCY_FNS constant the cross-surface consistency test runs on + the degree-trio dedicated tests). Same 4 drift tests. 5 exercised (get_degrees/in/out, hypergraph, limit), 54 waived honest one-liners (row-pipeline ops native-on-chain-but-NIE-via-call-executor; Plottable-method layouts/encoders/igraph/cugraph/umap pandas-cuDF-only -> no-bridge NIE; class covered by test_engine_polars_no_silent_call_bridge). The ledger immediately caught fa2_layout — a safelist entry the manual audit missed — proving the drift gate works (a new untracked safelist call now fails CI). dgx: ledger 21 passed (CPU introspection), ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… complete) Final coverage-ledger axis. Universe = ROW_PIPELINE_CALLS (row/pipeline.py, 15 ops); exercised DERIVED from importable _rowop_exercised() (labeled subjects: with_). Same 4 drift tests. 1 exercised, 14 waived honest one-liners (native frame ops not yet a labeled subject: skip/drop_cols/distinct/limit/rows; native-but-implicit-via-cypher: select/return_/where_rows/order_by/group_by/unwind; honest-NIE correlated ops: semi_apply_mark/anti_semi_apply/join_apply). Degree calls live on the call axis, not here. Coverage ledger now spans all 4 conformance axes (predicate / scalar-function / call- safelist / row-op), each deriving its universe from a live registry and failing CI when a new op lands without an assertion or an honest waiver. dgx: 28 passed, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hains (Stage 3) Fixes the seed-18 defect: a chain of >1 undirected edge with intermediate node filters dropped a node vs pandas. Root cause = the polars backward pass: the generic undirected hop returns a ONE-SIDED wavefront (TO-side endpoints only), while pandas' fast backward branch (compute/chain.py:1090-1098) threads BOTH endpoints of the surviving edges as the step's node frame. The one-sided frame dropped an intermediate node only reachable as the frontier-side endpoint of a sibling undirected edge, which then dropped that node's incident edge in the combine. Fix (localized to the backward pass): for a single-hop undirected edge, override the reverse hop's node frame with BOTH endpoints of its surviving edges (re-joined to g._nodes for full columns). The edge set already matched pandas (the undirected hop joins both orientations); only the node frame was one-sided. _combine_edges / _combine_nodes / _apply_node_names already mirror the pandas undirected oracle. Guard NARROWED: native for plain single-hop undirected in multi-edge chains; STILL NIE for undirected mixed with fixed-length multi-hop, include_zero_hop_seed, or *_query. Undirected MULTI-hop (e_undirected(hops>=2), to_fixed_point, min_hops>1) still NIE upstream. Validation (dgx --gpus all): amplified 500-seed fuzz (now generates+runs undirected multi-edge incl. seed-18 class) ZERO mismatch; 7 new adversarial undirected cases (seed-18 midfilter, 3x undirected, alternating filters, mixed directed+undirected, named-across-boundary, self-loop) parity; deferred-raises updated; full polars suite 1036 passed / 93 skipped (was 203 skipped — undirected fuzz cases now native). ruff+mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ulti-hop, Stage 4) Final fixed-length-family multi-hop surface. to_fixed_point (traverse until no new nodes) is now native for forward/reverse: _is_native_multihop no longer excludes it. The existing path-aware recompute handles it unchanged — re-run the forward to_fixed_point hop over the backward-pruned subgraph; the hop's own fixed-point detection guarantees termination (no infinite loop on cycles/self-loops). Same combine bypass as hops=N. STILL NIE: undirected to_fixed_point (direction guard), undirected multi-hop, min_hops>1. Validation (dgx --gpus all): the 500-seed amplified fuzz now also emits to_fixed_point edges over the cyclic fuzz graph (cycles/self-loops/parallel) — ZERO mismatch, proving termination + parity; 7 adversarial to_fixed_point cases (fwd/rev, self-loop, midfilter, isolated, sandwiched, named) parity; conformance fwd/rev-tofixed parity all 4 engines; deferred-raises updated (forward to_fixed_point removed, undirected to_fixed_point added). Full polars suite 1052 passed / 92 skipped. ruff+mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ti-hop surface) Completes the multi-hop NIE surface. Undirected e_undirected(hops=N / max_hops) in single- and multi-edge chains is now native + parity. _is_native_multihop admits undirected (except to_fixed_point); the multi-edge guard no longer NIEs undirected+multihop. NO new combine/override code: fixed undirected multi-hop uses the generic backward hop (NOT the single-hop both-endpoint override, which stays is_simple_single_hop-gated — same as pandas, which only applies its fast both-endpoint backward branch to single-hop) plus the existing direction-agnostic path-aware recompute. A design pass proved P=Q: for fixed undirected return_as_wave_front, the pandas backward frame (matches ∪ endpoints − unreached seeds) collapses to matches == the polars visited-set BFS, so no seed-18-style node drop at hops>=2. This commit RAN the gate the proof needed. STILL NIE (principled): undirected to_fixed_point (pandas connected-components + 2-core seed retention hop.py:817-887 — no vectorized polars analogue) and min_hops>1 (cudf-divergent). Validation (dgx --gpus all): 500-seed fuzz now emits undirected multi-hop (zero skips) — ZERO mismatch; +4 adversarial undirected multi-hop cases (hops2/maxhops3/midfilter/mixed); conformance und-hops2/und-maxhops3 parity all 4 engines, und-tofixed stays NIE. Full polars suite 1150 passed / 0 skipped (every multi-hop shape now native). ruff+mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tays honest-NIE Temporal Between over a naive Datetime column with DateValue bounds now lowers natively. NO new proof obligation: the GFQL temporal Between evaluates in pandas as GE/LE (inclusive) or GT/LT (exclusive) on the same bound, and each endpoint is the exact date-truncated compare _cmp_expr already lowers with proven parity (col.dt.date() <op> pl.lit(date)). The patch composes the two proven endpoint compares, inheriting parity AND every NIE guard for free — a tz-aware DateTimeValue / TimeValue / raw-datetime / mixed / non-Datetime bound makes an endpoint decline, so the whole Between honest-NIEs (never a silent tz/instant mismatch). Temporal IsIn stays honest-NIE (design-verified keep): pandas IsIn does EXACT-instant membership (not date-truncated), and the candidate native is_in cross-precision cast is not provable from source — declining is correct, locked by a regression test. dgx --gpus all: 8 Between conformance (chain+dag, incl/excl boundary, single-day, empty, all 4 engines) parity + naive-native + tz-aware-NIE + temporal-IsIn-NIE = 11 passed; 315-test conformance+cypher regression clean. ruff+mypy clean (predicates.py). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…est-NIE forms; close ledger gap unwind is native ONLY for scalar-literal lists (unwind_polars cross-join); list-column / nested / scalar forms honestly NIE. It was untested in the matrix and waived in the row-op ledger. Add labeled conformance: 4 native literal-list cases (ints/strings/with-null/ singleton, chain+dag all engines), empty-list-drops-all-rows, scalar honest-NIE, nested-list honest-NIE, and chain-vs-cypher cross-surface consistency. Move unwind from ROW_OP_KNOWN_UNCOVERED waiver into _rowop_exercised() (the ledger redundant-waiver gate enforces the move). Omitted the list-column case (polars list-column ingestion unverified). dgx --gpus all: 8 unwind tests passed; ledger 28 passed (row-op axis rebalanced). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…NE/IsNull/NotNull Close ledger gaps (test-only; no product change). Row-op axis: add _ROW_OP_CASES labeled conformance subjects (rows/skip/limit/distinct/drop_cols frame ops + order_by/select/return_/ where_rows/group_by row_pipeline lowerings) on chain+dag, moving all 10 from ROW_OP_KNOWN_UNCOVERED -> _rowop_exercised() (axis now 12 exercised + 3 NIE-waived = ROW_PIPELINE_CALLS 15). Predicate axis: add scalar EQ/NE + str IsNull/NotNull to _predicate_queries(), removing those 4 waivers. VERIFICATION NOTE: local ruff clean + a from-source self-review confirmed param schemas (SAFELIST_V1), native dispatch (_try_native_row_op / predicate_to_expr), predicate crash-safety, and ledger set-math correct by construction; it caught + fixed a stale IsNull/NotNull waiver (CPU drift-test). NOT yet run with --gpus all (dgx-spark in a ~20min ssh outage). GPU-lane parity assertions use code paths already exercised+green elsewhere; will run matrix+ledger on dgx when back and fix-forward if any parity assertion needs re-waive. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nal cudf bug) dgx-verified fix-forward on 73e051d. The 4-engine group_by row-op assertion failed on cudf: GFQLTypeError 'truth value of a Series is ambiguous' in the GFQL group_by execution path, triggered only when the row frame carries extra non-key/non-agg columns (the _graph fixture's f/name). polars+pandas agree; it's an orthogonal cudf-path bug (Series-truthiness; see cudf-cross-engine-findings.md #4). Move group_by out of the 4-engine _ROW_OP_CASES into a dedicated polars-vs-pandas assertion (new _assert_polars_parity_or_nie helper, reused for the min_hops work). group_by stays exercised on the ledger row-op axis. dgx --gpus all: matrix+ledger 247 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…uadrants Already-native (confirmed by design review + the 500-seed fuzz), but the specific undirected-single + directed-fixed-multihop quadrant wasn't a named regression. Add 3 to UND_CHAINS (und-then-fwdhops2, fwdhops2-then-und, und-mid-fwdhops2) to lock that the path-aware recompute doesn't collide with the undirected both-endpoint backward override on the shared middle wavefront. dgx --gpus all: all undirected-multiedge parity cases pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…/End/Quarter/Year) The 6 date-part boundary predicates now lower natively on a naive Datetime/Date column. Polars has no is_month_start/.../is_year_end BOOLEAN accessor (month_start()/month_end() ROLL to a Datetime), BUT the pandas oracle with freq=None is a pure CALENDAR-FIELD test: is_month_start = day==1; is_month_end = day==days_in_month; quarter/year add a month-set / month==N. polars dt.day()/dt.month()/dt.days_in_month() extract the SAME fields (proleptic Gregorian, leap-aware), so each compose is BIT-IDENTICAL to the oracle on non-null rows — a PROVABLE derivation, not a guess. NaT->null->fill_null(False) matches pandas False. Dtype- gated to naive Datetime/Date (tz shifts wall-clock fields -> honest NIE), exactly like IsLeapYear. This admits proven DERIVATIONS, not just single-accessor lowerings — provable parity is the NO-CHEATING bar. Replaces the *_honest_nie_polars boundary test with native-parity (4-engine — cuDF agrees, confirming the derivation is faithful) + a tz-aware NIE guard; ledger waivers updated to native. dgx --gpus all: boundary native+parity all engines; full polars suite 1201 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…findings 1,2,4) Core compute fixes (pandas+cuDF row pipeline), each making cuDF match the pandas oracle without changing pandas/polars behavior. dgx --gpus all verified pandas==cudf==polars. #4 group_by 'truth value of a Series is ambiguous': the handler grouped over the WHOLE active table, carrying non-key/non-agg columns (object 'name', float 'f') into the cuDF GroupBy and tripping its Series.__bool__ path. Fix: _make_grouped now projects to key+value cols before grouping (selecting value cols before grouping cannot change group sizes/reductions -> identical on every engine). group_by promoted back into the 4-engine _ROW_OP_CASES matrix. #1 list-literal element reorder: cuDF groupby-collect (.agg(list)) gives no within-group row-order guarantee, permuting elements vs construction order (pandas groupby(sort=False) is stable). Fix: route cuDF to the order-deterministic column-wise host build (the same path list-typed elements already use); pandas melt/groupby path untouched. #2 toString(float) repr divergence: libcudf float->string != python/pandas repr (sci-notation thresholds). Fix: engine-gated host round-trip (arrow -> pandas float -> str = the exact pandas result), so cuDF is string-identical incl. 1e20 -> '1e+20'. Better than declining (which would crash, not defer). pandas/int/bool/string casts unchanged. (#3 min_hops seed hop-label stays scoped — a deeper cuDF NA-handling issue, not one-line-safe.) dgx: row-pipeline (pandas+cudf) + conformance + ledger 516 passed; group_by 4-engine + 2 new cuDF parity tests 22 passed. ruff+mypy clean (pipeline.py). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…arity Variable-length lower-bound traversals e(min_hops=N, max_hops=M) now run natively on the polars engine (CPU + polars-gpu/cudf_polars) instead of NIE. The eager polars hop ports pandas' min_hops algorithm vectorized: - NON-anti-joined revisit BFS + cumulative-closure termination (cycles bump reach depth to max_hops so the lower bound can be satisfied) - 3-case gate: max_reached<min -> empty; goal-labeled->=min -> layered backward-tree walk; cyclic-revisit-only -> unpruned ball - the exact min_hops NODE-output rule (the hard part, traced vs the pandas oracle through forward-stack/backward-pass/recompute/label-rebuild/ output-slice/seed-strip): wavefront = endpoints of the RETAINED edges, MINUS seeds never re-reached at >=min_hops, with FULL attributes only for hop-labeled nodes and NULL attributes for source-side endpoint-only nodes (so a downstream node-attribute filter rejects them, matching pandas' track_node_hops labeled path). Fixes fuzz seeds 24 (fwd n0 over-include), 404 (reverse n1 unreached-seed), 48 (reverse n5/n7 null-attr kind filter). NO-CHEATING preserved: undirected min_hops>1 still raises NotImplementedError. Validated: test_polars_chain_fuzz_parity 500/500, full polars hop+chain suites green, whole gfql/ dir 3880 passed, polars-gpu min_hops parity 8/8, mypy clean. Stale deferred/unsupported guards updated; direct-hop min_hops parity test added. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
n({col: ne(x)}) and cypher WHERE n.col <> x over a NULL cell used to KEEP the
null row on the pandas engine (NaN != x -> True), diverging from cuDF and the
polars engine (both drop it) and from pandas' own NOT n.col = x path. Per
openCypher/SQL 3VL, null <> x is null -> not a match -> row excluded, like
eq/gt/IN (which already drop nulls).
Fix: NE.__call__ masks out nulls (& s.notna()). A 4-engine audit pinned the
entire divergence to exactly 2 cells, pandas-only (filter_dict ne + cypher <>);
one predicate fix corrects both (single-entity <> routes through NE). cuDF/
polars/polars-gpu were already conformant; polars untouched (own lowering).
Behavior change for ne() on nullable columns under the default pandas engine.
Test flipped from a strict-xfail divergence marker to a 3VL parity assertion.
Validated on dgx: 4522 predicates+gfql tests pass, all 4 engines now 3VL.
(Broader openCypher null-semantics sweep + docs tracked as a follow-up issue.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
n({col: [.., None]}) over a NULL cell used to KEEP the null row on cuDF
(cuDF isin matched a null cell against a None list element), diverging from
pandas + polars which exclude it. Per openCypher/SQL 3VL, null IN [...] is
null -> not a member -> excluded. Fixed in filter_by_dict (& notna() on the
membership branch): no-op for pandas/polars, fix for cuDF.
Found via the #1664 conformance sweep; validated 4-engine on dgx (all give
[n0,n3] now). +test_membership_on_null_is_three_valued_logic. 287 predicate
+filter tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The eager hop (hop_eager.py) and the lazy single-hop (hop.py) carried verbatim copies of the column-name + edge-id + node-dtype setup and the directed-pairs builder (hop.py even commented 'stay textually identical ... don't drift'). Extracted _hop_setup_columns() + _build_hop_pairs() — identical on pl.DataFrame (eager) and pl.LazyFrame (lazy), so both call sites share one implementation. Pure refactor, behavior-preserving. Validated on dgx: polars chain fuzz + hop suites 1133 pass, mypy clean. (The bigger pandas/polars node-output policy unification stays an issue — risky production-hop.py surgery + label-dtype contract from #1664/#1663 is a prerequisite.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… — NIE->native Three previously-NIE cypher row surfaces now run natively on engine='polars', parity-validated vs the pandas oracle across pandas/cudf/polars/polars-gpu: - toFloat(x): int/uint/bool/float -> Float64 (NaN preserved; no fillna step, unlike toInteger — float64 has no null sentinel). Non-numeric String declines (NIE) because pandas astype(float) RAISES, not null-on-failure. - collect(x) / collect(DISTINCT x) aggregations complete the native group_by surface: drop nulls, preserve within-group first-occurrence order (collect keeps dups, DISTINCT dedups keep-first), all-null group -> []. drop_nulls() /unique(maintain_order=True), no .implode(). - where_rows / WHERE ... IN [list] membership -> is_in (null cell excluded, 3VL). Removed the stale tofloat conformance-ledger waiver; +tofloat matrix cases, tofloat-string-NIE test, collect parity test. Validated on dgx: 4242 gfql tests pass, conformance matrix+ledger+row-pipeline green, 4-engine parity, mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ce without polars
Two CI failures on this layer, both diagnosed from the failing 3.8/3.14 jobs:
(A, real regression) execute_call() added an unconditional
'if isinstance(error, NotImplementedError): raise error' to propagate the polars
no-silent-bridge decline on the DAG surface — but it ALSO intercepted a
pandas/cudf NIE like fa2_layout's 'requires a GPU', so it stopped falling through
to the GFQLTypeError(E303) wrapper and test_fa2_layout_cpu_requires_gpu failed.
Gate the pass-through to engine in (POLARS, POLARS_GPU); pandas/cudf NIEs wrap to
E303 as before. Verified: fa2 test passes, full test_call_operations green (24/2).
(B, py3.14 env) test_engine_polars_conformance_matrix hardcodes the polars lane
without probing importability; on Python 3.14 (no cp314 polars wheel) polars is
absent from the lockfile, so every case reported a non-NIE ImportError as a
conformance failure (~60). Add module-level pytest.importorskip('polars') so the
matrix skips cleanly when polars is unavailable.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…me materialize A lone `RETURN count(*)` over a single node/edge pattern used to materialize the whole matched frame and run a constant-key group_by just to count rows (the ~770x-vs-Ladybug loss in the fair Cypher benchmark: count = 2543-8206ms while the df-path count is ~0.01ms). The Cypher lowering now emits a new `count_table` row op that reads the scanned table's height directly (or sums the boolean alias-mask column when the pattern filters) in a single reduction — no frame copy, no group_by. Guarded to the provably-equivalent shapes only: exactly one non-DISTINCT count(*), no group keys / post-agg exprs / row-level WHERE / UNWIND / paging / multi-relationship binding, and either a pure node scan (relationship_count==0) or a single relationship counted on its edge alias (== the cases _reject_unsound_relationship_multiplicity_aggregates permits). Every other shape falls through to the general aggregate path unchanged. The op replaces the rows()+with_+group_by prefix and keeps the identical trailing projection, so downstream/result is byte-identical. count_table is a native frame op (like rows/limit) routed via _POLARS_NATIVE_ROW_PIPELINE_CALLS -> op.execute -> frame_ops.count_table, so one implementation covers pandas / cuDF / polars / polars-gpu. Validation: CPU parity (pandas+polars node/edge count == oracle, fast path confirmed taken); full test_lowering suite 1394 passed; coverage ledger green (count_table accounted in _rowop_exercised + _ROW_OP_CASES + CALL_KNOWN_UNCOVERED); dgx 4-engine conformance (count_all_nodes/edges + count_table rowop, chain+dag) 55 passed on pandas/cuDF/polars/polars-gpu; ruff + mypy clean. The 5M/20M Ladybug head-to-head benchmark is the next step (needs the ladybug bench harness rebased onto this). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ble) The count_table row op (count(*) short-circuit) was only exercised via the polars-gated conformance matrix, so the polars-less test-gfql-core lane never executed the new frame_ops/pipeline code and the per-file coverage audit fell below baseline (CI: 'per-file coverage baseline failed', pre-existing since the count_table commit). Add always-on pandas tests: node-count + edge-count execution (incl. the dangling-edge endpoint-validation semantic), a structural lowering assert (pure count(*) -> count_table; grouped count does NOT short-circuit), the direct row-op path, and the unknown-table rejection. Validated: 11 pass locally (-k 'count_star or count_table'); ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…stale wording Review-wave findings (no behavior change except count_table engine fix): - _is_pure_count_star_shortcircuit docstring claimed post-aggregate exprs are rejected; they are in fact SUPPORTED by design (count lands in a temp column, trailing select applies the expr — verified count(*)+1 == 4, count(*)*2+1 == 7 on 3 nodes). Doc now matches code; pandas-lane tests already lock the shortcircuit + grouped-count-does-not cases. - Remove 25 dead 'if polars not in _NONPANDAS_ENGINES: skip' guards in the conformance matrix (module already importorskips polars; condition was always False). - lazy polars chain: _is_native_multihop docstring + chain NIE message listed to_fixed_point / min_hops>1 / undirected as deferred — all three are native for fwd/rev at this tip; wording now names only the genuinely deferred combos (undirected to_fixed_point / undirected min_hops>1, slicing, labels, *_query, zero-hop-seed, prune_to_endpoints). - count_table: table-is-None fallback returned a pandas frame even in a polars pipeline; now templates the 0-count row from the sibling table's engine (mirrors empty_frame discovery). - test_engine_polars_chain.py stale test-name ref; executor.py duplicate 'Engine as _Eng' import. Validated: count tests 11 pass; matrix 244 pass local-CPU (15 cudf-lane errs = local no-GPU env artifact, dgx re-validation follows); ruff+mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (3 files) The test-gfql-core per-file coverage audit fails on this stack because #1667 adds engine-specific (polars/cuDF-only) branches to files whose floors were generated pre-#1667 with razor-thin margins — those branches are unreachable in the pandas-only audit lane BY DESIGN (they are exercised in the polars CI lane and the dgx 4-engine conformance runs). Clean-room reproduction (fresh worktree at this tip + import-blocked polars/cudf to simulate the lane): 3312 tests pass, exactly 3 files below floor: row/frame_ops.py 60.22<65.0 (count_table polars branch + error paths), chain_let.py 70.07<70.21 and cypher/reentry/execution.py 85.33<85.66 (hair-thin engine-gate shifts). Two-part fix: - Direct unit tests for count_table's frame-op-level paths the GFQL call path can't reach (param validation intercepts first): bad-table + missing- source ValueErrors, null-in-mask counting, sibling-frame-templated and bare 0-count fallbacks. - Regenerate the 3 floors to the clean-room actuals (minus small local-vs-CI variance headroom): frame_ops 65.0->59.5, chain_let 70.21->69.5, reentry/execution 85.66->84.75. Intentional engine-specific growth, per the baseline-update-in-PR flow; all other floors untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
626c7fd to
ba8d360
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Polars-engine followups stacked directly on the engine PR (#1660) — restacked 2026-07-02: an audit proved this PR has zero dependency on the index layer, so the experimental CSR-index PR (#1658) now stacks on top of this instead of underneath (each layer reviewable against what it actually uses). Extends the native polars GFQL engine to parity-or-NIE across many surfaces: native fixed multi-hop (hops=N), forward/reverse to_fixed_point, fixed undirected multi-hop, forward/reverse min_hops>1 in chain() (500/500 fuzz), more predicates (date-part boundaries, temporal Between), toInteger/toBoolean/toString/toFloat, collect/collect_distinct aggs, WHERE … IN membership, plus 3 cuDF cross-engine fixes (#1663). Test-amplification hardened the chain fuzz to compare null-aware node attributes + edge multiplicity. DRY: shared eager+lazy hop setup. All dgx-validated (parity-or-honest-NIE, NO-CHEATING). Also:
count(*)short-circuit — a newcount_tablerow op reads the scanned table's height (or alias-mask sum) in one reduction instead of materializing the full frame + constant-keygroup_by; the Cypher lowering short-circuits the pureMATCH (n)/()-[r]->() RETURN count(*)shape only (DISTINCT / WHERE / group keys / UNWIND / bindings all fall through unchanged; post-aggregate exprs likecount(*)+1compose via a temp column). Node-countMATCH (n) RETURN count(*): polars ~1 ms at 5M nodes (was ~2.5 s via materialize+group_by).openCypher-null conformance tracked in #1664; open NIE surfaces + per-engine docs in #1665.
Review notes: 38 conventional commits (merge-commit repo; adjacent fixups folded). Team-polish pass (
9c0ec314): conformance-matrix dead guards removed, stale NIE wording fixed (undirected/to_fixed_point/min_hops are native now), count_table engine-consistent empty fallback. Validation at tip: dgx 4-engine conformance matrix + ledger + lowering = 1359 pass; pandas-lane count(*) tests added (coverage-audit gate).