You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* test+docs(#1219): residual row-boolean compositional matrix + guardrails
Worker B / independent-hardening stream of #1219. After #1217's Earley
swap surfaced row-boolean shapes (OR/NOT/XOR among row predicates) that
LALR rejected, four compositional shapes remained unverified beyond the
fixtures #1217 covered. This PR locks empirical correctness across the
residual matrix + adds two lightweight guardrail comments.
## Compositional matrix tests (test_lowering.py)
All four shapes verified correct empirically; locked with sorted-id
assertions against discriminating fixtures:
1. **Nullable NOT/OR** — `WHERE NOT n.x = 1 OR n.y IS NULL` against a
4-row fixture mixing real and projected nulls. Locks pandas-backed
row-evaluator preserves Cypher 3VL truth table (NULL OR T = T):
`{n2, n4}`.
2. **N-ary OR (3 branches)** — `WHERE n.x = 1 OR n.x = 2 OR n.x = 3`.
Locks left-associative parse `or(or(=1, =2), =3)` doesn't degenerate
under associativity bugs. `{n1, n2, n3}`.
3. **De Morgan compositions** (parametrized × 4) — both `NOT(A OR B)`
≡ `NOT-A AND NOT-B` and `NOT(A AND B)` ≡ `NOT-A OR NOT-B` against
a 4-row fixture covering all (x∈{1,2}, y∈{2,3}) combos. Each form
and its De-Morganed equivalent return the same row set.
4. **Mixed-string-numeric AND inside OR** — `WHERE (n.s = 'a' AND
n.x > 0) OR n.x < -1`, exercising `_StringAllowingComparisonMixin`
(#1217) paired with OR composition. `{n1, n2, n4}`.
## Guardrails
- `expr_split.py::split_top_level_and` — added load-bearing AND-only
docstring with #1219 cross-ref explaining why a sibling
`split_top_level_or` would break OR-distributivity-over-join
correctness. No future maintainer should accidentally add it
without first redesigning topology-aware pushdown safety.
- `_boolean_expr_text.py::boolean_expr_to_text` — added explicit
`if expr.op == "pattern"` branch with docstring. Currently
unreachable in production (lift step extracts pattern leaves before
the binder walks the tree) but documents the contract: emit the raw
pattern source rather than silently falling through to empty string.
## Test impact
Validated on dgx-spark: `graphistry/tests/compute/` → 2524 passed
(7 new tests; remaining delta from baseline absorbed by #1224's
unrelated additions).
Closes the residual frontier portion of #1219.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test+docs(#1219): wave-1 review fixes — discriminating fixtures, equivalence assertions, pattern-op unit test, mirrored guard
Wave-1 review on c196393 surfaced 3 IMPORTANTs + 5 SUGGESTIONs.
Addressed:
1. N-ary OR test now has a companion (duplicate-leftmost-branch) that
isolates the rightmost-drop associativity bug from the any-branch-
drop case. Comment rephrased to be honest about what the original
test covers (any branch dropped, not specifically rightmost).
2. De Morgan parametrize restructured: paired (compound, distributed,
expected) tuples instead of independent rows. Now asserts:
- compound matches expected
- distributed matches expected
- compound == distributed (the actual De Morgan equivalence)
Added separate double-negation test (NOT NOT A ≡ A).
3. New test_boolean_expr_to_text_emits_atom_text_for_pattern_op in
test_boolean_expr.py exercises the explicit pattern branch added
in c196393. Locks the contract even though the branch is currently
unreachable in production (lift step extracts pattern leaves before
the binder walks the tree).
4. Mirrored AND-only guard comment near _split_conjuncts in
predicate_pushdown.py — that's where future maintainers actually
look when adding pushdown features; the load-bearing rationale
stays in expr_split.py's docstring.
Test counts on dgx-spark: 2525 passed (was 2524 + 1 pattern_op unit
test; net 8 new tests in PR diff vs master).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test+docs(#1219): wave-2 review fixes — accurate docstring, inline pattern, terser comments
Wave-2 (2a targeted + 2b broad) found 1 IMPORTANT + several SUGGESTIONs.
IMPORTANT (Wave-2b):
- expr_split.split_top_level_and docstring overstated the topology
argument. Original draft claimed distribute-OR breaks on fan-out
topologies; the cross-alias-OR test in test_lowering.py:3206-3211
(added in #1217) explicitly confirms distribute-OR converges to
correct row-set semantics on a 2:2 fan-out fixture. The actual
reason pushdown leaves OR opaque is that the multi-alias references
on the conjunct cause it to be retained post-join. Rewrote
docstring to describe the real mechanism + cite the test.
SUGGESTIONs (Wave-2a + 2b):
- _ids_for(graph, query) helper inlined to match the established
inline-sorted-comprehension style used elsewhere in test_lowering.py
(12+ existing call sites). Removes inconsistency within this PR.
- Pattern-op branch comment in _boolean_expr_text.py compressed from
9 lines to 3 — the verbose explanation duplicated the test docstring.
- Local-variable assignment for graph.gfql() result kept (matches
existing test patterns + works around a Plottable.gfql pyright
attribute warning that fires on fluent chains).
Test counts unchanged: 8 new tests; full gfql suite 1581 passed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test+docs(#1219): wave-3b fixes — accurate split-OR rationale + XOR runtime sibling
Wave-3b from-scratch fresh-eyes review found 1 IMPORTANT + 3 SUGGESTIONs.
Addressed:
- IMPORTANT: split_top_level_and docstring's stated mechanism
('multi-alias refs cause retention post-join') was misleading. Real
mechanism: pushdown silently AND-recombines the pushed conjuncts
inside PatternMatch.predicates. AND distributes (split + AND-recombine
≡ original); OR does not (split + AND-recombine ≠ original). An
OR-aware split would need a UNION-of-pushed-branches recombine path
the current pipeline doesn't implement. Rewrote the docstring with
the accurate split + AND-recombine rationale.
- SUGGESTION 3: XOR runtime row-set test (sibling to OR/AND/NOT tests
this PR adds). Locks symmetric-difference semantics.
Skipped:
- SUGGESTION 4 (NOT IS NULL standalone): nullable_not_or already
exercises IS NULL composed with NOT/OR; standalone marginal.
- SUGGESTION 5 (cuDF parametrize): real scope creep, defer.
Test counts: 9 new tests; full gfql suite green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test+docs(#1219): wave-4 polish — accurate mixin claim, XOR with NULL, mirrored-guard clarification
Wave-4 (4a targeted + 4b from-scratch) — 0 BLOCKER + 0 IMPORTANT, 9
SUGGESTIONs. Picked up the cheapest valuable ones:
- Wave-4b S1: mixed-string-numeric AND-inside-OR test claimed to
exercise _StringAllowingComparisonMixin but used `n.s = 'a'` (plain
EQ, supported pre-#1217). Swapped to `n.s > 'a'` (string GT, the
mixin-specific path); flipped fixture s-values to keep the same
truth-table outcomes.
- Wave-4b S3: added test_string_cypher_executes_xor_with_null_uses_three_valued_logic
— sibling to the OR/NOT 3VL test, locks NULL XOR T = NULL. Reuses
the 3VL fixture from the De Morgan tests.
- Wave-4a S1: `_split_conjuncts` mirrored guard now names the actual
failure mode (`_combine_conjuncts` AND-joins residuals) before
pointing to the fuller rationale in expr_split.
Skipped (out of scope or duplicate):
- Wave-4b S2: rightmost-only discriminator comment is already honest;
test verified correct.
- Wave-4b S4: cross-alias OR / cross-product fixture — already covered
by test_string_cypher_executes_cross_alias_or_returns_correct_union (#1217).
- Wave-4b S6: optional CHANGELOG bullet for test+docs PR.
- Wave-4a S2 (OR-analyzer in pushdown_safety.py:58-60): pre-existing
unrelated, separate followup.
Test counts: 10 new tests; full gfql suite green (1583 passed).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test+docs(#1219): wave-5 polish — extract 3VL fixture helper, clarify pattern-op unreachability
Wave-5 formal review (multi-dim per .agents/skills/review/SKILL.md +
adversarial pressure-test) found 0 BLOCKER + 0 IMPORTANT, 3 SUGGESTIONs
of which 2 were actionable + cheap:
- Extract `_three_valued_logic_fixture_graph()` helper paralleling
`_de_morgan_fixture_graph()` — eliminates byte-identical 4-row NaN-
mixed fixture duplication between nullable_not_or and xor_with_null.
- Clarify pattern-op branch comment in `_boolean_expr_text.py` to name
BOTH unreachability paths (top-level AND lift + nested-NOT/OR/XOR
E108 rejection) instead of just the first.
- Adversarial-rejected: malformed-NOT-chain test would be a tautology
(right-recursive grammar makes depth-N equivalent to depth-2; existing
double-negation test already exercises the path).
Test counts unchanged: 11 new tests; full gfql suite 1583 passed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(changelog): add #1227 row-boolean residual matrix entry
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,6 +30,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
30
30
-**Polars support**: `polars.DataFrame` and `polars.LazyFrame` now work in `plot()`, `materialize_nodes()`, `get_degrees()`, `get_indegrees()`, `get_outdegrees()`, and `hypergraph()`. Polars is an optional dependency — no behavior change when not installed. Upload path uses efficient Arrow conversion (`to_arrow()` with schema-metadata stripping and memoization); compute/hypergraph paths coerce to pandas at entry. `LazyFrame` is materialized via `.collect()` at each boundary. Adds `test_polars.py` with 17 tests; skips gracefully when polars is absent (#1133).
31
31
32
32
### Internal
33
+
- **GFQL / Cypher row-boolean residual matrix + guardrails (#1219 hardening)**: Locks compositional row-boolean WHERE shapes that #1217's Earley swap admitted but its initial test surface didn't cover. Adds 11 native tests: nullable NOT/OR over a 4-row 3VL fixture (`NULL OR T = T`); N-ary OR (3 branches) + duplicate-branch companion isolating rightmost-drop associativity bugs; De Morgan equivalences (`NOT (A OR B)` ≡ `NOT A AND NOT B`; `NOT (A AND B)` ≡ `NOT A OR NOT B`) parametrized to assert both per-form expected rows AND the form-equivalence; double negation; XOR symmetric difference + XOR with NULL preserving 3VL; mixed-string-numeric AND inside OR exercising `_StringAllowingComparisonMixin` GT path; unit test locking `boolean_expr_to_text(BooleanExpr(op="pattern", ...))` round-trip for the (currently unreachable) defensive branch. Three docstring guardrails: `expr_split.split_top_level_and` documents AND-only intent + the `_combine_conjuncts` AND-recombine mechanism that makes a hypothetical `split_top_level_or` silently incorrect; `predicate_pushdown._split_conjuncts` mirrored guard naming the failure mode; `_boolean_expr_text.boolean_expr_to_text` explicit `op == "pattern"` branch with both unreachability paths documented. No production-code behavior change. Closes the residual-frontier portion of #1219; deeper compositional shapes beyond current fixtures remain tracked under that issue (#1219, #1227).
33
34
- **GFQL / Cypher parser + ast_normalizer — multi-positive WHERE pattern predicates (#1031 slice 3)**: AND-joined positive WHERE pattern predicates (`WHERE (n)-[:R]->() AND (n)-[:T]->()`) now lift into structured `WhereClause.predicates` as N `WherePatternPredicate` entries. The ast_normalizer packs them into a single appended `MatchClause` whose `patterns: Tuple[Tuple[PatternElement, ...], ...]` carries one tuple per predicate (multi-pattern cartesian within MATCH), preserving the lowering invariant that only the FINAL match is connected — pre-binding seeds remain node-only. Per-predicate validation (must include a relationship; cannot introduce new aliases) runs independently before the lift. Removes the legacy `len(pattern_leaves) > 1` gate in `parser.py::_build_where_with_pattern_lift` and the corresponding gate in `ast_normalizer._rewrite_where_pattern_predicates_to_matches`. Refactors `pattern_atom` to split the greedy `WHERE_PATTERN` lexer token (which gobbles `pattern AND pattern AND ...` chains as a single match) back into individual pattern-item texts via `_WHERE_PATTERN_ITEM_RE.finditer` and emit one `BooleanExpr(op="pattern")` per item, joined by an AND-tree via `_rebuild_and_tree`. Adds `test_gfql_executes_multi_positive_where_pattern_predicates_as_intersected_seed` and updates the legacy rejection test to assert the new lift + compile shape. Closes #1031 slice 3 (#1031).
34
35
-**GFQL / Cypher lowering**: Connected `MATCH + OPTIONAL MATCH` compilation now supports row-boolean `WHERE` expressions (`OR`/`NOT`/`XOR` and mixed row predicates) by carrying non-lowerable expressions into post-binding `where_rows(...)` filters for base and optional arms, preserving null-extension behavior while expanding supported disjunction shapes (#1219, #1224).
35
36
- **GFQL / Cypher parser**: switched the Cypher parser from Lark's LALR(1) backend to Earley. *Earley's broader unification incidentally lifts the implicit LALR rejection on row-side OR/NOT/XOR among row predicates. Coverage validated empirically across the risky shapes available to current fixtures: simple homogeneous AND/OR/NOT (correct rows); cross-alias OR with predicate-pushdown candidates (correct union — pushdown leaves the OR intact past the join); OPTIONAL MATCH + WHERE OR (the pre-existing OPTIONAL-MATCH-projection validator gates the projection shape regardless of WHERE — including the OR variant); type-coerced OR against a mixed-type Series (the call executor wraps pandas's `TypeError` as `GFQLTypeError(E303)` via the generic unsupported-row-expression path). No silent wrong-rows surfaced in the shapes exercised; deeper compositional shapes (NOT inside OR with nullable arms, N-ary OR associativity, mixed-string-numeric AND inside OR, De Morgan compositions) are tracked under #1219.* This eliminates four LALR-induced workarounds: the 3 dedicated pattern-shape `where_clause` grammar alternatives (now collapsed into a single `WHERE_PATTERN -> pattern_atom` leaf in `?primary`), `_canonicalize_where_single_pattern_and_expr` (regex source-rewrite that reordered `expr AND pattern AND expr` to `pattern AND <rest>` so LALR could match), `_mixed_where_pattern_expr_error` (pre-flight rejector replaced with a structural lift in `generic_where_clause`), and the `parse_cypher` `except LarkError` retry block. `BooleanExpr.op` literal extended with `"pattern"` plus a new `BooleanExpr.pattern` payload field; the `pattern_atom` transformer wraps `WHERE_PATTERN` tokens as boolean-tree leaves; `_split_top_level_and_pattern_leaves` + `_rebuild_and_tree` + `_build_where_with_pattern_lift` extract pattern leaves from `expr_tree` into `WhereClause.predicates` as `WherePatternPredicate` entries before lowering. Strict-improvement consequences (Earley accepts what LALR rejected): `WHERE expr OR expr` now parses as a structured `or` tree; `WHERE expr AND (expr OR expr)` parses as `and(left, or(...))`; `WHERE n:Label AND n.prop = X` routes through structured `where_predicates`; mixed label+property+string-comparison shapes work via the paired `_StringAllowingComparisonMixin` fix in `comparison.py`. Slice 2/3/4 of #1200 territory (NOT-pattern, multi-positive-pattern, OR/XOR-around-pattern) emit explicit `unsupported` errors at the lift step. 1551 GFQL tests pass; the matching `tck-gfql` branch (`issue-1031-grammar-mixed-where-pattern-expr`) carries 8 paired contract refinements: `match-where1-10` lifted to UNEXPECTED_SUCCESS; `match-where5-{1,2,3}` + `expr-comparison2-1` + `with-where5-3` migrated to a new TYPE_ERROR_KEYS bucket (string-comparison-on-mixed-Series wraps as `GFQLTypeError(E303)`); `match-where5-4` upgraded to MATCHES_EXPECTED (Earley + 3-valued OR makes `WHERE i.var > 'te' OR i.var IS NOT NULL` semantically correct); `with2-1` filed under WRONG_ROW_KEYS (WITH-pipelined join now parses + executes but rows differ from oracle — separate gap); `with-where5-3` demoted from PROMOTION_ROW_KEYS (#1031, #1217).
0 commit comments