You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(gfql/cypher): execute WHERE NOT (pattern) via anti-semi row filtering (#1031 slice 2 phase 2b) (#1238)
* feat(gfql/cypher): execute WHERE NOT pattern via anti-semi row filtering (#1031#1235)
* test(gfql/cypher): amplify OPTIONAL MATCH NOT-pattern coverage (#1235)
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,6 +30,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
30
30
-**Polars support**: `polars.DataFrame` and `polars.LazyFrame` now work in `plot()`, `materialize_nodes()`, `get_degrees()`, `get_indegrees()`, `get_outdegrees()`, and `hypergraph()`. Polars is an optional dependency — no behavior change when not installed. Upload path uses efficient Arrow conversion (`to_arrow()` with schema-metadata stripping and memoization); compute/hypergraph paths coerce to pandas at entry. `LazyFrame` is materialized via `.collect()` at each boundary. Adds `test_polars.py` with 17 tests; skips gracefully when polars is absent (#1133).
31
31
32
32
### Internal
33
+
-**GFQL / Cypher lowering — WHERE NOT (pattern) anti-semi execution (#1031 slice 2 phase 2b)**: `WherePatternPredicate(negated=True)` no longer hard-fails at lowering. The compiler now emits a row pre-filter call (`anti_semi_apply`) for negated pattern predicates in both general MATCH lowering and connected OPTIONAL MATCH clause lowering, with per-predicate validation (must include relationship, must not introduce new aliases, must share bound aliases). Added `anti_semi_apply` row-pipeline runtime operation + call-safelist entry, plus row-table base-graph context preservation needed for correlated bindings execution. New tests cover compile shape (`row_pre_filters` emission) and runtime behavior for `MATCH (n) WHERE NOT (n)-[:R]->()` (including mixed row-expression + NOT-pattern filtering), plus connected `OPTIONAL MATCH ... WHERE NOT (pattern)` filtering/null-fill semantics. Full `cypher/test_lowering.py` suite passes (756 passed / 66 skipped) and touched-module mypy is clean.
33
34
- **GFQL / Cypher parser + ast_normalizer — NOT-pattern AST plumbing (#1031 slice 2 phase 2a)**: Top-level `WHERE NOT (pattern)` shapes (e.g. `WHERE NOT (n)-[:R]->()`) now parse cleanly and lift into `WhereClause.predicates` as `WherePatternPredicate(negated=True)` entries instead of tripping the legacy "cannot yet be mixed with generic row expressions" E108. `_split_top_level_and_pattern_leaves` adds a top-level `not(pattern_atom)` case that strips the NOT and emits the inner pattern as a negated leaf; `_build_where_with_pattern_lift` accepts both positive and negated leaves and emits one `WherePatternPredicate` per leaf with the matching `negated` flag. ast_normalizer's `_rewrite_where_pattern_predicates_to_matches` partitions into positive (rewrites to appended MatchClause as before) vs negated (passes through to lowering). Lowering now distinguishes the two cases: positive `WherePatternPredicate` still raises "must be rewritten before lowering" (defensive — slice 3 already rewrites all positives in ast_normalizer); negated raises a scoped "Cypher WHERE NOT (pattern) anti-semi-join lowering is not yet supported" pointing the way for the engine half (path-C row-pipeline anti-join, see `plans/1031-slices-2-3-4/findings/slice-2-scope.md`). Adds `test_parse_lifts_top_level_not_pattern_to_negated_predicate` and `test_string_cypher_failfast_rejects_negated_pattern_until_slice2_lowering`. De Morgan compositions, OR-around-pattern, and double-NOT remain rejected at the lift step (slice 4 / future). Phase 2a only — runtime (anti-semi-join lowering) ships in a follow-up sub-PR (#1031).
34
35
- **GFQL / Cypher row-boolean residual matrix + guardrails (#1219 hardening)**: Locks compositional row-boolean WHERE shapes that #1217's Earley swap admitted but its initial test surface didn't cover. Adds 11 native tests: nullable NOT/OR over a 4-row 3VL fixture (`NULL OR T = T`); N-ary OR (3 branches) + duplicate-branch companion isolating rightmost-drop associativity bugs; De Morgan equivalences (`NOT (A OR B)` ≡ `NOT A AND NOT B`; `NOT (A AND B)` ≡ `NOT A OR NOT B`) parametrized to assert both per-form expected rows AND the form-equivalence; double negation; XOR symmetric difference + XOR with NULL preserving 3VL; mixed-string-numeric AND inside OR exercising `_StringAllowingComparisonMixin` GT path; unit test locking `boolean_expr_to_text(BooleanExpr(op="pattern", ...))` round-trip for the (currently unreachable) defensive branch. Three docstring guardrails: `expr_split.split_top_level_and` documents AND-only intent + the `_combine_conjuncts` AND-recombine mechanism that makes a hypothetical `split_top_level_or` silently incorrect; `predicate_pushdown._split_conjuncts` mirrored guard naming the failure mode; `_boolean_expr_text.boolean_expr_to_text` explicit `op == "pattern"` branch with both unreachability paths documented. No production-code behavior change. Closes the residual-frontier portion of #1219; deeper compositional shapes beyond current fixtures remain tracked under that issue (#1219, #1227).
35
36
- **GFQL / Cypher parser + ast_normalizer — multi-positive WHERE pattern predicates (#1031 slice 3)**: AND-joined positive WHERE pattern predicates (`WHERE (n)-[:R]->() AND (n)-[:T]->()`) now lift into structured `WhereClause.predicates` as N `WherePatternPredicate` entries. The ast_normalizer packs them into a single appended `MatchClause` whose `patterns: Tuple[Tuple[PatternElement, ...], ...]` carries one tuple per predicate (multi-pattern cartesian within MATCH), preserving the lowering invariant that only the FINAL match is connected — pre-binding seeds remain node-only. Per-predicate validation (must include a relationship; cannot introduce new aliases) runs independently before the lift. Removes the legacy `len(pattern_leaves) > 1` gate in `parser.py::_build_where_with_pattern_lift` and the corresponding gate in `ast_normalizer._rewrite_where_pattern_predicates_to_matches`. Refactors `pattern_atom` to split the greedy `WHERE_PATTERN` lexer token (which gobbles `pattern AND pattern AND ...` chains as a single match) back into individual pattern-item texts via `_WHERE_PATTERN_ITEM_RE.finditer` and emit one `BooleanExpr(op="pattern")` per item, joined by an AND-tree via `_rebuild_and_tree`. Adds `test_gfql_executes_multi_positive_where_pattern_predicates_as_intersected_seed` and updates the legacy rejection test to assert the new lift + compile shape. Closes #1031 slice 3 (#1031).
0 commit comments