Skip to content

Commit 2bc8e95

Browse files
gregfeliceclaude
andauthored
Support pattern expressions as boolean expressions (#2360)
* Support pattern expressions in WHERE clause via GLR parser (issue #1577) Enable bare graph patterns as boolean expressions in WHERE clauses: MATCH (a:Person), (b:Person) WHERE (a)-[:KNOWS]->(b) -- now valid, equivalent to EXISTS(...) RETURN a.name, b.name Previously, this required wrapping in EXISTS(): WHERE EXISTS((a)-[:KNOWS]->(b)) The bare pattern syntax is standard openCypher and is used extensively in Neo4j. Its absence was the most frequently cited migration blocker. Implementation approach: - Switch the Cypher parser from LALR(1) to Bison GLR mode. GLR handles the inherent ambiguity between parenthesized expressions '(' expr ')' and graph path nodes '(' var_name label_opt props ')' by forking the parse stack and discarding the failing path. - Add anonymous_path as an expr_atom alternative with %dprec 1 (lower priority than expression path at %dprec 2). The action wraps the pattern in a cypher_sub_pattern + EXISTS SubLink, reusing the same transform_cypher_sub_pattern() machinery as explicit EXISTS(). - Extract make_exists_pattern_sublink() helper shared by both EXISTS(pattern) and bare pattern rules. - Fix YYLLOC_DEFAULT to use YYRHSLOC() for GLR compatibility. - %dprec annotations on expr_var/var_name_opt resolve the reduce/reduce conflict between expression variables and pattern node variables. Conflict budget: 7 shift/reduce (path extension vs arithmetic on -/<), 3 reduce/reduce (expr_var vs var_name_opt on )/}/=). All are expected and handled correctly by GLR forking + %dprec disambiguation. All 32 regression tests pass (31 existing + 1 new). New pattern_expression test covers: bare patterns, NOT patterns, labeled nodes, AND/OR combinations, left-directed patterns, anonymous nodes, multi-hop patterns, EXISTS() backward compatibility, and non-pattern expression regression checks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Address Copilot review: comment placement, %expect docs, test wording 1. Move "Helper function to create an ExplainStmt node" comment from above make_exists_pattern_sublink() to above make_explain_stmt() where it belongs. 2. Add block comment documenting the %expect/%expect-rr conflict budget: 7 S/R from path vs arithmetic on - and <, 3 R/R from expr_var vs var_name_opt on ) } =. 3. Clarify test comment: "Regular expressions" -> "Regular (non-pattern) expressions" to avoid confusion with regex. Regression test: pattern_expression OK. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Address Copilot round 3: broaden scope, remove %expect fragility - Pattern expressions are now accepted anywhere an expr is valid (RETURN, WITH, SET, CASE, boolean combinations), not only WHERE. This matches openCypher semantics and documents the broader surface area that was already implicitly enabled by adding anonymous_path to expr_atom. Added regression tests for each new context: RETURN projection (bare and AS-aliased), mixed with other projections, CASE WHEN, boolean AND/OR combinators, SET to persist a computed boolean property, and WITH ... WHERE pipeline. - Remove the hardcoded `%expect 7` / `%expect-rr 3` conflict budget from cypher_gram.y. The exact conflict counts can drift across Bison versions and distros, which would break builds even though the grammar is correct (GLR handles the conflicts at runtime via fork + %dprec). Instead, pass -Wno-conflicts-sr / -Wno-conflicts-rr via BISONFLAGS in the Makefile so the build stays clean without binding us to a specific Bison release. Kept a block comment in the grammar explaining why GLR conflicts are expected and how they resolve. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Address jrgemignani review: keep -Werror, restore %expect budget Reverts the broad `-Werror` drop and the no-%expect approach from the prior round on jrgemignani's request. The earlier framing — that conflict counts drift across Bison versions, so %expect is fragile — overcorrected: it removed the only build-time alarm bell for unintended new conflicts. Makefile: keep -Werror so any unexpected Bison warning (unused rules, undeclared types, etc.) still fails the build; downgrade only the two conflict categories to plain warnings via -Wno-error=conflicts-sr -Wno-error=conflicts-rr. pgxs auto-adds -Wno-deprecated, so existing %name-prefix= / %pure-parser directives remain non-erroring. cypher_gram.y: add `%expect 7` and `%expect-rr 3` matching the Bison 3.8.2 totals. Bison treats %expect as exact-match, not as a ceiling — any deviation fails the build and forces an audit of the new conflicts. Comment updated to reflect that future Bison versions reporting different counts should bump the numbers explicitly with a version note in the commit message, rather than removing the directive. No grammar or runtime change. Cassert installcheck 34/34 AGE tests green. * Add follow-up regression coverage for pattern expressions (#2360) Addresses the non-blocking test-coverage follow-ups from the review: pattern expressions in additional contexts opened up by allowing anonymous_path as an expr_atom. New cases (all verified against a PG18 build): - Single-node pattern on a bound variable (a:Label). Documented as an EXISTS existence check, NOT an openCypher label predicate: a matching label is always true, and a non-matching label hits AGE's pre-existing "multiple labels for variable" restriction (captured as expected error). - Pattern expressions inside list and map literals. - Pattern expressions as function arguments: collect() shows correct per-row booleans; count() counts all rows (non-null bool) -- documented so the value is not mistaken for a bug. - Pattern expression in OPTIONAL MATCH ... WHERE (null-preserving). - EXISTS() and a bare pattern combined in one predicate. make installcheck: 33/33 green. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 5abb02f commit 2bc8e95

4 files changed

Lines changed: 859 additions & 21 deletions

File tree

Makefile

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -215,6 +215,7 @@ REGRESS = scan \
215215
jsonb_operators \
216216
list_comprehension \
217217
predicate_functions \
218+
pattern_expression \
218219
map_projection \
219220
direct_field_access \
220221
security \
@@ -282,7 +283,21 @@ src/include/parser/cypher_kwlist_d.h: src/include/parser/cypher_kwlist.h $(GEN_K
282283

283284
src/include/parser/cypher_gram_def.h: src/backend/parser/cypher_gram.c
284285

285-
src/backend/parser/cypher_gram.c: BISONFLAGS += --defines=src/include/parser/cypher_gram_def.h -Werror
286+
#
287+
# The Cypher grammar uses GLR mode with a number of inherent shift/reduce
288+
# and reduce/reduce conflicts arising from the ambiguity between
289+
# parenthesized expressions and graph patterns (both start with '(').
290+
# GLR handles these correctly at runtime by forking at the conflict
291+
# point; %dprec annotations resolve cases where both forks succeed.
292+
#
293+
# We keep -Werror so any unexpected Bison warning (unused rules, undeclared
294+
# types, etc.) still fails the build; we downgrade only the two conflict
295+
# categories to plain warnings via -Wno-error=. The exact conflict totals
296+
# are pinned by %expect / %expect-rr in cypher_gram.y, which Bison treats
297+
# as exact-match: any deviation fails the build and forces an audit of
298+
# the new conflicts.
299+
#
300+
src/backend/parser/cypher_gram.c: BISONFLAGS += --defines=src/include/parser/cypher_gram_def.h -Werror -Wno-error=conflicts-sr -Wno-error=conflicts-rr
286301

287302
src/backend/parser/cypher_parser.o: src/backend/parser/cypher_gram.c src/include/parser/cypher_gram_def.h
288303
src/backend/parser/cypher_parser.bc: src/backend/parser/cypher_gram.c src/include/parser/cypher_gram_def.h

0 commit comments

Comments
 (0)