Skip to content

Commit 51b10dc

Browse files
lmeyerovclaude
andcommitted
test+docs(cypher): pin aggregate-failfast interaction with chained-reentry carry (#1256 W1-I1)
Wave-1 review flagged that hidden-column forwarding could in principle silently mutate grouping when the appended `__cypher_reentry_<S>_<X>__` bare item lands inside a `WITH a, friend, count(*) AS n` chain — the carry would join the group key set. Empirically those queries already fail at the pre-existing "aggregate would need repeated MATCH rows from a relationship pattern" failfast, so no silent wrong-result path exists today. Pin that behavior with a regression-lock test so a future change that lifts the aggregate failfast must reckon with carry-grouping interaction explicitly. Also document the DISTINCT/aggregate interaction in `_demote_secondary_whole_row_aliases` inline so the next reader sees the design tradeoff. Refs #1256. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 5762270 commit 51b10dc

2 files changed

Lines changed: 37 additions & 0 deletions

File tree

graphistry/compute/gfql/cypher/lowering.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7948,6 +7948,17 @@ def rewrite_text(expr: ExpressionText, field: str) -> ExpressionText:
79487948
# is in the prefix) but the subsequent WITH/RETURN scope-narrowing drops it,
79497949
# and references like `RETURN x.id AS xid` (rewritten to a bare hidden
79507950
# identifier) fail at the inner compile's alias resolution.
7951+
#
7952+
# Interaction with DISTINCT/aggregating downstream stages: appending the
7953+
# carry as a bare item makes it a participant in DISTINCT key sets and (in
7954+
# principle) aggregation grouping. For multi-alias carry semantics this is
7955+
# what callers want — DISTINCT over `(friend, x.id)` is the desired
7956+
# behavior when `x.id` is referenced downstream. Queries that combine carry
7957+
# forwarding with relationship-pattern aggregation already fail at the
7958+
# earlier "aggregate would need repeated MATCH rows" failfast, so no silent
7959+
# wrong-grouping reaches a user via this code path. A future tightening
7960+
# (gate per-stage on `not stage.clause.distinct and not has_aggregate`) is
7961+
# tracked under #1256 follow-up.
79517962
if refs_collected and rewritten_with_stages_tail:
79527963
forwarded_items: List[ReturnItem] = []
79537964
for alias_name, prop in sorted(refs_collected):

graphistry/tests/compute/gfql/cypher/test_lowering.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8081,6 +8081,32 @@ def test_string_cypher_admits_multi_alias_distinct_forwarding_through_reentry()
80818081
]
80828082

80838083

8084+
def test_string_cypher_chained_reentry_carry_with_aggregate_hits_existing_aggregate_failfast() -> None:
8085+
"""#1256 wave-1 review W1-I1 regression-lock: hidden-column forwarding
8086+
appends ``__cypher_reentry_<S>_<X>__`` bare items to downstream WITH stages
8087+
so chained recursive compiles see them as scalar carries. The reviewer
8088+
flagged that a chained WITH with relationship-pattern aggregation could in
8089+
principle silently mutate grouping (`WITH a, friend, count(*) AS n` would
8090+
add the carry to the group key). Empirically those queries already fail at
8091+
the pre-existing aggregate-after-relationship-MATCH failfast, so no silent
8092+
wrong-result path exists. This test pins that behavior so a future change
8093+
that lifts the aggregate failfast must also reckon with carry-grouping
8094+
interaction explicitly.
8095+
"""
8096+
query = (
8097+
"MATCH (a:A {id: 'a'}), (x:B {id: 'b'}) "
8098+
"WITH a, x "
8099+
"MATCH (a)-[:R]->(friend) "
8100+
"WITH a, x, count(*) AS n "
8101+
"RETURN x.id AS xid, n"
8102+
)
8103+
with pytest.raises(
8104+
GFQLValidationError,
8105+
match=r"aggregate would need repeated MATCH rows",
8106+
):
8107+
_mk_multi_stage_reentry_graph().gfql(query)
8108+
8109+
80848110
def test_string_cypher_failfast_rejects_intermediate_reentry_match_with_no_carried_source() -> None:
80858111
"""#1256 slice 4.3d remaining gap: trailing MATCH that does NOT start from a
80868112
carried alias (free-form intermediate MATCH) is still unsupported.

0 commit comments

Comments
 (0)