Support outer references in reduce() fold bodies#2448
Open
jrgemignani wants to merge 1 commit into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends Cypher reduce(acc = init, var IN list | body) so the fold body can reference loop-invariant values from the enclosing query (outer-query variables and cypher() parameters) by capturing those outer references at transform time and plumbing them into the fold evaluator as additional PARAM_EXEC slots.
Changes:
- Capture loop-invariant outer-reference subtrees in
transform_cypher_reduce()asPARAM_EXECparams 2..N and pass their values via a new trailingagtype[]“extras” aggregate argument. - Update the
age_reduceaggregate /age_reduce_transfnsignature to accept the newagtype[]extras argument and bind captured values during transition evaluation. - Expand regression coverage for outer references, value-type folds, function usage, composition, and syntax errors.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/backend/utils/adt/agtype.c | Extends age_reduce_transfn to size/bind additional PARAM_EXEC slots for captured outer values. |
| src/backend/parser/cypher_clause.c | Implements capture + rewrite of loop-invariant outer references into PARAM_EXEC params 2.. and builds the extras array argument. |
| sql/age_aggregate.sql | Updates extension SQL definitions for age_reduce_transfn and age_reduce to include agtype[] extras. |
| age--1.7.0--y.y.y.sql | Mirrors the aggregate/function signature changes in the extension install/upgrade SQL. |
| regress/sql/age_reduce.sql | Adds/updates regression queries covering outer references and additional reduce() behaviors. |
| regress/expected/age_reduce.out | Updates expected output to match the new regression cases and new supported behavior. |
| regress/age_load/data/junk.csv | Adds a new CSV fixture file (currently appears unrelated to the reduce() feature). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Allow a reduce(acc = init, var IN list | body) fold body to reference
loop-invariant values from the enclosing query -- outer-query variables and
cypher() parameters -- in addition to the accumulator and element. These were
previously rejected with ERRCODE_FEATURE_NOT_SUPPORTED.
How it works
------------
The fold body is still compiled to a standalone expression evaluated by
age_reduce_transfn, so an outer reference (which cannot be evaluated there)
is captured at transform time and supplied as a value:
- After the accumulator and element are rewritten to PARAM_EXEC params 0 and
1, transform_cypher_reduce() walks the body and replaces each maximal
agtype-typed, loop-invariant subtree -- one that references an outer Var or
a cypher() $parameter but not the accumulator/element -- with a new
PARAM_EXEC param 2, 3, ... in body order.
- The captured expressions are passed to the aggregate as a trailing
agtype[] argument; age_reduce(agtype, text, agtype, agtype[]) and its
transition function gain this argument.
- age_reduce_transfn sizes its param array to 2 + the number of captures and
binds the captured values to params 2.. on every row. Because the captures
are evaluated in the outer query context as ordinary aggregate arguments, a
correlated capture is re-evaluated per group, so an outer value that varies
per row (for example under UNWIND) is folded with the correct value.
Each capture slot is rebound on every row, and the trailing extras
argument is read only when the aggregate actually passes it (PG_NARGS),
keeping the transition safe under direct age_reduce() SQL calls and an
older 4-argument signature.
This keeps the no-core-patch design: the body is still a serialized standalone
expression, and the only new machinery is the captured-value plumbing.
Still rejected
--------------
Subqueries in the body (including a nested reduce()) and aggregate functions
remain unsupported and raise a clean ERRCODE_FEATURE_NOT_SUPPORTED error: a
subquery cannot be planned as a plain aggregate argument, and an aggregate in a
per-element fold is undefined per the openCypher specification.
Tests
-----
age_reduce gains an "Outer references in the fold body" section covering a
plain outer variable, an outer variable used as a multiplier, two distinct
outer variables, a property of an outer graph variable, the same outer variable
referenced more than once, a property of an outer map, a subexpression that
mixes an outer reference with the element (only the loop-invariant part is
captured), an outer reference inside a CASE branch of the body, a NULL outer
value propagating through the fold, multiple captures mixing a NULL and a
non-NULL outer value, an outer variable that changes per row (captured per
group), and a cypher() parameter supplied via a prepared statement. The
previously-rejected outer-variable case is moved out of the not-supported
section, which now covers a nested reduce() (any subquery in the body is
unsupported) and an aggregate in the body.
The same change also broadens the base reduce() coverage with value-type folds
(a float accumulator, negative numbers, a map accumulator passed through
unchanged, and list elements indexed in the body), function calls in the fold
body (a scalar function over the element and the list itself produced by a
function), reduce() composed with surrounding expressions (consumed by another
function and used in a comparison), and syntax-error checks for each required
piece of the form -- the "= init", ", var IN list", and "| body" clauses, plus
a rejected qualified iterator variable. 42/42 installcheck pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
modified: age--1.7.0--y.y.y.sql
modified: regress/expected/age_reduce.out
modified: regress/sql/age_reduce.sql
modified: sql/age_aggregate.sql
modified: src/backend/parser/cypher_clause.c
modified: src/backend/utils/adt/agtype.c
Contributor
Author
|
Resolved all 3 Copilot issues. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Allow a reduce(acc = init, var IN list | body) fold body to reference loop-invariant values from the enclosing query -- outer-query variables and cypher() parameters -- in addition to the accumulator and element. These were previously rejected with ERRCODE_FEATURE_NOT_SUPPORTED.
How it works
The fold body is still compiled to a standalone expression evaluated by age_reduce_transfn, so an outer reference (which cannot be evaluated there) is captured at transform time and supplied as a value:
This keeps the no-core-patch design: the body is still a serialized standalone expression, and the only new machinery is the captured-value plumbing.
Still rejected
Subqueries in the body (including a nested reduce()) and aggregate functions remain unsupported and raise a clean ERRCODE_FEATURE_NOT_SUPPORTED error: a subquery cannot be planned as a plain aggregate argument, and an aggregate in a per-element fold is undefined per the openCypher specification.
Tests
age_reduce gains an "Outer references in the fold body" section covering a plain outer variable, an outer variable used as a multiplier, two distinct outer variables, a property of an outer graph variable, the same outer variable referenced more than once, a property of an outer map, a subexpression that mixes an outer reference with the element (only the loop-invariant part is captured), an outer reference inside a CASE branch of the body, a NULL outer value propagating through the fold, multiple captures mixing a NULL and a non-NULL outer value, an outer variable that changes per row (captured per group), and a cypher() parameter supplied via a prepared statement. The previously-rejected outer-variable case is moved out of the not-supported section, which now covers a nested reduce() (any subquery in the body is unsupported) and an aggregate in the body.
The same change also broadens the base reduce() coverage with value-type folds (a float accumulator, negative numbers, a map accumulator passed through unchanged, and list elements indexed in the body), function calls in the fold body (a scalar function over the element and the list itself produced by a function), reduce() composed with surrounding expressions (consumed by another function and used in a comparison), and syntax-error checks for each required piece of the form -- the "= init", ", var IN list", and "| body" clauses, plus a rejected qualified iterator variable. 42/42 installcheck pass.
Co-authored-by: Copilot copilot@github.com
modified: age--1.7.0--y.y.y.sql
new file: regress/age_load/data/junk.csv
modified: regress/expected/age_reduce.out
modified: regress/sql/age_reduce.sql
modified: sql/age_aggregate.sql
modified: src/backend/parser/cypher_clause.c
modified: src/backend/utils/adt/agtype.c