Skip to content

Support outer references in reduce() fold bodies#2448

Open
jrgemignani wants to merge 1 commit into
apache:masterfrom
jrgemignani:reduce_2_1
Open

Support outer references in reduce() fold bodies#2448
jrgemignani wants to merge 1 commit into
apache:masterfrom
jrgemignani:reduce_2_1

Conversation

@jrgemignani

Copy link
Copy Markdown
Contributor

Allow a reduce(acc = init, var IN list | body) fold body to reference loop-invariant values from the enclosing query -- outer-query variables and cypher() parameters -- in addition to the accumulator and element. These were previously rejected with ERRCODE_FEATURE_NOT_SUPPORTED.

How it works

The fold body is still compiled to a standalone expression evaluated by age_reduce_transfn, so an outer reference (which cannot be evaluated there) is captured at transform time and supplied as a value:

  • After the accumulator and element are rewritten to PARAM_EXEC params 0 and 1, transform_cypher_reduce() walks the body and replaces each maximal agtype-typed, loop-invariant subtree -- one that references an outer Var or a cypher() $parameter but not the accumulator/element -- with a new PARAM_EXEC param 2, 3, ... in body order.
  • The captured expressions are passed to the aggregate as a trailing agtype[] argument; age_reduce(agtype, text, agtype, agtype[]) and its transition function gain this argument.
  • age_reduce_transfn sizes its param array to 2 + the number of captures and binds the captured values to params 2.. on every row. Because the captures are evaluated in the outer query context as ordinary aggregate arguments, a correlated capture is re-evaluated per group, so an outer value that varies per row (for example under UNWIND) is folded with the correct value.

This keeps the no-core-patch design: the body is still a serialized standalone expression, and the only new machinery is the captured-value plumbing.

Still rejected

Subqueries in the body (including a nested reduce()) and aggregate functions remain unsupported and raise a clean ERRCODE_FEATURE_NOT_SUPPORTED error: a subquery cannot be planned as a plain aggregate argument, and an aggregate in a per-element fold is undefined per the openCypher specification.

Tests

age_reduce gains an "Outer references in the fold body" section covering a plain outer variable, an outer variable used as a multiplier, two distinct outer variables, a property of an outer graph variable, the same outer variable referenced more than once, a property of an outer map, a subexpression that mixes an outer reference with the element (only the loop-invariant part is captured), an outer reference inside a CASE branch of the body, a NULL outer value propagating through the fold, multiple captures mixing a NULL and a non-NULL outer value, an outer variable that changes per row (captured per group), and a cypher() parameter supplied via a prepared statement. The previously-rejected outer-variable case is moved out of the not-supported section, which now covers a nested reduce() (any subquery in the body is unsupported) and an aggregate in the body.

The same change also broadens the base reduce() coverage with value-type folds (a float accumulator, negative numbers, a map accumulator passed through unchanged, and list elements indexed in the body), function calls in the fold body (a scalar function over the element and the list itself produced by a function), reduce() composed with surrounding expressions (consumed by another function and used in a comparison), and syntax-error checks for each required piece of the form -- the "= init", ", var IN list", and "| body" clauses, plus a rejected qualified iterator variable. 42/42 installcheck pass.

Co-authored-by: Copilot copilot@github.com

modified: age--1.7.0--y.y.y.sql
new file: regress/age_load/data/junk.csv
modified: regress/expected/age_reduce.out
modified: regress/sql/age_reduce.sql
modified: sql/age_aggregate.sql
modified: src/backend/parser/cypher_clause.c
modified: src/backend/utils/adt/agtype.c

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends Cypher reduce(acc = init, var IN list | body) so the fold body can reference loop-invariant values from the enclosing query (outer-query variables and cypher() parameters) by capturing those outer references at transform time and plumbing them into the fold evaluator as additional PARAM_EXEC slots.

Changes:

  • Capture loop-invariant outer-reference subtrees in transform_cypher_reduce() as PARAM_EXEC params 2..N and pass their values via a new trailing agtype[] “extras” aggregate argument.
  • Update the age_reduce aggregate / age_reduce_transfn signature to accept the new agtype[] extras argument and bind captured values during transition evaluation.
  • Expand regression coverage for outer references, value-type folds, function usage, composition, and syntax errors.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/backend/utils/adt/agtype.c Extends age_reduce_transfn to size/bind additional PARAM_EXEC slots for captured outer values.
src/backend/parser/cypher_clause.c Implements capture + rewrite of loop-invariant outer references into PARAM_EXEC params 2.. and builds the extras array argument.
sql/age_aggregate.sql Updates extension SQL definitions for age_reduce_transfn and age_reduce to include agtype[] extras.
age--1.7.0--y.y.y.sql Mirrors the aggregate/function signature changes in the extension install/upgrade SQL.
regress/sql/age_reduce.sql Adds/updates regression queries covering outer references and additional reduce() behaviors.
regress/expected/age_reduce.out Updates expected output to match the new regression cases and new supported behavior.
regress/age_load/data/junk.csv Adds a new CSV fixture file (currently appears unrelated to the reduce() feature).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/backend/utils/adt/agtype.c Outdated
Comment thread regress/age_load/data/junk.csv Outdated
Comment thread src/backend/utils/adt/agtype.c Outdated
Allow a reduce(acc = init, var IN list | body) fold body to reference
loop-invariant values from the enclosing query -- outer-query variables and
cypher() parameters -- in addition to the accumulator and element. These were
previously rejected with ERRCODE_FEATURE_NOT_SUPPORTED.

How it works
------------
The fold body is still compiled to a standalone expression evaluated by
age_reduce_transfn, so an outer reference (which cannot be evaluated there)
is captured at transform time and supplied as a value:

  - After the accumulator and element are rewritten to PARAM_EXEC params 0 and
    1, transform_cypher_reduce() walks the body and replaces each maximal
    agtype-typed, loop-invariant subtree -- one that references an outer Var or
    a cypher() $parameter but not the accumulator/element -- with a new
    PARAM_EXEC param 2, 3, ... in body order.
  - The captured expressions are passed to the aggregate as a trailing
    agtype[] argument; age_reduce(agtype, text, agtype, agtype[]) and its
    transition function gain this argument.
  - age_reduce_transfn sizes its param array to 2 + the number of captures and
    binds the captured values to params 2.. on every row. Because the captures
    are evaluated in the outer query context as ordinary aggregate arguments, a
    correlated capture is re-evaluated per group, so an outer value that varies
    per row (for example under UNWIND) is folded with the correct value.
    Each capture slot is rebound on every row, and the trailing extras
    argument is read only when the aggregate actually passes it (PG_NARGS),
    keeping the transition safe under direct age_reduce() SQL calls and an
    older 4-argument signature.

This keeps the no-core-patch design: the body is still a serialized standalone
expression, and the only new machinery is the captured-value plumbing.

Still rejected
--------------
Subqueries in the body (including a nested reduce()) and aggregate functions
remain unsupported and raise a clean ERRCODE_FEATURE_NOT_SUPPORTED error: a
subquery cannot be planned as a plain aggregate argument, and an aggregate in a
per-element fold is undefined per the openCypher specification.

Tests
-----
age_reduce gains an "Outer references in the fold body" section covering a
plain outer variable, an outer variable used as a multiplier, two distinct
outer variables, a property of an outer graph variable, the same outer variable
referenced more than once, a property of an outer map, a subexpression that
mixes an outer reference with the element (only the loop-invariant part is
captured), an outer reference inside a CASE branch of the body, a NULL outer
value propagating through the fold, multiple captures mixing a NULL and a
non-NULL outer value, an outer variable that changes per row (captured per
group), and a cypher() parameter supplied via a prepared statement. The
previously-rejected outer-variable case is moved out of the not-supported
section, which now covers a nested reduce() (any subquery in the body is
unsupported) and an aggregate in the body.

The same change also broadens the base reduce() coverage with value-type folds
(a float accumulator, negative numbers, a map accumulator passed through
unchanged, and list elements indexed in the body), function calls in the fold
body (a scalar function over the element and the list itself produced by a
function), reduce() composed with surrounding expressions (consumed by another
function and used in a comparison), and syntax-error checks for each required
piece of the form -- the "= init", ", var IN list", and "| body" clauses, plus
a rejected qualified iterator variable. 42/42 installcheck pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

modified:   age--1.7.0--y.y.y.sql
modified:   regress/expected/age_reduce.out
modified:   regress/sql/age_reduce.sql
modified:   sql/age_aggregate.sql
modified:   src/backend/parser/cypher_clause.c
modified:   src/backend/utils/adt/agtype.c
@jrgemignani

Copy link
Copy Markdown
Contributor Author

Resolved all 3 Copilot issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants