Skip to content

Commit b6236eb

Browse files
committed
Support outer references in reduce() fold bodies
Allow a reduce(acc = init, var IN list | body) fold body to reference loop-invariant values from the enclosing query -- outer-query variables and cypher() parameters -- in addition to the accumulator and element. These were previously rejected with ERRCODE_FEATURE_NOT_SUPPORTED. How it works ------------ The fold body is still compiled to a standalone expression evaluated by age_reduce_transfn, so an outer reference (which cannot be evaluated there) is captured at transform time and supplied as a value: - After the accumulator and element are rewritten to PARAM_EXEC params 0 and 1, transform_cypher_reduce() walks the body and replaces each maximal agtype-typed, loop-invariant subtree -- one that references an outer Var or a cypher() $parameter but not the accumulator/element -- with a new PARAM_EXEC param 2, 3, ... in body order. - The captured expressions are passed to the aggregate as a trailing agtype[] argument; age_reduce(agtype, text, agtype, agtype[]) and its transition function gain this argument. - age_reduce_transfn sizes its param array to 2 + the number of captures and binds the captured values to params 2.. on every row. Because the captures are evaluated in the outer query context as ordinary aggregate arguments, a correlated capture is re-evaluated per group, so an outer value that varies per row (for example under UNWIND) is folded with the correct value. Each capture slot is rebound on every row, and the trailing extras argument is read only when the aggregate actually passes it (PG_NARGS), keeping the transition safe under direct age_reduce() SQL calls and an older 4-argument signature. This keeps the no-core-patch design: the body is still a serialized standalone expression, and the only new machinery is the captured-value plumbing. Still rejected -------------- Subqueries in the body (including a nested reduce()) and aggregate functions remain unsupported and raise a clean ERRCODE_FEATURE_NOT_SUPPORTED error: a subquery cannot be planned as a plain aggregate argument, and an aggregate in a per-element fold is undefined per the openCypher specification. Tests ----- age_reduce gains an "Outer references in the fold body" section covering a plain outer variable, an outer variable used as a multiplier, two distinct outer variables, a property of an outer graph variable, the same outer variable referenced more than once, a property of an outer map, a subexpression that mixes an outer reference with the element (only the loop-invariant part is captured), an outer reference inside a CASE branch of the body, a NULL outer value propagating through the fold, multiple captures mixing a NULL and a non-NULL outer value, an outer variable that changes per row (captured per group), and a cypher() parameter supplied via a prepared statement. The previously-rejected outer-variable case is moved out of the not-supported section, which now covers a nested reduce() (any subquery in the body is unsupported) and an aggregate in the body. The same change also broadens the base reduce() coverage with value-type folds (a float accumulator, negative numbers, a map accumulator passed through unchanged, and list elements indexed in the body), function calls in the fold body (a scalar function over the element and the list itself produced by a function), reduce() composed with surrounding expressions (consumed by another function and used in a comparison), and syntax-error checks for each required piece of the form -- the "= init", ", var IN list", and "| body" clauses, plus a rejected qualified iterator variable. 42/42 installcheck pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> modified: age--1.7.0--y.y.y.sql modified: regress/expected/age_reduce.out modified: regress/sql/age_reduce.sql modified: sql/age_aggregate.sql modified: src/backend/parser/cypher_clause.c modified: src/backend/utils/adt/agtype.c
1 parent 92e48e9 commit b6236eb

6 files changed

Lines changed: 757 additions & 51 deletions

File tree

age--1.7.0--y.y.y.sql

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1107,17 +1107,21 @@ COMMENT ON FUNCTION ag_catalog.create_subgraph(name, name, text, text) IS
11071107
-- Transition function for the age_reduce aggregate. The fold body is compiled
11081108
-- by transform_cypher_reduce() with the accumulator and element rewritten to
11091109
-- PARAM_EXEC params 0 and 1 and serialized into the text argument; the
1110-
-- transition evaluates it for each element in list order. It must be callable
1111-
-- with a NULL transition state (no initcond), so it is intentionally not STRICT.
1112-
CREATE FUNCTION ag_catalog.age_reduce_transfn(agtype, agtype, text, agtype)
1110+
-- transition evaluates it for each element in list order. The trailing
1111+
-- agtype[] argument carries the loop-invariant outer values (outer-query
1112+
-- variables and cypher() parameters) referenced by the body, bound to
1113+
-- PARAM_EXEC params 2, 3, ... It must be callable with a NULL transition state
1114+
-- (no initcond), so it is intentionally not STRICT.
1115+
CREATE FUNCTION ag_catalog.age_reduce_transfn(agtype, agtype, text, agtype, agtype[])
11131116
RETURNS agtype
11141117
LANGUAGE c
11151118
PARALLEL UNSAFE
11161119
AS 'MODULE_PATHNAME';
11171120

11181121
-- aggregate definition for reduce(); direct arguments are
1119-
-- (init, serialized-body, element), with the element fed ORDER BY ordinality.
1120-
CREATE AGGREGATE ag_catalog.age_reduce(agtype, text, agtype)
1122+
-- (init, serialized-body, element, captured-outer-values), with the element
1123+
-- fed ORDER BY ordinality.
1124+
CREATE AGGREGATE ag_catalog.age_reduce(agtype, text, agtype, agtype[])
11211125
(
11221126
stype = agtype,
11231127
sfunc = ag_catalog.age_reduce_transfn

regress/expected/age_reduce.out

Lines changed: 251 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -222,6 +222,87 @@ $$) AS (result agtype);
222222
[1, 4, 9]
223223
(1 row)
224224

225+
--
226+
-- Value types in the fold
227+
--
228+
-- a float accumulator and float elements
229+
SELECT * FROM cypher('reduce', $$
230+
RETURN reduce(s = 0.0, x IN [1.5, 2.5, 3.0] | s + x)
231+
$$) AS (result agtype);
232+
result
233+
--------
234+
7.0
235+
(1 row)
236+
237+
-- negative numbers
238+
SELECT * FROM cypher('reduce', $$
239+
RETURN reduce(s = 0, x IN [-1, -2, -3] | s + x)
240+
$$) AS (result agtype);
241+
result
242+
--------
243+
-6
244+
(1 row)
245+
246+
-- a map accumulator passed through unchanged
247+
SELECT * FROM cypher('reduce', $$
248+
RETURN reduce(s = {n: 0}, x IN [1, 2, 3] | s)
249+
$$) AS (result agtype);
250+
result
251+
----------
252+
{"n": 0}
253+
(1 row)
254+
255+
-- elements that are themselves lists, indexed in the body
256+
SELECT * FROM cypher('reduce', $$
257+
RETURN reduce(s = 0, x IN [[1, 2], [3, 4], [5, 6]] | s + x[0])
258+
$$) AS (result agtype);
259+
result
260+
--------
261+
9
262+
(1 row)
263+
264+
--
265+
-- Function calls in the fold body
266+
--
267+
-- a scalar function applied to the element
268+
SELECT * FROM cypher('reduce', $$
269+
RETURN reduce(s = 0, x IN ['a', 'bb', 'ccc'] | s + size(x))
270+
$$) AS (result agtype);
271+
result
272+
--------
273+
6
274+
(1 row)
275+
276+
-- the list itself produced by a function
277+
SELECT * FROM cypher('reduce', $$
278+
RETURN reduce(s = 0, x IN range(1, 5) | s + x)
279+
$$) AS (result agtype);
280+
result
281+
--------
282+
15
283+
(1 row)
284+
285+
--
286+
-- Composing reduce() with surrounding expressions
287+
--
288+
-- the reduce() result consumed by another function
289+
SELECT * FROM cypher('reduce', $$
290+
RETURN size(reduce(s = [], x IN [1, 2, 3, 4] | s + [x]))
291+
$$) AS (result agtype);
292+
result
293+
--------
294+
4
295+
(1 row)
296+
297+
-- the reduce() result used in a comparison
298+
SELECT * FROM cypher('reduce', $$
299+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x) = 6
300+
$$) AS (result agtype);
301+
result
302+
--------
303+
true
304+
(1 row)
305+
225306
--
226307
-- A conditional body (CASE)
227308
--
@@ -484,17 +565,129 @@ $$) AS (name agtype, total agtype);
484565
(3 rows)
485566

486567
--
487-
-- Not-yet-supported constructs raise a clean feature error
568+
-- Outer references in the fold body
488569
--
489-
-- an outer variable referenced in the body
570+
-- The body may reference loop-invariant values from the enclosing query: an
571+
-- outer variable, a property of an outer variable, or a cypher() parameter.
572+
-- a plain outer variable in the body
490573
SELECT * FROM cypher('reduce', $$
491574
WITH 5 AS w
492-
RETURN reduce(s = 0, x IN [1, 2] | s + x + w)
575+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x + w)
493576
$$) AS (result agtype);
494-
ERROR: a reduce() expression may only reference its accumulator and element variables
495-
LINE 1: SELECT * FROM cypher('reduce', $$
496-
^
497-
-- a nested reduce() in the body
577+
result
578+
--------
579+
21
580+
(1 row)
581+
582+
-- an outer variable used as a multiplier
583+
SELECT * FROM cypher('reduce', $$
584+
WITH 3 AS factor
585+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x * factor)
586+
$$) AS (result agtype);
587+
result
588+
--------
589+
18
590+
(1 row)
591+
592+
-- two distinct outer variables in the body
593+
SELECT * FROM cypher('reduce', $$
594+
WITH 2 AS a, 100 AS b
595+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x * a + b)
596+
$$) AS (result agtype);
597+
result
598+
--------
599+
312
600+
(1 row)
601+
602+
-- a property of an outer (graph) variable in the body
603+
SELECT * FROM cypher('reduce', $$
604+
MATCH (u:bag) WHERE u.name = 'mid'
605+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x + u.vals[0])
606+
$$) AS (result agtype);
607+
result
608+
--------
609+
21
610+
(1 row)
611+
612+
-- the same outer variable referenced more than once in the body
613+
SELECT * FROM cypher('reduce', $$
614+
WITH 7 AS k
615+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + k + k)
616+
$$) AS (result agtype);
617+
result
618+
--------
619+
42
620+
(1 row)
621+
622+
-- a property of an outer map referenced in the body
623+
SELECT * FROM cypher('reduce', $$
624+
WITH {factor: 10} AS m
625+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x * m.factor)
626+
$$) AS (result agtype);
627+
result
628+
--------
629+
60
630+
(1 row)
631+
632+
-- a subexpression that mixes an outer reference with the element: only the
633+
-- loop-invariant part (the outer list) is captured, the element index is not
634+
SELECT * FROM cypher('reduce', $$
635+
WITH [10, 20, 30] AS lookup
636+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + lookup[x - 1])
637+
$$) AS (result agtype);
638+
result
639+
--------
640+
60
641+
(1 row)
642+
643+
-- an outer reference inside a CASE branch of the body is captured
644+
SELECT * FROM cypher('reduce', $$
645+
WITH 10 AS w
646+
RETURN reduce(s = 0, x IN [1, 2, 3] | CASE WHEN x % 2 = 0 THEN s + w ELSE s + x END)
647+
$$) AS (result agtype);
648+
result
649+
--------
650+
14
651+
(1 row)
652+
653+
-- a NULL outer value propagates through the fold
654+
SELECT * FROM cypher('reduce', $$
655+
WITH null AS w
656+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x + w)
657+
$$) AS (result agtype);
658+
result
659+
--------
660+
null
661+
(1 row)
662+
663+
-- multiple outer captures with a mix of NULL and non-NULL: each is bound to its
664+
-- own slot (the non-NULL multiplier is bound and the NULL still propagates)
665+
SELECT * FROM cypher('reduce', $$
666+
WITH 3 AS a, null AS b
667+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x * a + b)
668+
$$) AS (result agtype);
669+
result
670+
--------
671+
null
672+
(1 row)
673+
674+
-- an outer variable that changes per row is captured per group
675+
SELECT * FROM cypher('reduce', $$
676+
UNWIND [1, 2, 3] AS m
677+
RETURN reduce(s = 0, x IN [1, 2, 3, 4] | s + x * m) AS total
678+
ORDER BY total
679+
$$) AS (result agtype);
680+
result
681+
--------
682+
10
683+
20
684+
30
685+
(3 rows)
686+
687+
--
688+
-- Not-yet-supported constructs raise a clean feature error
689+
--
690+
-- a nested reduce() in the body (any subquery in the body is unsupported)
498691
SELECT * FROM cypher('reduce', $$
499692
RETURN reduce(s = 0, x IN [1, 2] | s + reduce(t = 0, y IN [x] | t + y))
500693
$$) AS (result agtype);
@@ -509,6 +702,57 @@ ERROR: aggregate functions are not supported in a reduce() expression
509702
LINE 1: SELECT * FROM cypher('reduce', $$
510703
^
511704
--
705+
-- Syntax errors: each required piece of the reduce() form is enforced
706+
--
707+
-- missing "= init"
708+
SELECT * FROM cypher('reduce', $$
709+
RETURN reduce(s, x IN [1, 2] | s + x)
710+
$$) AS (result agtype);
711+
ERROR: syntax error at or near ","
712+
LINE 2: RETURN reduce(s, x IN [1, 2] | s + x)
713+
^
714+
-- missing ", var IN list"
715+
SELECT * FROM cypher('reduce', $$
716+
RETURN reduce(s = 0 | s)
717+
$$) AS (result agtype);
718+
ERROR: syntax error at or near "|"
719+
LINE 2: RETURN reduce(s = 0 | s)
720+
^
721+
-- missing "| body"
722+
SELECT * FROM cypher('reduce', $$
723+
RETURN reduce(s = 0, x IN [1, 2])
724+
$$) AS (result agtype);
725+
ERROR: syntax error at or near ")"
726+
LINE 2: RETURN reduce(s = 0, x IN [1, 2])
727+
^
728+
-- a qualified iterator variable is not allowed
729+
SELECT * FROM cypher('reduce', $$
730+
RETURN reduce(s = 0, x.y IN [1, 2] | s)
731+
$$) AS (result agtype);
732+
ERROR: syntax error at or near "."
733+
LINE 2: RETURN reduce(s = 0, x.y IN [1, 2] | s)
734+
^
735+
--
736+
-- cypher() parameter referenced in the fold body (via a prepared statement)
737+
--
738+
PREPARE reduce_param(agtype) AS
739+
SELECT * FROM cypher('reduce', $$
740+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x + $p)
741+
$$, $1) AS (result agtype);
742+
EXECUTE reduce_param('{"p": 10}');
743+
result
744+
--------
745+
36
746+
(1 row)
747+
748+
EXECUTE reduce_param('{"p": 100}');
749+
result
750+
--------
751+
306
752+
(1 row)
753+
754+
DEALLOCATE reduce_param;
755+
--
512756
-- "reduce" as a property key name (safe_keywords backward compatibility):
513757
-- because reduce() introduced a reserved keyword, confirm the word is still
514758
-- usable as a map key, the same way any/none/single are.

0 commit comments

Comments
 (0)