Skip to content

Commit a48ec6f

Browse files
committed
Support outer references in reduce() fold bodies
Allow a reduce(acc = init, var IN list | body) fold body to reference loop-invariant values from the enclosing query -- outer-query variables and cypher() parameters -- in addition to the accumulator and element. These were previously rejected with ERRCODE_FEATURE_NOT_SUPPORTED. How it works ------------ The fold body is still compiled to a standalone expression evaluated by age_reduce_transfn, so an outer reference (which cannot be evaluated there) is captured at transform time and supplied as a value: - After the accumulator and element are rewritten to PARAM_EXEC params 0 and 1, transform_cypher_reduce() walks the body and replaces each maximal agtype-typed, loop-invariant subtree -- one that references an outer Var or a cypher() $parameter but not the accumulator/element -- with a new PARAM_EXEC param 2, 3, ... in body order. - The captured expressions are passed to the aggregate as a trailing agtype[] argument; age_reduce(agtype, text, agtype, agtype[]) and its transition function gain this argument. - age_reduce_transfn sizes its param array to 2 + the number of captures and binds the captured values to params 2.. on every row. Because the captures are evaluated in the outer query context as ordinary aggregate arguments, a correlated capture is re-evaluated per group, so an outer value that varies per row (for example under UNWIND) is folded with the correct value. This keeps the no-core-patch design: the body is still a serialized standalone expression, and the only new machinery is the captured-value plumbing. Still rejected -------------- Subqueries in the body (including a nested reduce()) and aggregate functions remain unsupported and raise a clean ERRCODE_FEATURE_NOT_SUPPORTED error: a subquery cannot be planned as a plain aggregate argument, and an aggregate in a per-element fold is undefined per the openCypher specification. Tests ----- age_reduce gains an "Outer references in the fold body" section covering a plain outer variable, an outer variable used as a multiplier, two distinct outer variables, a property of an outer graph variable, the same outer variable referenced more than once, a property of an outer map, a subexpression that mixes an outer reference with the element (only the loop-invariant part is captured), an outer reference inside a CASE branch of the body, a NULL outer value propagating through the fold, multiple captures mixing a NULL and a non-NULL outer value, an outer variable that changes per row (captured per group), and a cypher() parameter supplied via a prepared statement. The previously-rejected outer-variable case is moved out of the not-supported section, which now covers a nested reduce() (any subquery in the body is unsupported) and an aggregate in the body. The same change also broadens the base reduce() coverage with value-type folds (a float accumulator, negative numbers, a map accumulator passed through unchanged, and list elements indexed in the body), function calls in the fold body (a scalar function over the element and the list itself produced by a function), reduce() composed with surrounding expressions (consumed by another function and used in a comparison), and syntax-error checks for each required piece of the form -- the "= init", ", var IN list", and "| body" clauses, plus a rejected qualified iterator variable. 42/42 installcheck pass. Co-authored-by: Copilot <copilot@github.com> modified: age--1.7.0--y.y.y.sql modified: regress/expected/age_reduce.out modified: regress/sql/age_reduce.sql modified: sql/age_aggregate.sql modified: src/backend/parser/cypher_clause.c modified: src/backend/utils/adt/agtype.c
1 parent 92e48e9 commit a48ec6f

6 files changed

Lines changed: 738 additions & 51 deletions

File tree

age--1.7.0--y.y.y.sql

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1107,17 +1107,21 @@ COMMENT ON FUNCTION ag_catalog.create_subgraph(name, name, text, text) IS
11071107
-- Transition function for the age_reduce aggregate. The fold body is compiled
11081108
-- by transform_cypher_reduce() with the accumulator and element rewritten to
11091109
-- PARAM_EXEC params 0 and 1 and serialized into the text argument; the
1110-
-- transition evaluates it for each element in list order. It must be callable
1111-
-- with a NULL transition state (no initcond), so it is intentionally not STRICT.
1112-
CREATE FUNCTION ag_catalog.age_reduce_transfn(agtype, agtype, text, agtype)
1110+
-- transition evaluates it for each element in list order. The trailing
1111+
-- agtype[] argument carries the loop-invariant outer values (outer-query
1112+
-- variables and cypher() parameters) referenced by the body, bound to
1113+
-- PARAM_EXEC params 2, 3, ... It must be callable with a NULL transition state
1114+
-- (no initcond), so it is intentionally not STRICT.
1115+
CREATE FUNCTION ag_catalog.age_reduce_transfn(agtype, agtype, text, agtype, agtype[])
11131116
RETURNS agtype
11141117
LANGUAGE c
11151118
PARALLEL UNSAFE
11161119
AS 'MODULE_PATHNAME';
11171120

11181121
-- aggregate definition for reduce(); direct arguments are
1119-
-- (init, serialized-body, element), with the element fed ORDER BY ordinality.
1120-
CREATE AGGREGATE ag_catalog.age_reduce(agtype, text, agtype)
1122+
-- (init, serialized-body, element, captured-outer-values), with the element
1123+
-- fed ORDER BY ordinality.
1124+
CREATE AGGREGATE ag_catalog.age_reduce(agtype, text, agtype, agtype[])
11211125
(
11221126
stype = agtype,
11231127
sfunc = ag_catalog.age_reduce_transfn

regress/expected/age_reduce.out

Lines changed: 251 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -222,6 +222,87 @@ $$) AS (result agtype);
222222
[1, 4, 9]
223223
(1 row)
224224

225+
--
226+
-- Value types in the fold
227+
--
228+
-- a float accumulator and float elements
229+
SELECT * FROM cypher('reduce', $$
230+
RETURN reduce(s = 0.0, x IN [1.5, 2.5, 3.0] | s + x)
231+
$$) AS (result agtype);
232+
result
233+
--------
234+
7.0
235+
(1 row)
236+
237+
-- negative numbers
238+
SELECT * FROM cypher('reduce', $$
239+
RETURN reduce(s = 0, x IN [-1, -2, -3] | s + x)
240+
$$) AS (result agtype);
241+
result
242+
--------
243+
-6
244+
(1 row)
245+
246+
-- a map accumulator passed through unchanged
247+
SELECT * FROM cypher('reduce', $$
248+
RETURN reduce(s = {n: 0}, x IN [1, 2, 3] | s)
249+
$$) AS (result agtype);
250+
result
251+
----------
252+
{"n": 0}
253+
(1 row)
254+
255+
-- elements that are themselves lists, indexed in the body
256+
SELECT * FROM cypher('reduce', $$
257+
RETURN reduce(s = 0, x IN [[1, 2], [3, 4], [5, 6]] | s + x[0])
258+
$$) AS (result agtype);
259+
result
260+
--------
261+
9
262+
(1 row)
263+
264+
--
265+
-- Function calls in the fold body
266+
--
267+
-- a scalar function applied to the element
268+
SELECT * FROM cypher('reduce', $$
269+
RETURN reduce(s = 0, x IN ['a', 'bb', 'ccc'] | s + size(x))
270+
$$) AS (result agtype);
271+
result
272+
--------
273+
6
274+
(1 row)
275+
276+
-- the list itself produced by a function
277+
SELECT * FROM cypher('reduce', $$
278+
RETURN reduce(s = 0, x IN range(1, 5) | s + x)
279+
$$) AS (result agtype);
280+
result
281+
--------
282+
15
283+
(1 row)
284+
285+
--
286+
-- Composing reduce() with surrounding expressions
287+
--
288+
-- the reduce() result consumed by another function
289+
SELECT * FROM cypher('reduce', $$
290+
RETURN size(reduce(s = [], x IN [1, 2, 3, 4] | s + [x]))
291+
$$) AS (result agtype);
292+
result
293+
--------
294+
4
295+
(1 row)
296+
297+
-- the reduce() result used in a comparison
298+
SELECT * FROM cypher('reduce', $$
299+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x) = 6
300+
$$) AS (result agtype);
301+
result
302+
--------
303+
true
304+
(1 row)
305+
225306
--
226307
-- A conditional body (CASE)
227308
--
@@ -484,17 +565,129 @@ $$) AS (name agtype, total agtype);
484565
(3 rows)
485566

486567
--
487-
-- Not-yet-supported constructs raise a clean feature error
568+
-- Outer references in the fold body
488569
--
489-
-- an outer variable referenced in the body
570+
-- The body may reference loop-invariant values from the enclosing query: an
571+
-- outer variable, a property of an outer variable, or a cypher() parameter.
572+
-- a plain outer variable in the body
490573
SELECT * FROM cypher('reduce', $$
491574
WITH 5 AS w
492-
RETURN reduce(s = 0, x IN [1, 2] | s + x + w)
575+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x + w)
493576
$$) AS (result agtype);
494-
ERROR: a reduce() expression may only reference its accumulator and element variables
495-
LINE 1: SELECT * FROM cypher('reduce', $$
496-
^
497-
-- a nested reduce() in the body
577+
result
578+
--------
579+
21
580+
(1 row)
581+
582+
-- an outer variable used as a multiplier
583+
SELECT * FROM cypher('reduce', $$
584+
WITH 3 AS factor
585+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x * factor)
586+
$$) AS (result agtype);
587+
result
588+
--------
589+
18
590+
(1 row)
591+
592+
-- two distinct outer variables in the body
593+
SELECT * FROM cypher('reduce', $$
594+
WITH 2 AS a, 100 AS b
595+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x * a + b)
596+
$$) AS (result agtype);
597+
result
598+
--------
599+
312
600+
(1 row)
601+
602+
-- a property of an outer (graph) variable in the body
603+
SELECT * FROM cypher('reduce', $$
604+
MATCH (u:bag) WHERE u.name = 'mid'
605+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x + u.vals[0])
606+
$$) AS (result agtype);
607+
result
608+
--------
609+
21
610+
(1 row)
611+
612+
-- the same outer variable referenced more than once in the body
613+
SELECT * FROM cypher('reduce', $$
614+
WITH 7 AS k
615+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + k + k)
616+
$$) AS (result agtype);
617+
result
618+
--------
619+
42
620+
(1 row)
621+
622+
-- a property of an outer map referenced in the body
623+
SELECT * FROM cypher('reduce', $$
624+
WITH {factor: 10} AS m
625+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x * m.factor)
626+
$$) AS (result agtype);
627+
result
628+
--------
629+
60
630+
(1 row)
631+
632+
-- a subexpression that mixes an outer reference with the element: only the
633+
-- loop-invariant part (the outer list) is captured, the element index is not
634+
SELECT * FROM cypher('reduce', $$
635+
WITH [10, 20, 30] AS lookup
636+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + lookup[x - 1])
637+
$$) AS (result agtype);
638+
result
639+
--------
640+
60
641+
(1 row)
642+
643+
-- an outer reference inside a CASE branch of the body is captured
644+
SELECT * FROM cypher('reduce', $$
645+
WITH 10 AS w
646+
RETURN reduce(s = 0, x IN [1, 2, 3] | CASE WHEN x % 2 = 0 THEN s + w ELSE s + x END)
647+
$$) AS (result agtype);
648+
result
649+
--------
650+
14
651+
(1 row)
652+
653+
-- a NULL outer value propagates through the fold
654+
SELECT * FROM cypher('reduce', $$
655+
WITH null AS w
656+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x + w)
657+
$$) AS (result agtype);
658+
result
659+
--------
660+
null
661+
(1 row)
662+
663+
-- multiple outer captures with a mix of NULL and non-NULL: each is bound to its
664+
-- own slot (the non-NULL multiplier is bound and the NULL still propagates)
665+
SELECT * FROM cypher('reduce', $$
666+
WITH 3 AS a, null AS b
667+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x * a + b)
668+
$$) AS (result agtype);
669+
result
670+
--------
671+
null
672+
(1 row)
673+
674+
-- an outer variable that changes per row is captured per group
675+
SELECT * FROM cypher('reduce', $$
676+
UNWIND [1, 2, 3] AS m
677+
RETURN reduce(s = 0, x IN [1, 2, 3, 4] | s + x * m) AS total
678+
ORDER BY total
679+
$$) AS (result agtype);
680+
result
681+
--------
682+
10
683+
20
684+
30
685+
(3 rows)
686+
687+
--
688+
-- Not-yet-supported constructs raise a clean feature error
689+
--
690+
-- a nested reduce() in the body (any subquery in the body is unsupported)
498691
SELECT * FROM cypher('reduce', $$
499692
RETURN reduce(s = 0, x IN [1, 2] | s + reduce(t = 0, y IN [x] | t + y))
500693
$$) AS (result agtype);
@@ -509,6 +702,57 @@ ERROR: aggregate functions are not supported in a reduce() expression
509702
LINE 1: SELECT * FROM cypher('reduce', $$
510703
^
511704
--
705+
-- Syntax errors: each required piece of the reduce() form is enforced
706+
--
707+
-- missing "= init"
708+
SELECT * FROM cypher('reduce', $$
709+
RETURN reduce(s, x IN [1, 2] | s + x)
710+
$$) AS (result agtype);
711+
ERROR: syntax error at or near ","
712+
LINE 2: RETURN reduce(s, x IN [1, 2] | s + x)
713+
^
714+
-- missing ", var IN list"
715+
SELECT * FROM cypher('reduce', $$
716+
RETURN reduce(s = 0 | s)
717+
$$) AS (result agtype);
718+
ERROR: syntax error at or near "|"
719+
LINE 2: RETURN reduce(s = 0 | s)
720+
^
721+
-- missing "| body"
722+
SELECT * FROM cypher('reduce', $$
723+
RETURN reduce(s = 0, x IN [1, 2])
724+
$$) AS (result agtype);
725+
ERROR: syntax error at or near ")"
726+
LINE 2: RETURN reduce(s = 0, x IN [1, 2])
727+
^
728+
-- a qualified iterator variable is not allowed
729+
SELECT * FROM cypher('reduce', $$
730+
RETURN reduce(s = 0, x.y IN [1, 2] | s)
731+
$$) AS (result agtype);
732+
ERROR: syntax error at or near "."
733+
LINE 2: RETURN reduce(s = 0, x.y IN [1, 2] | s)
734+
^
735+
--
736+
-- cypher() parameter referenced in the fold body (via a prepared statement)
737+
--
738+
PREPARE reduce_param(agtype) AS
739+
SELECT * FROM cypher('reduce', $$
740+
RETURN reduce(s = 0, x IN [1, 2, 3] | s + x + $p)
741+
$$, $1) AS (result agtype);
742+
EXECUTE reduce_param('{"p": 10}');
743+
result
744+
--------
745+
36
746+
(1 row)
747+
748+
EXECUTE reduce_param('{"p": 100}');
749+
result
750+
--------
751+
306
752+
(1 row)
753+
754+
DEALLOCATE reduce_param;
755+
--
512756
-- "reduce" as a property key name (safe_keywords backward compatibility):
513757
-- because reduce() introduced a reserved keyword, confirm the word is still
514758
-- usable as a map key, the same way any/none/single are.

0 commit comments

Comments
 (0)