Skip to content

Commit f02eda0

Browse files
authored
perf: VLE terminal-qual rewrite (#2420)
perf: VLE terminal-qual rewrite — emit endpoint equalities instead of SRF qual functions. Removes the per-row age_match_vle_terminal_edge and age_match_two_vle_edges qual functions from VLE query plans. The cypher transformer now emits the endpoint match as a plain graphid/int8 equality on new SRF output columns, evaluated by the planner like any other join clause — no detoasting, no per-row C function dispatch. Stages land as one commit: S1 Inline start_vid/end_vid in VLE_path_container header S2 Read VLE qual endpoints from header-only TOAST slice S4 Emit start_id/end_id as scalar SRF output columns (age_vle now RETURNS SETOF record with edges/start_id/end_id) S5 Cypher transformer rewrites terminal-edge match quals as integer equalities (drops age_match_vle_terminal_edge call) S6 Cypher transformer emits graphid equality for two-VLE-edge joins (drops age_match_two_vle_edges call) Performance (SF3 LDBC SNB, 5 runs/3 warmup, vs clean master baseline_v2): IC sum 198,958 → 109,322 ms −45.05 % (1.82× end-to-end speedup) IC1 8,625 → 4,600 ms −46.67 % IC3 21,239 → 9,784 ms −53.93 % IC5 21,051 → 5,696 ms −72.94 % IC6 15,916 → 4,447 ms −72.06 % IC9 44,839 → 21,161 ms −52.81 % IC10 13,104 → 2,432 ms −81.44 % IC11 11,676 → 241 ms −97.93 % (48× speedup) IC2/4/7/8/12: parity (within ±3.3 %; IC4 is −2.47 %, no regression) IS sum: 1,009 → 1,004 ms −0.51 % (no VLE traffic) IU sum: 77 → 71 ms −8.38 % (IU1 −16.09 %; incidental) Memory: header-only TOAST slice for VLE qual evaluation avoids detoasting full path containers on every row; reduces per-call palloc/pfree churn in long DFS paths. No measured RSS change. Dead-code removal: - Bodies of age_match_vle_terminal_edge and age_match_two_vle_edges are gone from age_vle.c (~225 lines). C entry points remain as error-raising stubs solely so the upgrade-test snapshot loader (which sources an older 1.7.0_initial SQL against the current age.so) can resolve the symbols before the immediate ALTER EXTENSION UPDATE drops them. No regress test references either function. - SQL CREATE FUNCTION declarations removed from fresh install (sql/agtype_typecast.sql). - DROP FUNCTION IF EXISTS for both qual functions added to the upgrade script (age--1.7.0--y.y.y.sql). API change: ag_catalog.age_vle(...) now RETURNS SETOF record with output columns (edges agtype, start_id graphid, end_id graphid) instead of RETURNS SETOF agtype. Both 7-arg and 8-arg overloads are updated in fresh-install (sql/agtype_typecast.sql) and upgrade (age--1.7.0--y.y.y.sql) paths. age_match_vle_terminal_edge and age_match_two_vle_edges are dropped on upgrade and absent from fresh installs. Internal AGE callers are unaffected; external SQL that called any of these directly must adapt. Hardening (in response to PR #2420 review feedback): - cypher_clause.c: terminal-edge and two-VLE-edge join-qual emission paths now bracket the existing Assert(vle_alias != NULL) with a runtime ereport(ERROR, ...) so a missing alias produces a clean error in production builds (where Asserts compile out) instead of a NULL-deref crash inside makeString(). - age_vle.c: VLE_path_container struct comment now documents that the layout is transient (consumed within the producing query and never persisted), so the new start_vid/end_vid fields do not require an AGT_FBINARY_TYPE_VLE_PATH version bump or backward- compatible reader. The note also flags the constraint a future change would need to honor if the container ever became persistable. Tested on PostgreSQL 18.3 (REL_18_STABLE): all 34 regression tests pass (installcheck), warning-free build. modified: age--1.7.0--y.y.y.sql modified: regress/expected/cypher_match.out modified: regress/expected/cypher_vle.out modified: regress/expected/expr.out modified: sql/agtype_typecast.sql modified: src/backend/parser/cypher_clause.c modified: src/backend/parser/cypher_transform_entity.c modified: src/backend/utils/adt/age_vle.c modified: src/include/parser/cypher_transform_entity.h
1 parent 9960e9c commit f02eda0

9 files changed

Lines changed: 316 additions & 298 deletions

File tree

age--1.7.0--y.y.y.sql

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -490,3 +490,49 @@ AS 'SELECT $1::text::agtype';
490490

491491
CREATE CAST (jsonb AS agtype)
492492
WITH FUNCTION ag_catalog.jsonb_to_agtype(jsonb);
493+
494+
--
495+
-- S4: VLE SRF signature change
496+
--
497+
-- The age_vle SRF now emits start_id and end_id as scalar graphid columns
498+
-- alongside the existing `edges` column. This allows the cypher transformer
499+
-- to rewrite terminal-edge match quals as plain integer equalities,
500+
-- removing the per-row age_match_vle_terminal_edge and age_match_two_vle_edges
501+
-- function calls from VLE query plans. Both qual functions are dropped.
502+
--
503+
-- BREAKING CHANGE for any external SQL that called age_vle(...) directly
504+
-- and relied on `RETURNS SETOF agtype`, or called age_match_vle_terminal_edge
505+
-- / age_match_two_vle_edges directly. Internal AGE callers (the cypher
506+
-- transformer) are not affected.
507+
--
508+
DROP FUNCTION IF EXISTS ag_catalog.age_match_vle_terminal_edge(variadic "any");
509+
DROP FUNCTION IF EXISTS ag_catalog.age_match_two_vle_edges(agtype, agtype);
510+
511+
DROP FUNCTION IF EXISTS ag_catalog.age_vle(agtype, agtype, agtype, agtype,
512+
agtype, agtype, agtype);
513+
DROP FUNCTION IF EXISTS ag_catalog.age_vle(agtype, agtype, agtype, agtype,
514+
agtype, agtype, agtype, agtype);
515+
516+
CREATE FUNCTION ag_catalog.age_vle(IN agtype, IN agtype, IN agtype, IN agtype,
517+
IN agtype, IN agtype, IN agtype,
518+
OUT edges agtype,
519+
OUT start_id graphid,
520+
OUT end_id graphid)
521+
RETURNS SETOF record
522+
LANGUAGE C
523+
STABLE
524+
CALLED ON NULL INPUT
525+
PARALLEL UNSAFE
526+
AS 'MODULE_PATHNAME';
527+
528+
CREATE FUNCTION ag_catalog.age_vle(IN agtype, IN agtype, IN agtype, IN agtype,
529+
IN agtype, IN agtype, IN agtype, IN agtype,
530+
OUT edges agtype,
531+
OUT start_id graphid,
532+
OUT end_id graphid)
533+
RETURNS SETOF record
534+
LANGUAGE C
535+
STABLE
536+
CALLED ON NULL INPUT
537+
PARALLEL UNSAFE
538+
AS 'MODULE_PATHNAME';

regress/expected/cypher_match.out

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2784,13 +2784,13 @@ SELECT * FROM cypher('cypher_match', $$ MATCH p=()-[*]->() RETURN length(p) $$)
27842784
length
27852785
--------
27862786
1
2787-
2
27882787
1
2788+
2
27892789
1
27902790
1
27912791
2
2792-
1
27932792
2
2793+
1
27942794
(8 rows)
27952795

27962796
SELECT * FROM cypher('cypher_match', $$ MATCH p=()-[*]->() WHERE length(p) > 1 RETURN length(p) $$) as (length agtype);
@@ -2812,8 +2812,8 @@ SELECT * FROM cypher('cypher_match', $$ MATCH p=()-[*]->() WHERE size(nodes(p))
28122812
SELECT * FROM cypher('cypher_match', $$ MATCH (n {name:'Dave'}) MATCH p=()-[*]->() WHERE nodes(p)[0] = n RETURN length(p) $$) as (length agtype);
28132813
length
28142814
--------
2815-
1
28162815
2
2816+
1
28172817
(2 rows)
28182818

28192819
SELECT * FROM cypher('cypher_match', $$ MATCH p1=(n {name:'Dave'})-[]->() MATCH p2=()-[*]->() WHERE p2=p1 RETURN p2=p1 $$) as (path agtype);

regress/expected/cypher_vle.out

Lines changed: 14 additions & 14 deletions
Large diffs are not rendered by default.

regress/expected/expr.out

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

sql/agtype_typecast.sql

Lines changed: 10 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -70,10 +70,14 @@ PARALLEL SAFE
7070
AS 'MODULE_PATHNAME';
7171

7272
-- original VLE function definition
73+
-- S4: emit start_id/end_id as scalar columns to enable transformer rewrite
74+
-- of terminal-edge quals as integer equalities (see PERF_VLE_TERMINAL_QUAL_PLAN).
7375
CREATE FUNCTION ag_catalog.age_vle(IN agtype, IN agtype, IN agtype, IN agtype,
7476
IN agtype, IN agtype, IN agtype,
75-
OUT edges agtype)
76-
RETURNS SETOF agtype
77+
OUT edges agtype,
78+
OUT start_id graphid,
79+
OUT end_id graphid)
80+
RETURNS SETOF record
7781
LANGUAGE C
7882
STABLE
7983
CALLED ON NULL INPUT
@@ -84,8 +88,10 @@ AS 'MODULE_PATHNAME';
8488
-- caching mechanism to coexist with the previous VLE version.
8589
CREATE FUNCTION ag_catalog.age_vle(IN agtype, IN agtype, IN agtype, IN agtype,
8690
IN agtype, IN agtype, IN agtype, IN agtype,
87-
OUT edges agtype)
88-
RETURNS SETOF agtype
91+
OUT edges agtype,
92+
OUT start_id graphid,
93+
OUT end_id graphid)
94+
RETURNS SETOF record
8995
LANGUAGE C
9096
STABLE
9197
CALLED ON NULL INPUT
@@ -100,15 +106,6 @@ CREATE FUNCTION ag_catalog.age_build_vle_match_edge(agtype, agtype)
100106
PARALLEL SAFE
101107
AS 'MODULE_PATHNAME';
102108

103-
-- function to match a terminal vle edge
104-
CREATE FUNCTION ag_catalog.age_match_vle_terminal_edge(variadic "any")
105-
RETURNS boolean
106-
LANGUAGE C
107-
STABLE
108-
CALLED ON NULL INPUT
109-
PARALLEL SAFE
110-
AS 'MODULE_PATHNAME';
111-
112109
-- function to create an AGTV_PATH from a VLE_path_container
113110
CREATE FUNCTION ag_catalog.age_materialize_vle_path(agtype)
114111
RETURNS agtype
@@ -135,14 +132,6 @@ RETURNS NULL ON NULL INPUT
135132
PARALLEL SAFE
136133
AS 'MODULE_PATHNAME';
137134

138-
CREATE FUNCTION ag_catalog.age_match_two_vle_edges(agtype, agtype)
139-
RETURNS boolean
140-
LANGUAGE C
141-
STABLE
142-
RETURNS NULL ON NULL INPUT
143-
PARALLEL SAFE
144-
AS 'MODULE_PATHNAME';
145-
146135
-- list functions
147136
CREATE FUNCTION ag_catalog.age_keys(agtype)
148137
RETURNS agtype

src/backend/parser/cypher_clause.c

Lines changed: 85 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -3982,10 +3982,6 @@ static List *make_join_condition_for_edge(cypher_parsestate *cpstate,
39823982
{
39833983
Node *left_id = NULL;
39843984
Node *right_id = NULL;
3985-
String *ag_catalog = makeString("ag_catalog");
3986-
String *func_name;
3987-
List *qualified_func_name;
3988-
List *args = NIL;
39893985
List *quals = NIL;
39903986

39913987
/*
@@ -3998,56 +3994,107 @@ static List *make_join_condition_for_edge(cypher_parsestate *cpstate,
39983994
}
39993995

40003996
/*
4001-
* If the previous node and the next node are in the join tree, we need
4002-
* to create the age_match_vle_terminal_edge to compare the vle returned
4003-
* results against the two nodes.
3997+
* S5: if the previous and next nodes are both in the join tree,
3998+
* emit two graphid equality A_Exprs:
3999+
* <vle_alias>.start_id = prev_node.id
4000+
* <vle_alias>.end_id = next_node.id
4001+
* This replaces the historical per-row
4002+
* age_match_vle_terminal_edge(prev.id, next.id, edges)
4003+
* function call with plain integer (int8) equality quals on the
4004+
* SRF's S4 output columns. The planner can now drive the join
4005+
* directly on these keys (HashJoin hash keys, NestLoop index
4006+
* conditions where indexed).
40044007
*/
40054008
if (prev_node->in_join_tree)
40064009
{
4007-
func_name = makeString("age_match_vle_terminal_edge");
4008-
qualified_func_name = list_make2(ag_catalog, func_name);
4010+
ColumnRef *cr_start;
4011+
ColumnRef *cr_end;
4012+
A_Expr *eq_start;
4013+
A_Expr *eq_end;
40094014

40104015
/*
4011-
* Get the vertex's id and pass to the function. Pass in NULL
4012-
* otherwise.
4016+
* Production-build runtime guard. Asserts compile out in
4017+
* non-debug builds, so a NULL vle_alias would otherwise reach
4018+
* makeString() and crash the backend during ColumnRef build.
40134019
*/
4014-
left_id = (Node *)make_qual(cpstate, prev_node, "id");
4020+
if (entity->vle_alias == NULL)
4021+
{
4022+
ereport(ERROR,
4023+
(errcode(ERRCODE_INTERNAL_ERROR),
4024+
errmsg("VLE edge entity is missing its alias; cannot emit terminal-edge join qual")));
4025+
}
4026+
Assert(entity->vle_alias != NULL);
4027+
4028+
cr_start = makeNode(ColumnRef);
4029+
cr_start->fields = list_make2(makeString(entity->vle_alias),
4030+
makeString("start_id"));
4031+
cr_start->location = -1;
4032+
4033+
cr_end = makeNode(ColumnRef);
4034+
cr_end->fields = list_make2(makeString(entity->vle_alias),
4035+
makeString("end_id"));
4036+
cr_end->location = -1;
4037+
4038+
left_id = (Node *)make_qual(cpstate, prev_node, "id");
40154039
right_id = (Node *)make_qual(cpstate, next_node, "id");
40164040

4017-
/* create the argument list */
4018-
args = list_make3(left_id, right_id, entity->expr);
4041+
eq_start = makeSimpleA_Expr(AEXPR_OP, "=",
4042+
(Node *)cr_start, left_id, -1);
4043+
eq_end = makeSimpleA_Expr(AEXPR_OP, "=",
4044+
(Node *)cr_end, right_id, -1);
40194045

4020-
/* add to quals */
4021-
quals = lappend(quals, makeFuncCall(qualified_func_name, args,
4022-
COERCE_EXPLICIT_CALL, -1));
4046+
quals = lappend(quals, eq_start);
4047+
quals = lappend(quals, eq_end);
40234048
}
40244049

40254050
/*
4026-
* When the previous node is not in the join tree, but there is a vle
4027-
* edge before that join, then we need to compare this vle's start node
4028-
* against the previous vle's end node. No need to check the next edge,
4029-
* because that would be redundant.
4051+
* S6: when the previous node is not in the join tree but there is
4052+
* a vle edge before that join, emit a single graphid equality
4053+
* connecting the two VLE SRFs:
4054+
*
4055+
* prev_vle.end_id = this_vle.start_id
4056+
*
4057+
* This replaces the per-row age_match_two_vle_edges(prev, this)
4058+
* function call with a plain int8 equality on the S4 scalar
4059+
* output columns of both age_vle SRFs. No detoasting of either
4060+
* VLE_path_container is needed.
40304061
*/
40314062
if (!prev_node->in_join_tree &&
40324063
prev_edge != NULL &&
40334064
prev_edge->type == ENT_VLE_EDGE)
40344065
{
4035-
List *qualified_name;
4036-
String *match_qual;
4037-
FuncCall *fc;
4066+
ColumnRef *cr_prev_end;
4067+
ColumnRef *cr_this_start;
4068+
A_Expr *eq_chain;
40384069

4039-
match_qual = makeString("age_match_two_vle_edges");
4070+
/*
4071+
* Production-build runtime guard for both VLE aliases; see
4072+
* note above on entity->vle_alias.
4073+
*/
4074+
if (prev_edge->vle_alias == NULL || entity->vle_alias == NULL)
4075+
{
4076+
ereport(ERROR,
4077+
(errcode(ERRCODE_INTERNAL_ERROR),
4078+
errmsg("VLE edge entity is missing its alias; cannot emit two-VLE-edge join qual")));
4079+
}
4080+
Assert(prev_edge->vle_alias != NULL);
4081+
Assert(entity->vle_alias != NULL);
40404082

4041-
/* make the qualified function name */
4042-
qualified_name = list_make2(ag_catalog, match_qual);
4083+
cr_prev_end = makeNode(ColumnRef);
4084+
cr_prev_end->fields = list_make2(makeString(prev_edge->vle_alias),
4085+
makeString("end_id"));
4086+
cr_prev_end->location = -1;
40434087

4044-
/* make the args */
4045-
args = list_make2(prev_edge->expr, entity->expr);
4088+
cr_this_start = makeNode(ColumnRef);
4089+
cr_this_start->fields = list_make2(makeString(entity->vle_alias),
4090+
makeString("start_id"));
4091+
cr_this_start->location = -1;
40464092

4047-
/* create the function call */
4048-
fc = makeFuncCall(qualified_name, args, COERCE_EXPLICIT_CALL, -1);
4093+
eq_chain = makeSimpleA_Expr(AEXPR_OP, "=",
4094+
(Node *)cr_prev_end,
4095+
(Node *)cr_this_start, -1);
40494096

4050-
quals = lappend(quals, fc);
4097+
quals = lappend(quals, eq_chain);
40514098
}
40524099

40534100
return quals;
@@ -4898,6 +4945,12 @@ static transform_entity *transform_VLE_edge_entity(cypher_parsestate *cpstate,
48984945
vle_entity = make_transform_entity(cpstate, ENT_VLE_EDGE, (Node *)rel,
48994946
(Expr *)var);
49004947

4948+
/*
4949+
* S5: stash the auto-generated alias name so make_join_condition_for_edge
4950+
* can build ColumnRefs for the SRF's start_id/end_id output columns.
4951+
*/
4952+
vle_entity->vle_alias = alias->aliasname;
4953+
49014954
/* return the vle entity */
49024955
return vle_entity;
49034956
}

src/backend/parser/cypher_transform_entity.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ transform_entity *make_transform_entity(cypher_parsestate *cpstate,
5252
entity->declared_in_current_clause = true;
5353
entity->expr = expr;
5454
entity->in_join_tree = expr != NULL;
55+
entity->vle_alias = NULL;
5556

5657
return entity;
5758
}

0 commit comments

Comments
 (0)