Skip to content

Commit 85bca9f

Browse files
committed
Restore contsel/contjoinsel for containment & key-existence operators (#2356)
The containment (`@>`, `<@`, `@>>`, `<<@`) and key-existence (`?`, `?|`, `?&`) operators on `agtype` were bound to `matchingsel`/`matchingjoinsel` on the PG14+ source tree. `matchingsel` is built for pattern operators (LIKE/regex) and during planning invokes the operator's underlying function (`agtype_contains`) once per `pg_statistic` MCV. With realistic statistics targets that produces a planner-time regression that dominates simple OLTP-style point queries. Rebind those operators to the lighter `contsel`/`contjoinsel` estimators, which return fixed selectivity constants without invoking the operator function during planning. This is a deliberate planning-speed vs. estimate-accuracy trade-off. Note it DIVERGES from PostgreSQL core, which keeps jsonb's `@>`, `<@`, `?`, `?|`, `?&` on `matchingsel`/`matchingjoinsel` (verified on REL_16/17/18_STABLE in `pg_operator.dat`); it is an AGE-specific choice favoring workloads where these operators appear in selective point lookups. A future improvement could add a custom `agtype` selectivity function that is both cheap and statistics-aware. Changes: * `sql/agtype_operators.sql`, `sql/agtype_exists.sql`: 10 operators flipped from `matchingsel`/`matchingjoinsel` to `contsel`/`contjoinsel`. * `age--1.7.0--y.y.y.sql`: appended `ALTER OPERATOR ... SET (RESTRICT, JOIN)` for all 10 operators so existing installs flip on `ALTER EXTENSION age UPDATE`. * `regress/sql/containment_selectivity.sql` (+ `expected/.out`): pin the bindings via `pg_operator`, plus a scoped "no leaked matchingsel" guard and functional smoke for all 10 operators. Adds an upgrade-path assertion that simulates a stale (pre-fix) install, replays the shipped `ALTER OPERATOR` block, and confirms every overload flips to `contsel`/`contjoinsel` (run in a rolled-back transaction). * `regress/expected/cypher_match.out`, `regress/expected/cypher_vle.out`: refresh expected to reflect new (and better) plan shapes that the lower-selectivity helper produces — `test_enable_containment` now picks Nested Loop + Index Only Scans over a Seq Scan/Hash Join, and two `MATCH p=...` and `show_list_use_vle` queries flip row order (queries had no `ORDER BY`; result set is unchanged, only ordering). * `Makefile`: register `containment_selectivity` in `REGRESS`. Validation: * Build: clean, `-Werror`. * Regression: 36/37 tests pass under `EXTRA_TESTS="pgvector fuzzystrmatch pg_trgm"`. Only `age_upgrade` fails — pre-existing on master at 774e781 (verified by `git stash && installcheck`). * Reporter's exact methodology (LDBC-SNB-style snb_graph + pgbench on `bench_message_content`) reproduces the regression and the fix: | Metric | matchingsel | contsel | Delta | |----------------------------|-------------|---------|-------| | EXPLAIN planning time (ms) | 1.42 | 0.97 | -32% | | EXPLAIN execution time (ms)| 0.34 | 0.31 | ~equal| | pgbench TPS (8c x 30s) | 5247 | 7378 | +40.6%| Run with `default_statistics_target = 1000` to populate MCV lists, matching the reporter's analyzed-graph conditions. * Upgrade path: validated end-to-end during the benchmark — operator bindings were flipped from `matchingsel` -> `contsel` via the same `ALTER OPERATOR` statements the upgrade SQL ships, while operators remained functional throughout. Driver workflows (python/go/node/jdbc) intentionally not run: this PR only adjusts pg_operator selectivity metadata. There is no C, type, or protocol change that drivers could observe. Closes #2356.
1 parent 73d0705 commit 85bca9f

10 files changed

Lines changed: 483 additions & 45 deletions

Makefile

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -183,7 +183,8 @@ REGRESS = scan \
183183
direct_field_access \
184184
security \
185185
reserved_keyword_alias \
186-
agtype_jsonb_cast
186+
agtype_jsonb_cast \
187+
containment_selectivity
187188

188189
ifneq ($(EXTRA_TESTS),)
189190
REGRESS += $(EXTRA_TESTS)

age--1.7.0--y.y.y.sql

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -762,3 +762,41 @@ CREATE FUNCTION ag_catalog._agehash_self_test()
762762
VOLATILE
763763
PARALLEL UNSAFE
764764
AS 'MODULE_PATHNAME';
765+
766+
--
767+
-- Issue #2356: rebind containment and key-existence operators to the
768+
-- lightweight contsel / contjoinsel selectivity estimators.
769+
--
770+
-- @>, <@, @>>, <<@, ?, ?|, ?& on agtype were bound to matchingsel /
771+
-- matchingjoinsel. During planning matchingsel invokes the operator's
772+
-- underlying function (agtype_contains) once per pg_statistic MCV entry and
773+
-- histogram bin; with a realistic default_statistics_target that planning
774+
-- cost dominates simple point queries (the regression reported in #2356).
775+
--
776+
-- contsel / contjoinsel return fixed selectivity constants without calling the
777+
-- operator function, so planning is constant-time. This is a deliberate
778+
-- planning-speed vs. estimate-accuracy trade-off. Note it DIVERGES from
779+
-- PostgreSQL core, which keeps jsonb's @>, <@, ?, ?|, ?& on matchingsel /
780+
-- matchingjoinsel; it is an AGE-specific choice favoring workloads where these
781+
-- operators appear in selective point lookups.
782+
--
783+
ALTER OPERATOR ag_catalog.@>(agtype, agtype)
784+
SET (RESTRICT = contsel, JOIN = contjoinsel);
785+
ALTER OPERATOR ag_catalog.<@(agtype, agtype)
786+
SET (RESTRICT = contsel, JOIN = contjoinsel);
787+
ALTER OPERATOR ag_catalog.@>>(agtype, agtype)
788+
SET (RESTRICT = contsel, JOIN = contjoinsel);
789+
ALTER OPERATOR ag_catalog.<<@(agtype, agtype)
790+
SET (RESTRICT = contsel, JOIN = contjoinsel);
791+
ALTER OPERATOR ag_catalog.?(agtype, text)
792+
SET (RESTRICT = contsel, JOIN = contjoinsel);
793+
ALTER OPERATOR ag_catalog.?(agtype, agtype)
794+
SET (RESTRICT = contsel, JOIN = contjoinsel);
795+
ALTER OPERATOR ag_catalog.?|(agtype, text[])
796+
SET (RESTRICT = contsel, JOIN = contjoinsel);
797+
ALTER OPERATOR ag_catalog.?|(agtype, agtype)
798+
SET (RESTRICT = contsel, JOIN = contjoinsel);
799+
ALTER OPERATOR ag_catalog.?&(agtype, text[])
800+
SET (RESTRICT = contsel, JOIN = contjoinsel);
801+
ALTER OPERATOR ag_catalog.?&(agtype, agtype)
802+
SET (RESTRICT = contsel, JOIN = contjoinsel);
Lines changed: 245 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,245 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one
3+
* or more contributor license agreements. See the NOTICE file
4+
* distributed with this work for additional information
5+
* regarding copyright ownership. The ASF licenses this file
6+
* to you under the Apache License, Version 2.0 (the
7+
* "License"); you may not use this file except in compliance
8+
* with the License. You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing,
13+
* software distributed under the License is distributed on an
14+
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15+
* KIND, either express or implied. See the License for the
16+
* specific language governing permissions and limitations
17+
* under the License.
18+
*/
19+
/*
20+
* Regression coverage for issue #2356:
21+
* The containment (@>, <@, @>>, <<@) and key-existence (?, ?|, ?&)
22+
* operators on agtype must be bound to the lightweight selectivity
23+
* helpers contsel / contjoinsel during planning. Earlier PG14+
24+
* branches used matchingsel / matchingjoinsel, which caused planning
25+
* to invoke agtype_contains() against pg_statistic MCVs and produced
26+
* a 30%+ planning-time regression on point queries (severe TPS drop
27+
* reported on the PG18 branch).
28+
*
29+
* This test pins the bindings by querying pg_operator directly. If
30+
* someone re-introduces matchingsel here, the test diff is loud and
31+
* precise.
32+
*/
33+
LOAD 'age';
34+
SET search_path TO ag_catalog;
35+
-- Selectivity helpers for the four containment operators.
36+
SELECT o.oprname,
37+
pg_catalog.format_type(o.oprleft, NULL) AS lhs,
38+
pg_catalog.format_type(o.oprright, NULL) AS rhs,
39+
o.oprrest::text AS restrict_fn,
40+
o.oprjoin::text AS join_fn
41+
FROM pg_catalog.pg_operator o
42+
JOIN pg_catalog.pg_namespace n ON n.oid = o.oprnamespace
43+
WHERE n.nspname = 'ag_catalog'
44+
AND o.oprname IN ('@>', '<@', '@>>', '<<@')
45+
ORDER BY o.oprname, lhs, rhs;
46+
oprname | lhs | rhs | restrict_fn | join_fn
47+
---------+--------+--------+-------------+-------------
48+
<<@ | agtype | agtype | contsel | contjoinsel
49+
<@ | agtype | agtype | contsel | contjoinsel
50+
@> | agtype | agtype | contsel | contjoinsel
51+
@>> | agtype | agtype | contsel | contjoinsel
52+
(4 rows)
53+
54+
-- Selectivity helpers for all key-existence operator overloads
55+
-- (right-hand side may be text, text[], or agtype).
56+
SELECT o.oprname,
57+
pg_catalog.format_type(o.oprleft, NULL) AS lhs,
58+
pg_catalog.format_type(o.oprright, NULL) AS rhs,
59+
o.oprrest::text AS restrict_fn,
60+
o.oprjoin::text AS join_fn
61+
FROM pg_catalog.pg_operator o
62+
JOIN pg_catalog.pg_namespace n ON n.oid = o.oprnamespace
63+
WHERE n.nspname = 'ag_catalog'
64+
AND o.oprname IN ('?', '?|', '?&')
65+
ORDER BY o.oprname, lhs, rhs;
66+
oprname | lhs | rhs | restrict_fn | join_fn
67+
---------+--------+--------+-------------+-------------
68+
? | agtype | agtype | contsel | contjoinsel
69+
? | agtype | text | contsel | contjoinsel
70+
?& | agtype | agtype | contsel | contjoinsel
71+
?& | agtype | text[] | contsel | contjoinsel
72+
?| | agtype | agtype | contsel | contjoinsel
73+
?| | agtype | text[] | contsel | contjoinsel
74+
(6 rows)
75+
76+
-- Scoped guard for issue #2356: assert that none of the specific containment
77+
-- and key-existence operators on agtype are bound to matchingsel /
78+
-- matchingjoinsel. We deliberately limit the check to these operator names
79+
-- (rather than every operator in ag_catalog) so unrelated operators that
80+
-- legitimately use matchingsel for their own semantics are not affected by
81+
-- this regression test.
82+
SELECT COUNT(*) AS leaked_matchingsel_bindings
83+
FROM pg_catalog.pg_operator o
84+
JOIN pg_catalog.pg_namespace n ON n.oid = o.oprnamespace
85+
WHERE n.nspname = 'ag_catalog'
86+
AND o.oprname IN ('@>', '<@', '@>>', '<<@', '?', '?|', '?&')
87+
AND (o.oprrest::text = 'matchingsel'
88+
OR o.oprjoin::text = 'matchingjoinsel');
89+
leaked_matchingsel_bindings
90+
-----------------------------
91+
0
92+
(1 row)
93+
94+
-- Smoke test: each operator still works functionally. Selectivity binding
95+
-- only affects the planner; this guards against an inadvertent operator
96+
-- removal as part of any future cleanup.
97+
SELECT '{"a":1,"b":2}'::agtype @> '{"a":1}'::agtype AS contains_yes;
98+
contains_yes
99+
--------------
100+
t
101+
(1 row)
102+
103+
SELECT '{"a":1}'::agtype <@ '{"a":1,"b":2}'::agtype AS contained_yes;
104+
contained_yes
105+
---------------
106+
t
107+
(1 row)
108+
109+
SELECT '{"a":{"b":1}}'::agtype @>> '{"a":{"b":1}}'::agtype AS top_contains_yes;
110+
top_contains_yes
111+
------------------
112+
t
113+
(1 row)
114+
115+
SELECT '{"a":{"b":1}}'::agtype <<@ '{"a":{"b":1}}'::agtype AS top_contained_yes;
116+
top_contained_yes
117+
-------------------
118+
t
119+
(1 row)
120+
121+
SELECT '{"a":1}'::agtype ? 'a'::text AS exists_text_yes;
122+
exists_text_yes
123+
-----------------
124+
t
125+
(1 row)
126+
127+
SELECT '{"a":1}'::agtype ? '"a"'::agtype AS exists_agtype_yes;
128+
exists_agtype_yes
129+
-------------------
130+
t
131+
(1 row)
132+
133+
SELECT '{"a":1,"b":2}'::agtype ?| ARRAY['a','c'] AS exists_any_text_yes;
134+
exists_any_text_yes
135+
---------------------
136+
t
137+
(1 row)
138+
139+
SELECT '{"a":1,"b":2}'::agtype ?| '["a","c"]'::agtype AS exists_any_agtype_yes;
140+
exists_any_agtype_yes
141+
-----------------------
142+
t
143+
(1 row)
144+
145+
SELECT '{"a":1,"b":2}'::agtype ?& ARRAY['a','b'] AS exists_all_text_yes;
146+
exists_all_text_yes
147+
---------------------
148+
t
149+
(1 row)
150+
151+
SELECT '{"a":1,"b":2}'::agtype ?& '["a","b"]'::agtype AS exists_all_agtype_yes;
152+
exists_all_agtype_yes
153+
-----------------------
154+
t
155+
(1 row)
156+
157+
-- Upgrade-path assertion for issue #2356.
158+
--
159+
-- The checks above cover a FRESH install: contsel / contjoinsel come straight
160+
-- from agtype_operators.sql and agtype_exists.sql. Existing installs instead
161+
-- pick up the fix from the ALTER OPERATOR ... SET (RESTRICT, JOIN) block that
162+
-- age--1.7.0--y.y.y.sql ships and "ALTER EXTENSION age UPDATE" replays. Nothing
163+
-- above exercises that block, so a silent regression in it would go unnoticed.
164+
--
165+
-- We replay the shipped ALTER OPERATOR statements directly rather than running
166+
-- ALTER EXTENSION age UPDATE: the dev upgrade script targets the placeholder
167+
-- version "y.y.y" and is not a stable version-chain target inside the
168+
-- regression harness. The whole section runs in a transaction that is rolled
169+
-- back, so it observes the flip without permanently mutating the operator
170+
-- catalog (PostgreSQL DDL is transactional).
171+
BEGIN;
172+
-- Simulate a stale (pre-fix) install: force all ten overloads back onto
173+
-- matchingsel / matchingjoinsel.
174+
ALTER OPERATOR ag_catalog.@>(agtype, agtype) SET (RESTRICT = matchingsel, JOIN = matchingjoinsel);
175+
ALTER OPERATOR ag_catalog.<@(agtype, agtype) SET (RESTRICT = matchingsel, JOIN = matchingjoinsel);
176+
ALTER OPERATOR ag_catalog.@>>(agtype, agtype) SET (RESTRICT = matchingsel, JOIN = matchingjoinsel);
177+
ALTER OPERATOR ag_catalog.<<@(agtype, agtype) SET (RESTRICT = matchingsel, JOIN = matchingjoinsel);
178+
ALTER OPERATOR ag_catalog.?(agtype, text) SET (RESTRICT = matchingsel, JOIN = matchingjoinsel);
179+
ALTER OPERATOR ag_catalog.?(agtype, agtype) SET (RESTRICT = matchingsel, JOIN = matchingjoinsel);
180+
ALTER OPERATOR ag_catalog.?|(agtype, text[]) SET (RESTRICT = matchingsel, JOIN = matchingjoinsel);
181+
ALTER OPERATOR ag_catalog.?|(agtype, agtype) SET (RESTRICT = matchingsel, JOIN = matchingjoinsel);
182+
ALTER OPERATOR ag_catalog.?&(agtype, text[]) SET (RESTRICT = matchingsel, JOIN = matchingjoinsel);
183+
ALTER OPERATOR ag_catalog.?&(agtype, agtype) SET (RESTRICT = matchingsel, JOIN = matchingjoinsel);
184+
-- Stale state: every overload now reports matchingsel / matchingjoinsel.
185+
SELECT o.oprname,
186+
pg_catalog.format_type(o.oprleft, NULL) AS lhs,
187+
pg_catalog.format_type(o.oprright, NULL) AS rhs,
188+
o.oprrest::text AS restrict_fn,
189+
o.oprjoin::text AS join_fn
190+
FROM pg_catalog.pg_operator o
191+
JOIN pg_catalog.pg_namespace n ON n.oid = o.oprnamespace
192+
WHERE n.nspname = 'ag_catalog'
193+
AND o.oprname IN ('@>', '<@', '@>>', '<<@', '?', '?|', '?&')
194+
ORDER BY o.oprname, lhs, rhs;
195+
oprname | lhs | rhs | restrict_fn | join_fn
196+
---------+--------+--------+-------------+-----------------
197+
<<@ | agtype | agtype | matchingsel | matchingjoinsel
198+
<@ | agtype | agtype | matchingsel | matchingjoinsel
199+
? | agtype | agtype | matchingsel | matchingjoinsel
200+
? | agtype | text | matchingsel | matchingjoinsel
201+
?& | agtype | agtype | matchingsel | matchingjoinsel
202+
?& | agtype | text[] | matchingsel | matchingjoinsel
203+
?| | agtype | agtype | matchingsel | matchingjoinsel
204+
?| | agtype | text[] | matchingsel | matchingjoinsel
205+
@> | agtype | agtype | matchingsel | matchingjoinsel
206+
@>> | agtype | agtype | matchingsel | matchingjoinsel
207+
(10 rows)
208+
209+
-- Replay the exact ALTER OPERATOR block shipped in age--1.7.0--y.y.y.sql.
210+
ALTER OPERATOR ag_catalog.@>(agtype, agtype) SET (RESTRICT = contsel, JOIN = contjoinsel);
211+
ALTER OPERATOR ag_catalog.<@(agtype, agtype) SET (RESTRICT = contsel, JOIN = contjoinsel);
212+
ALTER OPERATOR ag_catalog.@>>(agtype, agtype) SET (RESTRICT = contsel, JOIN = contjoinsel);
213+
ALTER OPERATOR ag_catalog.<<@(agtype, agtype) SET (RESTRICT = contsel, JOIN = contjoinsel);
214+
ALTER OPERATOR ag_catalog.?(agtype, text) SET (RESTRICT = contsel, JOIN = contjoinsel);
215+
ALTER OPERATOR ag_catalog.?(agtype, agtype) SET (RESTRICT = contsel, JOIN = contjoinsel);
216+
ALTER OPERATOR ag_catalog.?|(agtype, text[]) SET (RESTRICT = contsel, JOIN = contjoinsel);
217+
ALTER OPERATOR ag_catalog.?|(agtype, agtype) SET (RESTRICT = contsel, JOIN = contjoinsel);
218+
ALTER OPERATOR ag_catalog.?&(agtype, text[]) SET (RESTRICT = contsel, JOIN = contjoinsel);
219+
ALTER OPERATOR ag_catalog.?&(agtype, agtype) SET (RESTRICT = contsel, JOIN = contjoinsel);
220+
-- After the upgrade replay every overload is back on contsel / contjoinsel.
221+
SELECT o.oprname,
222+
pg_catalog.format_type(o.oprleft, NULL) AS lhs,
223+
pg_catalog.format_type(o.oprright, NULL) AS rhs,
224+
o.oprrest::text AS restrict_fn,
225+
o.oprjoin::text AS join_fn
226+
FROM pg_catalog.pg_operator o
227+
JOIN pg_catalog.pg_namespace n ON n.oid = o.oprnamespace
228+
WHERE n.nspname = 'ag_catalog'
229+
AND o.oprname IN ('@>', '<@', '@>>', '<<@', '?', '?|', '?&')
230+
ORDER BY o.oprname, lhs, rhs;
231+
oprname | lhs | rhs | restrict_fn | join_fn
232+
---------+--------+--------+-------------+-------------
233+
<<@ | agtype | agtype | contsel | contjoinsel
234+
<@ | agtype | agtype | contsel | contjoinsel
235+
? | agtype | agtype | contsel | contjoinsel
236+
? | agtype | text | contsel | contjoinsel
237+
?& | agtype | agtype | contsel | contjoinsel
238+
?& | agtype | text[] | contsel | contjoinsel
239+
?| | agtype | agtype | contsel | contjoinsel
240+
?| | agtype | text[] | contsel | contjoinsel
241+
@> | agtype | agtype | contsel | contjoinsel
242+
@>> | agtype | agtype | contsel | contjoinsel
243+
(10 rows)
244+
245+
ROLLBACK;

0 commit comments

Comments
 (0)