Context
PgProtocolHandler.rewriteQuery() in convex-db rewrites incoming PostgreSQL SQL into Calcite-compatible SQL via string replacement. Each of the 10 hacks carries a "TODO: Remove hack" comment and names the proper fix.
String-munging is fragile (e.g. the regex-operator check returns an empty result for any query containing ~, which false-matches some legitimate SQL), loses query semantics, and will keep growing as more PostgreSQL clients hit it.
Proposal — phased refactor
Phase 1: Adopt Calcite Babel parser (biggest leverage)
Switch the Calcite SqlParser.Config to use SqlBabelParserImpl.FACTORY with PostgreSQL conformance. Babel natively supports:
- POSIX regex operators
~, ~*, !~, !~* → removes line 256 hack
::type cast syntax → removes line 320 hack (and the follow-up ::integer, ::int4, ::text, ::regclass, ::oid etc. stripping)
One parser config change eliminates two of the most aggressive rewrites.
Phase 2: Register PostgreSQL system functions as Calcite ScalarFunctions
Replace 6 regex substitutions with proper function registrations against the Calcite root schema. Each reads from session context (user, database, processId) rather than being baked into the rewritten string:
| TODO line |
Function |
Replaces |
| 289 |
CURRENT_SCHEMA() |
regex → 'public' |
| 295 |
CURRENT_DATABASE() |
regex → database literal |
| 300 |
CURRENT_USER |
regex → 'convex' |
| 305 |
SESSION_USER |
regex → 'convex' |
| 310 |
version() |
regex → version literal |
| 315 |
pg_backend_pid() |
regex → processId |
Suggest a new convex.db.calcite.pgcatalog.PgSystemFunctions class that registers all six in one place, analogous to the existing pg_catalog table setup.
Phase 3: Expand pg_catalog virtual tables and fix search_path
Partial implementations already exist under convex-db/src/main/java/convex/db/calcite/pgcatalog/ (PgClassTable, PgAttributeTable, PgNamespaceTable, PgDatabaseTable, PgTypeTable, PgTablesTable). Phase 3 fills in the gaps:
Once Phase 3 lands, rewriteQuery() should be removable entirely (or reduced to a no-op).
Why do this
- Real PostgreSQL clients (
pgAdmin, DBeaver, psql \d) issue introspection queries that the current empty-result hack silently breaks.
- Rewriting via regex drops query semantics — e.g. any user table with
~ in a column comparison currently returns no rows.
- The
:: cast stripping silently changes query meaning when a cast was semantically required.
- Babel is already a supported Calcite dialect, so Phase 1 is low-risk.
Out of scope
- Full PostgreSQL dialect parity (procedures, arrays, ranges, JSON operators). This issue is strictly about removing the existing string-rewrite layer.
Related
Part of the broader TODO cleanup identified in the repo review. See also #559 (Local op cache).
Context
PgProtocolHandler.rewriteQuery()inconvex-dbrewrites incoming PostgreSQL SQL into Calcite-compatible SQL via string replacement. Each of the 10 hacks carries a "TODO: Remove hack" comment and names the proper fix.String-munging is fragile (e.g. the regex-operator check returns an empty result for any query containing
~, which false-matches some legitimate SQL), loses query semantics, and will keep growing as more PostgreSQL clients hit it.Proposal — phased refactor
Phase 1: Adopt Calcite Babel parser (biggest leverage)
Switch the Calcite
SqlParser.Configto useSqlBabelParserImpl.FACTORYwith PostgreSQL conformance. Babel natively supports:~,~*,!~,!~*→ removes line 256 hack::typecast syntax → removes line 320 hack (and the follow-up::integer,::int4,::text,::regclass,::oidetc. stripping)One parser config change eliminates two of the most aggressive rewrites.
Phase 2: Register PostgreSQL system functions as Calcite ScalarFunctions
Replace 6 regex substitutions with proper function registrations against the Calcite root schema. Each reads from session context (user, database, processId) rather than being baked into the rewritten string:
CURRENT_SCHEMA()'public'CURRENT_DATABASE()CURRENT_USER'convex'SESSION_USER'convex'version()pg_backend_pid()Suggest a new
convex.db.calcite.pgcatalog.PgSystemFunctionsclass that registers all six in one place, analogous to the existing pg_catalog table setup.Phase 3: Expand
pg_catalogvirtual tables and fix search_pathPartial implementations already exist under
convex-db/src/main/java/convex/db/calcite/pgcatalog/(PgClassTable,PgAttributeTable,PgNamespaceTable,PgDatabaseTable,PgTypeTable,PgTablesTable). Phase 3 fills in the gaps:pg_constraint,pg_index,pg_proc,pg_views,pg_settings,pg_am,pg_roles,pg_stat*, andinformation_schema.*.addPgCatalogPrefixsearch_path emulation withSchemaPlus.setPath()so unqualifiedpg_tablesetc. resolve without string prefixing.Once Phase 3 lands,
rewriteQuery()should be removable entirely (or reduced to a no-op).Why do this
pgAdmin,DBeaver,psql \d) issue introspection queries that the current empty-result hack silently breaks.~in a column comparison currently returns no rows.::cast stripping silently changes query meaning when a cast was semantically required.Out of scope
Related
Part of the broader TODO cleanup identified in the repo review. See also #559 (Local op cache).