Skip to content

roachtest: add pg_dump round-trip compatibility test #167543

Draft
rafiss wants to merge 7 commits intocockroachdb:masterfrom
rafiss:pgdump-roachtest
Draft

roachtest: add pg_dump round-trip compatibility test #167543
rafiss wants to merge 7 commits intocockroachdb:masterfrom
rafiss:pgdump-roachtest

Conversation

@rafiss
Copy link
Copy Markdown
Collaborator

@rafiss rafiss commented Apr 5, 2026

Add a weekly roachtest that validates pg_dump round-trip compatibility
with CockroachDB. The test builds pg_dump from PostgreSQL source
(REL_17_4), creates a representative schema in a source database,
dumps it with pg_dump --schema-only using the pg_dump_compatibility
session variable, restores into a target database via psql, dumps
again, and compares the two outputs.

Known deviations are tracked via an expected-diff baseline file in
testdata, following the same pattern as the pg_regress roachtest. The
current baseline captures one known issue: pg_dump emits a view
before the table it depends on, causing the view to be lost during
restore.

The test covers enum types, tables with various column types,
constraints, secondary indexes (including partial and covering),
foreign keys, sequences, views, comments, multi-schema objects,
and default expressions.

Resolves: #167442
Epic: CRDB-28751

Release note: None

rafiss and others added 7 commits April 2, 2026 23:22
pg_dump fails when connecting to CockroachDB because it looks up catalog
objects by {tableoid, oid} using hardcoded PostgreSQL RelationId constants
(e.g. NamespaceRelationId = 2615 for pg_namespace), but CockroachDB
returns its internal descriptor IDs as tableoid (e.g. 4294967048). The
key mismatch causes "schema with OID X does not exist" errors.

Add a `pg_dump_compatibility` session variable (default off) that, when
enabled, causes virtual pg_catalog tables to report standard PostgreSQL
tableoid values instead of CockroachDB's internal descriptor IDs. The
mapping covers the 10 pg_catalog tables that pg_dump looks up by
hardcoded tableoid: pg_class, pg_type, pg_proc, pg_operator, pg_am,
pg_collation, pg_namespace, pg_extension, pg_publication, and
pg_subscription. Non-pg_catalog virtual tables (information_schema,
crdb_internal) are unaffected.

Informs: cockroachdb#20296

Release note: None
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
pg_dump issues `LOCK TABLE ... IN ACCESS SHARE MODE` for all tables it
discovers. CockroachDB didn't parse `LOCK TABLE` at all — `LOCK` wasn't
even a keyword in the grammar. Since CockroachDB uses MVCC, lock modes
like ACCESS SHARE are semantically meaningless (reads never block), so
this can be safely no-oped.

Add `LOCK` and `EXCLUSIVE` as unreserved keywords and implement parsing
for the full `LOCK TABLE` statement with all eight PostgreSQL lock modes
and the optional `NOWAIT` clause. When `pg_dump_compatibility` is true,
the statement is accepted as a no-op. Otherwise, it returns a
FeatureNotSupported error.

Informs: cockroachdb#20296

Release note: None

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
When pg_dump_compatibility is enabled, remap several pg_catalog OIDs so
pg_dump treats built-in objects correctly:

- pg_cast: assign sequential OIDs starting from 14000 instead of FNV-32
  hashes, keeping them below 16384 (FirstNormalObjectId) so pg_dump
  skips them as built-in
- pg_am: emit PostgreSQL's btree (403) and heap (2) access methods
  instead of CockroachDB's prefix and inverted, preventing broken
  CREATE ACCESS METHOD statements
- pg_class.relam: use heapAmOid (2) for tables and btreeAmOid (403) for
  forward indexes, preventing SET default_table_access_method output
- pg_tablespace: use OID 1663 for pg_default (matching PostgreSQL v17)
  instead of 0, preventing spurious "Tablespace: pg_default" annotations

Informs: cockroachdb#20296

Release note: None

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
When pg_dump_compatibility is enabled, hide CockroachDB-internal objects
from pg_catalog tables to prevent pg_dump from attempting to dump them:

- pg_proc: skip crdb_internal and information_schema built-in functions
  with OIDs below 16384, which pg_dump would collect as "user-defined"
- pg_class: skip crdb_internal tables so pg_dump does not try to dump
  internal system tables
- pg_namespace: skip the crdb_internal schema to prevent pg_dump from
  emitting CREATE SCHEMA crdb_internal
- pg_type: skip crdb_internal composite types to avoid referencing a
  missing schema OID

Uses schema ID comparison (catconstants.CrdbInternalID) instead of name
comparison for robustness.

Informs: cockroachdb#20296

Release note: None

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Previously, the pg_dump_compatibility session setting was a boolean.
This changes it to a string enum with three values: "off" (default),
"postgres", and "cockroachdb". Both "postgres" and "cockroachdb" modes
enable the existing pg_catalog compatibility behavior (standard OIDs,
hidden crdb_internal schema, etc.). The "postgres" mode additionally
suppresses CRDB-specific storage parameters like schema_locked from
SHOW CREATE TABLE output, pg_class.reloptions, and LOCALITY clauses,
so that pg_dump produces valid PostgreSQL syntax.

The GetStorageParams interface gains an excludeCRDBInternal parameter
to support filtering schema_locked when in postgres compat mode.

Informs: cockroachdb#20296

Release note (sql change): Added a `pg_dump_compatibility` session
variable that improves compatibility with the pg_dump PostgreSQL tool.
Set it to `postgres` to make pg_catalog report OIDs that match the
hardcoded ones expected by pg_dump, and hide CockroachDB-internal objects
that do not work with pg_dump, and make other fixups that allow the
pg_dump output to run on non-CockroachDB servers. Set it to `cockroachdb`
for the same pg_catalog fixes, but still keeping CockroachDB-specific
syntax. Like other session variables,`pg_dump_compatibility` can be set
in the connection string so that it takes effect as soon as pg_dump
connect to CockroachDB.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
When pg_dump_compatibility is enabled, strip the database catalog from
table references in pg_views and pg_matviews definitions (e.g.
"mydb.public.t" becomes "public.t"). PostgreSQL does not have
cross-database references, so view definitions should only use
schema-qualified names. Without this fix, pg_dump emits CREATE VIEW
statements that reference the source database name, which fail on
restore to a differently-named database.

The new viewQueryWithoutDatabasePrefix helper parses the view query
and re-formats it using FmtReformatTableNames to omit the catalog
from all table name nodes.

Informs: cockroachdb#20296

Release note: None

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Add a weekly roachtest that validates pg_dump round-trip compatibility
with CockroachDB. The test builds pg_dump from PostgreSQL source
(REL_17_4), creates a representative schema in a source database,
dumps it with `pg_dump --schema-only` using the `pg_dump_compatibility`
session variable, restores into a target database via psql, dumps
again, and compares the two outputs.

Known deviations are tracked via an expected-diff baseline file in
testdata, following the same pattern as the pg_regress roachtest. The
current baseline captures one known issue: pg_dump emits a view
before the table it depends on, causing the view to be lost during
restore.

The test covers enum types, tables with various column types,
constraints, secondary indexes (including partial and covering),
foreign keys, sequences, views, comments, multi-schema objects,
and default expressions.

Resolves: cockroachdb#167442
Epic: CRDB-28751

Release note: None
@trunk-io
Copy link
Copy Markdown
Contributor

trunk-io bot commented Apr 5, 2026

Merging to master in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 5, 2026

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

It looks like your PR touches SQL parser code but doesn't add or edit parser tests. Please make sure you add or edit parser tests if you edit the parser.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

roachtest: add nightly pg_dump compatibility test using PostgreSQL's test suite

2 participants