Skip to content

[FLINK-34806][postgres] Expose scan.newly-added-table.enabled option#4436

Open
cansakiroglu wants to merge 2 commits into
apache:masterfrom
cansakiroglu:FLINK-34806-postgres-scan-newly-added-table
Open

[FLINK-34806][postgres] Expose scan.newly-added-table.enabled option#4436
cansakiroglu wants to merge 2 commits into
apache:masterfrom
cansakiroglu:FLINK-34806-postgres-scan-newly-added-table

Conversation

@cansakiroglu

Copy link
Copy Markdown

SUMMARY

Add the scan.newly-added-table.enabled YAML option to the Postgres Pipeline connector. The underlying SnapshotSplitAssigner.captureNewlyAddedTables() mechanism + PostgresSourceBuilder.scanNewlyAddedTableEnabled() builder method already exist in the postgres-cdc source; this PR adds the missing YAML-side wiring.

Mirrors the same option already exposed by the MySQL Pipeline connector (MySqlDataSourceOptions.SCAN_NEWLY_ADDED_TABLE_ENABLED).

Default is false, so the change is no-op for existing pipelines. When set to true, restoring from a savepoint will discover tables that match the source tables: pattern but were not part of the captured set at savepoint time — enabling DMS-style 'add a new table without re-snapshotting existing tables' workflows.

JIRA

FLINK-34806
"[Feature][Postgres] Support automatically identify newly added tables"

What changes

Adds the scan.newly-added-table.enabled YAML option to the Postgres Pipeline
connector.

No behaviour change unless the user opts in (defaultValue(false)).

Why

The MySQL Pipeline connector exposes the same option via MySqlDataSourceOptions.SCAN_NEWLY_ADDED_TABLE_ENABLED and reads it in MySqlDataSourceFactory. This PR brings the Postgres Pipeline connector to parity.

When this matters

Adding a new table to a long-running pipeline. Today the only way to capture a newly-created PG table on an already-running Pipeline job is to cancel the job and re-snapshot every captured table from scratch. With this option set to true, on savepoint+restore the source compares the saved snapshot's table set against PG's current table set, picks up newly-matching tables, snapshots only the new ones, and resumes the existing captured tables from their saved WAL offsets. No re-snapshot of existing tables, no source-side load spike.

Default

false — preserves current behaviour for existing pipelines. Opt-in only.

Add the scan.newly-added-table.enabled YAML option to the Postgres Pipeline
connector. The underlying SnapshotSplitAssigner.captureNewlyAddedTables()
mechanism + PostgresSourceBuilder.scanNewlyAddedTableEnabled() builder
method already exist in the postgres-cdc source; this PR adds the missing
YAML-side wiring.

Mirrors the same option already exposed by the MySQL Pipeline connector
(MySqlDataSourceOptions.SCAN_NEWLY_ADDED_TABLE_ENABLED).

Default is false, so the change is no-op for existing pipelines. When set
to true, restoring from a savepoint will discover tables that match the
source tables: pattern but were not part of the captured set at savepoint
time — enabling DMS-style 'add a new table without re-snapshotting existing
tables' workflows.

Signed-off-by: Mehmet Can Şakiroğlu <cansakiroglu@gmail.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR exposes the scan.newly-added-table.enabled YAML option for the Postgres Pipeline connector by adding the missing ConfigOption and wiring it through PostgresDataSourceFactory into the underlying PostgresSourceBuilder.

Changes:

  • Added scan.newly-added-table.enabled to PostgresDataSourceOptions (default false).
  • Plumbed the option from PostgresDataSourceFactory configuration into PostgresSourceBuilder.scanNewlyAddedTableEnabled(...).
  • Registered the option in the factory’s optionalOptions() set so it is accepted by config validation.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
.../source/PostgresDataSourceOptions.java Defines the new scan.newly-added-table.enabled config option.
.../factory/PostgresDataSourceFactory.java Reads the option from config, passes it into the source builder, and adds it to optional options.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants