This guide keeps the PostgreSQL-specific information that is useful to operators and readers evaluating the adapter. It intentionally does not restate the shared core algorithm contract that now lives in the source Javadocs.
Current supported slice:
- PostgreSQL source majors verified in CI: 14 through 18
- logical decoding plugin:
pgoutput - one PostgreSQL database per runtime request
- explicit-table publication contract
- logical replication slot reused across restarts
- metadata tables in schema
dblog_meta - captured user tables must use
REPLICA IDENTITY FULL
The adapter is intentionally narrow rather than trying to support every PostgreSQL logical-replication feature.
The DBLog paper describes wal2json; this implementation uses pgoutput
instead. That is a modernization, not a semantic change.
Reasons:
pgoutputships with every supported PostgreSQL release- no external shared library or extra install step is required
pgoutputis a standard PostgreSQL logical-replication output plugin- pgJDBC can decode the binary protocol directly
- the DBLog watermark algorithm only needs committed-order change streams, and both plugins provide that
The adapter is usable only when all of the following hold:
- logical decoding is available on the source
- every captured table uses
REPLICA IDENTITY FULL - the live publication contains the captured user tables and
dblog_meta.watermarks - on the standard DBLog-managed path, the publication also contains
dblog_meta.heartbeats - the logical slot uses plugin
pgoutput - publication, slot, and captured tables all belong to the same database
- the runtime role has the privileges and ownership needed for the configured publication and slot ownership modes
The shared runtime SQL connection is expected to be:
- open
autoCommit=trueREAD_COMMITTED
DBLog ships no typed TLS support for the PostgreSQL adapter. Both the regular SQL connection (chunk reads, watermark writes, schema inspection, preflight) and the logical-replication connection (CDC stream via pgoutput) are opened with the operator's dblog.source.postgres.jdbc-url and dblog.source.postgres.replication-jdbc-url verbatim. pgJDBC's TLS parameters work because the driver parses them — DBLog itself does nothing.
If you need TLS, configure it through the JDBC URL:
dblog.source.postgres.jdbc-url=jdbc:postgresql://host:5432/app?sslmode=verify-full&sslrootcert=/path/to/ca.crt
dblog.source.postgres.replication-jdbc-url=jdbc:postgresql://host:5432/app?sslmode=verify-full&sslrootcert=/path/to/ca.crtBoth URLs must carry the TLS parameters independently — DBLog does not propagate them between connections. mTLS (sslcert/sslkey) works the same way.
This is a deliberate scope choice for a reference implementation. The shipped Docker fixtures do not enable TLS and the compatibilityMatrix lane runs plaintext.
Run DBLog with a dedicated PostgreSQL role, not the postgres superuser. The
bundled fixture role is named dblog, but the name is not special.
For the default DBLOG_MANAGED publication and slot ownership mode, the runtime
role must:
- have
LOGINandREPLICATION - connect to the configured database
- have
USAGEandCREATEon schemadblog_meta - be able to create, insert, and update
dblog_meta.watermarksanddblog_meta.heartbeats - own every captured user table that DBLog may add to the publication
- own the DBLog-managed publication after creation
If the dblog_meta schema does not yet exist, the runtime role additionally
needs CREATE on the database so it can create the schema on first run. The
recommended setup is to pre-create the schema with an admin connection and
grant USAGE, CREATE on it to the runtime role, which is what the bundled
fixture does via
CREATE SCHEMA IF NOT EXISTS dblog_meta AUTHORIZATION dblog in
ops/docker/postgres/init/01-create-users.sql.
The table-ownership rule is PostgreSQL's publication rule: the role that creates
or alters an explicit-table publication must own the tables in that publication.
If DBLog runs as a role that only has table grants, startup can fail with
must be owner of table ... when it creates or repairs the publication.
Administrator connections may still perform one-time setup, fixture reset, slot cleanup, and emergency repair out of band. They are not required for the normal DBLog runtime path.
The PostgreSQL publication contract is intentionally explicit and narrow.
The publication must:
- be an exact explicit table set
- not use
FOR ALL TABLES - not use
TABLES IN SCHEMA - publish
INSERT,UPDATE, andDELETE - not publish
TRUNCATE - not use row filters
- not use column lists
- not enable
publish_via_partition_root
DBLog-managed publications may be created or repaired inside that narrow model. If an existing publication is already broad or schema-scoped, the adapter fails closed rather than mutating it in place.
The PostgreSQL slot contract is also intentionally narrow.
The slot must:
- be a logical slot
- belong to the configured database
- use plugin
pgoutput - match the configured temporary / two-phase / failover flags
- be usable (
wal_statusnotlost, blank invalidation reason) - not already be active
A missing slot is created only when ownership is DBLog-managed.
The adapter uses two singleton metadata tables in schema dblog_meta:
watermarksheartbeats
The publication must include watermarks, and the standard DBLog-managed path
also includes heartbeats so same-stream interference detection can see them
on the logical stream. Malformed metadata events and same-stream heartbeat
interference fail closed. When the state store is available, these paths record
a schema-uncertainty signal before failing closed.
dblog_meta.watermarks is a singleton control table, not a watermark history
table. DBLog updates the single id = 1 row for each LOW/HIGH watermark, so
SELECT * FROM dblog_meta.watermarks shows only the latest token. Use the tap
stream or the PostgreSQL logical stream if you need a chronological watermark
trace.
DBLog checks REPLICA IDENTITY FULL before opening the replication stream.
That gate runs up front on purpose: deferring it until the first relation or
row event would allow the runtime to open the stream and risk leaking partial
results before rejecting the table shape.
- one committed transaction is buffered in memory before being handed to the sink as a batch
- watermark and heartbeat writes happen on the same source-side SQL path as chunk reads
- targeted repair reuses the same watermark-window machinery as ordinary table dumps
- replication-slot feedback failure does not lose DBLog's durable local checkpoint, but it can cause WAL retention to grow on the server until the slot advances again
For heap sizing guidance around large committed transactions, see
docs/OPERATION.md §2.4.2.
Logical replication slots are persistent server objects. A slot whose consumer disappears retains WAL on the source until the slot advances or is dropped. DBLog behaves like Debezium here: it creates a missing slot when DBLog-managed, reuses it across restarts, and never drops it automatically.
Operators are responsible for slot hygiene.
Consequences:
- decommissioning DBLog requires explicit slot cleanup
- recreating local fixtures without clearing old slot/state can leave the slot pointing at WAL that no longer exists
- two DBLog deployments pointed at the same slot name will fail closed when the second finds the slot already active
Recommended safety net:
- set
max_slot_wal_keep_sizeon PostgreSQL 13+ so an abandoned slot cannot retain unbounded WAL forever
Useful checks:
SELECT slot_name, database, active, restart_lsn, confirmed_flush_lsn, wal_status, invalidation_reason
FROM pg_replication_slots
WHERE plugin = 'pgoutput';SELECT slot_name,
pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) AS restart_lsn_lag_bytes,
pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS confirmed_flush_lsn_lag_bytes
FROM pg_replication_slots
WHERE plugin = 'pgoutput';Decommissioning cleanup:
SELECT pg_drop_replication_slot('<slot_name>');Do not drop a slot while a DBLog process is still connected.
The PostgreSQL adapter does not ship a schema-history subsystem and does not attempt broad DDL replay.
Current policy:
- additive schema changes may be reconciled if they preserve the configured selected-column contract
- non-additive or continuity-breaking changes require a fresh dump
- unsupported or malformed
pgoutputshapes fail closed - captured-table
TRUNCATEmessages, runtime replica-identity drift, selected-column relation metadata drift, key-only or missing old tuples for captured-tableUPDATE/DELETEmessages, and live primary-key updates record a full-dump-required signal and fail closed
The live pgoutput decoder maps tuple values by selected column name, so an
added unselected relation column can be ignored mechanically. Relation messages
also carry type OIDs, but this runtime does not use live relation OIDs as a
selected-column type-drift detector. Startup/restart schema inspection remains
the supported point that catches selected type, source-type, and nullability
drift. Operators should not rely on online PostgreSQL schema evolution as a
feature; stop DBLog, change the source, verify startup reconciliation, and submit
a fresh dump when the selected contract changed or correctness is uncertain.
This is a deliberate tradeoff: smaller implementation, no schema-history dependency, and explicit operator action when schema continuity becomes uncertain.
The current PostgreSQL adapter intentionally does not do the following:
- support
FOR ALL TABLESor schema-scoped publications - clean up logical slots automatically
- replay broad DDL history
- broaden into a general-purpose PostgreSQL CDC platform
If you need the implementation entry points rather than the operator guide, start here:
adapter/postgres/internal/PostgresSourcePreflight.javaadapter/postgres/internal/JdbcPostgresPublicationManager.javaadapter/postgres/internal/JdbcPostgresReplicationSlotManager.javaadapter/postgres/internal/PostgresTransactionStreamingSession.javaadapter/postgres/PostgresLiveStreamingRuntime.java