Fix/iceberg drop tables by hentzthename · Pull Request #16 · sidequery/dlt-iceberg

hentzthename · 2026-04-26T08:07:16Z

Hi Nico, another small one that fell out of running pipeline.extract(source, refresh="drop_resources") after pipeline.sync_destination() against a Nessie deployment.

The load layer emitted:

Client for iceberg_rest does not implement drop table.
Following tables {'x', 'y'} will not be dropped

…and silently skipped the drops, so stale tables stuck around across refreshes.

Problem

dlt core gates the per-table drop path on hasattr(job_client, "drop_tables") (dlt/load/utils.py:170). IcebergRestClient only exposed drop_storage() (a full namespace wipe) -- no drop_tables(*names, delete_schema=True) -- so the load layer fell back to the warn-and-skip branch. Net effect:

refresh="drop_resources" / refresh="drop_sources" were effectively no-ops on this destination.
pipeline.sql_client().drop_dataset() had no coherent per-table partner (dataset-level works via the base-class DROP SCHEMA CASCADE).
Consumers had to reach around the destination with pyiceberg directly for destructive ops.

Solution

Implement the JobClient.drop_tables contract on IcebergRestClient:

Drop each named table via the PyIceberg catalog, swallowing NoSuchTableError so the call is idempotent (dlt may pass tables that were never physically created).
When delete_schema=True, remove all _dlt_version rows where schema_name = self.schema.name via table.delete(EqualTo(...)), matching the SqlJobClientBase.drop_tables contract.

One deviation worth calling out: the obvious move would be self._delete_schema_in_storage(self.schema), but that method lives on SqlJobClientBase (not JobClientBase) and uses self.sql_client.execute_sql(...). IcebergRestClient extends JobClientBase directly, and its sql_client is a DuckDB view provider rather than a real DDL-capable client -- so the DELETE is issued via PyIceberg's row-delete API instead, reusing the pattern already at destination_client.py:1151-1153.

Changes

`destination_client.py`

Symbol	Change
`IcebergRestClient.drop_tables` (new)	Drops each named table via `catalog.drop_table(...)`; `NoSuchTableError` is swallowed. When `delete_schema=True`, deletes `_dlt_version` rows for `self.schema.name` via `version_table.delete(EqualTo("schema_name", ...))`.

No changes to sql_client.py or schema_evolution.py.

Tests

New test_drop_tables.py covering:
- the hasattr gate (method is actually exposed on the class)
- selective drop of named tables only
- idempotent behavior on missing tables
- delete_schema=True clears _dlt_version rows for the current schema
- pipeline.run(..., refresh="drop_resources") end-to-end (the originally reported symptom)
Full suite: 175/175 passing locally (SQLite + Nessie + Polaris + Lakekeeper).

dlt core gates refresh="drop_resources" / refresh="drop_sources" on hasattr(job_client, "drop_tables") (dlt/load/utils.py). Without that method the load layer warns and silently skips the drops, which is what triggered the "Client for iceberg_rest does not implement drop table" message in the field. Tests cover: - method existence (the hasattr gate) - selective drop of named tables only - idempotent behavior for missing tables - delete_schema=True clears _dlt_version rows for the current schema - refresh="drop_resources" end-to-end

IcebergRestClient only exposed drop_storage() (full namespace wipe). dlt core's refresh="drop_resources" / refresh="drop_sources" path calls job_client.drop_tables(*names, delete_schema=True); without it, drops are warned-and-skipped, leaving stale tables in the destination. Drops each named table via the PyIceberg catalog (swallowing NoSuchTableError for idempotence). When delete_schema=True, wipes _dlt_version rows for self.schema.name via table.delete(EqualTo(...)), matching the SqlJobClientBase.drop_tables contract. Inherited _delete_schema_in_storage isn't used because IcebergRestClient extends JobClientBase directly, not SqlJobClientBase, and would need a real SQL client to run the DELETE statement.

hentzthename added 2 commits April 26, 2026 13:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/iceberg drop tables#16

Fix/iceberg drop tables#16
hentzthename wants to merge 2 commits intosidequery:mainfrom
hentzthename:fix/iceberg-drop-tables

hentzthename commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hentzthename commented Apr 26, 2026

Problem

Solution

Changes

destination_client.py

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`destination_client.py`