Skip to content

Commit e3ba0d2

Browse files
committed
feat: add skip_merge_on_empty_source incremental config
Adds an opt-in incremental config that bypasses MERGE and all associated metadata queries (DESCRIBE, SHOW TBLPROPERTIES, constraint/tag/mask lookups) when the compiled source SELECT returns zero rows. Motivating case: customers who run `dbt run` on a schedule against source tables that receive deltas sporadically. Today, each incremental model still pays ~4-7s per run on temp view creation + metadata queries + MERGE planning even when there is nothing to merge. With `skip_merge_on_empty_source: true`, the materialization runs a cheap `SELECT 1 FROM (<compiled>) LIMIT 1` probe and, if empty, returns early after firing pre/post hooks and a no-op `main` statement. Scope: - V1 (`use_materialization_v2: false`) and V2 paths both honor the flag - Default is `false` (opt-in, no behavior change for existing projects) - SQL language only (Python models fall through to the standard path) Files: - `dbt/adapters/databricks/impl.py`: new `skip_merge_on_empty_source` field on `DatabricksConfig` - `dbt/include/databricks/macros/materializations/incremental/incremental.sql`: two helper macros (`source_has_rows`, `should_skip_merge_on_empty_source`) and short-circuit calls in the V1/V2 merge branches - `tests/functional/adapter/incremental/test_incremental_skip_on_empty_source.py`: functional tests covering the short-circuit path and the default-off behavior under both V1 and V2 - `CHANGELOG.md`: Features entry Co-authored-by: Isaac
1 parent 86d26c9 commit e3ba0d2

4 files changed

Lines changed: 148 additions & 2 deletions

File tree

CHANGELOG.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
1+
## dbt-databricks 1.13.0 (TBD)
2+
3+
### Features
4+
5+
- Add `skip_merge_on_empty_source` incremental config to bypass MERGE and associated metadata queries when the compiled source SELECT returns no rows, significantly reducing no-op incremental run time. ([#1410](https://github.com/databricks/dbt-databricks/pull/1410))
6+
17
## dbt-databricks 1.12.1 (June 10, 2026)
28

39
### Features
@@ -69,7 +75,6 @@
6975
- Warn when `contract.enforced: true` is set on a `materialized_view` model ([#1279](https://github.com/databricks/dbt-databricks/issues/1279))
7076
- Fix `materialized_view` models with `databricks_tags` silently going stale on `dbt run`. `MaterializedViewAPI._describe_relation` was not fetching `information_schema.tags`, so existing tags always parsed as empty, producing a spurious tag diff that routed the materialization to `ALTER ... SET TAGS` instead of `REFRESH MATERIALIZED VIEW` ([#1419](https://github.com/databricks/dbt-databricks/issues/1419))
7177
- Fix `dbt docs generate` failing with `RuntimeError: Tables contain columns with the same names ... but different types` during catalog merge across schemas ([#1392](https://github.com/databricks/dbt-databricks/issues/1392))
72-
7378
## dbt-databricks 1.11.7 (Apr 17, 2026)
7479

7580
### Features

dbt/adapters/databricks/impl.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -218,6 +218,7 @@ class DatabricksConfig(AdapterConfig):
218218
use_safer_relation_operations: Optional[bool] = None
219219
incremental_apply_config_changes: Optional[bool] = None
220220
view_update_via_alter: Optional[bool] = None
221+
skip_merge_on_empty_source: Optional[bool] = None
221222

222223

223224
def get_identifier_list_string(table_names: set[str]) -> str:

dbt/include/databricks/macros/materializations/incremental/incremental.sql

Lines changed: 57 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,16 @@
5353
{% endif %}
5454
{%- else -%}
5555
{{ log("Existing relation found, proceeding with incremental work")}}
56+
{#-- Short-circuit when `skip_merge_on_empty_source=true` and the source is empty.
57+
This must come after the intermediate relation was created (so pre-hooks and
58+
any side-effect SQL inside `compiled_code` still execute) but before we pay
59+
for schema/config metadata queries, strategy planning, and the MERGE itself.
60+
The `run_query` below issues `SELECT 1 FROM (<compiled_code>) LIMIT 1`; when
61+
the source SELECT has no rows, we skip the remainder and return. --#}
62+
{%- if should_skip_merge_on_empty_source(target_relation, existing_relation, compiled_code, grant_config, full_refresh_mode) -%}
63+
{{ run_post_hooks() }}
64+
{{ return({'relations': [target_relation]}) }}
65+
{%- endif -%}
5666
{#-- Set Overwrite Mode to DYNAMIC for subsequent incremental operations --#}
5767
{%- if incremental_strategy == 'insert_overwrite' and partition_by -%}
5868
{{ set_overwrite_mode('DYNAMIC') }}
@@ -128,6 +138,13 @@
128138
{% do apply_tags(target_relation, tags) %}
129139
{% do persist_docs(target_relation, model, for_relation=language=='python') %}
130140
{%- else -%}
141+
{#-- Short-circuit when `skip_merge_on_empty_source=true` and the source is empty.
142+
Placed before `get_relation_config` / `create_temp_relation` so we skip all
143+
downstream metadata queries and the MERGE itself when there are no deltas. --#}
144+
{%- if should_skip_merge_on_empty_source(target_relation, existing_relation, compiled_code, grant_config, full_refresh_mode) -%}
145+
{{ run_hooks(post_hooks) }}
146+
{{ return({'relations': [target_relation]}) }}
147+
{%- endif -%}
131148
{#-- Set Overwrite Mode to DYNAMIC for subsequent incremental operations --#}
132149
{%- if incremental_strategy == 'insert_overwrite' and partition_by -%}
133150
{{ set_overwrite_mode('DYNAMIC') }}
@@ -246,4 +263,43 @@
246263
{%- set configuration_changes = model_config.get_changeset(existing_config) -%}
247264
{{ apply_config_changeset(target_relation, model, configuration_changes) }}
248265
{% endif %}
249-
{% endmacro %}
266+
{% endmacro %}
267+
268+
{#-- Returns true iff the compiled source SELECT produces at least one row.
269+
Used by the `skip_merge_on_empty_source` incremental config to avoid
270+
unnecessary MERGE / temp view / metadata queries when the delta is empty. --#}
271+
{% macro source_has_rows(compiled_code) %}
272+
{%- set check_sql -%}
273+
select 1 from ({{ compiled_code }}) as __dbt_empty_source_check limit 1
274+
{%- endset -%}
275+
{%- set result = run_query(check_sql) -%}
276+
{{ return(result is not none and (result | length) > 0) }}
277+
{% endmacro %}
278+
279+
{#-- Short-circuit helper: if `skip_merge_on_empty_source` is true and the
280+
compiled source SELECT is empty, perform the minimal work required by dbt
281+
(pre/post hooks + a no-op `main` statement + grants) and return early.
282+
283+
Returns true if the materialization should short-circuit (caller should
284+
then `{{ return({'relations': [target_relation]}) }}`), false otherwise. --#}
285+
{% macro should_skip_merge_on_empty_source(target_relation, existing_relation, compiled_code, grant_config, full_refresh_mode) %}
286+
{%- set skip_flag = config.get('skip_merge_on_empty_source', False) | as_bool -%}
287+
{%- if not skip_flag -%}
288+
{{ return(false) }}
289+
{%- endif -%}
290+
{%- if not execute -%}
291+
{{ return(false) }}
292+
{%- endif -%}
293+
{%- if model['language'] != 'sql' -%}
294+
{{ return(false) }}
295+
{%- endif -%}
296+
{%- if source_has_rows(compiled_code) -%}
297+
{{ return(false) }}
298+
{%- endif -%}
299+
{{ log("[skip_merge_on_empty_source] " ~ target_relation ~ ": empty source, skipping MERGE", info=True) }}
300+
{%- call statement('main') -%}
301+
select 1 as __dbt_skip_merge_noop where false
302+
{%- endcall -%}
303+
{% do apply_grants(target_relation, grant_config, should_revoke(existing_relation, full_refresh_mode)) %}
304+
{{ return(true) }}
305+
{% endmacro %}
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
import pytest
2+
from dbt.tests.util import check_relations_equal, run_dbt
3+
4+
from tests.functional.adapter.fixtures import MaterializationV2Mixin
5+
6+
_MODEL_SQL = """
7+
{{ config(
8+
materialized='incremental',
9+
unique_key='id',
10+
skip_merge_on_empty_source=true,
11+
) }}
12+
13+
{% if not is_incremental() %}
14+
15+
select cast(1 as bigint) as id, 'hello' as msg
16+
union all
17+
select cast(2 as bigint) as id, 'goodbye' as msg
18+
19+
{% else %}
20+
21+
-- Delta filter: only rows with id greater than existing max (=> empty on 2nd run)
22+
select cast(id as bigint) as id, msg from (
23+
select 1 as id, 'hello' as msg
24+
union all
25+
select 2 as id, 'goodbye' as msg
26+
) src
27+
where id > (select max(id) from {{ this }})
28+
29+
{% endif %}
30+
"""
31+
32+
_SEED_AFTER_FIRST_RUN = """id,msg
33+
1,hello
34+
2,goodbye
35+
"""
36+
37+
38+
class TestSkipMergeOnEmptySource:
39+
@pytest.fixture(scope="class")
40+
def models(self):
41+
return {"skip_merge_model.sql": _MODEL_SQL}
42+
43+
@pytest.fixture(scope="class")
44+
def seeds(self):
45+
return {"expected.csv": _SEED_AFTER_FIRST_RUN}
46+
47+
def test_skip_merge_when_source_empty(self, project):
48+
# 1st run: seeds target with 2 rows
49+
results = run_dbt(["seed"])
50+
assert len(results) == 1
51+
results = run_dbt(["run"])
52+
assert len(results) == 1
53+
54+
# 2nd run: incremental with empty delta -> short-circuit should trigger
55+
results = run_dbt(["run"])
56+
assert len(results) == 1
57+
# Data must be unchanged (no MERGE happened, table same as after 1st run)
58+
check_relations_equal(project.adapter, ["skip_merge_model", "expected"])
59+
60+
61+
class TestSkipMergeOnEmptySourceV2(MaterializationV2Mixin, TestSkipMergeOnEmptySource):
62+
"""Same behavior under V2 materialization path."""
63+
64+
65+
class TestSkipMergeDefaultDisabled:
66+
"""When `skip_merge_on_empty_source` is not set, behavior is unchanged
67+
(MERGE runs as before, even if source is empty)."""
68+
69+
@pytest.fixture(scope="class")
70+
def models(self):
71+
# Same model but WITHOUT the skip flag
72+
return {"default_model.sql": _MODEL_SQL.replace("skip_merge_on_empty_source=true,", "")}
73+
74+
@pytest.fixture(scope="class")
75+
def seeds(self):
76+
return {"expected.csv": _SEED_AFTER_FIRST_RUN}
77+
78+
def test_default_no_skip(self, project):
79+
run_dbt(["seed"])
80+
run_dbt(["run"])
81+
# 2nd run without the flag still succeeds (MERGE with empty source)
82+
results = run_dbt(["run"])
83+
assert len(results) == 1
84+
check_relations_equal(project.adapter, ["default_model", "expected"])

0 commit comments

Comments
 (0)