Skip to content

Commit 9212bba

Browse files
moomindanisd-db
andauthored
feat: add skip_optimize model config to opt out of post-materialization OPTIMIZE (#1485)
Resolves #703 ### Description Adds a `skip_optimize` model config that lets users opt out of the post-materialization `OPTIMIZE` call without removing `zorder` / `liquid_clustered_by` / `auto_liquid_cluster` from the table definition. **Motivation**: the existing opt-out today is the run-wide `DATABRICKS_SKIP_OPTIMIZE` var, which forces an all-or-nothing decision for the entire invocation. Several users in #703 asked for a config-level opt-out so they can: - delegate `OPTIMIZE` to Predictive Optimization while keeping `auto_liquid_cluster=true` on the table - skip `OPTIMIZE` only for specific high-churn models and let it run for the rest - schedule `OPTIMIZE` out of band (workflow / job) instead of on the dbt critical path **Behavior**: - New model config `skip_optimize` (bool, default `false`) - When truthy, `databricks__optimize` short-circuits to a no-op even if `zorder` / `liquid_clustered_by` / `auto_liquid_cluster` is set on the model — the clustering declaration remains in the table DDL, only the `OPTIMIZE` SQL emission is suppressed - Inherits via standard dbt config resolution: project → folder → model (more specific wins). Example: ```yaml # dbt_project.yml models: my_project: +skip_optimize: true high_read_models: +skip_optimize: false ``` - `DATABRICKS_SKIP_OPTIMIZE` var is unchanged (still skips run-wide) ### Docs follow-up User-facing config reference lives in `dbt-labs/docs.getdbt.com` (`databricks-configs.md`). A companion docs PR will be opened there to document `skip_optimize` alongside the existing `zorder` / `liquid_clustered_by` / `auto_liquid_cluster` entries. ### Checklist - [x] I have run this code in development and it appears to resolve the stated issue - [x] This PR includes tests, or tests are not required/relevant for this PR - [x] I have updated the `CHANGELOG.md` and added information about my change to the "dbt-databricks next" section. --------- Co-authored-by: Shubham Dhal <shubham.dhal@databricks.com>
1 parent ce47403 commit 9212bba

5 files changed

Lines changed: 34 additions & 2 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
### Features
44

55
- Add catalogs.yml v2 support (requires `use_catalogs_v2: true` in dbt-core) ([1440](https://github.com/databricks/dbt-databricks/pull/1440))
6+
- Add `skip_optimize` model config to opt out of the post-materialization `OPTIMIZE` call without dropping `zorder` / `liquid_clustered_by` / `auto_liquid_cluster` from the table definition. Useful when `OPTIMIZE` is delegated to Predictive Optimization or scheduled out of band. Complements the existing run-wide `DATABRICKS_SKIP_OPTIMIZE` var by allowing project-, folder-, or model-level opt-out via standard dbt config inheritance ([#703](https://github.com/databricks/dbt-databricks/issues/703)).
67

78
### Fixes
89
- Apply column-level `databricks_tags` for incremental models on the V1 materialization path (`use_materialization_v2: false`, the default). They were silently dropped at create and on subsequent tag changes; the V1 incremental materialization now applies them, matching the `table` materialization and the V2 path. ([#1520](https://github.com/databricks/dbt-databricks/pull/1520) closes [#1307](https://github.com/databricks/dbt-databricks/issues/1307))

dbt/adapters/databricks/impl.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -205,6 +205,7 @@ class DatabricksConfig(AdapterConfig):
205205
query_tags: Optional[str] = None
206206
tblproperties: Optional[dict[str, str]] = None
207207
zorder: Optional[Union[list[str], str]] = None
208+
skip_optimize: Optional[bool] = None
208209
unique_tmp_table_suffix: bool = False
209210
skip_non_matched_step: Optional[bool] = None
210211
skip_matched_step: Optional[bool] = None

dbt/include/databricks/macros/relations/optimize.sql

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,8 @@
33
{% endmacro %}
44

55
{%- macro databricks__optimize(relation) -%}
6-
{%- if var('DATABRICKS_SKIP_OPTIMIZE', 'false')|lower != 'true' and
6+
{%- if config.get('skip_optimize', false) | as_bool -%}
7+
{%- elif var('DATABRICKS_SKIP_OPTIMIZE', 'false')|lower != 'true' and
78
var('databricks_skip_optimize', 'false')|lower != 'true' and
89
adapter.resolve_file_format(config) == 'delta' -%}
910
{%- if (config.get('zorder', False) or config.get('liquid_clustered_by', False)) or config.get('auto_liquid_cluster', False) -%}

tests/unit/macros/base.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,13 +116,25 @@ def databricks_env(self, macro_folders_to_load) -> Environment:
116116
"""
117117
The environment used for rendering Databricks macros
118118
"""
119-
return Environment(
119+
env = Environment(
120120
loader=FileSystemLoader(
121121
[f"dbt/include/databricks/{folder}" for folder in macro_folders_to_load]
122122
),
123123
extensions=["jinja2.ext.do"],
124124
)
125125

126+
def _as_bool(value):
127+
if isinstance(value, bool):
128+
return value
129+
if str(value).lower() in ("true", "1", "yes"):
130+
return True
131+
if str(value).lower() in ("false", "0", "no"):
132+
return False
133+
raise ValueError(f"Cannot convert {value!r} to bool")
134+
135+
env.filters["as_bool"] = _as_bool
136+
return env
137+
126138
@pytest.fixture
127139
def databricks_template_names(self) -> list:
128140
"""

tests/unit/macros/relations/test_optimize_macros.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,20 @@ def test_macros_optimize_with_skip(self, key_val, var, template_bundle):
4141
r = self.render_bundle(template_bundle, "optimize")
4242

4343
assert r == ""
44+
45+
@pytest.mark.parametrize(
46+
"cluster_key,cluster_val",
47+
[
48+
("zorder", "foo"),
49+
("liquid_clustered_by", ["foo"]),
50+
("auto_liquid_cluster", True),
51+
],
52+
)
53+
def test_macros_optimize_with_skip_optimize_config(
54+
self, cluster_key, cluster_val, config, template_bundle
55+
):
56+
config[cluster_key] = cluster_val
57+
config["skip_optimize"] = True
58+
r = self.render_bundle(template_bundle, "optimize")
59+
60+
assert r == ""

0 commit comments

Comments
 (0)