Skip to content

Commit 39db3e9

Browse files
authored
fix: Improve perf on schema enumeration/validation (#1168)
Resolves #1166 <!--- Include the number of the issue addressed by this PR above if applicable. Example: resolves #1234 Please review our pull request review process in CONTRIBUTING.md before your proceed. --> ### Description Apparently we've had a performance regression for a while due to switching to GetSchemas over using show schemas for everything. Prior to 1.9, you could get the show schema behavior if you were on HMS but didn't specify a database. Unclear why GetSchema's performance is poor, but even in my non-HMS testing, show schemas is faster. ### Checklist - [x] I have run this code in development and it appears to resolve the stated issue - [x] This PR includes tests, or tests are not required/relevant for this PR - [x] I have updated the `CHANGELOG.md` and added information about my change to the "dbt-databricks next" section.
1 parent 70bec4d commit 39db3e9

2 files changed

Lines changed: 8 additions & 11 deletions

File tree

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,10 @@
66
- Remove external path on intermediate tables for incremental models (with Materialization V2) ([1161](https://github.com/databricks/dbt-databricks/pull/1161))
77
- Fix get_columns_in_relation branching logic for streaming tables to prevent it from running `AS JSON`
88

9+
### Under the hood
10+
11+
- Improve performance of schema enumeration/validation ([1168](https://github.com/databricks/dbt-databricks/pull/1168))
12+
913
## dbt-databricks 1.10.10 (August 20, 2025)
1014

1115
### Fixes

dbt/adapters/databricks/impl.py

Lines changed: 4 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -300,21 +300,14 @@ def compare_dbr_version(self, major: int, minor: int) -> int:
300300
return self.connections.compare_dbr_version(major, minor)
301301

302302
def list_schemas(self, database: Optional[str]) -> list[str]:
303-
"""
304-
Get a list of existing schemas in database.
305-
306-
If `database` is `None`, fallback to executing `show databases` because
307-
`list_schemas` tries to collect schemas from all catalogs when `database` is `None`.
308-
"""
309-
if database is not None:
310-
results = self.connections.list_schemas(database=database)
311-
else:
312-
results = self.execute_macro(LIST_SCHEMAS_MACRO_NAME, kwargs={"database": database})
303+
results = self.execute_macro(LIST_SCHEMAS_MACRO_NAME, kwargs={"database": database})
313304
return [row[0] for row in results]
314305

315306
def check_schema_exists(self, database: Optional[str], schema: str) -> bool:
316307
"""Check if a schema exists."""
317-
return schema.lower() in set(s.lower() for s in self.list_schemas(database=database))
308+
return schema.lower() in set(
309+
s.lower() for s in self.connections.list_schemas(database or "hive_metastore", schema)
310+
)
318311

319312
def execute(
320313
self,

0 commit comments

Comments
 (0)