Skip to content
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
213d12a
feat: add weekly cleanup workflow for stale CI schemas
devin-ai-integration[bot] Feb 28, 2026
4605b7f
fix: remove unused ns.errors variable
devin-ai-integration[bot] Feb 28, 2026
fbe4059
test: add integration test for drop_stale_ci_schemas macro
devin-ai-integration[bot] Feb 28, 2026
a124fbe
fix: guard recent schema cleanup with existence check
devin-ai-integration[bot] Feb 28, 2026
fb87fe6
Merge remote-tracking branch 'origin/master' into devin/1772278180-cl…
devin-ai-integration[bot] Feb 28, 2026
f327572
fix: add ClickHouse-compatible list_ci_schemas dispatch and use eleme…
devin-ai-integration[bot] Feb 28, 2026
c373f8a
fix: add clickhouse__schema_exists implementation
devin-ai-integration[bot] Feb 28, 2026
a9f398d
fix: return dict instead of tojson (double-encoding) and use exact-ma…
devin-ai-integration[bot] Feb 28, 2026
fe9ff60
test: remove Dremio skip to let CI validate schema cleanup
devin-ai-integration[bot] Feb 28, 2026
5c58b2e
fix: address CodeRabbit review comments
devin-ai-integration[bot] Feb 28, 2026
572eeca
fix: revert to utcnow() - modules.datetime.timezone not available in …
devin-ai-integration[bot] Feb 28, 2026
dcd447a
fix: use SQL queries instead of adapter methods in run-operation context
devin-ai-integration[bot] Feb 28, 2026
1c6b4db
fix: add BigQuery-specific list_ci_schemas and ci_schema_exists imple…
devin-ai-integration[bot] Feb 28, 2026
f85599c
test: re-add Dremio skip - DremioRelation path model incompatible wit…
devin-ai-integration[bot] Feb 28, 2026
5adb93f
fix: increase test max_age_hours to 8760 to avoid race with parallel …
devin-ai-integration[bot] Feb 28, 2026
dbd9047
docs: fix comment - year 2020, not year 2000
devin-ai-integration[bot] Feb 28, 2026
3265525
refactor: move cleanup macros to integration tests, use regex parser,…
devin-ai-integration[bot] Feb 28, 2026
1001681
refactor: split helpers into schema_utils/, named regex groups, renam…
devin-ai-integration[bot] Feb 28, 2026
357d9a2
rename: schema_utils/ -> ci_schema_utils/ for clarity
devin-ai-integration[bot] Feb 28, 2026
14a0995
refactor: split into schema_utils/ (generic) + ci_schemas_cleanup/ (C…
devin-ai-integration[bot] Feb 28, 2026
6913c74
fix: rename macros to edr_ prefix to avoid shadowing dbt-databricks i…
devin-ai-integration[bot] Feb 28, 2026
3ca9f64
fix: remove duplicate edr_drop_schema - already exists in clear_env.sql
devin-ai-integration[bot] Feb 28, 2026
45c65ef
fix: add SQL escaping for defense-in-depth (CodeRabbit suggestion)
devin-ai-integration[bot] Feb 28, 2026
320a250
fix: add BigQuery backtick escaping for database identifiers (CodeRab…
devin-ai-integration[bot] Feb 28, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions .github/workflows/cleanup-stale-schemas.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
name: Cleanup stale CI schemas

on:
schedule:
# Every Sunday at 03:00 UTC
- cron: "0 3 * * 0"
workflow_dispatch:
inputs:
max-age-hours:
type: string
required: false
default: "24"
description: Drop schemas older than this many hours

env:
TESTS_DIR: ${{ github.workspace }}/dbt-data-reliability/integration_tests

jobs:
cleanup:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
warehouse-type:
- snowflake
- bigquery
- redshift
- databricks_catalog
- athena
steps:
- name: Checkout dbt package
uses: actions/checkout@v4
with:
path: dbt-data-reliability

- name: Setup Python
uses: actions/setup-python@v6
with:
python-version: "3.10"
cache: "pip"

- name: Install dbt
run: >
pip install
"dbt-core"
"dbt-${{ (matrix.warehouse-type == 'databricks_catalog' && 'databricks') || (matrix.warehouse-type == 'athena' && 'athena-community') || matrix.warehouse-type }}"

- name: Write dbt profiles
env:
CI_WAREHOUSE_SECRETS: ${{ secrets.CI_WAREHOUSE_SECRETS || '' }}
run: |
# The cleanup job doesn't create schemas, but generate_profiles.py
# requires --schema-name. Use a dummy value.
python "${{ github.workspace }}/dbt-data-reliability/integration_tests/profiles/generate_profiles.py" \
--template "${{ github.workspace }}/dbt-data-reliability/integration_tests/profiles/profiles.yml.j2" \
--output ~/.dbt/profiles.yml \
--schema-name "cleanup_placeholder"

- name: Install dbt deps
working-directory: ${{ env.TESTS_DIR }}/dbt_project
run: dbt deps

- name: Symlink local elementary package
run: ln -sfn ${{ github.workspace }}/dbt-data-reliability ${{ env.TESTS_DIR }}/dbt_project/dbt_packages/elementary

- name: Drop stale CI schemas
working-directory: ${{ env.TESTS_DIR }}/dbt_project
# Only dbt_ prefixed schemas are created in this repo's CI.
# The elementary repo has its own workflow for py_ prefixed schemas.
run: >
dbt run-operation elementary.drop_stale_ci_schemas
--args '{prefixes: ["dbt_"], max_age_hours: ${{ inputs.max-age-hours || '24' }}}'
-t "${{ matrix.warehouse-type }}"
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
{#
Integration-test helper for elementary.drop_stale_ci_schemas.

Creates two CI-style schemas (one with an old timestamp, one recent),
runs the cleanup macro, checks which schemas survived, cleans up,
and returns a JSON result dict.
#}

{% macro test_drop_stale_ci_schemas() %}
{% set database = elementary.target_database() %}
{% set now = modules.datetime.datetime.utcnow() %}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -name "test_drop_stale_ci_schemas.sql" -type f

Repository: elementary-data/dbt-data-reliability

Length of output: 149


🌐 Web query:

Python 3.12 datetime.utcnow() deprecation removal

💡 Result:

  • datetime.datetime.utcnow() (and utcfromtimestamp()) were deprecated in Python 3.12 and now emit a DeprecationWarning saying they’re “scheduled for removal in a future version.” The official replacement is timezone-aware UTC datetimes, e.g. datetime.datetime.now(tz=datetime.UTC). [1]
  • As of the Python 3.15.0a0 docs, utcnow() is still present (still marked “Deprecated since version 3.12”), which indicates it has not been removed yet, and no specific removal version is stated in the docs. [2]
  • The deprecation notice is still listed in Python 3.14 deprecations as well. [3]

Recommended migration:

from datetime import datetime, UTC

dt = datetime.now(UTC)  # instead of datetime.utcnow()

Sources: [1] [2] [3]


🏁 Script executed:

cat -n ./integration_tests/dbt_project/macros/test_drop_stale_ci_schemas.sql | head -30

Repository: elementary-data/dbt-data-reliability

Length of output: 1691


🏁 Script executed:

# Check if there are other datetime usages in the codebase to understand patterns
rg "modules\.datetime" integration_tests/dbt_project/macros/ -A 2 -B 2

# Also check dbt version or requirements
fd requirements -type f -name "*.txt" -o -name "*.yml" -o -name "setup.py" | head -5

Repository: elementary-data/dbt-data-reliability

Length of output: 818


🌐 Web query:

dbt Jinja modules.datetime.timezone Python version compatibility

💡 Result:

modules.datetime.timezone is part of Python’s standard library datetime module, and it’s available in all modern Python 3 versions that dbt supports—so it’s not a dbt/Python-version incompatibility issue in practice.

  • In dbt’s Jinja context, the modules object exposes Python stdlib modules, and modules.datetime is commonly used for date/time logic in macros (e.g., modules.datetime.datetime.now(), modules.datetime.timedelta(...)). That same object should also expose modules.datetime.timezone (e.g., modules.datetime.timezone.utc). [3]
  • Current dbt-core releases (e.g., dbt-core 1.11.6, released Feb 17, 2026) require Python >= 3.10 and declare support for Python 3.10–3.13. So datetime.timezone will be present. [1]
  • dbt-core 1.11 explicitly dropped Python 3.9, reinforcing that you’re in “modern Python” territory where datetime.timezone exists. [2]

If you’re seeing an error like “timezone is undefined” or “has no attribute timezone”, it’s more likely due to how the Jinja context is being executed (e.g., a non-dbt Jinja renderer, older tooling that doesn’t provide modules, or a linter/templater mismatch), not Python’s datetime module.

Sources: [1] [2] [3]


Use timezone-aware UTC instead of utcnow() in the test macro.

The datetime.utcnow() method is deprecated in Python 3.12+ and will be removed in a future version. Replace it with datetime.now(timezone.utc) to maintain forward compatibility and avoid runtime warnings.

♻️ Proposed fix
-  {% set now = modules.datetime.datetime.utcnow() %}
+  {% set now = modules.datetime.datetime.now(modules.datetime.timezone.utc) %}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{% set now = modules.datetime.datetime.utcnow() %}
{% set now = modules.datetime.datetime.now(modules.datetime.timezone.utc) %}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@integration_tests/dbt_project/macros/test_drop_stale_ci_schemas.sql` at line
11, Replace the deprecated call modules.datetime.datetime.utcnow() used to set
the now variable with a timezone-aware call: use
modules.datetime.datetime.now(modules.datetime.timezone.utc) so the now variable
is UTC-aware; update the expression where now is set (the line containing
modules.datetime.datetime.utcnow()) to call .now(...) with
modules.datetime.timezone.utc to avoid the deprecation and runtime warnings.


{# Old schema: timestamp in the past (2020-01-01 00:00:00) #}
{% set old_schema = 'dbt_200101_000000_citest_00000000' %}
{# Recent schema: timestamp = now #}
{% set recent_ts = now.strftime('%y%m%d_%H%M%S') %}
{% set recent_schema = 'dbt_' ~ recent_ts ~ '_citest_11111111' %}

{{ log("TEST: creating old schema: " ~ old_schema, info=true) }}
{{ log("TEST: creating recent schema: " ~ recent_schema, info=true) }}

{# ── Create both schemas ───────────────────────────────────────────── #}
{% do elementary_tests.edr_create_schema(database, old_schema) %}
{% do elementary_tests.edr_create_schema(database, recent_schema) %}

{# ── Verify both exist before running cleanup ──────────────────────── #}
{% set old_exists_before = elementary.ci_schema_exists(database, old_schema) %}
{% set recent_exists_before = elementary.ci_schema_exists(database, recent_schema) %}
{{ log("TEST: old_exists_before=" ~ old_exists_before ~ ", recent_exists_before=" ~ recent_exists_before, info=true) }}

{# ── Run cleanup with a large threshold so only the artificially old
schema (year 2020) is caught, and real CI schemas from parallel
workers are safely below the threshold. ──────────────────────────── #}
{% do elementary.drop_stale_ci_schemas(prefixes=['dbt_'], max_age_hours=8760) %}

{# ── Check which schemas survived ─────────────────────────────────── #}
{% set old_exists_after = elementary.ci_schema_exists(database, old_schema) %}
{% set recent_exists_after = elementary.ci_schema_exists(database, recent_schema) %}
{{ log("TEST: old_exists_after=" ~ old_exists_after ~ ", recent_exists_after=" ~ recent_exists_after, info=true) }}

{# ── Cleanup: drop any remaining test schemas ─────────────────────── #}
{% if old_exists_after is true %}
{% do elementary.drop_ci_schema(database, old_schema) %}
{% endif %}
{% if recent_exists_after %}
{% do elementary.drop_ci_schema(database, recent_schema) %}
{% endif %}

{# ── Return results ────────────────────────────────────────────────── #}
{% set results = {
"old_exists_before": old_exists_before,
"recent_exists_before": recent_exists_before,
"old_dropped": not old_exists_after,
"recent_kept": recent_exists_after
} %}
{% do return(results) %}
{% endmacro %}


{# ── Per-adapter schema creation ─────────────────────────────────────── #}

{% macro edr_create_schema(database, schema_name) %}
{% do return(adapter.dispatch('edr_create_schema', 'elementary_tests')(database, schema_name)) %}
{% endmacro %}

{% macro default__edr_create_schema(database, schema_name) %}
{% set schema_relation = api.Relation.create(database=database, schema=schema_name) %}
{% do dbt.create_schema(schema_relation) %}
{% do adapter.commit() %}
{% endmacro %}

{% macro clickhouse__edr_create_schema(database, schema_name) %}
{% do run_query("CREATE DATABASE IF NOT EXISTS `" ~ schema_name ~ "`") %}
{% do adapter.commit() %}
{% endmacro %}
21 changes: 21 additions & 0 deletions integration_tests/tests/test_ci_cleanup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
"""Integration test for elementary.drop_stale_ci_schemas macro."""

import json

import pytest
from dbt_project import DbtProject


@pytest.mark.skip_targets(["dremio"])
def test_drop_stale_ci_schemas(dbt_project: DbtProject):
"""Verify that old CI schemas are dropped and recent ones are kept."""
result = dbt_project.dbt_runner.run_operation(
"elementary_tests.test_drop_stale_ci_schemas",
)
assert result, "run_operation returned no output"
data = json.loads(result[0])

assert data["old_exists_before"], "Setup failed: old schema was not created"
assert data["recent_exists_before"], "Setup failed: recent schema was not created"
assert data["old_dropped"], "Old schema should have been dropped by cleanup"
assert data["recent_kept"], "Recent schema should have been kept by cleanup"
156 changes: 156 additions & 0 deletions macros/utils/ci_cleanup/drop_stale_ci_schemas.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
{#
drop_stale_ci_schemas – clean up timestamped CI schemas.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this macro be in the integration tests project?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — moved the entire macro to integration_tests/dbt_project/macros/drop_stale_ci_schemas.sql. The old macros/utils/ci_cleanup/ directory is deleted. Workflow now calls drop_stale_ci_schemas without the elementary. prefix.


Schema naming convention produced by CI:
<prefix><YYMMDD_HHMMSS>_<branch>_<hash>
Examples:
dbt_260228_112345_master_abcd1234
py_260228_112345_master_abcd1234
dbt_260228_112345_master_abcd1234_elementary (suffixed variant)

Call from a GitHub Actions workflow via:
dbt run-operation elementary.drop_stale_ci_schemas \
--args '{prefixes: ["dbt_", "py_"], max_age_hours: 24}'
#}

{% macro drop_stale_ci_schemas(prefixes=None, max_age_hours=24) %}
Comment thread
haritamar marked this conversation as resolved.
Outdated
{% if prefixes is none %}
{% set prefixes = ['dbt_', 'py_'] %}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make prefixes a mandatory input, and fail if it is not specified

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — prefixes now defaults to none and raises a CompilationError with a clear message if not provided or empty.

{% endif %}

{% set max_age_hours = max_age_hours | int %}
{% set database = elementary.target_database() %}
{% set all_schemas = elementary.list_ci_schemas(database) %}
{# utcnow() is deprecated in Python 3.12+ but modules.datetime.timezone is not
available in dbt's Jinja context. Both now and strptime produce naive datetimes
so comparisons are safe. #}
{% set now = modules.datetime.datetime.utcnow() %}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Python 3.12 datetime.utcnow() deprecation replacement

💡 Result:

In Python 3.12, datetime.datetime.utcnow() is deprecated; use a timezone-aware UTC datetime instead. (docs.python.org)

Replacement

from datetime import datetime, UTC
now_utc = datetime.now(UTC)

(docs.python.org)

If you need compatibility with older Python versions

from datetime import datetime, timezone
now_utc = datetime.now(timezone.utc)

(docs.python.org)

Related: utcfromtimestamp() replacement

from datetime import datetime, UTC
dt_utc = datetime.fromtimestamp(ts, tz=UTC)

(docs.python.org)

Citations:


datetime.utcnow() is deprecated in Python 3.12+.

Use datetime.now(timezone.utc) instead. The suggested fix should avoid stripping the timezone information with .replace(tzinfo=None), as this defeats the purpose of using timezone-aware datetimes. If the code elsewhere compares with naive datetimes, those should be updated to be timezone-aware as well.

♻️ Suggested fix
-  {% set now = modules.datetime.datetime.utcnow() %}
+  {% set now = modules.datetime.datetime.now(modules.datetime.timezone.utc) %}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/utils/ci_cleanup/drop_stale_ci_schemas.sql` at line 23, The code sets
now = modules.datetime.datetime.utcnow(), which uses a deprecated naive UTC;
change this to a timezone-aware UTC datetime by calling
modules.datetime.datetime.now(modules.datetime.timezone.utc) (or equivalent)
when assigning now in the macro, and remove any .replace(tzinfo=None) usage so
timezone info is preserved; also audit places that compare now to other
datetimes (e.g., functions referencing now) and update those comparisons to use
timezone-aware datetimes so no naive vs aware comparisons occur.

{% set max_age_seconds = max_age_hours * 3600 %}
{% set ns = namespace(dropped=0) %}

{{ log("CI schema cleanup: scanning " ~ all_schemas | length ~ " schema(s) in database '" ~ database ~ "' for prefixes " ~ prefixes | string, info=true) }}

{% for schema_name in all_schemas | sort %}
{% set schema_lower = schema_name.lower() %}
{% for prefix in prefixes %}
{% if schema_lower.startswith(prefix.lower()) %}
{% set remainder = schema_lower[prefix | length :] %}
{# Timestamp format: YYMMDD_HHMMSS (13 chars) followed by _ #}
{% if remainder | length >= 14 and remainder[6:7] == '_' and remainder[13:14] == '_' %}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. could we use a regex with groups to match the branch name parts? it's a bit cleaner no? modules.re is available in dbt.
  2. let's make a parse_ci_branch_name macro and use it here to simplify the main macro.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — replaced the manual character-by-character validation with modules.re.match('^(\\d{6})_(\\d{6})_.+', remainder) and extracted it into a parse_ci_schema_name macro. The main loop is now much cleaner — just calls parse_ci_schema_name(schema_name, prefixes) which returns a datetime or none.

Named it parse_ci_schema_name (instead of parse_ci_branch_name) since it takes a schema name as input and returns the parsed timestamp. Let me know if you'd prefer a different name.

{% set ts_str = remainder[:13] %}
{# Validate: positions 0-5 and 7-12 must be digits #}
{% set digits = ts_str[:6] ~ ts_str[7:] %}
{% set ns_valid = namespace(ok=true) %}
{% for c in digits %}
{% if c not in '0123456789' %}
{% set ns_valid.ok = false %}
{% endif %}
{% endfor %}
{% if ns_valid.ok %}
{# Validate date component ranges before strptime to avoid ValueError #}
{% set mm = ts_str[2:4] | int %}
{% set dd = ts_str[4:6] | int %}
{% set hh = ts_str[7:9] | int %}
{% set mi = ts_str[9:11] | int %}
{% set ss = ts_str[11:13] | int %}
{% if 1 <= mm <= 12 and 1 <= dd <= 31 and 0 <= hh <= 23 and 0 <= mi <= 59 and 0 <= ss <= 59 %}
{% set schema_ts = modules.datetime.datetime.strptime(ts_str, '%y%m%d_%H%M%S') %}
{% set age_seconds = (now - schema_ts).total_seconds() %}
{% if age_seconds > max_age_seconds %}
{{ log(" DROP " ~ schema_name ~ " (age: " ~ (age_seconds / 3600) | round(1) ~ " h)", info=true) }}
{% do elementary.drop_ci_schema(database, schema_name) %}
{% set ns.dropped = ns.dropped + 1 %}
{% else %}
{{ log(" keep " ~ schema_name ~ " (age: " ~ (age_seconds / 3600) | round(1) ~ " h)", info=true) }}
{% endif %}
{% else %}
{{ log(" skip " ~ schema_name ~ " (invalid timestamp: " ~ ts_str ~ ")", info=true) }}
{% endif %}
{% endif %}
{% endif %}
{% endif %}
{% endfor %}
{% endfor %}

{{ log("CI schema cleanup complete. Dropped " ~ ns.dropped ~ " stale schema(s).", info=true) }}
{% endmacro %}


{# ── Per-adapter schema drop ─────────────────────────────────────────── #}

{# ── Per-adapter schema listing ─────────────────────────────────────── #}

{% macro list_ci_schemas(database) %}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can be in its own file? (also in the integration tests project)
and also ci_schema_exists.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — list_ci_schemas and ci_schema_exists (with all adapter dispatches) are now in their own file: integration_tests/dbt_project/macros/ci_schema_helpers.sql. All dispatch calls use 'elementary_tests' as the search package.

{% do return(adapter.dispatch('list_ci_schemas', 'elementary')(database)) %}
{% endmacro %}

{# adapter.list_schemas() is not available in run-operation context
(RuntimeDatabaseWrapper does not expose it). Use information_schema instead. #}
{% macro default__list_ci_schemas(database) %}
{% set results = run_query("SELECT schema_name FROM information_schema.schemata") %}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a safety, can we just list schemas in the CI db? (the one in the profiles)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added WHERE lower(catalog_name) = lower('<database>') filter to the default list_ci_schemas implementation. BigQuery was already scoped via project-qualified INFORMATION_SCHEMA. ClickHouse lists all databases (no catalog concept).

{% set schemas = [] %}
{% for row in results %}
{% do schemas.append(row[0]) %}
{% endfor %}
{% do return(schemas) %}
{% endmacro %}

{% macro bigquery__list_ci_schemas(database) %}
{% set results = run_query("SELECT schema_name FROM `" ~ database ~ "`.INFORMATION_SCHEMA.SCHEMATA") %}
{% set schemas = [] %}
{% for row in results %}
{% do schemas.append(row[0]) %}
{% endfor %}
{% do return(schemas) %}
{% endmacro %}

{% macro clickhouse__list_ci_schemas(database) %}
{% set results = run_query('SHOW DATABASES') %}
{% set schemas = [] %}
{% for row in results %}
{% do schemas.append(row[0]) %}
{% endfor %}
{% do return(schemas) %}
{% endmacro %}


{# ── Per-adapter schema drop ─────────────────────────────────────────── #}

{% macro drop_ci_schema(database, schema_name) %}
{% do return(adapter.dispatch('drop_ci_schema', 'elementary')(database, schema_name)) %}
{% endmacro %}

{% macro default__drop_ci_schema(database, schema_name) %}
{% set schema_relation = api.Relation.create(database=database, schema=schema_name) %}
{% do dbt.drop_schema(schema_relation) %}
{% do adapter.commit() %}
{% endmacro %}

{% macro clickhouse__drop_ci_schema(database, schema_name) %}
{% do run_query("DROP DATABASE IF EXISTS `" ~ schema_name ~ "`") %}
{% do adapter.commit() %}
{% endmacro %}


{# ── Per-adapter schema existence check (run-operation safe) ──────── #}
{# adapter.check_schema_exists() is not available in run-operation context.
This is only used by the integration test; the main macro does not need it. #}

{% macro ci_schema_exists(database, schema_name) %}
{% do return(adapter.dispatch('ci_schema_exists', 'elementary')(database, schema_name)) %}
{% endmacro %}

{% macro default__ci_schema_exists(database, schema_name) %}
{% set result = run_query("SELECT schema_name FROM information_schema.schemata WHERE lower(schema_name) = lower('" ~ schema_name ~ "')") %}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise, only consider schemas in the CI database.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added WHERE lower(catalog_name) = lower('<database>') AND ... filter to the default ci_schema_exists implementation as well.

{% do return(result | length > 0) %}
{% endmacro %}

{% macro bigquery__ci_schema_exists(database, schema_name) %}
{% set result = run_query("SELECT schema_name FROM `" ~ database ~ "`.INFORMATION_SCHEMA.SCHEMATA WHERE lower(schema_name) = lower('" ~ schema_name ~ "')") %}
{% do return(result | length > 0) %}
{% endmacro %}

{% macro clickhouse__ci_schema_exists(database, schema_name) %}
{% set result = run_query("SELECT 1 FROM system.databases WHERE name = '" ~ schema_name ~ "' LIMIT 1") %}
{% do return(result | length > 0) %}
{% endmacro %}
5 changes: 5 additions & 0 deletions macros/utils/cross_db_utils/schema_exists.sql
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,11 @@
{% do return(adapter.check_schema_exists(database, schema)) %}
{% endmacro %}

{% macro clickhouse__schema_exists(database, schema) %}
{% set result = run_query("SELECT 1 FROM system.databases WHERE name = '" ~ schema ~ "' LIMIT 1") %}
{% do return(result | length > 0) %}
{% endmacro %}

{% macro default__schema_exists(database, schema) %}
{% do return(adapter.check_schema_exists(database, schema)) %}
{% endmacro %}
Loading