You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add full ClickHouse support - remove all skip_targets markers (CORE-397) (#934)
* chore: remove all skip_targets(["clickhouse"]) markers from test files
Remove ClickHouse from skip_targets in all integration test files to enable
full ClickHouse support testing. For multi-target skip lists (e.g. in
test_schema_changes.py and test_exposure_schema_validity.py), only 'clickhouse'
was removed while keeping other targets.
Also remove now-unused 'import pytest' statements in files where pytest was
only imported for the skip_targets decorator.
CORE-397
Co-Authored-By: unknown <>
* ci: temporarily limit CI matrix to clickhouse-only for iteration
Reduce the warehouse-type matrix to only [clickhouse] to enable fast
iteration on ClickHouse test fixes. Will be restored to full matrix
once all ClickHouse tests pass.
CORE-397
Co-Authored-By: unknown <>
* fix: use NOT IN instead of LEFT JOIN IS NULL for ClickHouse compatibility
ClickHouse LEFT OUTER JOIN produces default values (e.g. 1970-01-01 for DateTime)
instead of NULL for unmatched rows, causing the anti-join pattern to fail.
Changed missing_bucket_starts CTE to use NOT IN which works correctly on all databases.
Co-Authored-By: unknown <>
* fix: ClickHouse Nullable(Float32) cast + HTTP API seed null fix
- Use Nullable(Float32) in clickhouse__standard_deviation and clickhouse__variance
to handle CASE expressions that return NULL
- Add _fix_clickhouse_seed_nulls() to rebuild seed tables with proper Nullable types
using ClickHouse HTTP API with nullIf() function
- Configure ClickHouse Docker with join_use_nulls=1 and mutations_sync=1
- Fix unused variable lint warning in dbt_project.py
Co-Authored-By: unknown <>
* fix: address CodeRabbit review + revert NOT IN back to LEFT JOIN
- Revert NOT IN subquery back to LEFT JOIN IS NULL (join_use_nulls=1 handles NULLs)
- Add _fix_seed_if_needed to seed_context for ClickHouse NULL fix
- Add try/finally for cleanup in table-rebuild sequence
- Handle Nullable wrapping to avoid Nullable(Nullable(...))
- Handle FixedString/LowCardinality string variants
- Add warning when cols_result is empty
- Backtick-quote column names in ClickHouse ALTER statements
Co-Authored-By: unknown <>
* fix: address CodeRabbit review round 2 - env vars, timeout, SQL injection guard, mutations_sync
Co-Authored-By: unknown <>
* fix: ClickHouse full_names adapter.dispatch, seasonality macros, event freshness Nullable cast
Co-Authored-By: unknown <>
* fix: ClickHouse event_freshness timediff NULL handling + list_concat Nullable dimension cast
Co-Authored-By: unknown <>
* fix: dynamically resolve ClickHouse schema from dbt profiles.yml instead of hardcoding 'default'
Co-Authored-By: unknown <>
* ci: restore full CI matrix with all warehouse types
Co-Authored-By: unknown <>
* refactor: extract ClickHouse seed repair utils + dispatch empty-string NULL macro
Co-Authored-By: unknown <>
* refactor: remove unused clickhouse__ dispatch from replace_empty_strings_with_nulls
The macro is only called for BigQuery fusion seeds. ClickHouse seed NULL
repair is handled by fix_clickhouse_seed_nulls() in clickhouse_utils.py
via the HTTP API (covers all column types, not just strings).
Co-Authored-By: unknown <>
* ci: retrigger CI to verify flaky test_seed_group_attribute failure
Co-Authored-By: unknown <>
* refactor: replace clickhouse_utils.py with ClickHouseDirectSeeder
- Add ClickHouseDirectSeeder to data_seeder.py: creates tables with
Nullable(String) columns directly via the dbt adapter, bypassing
dbt seed and eliminating the need for post-hoc NULL repair
- Add execute_sql() and schema_name property to AdapterQueryRunner
- DbtProject._create_seeder() auto-selects ClickHouseDirectSeeder
when target is 'clickhouse'
- Delete clickhouse_utils.py (HTTP API no longer needed for seeding)
- Update replace_empty_strings_with_nulls.sql comment
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Co-Authored-By: unknown <>
* fix: add type inference to ClickHouseDirectSeeder
Infer ClickHouse column types from Python values instead of using
Nullable(String) for all columns. This preserves proper numeric types
(Int64, Float64) so that Elementary's numeric monitors (average,
zero_count) and schema change detection (type_changed) work correctly.
- _infer_column_type(): examines Python types (bool→UInt8, int→Int64,
float→Float64, str→String), all wrapped in Nullable()
- _escape(): returns unquoted literals for numeric/boolean types
- seed(): logs inferred column types for debugging
Co-Authored-By: unknown <>
* fix: treat booleans as strings in ClickHouseDirectSeeder
dbt seed writes Python True/False as 'True'/'False' strings in CSV,
so ClickHouse stores them as String columns. Match this behavior in the
direct seeder so count_true/count_false monitors work correctly.
- Remove Nullable(UInt8) inference for booleans (fall through to String)
- Escape True/False as quoted strings 'True'/'False'
Co-Authored-By: unknown <>
* fix: use Nullable(Bool) for boolean columns in ClickHouseDirectSeeder
dbt seed infers True/False CSV values as boolean; dbt-clickhouse maps
this to Bool (alias for UInt8). Match this behavior so count_true and
count_false monitors work correctly.
- Infer Nullable(Bool) for all-boolean columns
- Escape True/False as ClickHouse Bool literals (true/false)
Co-Authored-By: unknown <>
* fix: write CSV for dbt node discovery in ClickHouseDirectSeeder
The direct seeder bypasses dbt seed but still needs a CSV file on disk
so that dbt can discover the seed node for {{ ref() }} resolution
during run_operation. Without it, queries referencing the seed table
fail with 'node not found'.
- Add seeds_dir_path to ClickHouseDirectSeeder.__init__
- Write CSV before creating the table; delete it in finally block
- Pass seeds_dir_path from DbtProject._create_seeder()
Co-Authored-By: unknown <>
* refactor: remove run_operation retry logic from run_query path
The retry masked non-transient errors (e.g. 'node not found') by
retrying them pointlessly. Since most queries now use the direct
adapter path (AdapterQueryRunner), the retry is no longer needed.
If the log-capture issue resurfaces, we can add a proper fix that
distinguishes transient from non-transient failures.
Co-Authored-By: unknown <>
* docs: add comment explaining why clickhouse__has_temp_table_support returns false
Co-Authored-By: unknown <>
---------
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Itamar Hartstein <haritamar@gmail.com>
0 commit comments