Add integration tests for data_freshness_sla and volume_threshold#965
Conversation
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan for PR comments
Comment |
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
be7de95
into
feature/volume-threshold-test
* Add data_freshness_sla and volume_threshold tests Add two new Elementary tests: - data_freshness_sla: checks if data was updated before a specified SLA deadline - volume_threshold: monitors row count changes with configurable warn/error thresholds, using Elementary's metric caching to avoid redundant computation Fixes applied: - volume_threshold: union historical metrics with new metrics for comparison - volume_threshold: deterministic dedup with source_priority tiebreaker - volume_threshold: let get_time_bucket handle defaults - data_freshness_sla: treat future-dated data as fresh (remove upper bound) - data_freshness_sla: escape single quotes in where_expression - data_freshness_sla: simplify deadline_passed logic - data_freshness_sla: rename max_timestamp_utc to max_timestamp (no UTC conversion) - data_freshness_sla: fix macro comment to match actual behavior - data_freshness_sla: document UTC assumption, add ephemeral model check Co-authored-by: Cursor <cursoragent@cursor.com> * fix: handle missing table in read_table when raise_if_empty=False (BigQuery test_seed_run_results) Co-authored-by: Cursor <cursoragent@cursor.com> * Revert "fix: handle missing table in read_table when raise_if_empty=False (BigQuery test_seed_run_results)" This reverts commit 9fc552e. * Add integration tests for data_freshness_sla and volume_threshold (#965) * Add integration tests for data_freshness_sla and volume_threshold Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix sla_time YAML sexagesimal issue - use AM/PM format Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Itamar Hartstein <haritamar@gmail.com> * Fix sqlfmt issues and Postgres round() bug in test macros - Reformat both test_data_freshness_sla.sql and test_volume_threshold.sql to pass sqlfmt - Fix Postgres round() bug: cast expression to numeric for round(numeric, int) compatibility - Restructure Jinja conditionals in data_freshness_sla to be sqlfmt-compatible - Extract where_suffix Jinja set to avoid parentheses inside SQL CASE expressions - Use edr_boolean_literal for is_failure CASE and WHERE clause - Remove 'where severity_level > 0' filter to prevent NULL fail_calc validation error Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix volume_threshold integration tests to use complete buckets Elementary only processes complete time buckets (the latest full bucket before run_started_at). With daily buckets, today's bucket is incomplete and gets excluded. Tests were putting anomalous data in today's bucket, so the macro never saw the spike/drop. Fix: shift all test data one day back so the anomalous data lands in yesterday's bucket (the latest complete bucket) and baseline data in the day before. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix cross-database compatibility in volume_threshold macro - Replace 'cast(... as numeric)' with edr_type_numeric() for adapter-aware type - Replace 'limit 1' with row_number() pattern (SQL Server/Fabric compatible) - Replace '||' string concat with edr_concat() (SQL Server/Fabric compatible) Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix Dremio: rename 'prev' alias to 'prev_b' (reserved keyword) Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix Dremio: rename 'result' CTE to 'volume_result' (reserved keyword) Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix fusion pytz issue and SQL Server/Fabric concat in data_freshness_sla - Skip pytz.all_timezones validation in dbt-fusion (known discrepancy, dbt-labs/dbt-fusion#143) - Replace || concatenation with edr_concat() in test_data_freshness_sla.sql for SQL Server/Fabric Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix fusion pytz.localize() producing incorrect results in calculate_sla_deadline_utc In dbt-fusion, pytz.localize() produces incorrect timezone-aware datetimes, causing deadline_passed to be False when it should be True. Use datetime.timezone.utc and replace(tzinfo=) in the fusion path instead. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix fusion: use naive UTC datetimes to avoid broken tz-aware operations In dbt-fusion, datetime.timezone.utc doesn't exist and timezone-aware datetime comparison produces incorrect results. Use datetime.utcnow() with manual offset calculation so all comparisons use naive datetimes. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix Fabric, BigQuery, and fusion failures in volume_threshold and data_freshness_sla - Fabric: rename CTE current_bucket -> curr_bucket (SQL Server rejects CURRENT_ prefix) - BigQuery: use \' instead of '' to escape single quotes in where_suffix string literals - fusion: replace pytz.localize() with stdlib datetime.timezone.utc in calculate_sla_deadline_utc to fix deadline_passed always being False Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix sqlfmt formatting in test_data_freshness_sla.sql Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix Fabric: replace scalar subquery with JOIN in previous_bucket CTE SQL Server/Fabric cannot resolve a CTE name inside a nested subquery within another CTE. Replace (select bucket_start from curr_bucket) with an INNER JOIN to curr_bucket. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix sqlfmt formatting in test_volume_threshold.sql Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix freshness SLA deadline check and volume threshold Fabric compatibility Move SLA deadline_passed check from compile-time Python (broken in dbt-fusion due to pytz issues) to SQL-level using edr_condition_as_boolean with edr_current_timestamp_in_utc. Replace cross-CTE subquery in volume_threshold with LAG window function to fix Fabric/SQL Server "Invalid object name" error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revert fusion path to use naive UTC datetimes instead of datetime.timezone.utc dbt-fusion does not support datetime.timezone.utc, causing "undefined value" render errors. Revert to the pytz probe approach. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address CodeRabbit review feedback - Fix clock-flaky SLA tests: use Etc/GMT-14 for deadline-passed cases so the deadline is always in the past regardless of CI runner time - Use exact status assertions (== "error") instead of != "pass" in volume threshold tests to catch severity regressions - Add negative threshold and min_row_count validation in volume_threshold - Document DST limitation in fusion sla_utils path Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix volume threshold assertions: dbt reports 'fail' not 'error' Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix ClickHouse and DuckDB CI failures ClickHouse: reorder columns in freshness SLA final_result CTE so is_failure is computed before sla_deadline_utc is cast to string (ClickHouse resolves column refs to already-aliased columns in the same SELECT, causing DateTime vs String type mismatch). DuckDB: add explicit casts to LAG window function results in volume_threshold to work around DuckDB internal binder bug that confuses TIMESTAMP and FLOAT types across multiple LAG calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix ClickHouse freshness SLA: rename string-cast alias to avoid shadowing ClickHouse resolves column references against output aliases regardless of SELECT clause order. The cast(sla_deadline_utc as string) with the same alias name caused the is_failure comparison to use the string version instead of the timestamp, producing DateTime vs String type mismatch. Renamed to sla_deadline_utc_str internally and re-aliased in the final SELECT. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix DuckDB volume_threshold: replace LAG with ROW_NUMBER + self-join DuckDB has an internal binder bug where LAG window functions over UNION ALL sources confuse TIMESTAMP and FLOAT column types, causing "Failed to bind column reference bucket_end: inequal types". Using ROW_NUMBER + self-join achieves the same result without triggering the bug, and is cross-database compatible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix fusion: use pytz.utc instead of datetime.timezone.utc; fix brittle test timezone - sla_utils.sql: replace datetime.timezone.utc with pytz.utc in fusion path. dbt-fusion's modules.datetime does not expose datetime.timezone, causing dbt1501 "undefined value" errors in all data_freshness_sla tests on fusion. - test_data_freshness_sla.py: change test_deadline_not_passed_does_not_fail from Etc/GMT-14 (UTC+14, deadline = 09:59 UTC) to plain UTC (deadline = 23:59 UTC). Etc/GMT-14 caused the test to fail whenever CI ran after 09:59 UTC. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix Fabric, ClickHouse, and Vertica failures in volume_threshold and freshness_sla volume_threshold (Fabric + ClickHouse): - Cast row_number() to signed int in bucket_numbered CTE to fix ClickHouse NO_COMMON_TYPE error (UInt64 vs Int64) on JOIN with bucket_num - 1. - Move max(bucket_num) into a separate max_bucket CTE and use INNER JOIN instead of a scalar subquery in the WHERE clause to fix SQL Server / Fabric "Invalid object name 'bucket_numbered'" error. data_freshness_sla (Vertica): - Replace timezone: "Etc/GMT-14" with timezone: "UTC" for tests that need a deadline in the past. Etc/GMT-14 behaved incorrectly in some pytz versions, causing Vertica tests to return 'pass' instead of 'fail'. 12:01am UTC (= 00:01 UTC) is always in the past when CI runs at 07:00+ UTC. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix Dremio: rename 'prev' alias to 'prev_b' (reserved keyword in Dremio) Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Address CodeRabbit review comments: clarify DATA_FRESH semantics, handle zero-baseline spikes, add timestamp_column validation to volume_threshold Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Itamar Hartstein <haritamar@gmail.com> Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Add integration tests for data_freshness_sla and volume_threshold
Summary
Adds integration test coverage for the two new Elementary tests introduced in this PR branch:
data_freshness_slaandvolume_threshold.test_data_freshness_sla.py(6 tests):where_expressionfilteringtest_volume_threshold.py(8 tests):directionparameter (spike-only, drop-only)min_row_countskips small baselineswhere_expressionfilteringBoth files follow existing integration test conventions (pytest fixtures,
DbtProject.test(),generate_dates, etc.).Review & Testing Checklist for Human
test_stale_data_failsusessla_time: "00:01"UTC — will break if CI runs between 00:00–00:01 UTC.test_deadline_not_passed_does_not_failusesEtc/GMT-14to push deadline into the future — verify this works with pytz in the SLA macro.test_custom_thresholdswarn assertion: Assertsstatus == "warn"for an 8% change. This depends on thefail_calc='max(severity_level)'/warn_if='>=1'/error_if='>=2'config in the macro — verify this produces a"warn"(not"fail"or"pass") inelementary_test_results.pytest test_data_freshness_sla.py test_volume_threshold.py -vvv --target postgres) to validate.test_idwithout re-seeding: Intest_with_where_expressionandtest_custom_thresholds, the second assertion relies on data persisting from the first seed. Verify this works with the test infrastructure.day_of_week/day_of_monthscheduling fordata_freshness_slais not tested. Consider if this is acceptable or needs follow-up.Notes