Skip to content

Add integration tests for data_freshness_sla and volume_threshold#965

Merged
haritamar merged 2 commits intofeature/volume-threshold-testfrom
devin/1773323122-add-integration-tests-sla-volume-threshold
Mar 12, 2026
Merged

Add integration tests for data_freshness_sla and volume_threshold#965
haritamar merged 2 commits intofeature/volume-threshold-testfrom
devin/1773323122-add-integration-tests-sla-volume-threshold

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Add integration tests for data_freshness_sla and volume_threshold

Summary

Adds integration test coverage for the two new Elementary tests introduced in this PR branch: data_freshness_sla and volume_threshold.

test_data_freshness_sla.py (6 tests):

  • Fresh data passes, stale data fails, no-data fails
  • Deadline-not-yet-passed does not fail
  • where_expression filtering
  • Non-UTC timezone handling

test_volume_threshold.py (8 tests):

  • Stable volume passes
  • Large spike/drop triggers failure
  • direction parameter (spike-only, drop-only)
  • min_row_count skips small baselines
  • Custom warn/error thresholds
  • where_expression filtering

Both files follow existing integration test conventions (pytest fixtures, DbtProject.test(), generate_dates, etc.).

Review & Testing Checklist for Human

  • Time-sensitive flakiness: test_stale_data_fails uses sla_time: "00:01" UTC — will break if CI runs between 00:00–00:01 UTC. test_deadline_not_passed_does_not_fail uses Etc/GMT-14 to push deadline into the future — verify this works with pytz in the SLA macro.
  • test_custom_thresholds warn assertion: Asserts status == "warn" for an 8% change. This depends on the fail_calc='max(severity_level)' / warn_if='>=1' / error_if='>=2' config in the macro — verify this produces a "warn" (not "fail" or "pass") in elementary_test_results.
  • Tests were not executed against any warehouse — they were written by reading existing test patterns and the macro SQL. Run at least the Postgres suite (pytest test_data_freshness_sla.py test_volume_threshold.py -vvv --target postgres) to validate.
  • Multi-step tests reuse test_id without re-seeding: In test_with_where_expression and test_custom_thresholds, the second assertion relies on data persisting from the first seed. Verify this works with the test infrastructure.
  • Missing coverage: day_of_week/day_of_month scheduling for data_freshness_sla is not tested. Consider if this is acceptable or needs follow-up.

Notes

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 12, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4230a797-7c8d-47b2-a6fe-aefe4f566364

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1773323122-add-integration-tests-sla-volume-threshold
📝 Coding Plan for PR comments
  • Generate coding plan

Comment @coderabbitai help to get the list of available commands and usage tips.

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
@haritamar haritamar merged commit be7de95 into feature/volume-threshold-test Mar 12, 2026
1 of 2 checks passed
@haritamar haritamar deleted the devin/1773323122-add-integration-tests-sla-volume-threshold branch March 12, 2026 14:05
arbiv pushed a commit that referenced this pull request Apr 5, 2026
* Add data_freshness_sla and volume_threshold tests

Add two new Elementary tests:
- data_freshness_sla: checks if data was updated before a specified SLA deadline
- volume_threshold: monitors row count changes with configurable warn/error
  thresholds, using Elementary's metric caching to avoid redundant computation

Fixes applied:
- volume_threshold: union historical metrics with new metrics for comparison
- volume_threshold: deterministic dedup with source_priority tiebreaker
- volume_threshold: let get_time_bucket handle defaults
- data_freshness_sla: treat future-dated data as fresh (remove upper bound)
- data_freshness_sla: escape single quotes in where_expression
- data_freshness_sla: simplify deadline_passed logic
- data_freshness_sla: rename max_timestamp_utc to max_timestamp (no UTC conversion)
- data_freshness_sla: fix macro comment to match actual behavior
- data_freshness_sla: document UTC assumption, add ephemeral model check

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: handle missing table in read_table when raise_if_empty=False (BigQuery test_seed_run_results)

Co-authored-by: Cursor <cursoragent@cursor.com>

* Revert "fix: handle missing table in read_table when raise_if_empty=False (BigQuery test_seed_run_results)"

This reverts commit 9fc552e.

* Add integration tests for data_freshness_sla and volume_threshold (#965)

* Add integration tests for data_freshness_sla and volume_threshold

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

* Fix sla_time YAML sexagesimal issue - use AM/PM format

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Itamar Hartstein <haritamar@gmail.com>

* Fix sqlfmt issues and Postgres round() bug in test macros

- Reformat both test_data_freshness_sla.sql and test_volume_threshold.sql to pass sqlfmt
- Fix Postgres round() bug: cast expression to numeric for round(numeric, int) compatibility
- Restructure Jinja conditionals in data_freshness_sla to be sqlfmt-compatible
- Extract where_suffix Jinja set to avoid parentheses inside SQL CASE expressions
- Use edr_boolean_literal for is_failure CASE and WHERE clause
- Remove 'where severity_level > 0' filter to prevent NULL fail_calc validation error

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

* Fix volume_threshold integration tests to use complete buckets

Elementary only processes complete time buckets (the latest full bucket
before run_started_at). With daily buckets, today's bucket is incomplete
and gets excluded. Tests were putting anomalous data in today's bucket,
so the macro never saw the spike/drop.

Fix: shift all test data one day back so the anomalous data lands in
yesterday's bucket (the latest complete bucket) and baseline data in
the day before.

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

* Fix cross-database compatibility in volume_threshold macro

- Replace 'cast(... as numeric)' with edr_type_numeric() for adapter-aware type
- Replace 'limit 1' with row_number() pattern (SQL Server/Fabric compatible)
- Replace '||' string concat with edr_concat() (SQL Server/Fabric compatible)

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

* Fix Dremio: rename 'prev' alias to 'prev_b' (reserved keyword)

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

* Fix Dremio: rename 'result' CTE to 'volume_result' (reserved keyword)

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

* Fix fusion pytz issue and SQL Server/Fabric concat in data_freshness_sla

- Skip pytz.all_timezones validation in dbt-fusion (known discrepancy, dbt-labs/dbt-fusion#143)
- Replace || concatenation with edr_concat() in test_data_freshness_sla.sql for SQL Server/Fabric

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

* Fix fusion pytz.localize() producing incorrect results in calculate_sla_deadline_utc

In dbt-fusion, pytz.localize() produces incorrect timezone-aware datetimes,
causing deadline_passed to be False when it should be True. Use
datetime.timezone.utc and replace(tzinfo=) in the fusion path instead.

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

* Fix fusion: use naive UTC datetimes to avoid broken tz-aware operations

In dbt-fusion, datetime.timezone.utc doesn't exist and timezone-aware
datetime comparison produces incorrect results. Use datetime.utcnow()
with manual offset calculation so all comparisons use naive datetimes.

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

* Fix Fabric, BigQuery, and fusion failures in volume_threshold and data_freshness_sla

- Fabric: rename CTE current_bucket -> curr_bucket (SQL Server rejects CURRENT_ prefix)
- BigQuery: use \' instead of '' to escape single quotes in where_suffix string literals
- fusion: replace pytz.localize() with stdlib datetime.timezone.utc in calculate_sla_deadline_utc to fix deadline_passed always being False

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix sqlfmt formatting in test_data_freshness_sla.sql

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix Fabric: replace scalar subquery with JOIN in previous_bucket CTE

SQL Server/Fabric cannot resolve a CTE name inside a nested subquery
within another CTE. Replace (select bucket_start from curr_bucket)
with an INNER JOIN to curr_bucket.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix sqlfmt formatting in test_volume_threshold.sql

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix freshness SLA deadline check and volume threshold Fabric compatibility

Move SLA deadline_passed check from compile-time Python (broken in
dbt-fusion due to pytz issues) to SQL-level using
edr_condition_as_boolean with edr_current_timestamp_in_utc.

Replace cross-CTE subquery in volume_threshold with LAG window function
to fix Fabric/SQL Server "Invalid object name" error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Revert fusion path to use naive UTC datetimes instead of datetime.timezone.utc

dbt-fusion does not support datetime.timezone.utc, causing
"undefined value" render errors. Revert to the pytz probe approach.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Address CodeRabbit review feedback

- Fix clock-flaky SLA tests: use Etc/GMT-14 for deadline-passed cases
  so the deadline is always in the past regardless of CI runner time
- Use exact status assertions (== "error") instead of != "pass" in
  volume threshold tests to catch severity regressions
- Add negative threshold and min_row_count validation in volume_threshold
- Document DST limitation in fusion sla_utils path

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix volume threshold assertions: dbt reports 'fail' not 'error'

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix ClickHouse and DuckDB CI failures

ClickHouse: reorder columns in freshness SLA final_result CTE so
is_failure is computed before sla_deadline_utc is cast to string
(ClickHouse resolves column refs to already-aliased columns in the
same SELECT, causing DateTime vs String type mismatch).

DuckDB: add explicit casts to LAG window function results in
volume_threshold to work around DuckDB internal binder bug that
confuses TIMESTAMP and FLOAT types across multiple LAG calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix ClickHouse freshness SLA: rename string-cast alias to avoid shadowing

ClickHouse resolves column references against output aliases regardless
of SELECT clause order. The cast(sla_deadline_utc as string) with the
same alias name caused the is_failure comparison to use the string
version instead of the timestamp, producing DateTime vs String type
mismatch. Renamed to sla_deadline_utc_str internally and re-aliased
in the final SELECT.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix DuckDB volume_threshold: replace LAG with ROW_NUMBER + self-join

DuckDB has an internal binder bug where LAG window functions over
UNION ALL sources confuse TIMESTAMP and FLOAT column types, causing
"Failed to bind column reference bucket_end: inequal types". Using
ROW_NUMBER + self-join achieves the same result without triggering
the bug, and is cross-database compatible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix fusion: use pytz.utc instead of datetime.timezone.utc; fix brittle test timezone

- sla_utils.sql: replace datetime.timezone.utc with pytz.utc in fusion path.
  dbt-fusion's modules.datetime does not expose datetime.timezone, causing
  dbt1501 "undefined value" errors in all data_freshness_sla tests on fusion.

- test_data_freshness_sla.py: change test_deadline_not_passed_does_not_fail
  from Etc/GMT-14 (UTC+14, deadline = 09:59 UTC) to plain UTC (deadline = 23:59 UTC).
  Etc/GMT-14 caused the test to fail whenever CI ran after 09:59 UTC.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix Fabric, ClickHouse, and Vertica failures in volume_threshold and freshness_sla

volume_threshold (Fabric + ClickHouse):
- Cast row_number() to signed int in bucket_numbered CTE to fix ClickHouse
  NO_COMMON_TYPE error (UInt64 vs Int64) on JOIN with bucket_num - 1.
- Move max(bucket_num) into a separate max_bucket CTE and use INNER JOIN
  instead of a scalar subquery in the WHERE clause to fix SQL Server / Fabric
  "Invalid object name 'bucket_numbered'" error.

data_freshness_sla (Vertica):
- Replace timezone: "Etc/GMT-14" with timezone: "UTC" for tests that need a
  deadline in the past. Etc/GMT-14 behaved incorrectly in some pytz versions,
  causing Vertica tests to return 'pass' instead of 'fail'. 12:01am UTC
  (= 00:01 UTC) is always in the past when CI runs at 07:00+ UTC.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix Dremio: rename 'prev' alias to 'prev_b' (reserved keyword in Dremio)

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

* Address CodeRabbit review comments: clarify DATA_FRESH semantics, handle zero-baseline spikes, add timestamp_column validation to volume_threshold

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Itamar Hartstein <haritamar@gmail.com>
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant