Merge pull request #2119 from elementary-data/add-sla-and-volume-threshold-test-docs

joostboon · web-flow · commit 4121a3fe31f2 · 2026-02-18T21:07:40.000+02:00
Clarify anomaly detection methods and execution vs data freshness
diff --git a/docs/data-tests/data-freshness-sla.mdx b/docs/data-tests/data-freshness-sla.mdx
@@ -12,7 +12,9 @@ import AiGenerateTest from '/snippets/ai-generate-test.mdx';
 
 Verifies that data in a model was updated before a specified SLA deadline time.
 
-This test checks the maximum timestamp value of a specified column in your data to determine whether the data was refreshed before your deadline. Unlike `freshness_anomalies` (which uses ML-based anomaly detection), this test validates against a fixed, explicit SLA time — making it ideal when you have a concrete contractual or operational deadline.
+This test checks the maximum timestamp value of a specified column in your data to determine whether the data was actually refreshed before your deadline. Unlike `freshness_anomalies` (which uses z-score based anomaly detection as a dbt test, or ML-based detection in Elementary Cloud), this test validates against a fixed, explicit SLA time, making it ideal when you have a concrete contractual or operational deadline.
+
+Unlike `execution_sla` (which only checks if the dbt model _ran_ on time), `data_freshness_sla` checks whether the actual _data_ is fresh. A pipeline can run successfully but still serve stale data if, for example, an upstream source didn't update. This test catches that.
 
 ### Use Case
 
@@ -131,9 +133,9 @@ models:
 
 | Feature | `data_freshness_sla` | `freshness_anomalies` | `execution_sla` |
 | --- | --- | --- | --- |
-| What it checks | Data timestamps | Data timestamps | Pipeline run time |
-| Detection method | Fixed SLA deadline | ML-based anomaly detection | Fixed SLA deadline |
-| Best for | Contractual/operational deadlines | Detecting unexpected delays | Pipeline execution deadlines |
+| What it checks | Actual data freshness (timestamps in the data) | Actual data freshness (timestamps in the data) | Pipeline execution (did the model run?) |
+| Detection method | Fixed SLA deadline | Z-score (dbt test) / ML (Cloud) | Fixed SLA deadline |
+| Best for | Contractual/operational deadlines on data | Detecting unexpected delays in data updates | Ensuring the pipeline itself ran on time |
 | Works with sources | Yes | Yes | No (models only) |
 
 ### Notes
diff --git a/docs/data-tests/execution-sla.mdx b/docs/data-tests/execution-sla.mdx
@@ -12,7 +12,9 @@ import AiGenerateTest from '/snippets/ai-generate-test.mdx';
 
 Verifies that dbt models are executed successfully before a specified SLA deadline time.
 
-This test checks whether your pipeline completed before a specified deadline on the days you care about. It queries `dbt_run_results` for successful runs of the model and validates that at least one run completed before the SLA deadline.
+This test checks whether your pipeline _ran_ before a specified deadline on the days you care about. It queries `dbt_run_results` for successful runs of the model and validates that at least one run completed before the SLA deadline.
+
+Note that this test only verifies that the model executed, not that the data is actually fresh. If you need to verify that the underlying data was updated (e.g., an upstream source refreshed), use [`data_freshness_sla`](/data-tests/data-freshness-sla) instead.
 
 ### Use Case
 
diff --git a/docs/data-tests/volume-threshold.mdx b/docs/data-tests/volume-threshold.mdx
@@ -12,7 +12,7 @@ import AiGenerateTest from '/snippets/ai-generate-test.mdx';
 
 Monitors row count changes between time buckets using configurable percentage thresholds with multiple severity levels.
 
-Unlike `volume_anomalies` (which uses ML-based anomaly detection to determine what's "normal"), this test lets you define explicit percentage thresholds for warnings and errors — giving you precise control over when to be alerted. It uses Elementary's metric caching infrastructure to avoid recalculating row counts for buckets that have already been computed.
+Unlike `volume_anomalies` (which uses z-score based anomaly detection as a dbt test, or ML-based detection in Elementary Cloud), this test lets you define explicit percentage thresholds for warnings and errors, giving you precise control over when to be alerted. It uses Elementary's metric caching infrastructure to avoid recalculating row counts for buckets that have already been computed.
 
 ### Use Case
 
@@ -124,12 +124,12 @@ models:
 
 | Parameter                 | Required | Default | Description                                                                  |
 | ------------------------- | -------- | ------- | ---------------------------------------------------------------------------- |
-| `timestamp_column`        | Yes      | —       | Column to determine time periods                                             |
+| `timestamp_column`        | Yes      | -       | Column to determine time periods                                             |
 | `warn_threshold_percent`  | No       | 5       | Percentage change that triggers a warning                                    |
 | `error_threshold_percent` | No       | 10      | Percentage change that triggers an error                                     |
 | `direction`               | No       | `both`  | Direction to monitor: `both`, `spike`, or `drop`                             |
 | `time_bucket`             | No       | `{period: day, count: 1}` | Time bucket configuration                              |
-| `where_expression`        | No       | —       | SQL expression to filter the data                                            |
+| `where_expression`        | No       | -       | SQL expression to filter the data                                            |
 | `days_back`               | No       | 14      | Days of metric history to retain                                             |
 | `backfill_days`           | No       | 2       | Days to recalculate on each run                                              |
 | `min_row_count`           | No       | 100     | Minimum rows in the previous bucket required to trigger the check            |
@@ -138,7 +138,7 @@ models:
 
 | Feature | `volume_threshold` | `volume_anomalies` |
 | --- | --- | --- |
-| Detection method | Fixed percentage thresholds | ML-based anomaly detection |
+| Detection method | Fixed percentage thresholds | Z-score (dbt test) / ML (Cloud) |
 | Severity levels | Dual (warn + error) | Single (pass/fail) |
 | Best for | Known acceptable ranges | Unknown/variable patterns |
 | Configuration | Explicit thresholds | Sensitivity tuning |
@@ -147,6 +147,6 @@ models:
 ### Notes
 
 - The `warn_threshold_percent` must be less than or equal to `error_threshold_percent`
-- The test uses Elementary's metric caching infrastructure — row counts for previously computed time buckets are reused across runs
+- The test uses Elementary's metric caching infrastructure. Row counts for previously computed time buckets are reused across runs
 - If the previous bucket has fewer rows than `min_row_count`, the test passes (insufficient data for a meaningful comparison)
 - The test only evaluates completed time buckets