Skip to content
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
2c08c87
feat: add Microsoft Fabric DWH integration (ELE-5282)
devin-ai-integration[bot] Mar 4, 2026
4699b73
fix: add env_var function to profile generator for fabric password
devin-ai-integration[bot] Mar 4, 2026
a0a7285
fix: use sqlserver adapter for local CI testing, rename fabric__ macr…
devin-ai-integration[bot] Mar 4, 2026
319dee6
fix: T-SQL compatibility for Elementary models (boolean literals, std…
devin-ai-integration[bot] Mar 4, 2026
ba6bece
fix: T-SQL GROUP BY positional refs, undefined macro, variance, maxre…
devin-ai-integration[bot] Mar 4, 2026
bea97a1
feat: refactor T-SQL macros to use fabric__ as primary implementation
devin-ai-integration[bot] Mar 4, 2026
d489172
refactor: remove sqlserver__ wrappers, use fabric__ as sole T-SQL dis…
devin-ai-integration[bot] Mar 4, 2026
4a13b4e
ci: temporarily disable non-fabric targets to focus testing
devin-ai-integration[bot] Mar 4, 2026
a350671
fix: T-SQL compatibility fixes for all anomaly detection tests
devin-ai-integration[bot] Mar 4, 2026
0505cde
ci: re-enable all CI targets (local + cloud) alongside new fabric target
devin-ai-integration[bot] Mar 4, 2026
dd0dedf
fix: reduce fabric max identifier length to 60 to prevent 128-char ov…
devin-ai-integration[bot] Mar 4, 2026
00715f1
fix: CTE-in-subquery workaround, data type normalization, and test LI…
devin-ai-integration[bot] Mar 4, 2026
ddd219c
fix: dispatch render_value for Fabric boolean literal compatibility i…
devin-ai-integration[bot] Mar 4, 2026
f3e8a33
refactor: extract Fabric-specific logic to dispatched macros and use …
devin-ai-integration[bot] Mar 4, 2026
1a8635c
fix: dispatch GROUP BY in get_columns_changes_query_generic for T-SQL…
devin-ai-integration[bot] Mar 4, 2026
e17d7eb
fix: use dbt.concat() for cross-DB string concatenation in schema cha…
devin-ai-integration[bot] Mar 5, 2026
70dbd4b
fix: resolve Fabric Warehouse compatibility issues found in local tes…
haritamar Mar 7, 2026
72c3f8e
ci: add Fabric Warehouse to CI test matrix
haritamar Mar 7, 2026
9864781
ci: add 60-minute job timeout to prevent hung CI runs
devin-ai-integration[bot] Mar 7, 2026
fbd07fb
feat: add sqlserver__target_database() dispatch macro
devin-ai-integration[bot] Mar 7, 2026
3d1aa76
feat: add comprehensive sqlserver__ dispatch macros delegating to fab…
devin-ai-integration[bot] Mar 7, 2026
312ccbc
fix: use do return() for sqlserver dispatches returning objects/value…
devin-ai-integration[bot] Mar 7, 2026
3bba723
fix: use return() instead of {{ }} for sqlserver__ dispatch macros re…
devin-ai-integration[bot] Mar 7, 2026
b04d35a
Merge branch 'devin/ELE-5282-1772640713' of https://git-manager.devin…
devin-ai-integration[bot] Mar 7, 2026
241cb0b
revert: remove redundant sqlserver__ dispatch wrappers
devin-ai-integration[bot] Mar 7, 2026
b083d2b
refactor: hardcode SQL Server password directly, remove MSSQL_SA_PASS…
devin-ai-integration[bot] Mar 7, 2026
0c30c35
refactor: address PR review comments - unify cross-adapter patterns
devin-ai-integration[bot] Mar 7, 2026
ed337d7
refactor: address PR review round 2 - is_tsql macro, merge render_val…
devin-ai-integration[bot] Mar 7, 2026
1487485
fix: normalize TEXT to VARCHAR in sqlserver__get_normalized_data_type
devin-ai-integration[bot] Mar 7, 2026
a50077a
fix: restore positional GROUP BY for non-T-SQL adapters in schema cha…
devin-ai-integration[bot] Mar 7, 2026
9e7778e
refactor: address PR review comments (samples query + T-SQL temp table)
devin-ai-integration[bot] Mar 8, 2026
342f00f
fix: compute anomaly_score after window aggregates
devin-ai-integration[bot] Mar 8, 2026
8821965
fix: inline T-SQL temp table logic into fabric__ macro, remove undefi…
devin-ai-integration[bot] Mar 8, 2026
ad1f9ac
refactor: add edr_condition_as_boolean, edr_is_true/edr_is_false, ren…
devin-ai-integration[bot] Mar 8, 2026
dd9bbca
refactor: simplify column_boolean_monitors with set variable, extract…
devin-ai-integration[bot] Mar 8, 2026
12e20a1
refactor: move set variables to top of model_run_results.sql for read…
devin-ai-integration[bot] Mar 8, 2026
acb01a0
fix: use target.schema for temp view in fabric query_test_result_rows…
devin-ai-integration[bot] Mar 8, 2026
fd27dac
refactor: remove fabric__get_unified_metrics_query, use UNION ALL app…
devin-ai-integration[bot] Mar 8, 2026
a2978b5
fix: use elementary package schema for temp view in fabric query_test…
devin-ai-integration[bot] Mar 8, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .github/workflows/test-all-warehouses.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,8 @@ jobs:
dbt-version:
${{ inputs.dbt-version && fromJSON(format('["{0}"]', inputs.dbt-version)) ||
fromJSON('["latest_official", "latest_pre"]') }}
warehouse-type: [postgres, clickhouse, trino, dremio, spark, duckdb]
warehouse-type:
[postgres, clickhouse, trino, dremio, spark, duckdb, fabric]
exclude:
# latest_pre is only tested on postgres
- dbt-version: latest_pre
Expand All @@ -61,6 +62,8 @@ jobs:
warehouse-type: spark
- dbt-version: latest_pre
warehouse-type: duckdb
- dbt-version: latest_pre
warehouse-type: fabric
uses: ./.github/workflows/test-warehouse.yml
with:
warehouse-type: ${{ matrix.warehouse-type }}
Expand Down
26 changes: 24 additions & 2 deletions .github/workflows/test-warehouse.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ on:
- clickhouse
- dremio
- duckdb
- fabric
elementary-ref:
type: string
required: false
Expand Down Expand Up @@ -51,6 +52,7 @@ on:
env:
BRANCH_NAME: ${{ github.head_ref || github.ref_name }}
TESTS_DIR: ${{ github.workspace }}/dbt-data-reliability/integration_tests
MSSQL_SA_PASSWORD: ${{ secrets.MSSQL_SA_PASSWORD || 'Elementary123!' }}
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated

jobs:
test:
Expand Down Expand Up @@ -100,6 +102,25 @@ jobs:
timeout 180 bash -c 'until [ "$(docker inspect -f {{.State.Health.Status}} dremio 2>/dev/null)" = "healthy" ]; do sleep 5; done'
echo "Dremio is healthy."

- name: Start SQL Server
if: inputs.warehouse-type == 'fabric'
working-directory: ${{ env.TESTS_DIR }}
env:
MSSQL_SA_PASSWORD: ${{ env.MSSQL_SA_PASSWORD }}
run: |
docker compose -f docker-compose-sqlserver.yml up -d
echo "Waiting for SQL Server to become healthy..."
timeout 120 bash -c 'until [ "$(docker inspect -f {{.State.Health.Status}} sqlserver 2>/dev/null)" = "healthy" ]; do sleep 5; done'
echo "SQL Server is healthy."

- name: Install ODBC Driver
if: inputs.warehouse-type == 'fabric'
run: |
curl https://packages.microsoft.com/keys/microsoft.asc | sudo tee /etc/apt/trusted.gpg.d/microsoft.asc
curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list | sudo tee /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql18 unixodbc-dev

- name: Start Spark
if: inputs.warehouse-type == 'spark'
working-directory: ${{ env.TESTS_DIR }}
Expand Down Expand Up @@ -136,19 +157,20 @@ jobs:
run:
pip install${{ (inputs.dbt-version == 'latest_pre' && ' --pre') || '' }}
"dbt-core${{ (!startsWith(inputs.dbt-version, 'latest') && format('=={0}', inputs.dbt-version)) || '' }}"
"dbt-${{ (inputs.warehouse-type == 'databricks_catalog' && 'databricks') || (inputs.warehouse-type == 'spark' && 'spark[PyHive]') || (inputs.warehouse-type == 'athena' && 'athena-community') || inputs.warehouse-type }}${{ (!startsWith(inputs.dbt-version, 'latest') && format('~={0}', inputs.dbt-version)) || '' }}"
"dbt-${{ (inputs.warehouse-type == 'databricks_catalog' && 'databricks') || (inputs.warehouse-type == 'spark' && 'spark[PyHive]') || (inputs.warehouse-type == 'athena' && 'athena-community') || (inputs.warehouse-type == 'fabric' && 'sqlserver') || inputs.warehouse-type }}${{ (!startsWith(inputs.dbt-version, 'latest') && format('~={0}', inputs.dbt-version)) || '' }}"

- name: Install dbt-fusion
if: inputs.dbt-version == 'fusion'
run: |
curl -fsSL https://public.cdn.getdbt.com/fs/install/install.sh | sh -s --

- name: Install Elementary
run: pip install "./elementary[${{ (inputs.warehouse-type == 'databricks_catalog' && 'databricks') || inputs.warehouse-type }}]"
run: pip install "./elementary[${{ (inputs.warehouse-type == 'databricks_catalog' && 'databricks') || (inputs.warehouse-type == 'fabric' && 'sqlserver') || inputs.warehouse-type }}]"

- name: Write dbt profiles
env:
CI_WAREHOUSE_SECRETS: ${{ secrets.CI_WAREHOUSE_SECRETS || '' }}
MSSQL_SA_PASSWORD: ${{ env.MSSQL_SA_PASSWORD }}
run: |
# Schema name = dbt_<YYMMDD_HHMMSS>_<branch≤18>_<8-char hash>
# The hash prevents collisions across concurrent jobs; the branch
Expand Down
16 changes: 16 additions & 0 deletions integration_tests/docker-compose-sqlserver.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
version: "3.8"
services:
sqlserver:
image: mcr.microsoft.com/mssql/server:2022-latest
container_name: sqlserver
ports:
- "127.0.0.1:1433:1433"
environment:
ACCEPT_EULA: "Y"
MSSQL_SA_PASSWORD: "${MSSQL_SA_PASSWORD}"
MSSQL_PID: "Developer"
healthcheck:
test: /opt/mssql-tools18/bin/sqlcmd -S localhost -U sa -P "${MSSQL_SA_PASSWORD}" -C -Q "SELECT 1" -b
interval: 10s
timeout: 5s
retries: 10
1 change: 1 addition & 0 deletions integration_tests/profiles/generate_profiles.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ def main(
keep_trailing_newline=True,
)
env.filters["toyaml"] = _yaml_inline
env.globals["env_var"] = lambda key, default="": os.environ.get(key, default)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and read the generate_profiles.py file around line 126
wc -l integration_tests/profiles/generate_profiles.py

Repository: elementary-data/dbt-data-reliability

Length of output: 131


🏁 Script executed:

# Read the file with context around line 126
sed -n '110,140p' integration_tests/profiles/generate_profiles.py | cat -n

Repository: elementary-data/dbt-data-reliability

Length of output: 1193


🏁 Script executed:

# Look for usage of env_var in the Jinja templates
find integration_tests -name "*.j2" -o -name "*.yml.j2" | head -20

Repository: elementary-data/dbt-data-reliability

Length of output: 122


🏁 Script executed:

# Check for env_var usage patterns in templates
rg "env_var\(" integration_tests/profiles/ -C 2

Repository: elementary-data/dbt-data-reliability

Length of output: 438


🏁 Script executed:

# Get more context around line 117 in the file
sed -n '100,134p' integration_tests/profiles/generate_profiles.py | cat -n

Repository: elementary-data/dbt-data-reliability

Length of output: 1611


🌐 Web query:

dbt env_var Jinja function behavior when variable not set no default

💡 Result:

In dbt, env_var() requires the environment variable to be set unless you pass a default:

  • {{ env_var('MY_VAR') }} (no default) → if MY_VAR is not set, dbt errors with: Env var required but not provided: 'MY_VAR' (this is a hard failure during parsing/compilation). [1][2]
  • {{ env_var('MY_VAR', 'some_default') }} → if MY_VAR is not set, dbt returns the provided default instead of erroring. [3]

So “when variable not set, no default” behavior = fail fast with an exception/error. [1][2]

Sources:
[1] https://airflow.apache.org/docs/apache-airflow-providers-openlineage/2.6.1/_modules/tests/system/openlineage/operator.html
[2] https://stackoverflow.com/questions/76653253/env-var-function-in-dbt-project-is-not-behaving-as-expected
[3] https://docs.open-metadata.org/latest/connectors/ingestion/workflows/dbt/auto-ingest-dbt-core


🏁 Script executed:

# Check if there are more usages of env_var in templates or documentation
rg "env_var" integration_tests/ -t yml -t yaml -C 3

Repository: elementary-data/dbt-data-reliability

Length of output: 107


🏁 Script executed:

# Check for all env_var usages in .j2 files more carefully
rg "env_var" integration_tests/profiles/

Repository: elementary-data/dbt-data-reliability

Length of output: 305


🏁 Script executed:

# Check if MSSQL_SA_PASSWORD or similar env vars are expected to be set
grep -r "MSSQL_SA_PASSWORD\|env_var" integration_tests/ --include="*.py" --include="*.md"

Repository: elementary-data/dbt-data-reliability

Length of output: 209


🏁 Script executed:

# See the full context of the function to understand when secrets are vs aren't loaded
sed -n '80,134p' integration_tests/profiles/generate_profiles.py | cat -n

Repository: elementary-data/dbt-data-reliability

Length of output: 2568


🏁 Script executed:

# Check all env_var calls in the entire codebase to see if any have explicit defaults
rg "env_var\(" --type py --type yaml -C 2

Repository: elementary-data/dbt-data-reliability

Length of output: 379


🏁 Script executed:

# Also check .j2 files specifically
fd "\.j2$" integration_tests/ | xargs cat | grep -n "env_var"

Repository: elementary-data/dbt-data-reliability

Length of output: 135


Preserve required env_var semantics instead of silently returning empty string.

At line 126, missing env vars silently resolve to "" even when no default is provided. The template at line 73 calls env_var('MSSQL_SA_PASSWORD') without an explicit default, which should fail in strict mode (when secrets are loaded) per dbt's documented behavior. The current implementation bypasses StrictUndefined's fail-fast guarantee.

🔧 Suggested fix
+_MISSING = object()
+
+def _env_var(key: str, default: Any = _MISSING) -> str:
+    value = os.environ.get(key)
+    if value is None:
+        if default is _MISSING:
+            raise click.ClickException(f"Missing required environment variable: {key}")
+        return default
+    return value
+
@@
-    env.globals["env_var"] = lambda key, default="": os.environ.get(key, default)
+    env.globals["env_var"] = _env_var
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@integration_tests/profiles/generate_profiles.py` at line 126, The current
env.globals["env_var"] lambda silently returns "" when no default is provided;
replace it with a real function (e.g., env_var) that distinguishes between an
omitted default and an explicit default: if a default is supplied, return
os.environ.get(key, default); if no default was supplied and the environment
variable is missing, raise an error (e.g., KeyError or a clear RuntimeError) so
templates like env_var('MSSQL_SA_PASSWORD') fail fast in strict mode; update the
reference to env.globals["env_var"] to point to this new function.

tmpl = env.from_string(template.read_text())
rendered = tmpl.render(**context)
output.write_text(rendered)
Expand Down
15 changes: 14 additions & 1 deletion integration_tests/profiles/profiles.yml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,19 @@ elementary_tests:
schema: {{ schema_name }}
threads: 8

fabric: &fabric
type: sqlserver
driver: "ODBC Driver 18 for SQL Server"
server: 127.0.0.1
port: 1433
database: master
schema: {{ schema_name }}
user: sa
password: "{{ env_var('MSSQL_SA_PASSWORD') }}"
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated
encrypt: false
trust_cert: true
threads: 4

# ── Cloud targets (secrets substituted at CI time) ─────────────────

snowflake: &snowflake
Expand Down Expand Up @@ -122,7 +135,7 @@ elementary_tests:
elementary:
target: postgres
outputs:
{%- set targets = ['postgres', 'clickhouse', 'trino', 'dremio', 'spark', 'duckdb', 'snowflake', 'bigquery', 'redshift', 'databricks_catalog', 'athena'] %}
{%- set targets = ['postgres', 'clickhouse', 'trino', 'dremio', 'spark', 'duckdb', 'fabric', 'snowflake', 'bigquery', 'redshift', 'databricks_catalog', 'athena'] %}
{%- for t in targets %}
{{ t }}:
<<: *{{ t }}
Expand Down
10 changes: 7 additions & 3 deletions integration_tests/tests/dbt_project.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,22 +95,26 @@ def _run_query_with_run_operation(self, prerendered_query: str):
)
return json.loads(run_operation_results[0])

@staticmethod
def read_table_query(
self,
table_name: str,
where: Optional[str] = None,
group_by: Optional[str] = None,
order_by: Optional[str] = None,
limit: Optional[int] = None,
column_names: Optional[List[str]] = None,
):
is_tsql = self.target in ("fabric", "sqlserver")
columns = ", ".join(column_names) if column_names else "*"
top_clause = f"TOP {limit} " if limit and is_tsql else ""
limit_clause = f"LIMIT {limit}" if limit and not is_tsql else ""
return f"""
SELECT {', '.join(column_names) if column_names else '*'}
SELECT {top_clause}{columns}
FROM {{{{ ref('{table_name}') }}}}
{f"WHERE {where}" if where else ""}
{f"GROUP BY {group_by}" if group_by else ""}
{f"ORDER BY {order_by}" if order_by else ""}
{f"LIMIT {limit}" if limit else ""}
{limit_clause}
"""

def read_table(
Expand Down
12 changes: 9 additions & 3 deletions integration_tests/tests/test_anomalies_backfill_logic.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,11 @@
# This returns data points used in the latest anomaly test
ANOMALY_TEST_POINTS_QUERY = """
with latest_elementary_test_result as (
select id
select {top_clause}id
from {{{{ ref("elementary_test_results") }}}}
where lower(table_name) = lower('{test_id}')
order by created_at desc
limit 1
{limit_clause}
)

select result_row
Expand Down Expand Up @@ -62,7 +62,13 @@ def get_daily_row_count_metrics(dbt_project: DbtProject, test_id: str):


def get_latest_anomaly_test_metrics(dbt_project: DbtProject, test_id: str):
results = dbt_project.run_query(ANOMALY_TEST_POINTS_QUERY.format(test_id=test_id))
is_tsql = dbt_project.target in ("fabric", "sqlserver")
query = ANOMALY_TEST_POINTS_QUERY.format(
test_id=test_id,
top_clause="TOP 1 " if is_tsql else "",
limit_clause="" if is_tsql else "limit 1",
)
results = dbt_project.run_query(query)
result_rows = [json.loads(result["result_row"]) for result in results]
return {
(
Expand Down
12 changes: 9 additions & 3 deletions integration_tests/tests/test_anomalies_ranges.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,11 @@

ANOMALY_TEST_POINTS_QUERY = """
with latest_elementary_test_result as (
select id
select {top_clause}id
from {{{{ ref("elementary_test_results") }}}}
where lower(table_name) = lower('{test_id}')
order by created_at desc
limit 1
{limit_clause}
)

select result_row
Expand All @@ -29,7 +29,13 @@


def get_latest_anomaly_test_points(dbt_project: DbtProject, test_id: str):
results = dbt_project.run_query(ANOMALY_TEST_POINTS_QUERY.format(test_id=test_id))
is_tsql = dbt_project.target in ("fabric", "sqlserver")
query = ANOMALY_TEST_POINTS_QUERY.format(
test_id=test_id,
top_clause="TOP 1 " if is_tsql else "",
limit_clause="" if is_tsql else "limit 1",
)
results = dbt_project.run_query(query)
return [json.loads(result["result_row"]) for result in results]


Expand Down
5 changes: 4 additions & 1 deletion integration_tests/tests/test_anomaly_exclude_metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,9 +101,12 @@ def test_exclude_specific_timestamps(test_id: str, dbt_project: DbtProject):
)
assert test_result["status"] == "pass"

ts_type = (
"datetime2" if dbt_project.target in ("fabric", "sqlserver") else "timestamp"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have an edr_timestamp or something like that, could we use it? Not sure

)
excluded_buckets_str = ", ".join(
[
"cast('%s' as timestamp)" % cur_ts.strftime(DATE_FORMAT)
"cast('%s' as %s)" % (cur_ts.strftime(DATE_FORMAT), ts_type)
for cur_ts in excluded_buckets
]
)
Expand Down
36 changes: 23 additions & 13 deletions integration_tests/tests/test_column_pii_sampling.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,28 @@

SAMPLES_QUERY = """
with latest_elementary_test_result as (
select id
select {top_clause}id
from {{{{ ref("elementary_test_results") }}}}
where lower(table_name) = lower('{test_id}')
order by created_at desc
limit 1
{limit_clause}
)

select result_row
from {{{{ ref("test_result_rows") }}}}
where elementary_test_results_id in (select * from latest_elementary_test_result)
"""


def _fmt_samples_query(dbt_project: DbtProject, test_id: str) -> str:
is_tsql = dbt_project.target in ("fabric", "sqlserver")
return SAMPLES_QUERY.format(
test_id=test_id,
top_clause="TOP 1 " if is_tsql else "",
limit_clause="" if is_tsql else "limit 1",
)


TEST_SAMPLE_ROW_COUNT = 5


Expand Down Expand Up @@ -47,7 +57,7 @@ def test_column_pii_sampling_enabled(test_id: str, dbt_project: DbtProject):

samples = [
json.loads(row["result_row"])
for row in dbt_project.run_query(SAMPLES_QUERY.format(test_id=test_id))
for row in dbt_project.run_query(_fmt_samples_query(dbt_project, test_id))
]

assert len(samples) == 0
Expand Down Expand Up @@ -77,7 +87,7 @@ def test_column_pii_sampling_disabled(test_id: str, dbt_project: DbtProject):

samples = [
json.loads(row["result_row"])
for row in dbt_project.run_query(SAMPLES_QUERY.format(test_id=test_id))
for row in dbt_project.run_query(_fmt_samples_query(dbt_project, test_id))
]

# sample should be {'unique_field': 'user@example.com', 'n_records': 10}
Expand Down Expand Up @@ -113,7 +123,7 @@ def test_column_pii_default_tag_override(test_id: str, dbt_project: DbtProject):

samples = [
json.loads(row["result_row"])
for row in dbt_project.run_query(SAMPLES_QUERY.format(test_id=test_id))
for row in dbt_project.run_query(_fmt_samples_query(dbt_project, test_id))
]

# sample should be {'unique_field': 'user@example.com', 'n_records': 10}
Expand Down Expand Up @@ -150,7 +160,7 @@ def test_column_pii_sampling_tags_exist_but_flag_disabled(

samples = [
json.loads(row["result_row"])
for row in dbt_project.run_query(SAMPLES_QUERY.format(test_id=test_id))
for row in dbt_project.run_query(_fmt_samples_query(dbt_project, test_id))
]

# When flag is disabled, we get the full sample (not limited by PII filtering)
Expand Down Expand Up @@ -188,7 +198,7 @@ def test_column_pii_sampling_all_columns_pii(test_id: str, dbt_project: DbtProje

samples = [
json.loads(row["result_row"])
for row in dbt_project.run_query(SAMPLES_QUERY.format(test_id=test_id))
for row in dbt_project.run_query(_fmt_samples_query(dbt_project, test_id))
]

# When all columns are PII, no samples should be collected
Expand Down Expand Up @@ -219,7 +229,7 @@ def test_unique_test_custom_tag(test_id: str, dbt_project: DbtProject):

samples = [
json.loads(row["result_row"])
for row in dbt_project.run_query(SAMPLES_QUERY.format(test_id=test_id))
for row in dbt_project.run_query(_fmt_samples_query(dbt_project, test_id))
]

assert len(samples) == 0
Expand Down Expand Up @@ -249,7 +259,7 @@ def test_accepted_values_multi_tags(test_id: str, dbt_project: DbtProject):

samples = [
json.loads(row["result_row"])
for row in dbt_project.run_query(SAMPLES_QUERY.format(test_id=test_id))
for row in dbt_project.run_query(_fmt_samples_query(dbt_project, test_id))
]

assert len(samples) == 0
Expand Down Expand Up @@ -279,7 +289,7 @@ def test_not_null_test_multi_matched_tags(test_id: str, dbt_project: DbtProject)

samples = [
json.loads(row["result_row"])
for row in dbt_project.run_query(SAMPLES_QUERY.format(test_id=test_id))
for row in dbt_project.run_query(_fmt_samples_query(dbt_project, test_id))
]

assert len(samples) == 0
Expand Down Expand Up @@ -312,7 +322,7 @@ def test_multiple_pii_columns_mapping(test_id: str, dbt_project: DbtProject):

samples = [
json.loads(row["result_row"])
for row in dbt_project.run_query(SAMPLES_QUERY.format(test_id=test_id))
for row in dbt_project.run_query(_fmt_samples_query(dbt_project, test_id))
]

assert len(samples) == 0
Expand Down Expand Up @@ -342,7 +352,7 @@ def test_custom_sql_test_with_pii_column_simple(test_id: str, dbt_project: DbtPr

samples = [
json.loads(row["result_row"])
for row in dbt_project.run_query(SAMPLES_QUERY.format(test_id=test_id))
for row in dbt_project.run_query(_fmt_samples_query(dbt_project, test_id))
]

assert len(samples) == 0
Expand Down Expand Up @@ -372,7 +382,7 @@ def test_meta_tags_and_accepted_values(test_id: str, dbt_project: DbtProject):

samples = [
json.loads(row["result_row"])
for row in dbt_project.run_query(SAMPLES_QUERY.format(test_id=test_id))
for row in dbt_project.run_query(_fmt_samples_query(dbt_project, test_id))
]

assert len(samples) == 0
Loading
Loading