Skip to content

feat: add timestamp normalization for BigQuery artifacts#977

Open
niccoloalexander wants to merge 3 commits intoelementary-data:masterfrom
niccoloalexander:feat/bigquery-adaptability-improvements
Open

feat: add timestamp normalization for BigQuery artifacts#977
niccoloalexander wants to merge 3 commits intoelementary-data:masterfrom
niccoloalexander:feat/bigquery-adaptability-improvements

Conversation

@niccoloalexander
Copy link
Copy Markdown

@niccoloalexander niccoloalexander commented Mar 31, 2026

Issues:

  • Problem parsing hyphenated BigQuery dataset names e.g. "project-name" when using dbt Fusion
  • dbt artifact timestamps that elementary writes to BigQuery and parses as timestamps have incompatible nanosecond-level granularity

Changes:

  • Introduced a new macro normalize_artifact_timestamp_precision to ensure timestamp precision for BigQuery.
  • Updated existing macros to utilize this new function for execute_started_at, execute_completed_at, compile_started_at, and compile_completed_at fields in upload_run_results.sql and upload_source_freshness.sql.
  • Enhanced schema existence checks for BigQuery in create_elementary_tests_schema.sql and get_elementary_tests_schema.sql to improve compatibility.

Summary by CodeRabbit

  • Bug Fixes
    • Normalize timestamp precision for artifact timing fields to prevent overly‑precise fractional seconds (improves BigQuery compatibility).
    • Improve schema-existence checks and validation logic to handle BigQuery information schema correctly, reducing false negatives when creating or detecting test schemas.

- Introduced a new macro `normalize_artifact_timestamp_precision` to ensure timestamp precision for BigQuery.
- Updated existing macros to utilize this new function for `execute_started_at`, `execute_completed_at`, `compile_started_at`, and `compile_completed_at` fields in `upload_run_results.sql` and `upload_source_freshness.sql`.
- Enhanced schema existence checks for BigQuery in `create_elementary_tests_schema.sql` and `get_elementary_tests_schema.sql` to improve compatibility.
@github-actions
Copy link
Copy Markdown
Contributor

👋 @niccoloalexander
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 31, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e759373d-f019-4688-9d95-a74d94d09d58

📥 Commits

Reviewing files that changed from the base of the PR and between 7e1be1b and 8a7967c.

📒 Files selected for processing (2)
  • macros/edr/tests/on_run_start/create_elementary_tests_schema.sql
  • macros/edr/tests/test_utils/get_elementary_tests_schema.sql
🚧 Files skipped from review as they are similar to previous changes (1)
  • macros/edr/tests/on_run_start/create_elementary_tests_schema.sql

📝 Walkthrough

Walkthrough

Normalize timestamp precision for BigQuery artifacts and switch to BigQuery-specific INFORMATION_SCHEMA schema existence checks while retaining adapter-based checks for other targets across EDR and test utility macros.

Changes

Cohort / File(s) Summary
Timestamp Precision Normalization
macros/edr/dbt_artifacts/upload_run_results.sql, macros/edr/dbt_artifacts/upload_source_freshness.sql
Added normalize_artifact_timestamp_precision(timestamp_value) and updated flatten_run_result / flatten_source_freshness to pass execute_*_at and compile_*_at through the normalizer for BigQuery targets.
BigQuery Schema Existence Checks
macros/edr/tests/on_run_start/create_elementary_tests_schema.sql, macros/edr/tests/test_utils/get_elementary_tests_schema.sql
Replaced unconditional adapter.check_schema_exists(...) calls with a branch: for target.type == "bigquery", run an INFORMATION_SCHEMA.SCHEMATA COUNT(*) query (case-insensitive) and derive existence from the count; otherwise keep the adapter check.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

Hop-hop, I trim the timestamps neat,
Six tiny ticks make records sweet.
I peek in schemas, count with care,
BigQuery answers — none elsewhere. 🐇✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately describes the main change: adding timestamp normalization functionality specifically for BigQuery artifacts, which is the primary focus across multiple modified files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
macros/edr/tests/on_run_start/create_elementary_tests_schema.sql (1)

9-23: Consider extracting this BigQuery schema-exists check into a shared macro.

This logic is duplicated in macros/edr/tests/test_utils/get_elementary_tests_schema.sql (same SQL + result parsing pattern). A shared helper would keep behavior consistent and reduce drift.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/edr/tests/on_run_start/create_elementary_tests_schema.sql` around
lines 9 - 23, Extract the BigQuery schema-exists logic (the block that builds
schema_exists_sql, calls elementary.run_query into schema_exists_result, and
computes schema_exists) into a shared macro (e.g.,
get_elementary_tests_schema_exists) and use that macro from both
create_elementary_tests_schema.sql and get_elementary_tests_schema.sql; the
macro should accept database_name and tests_schema_name, run the same
INFORMATION_SCHEMA query via elementary.run_query, parse rows[0][0] to an int
boolean, and return the boolean so you can replace the duplicated
schema_exists_sql/schema_exists_result/schema_exists code with a single macro
call.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@macros/edr/tests/on_run_start/create_elementary_tests_schema.sql`:
- Around line 15-20: The current check reads schema_exists_result.rows[0][0]
directly which breaks in Fusion; replace the direct rows access by converting
the run_query result with elementary.agate_to_dicts(schema_exists_result) and
then inspect the first dict/value to determine existence (update the logic
around schema_exists_result and schema_exists); apply the same change in both
locations referencing elementary.run_query in create_elementary_tests_schema.sql
and get_elementary_tests_schema.sql so the code uses
elementary.agate_to_dicts(...) instead of schema_exists_result.rows[0][0].

In `@macros/edr/tests/test_utils/get_elementary_tests_schema.sql`:
- Around line 24-29: The current legacy_schema_exists computation reads
legacy_schema_exists_result.rows[0][0] which breaks in dbt Fusion because
elementary.run_query() can return a different shape; update the logic that sets
legacy_schema_exists (and the variable legacy_schema_exists_result from
elementary.run_query(legacy_schema_exists_sql)) to normalize the result via
elementary.agate_to_dicts(legacy_schema_exists_result) (or the equivalent
conversion) and then check the first row/value safely via the normalized
dict/list structure so both legacy and Fusion run_query() result shapes are
supported.

---

Nitpick comments:
In `@macros/edr/tests/on_run_start/create_elementary_tests_schema.sql`:
- Around line 9-23: Extract the BigQuery schema-exists logic (the block that
builds schema_exists_sql, calls elementary.run_query into schema_exists_result,
and computes schema_exists) into a shared macro (e.g.,
get_elementary_tests_schema_exists) and use that macro from both
create_elementary_tests_schema.sql and get_elementary_tests_schema.sql; the
macro should accept database_name and tests_schema_name, run the same
INFORMATION_SCHEMA query via elementary.run_query, parse rows[0][0] to an int
boolean, and return the boolean so you can replace the duplicated
schema_exists_sql/schema_exists_result/schema_exists code with a single macro
call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 99ee226d-2fc5-4e5e-86f8-97a52e265b73

📥 Commits

Reviewing files that changed from the base of the PR and between b8c7ab0 and 7e1be1b.

📒 Files selected for processing (4)
  • macros/edr/dbt_artifacts/upload_run_results.sql
  • macros/edr/dbt_artifacts/upload_source_freshness.sql
  • macros/edr/tests/on_run_start/create_elementary_tests_schema.sql
  • macros/edr/tests/test_utils/get_elementary_tests_schema.sql

… review comments

- Updated `create_elementary_tests_schema.sql` and `get_elementary_tests_schema.sql` to utilize a more robust method for checking schema existence by converting query results to dictionaries.
- Improved readability and maintainability of the schema existence logic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant