fix: use py_<YYMMDD_HHMMSS>_<branch>_<hash> schema naming to prevent CI collisions by devin-ai-integration[bot] · Pull Request #2126 · elementary-data/elementary

devin-ai-integration · 2026-02-27T19:16:27Z

fix: use `py_<YYMMDD_HHMMSS>_<branch>_<hash>` schema naming to prevent CI collisions

Summary

Replaces the old truncation-based CI schema naming (py_<warehouse>_dbt_<version>_<branch> truncated to 40 chars) with a hash-based approach that prevents cross-branch collisions when concurrent CI jobs target the same warehouse.

Old format: py_ + <warehouse>_dbt_<dbt_version>_<branch> truncated to 40 chars via head -c 40
New format: py_<YYMMDD_HHMMSS>_<branch≤19>_<8-char SHA-256 hash of concurrency group>

The hash is derived from the full concurrency group key (warehouse × dbt-version × branch), so jobs that can run concurrently always get different schema names. The timestamp (second-level, explicit UTC) ensures each CI run gets a unique schema and enables time-based cleanup of stale schemas.

This is a companion PR to dbt-data-reliability#940 which applies the same pattern with a dbt_ prefix.

Updates since last revision

Added HHMM to timestamp (YYMMDD_HHMM, 11 chars) for per-run uniqueness — motivated by Athena investigation showing stale data_monitoring_metrics from prior runs contaminating test results when schemas collide across runs on the same day
Added seconds (%S) to timestamp (YYMMDD_HHMMSS, 13 chars) per maintainer request for completeness
Branch truncated to 19 chars (down from 29) to accommodate the longer timestamp while staying within budget
Explicit UTC (date -u) for timestamp consistency across runners
Collapsed consecutive underscores in sanitized branch names (s/__*/_/g)

Budget (PostgreSQL 63-char limit):
py_(3) + timestamp(13) + _(1) + branch(≤19) + _(1) + hash(8) = 45, plus _elementary(11) + _gw7(4) = 60

Review & Testing Checklist for Human

Verify 60-char budget math. py_(3) + YYMMDD_HHMMSS(13) + _(1) + branch(≤19) + _(1) + hash(8) = 45, plus _elementary(11) + _gw7(4) = 60. Confirm this holds for your longest xdist worker suffix and schema postfixes. The margin is only 3 chars — any longer suffixes will silently truncate on PostgreSQL.
Run a CI job on a branch and verify the schema name in the "Write dbt profiles" step output matches the expected py_YYMMDD_HHMMSS_branch_hash format (with seconds).
Old schemas are now orphaned. Existing schemas named py_<warehouse>_dbt_... will no longer be used or cleaned up by CI. Plan manual cleanup if needed.

Notes

Second-level timestamp precision provides extra safety beyond the minute-level uniqueness already enforced by concurrency groups + cancel-in-progress: true.
Link to Devin run
Requested by: @haritamar

Summary by CodeRabbit

Chores
- Updated warehouse testing workflow to use an explicit multi-part schema naming scheme.
- New schema names include a UTC timestamp (YYMMDD_HHMMSS), a sanitized lowercased branch label (max 19 chars), and a short hash to reduce collisions.
- Runs are grouped and serialized by warehouse type, tooling version, and branch to avoid conflicts.
- Workflow now logs the generated schema identifier for visibility.
- All other workflow behavior remains unchanged.

…isions Replace the old truncation-based schema naming with a hash-based approach that prevents cross-branch collisions when concurrent CI jobs share the same warehouse. Uses py_ prefix to identify the Python package CI (matching dbt_ prefix in dbt-data-reliability). Format: py_<YYMMDD>_<branch≤29>_<8-char-hash> The hash is derived from the concurrency group key. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

devin-ai-integration · 2026-02-27T19:16:31Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

github-actions · 2026-02-27T19:16:38Z

👋 @devin-ai-integration[bot]
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in this pull request.

coderabbitai · 2026-02-27T19:16:50Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Replaces prior UNDERSCORED_REF_NAME logic in the CI workflow with an explicit SCHEMA_NAME composed of DATE_STAMP (YYMMDD_HHMMSS), SAFE_BRANCH (sanitized, lowercased, alphanumeric/underscore, length-capped), and SHORT_HASH (first 8 chars of SHA‑256 of CONCURRENCY_GROUP). Emits an echo and updates the --schema-name argument; introduces CONCURRENCY_GROUP for run serialization.

Changes

Cohort / File(s)	Summary
Workflow: schema naming `.github/workflows/test-warehouse.yml`	Reworks schema-name derivation: introduces `CONCURRENCY_GROUP` (warehouse type + dbt version + branch), computes `SHORT_HASH` as first 8 chars of SHA‑256(CONCURRENCY_GROUP), creates `SAFE_BRANCH` (lowercased, alphanumeric/underscore, max 19 chars), adds `DATE_STAMP` (YYMMDD_HHMMSS), and builds `SCHEMA_NAME` as `py_<DATE_STAMP>_<SAFE_BRANCH>_<SHORT_HASH>`. Replaces prior DBT_VERSION/UNDERSCORED_REF_NAME approach, updates `--schema-name` usage, echoes derived components, and clarifies concurrency-group comment about serialization by warehouse type, dbt version, and branch.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇
I dig a name with time and hash tonight,
A tidy burrow stamped in moonlit light,
Branches scrubbed and shortened, hashes neat,
Tests hop in order, no two burrows meet. 🥕

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: replacing the schema naming approach from truncation-based to hash-and-timestamp-based format to prevent CI collisions.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch devin/1772219716-fix-ci-schema-naming

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

.github/workflows/test-warehouse.yml (1)
120-131: Schema naming logic is well-designed with correct budget calculation.

The implementation correctly:

Derives the hash from the same components as the concurrency group

Uses echo -n to avoid hashing a trailing newline

Applies lowercase before the character filter

Respects the 63-character PostgreSQL limit with the calculated budget

One optional enhancement: multiple consecutive special characters in branch names become multiple underscores (e.g., feature/v2--hotfix → feature_v2__hotfix). If consistent formatting is desired, you could collapse them:
Optional: collapse consecutive underscores
-          SAFE_BRANCH=$(echo "${BRANCH_NAME}" | awk '{print tolower($0)}' | sed "s/[^a-z0-9]/_/g" | head -c 29)
+          SAFE_BRANCH=$(echo "${BRANCH_NAME}" | awk '{print tolower($0)}' | sed "s/[^a-z0-9]/_/g; s/__*/_/g" | head -c 29)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-warehouse.yml around lines 120 - 131, The SAFE_BRANCH
transformation currently lowercases and replaces non-alphanumerics with
underscores but leaves consecutive underscores; update the SAFE_BRANCH pipeline
(the command that produces SAFE_BRANCH used by CONCURRENCY_GROUP, SHORT_HASH and
SCHEMA_NAME) to collapse repeated underscores into a single underscore (e.g., by
adding a step such as running the output through a squeeze/dedup of underscores)
and then apply head -c 29 as before so the final SCHEMA_NAME still fits the
63-char PostgreSQL budget.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/test-warehouse.yml:
- Around line 120-131: The SAFE_BRANCH transformation currently lowercases and
replaces non-alphanumerics with underscores but leaves consecutive underscores;
update the SAFE_BRANCH pipeline (the command that produces SAFE_BRANCH used by
CONCURRENCY_GROUP, SHORT_HASH and SCHEMA_NAME) to collapse repeated underscores
into a single underscore (e.g., by adding a step such as running the output
through a squeeze/dedup of underscores) and then apply head -c 29 as before so
the final SCHEMA_NAME still fits the 63-char PostgreSQL budget.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c65cd98 and 41c5055.

📒 Files selected for processing (1)

.github/workflows/test-warehouse.yml

…tpick) Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

coderabbitai

🧹 Nitpick comments (1)

.github/workflows/test-warehouse.yml (1)
120-134: Consider using explicit UTC for consistent timestamps across runners.

The schema naming logic is well-designed with accurate budget calculations. However, date +%y%m%d_%H%M uses the system timezone. While GitHub-hosted runners default to UTC, explicitly specifying UTC ensures consistency if runners are ever reconfigured or if self-hosted runners are used.
🔧 Suggested change for explicit UTC
-          DATE_STAMP=$(date +%y%m%d_%H%M)
+          DATE_STAMP=$(date -u +%y%m%d_%H%M)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-warehouse.yml around lines 120 - 134, The DATE_STAMP
generation uses the local system timezone; change it to explicit UTC so schema
timestamps are consistent across runners by updating how DATE_STAMP is created
(the DATE_STAMP variable used to build SCHEMA_NAME). Replace the current date
invocation with an explicit UTC invocation (e.g., use date -u or set TZ=UTC for
the command) so DATE_STAMP and consequently SCHEMA_NAME are always based on UTC.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/test-warehouse.yml:
- Around line 120-134: The DATE_STAMP generation uses the local system timezone;
change it to explicit UTC so schema timestamps are consistent across runners by
updating how DATE_STAMP is created (the DATE_STAMP variable used to build
SCHEMA_NAME). Replace the current date invocation with an explicit UTC
invocation (e.g., use date -u or set TZ=UTC for the command) so DATE_STAMP and
consequently SCHEMA_NAME are always based on UTC.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3f540f6 and 415b35b.

📒 Files selected for processing (1)

.github/workflows/test-warehouse.yml

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

devin-ai-integration Bot assigned haritamar Feb 27, 2026

coderabbitai Bot reviewed Feb 27, 2026

View reviewed changes

devin-ai-integration Bot and others added 2 commits February 27, 2026 19:20

style: collapse consecutive underscores in SAFE_BRANCH (CodeRabbit ni…

3f540f6

…tpick) Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

feat: add HHMM to schema timestamp for per-run uniqueness

415b35b

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

coderabbitai Bot reviewed Feb 27, 2026

View reviewed changes

style: use explicit UTC for timestamp (date -u)

4a3f43b

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

This comment was marked as resolved.

Sign in to view

devin-ai-integration Bot mentioned this pull request Feb 27, 2026

fix: use dbt_<YYMMDD_HHMMSS>_<branch>_<hash> schema naming to prevent CI collisions elementary-data/dbt-data-reliability#940

Merged

4 tasks

devin-ai-integration Bot changed the title ~~fix: use py_<yymmdd>_<branch>_<hash> schema naming to prevent CI collisions~~ fix: use py_<YYMMDD_HHMM>_<branch>_<hash> schema naming to prevent CI collisions Feb 27, 2026

style: add seconds to timestamp (YYMMDD_HHMMSS) per maintainer request

649af0b

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>

devin-ai-integration Bot changed the title ~~fix: use py_<YYMMDD_HHMM>_<branch>_<hash> schema naming to prevent CI collisions~~ fix: use py_<YYMMDD_HHMMSS>_<branch>_<hash> schema naming to prevent CI collisions Feb 27, 2026

haritamar approved these changes Feb 28, 2026

View reviewed changes

haritamar merged commit c0a9602 into master Feb 28, 2026
16 checks passed

haritamar deleted the devin/1772219716-fix-ci-schema-naming branch February 28, 2026 08:44

devin-ai-integration Bot commented Feb 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use py_<YYMMDD_HHMMSS>_<branch>_<hash> schema naming to prevent CI collisions#2126

fix: use py_<YYMMDD_HHMMSS>_<branch>_<hash> schema naming to prevent CI collisions#2126
haritamar merged 5 commits intomasterfrom
devin/1772219716-fix-ci-schema-naming

devin-ai-integration Bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot commented Feb 27, 2026

Uh oh!

github-actions Bot commented Feb 27, 2026

Uh oh!

coderabbitai Bot commented Feb 27, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devin-ai-integration Bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

fix: use py_<YYMMDD_HHMMSS>_<branch>_<hash> schema naming to prevent CI collisions

Summary

Updates since last revision

Review & Testing Checklist for Human

Notes

Summary by CodeRabbit

Uh oh!

devin-ai-integration Bot commented Feb 27, 2026

🤖 Devin AI Engineer

Uh oh!

github-actions Bot commented Feb 27, 2026

Uh oh!

coderabbitai Bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

devin-ai-integration Bot commented Feb 27, 2026 •

edited

Loading

fix: use `py_<YYMMDD_HHMMSS>_<branch>_<hash>` schema naming to prevent CI collisions

coderabbitai Bot commented Feb 27, 2026 •

edited

Loading