Skip to content

Commit c0a9602

Browse files
fix: use py_<YYMMDD_HHMMSS>_<branch>_<hash> schema naming to prevent CI collisions (#2126)
* fix: use py_<yymmdd>_<branch>_<hash> schema naming to prevent CI collisions Replace the old truncation-based schema naming with a hash-based approach that prevents cross-branch collisions when concurrent CI jobs share the same warehouse. Uses py_ prefix to identify the Python package CI (matching dbt_ prefix in dbt-data-reliability). Format: py_<YYMMDD>_<branch≤29>_<8-char-hash> The hash is derived from the concurrency group key. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * style: collapse consecutive underscores in SAFE_BRANCH (CodeRabbit nitpick) Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * feat: add HHMM to schema timestamp for per-run uniqueness Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * style: use explicit UTC for timestamp (date -u) Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * style: add seconds to timestamp (YYMMDD_HHMMSS) per maintainer request Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Itamar Hartstein <haritamar@gmail.com>
1 parent c65cd98 commit c0a9602

1 file changed

Lines changed: 18 additions & 4 deletions

File tree

.github/workflows/test-warehouse.yml

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,8 @@ jobs:
6565
run:
6666
working-directory: elementary
6767
concurrency:
68-
# This is what eventually defines the schema name in the data platform.
68+
# Serialises runs for the same warehouse × dbt-version × branch.
69+
# The schema name is derived from a hash of this group (see "Write dbt profiles").
6970
group: tests_${{ inputs.warehouse-type }}_dbt_${{ inputs.dbt-version }}_${{ github.head_ref || github.ref_name }}
7071
cancel-in-progress: true
7172
steps:
@@ -116,13 +117,26 @@ jobs:
116117
env:
117118
CI_WAREHOUSE_SECRETS: ${{ secrets.CI_WAREHOUSE_SECRETS || '' }}
118119
run: |
119-
DBT_VERSION=$(pip show dbt-core | grep -i version | awk '{print $2}' | sed 's/\.//g')
120-
UNDERSCORED_REF_NAME=$(echo "${{ inputs.warehouse-type }}_dbt_${DBT_VERSION}_${BRANCH_NAME}" | awk '{print tolower($0)}' | head -c 40 | sed "s/[-\/]/_/g")
120+
# Schema name = py_<YYMMDD_HHMMSS>_<branch≤19>_<8-char hash>
121+
# The hash prevents collisions across concurrent jobs; the branch
122+
# keeps it human-readable; the timestamp helps with stale schema
123+
# cleanup and ensures each CI run gets a unique schema.
124+
#
125+
# Budget (PostgreSQL 63-char limit):
126+
# py_(3) + timestamp(13) + _(1) + branch(≤19) + _(1) + hash(8) = 45
127+
# + _elementary(11) + _gw7(4) = 60
128+
CONCURRENCY_GROUP="tests_${{ inputs.warehouse-type }}_dbt_${{ inputs.dbt-version }}_${BRANCH_NAME}"
129+
SHORT_HASH=$(echo -n "$CONCURRENCY_GROUP" | sha256sum | head -c 8)
130+
SAFE_BRANCH=$(echo "${BRANCH_NAME}" | awk '{print tolower($0)}' | sed "s/[^a-z0-9]/_/g; s/__*/_/g" | head -c 19)
131+
DATE_STAMP=$(date -u +%y%m%d_%H%M%S)
132+
SCHEMA_NAME="py_${DATE_STAMP}_${SAFE_BRANCH}_${SHORT_HASH}"
133+
134+
echo "Schema name: $SCHEMA_NAME (branch='${BRANCH_NAME}', timestamp=${DATE_STAMP}, hash of concurrency group)"
121135
122136
python "${{ github.workspace }}/elementary/tests/profiles/generate_profiles.py" \
123137
--template "${{ github.workspace }}/elementary/tests/profiles/profiles.yml.j2" \
124138
--output ~/.dbt/profiles.yml \
125-
--schema-name "py_$UNDERSCORED_REF_NAME"
139+
--schema-name "$SCHEMA_NAME"
126140
127141
- name: Run Python package unit tests
128142
run: pytest -vv tests/unit --warehouse-type ${{ inputs.warehouse-type }}

0 commit comments

Comments
 (0)