Skip to content

Commit 4566cc2

Browse files
feat: add Vertica adapter support (#2141)
* feat: add Vertica adapter support - Add dbt-vertica as optional dependency in pyproject.toml - Add Vertica transient error patterns for automatic retry - Add Vertica Docker service to e2e test docker-compose - Add Vertica profile to test profiles template - Add Vertica to CI workflow matrix (test-warehouse + test-all-warehouses) - Handle dbt-vertica install with --no-deps (pins dbt-core~=1.8) Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * fix: prevent dbt-core downgrade when installing elementary with vertica The dbt-vertica package pins dbt-core~=1.8. Installing elementary with the [vertica] extra would re-resolve dbt-vertica's deps and downgrade dbt-core, breaking package-lock.yml parsing and protobuf compat. Fix: install elementary without the adapter extra for Vertica (dbt-vertica is already present via --no-deps), and explicitly install vertica-python. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * fix: address CodeRabbit review feedback - Remove 'no transaction in progress' from Vertica transient patterns (it's a transaction-state notice, not a transient connection error) - Hardcode dbt-data-reliability-ref to 'vertica-compat' for the Vertica matrix entry until PR #963 merges into master Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * chore: update dbt-data-reliability ref to merged master commit PR #963 has been merged to master (2ab66fbe). Update: - packages.yml/package-lock.yml: point to merge commit on master - test-all-warehouses.yml: remove vertica-compat branch override Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * fix: add vertica_seed_override to e2e project (fixes seed_rejects collision) dbt-vertica's COPY command hardcodes 'seed_rejects' as the reject table name for every seed, causing 'Object already exists' errors when dbt seed processes more than one file. This override (same as in dbt-data-reliability integration tests) uses per-seed reject table names. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Itamar Hartstein <haritamar@gmail.com>
1 parent 13a4e0a commit 4566cc2

File tree

9 files changed

+105
-7
lines changed

9 files changed

+105
-7
lines changed

.github/workflows/test-all-warehouses.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,7 @@ jobs:
102102
spark,
103103
fabric,
104104
sqlserver,
105+
vertica,
105106
]
106107
uses: ./.github/workflows/test-warehouse.yml
107108
with:

.github/workflows/test-warehouse.yml

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ on:
2020
- dremio
2121
- fabric
2222
- sqlserver
23+
- vertica
2324
elementary-ref:
2425
type: string
2526
required: false
@@ -167,6 +168,12 @@ jobs:
167168
run: |
168169
docker compose up -d --wait sqlserver
169170
171+
- name: Start Vertica
172+
if: inputs.warehouse-type == 'vertica'
173+
working-directory: ${{ env.E2E_DBT_PROJECT_DIR }}
174+
run: |
175+
docker compose up -d --wait vertica
176+
170177
- name: Setup Python
171178
uses: actions/setup-python@v5
172179
with:
@@ -185,15 +192,34 @@ jobs:
185192
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql18 unixodbc-dev
186193
187194
- name: Install dbt
195+
if: inputs.warehouse-type != 'vertica'
188196
run: >
189197
pip install
190198
"dbt-core${{ inputs.dbt-version && format('=={0}', inputs.dbt-version) }}"
191199
"dbt-${{ (inputs.warehouse-type == 'databricks_catalog' && 'databricks') || (inputs.warehouse-type == 'athena' && 'athena-community') || (inputs.warehouse-type == 'dremio' && 'dremio') || inputs.warehouse-type }}${{ (inputs.warehouse-type == 'spark' && '[PyHive]') || '' }}${{ inputs.dbt-version && format('~={0}', inputs.dbt-version) }}"
192200
201+
# dbt-vertica pins dbt-core~=1.8 which lacks the 'arguments' attribute
202+
# used by newer dbt-core. Install dbt-vertica without deps first, then
203+
# install the latest compatible dbt-core separately. We also install
204+
# vertica-python (dbt-vertica's runtime dep) explicitly.
205+
- name: Install dbt (Vertica)
206+
if: inputs.warehouse-type == 'vertica'
207+
run: |
208+
pip install --no-deps dbt-vertica
209+
pip install vertica-python
210+
pip install "dbt-core${{ inputs.dbt-version && format('=={0}', inputs.dbt-version) }}"
211+
193212
- name: Install Elementary
194213
run: |
195214
pip install -r dev-requirements.txt
196-
pip install ".[${{ (inputs.warehouse-type == 'databricks_catalog' && 'databricks') || inputs.warehouse-type }}]"
215+
# For Vertica, dbt-vertica is already installed with --no-deps above;
216+
# using ".[vertica]" would re-resolve dbt-vertica's deps and downgrade
217+
# dbt-core to ~=1.8. Install elementary without the adapter extra.
218+
if [ "${{ inputs.warehouse-type }}" = "vertica" ]; then
219+
pip install "."
220+
else
221+
pip install ".[${{ (inputs.warehouse-type == 'databricks_catalog' && 'databricks') || inputs.warehouse-type }}]"
222+
fi
197223
198224
- name: Write dbt profiles
199225
env:
@@ -204,7 +230,7 @@ jobs:
204230
# This enables caching the seeded database state between runs.
205231
IS_DOCKER=false
206232
case "${{ inputs.warehouse-type }}" in
207-
postgres|clickhouse|trino|dremio|duckdb|spark|sqlserver) IS_DOCKER=true ;;
233+
postgres|clickhouse|trino|dremio|duckdb|spark|sqlserver|vertica) IS_DOCKER=true ;;
208234
esac
209235
210236
if [ "$IS_DOCKER" = "true" ]; then

elementary/clients/dbt/transient_errors.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,11 @@
120120
),
121121
"fabric": _TSQL_TRANSIENT,
122122
"sqlserver": _TSQL_TRANSIENT,
123+
"vertica": (
124+
"connection timed out",
125+
"could not connect to the server",
126+
"ssl syscall error",
127+
),
123128
}
124129

125130
# Pre-computed union of all adapter-specific patterns for the fallback path

elementary/monitor/dbt_project/package-lock.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,5 @@ packages:
44
version: 0.8.6
55
- git: https://github.com/elementary-data/dbt-data-reliability.git
66
name: elementary
7-
revision: 534afc63c75d28b87d7cbd3b222dd3ea9a980f7b
8-
sha1_hash: cb18b7df65415901187dcf469dcd377e56c0dc70
7+
revision: 2ab66fbe7e347c3cbbf2910c91f03cd6db2ef517
8+
sha1_hash: 7dc83ea83a781be623eea141eca2a0cceb4878e9

elementary/monitor/dbt_project/packages.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ packages:
22
- package: dbt-labs/dbt_utils
33
version: [">=0.8.0", "<0.9.0"]
44
- git: https://github.com/elementary-data/dbt-data-reliability.git
5-
revision: 534afc63c75d28b87d7cbd3b222dd3ea9a980f7b
5+
revision: 2ab66fbe7e347c3cbbf2910c91f03cd6db2ef517
66

77
# NOTE - for unreleased CLI versions we often need to update the package version to a commit hash (please leave this
88
# commented, so it will be easy to access)

pyproject.toml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ dbt-duckdb = {version = ">=1.5.0,<2.0.0", optional = true}
5858
dbt-dremio = {version = ">=1.5.0,<2.0.0", optional = true}
5959
dbt-fabric = {version = ">=1.4,<2.0.0", optional = true}
6060
dbt-sqlserver = {version = ">=1.4,<2.0.0", optional = true}
61+
dbt-vertica = {version = ">=1.7,<2.0.0", optional = true}
6162
[tool.poetry.extras]
6263
snowflake = ["dbt-snowflake"]
6364
bigquery = ["dbt-bigquery"]
@@ -72,7 +73,8 @@ duckdb = ["dbt-duckdb"]
7273
dremio = ["dbt-dremio"]
7374
fabric = ["dbt-fabric"]
7475
sqlserver = ["dbt-sqlserver"]
75-
all = ["dbt-snowflake", "dbt-bigquery", "dbt-redshift", "dbt-postgres", "dbt-databricks", "dbt-spark", "dbt-clickhouse", "dbt-athena-community", "dbt-trino", "dbt-duckdb", "dbt-dremio", "dbt-fabric", "dbt-sqlserver"]
76+
vertica = ["dbt-vertica"]
77+
all = ["dbt-snowflake", "dbt-bigquery", "dbt-redshift", "dbt-postgres", "dbt-databricks", "dbt-spark", "dbt-clickhouse", "dbt-athena-community", "dbt-trino", "dbt-duckdb", "dbt-dremio", "dbt-fabric", "dbt-sqlserver", "dbt-vertica"]
7678

7779
[build-system]
7880
requires = ["poetry-core>=1.0.0"]

tests/e2e_dbt_project/docker-compose.yml

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -290,6 +290,36 @@ services:
290290
timeout: 5s
291291
retries: 10
292292

293+
# ── Vertica CE ────────────────────────────────────────────────────
294+
vertica:
295+
image: ghcr.io/ratiopbc/vertica-ce
296+
container_name: vertica
297+
ports:
298+
- "127.0.0.1:5433:5433"
299+
environment:
300+
APP_DB_USER: dbadmin
301+
APP_DB_PASSWORD: vertica
302+
TZ: "UTC"
303+
VERTICA_DB_NAME: elementary_tests
304+
VMART_ETL_SCRIPT: ""
305+
deploy:
306+
mode: global
307+
ulimits:
308+
nofile:
309+
soft: 65536
310+
hard: 65536
311+
volumes:
312+
- vertica-data:/data
313+
healthcheck:
314+
test:
315+
[
316+
"CMD-SHELL",
317+
"/opt/vertica/bin/vsql -U dbadmin -w vertica -c 'SELECT 1;'",
318+
]
319+
interval: 5s
320+
timeout: 5s
321+
retries: 10
322+
293323
# ── SQL Server (for Fabric / SQL Server adapters) ─────────────────
294324
sqlserver:
295325
image: mcr.microsoft.com/mssql/server:2022-latest
@@ -316,3 +346,4 @@ volumes:
316346
dremio-minio-data:
317347
spark-warehouse:
318348
spark-hive-metastore:
349+
vertica-data:
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
{#- Override the dbt-vertica seed helper so that each seed file uses a
2+
unique reject-table name. The upstream macro hardcodes
3+
``seed_rejects`` for every seed, which causes "Object already exists"
4+
errors when ``dbt seed`` processes more than one file. -#}
5+
{% macro copy_local_load_csv_rows(model, agate_table) %}
6+
{% set cols_sql = get_seed_column_quoted_csv(model, agate_table.column_names) %}
7+
8+
{#- Build a per-seed reject table name so concurrent seeds don't clash. -#}
9+
{% set reject_table = model["alias"] ~ "_rejects" %}
10+
11+
{% set sql %}
12+
copy {{ this.render() }}
13+
({{ cols_sql }})
14+
from local '{{ agate_table.original_abspath }}'
15+
delimiter ','
16+
enclosed by '"'
17+
skip 1
18+
abort on error
19+
rejected data as table {{ this.without_identifier() }}.{{ reject_table }};
20+
{% endset %}
21+
22+
{{ return(sql) }}
23+
{% endmacro %}

tests/profiles/profiles.yml.j2

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,16 @@ elementary_tests:
7373
trust_cert: true
7474
threads: 4
7575

76+
vertica: &vertica
77+
type: vertica
78+
host: 127.0.0.1
79+
port: 5433
80+
username: dbadmin
81+
password: vertica
82+
database: elementary_tests
83+
schema: {{ schema_name }}
84+
threads: 4
85+
7686
# ── Cloud targets (secrets substituted at CI time) ─────────────────
7787

7888
fabric: &fabric
@@ -148,7 +158,7 @@ elementary_tests:
148158
elementary:
149159
target: postgres
150160
outputs:
151-
{%- set targets = ['postgres', 'clickhouse', 'trino', 'dremio', 'duckdb', 'spark', 'fabric', 'sqlserver', 'snowflake', 'bigquery', 'redshift', 'databricks_catalog', 'athena'] %}
161+
{%- set targets = ['postgres', 'clickhouse', 'trino', 'dremio', 'duckdb', 'spark', 'fabric', 'sqlserver', 'vertica', 'snowflake', 'bigquery', 'redshift', 'databricks_catalog', 'athena'] %}
152162
{%- for t in targets %}
153163
{{ t }}:
154164
<<: *{{ t }}

0 commit comments

Comments
 (0)