Skip to content

Commit f7b9497

Browse files
suryaiyer95claude
andauthored
feat: add MSSQL/Fabric support to data-parity skill (#705)
* fix: use synchronous DuckDB constructor to avoid bun runtime timeout Bun's runtime never fires native addon async callbacks, so the async `new duckdb.Database(path, opts, callback)` form would hit the 2-second timeout fallback on every connection attempt. Switch to the synchronous constructor form `new duckdb.Database(path)` / `new duckdb.Database(path, opts)` which throws on error and completes immediately in both Node and bun runtimes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * revert: restore async DuckDB constructor — sync change was bogus The async callback form with 2s fallback was already working correctly at e3df5a4. The timeout was caused by a missing duckdb .node binary, not a bun incompatibility. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add MSSQL/Fabric dialect mapping and data-parity support - Add `warehouseTypeToDialect()` mapping: sqlserver→tsql, mssql→tsql, fabric→fabric, postgresql→postgres, mariadb→mysql. Fixes critical serde mismatch where Rust engine rejects raw warehouse type names. - Update both `resolveDialect()` functions to use the mapping - Add MSSQL/Fabric cases to `dateTruncExpr()` — DATETRUNC(DAY, col) - Add locale-safe date literal casting via CONVERT(DATE, ..., 23) - Register `fabric` in DRIVER_MAP (reuses sqlserver TDS driver) - Add `fabric` normalize aliases in normalize.ts - Add 15 SQL Server driver unit tests (TOP injection, truncation, schema introspection, connection lifecycle, result format) - Add 9 dialect mapping unit tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add Azure AD authentication to SQL Server driver (7 flows) - Support all 7 Azure AD / Entra ID auth types in `sqlserver.ts`: `azure-active-directory-password`, `access-token`, `service-principal-secret`, `msi-vm`, `msi-app-service`, `azure-active-directory-default`, `token-credential` - Force TLS encryption for all Azure AD connections - Dynamic import of `@azure/identity` for `DefaultAzureCredential` - Add normalize aliases for Azure AD config fields (`authentication`, `azure_tenant_id`, `azure_client_id`, `azure_client_secret`, `access_token`) - Add `fabric: SQLSERVER_ALIASES` to DRIVER_ALIASES - Add 10 Azure AD unit tests covering all auth flows, encryption, and `DefaultAzureCredential` with managed identity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add MSSQL and Microsoft Fabric documentation to data-parity SKILL.md - Add SQL Server / Fabric schema inspection query in Step 2 - Add "SQL Server and Microsoft Fabric" section with: - Supported configurations table (sqlserver, mssql, fabric) - Fabric connection guide with Azure AD auth types - Algorithm behavior notes (joindiff vs hashdiff selection) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: delegate Azure AD credential creation to tedious and remove underscore column filter - **Azure AD auth**: Pass `azure-active-directory-*` types directly to tedious instead of constructing `DefaultAzureCredential` ourselves. Tedious imports `@azure/identity` internally and creates credentials — avoids bun CJS/ESM `isTokenCredential` boundary issue that caused "not an instance of the token credential class" errors. - **Auth shorthands**: Map `CLI`, `default`, `password`, `service-principal`, `msi`, `managed-identity` to their full tedious type names. - **Column filter**: Remove `_.startsWith("_")` filter from `execute()` result columns — it stripped legitimate aliases like `_p` used by partition discovery, causing partitioned diffs to return empty results. - **Tests**: Remove `@azure/identity` mock (no longer imported by driver), update auth assertions, add shorthand mapping tests, fix column filter test. - **Verified**: All 97 driver tests pass. Full data-diff pipeline tested against real MSSQL server (profile, joindiff, auto, where_clause, partitioned). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: upgrade `mssql` to v12 with `ConnectionPool` isolation and row flattening - Upgrade `mssql` from v11 to v12 (`tedious` 18 → 19) - Use explicit `ConnectionPool` instead of global `mssql.connect()` to isolate multiple simultaneous connections - Flatten unnamed column arrays — `mssql` merges unnamed columns (e.g. `SELECT COUNT(*), SUM(...)`) into a single array under the empty-string key; restore positional column values - Proper column name resolution: compare `namedKeys.length` against flattened row length, fall back to synthetic `col_0`, `col_1`, etc. - Update test mock to export `ConnectionPool` class and `createMockPool` Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve TypeScript spread-type errors in Azure AD conditional options Use ternary expressions (`x ? {...} : {}`) instead of short-circuit (`x && {...}`) to avoid spreading a boolean value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve cubic review findings on MSSQL/Fabric PR - P1: restrict `flattenRow` to only spread the empty-string key (`""`) where mssql merges unnamed columns, preserving legitimate array values - P2: escape single quotes in `partitionValue` for date-mode branches in `buildPartitionWhereClause` (categorical mode already escaped) - P2: add `fabric` to `PASSWORD_DRIVERS` set in registry for consistent password validation alongside `sqlserver`/`mssql` - P2: fallback to `"(no values)"` when `d.values` is nullish to prevent template literal coercing `undefined` to the string `"undefined"` Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add fabric connection path and flattenRow coverage - sqlserver-unit: 3 tests for unnamed column flattening — verifies only the empty-string key is spread, legitimate named arrays are preserved - driver-normalize: fabric type uses SQLSERVER_ALIASES (server → host, trustServerCertificate → trust_server_certificate) - connections: fabric type is recognized in DRIVER_MAP and listed correctly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: document minimum versions and make @azure/identity optional - Add "Minimum Version Requirements" table to SKILL.md covering SQL Server 2022+, mssql v12, and @azure/identity v4 with rationale for each - Document auth shorthands (CLI, default, password, service-principal, msi) - Move @azure/identity from dependencies to optional peerDependencies so it is NOT installed by default — only required for Azure AD auth - Add runtime check in sqlserver driver: if Azure AD auth type is requested but @azure/identity is missing, throw a clear install instruction error Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: acquire Azure AD tokens directly to bypass Bun browser-bundle resolution - For `azure-active-directory-default` (CLI/default auth), acquire token ourselves instead of delegating to tedious's internal `@azure/identity` - Strategy: try `DefaultAzureCredential` first, fall back to `az` CLI subprocess - Bypasses Bun resolving `@azure/identity` to browser bundle where `DefaultAzureCredential` is a non-functional stub - Also bypasses CJS/ESM `isTokenCredential` boundary mismatch - All 31 driver unit tests pass, verified against real Fabric endpoint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: auto-acquire Azure AD token for `azure-active-directory-access-token` when none supplied The `azure-active-directory-access-token` branch passed `token: config.token ?? config.access_token` to tedious. When neither field was set on a connection (e.g. a `fabric-migration` entry that declared the auth type but no token), tedious threw: TypeError: The "config.authentication.options.token" property must be of type string This blocked any Fabric/MSSQL config that relied on ambient credentials (Azure CLI / managed identity) but used the explicit `azure-active-directory-access-token` type instead of the `default` shorthand. Refactor token acquisition (`DefaultAzureCredential` → `az` CLI fallback) into a shared `acquireAzureToken()` helper used by both the `default` path and the `access-token` path when no token was supplied. Callers that pass an explicit token are unchanged. Also harden `mock.module("node:child_process", ...)` in `sqlserver-unit.test.ts` to spread the real module so sibling tests in the same `bun test` run keep access to `spawn` / `exec` / `fork`. Tests: 110 pass, 0 fail in `packages/drivers`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix: side-aware CTE injection for cross-warehouse `data_diff` SQL-query mode When `source` and `target` are both SQL queries, `resolveTableSources` wraps them as `__diff_source` / `__diff_target` CTEs and the executor prepends the combined `WITH …` block to every engine-emitted task. T-SQL and Fabric parse-bind every CTE body even when unreferenced, so a task routed to the source warehouse failed to resolve the target-only base table referenced inside the unused `__diff_target` CTE (and vice versa), producing `Invalid object name` errors from the wrong warehouse. Return side-specific prefixes from `resolveTableSources` alongside the combined one, and have the executor loop in `runDataDiff` pick the source or target prefix per task when `source_warehouse !== target_warehouse`. Same-warehouse behaviour is unchanged. Adds `data-diff-cte.test.ts` covering plain-name passthrough, both-query wrapping, side-specific CTE isolation, and CTE merging with engine-emitted `WITH` clauses (10 tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: regenerate `bun.lock` to match drivers `peerDependencies` layout Commit 333a45c moved `@azure/identity` from `optionalDependencies` to `peerDependencies` with `optional: true` in `packages/drivers/package.json`, but the lockfile was not regenerated. That left CI under `--frozen-lockfile` broken and made fresh installs silently diverge from the committed state. Running `bun install` brings the lockfile in sync: `@azure/identity` is recorded as an optional peer, and its transitive pins (`@azure/msal-browser`, `@azure/msal-common`, `@azure/msal-node`) re-resolve to the versions required by `tedious` and `snowflake-sdk`, matching the reachable runtime surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address all CRITICAL/MAJOR findings from multi-model review Fixes five correctness, reliability, and portability issues surfaced by the consensus code review of this branch. CRITICAL #1 — Cross-dialect partitioned diff (`data-diff.ts`): `runPartitionedDiff` built one partition WHERE clause with `sourceDialect` and passed it as shared `where_clause` to the recursive `runDataDiff`, which applied it to both warehouses identically. Cross-dialect partition mode (MSSQL → Postgres) failed because the target received T-SQL `DATETRUNC`/`CONVERT(DATE, …, 23)`. Now builds per-side WHERE using each warehouse's dialect and bakes it into dialect-quoted subquery SQL for source and target independently. The existing side-aware CTE injection handles the rest. MAJOR #2 — Azure AD token caching and refresh (`sqlserver.ts`): `acquireAzureToken` fetched a fresh token on every `connect()` and embedded it in the pool config with no refresh. Long-lived sessions silently failed when the ~1h token expired. Adds a module-scoped cache keyed by `(resource, client_id)` with proactive refresh 5 min before expiry, parsing `expiresOnTimestamp` from `@azure/identity` or the JWT `exp` claim from the `az` CLI fallback. Exposes `_resetTokenCacheForTests` for isolation. MAJOR #3 — `joindiff` + cross-warehouse guard (`data-diff.ts`): Explicit `algorithm: "joindiff"` combined with different warehouses produced broken SQL (one task referencing two CTE aliases with only one injected). Now returns an early error with a clear message steering users to `hashdiff` or `auto`. Cross-warehouse detection switched from warehouse-name string compare to dialect compare, matching the underlying SQL-divergence invariant. MAJOR #4 — Dialect-aware identifier quoting in CTE wrapping (`data-diff.ts`): `resolveTableSources` wrapped plain-table names with ANSI double-quotes for all dialects. T-SQL/Fabric require `QUOTED_IDENTIFIER ON` for this to work; default for `mssql`/tedious is ON, but user contexts (stored procs, legacy collations) can override. Now accepts source/target dialect parameters and delegates to `quoteIdentForDialect`, which was hoisted to module scope so it can be reused across partition and CTE paths. MAJOR #5 — Configurable Azure resource URL (`sqlserver.ts`, `normalize.ts`): Token acquisition hardcoded `https://database.windows.net/`, blocking Azure Government, Azure China, and sovereign-cloud customers. Now honours an explicit `azure_resource_url` config field and otherwise infers the URL from the host suffix (`.usgovcloudapi.net`, `.chinacloudapi.cn`). Adds the usual camelCase/snake_case aliases in the SQL Server normalizer. Also surfaces Azure auth error causes: if both `@azure/identity` and `az` CLI fail, the thrown error includes both hints (redacted) so users know why rather than seeing the generic "install @azure/identity or run az login" message. Tests: adds `data-diff-cross-dialect.test.ts` covering the cross-dialect partition WHERE routing and the `joindiff` guard; extends `data-diff-cte.test.ts` with dialect-aware quoting assertions for tsql, fabric, and mysql; extends `sqlserver-unit.test.ts` with cache hit / expiry refresh / client-id keyed cache tests, commercial/gov/china/custom resource URL resolution, and the combined-error-hints surface. All 41 sqlserver driver tests, 24 data-diff orchestrator tests, and 214 normalize/connections tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address PR #705 bot review findings (coderabbitai + cubic + copilot) Addresses the remaining issues raised by coderabbitai, cubic-dev-ai, and the Copilot PR reviewer on top of the multi-model consensus fix. ### CRITICAL - **`@azure/identity` peer dep removed** (`drivers/package.json`) `mssql@12` → `tedious@19` bundles `@azure/identity ^4.2.1` as a regular dependency. Declaring it here as an optional peer was redundant and caused transitive-version-drift concerns. Users get the correct version automatically through the tedious chain; our lazy import handles the browser-bundle edge case itself. ### MAJOR - **Cross-dialect date partition literal normalization** (`data-diff.ts`) `buildPartitionDiscoverySQL` on MSSQL returns a JS `Date` object, stringified upstream as `"Mon Jan 01 2024 …"`. `CONVERT(DATE, …, 23)` rejects that format. Normalize `partitionValue` to ISO `yyyy-mm-dd` before dialect casting so the T-SQL/Fabric path works end-to-end on dates discovered from MSSQL sources. - **`crossWarehouse` uses resolved warehouse identity** (`data-diff.ts`) Previous commit gated on dialect compare, which treated two independent MSSQL instances as "same warehouse" and would have let `joindiff` route a JOIN through a warehouse that can't resolve the other side's base tables. Now resolves both sides' warehouse name (falling back to the default warehouse when a side is omitted) and compares identities — identity-based gating handles both the "undefined vs default" case (cubic) and the "same-dialect, different instance" case (Copilot). - **Drop `mssql.connect()` fallback** (`sqlserver.ts`) `mssql@^12` guarantees `ConnectionPool` as a named export. The fallback silently re-introduced the global-shared-pool bug this branch was added to fix. Now throws a descriptive error if `ConnectionPool` is missing — cross-database pool interference cannot regress. - **Non-string `config.authentication` guarded** (`sqlserver.ts`) Caller passing a pre-built `{ type, options }` block (or `null`) previously crashed with `TypeError: rawAuth.toLowerCase is not a function`. Now only applies the shorthand lookup when `rawAuth` is a string; other values pass through so tedious can handle them or reject them with its own error. - **Unknown `azure-active-directory-*` subtype fails fast** (`sqlserver.ts`) Typos or future tedious subtypes previously dropped through all `else if` branches, producing a config with `encrypt: true` but no `authentication` block. tedious then surfaced an opaque error far from the root cause. Now throws with the offending subtype and the supported list. - **`execSync` replaced with async `exec`** (`sqlserver.ts`) The `az account get-access-token` CLI fallback previously blocked the event loop for up to 15s. Switched to `util.promisify(exec)` so the connection path stays non-blocking. - **Mixed named + unnamed column derivation preserves headers** (`sqlserver.ts`) Previously `SELECT name, COUNT(*), SUM(x)` produced either `["name", ""]` (blank header) or `["col_0", "col_1", "col_2"]` (lost `name`). Rewrote column/row derivation to iterate in one pass, preserving known named columns and synthesizing `col_N` only for expanded `""`-key positions. ### MINOR - **`(no values)` fallback for empty `diff_row.values` array** (`tools/data-diff.ts`) `[].join(" | ") ?? "(no values)"` never fires because `""` is falsy-but-not- nullish. Gate on `d.values?.length` instead. ### Test / docs - `sqlserver-unit.test.ts`: token-cache client-id test now counts actual `getToken` invocations (previous version only verified both got the same mocked token, which proved nothing about keying). - `sqlserver-unit.test.ts`: "empty result" test now mirrors the real mssql shape (`recordset.columns` is a property *on* the recordset array, not a sibling key). - `sqlserver-unit.test.ts`: added mixed-column regression tests — "name + COUNT + SUM" and "single unnamed column" — to lock in the derivation fix. - `sqlserver-unit.test.ts`: stubbed async `exec` via `util.promisify.custom` so tests drive both the `execSync` legacy path and the new async path. - `SKILL.md`: Fabric config fenced block now declares `yaml` (markdownlint MD040). All tests: 43/43 sqlserver driver + 238/238 opencode test suite. Attribution: findings identified by coderabbitai, cubic-dev-ai, and the Copilot PR reviewer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: drop stale `@azure/identity` peer-dep entries from `bun.lock` Commit 38cfb0e removed `@azure/identity` from the drivers package's `peerDependencies` (tedious already bundles it), but the lockfile's `packages/drivers` workspace section still carried the corresponding `peerDependencies` and `optionalPeers` blocks. CI running `bun install --frozen-lockfile` would fail on the drift. Minimal edit — just removes the two stale blocks. No resolution changes (`bun install --frozen-lockfile` passes with "no changes"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: CI — isolate `data-diff-cross-dialect` tests from other files The prior integration-style test mocked the Registry module globally with `mock.module(".../registry", ...)`, which leaks across all test files in bun:test's single-process runner. That caused 14 unrelated tests in `connections.test.ts`, `telemetry-safety.test.ts`, and `dbt-first-execution.test.ts` to fail in CI. Additionally, the test relied on `mock.module("@altimateai/altimate-core")` to supply a fake `DataParitySession`. The npm-published 0.2.6 of that package does not export `DataParitySession` (sessions are only in the locally-built `altimate-core-internal` binary), and Bun's `mock.module` cannot override a package that another test file has already imported — so the integration test was structurally unreliable. Resolution: 1. **Export pure SQL-builder helpers** from `data-diff.ts` (`dateTruncExpr`, `buildPartitionWhereClause`) and unit-test them directly. No module mocking required; the test directly exercises the logic the CRITICAL/MAJOR fix changed. 2. **Move the `joindiff` + cross-warehouse guard earlier** in `runDataDiff` — before the NAPI import. Semantically identical for callers (guard still fires, same error message, `steps: 0`), but now it can be integration-tested without any NAPI mock. Preserves end-to-end wiring coverage for the guard. 3. **Rewrite `data-diff-cross-dialect.test.ts`** as pure-function unit tests for the partition WHERE logic + a real `runDataDiff` call for the joindiff guard. No more cross-file mock pollution. Functionality unchanged: - `runDataDiff` behavior for real callers is identical. The only observable difference is error-ordering: if a caller simultaneously omits NAPI and passes `joindiff + cross-warehouse`, they now get the "joindiff requires same warehouse" error instead of the NAPI-missing error. That's strictly better UX — NAPI availability is a deployment concern, `joindiff`+cross-warehouse is a user error. - `buildPartitionWhereClause` and `dateTruncExpr` are now exported but semantically unchanged — same inputs, same outputs. Test results: - 2821 altimate tests pass, 0 fail - 43 sqlserver driver tests pass, 0 fail - The 19 remaining full-suite failures (`mcp/`, `tool/project-scan`, `plan-approval-phrase`) are pre-existing on `main` and unrelated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: follow-up PR bot review findings (cubic P1/P2 + coderabbit MAJOR/MINOR) Addresses 5 substantive issues raised by the latest round of bot reviews. ### P1 / MAJOR - **MySQL/MariaDB week-partition values no longer corrupted** (cubic P1, data-diff.ts:610) — the prior ISO `yyyy-mm-dd` normalization applied to every dialect silently rewrote MySQL `DATE_FORMAT(%Y-%u)` outputs like `"2024-42"` into invalid dates, producing WHERE clauses that never match. Scope the normalization to T-SQL / Fabric only — those use `CONVERT(DATE, …, 23)` which is the only code path that requires ISO. Postgres, MySQL, ClickHouse, BigQuery, Oracle all get the raw value verbatim, matching their own `DATE_TRUNC`/`toStartOf*` output. - **Partitioned diff no longer drops extra_columns** (coderabbit MAJOR, data-diff.ts:824) — the partition fix wraps each side as a SELECT subquery before recursing. `discoverExtraColumns` skips SQL queries (only inspects plain table names), so the recursive `runDataDiff` fell through to key-only comparison, silently losing value-level diffs. Now `runPartitionedDiff` runs discovery ONCE on the plain source table up-front and passes the resolved `extra_columns` explicitly to each recursive call. Audit-column exclusion metadata is also propagated to the aggregated result for user reporting. ### P2 / MINOR - **`azure_resource_url` trailing slash normalized** (cubic P2, sqlserver.ts:50) — an explicit `"https://custom-host"` (no slash) would produce an invalid OAuth scope `"https://custom-host.default"`. Enforce a trailing slash in `resolveAzureResourceUrl`. - **`az account get-access-token` uses `execFile`** (coderabbit, sqlserver.ts:200) — replaces `exec(<shell command string>)` with `execFile("az", [args])` so user-supplied `azure_resource_url` can't introduce shell metacharacters into the command string. Also updates the test harness to stub both `exec` and `execFile`. ### Test isolation / coverage - **Added same-dialect cross-warehouse joindiff test** (cubic, data-diff-cross-dialect.test.ts:97) — two MSSQL servers with different hosts must still be gated by the joindiff guard; previous tests only exercised mixed dialects. - **Added MySQL week-partition regression tests** — prevent future revivals of the dialect-unaware ISO rewrite. - **Added trailing-slash `azure_resource_url` test.** Test results: - 44/44 sqlserver driver tests pass - 2824/2824 altimate tests pass, 0 fail - Remaining full-suite failures (`mcp/`, `tool/project-scan`, `plan-approval-phrase`) are pre-existing on `main`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 73fb5e8 commit f7b9497

14 files changed

Lines changed: 1974 additions & 93 deletions

File tree

.opencode/skills/data-parity/SKILL.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,19 @@ WHERE table_schema = 'mydb' AND table_name = 'orders'
7171
ORDER BY ordinal_position
7272
```
7373

74+
```sql
75+
-- SQL Server / Fabric
76+
SELECT c.name AS column_name, tp.name AS data_type, c.is_nullable,
77+
dc.definition AS column_default
78+
FROM sys.columns c
79+
INNER JOIN sys.types tp ON c.user_type_id = tp.user_type_id
80+
INNER JOIN sys.objects o ON c.object_id = o.object_id
81+
INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
82+
LEFT JOIN sys.default_constraints dc ON c.default_object_id = dc.object_id
83+
WHERE s.name = 'dbo' AND o.name = 'orders'
84+
ORDER BY c.column_id
85+
```
86+
7487
```sql
7588
-- ClickHouse
7689
DESCRIBE TABLE source_db.events
@@ -409,3 +422,56 @@ Even when tables match perfectly, state what was checked:
409422

410423
**Silently excluding auto-timestamp columns without asking the user**
411424
→ Always present detected auto-timestamp columns (Step 4) and get explicit confirmation. In migration scenarios, `created_at` should be *identical* — excluding it silently hides real bugs.
425+
426+
---
427+
428+
## SQL Server and Microsoft Fabric
429+
430+
### Minimum Version Requirements
431+
432+
| Component | Minimum Version | Why |
433+
|---|---|---|
434+
| **SQL Server** | 2022 (16.x) | `DATETRUNC()` used for date partitioning; `LEAST()`/`GREATEST()` used by Rust engine |
435+
| **Azure SQL Database** | Any current version | Always has `DATETRUNC()` and `LEAST()` |
436+
| **Microsoft Fabric** | Any current version | T-SQL surface includes all required functions |
437+
| **mssql** (npm) | 12.0.0 | `ConnectionPool` isolation for concurrent connections, tedious 19 |
438+
| **@azure/identity** (npm) | 4.0.0 | Required only for Azure AD authentication; tedious imports it internally |
439+
440+
> **Note:** Date partitioning (`partition_column` + `partition_granularity`) uses `DATETRUNC()` which is **not available on SQL Server 2019 or earlier**. Basic diff operations (joindiff, hashdiff, profile) work on older versions. If you need partitioned diffs on SQL Server < 2022, use numeric or categorical partitioning instead.
441+
442+
### Supported Configurations
443+
444+
| Warehouse Type | Authentication | Notes |
445+
|---|---|---|
446+
| `sqlserver` / `mssql` | User/password or Azure AD | On-prem or Azure SQL. SQL Server 2022+ required for date partitioning. |
447+
| `fabric` | Azure AD only | Microsoft Fabric SQL endpoint. Always uses TLS encryption. |
448+
449+
### Connecting to Microsoft Fabric
450+
451+
Fabric uses the same TDS protocol as SQL Server — no separate driver needed. Configuration:
452+
453+
```yaml
454+
type: "fabric"
455+
host: "<workspace-id>-<item-id>.datawarehouse.fabric.microsoft.com"
456+
database: "<warehouse-name>"
457+
authentication: "azure-active-directory-default" # recommended
458+
```
459+
460+
Auth shorthands (mapped to full tedious type names):
461+
- `CLI` or `default` → `azure-active-directory-default`
462+
- `password` → `azure-active-directory-password`
463+
- `service-principal` → `azure-active-directory-service-principal-secret`
464+
- `msi` or `managed-identity` → `azure-active-directory-msi-vm`
465+
466+
Full Azure AD authentication types:
467+
- `azure-active-directory-default` — auto-discovers credentials via `DefaultAzureCredential` (recommended; works with `az login`)
468+
- `azure-active-directory-password` — username/password with `azure_client_id` and `azure_tenant_id`
469+
- `azure-active-directory-access-token` — pre-obtained token (does **not** auto-refresh)
470+
- `azure-active-directory-service-principal-secret` — service principal with `azure_client_id`, `azure_client_secret`, `azure_tenant_id`
471+
- `azure-active-directory-msi-vm` / `azure-active-directory-msi-app-service` — managed identity
472+
473+
### Algorithm Behavior
474+
475+
- **Same-warehouse** MSSQL or Fabric → `joindiff` (single FULL OUTER JOIN, most efficient)
476+
- **Cross-warehouse** MSSQL/Fabric ↔ other database → `hashdiff` (automatic when using `auto`)
477+
- The Rust engine maps `sqlserver`/`mssql` to `tsql` dialect and `fabric` to `fabric` dialect — both generate valid T-SQL syntax with bracket quoting (`[schema].[table]`).

bun.lock

Lines changed: 18 additions & 6 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

packages/drivers/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
"@google-cloud/bigquery": "^8.0.0",
1818
"@databricks/sql": "^1.0.0",
1919
"mysql2": "^3.0.0",
20-
"mssql": "^11.0.0",
20+
"mssql": "^12.0.0",
2121
"oracledb": "^6.0.0",
2222
"duckdb": "^1.0.0",
2323
"mongodb": "^6.0.0",

packages/drivers/src/normalize.ts

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,12 @@ const SQLSERVER_ALIASES: AliasMap = {
6565
...COMMON_ALIASES,
6666
host: ["server", "serverName", "server_name"],
6767
trust_server_certificate: ["trustServerCertificate"],
68+
authentication: ["authenticationType", "auth_type", "authentication_type"],
69+
azure_tenant_id: ["tenantId", "tenant_id", "azureTenantId"],
70+
azure_client_id: ["clientId", "client_id", "azureClientId"],
71+
azure_client_secret: ["clientSecret", "client_secret", "azureClientSecret"],
72+
access_token: ["token", "accessToken"],
73+
azure_resource_url: ["azureResourceUrl", "resourceUrl", "resource_url"],
6874
}
6975

7076
const ORACLE_ALIASES: AliasMap = {
@@ -104,6 +110,7 @@ const DRIVER_ALIASES: Record<string, AliasMap> = {
104110
mariadb: MYSQL_ALIASES,
105111
sqlserver: SQLSERVER_ALIASES,
106112
mssql: SQLSERVER_ALIASES,
113+
fabric: SQLSERVER_ALIASES,
107114
oracle: ORACLE_ALIASES,
108115
mongodb: MONGODB_ALIASES,
109116
mongo: MONGODB_ALIASES,

0 commit comments

Comments
 (0)