Skip to content

Commit 5413ff9

Browse files
authored
[SEA-NodeJS] Thrift-parity fixes: SQL error class + INTERVAL type code (#421)
* [SEA-NodeJS] map kernel SqlError to OperationStateError for Thrift parity Server-reported SQL execution failures (kernel `SqlError` — bad query, missing table, divide-by-zero, invalid cast, param type mismatch) were surfaced as the base `HiveDriverError` on the SEA/kernel path, while the Thrift backend raises `OperationStateError(Error)` when the operation reaches ERROR_STATE. The comparator flagged every error path as a class mismatch (Thrift `OperationStateError` vs SEA `HiveDriverError`), and the Python kernel connector matches Thrift here — so the divergence was in this mapping, not the kernel. Map `SqlError` -> `OperationStateError(OperationStateErrorCode.Error)`, preserving the kernel message. `OperationStateError extends HiveDriverError`, so existing `instanceof HiveDriverError` catches are unaffected. Other code mappings (InvalidArgument -> ParameterError, auth, network, etc.) are unchanged. Verified against the comparator warehouse: all ERROR_PATHS cases (table-not-found, syntax error, unresolved column, divide-by-zero, invalid cast, param mismatch) now match the Thrift backend. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com> * [SEA-NodeJS] report INTERVAL columns as STRING_TYPE (Thrift / Python kernel parity) The SEA Arrow→Thrift type synthesis surfaced interval columns with the true INTERVAL_YEAR_MONTH / INTERVAL_DAY_TIME type ids, while the Thrift backend and the Python kernel connector both report interval columns with a STRING type code. The comparator flagged every interval column as a type-code mismatch. Map INTERVAL (via databricks.type_name, the rewritten-duration Int64 path, and the native Arrow interval fallback) to STRING_TYPE. The cell value is already rendered to the canonical interval string ("2-6" / "3 12:30:15.000000000") by ArrowResultConverter, which keys off the Arrow value type — not this synthesized TTypeId — so value formatting is unchanged. Verified against the comparator warehouse: STATEMENT_SELECT / EXTREME_VALUES interval columns now match the Thrift backend (type 7 + identical string value). Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com> --------- Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
1 parent 21df925 commit 5413ff9

3 files changed

Lines changed: 36 additions & 11 deletions

File tree

lib/sea/SeaArrowIpc.ts

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,12 @@ function arrowTypeToTTypeId(field: Field<DataType>): TTypeId {
162162
return TTypeId.TIMESTAMP_TYPE;
163163
case 'DECIMAL':
164164
return TTypeId.DECIMAL_TYPE;
165+
// INTERVAL — surface as STRING_TYPE to match the Thrift backend and the
166+
// Python kernel connector, both of which report interval columns with a
167+
// string type code. The cell value is already rendered to the canonical
168+
// interval string (e.g. "2-6" / "3 12:30:15.000000000") by
169+
// ArrowResultConverter, which keys off the Arrow value type (not this
170+
// synthesized TTypeId), so value formatting is unaffected.
165171
case 'INTERVAL':
166172
case 'INTERVAL DAY':
167173
case 'INTERVAL DAY TO HOUR':
@@ -173,11 +179,10 @@ function arrowTypeToTTypeId(field: Field<DataType>): TTypeId {
173179
case 'INTERVAL MINUTE':
174180
case 'INTERVAL MINUTE TO SECOND':
175181
case 'INTERVAL SECOND':
176-
return TTypeId.INTERVAL_DAY_TIME_TYPE;
177182
case 'INTERVAL YEAR':
178183
case 'INTERVAL YEAR TO MONTH':
179184
case 'INTERVAL MONTH':
180-
return TTypeId.INTERVAL_YEAR_MONTH_TYPE;
185+
return TTypeId.STRING_TYPE;
181186
case 'ARRAY':
182187
return TTypeId.ARRAY_TYPE;
183188
case 'MAP':
@@ -198,10 +203,12 @@ function arrowTypeToTTypeId(field: Field<DataType>): TTypeId {
198203
if (DataType.isInt(arrowType)) {
199204
// Duration columns are rewritten to Int64 with a
200205
// `databricks.arrow.duration_unit` metadata marker (see
201-
// `SeaArrowIpcDurationFix.ts`). Surface them as INTERVAL_DAY_TIME
202-
// so the converter formats them back into the thrift string form.
206+
// `SeaArrowIpcDurationFix.ts`). Surface them as STRING_TYPE (matching the
207+
// Thrift backend and Python kernel) — the converter still formats the
208+
// value into the thrift INTERVAL DAY-TIME string via the duration_unit
209+
// metadata, independent of this type code.
203210
if (arrowType.bitWidth === 64 && field.metadata.has(DURATION_UNIT_METADATA_KEY)) {
204-
return TTypeId.INTERVAL_DAY_TIME_TYPE;
211+
return TTypeId.STRING_TYPE;
205212
}
206213
switch (arrowType.bitWidth) {
207214
case 8:
@@ -233,8 +240,10 @@ function arrowTypeToTTypeId(field: Field<DataType>): TTypeId {
233240
// pairs which the converter formats to thrift's `"Y-M"` / day-time
234241
// strings.
235242
if (DataType.isInterval(arrowType)) {
236-
// unit 0 = YEAR_MONTH, unit 1 = DAY_TIME, unit 2 = MONTH_DAY_NANO
237-
return arrowType.unit === 0 ? TTypeId.INTERVAL_YEAR_MONTH_TYPE : TTypeId.INTERVAL_DAY_TIME_TYPE;
243+
// Surface native Arrow interval types as STRING_TYPE too (Thrift / Python
244+
// kernel parity). The converter formats the value to the thrift "Y-M" /
245+
// day-time string from the Arrow value, independent of this type code.
246+
return TTypeId.STRING_TYPE;
238247
}
239248
if (DataType.isList(arrowType)) return TTypeId.ARRAY_TYPE;
240249
if (DataType.isMap(arrowType)) return TTypeId.MAP_TYPE;

lib/sea/SeaErrorMapping.ts

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -147,6 +147,20 @@ export function mapKernelErrorToJsError(kErr: KernelErrorShape): ErrorWithSqlSta
147147
error = new ParameterError(message);
148148
break;
149149

150+
case 'SqlError': {
151+
// A server-reported SQL execution failure (kernel `SqlError`, e.g. a
152+
// bad query, missing table, divide-by-zero, invalid cast). The Thrift
153+
// backend surfaces the same situation as `OperationStateError(Error)`
154+
// when the operation reaches ERROR_STATE (see DBSQLOperation), so map
155+
// SqlError to the same class for backend parity. OperationStateError
156+
// extends HiveDriverError, so existing `instanceof HiveDriverError`
157+
// catches are unaffected.
158+
const stateError = new OperationStateError(OperationStateErrorCode.Error);
159+
stateError.message = message;
160+
error = stateError;
161+
break;
162+
}
163+
150164
// All remaining kernel ErrorCode variants map to the base driver error class.
151165
// M0 intentionally does not introduce new error classes; M1 may add nuance.
152166
case 'NotFound':
@@ -156,7 +170,6 @@ export function mapKernelErrorToJsError(kErr: KernelErrorShape): ErrorWithSqlSta
156170
case 'Internal':
157171
case 'InvalidStatementHandle':
158172
case 'NetworkError':
159-
case 'SqlError':
160173
error = new HiveDriverError(message);
161174
break;
162175

tests/unit/sea/SeaIntervalParity.test.ts

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -403,12 +403,15 @@ describe('SeaOperationBackend — INTERVAL parity with thrift', () => {
403403
const backend = new SeaOperationBackend({ statement: stub, context: new ClientContextStub() });
404404

405405
// Round-trip the metadata to confirm we synthesise the right TTypeId.
406+
// Interval columns are surfaced as STRING_TYPE — matching the Thrift
407+
// backend and the Python kernel connector, both of which report interval
408+
// columns with a string type code. The value is still rendered to the
409+
// canonical interval string (asserted below), which is what makes this
410+
// "interval parity with thrift".
406411
const metadata = await backend.getResultMetadata();
407412
expect(metadata.schema?.columns?.[0]?.typeDesc.types?.[0]?.primitiveEntry?.type).to.equal(
408-
// INTERVAL_DAY_TIME_TYPE = 30 in TCLIService_types
409-
// We assert by importing the enum below to avoid magic numbers.
410413
// eslint-disable-next-line global-require, @typescript-eslint/no-var-requires
411-
require('../../../thrift/TCLIService_types').TTypeId.INTERVAL_DAY_TIME_TYPE,
414+
require('../../../thrift/TCLIService_types').TTypeId.STRING_TYPE,
412415
);
413416

414417
const rows = await backend.fetchChunk({ limit: 100 });

0 commit comments

Comments
 (0)