Skip to content

Commit e81bf9b

Browse files
committed
[SEA-NodeJS] Accept-and-ignore unsupported per-statement options; drive sync execute to terminal
Two correctness fixes to the SEA (kernel) backend, aligning it with the Thrift backend and the Python connector's kernel backend. 1) Accept-and-ignore unsupported per-statement options (was: hard failure). `SeaSessionBackend.executeStatement` threw on options the kernel does not expose at per-statement granularity, breaking callers that set them globally. `useCloudFetch`, `useLZ4Compression`, and `stagingAllowedLocalPath` are now accepted and ignored (no-op), matching `KernelDatabricksClient.execute_command` which takes the same flags and never reads them (CloudFetch is the session-level `ResultConfig`, compression is decoded from the result manifest, and the kernel has no Volume API yet). `queryTimeout` is also a no-op (+TODO): it must not be mapped onto the SEA `wait_timeout` wire field (valid range {0} ∪ [5,50]s — a different concept from a statement timeout; out-of-range values fail HTTP 400). The correct mechanism is the `STATEMENT_TIMEOUT` session config. 2) Drive the default (sync) executeStatement to terminal before returning. Previously the sync path executed lazily (only at first fetch), so a side-effecting statement run via `executeStatement(...)` then `close()` with no fetch was silently dropped, and dependent statements raced. The sync path now awaits the statement to terminal in `executeStatement` (matching JDBC, ADBC C#, and Python `use_kernel`), so by the time it resolves the side effect is committed and the result is materialised — fire-and-forget DDL/DML and dependent statements just work, and `close()` is a trivial non-blocking cleanup. Mid-run cancellation of a long-running query is done via `runAsync: true` (submit + poll → cancellable handle), mirroring JDBC's sync-vs-async split. This needs no change to `SeaOperationBackend`. Verified: 249 SEA unit tests pass; e2e on a SQL warehouse confirms fire-and-forget + dependent CREATE/INSERT/UPDATE/DELETE persist on SEA, mid-run cancel via runAsync works, and the common execute/fetch/close path is at parity with Thrift (latency front-loads from fetch into execute; total unchanged). Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
1 parent 4b9e16e commit e81bf9b

2 files changed

Lines changed: 111 additions & 105 deletions

File tree

lib/sea/SeaSessionBackend.ts

Lines changed: 54 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,6 @@ import ParameterError from '../errors/ParameterError';
3535
import { LogLevel } from '../contracts/IDBSQLLogger';
3636
import { SeaConnection, SeaNativeExecuteOptions, SeaStatement } from './SeaNativeLoader';
3737
import { decodeNapiKernelError } from './SeaErrorMapping';
38-
import { numberToInt64 } from '../thrift-backend/ThriftSessionBackend';
3938
import SeaOperationBackend from './SeaOperationBackend';
4039
import { buildSeaPositionalParams, buildSeaNamedParams } from './SeaPositionalParams';
4140
import { seaServerInfoValue } from './SeaServerInfo';
@@ -122,38 +121,41 @@ export default class SeaSessionBackend implements ISessionBackend {
122121
* Per-statement options forwarded to the kernel `ExecuteOptions`:
123122
* - `ordinalParameters` / `namedParameters` → bound params (mutually
124123
* exclusive — the kernel binds one placeholder style per statement);
125-
* - `queryTimeout` → enforced client-side by the operation backend's poll
126-
* deadline (the kernel ignores `queryTimeoutSecs` on the async submit
127-
* path), NOT forwarded to the napi options;
128124
* - `rowLimit` → `rowLimit` (SEA-only server-side row cap);
129125
* - `queryTags` → serialised into the conf overlay's reserved
130126
* `query_tags` key (the same wire shape Thrift's `serializeQueryTags`
131127
* produces), merged with any explicit `statementConf`.
132128
*
133-
* Still rejected (genuinely unsupported on SEA, rather than silently
134-
* dropped): `useCloudFetch` (governed by the kernel `ResultConfig`, not a
135-
* per-statement knob), `useLZ4Compression` (kernel owns result compression),
136-
* and `stagingAllowedLocalPath` (volume operations). `maxRows` is applied by
137-
* the facade at fetch time, so it is intentionally not handled here.
129+
* Accepted but IGNORED (no-op — the kernel exposes no per-statement knob, so
130+
* we drop rather than reject; see the body for details and TODOs):
131+
* `useCloudFetch`, `useLZ4Compression`, `stagingAllowedLocalPath`, and
132+
* `queryTimeout`. `maxRows` is applied by the facade at fetch time, so it is
133+
* intentionally not handled here.
138134
*/
139135
public async executeStatement(statement: string, options: ExecuteStatementOptions): Promise<IOperationBackend> {
140136
this.failIfClosed();
141137

142-
if (options.useCloudFetch !== undefined) {
143-
throw new HiveDriverError(
144-
'SEA executeStatement: useCloudFetch is controlled by the kernel result configuration and is not a per-statement option on SEA',
145-
);
146-
}
147-
if (options.useLZ4Compression !== undefined) {
148-
throw new HiveDriverError(
149-
'SEA executeStatement: useLZ4Compression is not supported on SEA (result compression is governed by the kernel)',
150-
);
151-
}
152-
if (options.stagingAllowedLocalPath !== undefined) {
153-
throw new HiveDriverError(
154-
'SEA executeStatement: stagingAllowedLocalPath (volume operations) is not supported on SEA',
155-
);
156-
}
138+
// `useCloudFetch`, `useLZ4Compression`, and `stagingAllowedLocalPath` are
139+
// accepted and IGNORED (no-op) on the kernel-backed SEA path rather than
140+
// rejected — the kernel exposes no per-statement knob for any of them, so a
141+
// hard failure would break callers that set these options globally. This
142+
// mirrors the Python connector's kernel backend
143+
// (`KernelDatabricksClient.execute_command`), which takes the same flags and
144+
// never reads them.
145+
//
146+
// - `useCloudFetch`: result transport is governed by the session-level
147+
// kernel `ResultConfig.cloudfetch_enabled` (default: CloudFetch on);
148+
// there is no per-statement override on the napi surface.
149+
// - `useLZ4Compression`: the kernel transparently decodes whatever
150+
// compression the server returns (`manifest.result_compression`) and
151+
// exposes no compression-request knob.
152+
// - `stagingAllowedLocalPath`: the kernel has no Volume (PUT/GET/REMOVE)
153+
// API yet, so `SeaOperationBackend` always reports
154+
// `isStagingOperation: false` and `DBSQLSession` treats such statements
155+
// as ordinary queries. Non-staging queries that set the option run
156+
// normally (parity with Thrift).
157+
// TODO(SEA): wire real volume operations once the kernel exposes a
158+
// Volume API + napi `is_volume_operation`.
157159

158160
// `runAsync` selects the kernel execution path. NOTE: this is a SEA/kernel-
159161
// specific use of the option — the Thrift backend hardcodes `runAsync: true`
@@ -166,26 +168,27 @@ export default class SeaSessionBackend implements ISessionBackend {
166168
// - DEFAULT (`runAsync` false/undefined) — SYNC. Route through
167169
// `executeStatementCancellable`: the kernel blocks on `execute()`
168170
// (server-side direct-results / poll-to-terminal), which is faster and,
169-
// with the napi sync canceller, fully cancellable mid-COMPUTE. The
170-
// blocking drive runs in the operation backend's `result()` (inside
171-
// `waitUntilReady`, which the facade invokes lazily at first fetch).
172-
// `queryTimeoutSecs` IS honoured on this path (forwarded to the napi
173-
// options below) since the kernel `execute()` consults it.
171+
// with the napi sync canceller, fully cancellable mid-COMPUTE.
174172
//
175173
// - `runAsync: true` — ASYNC. Submit (`wait_timeout=0s`): the server
176174
// returns a pending `AsyncStatement` immediately while the query runs;
177175
// the backend polls `status()` to terminal in `waitUntilReady()` and
178-
// materialises results via `awaitResult()`. `queryTimeoutSecs` is
179-
// ignored by the kernel on submit, so it is enforced client-side by the
180-
// operation backend's poll-loop deadline instead.
176+
// materialises results via `awaitResult()`.
177+
//
178+
// TODO(SEA): `queryTimeout` is intentionally a NO-OP here. It must NOT be
179+
// mapped to the SEA `wait_timeout` wire field: `wait_timeout` is the
180+
// inline-result wait knob (valid range {0} ∪ [5,50]s, paired with
181+
// `on_wait_timeout`), a different concept from a server statement-execution
182+
// timeout, and out-of-range values fail with HTTP 400. The correct SEA
183+
// mechanism is the `STATEMENT_TIMEOUT` session configuration (seconds); the
184+
// Python connector forwards no per-statement timeout at all. Wiring this
185+
// properly (STATEMENT_TIMEOUT and/or a client-side poll deadline) is
186+
// deferred — until then the option is accepted and ignored.
181187
const runAsync = options.runAsync ?? false;
182-
const queryTimeoutSecs =
183-
options.queryTimeout !== undefined ? numberToInt64(options.queryTimeout).toNumber() : undefined;
188+
189+
const execOptions = this.buildExecuteOptions(options);
184190

185191
if (!runAsync) {
186-
// Sync path: forward `queryTimeoutSecs` to the napi options — the kernel
187-
// `execute()` honours it (server statement timeout).
188-
const execOptions = this.buildExecuteOptions(options, queryTimeoutSecs);
189192
let cancellableExecution;
190193
try {
191194
cancellableExecution =
@@ -195,19 +198,22 @@ export default class SeaSessionBackend implements ISessionBackend {
195198
} catch (err) {
196199
throw this.logAndMapError('executeStatement', err);
197200
}
198-
return new SeaOperationBackend({
201+
const op = new SeaOperationBackend({
199202
cancellableExecution: cancellableExecution!,
200203
context: this.context,
201-
// The kernel honours `queryTimeoutSecs` on the sync `execute` path, so
202-
// it is forwarded via the napi options (see `buildExecuteOptions`); the
203-
// backend also keeps it as a deadline guard for parity with async.
204-
queryTimeoutSecs,
205204
});
205+
// Option A (matches JDBC / ADBC C# / Python use_kernel): the default sync
206+
// path drives the statement to terminal HERE, so by the time
207+
// executeStatement resolves the side effect is committed and the result is
208+
// materialised. This makes dependent statements + fire-and-forget DDL/DML
209+
// "just work" and keeps `close()` a trivial non-blocking cleanup. Mid-run
210+
// cancellation of a long query is done via `runAsync: true` (below).
211+
// Errors are swallowed here and re-surfaced at first fetch/finished to
212+
// preserve the existing error-timing contract.
213+
await op.waitUntilReady().catch(() => undefined);
214+
return op;
206215
}
207216

208-
// Async path: do NOT forward `queryTimeoutSecs` (the kernel ignores it on
209-
// submit — `wait_timeout=0s`); it is enforced client-side by the poll loop.
210-
const execOptions = this.buildExecuteOptions(options);
211217
let asyncStatement;
212218
try {
213219
asyncStatement =
@@ -217,16 +223,9 @@ export default class SeaSessionBackend implements ISessionBackend {
217223
} catch (err) {
218224
throw this.logAndMapError('executeStatement', err);
219225
}
220-
// `queryTimeout` is enforced client-side by the operation backend's poll
221-
// loop: the kernel ignores `queryTimeoutSecs` on the async submit path
222-
// (`submitStatement` always sends `wait_timeout=0s`), so we do NOT forward
223-
// it to the napi options — passing it there would be a silent no-op.
224226
return new SeaOperationBackend({
225227
asyncStatement: asyncStatement!,
226228
context: this.context,
227-
// `queryTimeout` is typed `number | bigint | Int64`; `numberToInt64(...).toNumber()`
228-
// coerces all three (a bare `Number(int64)` yields NaN — node-int64 has no valueOf).
229-
queryTimeoutSecs,
230229
});
231230
}
232231

@@ -235,10 +234,7 @@ export default class SeaSessionBackend implements ISessionBackend {
235234
* `ExecuteOptions`, returning `undefined` when nothing is set so the
236235
* no-options call shape (`executeStatement(sql)`) is preserved.
237236
*/
238-
private buildExecuteOptions(
239-
options: ExecuteStatementOptions,
240-
queryTimeoutSecs?: number,
241-
): SeaNativeExecuteOptions | undefined {
237+
private buildExecuteOptions(options: ExecuteStatementOptions): SeaNativeExecuteOptions | undefined {
242238
// Positional (`?`) and named (`:name`) parameters are mutually exclusive —
243239
// the kernel binds one placeholder style per statement. Use the SAME error
244240
// type and message as the Thrift backend (`ThriftSessionBackend`) so a
@@ -256,14 +252,8 @@ export default class SeaSessionBackend implements ISessionBackend {
256252
if (namedParams !== undefined) {
257253
execOptions.namedParams = namedParams;
258254
}
259-
// `queryTimeoutSecs` is forwarded only on the SYNC path (the caller passes
260-
// it in): the kernel `execute()` consults it as the server statement
261-
// timeout. On the async submit path the caller omits it (the kernel ignores
262-
// it under `wait_timeout=0s`), so it is enforced client-side by the
263-
// operation backend's poll-loop deadline instead (see executeStatement).
264-
if (queryTimeoutSecs !== undefined) {
265-
execOptions.queryTimeoutSecs = queryTimeoutSecs;
266-
}
255+
// `queryTimeout` is intentionally NOT forwarded — it is a no-op on SEA (see
256+
// the TODO in executeStatement). It must not become the SEA `wait_timeout`.
267257
if (options.rowLimit !== undefined) {
268258
execOptions.rowLimit = Number(options.rowLimit);
269259
}

tests/unit/sea/execution.test.ts

Lines changed: 57 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -591,44 +591,59 @@ describe('SeaSessionBackend', () => {
591591
expect((thrown as Error).message).to.equal('Driver does not support both ordinal and named parameters.');
592592
});
593593

594-
it('executeStatement (sync default) DOES forward queryTimeout to the napi options', async () => {
594+
it('executeStatement does NOT forward queryTimeout — it is a no-op on SEA', async () => {
595595
const connection = new FakeNativeConnection();
596596
const session = makeSession(connection);
597597
await session.executeStatement('SELECT 1', { queryTimeout: 30 });
598-
// Sync path: the kernel `execute()` honours queryTimeoutSecs (server
599-
// statement timeout), so the backend forwards it onto the napi options.
600-
expect((connection.lastOptions as { queryTimeoutSecs?: number } | undefined)?.queryTimeoutSecs).to.equal(30);
598+
// queryTimeout is intentionally a NO-OP on SEA (see SeaSessionBackend): the
599+
// kernel would otherwise map queryTimeoutSecs onto the SEA `wait_timeout` wire
600+
// field (valid range {0} ∪ [5,50]s) — a different concept from a statement
601+
// timeout, and out-of-range values fail HTTP 400. The proper mechanism is the
602+
// `STATEMENT_TIMEOUT` session config (deferred). So it is neither forwarded
603+
// nor allowed to synthesise an options object.
604+
expect(connection.lastOptions).to.equal(undefined);
601605
});
602606

603-
it('executeStatement (runAsync: true) does NOT forward queryTimeout to submit (kernel ignores it; enforced client-side)', async () => {
607+
it('executeStatement (runAsync: true) does NOT forward queryTimeout (no-op on SEA)', async () => {
604608
const connection = new FakeNativeConnection();
605609
const session = makeSession(connection);
606610
await session.executeStatement('SELECT 1', { queryTimeout: 30, runAsync: true });
607-
// Async submit path: the kernel ignores queryTimeoutSecs under
608-
// `wait_timeout=0s`, so it's enforced client-side by the poll deadline
609-
// instead — never forwarded to the napi options.
610611
expect((connection.lastOptions as { queryTimeoutSecs?: number } | undefined)?.queryTimeoutSecs).to.equal(undefined);
611612
});
612613

613-
it('coerces an Int64 queryTimeout into the client-side deadline on the async path (not NaN)', async function int64Timeout() {
614-
// Regression: `Number(new Int64(...))` yields NaN (node-int64 has no valueOf),
615-
// which would silently disable the deadline. The backend must coerce via
616-
// numberToInt64(...).toNumber() so an Int64 queryTimeout still bounds the poll.
617-
// Exercised on the async path, where the client-side poll deadline applies.
618-
// eslint-disable-next-line no-invalid-this
619-
this.timeout(5000);
614+
it('accepts an Int64 queryTimeout without forwarding it or crashing (no-op on SEA)', async () => {
615+
// queryTimeout is a no-op on SEA, but the option must still be ACCEPTED for
616+
// any value type the public API allows (number | bigint | Int64) without
617+
// throwing or synthesising napi options. Int64 is the awkward case
618+
// (node-int64 has no valueOf), so assert it specifically.
620619
const connection = new FakeNativeConnection();
621-
connection.submitStatusValue = 'Running'; // never reaches a terminal state
622620
const session = makeSession(connection);
623-
const op = await session.executeStatement('SELECT 1', { queryTimeout: new Int64(1), runAsync: true });
624-
let thrown: unknown;
625-
try {
626-
await op.waitUntilReady();
627-
} catch (err) {
628-
thrown = err;
629-
}
630-
expect(thrown, 'Int64(1) timeout must fire — NaN would poll forever').to.be.instanceOf(OperationStateError);
631-
expect((thrown as OperationStateError).errorCode).to.equal(OperationStateErrorCode.Timeout);
621+
await session.executeStatement('SELECT 1', { queryTimeout: new Int64(1) });
622+
expect((connection.lastOptions as { queryTimeoutSecs?: number } | undefined)?.queryTimeoutSecs).to.equal(undefined);
623+
expect(connection.lastOptions).to.equal(undefined);
624+
});
625+
626+
it('Option A: sync executeStatement drives the statement to terminal before returning', async () => {
627+
const connection = new FakeNativeConnection();
628+
const session = makeSession(connection);
629+
const op = await session.executeStatement('SELECT 1', {});
630+
// The default sync path blocks to terminal in executeStatement (matching
631+
// JDBC / ADBC C# / Python use_kernel), so the returned op is already finished
632+
// — status reports Succeeded with no explicit waitUntilReady()/fetch. This is
633+
// what makes fire-and-forget DDL/DML and dependent statements "just work".
634+
const status = await op.status(false);
635+
expect(status.state).to.equal(OperationState.Succeeded);
636+
});
637+
638+
it('runAsync: true does NOT drive to terminal in executeStatement (returns a pending, cancellable handle)', async () => {
639+
const connection = new FakeNativeConnection();
640+
connection.submitStatusValue = 'Running';
641+
const session = makeSession(connection);
642+
const op = await session.executeStatement('SELECT 1', { runAsync: true });
643+
// The async path returns immediately with a running handle so the caller can
644+
// poll / cancel mid-run (the place to do mid-run cancellation under Option A).
645+
const status = await op.status(false);
646+
expect(status.state).to.equal(OperationState.Running);
632647
});
633648

634649
it('executeStatement forwards rowLimit', async () => {
@@ -690,25 +705,26 @@ describe('SeaSessionBackend', () => {
690705
}
691706
});
692707

693-
// Genuinely unsupported on SEA — rejected (rather than silently ignored) so
694-
// a caller/agent gets signal instead of a no-op. queryTags / queryTimeout /
695-
// rowLimit are NOT here — they are forwarded (asserted above).
696-
for (const { name, options, re } of [
697-
{ name: 'useCloudFetch', options: { useCloudFetch: true }, re: /useCloudFetch/ },
698-
{ name: 'useLZ4Compression', options: { useLZ4Compression: true }, re: /useLZ4Compression/ },
699-
{ name: 'stagingAllowedLocalPath', options: { stagingAllowedLocalPath: '/tmp' }, re: /stagingAllowedLocalPath/ },
708+
// Not per-statement knobs on the kernel-backed SEA path — ACCEPTED and IGNORED
709+
// (no-op) rather than rejected, matching the Python connector's kernel backend
710+
// (`KernelDatabricksClient.execute_command`, which takes the same flags and
711+
// never reads them). A hard failure would break callers that set these
712+
// globally; the kernel exposes no per-statement override for any of them
713+
// (CloudFetch is the session-level ResultConfig, compression is decoded from
714+
// the manifest, and volume operations have no kernel API yet).
715+
for (const { name, options } of [
716+
{ name: 'useCloudFetch', options: { useCloudFetch: true } },
717+
{ name: 'useLZ4Compression', options: { useLZ4Compression: true } },
718+
{ name: 'stagingAllowedLocalPath', options: { stagingAllowedLocalPath: '/tmp' } },
700719
] as const) {
701-
it(`executeStatement rejects ${name} rather than silently ignoring it`, async () => {
720+
it(`executeStatement accepts and ignores ${name} (no throw, not forwarded)`, async () => {
702721
const connection = new FakeNativeConnection();
703722
const session = makeSession(connection);
704-
let thrown: unknown;
705-
try {
706-
await session.executeStatement('SELECT 1', options);
707-
} catch (err) {
708-
thrown = err;
709-
}
710-
expect(thrown).to.be.instanceOf(HiveDriverError);
711-
expect((thrown as Error).message).to.match(re);
723+
// Must not throw...
724+
await session.executeStatement('SELECT 1', options);
725+
// ...and must not be forwarded to the napi options (it has no kernel knob),
726+
// so a SELECT-1 with only this option still uses the no-options fast path.
727+
expect(connection.lastOptions).to.equal(undefined);
712728
});
713729
}
714730

0 commit comments

Comments
 (0)