Skip to content

Commit 36f05bf

Browse files
committed
[SEA-NodeJS] SEA connection & statement options
Wire the SEA connection-level and per-statement option surfaces onto the merged-kernel napi binding (thin forwarding — the kernel owns the behaviour): Connection options (SeaAuth.buildSeaConnectionOptions): - `maxConnections` → kernel pool size, validated as a positive integer within the napi u32 range. - TLS: `checkServerCertificate` (secure-by-default — omit to keep the kernel's verify-on default; `false` opts into insecure) and `customCaCert` (PEM string or Buffer; strings are PEM-sanity-checked and normalised to a Buffer before the FFI boundary), via the new `buildSeaTlsOptions`. - `intervalsAsString: true` is always set so SEA interval/duration columns render as strings — a byte-compatible drop-in for the Thrift backend. `complexTypesAsJson` is intentionally left at the kernel default (native Arrow), which already decodes identically to Thrift via the shared converter. Statement options (SeaSessionBackend.executeStatement, via buildExecuteOptions): - `queryTimeout` → `queryTimeoutSecs`; `rowLimit` → `rowLimit` (SEA-only cap). - `queryTags` serialised JS-side (reusing Thrift's `serializeQueryTags`) into the reserved `query_tags` conf key, merged with any explicit `statementConf` — the napi `queryTags` field can't carry null-valued tags, and the kernel rejects setting both. `queryTags` / `queryTimeout` are no longer rejected. - Still rejected (genuinely unsupported on SEA): `useCloudFetch`, `useLZ4Compression`, `stagingAllowedLocalPath`. `rowLimit` / `statementConf` added to the public `ExecuteStatementOptions`; SEA-only knobs (`maxConnections` / `checkServerCertificate` / `customCaCert`) added to the internal `InternalConnectionOptions`. Validated against a live warehouse: secure-by-default connect, maxConnections, checkServerCertificate, rowLimit (caps rows), queryTimeout, queryTags, statementConf, and non-PEM customCaCert rejection. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
1 parent 9fe5357 commit 36f05bf

9 files changed

Lines changed: 393 additions & 56 deletions

lib/contracts/IDBSQLSession.ts

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,18 @@ export type ExecuteStatementOptions = {
2727
* These tags apply only to this statement and do not persist across queries.
2828
*/
2929
queryTags?: Record<string, string | null | undefined>;
30+
/**
31+
* SEA-only: server-side row cap for this statement (kernel `row_limit`). The
32+
* Thrift backend has no execute-time server cap, so this is a no-op there;
33+
* use `maxRows` for the cross-backend client-side fetch limit.
34+
*/
35+
rowLimit?: number;
36+
/**
37+
* SEA-only: per-statement Spark conf overlay (kernel `statement_conf`).
38+
* Merged with the serialized `queryTags` (which land under the reserved
39+
* `query_tags` key). Ignored by the Thrift backend.
40+
*/
41+
statementConf?: Record<string, string>;
3042
};
3143

3244
export type TypeInfoRequest = {

lib/contracts/InternalConnectionOptions.ts

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,27 @@ export interface InternalConnectionOptions {
1818
* @internal Not stable; M0 stub only.
1919
*/
2020
useSEA?: boolean;
21+
22+
/**
23+
* SEA-only: kernel connection-pool size (`ConnectionOptions.max_connections`).
24+
* Validated as a positive integer within the napi `u32` range.
25+
* @internal SEA path only.
26+
*/
27+
maxConnections?: number;
28+
29+
/**
30+
* SEA-only: verify the server's TLS certificate. Secure-by-default — omit
31+
* to keep full chain + hostname verification; set `false` only to opt into
32+
* the insecure accept-anything mode.
33+
* @internal SEA path only.
34+
*/
35+
checkServerCertificate?: boolean;
36+
37+
/**
38+
* SEA-only: PEM-encoded CA certificate (string or `Buffer`) added to the
39+
* trust store on top of the system roots — for TLS-inspecting proxies or
40+
* on-prem internal CAs. Honoured regardless of `checkServerCertificate`.
41+
* @internal SEA path only.
42+
*/
43+
customCaCert?: Buffer | string;
2144
}

lib/sea/SeaAuth.ts

Lines changed: 135 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,9 +66,58 @@ export interface SeaSessionDefaults {
6666
catalog?: string;
6767
schema?: string;
6868
sessionConf?: Record<string, string>;
69+
/**
70+
* Render `INTERVAL` / `DURATION` result columns as strings
71+
* (kernel `ResultConfig.intervals_as_string`). The kernel default is
72+
* native Arrow `month_interval` / `duration[us]`, but the NodeJS
73+
* Thrift driver surfaces intervals as strings — so the SEA path sets
74+
* this `true` so its result shape is a byte-compatible drop-in for the
75+
* Thrift backend. Omitting it falls back to the kernel's native types.
76+
*/
77+
intervalsAsString?: boolean;
78+
/**
79+
* Render complex (`ARRAY` / `MAP` / `STRUCT` / `VARIANT`) result
80+
* columns as JSON strings (kernel `ResultConfig.complex_types_as_json`).
81+
* Left unset on the SEA path: native Arrow nested types already decode
82+
* identically to the Thrift backend through the shared Arrow converter,
83+
* so forcing JSON here would *introduce* a divergence rather than
84+
* remove one.
85+
*/
86+
complexTypesAsJson?: boolean;
87+
/**
88+
* Per-session kernel connection-pool size
89+
* (kernel `ConnectionOptions.max_connections`). Validated as a positive
90+
* integer within the napi `u32` range by `buildSeaConnectionOptions`.
91+
*/
92+
maxConnections?: number;
93+
}
94+
95+
/**
96+
* TLS options shared across all auth-mode variants. Mirror the napi
97+
* binding's `ConnectionOptions.checkServerCertificate` / `.customCaCert`
98+
* (kernel `Session::builder().tls(TlsConfig)`).
99+
*
100+
* The napi shape takes `customCaCert` as a `Buffer` only; the public
101+
* `ConnectionOptions` additionally accepts a PEM string, which
102+
* `buildSeaConnectionOptions` normalises to a `Buffer` before crossing
103+
* the FFI boundary.
104+
*/
105+
export interface SeaTlsOptions {
106+
/**
107+
* Verify the server's TLS certificate. The SEA backend is
108+
* **secure-by-default**: omitting this leaves the kernel default of
109+
* `true` (full chain + hostname verification). Set `false` only to opt
110+
* into the insecure, accept-anything mode (analogous to Thrift's
111+
* `rejectUnauthorized: false`); prefer pairing strict checking with
112+
* `customCaCert` over disabling verification entirely.
113+
*/
114+
checkServerCertificate?: boolean;
115+
/** PEM-encoded CA bytes to add to the trust store. */
116+
customCaCert?: Buffer;
69117
}
70118

71119
export type SeaNativeConnectionOptions = SeaSessionDefaults &
120+
SeaTlsOptions &
72121
(
73122
| {
74123
hostName: string;
@@ -114,6 +163,57 @@ export function isBlankOrReserved(s: string): boolean {
114163
return normalized.length === 0 || normalized === 'undefined' || normalized === 'null';
115164
}
116165

166+
/** napi-rs marshals `maxConnections` as a `u32`; reject values it can't hold. */
167+
const MAX_U32 = 0xffffffff;
168+
169+
/**
170+
* Normalise the public TLS options (`checkServerCertificate` /
171+
* `customCaCert`) into the napi shape.
172+
*
173+
* - `checkServerCertificate` passes through verbatim (only when set; an
174+
* absent value leaves the kernel default, which is secure — verify on).
175+
* - `customCaCert` accepts a PEM string or `Buffer` on the public
176+
* surface; we convert a string to a `Buffer` here and do a light PEM
177+
* sanity check. The bytes are NOT parsed in JS — the kernel returns a
178+
* meaningful error if the PEM is malformed.
179+
*
180+
* Throws `HiveDriverError` when `customCaCert` is supplied but empty or
181+
* (for strings) lacks a PEM certificate header.
182+
*/
183+
export function buildSeaTlsOptions(options: ConnectionOptions): SeaTlsOptions {
184+
const { checkServerCertificate, customCaCert } = options as {
185+
checkServerCertificate?: boolean;
186+
customCaCert?: Buffer | string;
187+
};
188+
189+
const tls: SeaTlsOptions = {};
190+
191+
if (checkServerCertificate !== undefined) {
192+
tls.checkServerCertificate = checkServerCertificate;
193+
}
194+
195+
if (customCaCert !== undefined) {
196+
if (typeof customCaCert === 'string') {
197+
if (!customCaCert.includes('-----BEGIN CERTIFICATE-----')) {
198+
throw new HiveDriverError(
199+
'SEA backend: `customCaCert` string does not look like a PEM certificate ' +
200+
"(missing '-----BEGIN CERTIFICATE-----'). Pass PEM text or a Buffer of PEM bytes.",
201+
);
202+
}
203+
tls.customCaCert = Buffer.from(customCaCert, 'utf8');
204+
} else if (Buffer.isBuffer(customCaCert)) {
205+
if (customCaCert.length === 0) {
206+
throw new HiveDriverError('SEA backend: `customCaCert` Buffer is empty.');
207+
}
208+
tls.customCaCert = customCaCert;
209+
} else {
210+
throw new HiveDriverError('SEA backend: `customCaCert` must be a PEM string or a Buffer.');
211+
}
212+
}
213+
214+
return tls;
215+
}
216+
117217
/**
118218
* Validate the user-supplied `ConnectionOptions` and build the
119219
* napi-binding's connection-options shape.
@@ -170,11 +270,45 @@ export function isBlankOrReserved(s: string): boolean {
170270
export function buildSeaConnectionOptions(options: ConnectionOptions): SeaNativeConnectionOptions {
171271
const { authType } = options as { authType?: string };
172272

173-
const base = {
273+
const base: {
274+
hostName: string;
275+
httpPath: string;
276+
intervalsAsString: boolean;
277+
maxConnections?: number;
278+
} & SeaTlsOptions = {
174279
hostName: options.host,
175280
httpPath: prependSlash(options.path),
281+
// Match the NodeJS Thrift driver, which surfaces INTERVAL columns as
282+
// strings. The kernel defaults to native Arrow interval/duration types;
283+
// forcing the string rendering here keeps the SEA path a byte-compatible
284+
// drop-in. Complex types are intentionally left at the kernel default
285+
// (native Arrow) — they already decode identically to Thrift via the
286+
// shared Arrow converter, so `complexTypesAsJson` is not forced on.
287+
intervalsAsString: true,
288+
// TLS knobs (server-cert verification toggle + custom CA). Validated and
289+
// normalised (string PEM → Buffer) here so the napi shape only sees a Buffer.
290+
...buildSeaTlsOptions(options),
176291
};
177292

293+
// SEA-only pool sizing; read via cast to match how this function reads the
294+
// other SEA-specific options (TLS) — they live on the internal options
295+
// surface, not the published public `ConnectionOptions` `.d.ts`.
296+
const { maxConnections } = options as { maxConnections?: number };
297+
if (maxConnections !== undefined) {
298+
if (!Number.isInteger(maxConnections) || maxConnections < 1) {
299+
throw new HiveDriverError(
300+
`SEA backend: \`maxConnections\` must be a positive integer; got ${maxConnections}.`,
301+
);
302+
}
303+
if (maxConnections > MAX_U32) {
304+
throw new HiveDriverError(
305+
`SEA backend: \`maxConnections\` exceeds the napi u32 limit (${MAX_U32}); got ${maxConnections}. ` +
306+
'Typical pool sizes are 10-500.',
307+
);
308+
}
309+
base.maxConnections = maxConnections;
310+
}
311+
178312
const oauth = options as {
179313
oauthClientId?: string;
180314
oauthClientSecret?: string;

lib/sea/SeaSessionBackend.ts

Lines changed: 67 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ import { decodeNapiKernelError } from './SeaErrorMapping';
3636
import SeaOperationBackend from './SeaOperationBackend';
3737
import { buildSeaPositionalParams, buildSeaNamedParams } from './SeaPositionalParams';
3838
import { seaServerInfoValue } from './SeaServerInfo';
39+
import { serializeQueryTags } from '../utils';
3940

4041
export interface SeaSessionBackendOptions {
4142
/** The opaque napi `Connection` handle returned by `openSession`. */
@@ -105,68 +106,42 @@ export default class SeaSessionBackend implements ISessionBackend {
105106
/**
106107
* Execute a SQL statement through the napi binding.
107108
*
108-
* Catalog / schema / sessionConf were applied at session open, so
109-
* there are no per-statement options to thread through.
109+
* Catalog / schema / sessionConf are session-level (applied at open).
110+
* Per-statement options forwarded to the kernel `ExecuteOptions`:
111+
* - `ordinalParameters` / `namedParameters` → bound params (mutually
112+
* exclusive — the kernel binds one placeholder style per statement);
113+
* - `queryTimeout` → `queryTimeoutSecs` (SEA server wait timeout);
114+
* - `rowLimit` → `rowLimit` (SEA-only server-side row cap);
115+
* - `queryTags` → serialised into the conf overlay's reserved
116+
* `query_tags` key (the same wire shape Thrift's `serializeQueryTags`
117+
* produces), merged with any explicit `statementConf`.
110118
*
111-
* M0 intentionally rejects `queryTimeout`, `namedParameters`, and
112-
* `ordinalParameters` with explicit deferred-to-M1 errors. `useCloudFetch`
113-
* is a no-op on the SEA path — the kernel hardcodes the SEA
114-
* `disposition` to `INLINE_OR_EXTERNAL_LINKS`, and per-statement
115-
* conf overrides have no reader on the kernel; cloud-fetch behaviour
116-
* is governed entirely by the kernel's `ResultConfig` (M1 binding
117-
* surface).
118-
*
119-
* The Thrift backend remains the path for consumers that need any
120-
* of those today.
119+
* Still rejected (genuinely unsupported on SEA, rather than silently
120+
* dropped): `useCloudFetch` (governed by the kernel `ResultConfig`, not a
121+
* per-statement knob), `useLZ4Compression` (kernel owns result compression),
122+
* and `stagingAllowedLocalPath` (volume operations). `maxRows` is applied by
123+
* the facade at fetch time, so it is intentionally not handled here.
121124
*/
122125
public async executeStatement(statement: string, options: ExecuteStatementOptions): Promise<IOperationBackend> {
123126
this.failIfClosed();
124127

125-
// Positional (`?`) and named (`:name`) parameters are mutually exclusive —
126-
// the kernel param codec binds exactly one placeholder style per
127-
// statement, so passing both is a caller error we surface up-front.
128-
const positionalParams = buildSeaPositionalParams(options.ordinalParameters);
129-
const namedParams = buildSeaNamedParams(options.namedParameters);
130-
if (positionalParams !== undefined && namedParams !== undefined) {
131-
throw new HiveDriverError(
132-
'SEA executeStatement: ordinalParameters and namedParameters are mutually exclusive — pass only one.',
133-
);
134-
}
135-
136-
if (options.queryTimeout !== undefined) {
137-
throw new HiveDriverError('SEA executeStatement: queryTimeout is not supported in M0 (deferred to M1)');
138-
}
139128
if (options.useCloudFetch !== undefined) {
140129
throw new HiveDriverError(
141130
'SEA executeStatement: useCloudFetch is controlled by the kernel result configuration and is not a per-statement option on SEA',
142131
);
143132
}
144-
// Reject — rather than silently ignore — the remaining Thrift-path
145-
// options the SEA M0 backend does not honor. Silently dropping them
146-
// is the worst failure mode for an agent/caller: passing e.g.
147-
// `queryTags` or `useLZ4Compression` would no-op with zero signal.
148-
// (`maxRows` is intentionally NOT here — the facade applies it at
149-
// fetch time.)
150-
if (options.queryTags !== undefined) {
151-
throw new HiveDriverError('SEA executeStatement: queryTags is not supported in M0 (deferred to M1)');
152-
}
153133
if (options.useLZ4Compression !== undefined) {
154134
throw new HiveDriverError(
155135
'SEA executeStatement: useLZ4Compression is not supported on SEA (result compression is governed by the kernel)',
156136
);
157137
}
158138
if (options.stagingAllowedLocalPath !== undefined) {
159139
throw new HiveDriverError(
160-
'SEA executeStatement: stagingAllowedLocalPath (volume operations) is not supported in M0 (deferred to M1)',
140+
'SEA executeStatement: stagingAllowedLocalPath (volume operations) is not supported on SEA',
161141
);
162142
}
163143

164-
// Only build the napi options object when there is something to send —
165-
// the no-params path keeps the minimal call shape (`executeStatement(sql)`).
166-
let execOptions: SeaNativeExecuteOptions | undefined;
167-
if (positionalParams !== undefined || namedParams !== undefined) {
168-
execOptions = { positionalParams, namedParams };
169-
}
144+
const execOptions = this.buildExecuteOptions(options);
170145

171146
let nativeStatement;
172147
try {
@@ -180,6 +155,56 @@ export default class SeaSessionBackend implements ISessionBackend {
180155
return this.wrapStatement(nativeStatement!);
181156
}
182157

158+
/**
159+
* Translate the public `ExecuteStatementOptions` into the kernel napi
160+
* `ExecuteOptions`, returning `undefined` when nothing is set so the
161+
* no-options call shape (`executeStatement(sql)`) is preserved.
162+
*/
163+
private buildExecuteOptions(options: ExecuteStatementOptions): SeaNativeExecuteOptions | undefined {
164+
// Positional (`?`) and named (`:name`) parameters are mutually exclusive.
165+
const positionalParams = buildSeaPositionalParams(options.ordinalParameters);
166+
const namedParams = buildSeaNamedParams(options.namedParameters);
167+
if (positionalParams !== undefined && namedParams !== undefined) {
168+
throw new HiveDriverError(
169+
'SEA executeStatement: ordinalParameters and namedParameters are mutually exclusive — pass only one.',
170+
);
171+
}
172+
173+
const execOptions: SeaNativeExecuteOptions = {};
174+
if (positionalParams !== undefined) {
175+
execOptions.positionalParams = positionalParams;
176+
}
177+
if (namedParams !== undefined) {
178+
execOptions.namedParams = namedParams;
179+
}
180+
// JDBC `setQueryTimeout` is whole seconds; the kernel's `queryTimeoutSecs`
181+
// (SEA wait timeout) is the native equivalent. The SEA wire caps it at 50s.
182+
if (options.queryTimeout !== undefined) {
183+
execOptions.queryTimeoutSecs = Number(options.queryTimeout);
184+
}
185+
if (options.rowLimit !== undefined) {
186+
execOptions.rowLimit = Number(options.rowLimit);
187+
}
188+
// Per-statement conf overlay plus query tags. Tags are serialised JS-side
189+
// into the reserved `query_tags` key (the same wire shape the Thrift
190+
// backend produces via `serializeQueryTags` → `confOverlay`), rather than
191+
// via the napi `queryTags` field: napi's `HashMap<String,String>` can't
192+
// represent a null-valued tag, and the kernel rejects setting both the
193+
// `queryTags` field and a `query_tags` conf key.
194+
const serializedQueryTags = serializeQueryTags(options.queryTags);
195+
if (options.statementConf !== undefined || serializedQueryTags !== undefined) {
196+
const statementConf: Record<string, string> = { ...(options.statementConf ?? {}) };
197+
if (serializedQueryTags !== undefined) {
198+
statementConf.query_tags = serializedQueryTags;
199+
}
200+
if (Object.keys(statementConf).length > 0) {
201+
execOptions.statementConf = statementConf;
202+
}
203+
}
204+
205+
return Object.keys(execOptions).length > 0 ? execOptions : undefined;
206+
}
207+
183208
/** Wrap a napi `Statement` (from execute or a metadata call) as an operation backend. */
184209
private wrapStatement(nativeStatement: SeaStatement): IOperationBackend {
185210
return new SeaOperationBackend({

tests/unit/sea/auth-m2m.test.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ describe('SeaAuth + SeaBackend — OAuth M2M auth flow', () => {
3535
expect(native).to.deep.equal({
3636
hostName: 'example.cloud.databricks.com',
3737
httpPath: '/sql/1.0/warehouses/abc',
38+
intervalsAsString: true,
3839
authMode: 'OAuthM2m',
3940
oauthClientId: 'client-uuid',
4041
oauthClientSecret: 'dose-fake-secret',
@@ -165,6 +166,7 @@ describe('SeaAuth + SeaBackend — OAuth M2M auth flow', () => {
165166
expect(calls[0].args[0]).to.deep.equal({
166167
hostName: 'example.cloud.databricks.com',
167168
httpPath: '/sql/1.0/warehouses/abc',
169+
intervalsAsString: true,
168170
authMode: 'OAuthM2m',
169171
oauthClientId: 'client-uuid',
170172
oauthClientSecret: 'dose-fake-secret',

tests/unit/sea/auth-pat.test.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ describe('SeaAuth — PAT auth options builder', () => {
3131
expect(native).to.deep.equal({
3232
hostName: 'example.cloud.databricks.com',
3333
httpPath: '/sql/1.0/warehouses/abc',
34+
intervalsAsString: true,
3435
authMode: 'Pat',
3536
token: 'dapi-fake-pat',
3637
});

0 commit comments

Comments
 (0)