Skip to content

Commit fdba15b

Browse files
committed
feat(kernel): route OAuth identically to Thrift (strict parity) + custom U2M client id
Make the kernel auth flow selector + client-id resolution byte-for-byte identical to the Thrift driver (`DBSQLClient.createAuthProvider`): - flow = `oauthClientSecret === undefined ? U2M : M2M` (strict undefined) - clientId = `oauthClientId ?? defaultClientId` (`??` guards null/undefined only) No blank/reserved normalization on the OAuth fields — a present-but-degenerate value (`""` / `"undefined"` / whitespace) is forwarded verbatim, exactly as Thrift forwards it. This removes every divergence from the Thrift backend across the full (clientId × clientSecret) input matrix (verified). Consequences vs the prior id-presence routing: - `oauthClientId` + no secret now runs U2M (browser) and forwards the id as a custom U2M client (Thrift does the same). - M2M with no/blank id no longer throws "id required" — it uses the default client (`?? defaultClientId`), matching Thrift. - A present-but-empty/`"undefined"` secret routes to M2M (not U2M), matching Thrift's strict `=== undefined` check. Trade-off (accepted for parity): this re-imports Thrift's env-stringification behaviour — a secret/id env var that resolved to `""`/`"undefined"` is taken literally rather than treated as unset. Updated the auth unit tests (u2m/m2m/edge-cases) to assert the strict-parity matrix. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
1 parent 5be4626 commit fdba15b

4 files changed

Lines changed: 134 additions & 254 deletions

File tree

lib/kernel/KernelAuth.ts

Lines changed: 31 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,11 @@ const U2M_DEFAULT_SCOPES = ['sql', 'offline_access'];
4242
// (parity with pyo3, which forwards `scopes` on M2M).
4343
const M2M_DEFAULT_SCOPES = ['all-apis'];
4444

45+
// Default OAuth client id — identical to the Thrift driver's
46+
// `DatabricksOAuthManager.defaultClientId` and the kernel napi's own U2M default.
47+
// Used for `oauthClientId ?? default`, mirroring Thrift's `getClientId()`.
48+
const DEFAULT_OAUTH_CLIENT_ID = 'databricks-sql-connector';
49+
4550
/**
4651
* Shape consumed by the napi-binding's `openSession()` (see
4752
* `native/kernel/index.d.ts`). Mirrors `ConnectionOptions` in the binding's
@@ -211,6 +216,7 @@ export type KernelNativeConnectionOptions = KernelSessionDefaults &
211216
authMode: 'OAuthU2m';
212217
oauthRedirectPort: number;
213218
oauthScopes?: Array<string>;
219+
oauthClientId?: string;
214220
}
215221
);
216222

@@ -596,40 +602,18 @@ export function buildKernelConnectionOptions(options: ConnectionOptions): Kernel
596602
);
597603
}
598604

599-
// Flow selector — DELIBERATELY DIFFERENT from thrift's
600-
// `DBSQLClient.createAuthProvider` (`DBSQLClient.ts:216`), which keys off
601-
// the secret (`oauthClientSecret === undefined ? U2M : M2M`). kernel keys off
602-
// `oauthClientId` *presence* (the "do I have an id?" signal) instead, so a
603-
// user who set an id but typoed/forgot the secret gets the actionable M2M
604-
// "secret is required" error rather than being silently routed to U2M
605-
// (which would hide their intent). Cost: `id + no secret` throws here
606-
// where thrift would run U2M, and kernel U2M has no custom-client-id support
607-
// (see buildKernelConnectionOptions header). The U2M arm still defends against an id
608-
// sneaking through: fires only when `oauthClientId` is provided as
609-
// a blank-reserved literal (e.g., whitespace, `"null"`, `"undefined"`)
610-
// alongside an absent/blank secret — both `idIsBlank` and
611-
// `secretIsBlank` are true so U2M wins routing, but the caller's
612-
// intent to use U2M with a partially-set id is ambiguous and
613-
// rejected explicitly.
614-
const idIsBlank =
615-
oauth.oauthClientId === undefined ||
616-
(typeof oauth.oauthClientId === 'string' && isBlankOrReserved(oauth.oauthClientId));
617-
const secretIsBlank =
618-
oauth.oauthClientSecret === undefined ||
619-
(typeof oauth.oauthClientSecret === 'string' && isBlankOrReserved(oauth.oauthClientSecret));
620-
621-
if (idIsBlank && secretIsBlank) {
622-
// U2M — neither id nor secret supplied.
623-
if (oauth.oauthClientId !== undefined) {
624-
// Defense-in-depth: id was set but blank/reserved literal.
625-
// The kernel hardcodes `client_id = "databricks-cli"` for U2M;
626-
// there's no JS-side override knob.
627-
throw new HiveDriverError(
628-
'kernel backend: `oauthClientId` is not supported on the OAuth U2M flow; ' +
629-
"the kernel uses the built-in 'databricks-cli' client. " +
630-
'Omit `oauthClientId` for U2M, or supply `oauthClientSecret` for the M2M flow.',
631-
);
632-
}
605+
// Flow selector + client-id resolution mirror the Thrift driver EXACTLY
606+
// (`DBSQLClient.createAuthProvider`, DBSQLClient.ts:220):
607+
// flow = oauthClientSecret === undefined ? U2M : M2M (strict undefined)
608+
// clientId = oauthClientId ?? defaultClientId (`??` guards null/undefined only)
609+
// No blank/reserved normalization on the OAuth fields — a present-but-
610+
// degenerate value (`""`, `"undefined"`, whitespace) is forwarded verbatim,
611+
// exactly as Thrift forwards it, so the kernel path does not diverge from the
612+
// Thrift backend. (This intentionally re-imports Thrift's env-stringification
613+
// behaviour: a secret that resolved to `""`/`"undefined"` counts as a real
614+
// secret ⇒ M2M, just like Thrift.)
615+
if (oauth.oauthClientSecret === undefined) {
616+
// U2M (browser) — no secret, exactly like Thrift.
633617
if (oauth.persistence !== undefined) {
634618
throw new HiveDriverError(
635619
'kernel backend: `persistence` (custom OAuth token store) is not yet wired through ' +
@@ -640,31 +624,21 @@ export function buildKernelConnectionOptions(options: ConnectionOptions): Kernel
640624
'when the kernel exposes it.',
641625
);
642626
}
643-
return {
627+
const u2m = {
644628
...base,
645-
authMode: 'OAuthU2m',
629+
authMode: 'OAuthU2m' as const,
646630
oauthRedirectPort: U2M_DEFAULT_REDIRECT_PORT,
647-
// Pass scopes explicitly so the kernel requests the same set as the
648-
// Thrift driver (`sql offline_access`) rather than its bare-Rust
649-
// `all-apis offline_access` default. Caller can override via `oauthScopes`.
631+
// Scopes default to Thrift parity (`sql offline_access`); overridable.
650632
oauthScopes:
651-
Array.isArray(oauth.oauthScopes) && oauth.oauthScopes.length > 0
652-
? oauth.oauthScopes
653-
: U2M_DEFAULT_SCOPES,
633+
Array.isArray(oauth.oauthScopes) && oauth.oauthScopes.length > 0 ? oauth.oauthScopes : U2M_DEFAULT_SCOPES,
654634
};
635+
// clientId: Thrift uses `oauthClientId ?? default`. Forward it verbatim
636+
// when set; when absent the napi applies the same default
637+
// (`databricks-sql-connector`), so omitting it is identical to Thrift.
638+
return oauth.oauthClientId !== undefined ? { ...u2m, oauthClientId: oauth.oauthClientId } : u2m;
655639
}
656640

657-
// M2M.
658-
if (typeof oauth.oauthClientId !== 'string' || isBlankOrReserved(oauth.oauthClientId)) {
659-
throw new AuthenticationError(
660-
'kernel backend: `oauthClientId` is required (non-empty, non-whitespace) for OAuth M2M.',
661-
);
662-
}
663-
if (typeof oauth.oauthClientSecret !== 'string' || isBlankOrReserved(oauth.oauthClientSecret)) {
664-
throw new AuthenticationError(
665-
'kernel backend: `oauthClientSecret` must be a non-empty non-whitespace string for OAuth M2M.',
666-
);
667-
}
641+
// M2M (client credentials) — a secret is present, exactly like Thrift.
668642
if (oauth.persistence !== undefined) {
669643
throw new HiveDriverError(
670644
'kernel backend: `persistence` is not supported on OAuth M2M ' +
@@ -674,14 +648,12 @@ export function buildKernelConnectionOptions(options: ConnectionOptions): Kernel
674648
return {
675649
...base,
676650
authMode: 'OAuthM2m',
677-
oauthClientId: oauth.oauthClientId,
651+
// Thrift: `getClientId()` = `oauthClientId ?? defaultClientId`.
652+
oauthClientId: oauth.oauthClientId ?? DEFAULT_OAUTH_CLIENT_ID,
678653
oauthClientSecret: oauth.oauthClientSecret,
679-
// Configurable (parity with pyo3); defaults to `['all-apis']` — the only
680-
// scope the client-credentials flow allows, matching Thrift + the kernel.
654+
// Configurable (parity with pyo3); defaults to `['all-apis']`.
681655
oauthScopes:
682-
Array.isArray(oauth.oauthScopes) && oauth.oauthScopes.length > 0
683-
? oauth.oauthScopes
684-
: M2M_DEFAULT_SCOPES,
656+
Array.isArray(oauth.oauthScopes) && oauth.oauthScopes.length > 0 ? oauth.oauthScopes : M2M_DEFAULT_SCOPES,
685657
};
686658
}
687659

tests/unit/kernel/auth-edge-cases.test.ts

Lines changed: 62 additions & 149 deletions
Original file line numberDiff line numberDiff line change
@@ -67,126 +67,59 @@ describe('KernelAuth — edge cases (input validation + ambiguity)', () => {
6767
}
6868
});
6969

70-
// Round-4 NF3-2: presence of `oauthClientId` signals M2M intent.
71-
// A blank/reserved-literal `oauthClientSecret` is then a missing-secret
72-
// typo, not a request to fall back to U2M. Surface the M2M "secret
73-
// required" AuthenticationError so the user fixes the real problem
74-
// rather than swap class to a HiveDriverError pointing at a flow
75-
// they didn't intend to use.
76-
it('rejects mixed-case reserved-literal oauthClientSecret with AuthenticationError when id is set', () => {
77-
const opts: ConnectionOptions = {
78-
host: 'example.cloud.databricks.com',
79-
path: '/sql/1.0/warehouses/abc',
80-
authType: 'databricks-oauth',
81-
oauthClientId: 'client-uuid',
82-
oauthClientSecret: 'NULL',
83-
};
84-
85-
expect(() => buildKernelConnectionOptions(opts)).to.throw(
86-
AuthenticationError,
87-
/oauthClientSecret.*non-empty.*OAuth M2M/,
88-
);
89-
});
90-
91-
it('rejects whitespace-only oauthClientId on M2M', () => {
92-
const opts: ConnectionOptions = {
93-
host: 'example.cloud.databricks.com',
94-
path: '/sql/1.0/warehouses/abc',
95-
authType: 'databricks-oauth',
96-
oauthClientId: ' ',
97-
oauthClientSecret: 'dose-fake-secret',
98-
};
99-
100-
expect(() => buildKernelConnectionOptions(opts)).to.throw(AuthenticationError, /oauthClientId.*required/);
101-
});
102-
103-
it('rejects whitespace-only oauthClientSecret with AuthenticationError when oauthClientId is set (M2M intent)', () => {
104-
const opts: ConnectionOptions = {
105-
host: 'example.cloud.databricks.com',
106-
path: '/sql/1.0/warehouses/abc',
107-
authType: 'databricks-oauth',
108-
oauthClientId: 'client-uuid',
109-
oauthClientSecret: '\n\t',
110-
};
111-
112-
expect(() => buildKernelConnectionOptions(opts)).to.throw(
113-
AuthenticationError,
114-
/oauthClientSecret.*non-empty.*OAuth M2M/,
115-
);
116-
});
117-
118-
it('rejects literal "undefined" as oauthClientId on M2M', () => {
119-
const opts: ConnectionOptions = {
120-
host: 'example.cloud.databricks.com',
121-
path: '/sql/1.0/warehouses/abc',
122-
authType: 'databricks-oauth',
123-
oauthClientId: 'undefined',
124-
oauthClientSecret: 'dose-fake-secret',
125-
};
126-
127-
expect(() => buildKernelConnectionOptions(opts)).to.throw(AuthenticationError, /oauthClientId.*required/);
128-
});
129-
130-
it('rejects literal "undefined" as oauthClientSecret with AuthenticationError when id is set (M2M intent)', () => {
131-
const opts: ConnectionOptions = {
132-
host: 'example.cloud.databricks.com',
133-
path: '/sql/1.0/warehouses/abc',
134-
authType: 'databricks-oauth',
135-
oauthClientId: 'client-uuid',
136-
oauthClientSecret: 'undefined',
137-
};
138-
139-
expect(() => buildKernelConnectionOptions(opts)).to.throw(
140-
AuthenticationError,
141-
/oauthClientSecret.*non-empty.*OAuth M2M/,
142-
);
143-
});
144-
145-
// Round-4 NF3-2: pin the exact class against the round-3 NF-N3
146-
// regression where M2M-with-empty-secret was routed through the U2M
147-
// arm and raised a bare `HiveDriverError`. `instanceof
148-
// AuthenticationError` correctly returns `false` for a bare
149-
// `HiveDriverError` instance (instanceof is a one-way subclass
150-
// check), so the subclass check IS sufficient to catch the
151-
// regression. We don't add an `error.name` or `constructor.name`
152-
// belt — the former requires `this.name` on the subclass (LE4-1
153-
// handles that separately for downstream-consumer benefit, not for
154-
// this test), and the latter is bundler-fragile (terser/esbuild
155-
// strip class names without `keep_classnames`).
156-
it('M2M-with-empty-secret throws AuthenticationError, not bare HiveDriverError (class pin)', () => {
157-
const opts: ConnectionOptions = {
158-
host: 'example.cloud.databricks.com',
159-
path: '/sql/1.0/warehouses/abc',
160-
authType: 'databricks-oauth',
161-
oauthClientId: 'x',
162-
oauthClientSecret: '',
163-
};
70+
// Strict Thrift parity: flow = `oauthClientSecret === undefined ? U2M : M2M`,
71+
// and OAuth fields are forwarded VERBATIM (no blank/reserved normalization),
72+
// exactly as the Thrift driver does. A present-but-degenerate secret
73+
// (`""` / whitespace / `"undefined"`) therefore counts as a real secret ⇒ M2M
74+
// — byte-for-byte with Thrift (which only routes to U2M when the secret is
75+
// strictly `undefined`). The id rides through as `oauthClientId ?? default`.
76+
const m2mDegenerateSecret = [
77+
{ label: 'reserved-literal "NULL"', secret: 'NULL' },
78+
{ label: 'whitespace-only', secret: '\n\t' },
79+
{ label: 'literal "undefined"', secret: 'undefined' },
80+
{ label: 'empty string', secret: '' },
81+
];
82+
for (const { label, secret } of m2mDegenerateSecret) {
83+
it(`routes id + ${label} secret to M2M (secret !== undefined ⇒ M2M, like Thrift)`, () => {
84+
const native = buildKernelConnectionOptions({
85+
host: 'example.cloud.databricks.com',
86+
path: '/sql/1.0/warehouses/abc',
87+
authType: 'databricks-oauth',
88+
oauthClientId: 'client-uuid',
89+
oauthClientSecret: secret,
90+
} as ConnectionOptions);
91+
expect(native.authMode).to.equal('OAuthM2m');
92+
expect((native as { oauthClientId?: string }).oauthClientId).to.equal('client-uuid');
93+
});
94+
}
16495

165-
expect(() => buildKernelConnectionOptions(opts)).to.throw(
166-
AuthenticationError,
167-
/oauthClientSecret.*non-empty.*OAuth M2M/,
168-
);
169-
});
96+
// Degenerate ids on M2M are forwarded verbatim (`oauthClientId ?? default`
97+
// keeps a non-nullish value), NOT rejected — matching Thrift's getClientId().
98+
for (const id of [' ', 'undefined']) {
99+
it(`forwards a degenerate oauthClientId (${JSON.stringify(id)}) verbatim on M2M`, () => {
100+
const native = buildKernelConnectionOptions({
101+
host: 'example.cloud.databricks.com',
102+
path: '/sql/1.0/warehouses/abc',
103+
authType: 'databricks-oauth',
104+
oauthClientId: id,
105+
oauthClientSecret: 'dose-fake-secret',
106+
} as ConnectionOptions);
107+
expect(native.authMode).to.equal('OAuthM2m');
108+
expect((native as { oauthClientId?: string }).oauthClientId).to.equal(id);
109+
});
110+
}
170111

171-
// Round-5 DA4-2: the round-3 → round-4 test flips left the U2M-arm
172-
// defense-in-depth U2M+id rejection without coverage. It's still
173-
// reachable: when `oauthClientId` is a blank-reserved literal
174-
// (whitespace, `"null"`, `"undefined"`) AND `oauthClientSecret` is
175-
// absent/blank, BOTH `idIsBlank` and `secretIsBlank` are true so
176-
// U2M wins routing — but a non-undefined id signals ambiguity that
177-
// U2M cannot honor (the kernel hardcodes `databricks-cli`).
178-
it('routes a whitespace oauthClientId with no oauthClientSecret to the U2M defense-in-depth rejection', () => {
179-
const opts: ConnectionOptions = {
112+
// No secret ⇒ U2M, and a degenerate id is forwarded verbatim too (Thrift's
113+
// `oauthClientId ?? default` keeps a non-nullish whitespace id).
114+
it('routes a whitespace oauthClientId with no secret to U2M, forwarding the id verbatim', () => {
115+
const native = buildKernelConnectionOptions({
180116
host: 'example.cloud.databricks.com',
181117
path: '/sql/1.0/warehouses/abc',
182118
authType: 'databricks-oauth',
183119
oauthClientId: ' ',
184-
} as unknown as ConnectionOptions;
185-
186-
expect(() => buildKernelConnectionOptions(opts)).to.throw(
187-
HiveDriverError,
188-
/oauthClientId.*not supported on the OAuth U2M flow/,
189-
);
120+
} as unknown as ConnectionOptions);
121+
expect(native.authMode).to.equal('OAuthU2m');
122+
expect((native as { oauthClientId?: string }).oauthClientId).to.equal(' ');
190123
});
191124
});
192125

@@ -260,41 +193,21 @@ describe('KernelAuth — edge cases (input validation + ambiguity)', () => {
260193
// `process.env.MY_SECRET || ''` shape) should route to U2M, not
261194
// to the M2M arm with an "empty secret" rejection. M2M's error
262195
// message would never mention U2M, leaving the user stuck.
263-
it('routes blank oauthClientSecret to U2M (not to an M2M-blank-secret rejection)', () => {
264-
const opts: ConnectionOptions = {
265-
host: 'example.cloud.databricks.com',
266-
path: '/sql/1.0/warehouses/abc',
267-
authType: 'databricks-oauth',
268-
oauthClientSecret: '',
269-
};
270-
271-
const native = buildKernelConnectionOptions(opts);
272-
expect(native.authMode).to.equal('OAuthU2m');
273-
});
274-
275-
it('routes whitespace-only oauthClientSecret to U2M too', () => {
276-
const opts: ConnectionOptions = {
277-
host: 'example.cloud.databricks.com',
278-
path: '/sql/1.0/warehouses/abc',
279-
authType: 'databricks-oauth',
280-
oauthClientSecret: ' \t ',
281-
};
282-
283-
const native = buildKernelConnectionOptions(opts);
284-
expect(native.authMode).to.equal('OAuthU2m');
285-
});
286-
287-
it('routes literal-"undefined" oauthClientSecret to U2M too', () => {
288-
const opts: ConnectionOptions = {
289-
host: 'example.cloud.databricks.com',
290-
path: '/sql/1.0/warehouses/abc',
291-
authType: 'databricks-oauth',
292-
oauthClientSecret: 'undefined',
293-
};
294-
295-
const native = buildKernelConnectionOptions(opts);
296-
expect(native.authMode).to.equal('OAuthU2m');
297-
});
196+
// Strict Thrift parity: a present-but-degenerate secret (no id) is still a
197+
// defined secret ⇒ M2M (only a strictly-`undefined` secret routes to U2M).
198+
// With no id, the client defaults via `oauthClientId ?? default`.
199+
for (const secret of ['', ' \t ', 'undefined']) {
200+
it(`routes a degenerate-but-present oauthClientSecret (${JSON.stringify(secret)}, no id) to M2M`, () => {
201+
const native = buildKernelConnectionOptions({
202+
host: 'example.cloud.databricks.com',
203+
path: '/sql/1.0/warehouses/abc',
204+
authType: 'databricks-oauth',
205+
oauthClientSecret: secret,
206+
} as ConnectionOptions);
207+
expect(native.authMode).to.equal('OAuthM2m');
208+
expect((native as { oauthClientId?: string }).oauthClientId).to.equal('databricks-sql-connector');
209+
});
210+
}
298211
});
299212

300213
describe('explicit-undefined vs missing for Azure-direct discriminants', () => {

0 commit comments

Comments
 (0)