Bump databricks-sdk-java from 0.69.0 to 0.106.0#1464
Closed
msrathore-db wants to merge 1 commit into
Closed
Conversation
SDK 0.106 introduces native AI-coding-agent detection in
com.databricks.sdk.core.UserAgent (new agentProvider() field +
listKnownAgents()/lookupAgentProvider() methods + an agent/<name> block
in asString()). SDK 0.69 had none of these.
The driver's existing AgentDetector.detect() injection into
UserAgent.withOtherInfo("agent", ...) at UserAgentManager.setUserAgent()
now layers on top of the SDK's built-in detection. Both fire on the same
env-var contract, producing two agent/<name> tokens in every SDK-routed
request's User-Agent header (verified live via mitmproxy: agent/x2 on
both SEA and Thrift on SDK 0.106 vs agent/x1 on SDK 0.69).
Remove the driver-side injection in setUserAgent so the SDK is the
single source for the agent/<name> token (and gains coverage for two
agents the driver's list misses: Augment and Windsurf).
The hand-built buildUserAgentForConnectorService path keeps its own
AgentDetector.detect() call because that bootstrap UA is constructed
via StringBuilder and never goes through UserAgent.asString() —
no SDK-side injection happens there.
Audit performed by diffing every driver-imported SDK class between
0.69 and 0.106. The agent collision was the only behavior-impacting
delta on the driver's hot path. Other SDK additions (workspaceId /
accountId / discoveryUrl / tokenAudience auto-population in
DatabricksConfig.resolve(), CachedTokenSource dynamic stale period,
X-Databricks-Org-Id auto-injection in SDK service impls) either don't
reach driver code paths (driver bypasses SDK service impls via
apiClient.execute() direct) or only populate when the corresponding
config field is null (driver-set values win).
Live verified on SDK 0.106 against pecotesting (Azure SPOG + Legacy)
and PECOAWS workspaces:
- PAT, M2M Databricks-OIDC, AAD SP, U2M browser, U2M refresh-token
cache reuse — all PASS
- Multi-chunk download (10k rows), prepared statements, complex
types (ARRAY/MAP/STRUCT), metadata APIs, 50x connection lifecycle
— all PASS
- mitm-captured wire: agent/x1 on both SEA and Thrift after the
one-line removal in setUserAgent
NO_CHANGELOG=false
Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
Collaborator
Author
|
Closing in favor of a new PR from the internal branch. The forked-PR CI runs Maven in offline-mode-only and can't download new dependencies — SDK 0.106.0 isn't in the runner cache, causing all build jobs to fail at dependency resolution before any code is tested. Reopening from the internal |
10 tasks
msrathore-db
added a commit
that referenced
this pull request
May 22, 2026
## Summary - Bumps `databricks-sdk-java` from **0.69.0 → 0.106.0** (37 minor versions). - Removes the driver-side `AgentDetector.detect()` call in `UserAgentManager.setUserAgent()` to avoid duplicate `agent/<name>` User-Agent tokens. SDK 0.106 introduced native AI-coding-agent detection in `com.databricks.sdk.core.UserAgent` (new `agentProvider()` + 9-agent `listKnownAgents()`); layering the driver's own injection on top produced `agent/×2` on every SDK-routed request. > Note: this PR replaces #1464 which failed CI on dependency resolution because forked-PR builds run Maven in offline-cache-only mode and SDK 0.106 isn't in the cache yet. Reopened from an internal branch so CI can fetch the new SDK artifact from Maven Central. ## Why this is safe to ship Bytecode-diffed every driver-imported SDK class between 0.69 and 0.106. All 38 imports survive to 0.106 with compatible signatures (compile passes unchanged). The `agent/<name>` collision was the **only** behavior-impacting delta on the driver's hot path. The other 0.69→0.106 deltas either don't reach driver code paths (driver bypasses SDK service impls via `apiClient.execute()` direct, so SDK's new `X-Databricks-Org-Id` auto-injection on `StatementExecutionImpl`/`AppsImpl`/etc. never fires for us), or only populate when the corresponding `DatabricksConfig` field is `null` (driver-set values continue to win — verified by reading `resolveHostMetadata()` source). The bootstrap `buildUserAgentForConnectorService` path retains its own `AgentDetector.detect()` call because that UA is hand-built via `StringBuilder` and never goes through `UserAgent.asString()`. Mitm-verified `agent/×1` on every wire request after the fix. ## What changed **3 files, +2 / -4 lines total:** ``` pom.xml | 2 +- src/main/java/com/databricks/jdbc/common/util/UserAgentManager.java | 3 --- NEXT_CHANGELOG.md | 1 + ``` ## Test plan - [x] `mvn install` builds clean on SDK 0.106 with no other source changes - [x] Mitm wire verified: `agent/×2` on SDK 0.106 pre-fix → `agent/×1` after, on both SEA and Thrift - [x] PAT, M2M Databricks-OIDC, AAD SP — live PASS on pecotesting (Azure SPOG + Legacy) on SDK 0.106 - [x] U2M browser flow — live PASS on Azure SPOG, Azure Legacy, AWS workspace - [x] U2M refresh-token cache reuse — live PASS on AWS (first run opens browser + caches; second run reuses cache, no browser, 6× faster) - [x] Statement execution e2e: SELECT 1, NULL, multi-row 100, **10k-row multi-chunk**, prepared statement, ResultSetMetaData — all PASS - [x] DatabaseMetaData APIs: `getCatalogs`, `getSchemas`, `getTableTypes`, `getDatabaseProductName` — all PASS - [x] Complex types round-trip: ARRAY, MAP, STRUCT, DECIMAL+DATE — all PASS - [x] 50× connection lifecycle on both PAT and M2M — no leaks, no token-cache pollution - [x] Mocked: `CachedTokenSourceRefreshTest`, `AzureMsiMockTest`, `OidcDiscoveryTest`, `ErrorMappingTest`, `DefaultProfileResolutionTest`, `UserAgentTest` — all PASS ## Coverage gaps (not blocking — for transparency) - **GCP workspace not live-tested**: SDK's `GoogleCredentialsCredentialsProvider`/`GoogleIdCredentialsProvider` bytecode is essentially unchanged 0.69 → 0.106 (logger refactor only). Recommend canary on GCP customers. - **Real Azure MSI flow**: only mocked-IMDS test; needs an Azure VM for live coverage. ## Pre-existing driver issues observed but NOT addressed by this PR These reproduce identically on SDK 0.69 — not introduced by the bump. Each should be filed separately: 1. `EnableTokenFederation=1` default breaks vanilla U2M (workaround: set `EnableTokenFederation=0`). Wraps `ExternalBrowserCredentialsProvider` with `DatabricksTokenFederationProvider` which expects an external IdP token that vanilla U2M doesn't have. 2. Cross-cloud federation rejected by the `Cloud.AZURE`-only check at `ClientConfigurator.java:172` (blocks AAD external IdP → GCP/AWS workspace federation). 3. OAuth-M2M scope is not pluggable via JDBC URL — driver's `Auth_Scope` applies only to U2M and JWT M2M paths, not OAuth M2M `client_credentials`. Breaks federation when the external IdP is Azure AAD because AAD requires `<resource>/.default` and the SDK sends `all-apis` from `DatabricksConfig.getScopes()` default. NO_CHANGELOG=false This pull request was AI-assisted by Isaac. --------- Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
databricks-sdk-javafrom 0.69.0 → 0.106.0 (37 minor versions).AgentDetector.detect()call inUserAgentManager.setUserAgent()to avoid duplicateagent/<name>User-Agent tokens. SDK 0.106 introduced native AI-coding-agent detection incom.databricks.sdk.core.UserAgent(newagentProvider()+ 9-agentlistKnownAgents()); layering the driver's own injection on top producedagent/×2on every SDK-routed request.Why this is safe to ship
Bytecode-diffed every driver-imported SDK class between 0.69 and 0.106. All 38 imports survive to 0.106 with compatible signatures (compile passes unchanged).
The
agent/<name>collision was the only behavior-impacting delta on the driver's hot path. The other 0.69→0.106 deltas either don't reach driver code paths (driver bypasses SDK service impls viaapiClient.execute()direct, so SDK's newX-Databricks-Org-Idauto-injection onStatementExecutionImpl/AppsImpl/etc. never fires for us), or only populate when the correspondingDatabricksConfigfield isnull(driver-set values continue to win — verified by readingresolveHostMetadata()source).The bootstrap
buildUserAgentForConnectorServicepath retains its ownAgentDetector.detect()call because that UA is hand-built viaStringBuilderand never goes throughUserAgent.asString(). Mitm-verifiedagent/×1on every wire request after the fix.What changed
3 files, +2 / -4 lines total:
Test plan
mvn installbuilds clean on SDK 0.106 with no other source changesagent/×2on SDK 0.106 pre-fix →agent/×1after, on both SEA and ThriftgetCatalogs,getSchemas,getTableTypes,getDatabaseProductName— all PASSCachedTokenSourceRefreshTest,AzureMsiMockTest,OidcDiscoveryTest,ErrorMappingTest,DefaultProfileResolutionTest,UserAgentTest— all PASSCoverage gaps (not blocking — for transparency)
GoogleCredentialsCredentialsProvider/GoogleIdCredentialsProviderbytecode is essentially unchanged 0.69 → 0.106 (logger refactor only). Recommend canary on GCP customers.Pre-existing driver issues observed but NOT addressed by this PR
These reproduce identically on SDK 0.69 — not introduced by the bump. Each should be filed separately:
EnableTokenFederation=1default breaks vanilla U2M (workaround: setEnableTokenFederation=0). WrapsExternalBrowserCredentialsProviderwithDatabricksTokenFederationProviderwhich expects an external IdP token that vanilla U2M doesn't have.Cloud.AZURE-only check atClientConfigurator.java:172(blocks AAD external IdP → GCP/AWS workspace federation).Auth_Scopeapplies only to U2M and JWT M2M paths, not OAuth M2Mclient_credentials. Breaks federation when the external IdP is Azure AAD because AAD requires<resource>/.defaultand the SDK sendsall-apisfromDatabricksConfig.getScopes()default.NO_CHANGELOG=false
This pull request was AI-assisted by Isaac.