Commit 3204e67
authored
Make secure token storage the default storage mode (#5272)
## Why
Part of CLI GA. Storing long-lived OAuth refresh tokens for interactive
logins (`auth_type = databricks-cli`) in a plain JSON file in the user's
home directory is a security weakness: any process with home-directory
access can read them. The CLI is increasingly the entry point for local
agent workflows, so we want tokens in the OS-native secure store by
default.
The flip has to handle one big "but": not every system has a usable OS
keyring. Linux containers, headless SSH sessions, WSL1, and some CI
runners do not have a D-Bus session bus. On those systems the keyring is
not merely empty, it is not reachable at all. If we shipped the default
flip alone, every command on those systems would error out with a
cryptic backend message ("Cannot autolaunch D-Bus without X11 $DISPLAY",
etc.) until the user manually set `DATABRICKS_AUTH_STORAGE=plaintext` or
re-ran `databricks auth login`. That is a real support burden for the GA
window.
This PR ships the default flip together with the supporting UX. Three
pieces, each scoped narrowly:
1. **Pin-on-success** after a successful login on the default-secure
path, so a later transient keyring failure cannot silently demote a
working secure-storage user back to plaintext.
2. **Read-path fallback for unavailable keyrings only.** When the
keyring backend itself is unreachable (no D-Bus, no Secret Service
daemon, etc.), reads fall through to the file cache so pre-upgrade
`token-cache.json` entries stay accessible without manual configuration.
**This does not fire when the keyring is reachable but empty** — that is
the normal post-upgrade case, where we still surface a clear "run
`databricks auth login` to sign in" nudge so the user moves their tokens
into the secure store rather than silently keeping them in plaintext.
3. **Friendlier `ErrNotFound` messages** that tell the user what to do,
with upgrade-specific copy when a legacy `token-cache.json` is present.
## Behavior matrix
The distinction between "keyring not available" and "keyring available
but empty" drives most of the design:
| Scenario | Read-path behavior |
| --- | --- |
| Keyring reachable, has token for profile | Return token from keyring.
|
| Keyring reachable, no token for profile | `ErrNotFound` wrapped with
"no cached credentials; run `databricks auth login` to sign in" (or
upgrade copy if `token-cache.json` exists). User re-authenticates and
writes the new token to the keyring. |
| Keyring NOT reachable, default-secure mode | Silent fall-back to file
cache. Pre-upgrade tokens keep working. **No nudge, no error.** |
| Keyring NOT reachable, user explicitly chose secure (env / config) |
Return the keyring cache anyway. The actual `Lookup` surfaces the
unreachability rather than being silently downgraded against the user's
stated intent. |
| Keyring probe times out | Stay on the keyring. A locked keyring being
unlocked is the common timeout case; misdiagnosing it as "unavailable"
would silently route reads to a different backend. |
## Changes
Before: tokens were written to `~/.databricks/token-cache.json`. Setting
`DATABRICKS_AUTH_STORAGE=secure` opted in to the OS keyring.
Now:
- Default storage is the OS keyring (Keychain / Credential Manager /
Secret Service). Users re-run `databricks auth login` once after
upgrade.
- `DATABRICKS_AUTH_STORAGE=plaintext` or `[__settings__].auth_storage =
plaintext` opts back to the file cache. Env wins over config.
- After a successful login on the default-secure path, the CLI writes
`auth_storage = secure` to `[__settings__]`. Pins the choice so a later
transient probe failure cannot silently demote the user.
- Read paths cheap-probe the keyring with a read-only `Get` on a
non-existent account. If the backend is unreachable on the
default-secure path, the file cache is returned instead. The probe and
fall-back are scoped strictly to backend-unavailability, not to empty
results. Read-path fallback does NOT pin; pinning stays exclusive to
login, which has the stronger write-probe signal and an explicit user
action.
- `ErrNotFound` from `Lookup` is wrapped with actionable copy: generic
case "no cached credentials; run `databricks auth login` to sign in";
upgrade case (mode=secure AND `~/.databricks/token-cache.json` has
entries) "stored credentials from older CLI versions are no longer used;
run `databricks auth login` to sign in again, or set
`DATABRICKS_AUTH_STORAGE=plaintext` to keep using the file cache".
- Non-`ErrNotFound` keyring errors get wrapped with the same actionable
hint so users on no-keyring systems who somehow bypass the probe (e.g.
explicit-secure callers) see "OS keyring unreachable: ... (set
`DATABRICKS_AUTH_STORAGE=plaintext` or run `databricks auth login`)"
instead of a raw D-Bus message.
- Login-time silent fallback (already on `main` as dormant
infrastructure) activates and pins.
Implementation:
- `libs/auth/storage/mode.go`: resolver default flips from
`StorageModePlaintext` to `StorageModeSecure`. Constant doc comments
updated.
- `libs/auth/storage/cache.go`: drops "dormant today" comments. New
`PinSecureMode` (login-side pin) and `applyReadFallback` (read-side
fallback). `cacheFactories` gains `probeKeyringRead`.
`persistPlaintextFallback` now logs internally at debug for
shape-consistency with `PinSecureMode`.
- `libs/auth/storage/keyring.go`: new `ProbeKeyringRead` (read-only
probe). `Lookup` wraps non-`ErrNotFound` errors with the unreachability
hint.
- `libs/auth/storage/not_found_hint.go` (new): `notFoundHintCache` wraps
`ResolveCache` / `ResolveCacheForLogin` so `ErrNotFound` from `Lookup`
carries an actionable hint without getting sandwiched between the SDK's
`cache:` prefix and `ErrNotFound`'s tail.
- `cmd/auth/login.go`, `cmd/auth/token.go`: call `storage.PinSecureMode`
after each `persistentAuth.Challenge()`. `login.go` also moves
`ResolveCacheForLogin` to run after input validation so
trivially-invalid commands no longer probe the keyring.
- Unit tests cover all of the above (`PinSecureMode` cases,
`applyReadFallback` cases, `ProbeKeyringRead`, `notFoundHintCache`,
`legacyCacheHasTokens`).
- `acceptance/script.prepare` forces `DATABRICKS_AUTH_STORAGE=plaintext`
at the root so existing auth acceptance tests keep exercising the
file-backed path. Tests that want the resolver default override it.
- `acceptance/cmd/auth/describe/u2m-plaintext-default` renamed to
`u2m-secure-default`; its `test.toml` adds a `[[Repls]]` regex
normalizing the platform-dependent keyring lookup error.
- `acceptance/cmd/auth/describe/u2m-json-output`, `u2m-plaintext-env`,
`u2m-plaintext-config`: regenerated to match the new error copy.
- `cmd/auth/auth_test.go`: `TestProfileHostCompatibleViaCobra` copies
the fixture into a temp directory so the resolver's writes can never
dirty the checked-in file.
- `NEXT_CHANGELOG.md`: breaking-change entry under Notable Changes
covering the flip, the re-login requirement, both opt-out paths, and the
read-path fallback for systems without a usable keyring.
## Test plan
- [x] `task checks` clean
- [x] `task lint-q` clean
- [x] `go test ./libs/auth/... ./cmd/auth/... ./libs/databrickscfg/...`
passes
- [x] `go test ./acceptance -run 'TestAccept/cmd/auth'` passes on macOS
- [x] `go test ./acceptance -run 'TestAccept/cmd/configure'` passes
(covers a `databricks-cli` auth path outside `cmd/auth`)
- [ ] Linux CI is the real test for the `[[Repls]]` regex in
`u2m-secure-default/test.toml` (macOS clean miss vs. Linux backend
error).
- [ ] Manual: with `DATABRICKS_AUTH_STORAGE` unset, `databricks auth
login --profile X` writes to the keyring and persists `auth_storage =
secure` to `[__settings__]`.
- [ ] Manual: `DATABRICKS_AUTH_STORAGE=plaintext databricks auth login
--profile X` continues to write to `~/.databricks/token-cache.json` with
the host-key dual-write entry; `[__settings__]` is not modified.
- [ ] Manual: keyring reachable but empty for the current profile, an
auth command produces the "run `databricks auth login` to sign in" nudge
(not a silent fall-back).
- [ ] Manual: keyring NOT reachable (Linux container, headless SSH), an
auth command silently uses the file cache; a populated pre-upgrade
`token-cache.json` keeps working.
This pull request and its description were written by Isaac.1 parent 193be7f commit 3204e67
27 files changed
Lines changed: 915 additions & 88 deletions
File tree
- acceptance
- cmd/auth
- describe
- u2m-json-output
- u2m-plaintext-config
- u2m-plaintext-default
- u2m-plaintext-env
- u2m-secure-default
- logout/stale-account-id-workspace-host
- token
- default-profile
- force-refresh-no-cache
- cmd/auth
- libs/auth/storage
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
| 8 | + | |
7 | 9 | | |
8 | 10 | | |
9 | 11 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
6 | | - | |
| 5 | + | |
| 6 | + | |
7 | 7 | | |
8 | 8 | | |
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| |||
Lines changed: 0 additions & 3 deletions
This file was deleted.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
5 | | - | |
| 4 | + | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| |||
Lines changed: 11 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
39 | | - | |
| 39 | + | |
0 commit comments