Skip to content

auth profiles: always validate SPOG profiles as account#5284

Open
mihaimitrea-db wants to merge 3 commits into
mainfrom
mihai/fix-auth-profiles-spog-validation
Open

auth profiles: always validate SPOG profiles as account#5284
mihaimitrea-db wants to merge 3 commits into
mainfrom
mihai/fix-auth-profiles-spog-validation

Conversation

@mihaimitrea-db
Copy link
Copy Markdown
Contributor

@mihaimitrea-db mihaimitrea-db commented May 20, 2026

Summary

auth profiles was reporting Valid: NO for SPOG profiles with a real workspace_id, even though auth token worked. Root cause: ResolveConfigType routed those profiles to WorkspaceConfig, so validation went through CurrentUser.Me — but SPOG OAuth tokens are account-audience and the workspace API rejects them with 400 "Unable to load OAuth Config".

Replace the either/or routing with a host-shape-aware probe: probe the account API for classic accounts.* and SPOG hosts, probe the workspace API for everything else, and additionally probe workspace if the profile carries an explicit workspace_id (covers SPOG-with-PAT). A profile is valid if any applicable probe succeeds.

Probes run sequentially against the shared cfg — parallel probes race on cfg.Host (SDK's lazy Authenticate() writes it while the other probe's client construction reads it). Profile-level parallelism is unchanged.

Test plan

  • go test -race ./libs/auth/... ./cmd/auth/... — green.
  • Live repro on a SPOG host (db-deco-test.gcp.databricks.com): Valid: NO -> YES.
  • TestProfileLoadSPOGWorkspaceCredential covers SPOG-with-PAT.
  • TestIsSPOGHost / TestIsClassicWorkspaceHost lock in the three-way host classification including accounts-dod.*.

This pull request and its description were written by Isaac.

ResolveConfigType used to route SPOG profiles with a real workspace_id to
WorkspaceConfig, so auth profiles validated them with CurrentUser.Me. SPOG
OAuth is account-scoped, so every token's audience is the account, and the
workspace API rejects those tokens with `400 "Unable to load OAuth Config"`
— flagging otherwise-functional profiles as invalid. Always classify SPOG
as AccountConfig so validation goes through Workspaces.List, which the
account-audience token can actually authenticate.

The SPOG mock in newSPOGServer now returns 500 on /scim/v2/Me so any
future regression that reintroduces the workspace branch fails the test.

Co-authored-by: Isaac
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

Approval status: pending

/cmd/auth/ - needs approval

Files: cmd/auth/profiles.go, cmd/auth/profiles_test.go
Suggested: @simonfaltum
Also eligible: @tanmay-db, @tejaskochar-db, @renaudhartert-db, @hectorcast-db, @parthban-db, @Divyansh-db, @chrisst, @rauchy

/libs/auth/ - needs approval

Files: libs/auth/config_type.go, libs/auth/config_type_test.go
Suggested: @simonfaltum
Also eligible: @tanmay-db, @tejaskochar-db, @renaudhartert-db, @hectorcast-db, @parthban-db, @Divyansh-db, @chrisst, @rauchy

Any maintainer (@andrewnester, @anton-107, @denik, @pietern, @shreyas-goenka, @simonfaltum, @renaudhartert-db) can approve all areas.
See OWNERS for ownership rules.

@simonfaltum
Copy link
Copy Markdown
Member

One thing I’m wondering about: does this make any non-OAuth SPOG workspace profile worse?

My understanding is that auth profiles validates all config-file auth types, not only databricks-cli OAuth profiles. For a SPOG host with a real workspace_id, SDK workspace calls include X-Databricks-Org-Id, so a PAT/manual workspace credential could potentially pass CurrentUser.Me but fail AccountClient.Workspaces.List if the user doesn’t have account-level access.

So I think this is clearly the right direction for the OAuth case, but should the AccountConfig override be scoped to OAuth/databricks-cli profiles? Or do we consider workspace-scoped credentials on SPOG hosts unsupported?

Not entirely sure this credential shape matters in practice, but it seems like the main strict-worse case to rule out.

@simonfaltum
Copy link
Copy Markdown
Member

One broader thought: I wonder if we can avoid having auth profiles reason about IsSPOG at all.

It would be great if this command could move toward treating profiles as unified profiles by default, i.e. not first deciding "this is account" vs "this is workspace" based on host shape. The tricky part is the validation probe. Maybe the rule could be: try the account-level probe when account_id is present, try the workspace-level probe when a real workspace_id is present, and consider the profile valid if either applicable probe succeeds.

That would make the SPOG OAuth case work because Workspaces.List succeeds, but it would also avoid making SPOG workspace-scoped credentials worse if CurrentUser.Me is the only probe that succeeds.

I’m not entirely sure this is the best probing strategy, but I think the direction of removing the SPOG-specific classification would be nice if we can make the semantics clear.

@mihaimitrea-db
Copy link
Copy Markdown
Contributor Author

mihaimitrea-db commented May 20, 2026

One broader thought: I wonder if we can avoid having auth profiles reason about IsSPOG at all.

It would be great if this command could move toward treating profiles as unified profiles by default, i.e. not first deciding "this is account" vs "this is workspace" based on host shape. The tricky part is the validation probe. Maybe the rule could be: try the account-level probe when account_id is present, try the workspace-level probe when a real workspace_id is present, and consider the profile valid if either applicable probe succeeds.

That would make the SPOG OAuth case work because Workspaces.List succeeds, but it would also avoid making SPOG workspace-scoped credentials worse if CurrentUser.Me is the only probe that succeeds.

I’m not entirely sure this is the best probing strategy, but I think the direction of removing the SPOG-specific classification would be nice if we can make the semantics clear.

I think this is a good approach. However my main concern is how we would handle having to do double the number of API calls? Do we make them async and accept the extra calls or do we make them sequentially and accept that validation might take double the time? I would lean towards async.

We also need to account for the host somehow. If I remember correctly if you follow this sequence of actions:

login with profile A into account > don't select workspace > logout of profile A > login again with workspace host

You end up with a profile which has a stale account ID for a workspace profile. So we need the host to also be a signal for which endpoint to try.

Replace ResolveConfigType's "either account or workspace" routing with a
permissive validator: probe whichever API surfaces the profile has a
signal for (host shape or field presence), and mark the profile valid if
any probe succeeds. This addresses review feedback on the previous fix —
it now also handles SPOG workspace-scoped credentials (e.g. a PAT),
which the strict "SPOG always validates as account" rule would have
falsely flagged as invalid.

Probes run sequentially against the shared cfg because the SDK's lazy
Authenticate() chain writes cfg.Host (via fixHostIfNeeded) while the
other client's construction reads cfg.Host unlocked — go test -race
flags the parallel version. Profile-level parallelism in
newProfilesCommand is unchanged, so overall auth profiles wall-clock is
still bounded by the slowest profile.

ResolveConfigType is removed; IsSPOGHost and IsClassicWorkspaceHost are
added in libs/auth as named wrappers so the host-shape check in
profiles.go reads as a flat block of named booleans.

Tests:
- TestProfileLoadSPOGWorkspaceCredential covers PAT-on-SPOG explicitly:
  workspace probe succeeds, account probe 403s, OR yields Valid=true.
- TestIsSPOGHost and TestIsClassicWorkspaceHost cover the three-way
  host classification, including the accounts-dod.* variant.

Co-authored-by: Isaac
The previous validator probed the account API whenever `account_id` was
set in cfg, on top of host signals. That over-fires in practice:
`account_id` is back-filled by discovery onto every workspace profile,
and can linger from a prior account login on the same profile name. The
acceptance test `cmd/auth/login/discovery` failed because the new
account probe hit an endpoint the test's mock server didn't declare,
even though the underlying validation outcome was still correct
(workspace probe carried the profile to Valid=true).

Gate the account probe on host signals only (classic accounts.* or
SPOG). `hasRealWorkspaceID` still adds the workspace probe on account
hosts so the SPOG-PAT case keeps working.

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants