Skip to content

feat(settings): add native Databricks AI Gateway provider to settings TUI#740

Draft
prasadkona wants to merge 10 commits into
OpenHands:mainfrom
prasadkona:feat/databricks-native-provider
Draft

feat(settings): add native Databricks AI Gateway provider to settings TUI#740
prasadkona wants to merge 10 commits into
OpenHands:mainfrom
prasadkona:feat/databricks-native-provider

Conversation

@prasadkona
Copy link
Copy Markdown

Summary

Adds first-class support for the native Databricks AI Gateway provider
(Databricks PWAF compliant) to the settings TUI: workspace host, auth method
selector (PAT / M2M service principal / Browser SSO / CLI profile), live model
discovery from the workspace, and Databricks-aware env-var overrides.

Depends on the companion SDK PR (OpenHands/software-agent-sdk#3286).


What's new

Settings screen additions

  • Databricks auth section — shown only when provider is databricks:
    • Workspace host field
    • Optional dedicated AI Gateway host override
    • Auth method selector: PAT / Service Principal (M2M) / Browser SSO (U2M) / CLI Profile
    • Context-specific field groups (client ID + secret for M2M; profile name for PROFILE)
    • Dynamic auth hint that generates a workspace-specific databricks auth login
      command for Browser SSO, using the hostname already entered on the same screen

Model discovery

  • Live workspace endpoint picker (calls list_chat_endpoints from the SDK) when
    host + credentials are available — same two-tier list as the web UI
  • Graceful fallback to static curated model list when workspace is unreachable
  • TTL cache (5 min) so switching provider tabs doesn't hammer the workspace

Env-var overrides

  • DATABRICKS_HOST and DATABRICKS_TOKEN env vars supported in headless mode
    (alongside existing LLM_API_KEY / LLM_MODEL / LLM_BASE_URL)

Bug fix (found during testing)

  • M2M secret serialization: databricks_client_secret (SecretStr) was not
    in the base LLM_SECRET_FIELDS tuple, so AgentStore.save() was writing
    "**********" to agent_settings.json instead of the real value. Every CLI
    restart sent the masked string to the OIDC token endpoint, causing a persistent
    401. Fixed by adding a @field_serializer on DatabricksLLM (in the SDK PR).

Changes to existing files

All additions follow the existing patterns in each file:

File Change
agent_store.py Extends LLMEnvOverrides for DATABRICKS_HOST/DATABRICKS_TOKEN; rebuilds via create_llm for Databricks instances so private client state is fresh
settings_screen.py Adds Databricks field group + auth method selector; injects saved model gracefully on load
settings_tab.py Adds Databricks form fields (host, auth method, credentials)
utils.py Extends SettingsFormData validation for 4 Databricks auth paths
choices.py Adds get_databricks_model_options() with TTL-cached live discovery
model_recommendations.py Adds databricks to the provider enum

Note for reviewers: The Databricks-specific settings logic is added inline
in the existing modules to keep the same import structure as the other providers.
If you'd prefer a separate databricks/ sub-module in the settings package
I'm happy to restructure.


Tests

  • 4 new test files: test_choices_databricks.py, test_databricks_auth_method.py,
    and additions to test_settings_utils.py / test_settings_tab.py
  • Manually tested end-to-end in the TUI against a live Databricks workspace:
    • PAT auth ✅
    • M2M service principal auth ✅ (including secret persistence fix)
    • Browser SSO (U2M) ✅

Test plan

  • uv run pytest tests/tui/modals/settings/ -q — all settings tests pass
  • Launch CLI with DATABRICKS_HOST=... DATABRICKS_TOKEN=... LLM_MODEL=databricks/... — headless agent starts without prompting for API key
  • Open TUI settings, select "databricks" provider, enter host → model picker populates with live endpoints

Wires the DatabricksLLM provider into the CLI settings screen:

Settings UI
- Databricks auth section: PAT / M2M / CLI Profile / U2M (browser SSO)
- Auth-method-aware field visibility: API Key field hidden for non-PAT
  methods; Profile/M2M credential fields shown only when relevant
- Live auth-method hints with step-by-step instructions for U2M and
  CLI Profile (includes install + login commands)
- Model dropdown: two-tier picker (curated + live-discovered) auto-
  refreshes when auth method or workspace host changes, using the
  credentials typed in the form so U2M/profile show the full list

Agent store
- `LLMEnvOverrides`: DATABRICKS_HOST + DATABRICKS_TOKEN env overrides
- `apply_llm_overrides`: rebuilds via create_llm for Databricks models
- Headless agent creation uses create_llm factory

Discovery (choices.py)
- `_resolve_credentials_for_host`: builds credentials from form state
  (host + auth method) for targeted per-workspace discovery
- `_get_databricks_model_options`: accepts pre-built credentials so
  auth-method changes immediately surface the correct model list

run_local.sh
- Installs local SDK in compat editable mode and ensures databricks-sdk
  is present (needed for U2M / profile auth) before launching the CLI
pytest namespace-package collection fails without __init__.py when sibling
test directories already have one. All other test dirs under tests/tui/
have __init__.py; add it to tests/tui/modals/settings/ so Databricks and
existing settings tests can be collected correctly.
When a Databricks discovered-only model (not in the curated list) was
previously saved and the Settings screen mounts without credentials, the
model dropdown was populated with curated-only options, causing Textual to
raise InvalidSelectValueError when trying to restore the saved selection.

Fix: wrap the value assignment in a try/except — on failure, inject the
saved model as the sole option with a "(saved — re-enter credentials to
refresh)" label so the user sees their current selection without crashing.
_refresh_databricks_models() repopulates the full discovered list once host
and auth fields are filled in.
The U2M auth hint now uses whatever the user has typed into the Workspace
Host field to build the exact databricks auth login command, so they can
copy-paste it directly from the TUI.

Before:  databricks auth login --host <workspace_host>
After:   databricks auth login --host https://e2-demo-field-eng.cloud.databricks.com

The hint updates live as the user types the host — no need to leave the
settings screen to figure out the right command.
Two blockers for external contributors:

1. The previous "chore: update uv.lock" commit rewrote all 401 package
   sources from pypi.org to pypi-proxy.dev.databricks.com (an internal
   Databricks mirror). This has been reverted so external contributors
   can run `uv sync` without VPN access.

2. pyproject.toml pinned openhands-sdk==1.21.0 (published), but the
   Databricks native provider ships in the still-unmerged SDK PR #3286.
   Enable [tool.uv.sources] with a git-source pin to the PR branch HEAD
   so the CLI is installable without the run_local.sh editable override.
   Also relax the constraint from ==1.21.0 to >=1.21.0 to allow the
   git-sourced version to satisfy the requirement.

   This section should be removed and the version constraint restored
   once SDK PR #3286 is merged and published to PyPI.
The DATABRICKS_AI_GATEWAY_HOST override is supported in the backend
and configurable via environment variable, but surfacing it in the TUI
adds unnecessary complexity for most users who use a standard workspace.

Remove the input widget and its help text from settings_tab.py.
The field remains functional via env var for advanced/split-hostname
deployments. Re-enabling the UI field is a one-file change when needed.
…erences

The UI widget was removed from settings_tab.py but settings_screen.py
still declared the query_one getter and read/wrote its value in three
places (_clear_databricks_fields, _load_current_settings, _get_form_data).
Textual's query_one raises NoMatches at runtime, crashing the TUI on Save.

Remove the getter declaration and all three call sites. The env-var-only
DATABRICKS_AI_GATEWAY_HOST path is preserved in the backend; the TUI
simply passes db_ai_gateway_host=None so the backend falls through to
the env var.
Copy link
Copy Markdown
Member

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution @prasadkona , I’m not sure this is the best way though, so you know: I suggest we discuss your proposal in the SDK repo PR, first.

I mean, up to you, I’m just concerned that we may or may not be able to accept your Databricks proposal, and if yes, how; and the best place for that discussion is in the SDK where we started talking.

…_llm_overrides

When env-var overrides force a Databricks model swap, apply_llm_overrides()
calls create_llm() to build a fresh DatabricksLLM. The previous instance
was discarded without closing its httpx client, leaking an open connection
pool. Explicitly call llm.close() on the existing DatabricksLLM instance
before creating the replacement to release the underlying httpx.Client.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants