Skip to content

Fall back to remote runtime on Spark Connect when the legacy namespace is unavailable#1469

Merged
Divyansh-db merged 6 commits into
mainfrom
fix/runtime-spark-connect-import
Jun 11, 2026
Merged

Fall back to remote runtime on Spark Connect when the legacy namespace is unavailable#1469
Divyansh-db merged 6 commits into
mainfrom
fix/runtime-spark-connect-import

Conversation

@Divyansh-db

@Divyansh-db Divyansh-db commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Importing databricks.sdk.runtime on a Spark Connect runtime (e.g. shared-access-mode clusters) no longer raises CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT at import time. When the legacy user namespace cannot be materialized, the import now logs a warning and falls back to the existing Spark Connect-compatible remote implementation, so WorkspaceClient() construction succeeds on such clusters.

Fixes #1463. Carries forward @sd-db's work from #1464 (closed because fork PRs in this repo cannot run tests).

Why

WorkspaceClient.__init__ eagerly builds dbutils via _make_dbutils, which on a cluster does from databricks.sdk.runtime import dbutils. That import calls UserNamespaceInitializer.getOrCreate().get_namespace_globals(), materializing a legacy SparkContext. On a Spark Connect cluster this raises CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT — a pyspark.errors.PySparkRuntimeError, not an ImportError — so the existing except ImportError: does not catch it and the error escapes the import, crashing WorkspaceClient construction before any API call. This is what databricks/dbt-databricks#1252 hits in Python models on shared clusters.

The existing except ImportError branch is already the Spark Connect-compatible path (it builds spark via DatabricksSession and dbutils via RemoteDbUtils), so this PR routes the materialization failure there.

A complementary follow-up — making WorkspaceClient.dbutils lazy via a cached_property so consumers that never touch it skip the build entirely — is noted in #1463 as a separate discussion since it touches generated code. Related issue #986 (off-cluster eager RemoteDbUtils auth failure) is the symmetric case and is intentionally not addressed here; the lazy-dbutils follow-up would unify both.

What changed

Behavioral changes

On a Spark Connect runtime, importing databricks.sdk.runtime now logs a WARNING and uses the remote implementation instead of raising at import time. When dbruntime is absent (off-cluster) or the namespace materializes successfully (classic runtime), behavior is unchanged.

Internal changes

databricks/sdk/runtime/__init__.py: the runtime-namespace block is restructured into a single try with sibling except ImportError (existing — "not in a classic runtime") and except Exception (new — Spark Connect / CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT, logged) clauses, plus an if not _use_runtime_namespace: guard over the existing — unchanged — OSS/remote block. The catch is intentionally broad rather than typed on PySparkRuntimeError to avoid pulling pyspark in at SDK import time just to narrow the exception type; the inline comment notes this.

How is this tested?

New tests/test_runtime.py simulates a Spark Connect runtime by injecting a fake dbruntime whose get_namespace_globals() raises CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT, and asserts that:

  • reloading databricks.sdk.runtime survives the failure and falls back (is_local_implementation is True, dbutils is not None)
  • WorkspaceClient(config=…) constructs without raising — the direct reproduction of the reported failure

Verified red→green locally. Full unit test suite (2098 tests) passes with no regressions.

sd-db and others added 2 commits June 8, 2026 14:08
…e is unavailable

On a Databricks shared-access-mode (Spark Connect) cluster, importing databricks.sdk.runtime (which happens when WorkspaceClient.__init__ eagerly builds dbutils) materializes a legacy SparkContext via UserNamespaceInitializer.get_namespace_globals() and raises CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT (a PySparkRuntimeError, not ImportError). The surrounding 'except ImportError' does not catch it, so the error escapes the import and crashes WorkspaceClient construction.

Treat a namespace-materialization failure the same as 'not in a classic runtime': log a warning and fall back to the existing OSS/remote implementation, which is Spark Connect-compatible (DatabricksSession + RemoteDbUtils).

Fixes #1463

Signed-off-by: Shubham Dhal <shubham.dhal@databricks.com>
Add a sentence to the inline comment explaining why the catch is broad
rather than typed on PySparkRuntimeError — avoids importing pyspark at
SDK import time and keeps unexpected runtime-namespace errors surfaced
as a warning + safe fallback instead of a constructor crash.
@Divyansh-db Divyansh-db force-pushed the fix/runtime-spark-connect-import branch from 3d8e600 to 61cc935 Compare June 9, 2026 15:14
@Divyansh-db Divyansh-db temporarily deployed to test-trigger-is June 9, 2026 15:14 — with GitHub Actions Inactive
@Divyansh-db Divyansh-db temporarily deployed to test-trigger-is June 9, 2026 15:15 — with GitHub Actions Inactive
Use monkeypatch.setitem for the sys.modules injection (auto-teardown
instead of manual save/restore), move the runtime reload into the
fixture so test bodies stay focused on the assertion, inline the fake
initializer, and strengthen the first assertion to isinstance(
RemoteDbUtils) so it explicitly proves the Spark Connect fallback path
was taken rather than just that some dbutils exists.
…tests

test_notebook_oauth.py caches a fake ``databricks.sdk.runtime`` directly
in ``sys.modules`` without going through the import machinery, which
leaves the ``runtime`` attribute on ``databricks.sdk`` unset. The
previous fixture's ``import databricks.sdk.runtime`` then hit the cached
fake (skipping the loader), and the follow-up ``importlib.reload(
databricks.sdk.runtime)`` died with AttributeError when CI happened to
run test_notebook_oauth.py first.

Drop the eager import + reload from the fixture; just delitem the stale
``sys.modules`` entry via monkeypatch so the next ``import`` in the test
body triggers a fresh load (which correctly sets both ``sys.modules``
and the parent attribute). Verified locally that the suite passes both
in isolation and when ordered after test_notebook_oauth.py.
@Divyansh-db Divyansh-db temporarily deployed to test-trigger-is June 9, 2026 16:45 — with GitHub Actions Inactive
@Divyansh-db Divyansh-db temporarily deployed to test-trigger-is June 9, 2026 16:46 — with GitHub Actions Inactive
@Divyansh-db Divyansh-db requested a review from hectorcast-db June 9, 2026 18:17
Divyansh-db added a commit that referenced this pull request Jun 9, 2026
Regenerated ``databricks/sdk/__init__.py`` with the updated template
(imports ``functools.cached_property``, drops the eager
``self._dbutils = _make_dbutils(self._config)`` from ``__init__``,
emits ``dbutils`` as a ``@cached_property`` that calls
``_make_dbutils`` on first access).

Adds four ``tests/test_client.py`` tests that lock in the contract:

- ``dbutils`` is a ``functools.cached_property`` descriptor on
  ``WorkspaceClient``.
- ``WorkspaceClient.__init__`` does not invoke ``_make_dbutils``.
- The first ``ws.dbutils`` read invokes ``_make_dbutils`` once;
  subsequent reads return the cached value without re-invoking.
- Constructing ``WorkspaceClient`` on a faked Spark Connect runtime
  (whose ``dbruntime`` raises ``CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT``
  on any namespace materialization) succeeds without importing
  ``databricks.sdk.runtime`` at all — the durable sidestep of
  databricks/dbt-databricks#1252.

Complements #1469 (which catches the same failure at runtime-module
import time as a defense-in-depth fallback).
Signed-off-by: Divyansh Vijayvergia <171924202+Divyansh-db@users.noreply.github.com>
@Divyansh-db Divyansh-db enabled auto-merge June 11, 2026 12:16
@github-actions

Copy link
Copy Markdown

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/sdk-py

Inputs:

  • PR number: 1469
  • Commit SHA: ad5a86173022e030bd96501c11e3fcd33bee8681

Checks will be approved automatically on success.

@Divyansh-db Divyansh-db added this pull request to the merge queue Jun 11, 2026
Merged via the queue into main with commit 8d80062 Jun 11, 2026
14 checks passed
@Divyansh-db Divyansh-db deleted the fix/runtime-spark-connect-import branch June 11, 2026 13:02
pull Bot pushed a commit to Future-Outlier/databricks-sdk-py that referenced this pull request Jun 11, 2026
## Summary

Makes `WorkspaceClient.dbutils` a `functools.cached_property` so
consumers that never read it pay no construction cost — and, on Spark
Connect runtimes, never touch the legacy `SparkContext` path that
`databricks.sdk.runtime` materializes on import. Includes four
regression tests that lock in the contract.

## Why

`WorkspaceClient.__init__` used to call `_make_dbutils(self._config)`
eagerly, which on a cluster imports `databricks.sdk.runtime`. On a Spark
Connect (shared-access-mode) cluster, that import materializes a legacy
`SparkContext` and raises `CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT`,
crashing the constructor before any API call. Downstream consumers that
never touch `.dbutils` (notably `dbt-databricks` Python models) hit this
for no reason — see databricks#1463 and databricks/dbt-databricks#1252.

databricks#1469 patches the runtime side as a defense-in-depth fallback (catch the
materialization failure, fall back to `RemoteDbUtils`). This PR is the
durable fix: callers that don't read `.dbutils` never trigger the build
at all, sidestepping the entire code path. The first read still calls
`_make_dbutils` once, lazily; subsequent reads hit the cached attribute
in `__dict__` at plain-attribute speed.

## What changed

`databricks/sdk/__init__.py` (generated from updated template):

- `from functools import cached_property` added to the imports.
- The eager `self._dbutils = _make_dbutils(self._config)` line is
removed from `__init__`.
- `@property def dbutils` (which returned the cached `self._dbutils`)
becomes `@cached_property def dbutils` that calls
`_make_dbutils(self._config)` on first access.

`_dbutils` was a private attribute with no external consumers (verified
across the codebase), so removing it does not break any public surface.

`tests/test_client.py` — four new tests:

- `test_dbutils_is_a_cached_property` — descriptor type check.
- `test_workspace_client_init_does_not_build_dbutils` — spies
`_make_dbutils`, constructs a `WorkspaceClient`, asserts the spy was
never called.
- `test_dbutils_first_access_builds_exactly_once` — first read invokes
`_make_dbutils` once (returns the spy's sentinel); second read still
shows `call_count == 1` and same identity.
-
`test_workspace_client_constructs_on_spark_connect_without_touching_runtime`
— fakes `dbruntime` to raise `CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT` on
any namespace materialization; asserts `WorkspaceClient(config=...)`
succeeds and `databricks.sdk.runtime` is never imported during
construction. This is the strongest evidence that the dbt-databricks
failure mode is sidestepped by this change alone.

## How is this tested?

- 4/4 new tests pass locally (0.03s).
- Existing `tests/test_client.py` autospec tests untouched, still pass.
- The fourth test is the negative-space proof: asserts
`databricks.sdk.runtime` is *not* in `sys.modules` after
`WorkspaceClient(config=...)` — i.e., the constructor literally does not
reach for the runtime module.

NO_CHANGELOG=true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WorkspaceClient construction fails on Spark Connect clusters: CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT when importing databricks.sdk.runtime

3 participants