Fall back to remote runtime on Spark Connect when the legacy namespace is unavailable (#1469)

Divyansh-db · sd-db · web-flow · commit 8d800623f47b · 2026-06-11T12:54:34.000Z
## Summary Importing `databricks.sdk.runtime` on a Spark Connect runtime (e.g. shared-access-mode clusters) no longer raises `CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT` at import time. When the legacy user namespace cannot be materialized, the import now logs a warning and falls back to the existing Spark Connect-compatible remote implementation, so `WorkspaceClient()` construction succeeds on such clusters. Fixes #1463. Carries forward @sd-db's work from #1464 (closed because fork PRs in this repo cannot run tests). ## Why `WorkspaceClient.__init__` eagerly builds `dbutils` via `_make_dbutils`, which on a cluster does `from databricks.sdk.runtime import dbutils`. That import calls `UserNamespaceInitializer.getOrCreate().get_namespace_globals()`, materializing a legacy `SparkContext`. On a Spark Connect cluster this raises `CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT` — a `pyspark.errors.PySparkRuntimeError`, not an `ImportError` — so the existing `except ImportError:` does not catch it and the error escapes the import, crashing `WorkspaceClient` construction before any API call. This is what databricks/dbt-databricks#1252 hits in Python models on shared clusters. The existing `except ImportError` branch is already the Spark Connect-compatible path (it builds `spark` via `DatabricksSession` and `dbutils` via `RemoteDbUtils`), so this PR routes the materialization failure there. A complementary follow-up — making `WorkspaceClient.dbutils` lazy via a `cached_property` so consumers that never touch it skip the build entirely — is noted in #1463 as a separate discussion since it touches generated code. Related issue #986 (off-cluster eager `RemoteDbUtils` auth failure) is the symmetric case and is intentionally not addressed here; the lazy-dbutils follow-up would unify both. ## What changed ### Behavioral changes On a Spark Connect runtime, importing `databricks.sdk.runtime` now logs a `WARNING` and uses the remote implementation instead of raising at import time. When `dbruntime` is absent (off-cluster) or the namespace materializes successfully (classic runtime), behavior is unchanged. ### Internal changes `databricks/sdk/runtime/__init__.py`: the runtime-namespace block is restructured into a single `try` with sibling `except ImportError` (existing — "not in a classic runtime") and `except Exception` (new — Spark Connect / `CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT`, logged) clauses, plus an `if not _use_runtime_namespace:` guard over the existing — unchanged — OSS/remote block. The catch is intentionally broad rather than typed on `PySparkRuntimeError` to avoid pulling `pyspark` in at SDK import time just to narrow the exception type; the inline comment notes this. ## How is this tested? New `tests/test_runtime.py` simulates a Spark Connect runtime by injecting a fake `dbruntime` whose `get_namespace_globals()` raises `CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT`, and asserts that: - reloading `databricks.sdk.runtime` survives the failure and falls back (`is_local_implementation is True`, `dbutils is not None`) - `WorkspaceClient(config=…)` constructs without raising — the direct reproduction of the reported failure Verified red→green locally. Full unit test suite (2098 tests) passes with no regressions. --------- Signed-off-by: Shubham Dhal <shubham.dhal@databricks.com> Signed-off-by: Divyansh Vijayvergia <171924202+Divyansh-db@users.noreply.github.com> Co-authored-by: Shubham Dhal <shubham.dhal@databricks.com>
diff --git a/NEXT_CHANGELOG.md b/NEXT_CHANGELOG.md
@@ -8,6 +8,7 @@
 
 ### Bug Fixes
 
+* Fall back to the remote runtime implementation when the legacy user namespace cannot be materialized. On Spark Connect runtimes (e.g. shared-access-mode clusters), importing `databricks.sdk.runtime` — which happens when constructing a `WorkspaceClient` on such a cluster — tried to build a legacy `SparkContext` and raised `CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT` at import time. It now logs a warning and falls back to the Spark Connect-compatible remote implementation instead of crashing.
 * Cache tokens minted by `DatabricksOidcTokenSource` (Workload Identity Federation / account-wide token federation). Previously a fresh `/oidc/v1/token` exchange was performed on every authenticated API call, adding latency, amplifying transient federation-policy errors, and hitting OIDC token-endpoint rate limits. The token source now reuses the cached token until it is stale or expired, fetching a fresh ID token on each refresh to support rotation.
 
 ### Documentation
diff --git a/databricks/sdk/runtime/__init__.py b/databricks/sdk/runtime/__init__.py
@@ -93,9 +93,10 @@ def inner() -> Dict[str, str]:
         return None, None
 
 
+# Internal implementation
+# Separated from above for backward compatibility
+_use_runtime_namespace = False
 try:
-    # Internal implementation
-    # Separated from above for backward compatibility
     from dbruntime import UserNamespaceInitializer
 
     userNamespaceGlobals = UserNamespaceInitializer.getOrCreate().get_namespace_globals()
@@ -105,7 +106,23 @@ def inner() -> Dict[str, str]:
             continue
         _globals[var] = userNamespaceGlobals[var]
     is_local_implementation = False
+    _use_runtime_namespace = True
 except ImportError:
+    # Not running inside a classic Databricks runtime; fall back to the OSS implementation below.
+    pass
+except Exception as e:
+    # On Spark Connect runtimes (e.g. shared-access-mode clusters), materializing the
+    # legacy user namespace builds a SparkContext, which is unavailable in remote clients
+    # and raises CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT. Treat this like "not in a classic
+    # runtime" and fall back to the OSS/remote implementation below, which is Spark
+    # Connect-compatible. Without this, importing databricks.sdk.runtime (and therefore
+    # constructing a WorkspaceClient on such a cluster) raises at import time. The catch
+    # is broad rather than typed on PySparkRuntimeError so the SDK does not need to import
+    # pyspark just to narrow the exception type; any other unexpected failure here is also
+    # safer surfaced as a warning + remote fallback than as a constructor crash.
+    logger.warning(f"Runtime namespace unavailable, falling back to remote implementation: {e}")
+
+if not _use_runtime_namespace:
     # OSS implementation
     is_local_implementation = True
 
diff --git a/tests/test_runtime.py b/tests/test_runtime.py
@@ -0,0 +1,62 @@
+"""Tests for the import-time behavior of ``databricks.sdk.runtime``."""
+
+import sys
+import types
+
+import pytest
+
+from databricks.sdk.dbutils import RemoteDbUtils
+
+
+@pytest.fixture
+def spark_connect_runtime(monkeypatch):
+    """``dbruntime`` is importable, but materializing the legacy user namespace raises
+    ``CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT`` — the Spark Connect failure mode."""
+
+    class _Initializer:
+        @staticmethod
+        def getOrCreate():
+            class _Namespace:
+                def get_namespace_globals(self):
+                    raise RuntimeError(
+                        "[CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT] Calls to SparkContext are "
+                        "not supported on a Spark Connect cluster. Use spark instead."
+                    )
+
+            return _Namespace()
+
+    fake = types.ModuleType("dbruntime")
+    fake.UserNamespaceInitializer = _Initializer
+    monkeypatch.setitem(sys.modules, "dbruntime", fake)
+
+    # The remote fallback constructs ``RemoteDbUtils()``, which initializes a default
+    # ``Config``; hermetic PAT credentials keep the fallback from failing for unrelated
+    # auth reasons (see databricks-sdk-py#986).
+    monkeypatch.setenv("DATABRICKS_HOST", "https://test.cloud.databricks.com")
+    monkeypatch.setenv("DATABRICKS_TOKEN", "test-token")
+
+    # Force ``databricks.sdk.runtime`` to re-execute its module body on next import so it
+    # picks up the fake ``dbruntime``. Earlier tests (e.g. test_notebook_oauth.py) cache a
+    # fake module here directly via ``sys.modules`` without going through the import
+    # machinery, which leaves the ``runtime`` attribute on ``databricks.sdk`` unset —
+    # dropping the cached entry repairs that on the next real import. ``monkeypatch``
+    # restores the prior value on teardown.
+    monkeypatch.delitem(sys.modules, "databricks.sdk.runtime", raising=False)
+
+
+def test_runtime_import_falls_back_on_spark_connect(spark_connect_runtime):
+    """Regression for dbt-databricks#1252: import survives the namespace failure."""
+    import databricks.sdk.runtime as runtime
+
+    assert runtime.is_local_implementation is True
+    assert isinstance(runtime.dbutils, RemoteDbUtils)
+
+
+def test_workspace_client_constructs_on_spark_connect(spark_connect_runtime, config):
+    """Regression for dbt-databricks#1252: ``WorkspaceClient.__init__`` eagerly builds
+    dbutils via ``databricks.sdk.runtime`` and must not raise on Spark Connect."""
+    from databricks.sdk import WorkspaceClient
+
+    ws = WorkspaceClient(config=config)
+
+    assert ws is not None