databricks-solutions · stuart-gano · Mar 10, 2026 · Mar 10, 2026 · Mar 10, 2026 · Mar 10, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -1,132 +1,41 @@
 # Genie Workbench
 
-Databricks App for creating, scoring, and optimizing Genie Spaces. FastAPI backend + React/Vite frontend deployed together on Databricks Apps.
+## Project Overview
 
-## Commands
+Genie Workbench is a Databricks App that acts as a quality control and optimization platform for Genie Space administrators. It helps builders understand why their Genie Space isn't performing well and fix it.
 
-```bash
-# Backend (from project root)
-uv pip install -e .                          # Install Python deps
-uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reload  # Dev server
-
-# Frontend (from frontend/)
-cd frontend && npm install && npm run build  # Build for production
-cd frontend && npm run dev                   # Vite dev server (port 5173, proxies /api to :8000)
-cd frontend && npm run lint                  # ESLint
-
-# Full build (what Databricks Apps runs)
-npm install   # Triggers postinstall -> cd frontend && npm install
-npm run build # Triggers cd frontend && npm run build
-
-# Deploy
-databricks sync --watch . /Workspace/Users/<email>/genie-workbench
-databricks apps deploy <app-name> --source-code-path /Workspace/Users/<email>/genie-workbench
-
-# Tests (require running backend at localhost:8000)
-python tests/test_e2e_local.py    # E2E create agent tests
-python tests/test_full_schema.py  # Schema validation
-# Deployed E2E tests require: pip install playwright && playwright install chromium
-python tests/test_e2e_deployed.py
-```
-
-## Architecture
-
-```
-backend/
-  main.py                  # FastAPI app entry point, OBO middleware, static file serving
-  models.py                # All Pydantic models (shared between routers/services)
-  routers/
-    analysis.py            # /api/space/*, /api/analyze/*, /api/optimize, /api/genie/*, /api/sql/*
-    spaces.py              # /api/spaces/* (list, scan, history, star, fix)
-    admin.py               # /api/admin/* (dashboard, leaderboard, alerts)
-    auth.py                # /api/auth/me
-    create.py              # /api/create/* (agent chat, UC discovery, wizard)
-  services/
-    auth.py                # OBO auth (ContextVar), SP fallback, WorkspaceClient mgmt
-    genie_client.py        # Databricks Genie API (fetch space, list spaces, query for SQL)
-    scanner.py             # Rule-based IQ scoring engine (0-100, 4 dimensions)
-    analyzer.py            # LLM-based deep analysis against best-practices checklist
-    optimizer.py           # LLM-based optimization from benchmark feedback
-    fix_agent.py           # LLM agent that generates JSON patches and applies via Genie API
-    create_agent.py        # Multi-turn LLM agent for creating new Genie Spaces
-    create_agent_session.py # Session persistence for create agent (Lakebase)
-    create_agent_tools.py  # Tool definitions for create agent (UC discovery, SQL, etc.)
-    lakebase.py            # PostgreSQL persistence (asyncpg pool, in-memory fallback)
-    llm_utils.py           # OpenAI-compatible LLM client via Databricks serving endpoints
-    uc_client.py           # Unity Catalog browsing (catalogs, schemas, tables)
-  prompts/                 # Prompt templates for analysis
-  prompts_create/          # Prompt templates for create agent (multi-file, modular)
-  references/schema.md     # Genie Space JSON schema reference
-frontend/
-  src/
-    App.tsx                # Root: SpaceList | SpaceDetail | AdminDashboard | CreateAgentChat
-    lib/api.ts             # All API calls (fetch, SSE streaming helpers)
-    types/index.ts         # TypeScript types mirroring backend Pydantic models
-    components/            # UI components (analysis, optimization, fix agent, etc.)
-    pages/                 # SpaceList, SpaceDetail, AdminDashboard, HistoryTab, IQScoreTab
-    hooks/                 # useAnalysis, useTheme
-  vite.config.ts           # Vite config with /api proxy to localhost:8000
-```
-
-## Key Patterns
-
-### Authentication (OBO)
-On Databricks Apps, user identity flows via `x-forwarded-access-token` header. `OBOAuthMiddleware` in `main.py` stores the token in a `ContextVar`. All services call `get_workspace_client()` which returns the OBO client if set, otherwise the SP singleton. Some Genie API calls require SP auth (missing `genie` OAuth scope) — see `_is_scope_error()` fallback in `genie_client.py`.
-
-### SSE Streaming
-Multiple endpoints use `StreamingResponse` with `text/event-stream`:
-- `/api/analyze/stream` — analysis progress
-- `/api/optimize` — optimization with heartbeat keepalives (15s)
-- `/api/spaces/{id}/fix` — fix agent patches
-- `/api/create/agent/chat` — multi-turn agent with typed events (session, step, thinking, tool_call, tool_result, message_delta, message, created, error, done)
-
-Frontend consumes these via manual `fetch` + `ReadableStream` in `lib/api.ts` (not EventSource). Buffer splitting on `\n\n`.
+- **Backend:** Python (FastAPI), deployed as a Databricks App
+- **Frontend:** React/TypeScript (Vite)
+- **Storage:** Lakebase (with in-memory fallback for local dev)
+- **Tracing:** Optional MLflow integration
 
-### Lakebase Persistence
-`services/lakebase.py` uses asyncpg with graceful fallback to in-memory dicts when `LAKEBASE_HOST` is not set. Credentials auto-generated via Databricks SDK (`/api/2.0/database/credentials`). Schema defined in `sql/setup_lakebase.sql`.
+## GenieRX Specification
 
-### LLM Calls
-All LLM calls go through Databricks model serving endpoints using OpenAI-compatible API. Model configured via `LLM_MODEL` env var (default: `databricks-claude-sonnet-4-6`). MLflow tracing is optional — controlled by `MLFLOW_EXPERIMENT_ID`.
+The GenieRX spec (`docs/genierx-spec.md`) defines the core analysis and recommendation framework used throughout this project. **Always consult it when working on analysis, scoring, or recommendation features.**
 
-## Environment Variables
+Key concepts from the spec:
 
-Defined in `app.yaml`. Key ones:
-- `SQL_WAREHOUSE_ID` — from app resource `sql-warehouse`
-- `LLM_MODEL` — serving endpoint name
-- `LAKEBASE_HOST`, `LAKEBASE_PORT`, `LAKEBASE_DATABASE`, `LAKEBASE_INSTANCE_NAME` — Lakebase config
-- `MLFLOW_EXPERIMENT_ID` — enables MLflow tracing (validated at startup, cleared if invalid)
-- `GENIE_TARGET_DIRECTORY` — where new spaces are created (default `/Shared/`)
-- `DEV_USER_EMAIL` — local dev only
+- **Authoritative Facts** — raw data from systems of record, safe to surface directly
+- **Canonical Metrics** — governed KPIs with stable definitions and cross-team agreement
+- **Heuristic Signals** — derived fields with subjective thresholds; must always carry caveats
 
-Local dev uses `.env.local` (loaded first with override) then `.env`.
+When implementing or modifying any analyzer, scorer, or recommender logic, ensure field classifications align with this taxonomy. Heuristic signals must never be presented as authoritative facts in Genie answers.
 
-## Dev/Test Workflow
+## Key Documentation
 
-There is no local dev server — all testing is done by syncing code to Databricks and redeploying:
+- `docs/genierx-spec.md` — GenieRX analyzer/recommender specification
+- `docs/genie-space-schema.md` — Genie space schema reference
+- `docs/checklist-by-schema.md` — Analysis checklist organized by schema section
+- `CUJ.md` — Core user journeys and product analysis
 
-1. Edit code locally
-2. `databricks sync --watch . /Workspace/Users/<email>/genie-workbench` picks up changes automatically
-3. Re-run `databricks apps deploy <app-name> --source-code-path /Workspace/Users/<email>/genie-workbench` to trigger a new deployment
-4. Test in the deployed Databricks App
+## Development
 
-Do NOT suggest running `uvicorn` or `npm run dev` locally. The app depends on Databricks-managed resources (OBO auth, Lakebase, serving endpoints) that aren't available outside a Databricks App environment.
-
-## Gotchas
-
-- **frontend/dist/ is gitignored but NOT databricksignored** — the built React app must be synced to workspace for deployment. Build before `databricks sync`.
-- **`.databricksignore` excludes `*.md`** but explicitly includes `backend/references/schema.md` (needed at runtime by the analyzer).
-- **OBO ContextVar and streaming** — for SSE endpoints, the ContextVar is NOT cleared after `call_next` because the response streams lazily. Streaming handlers stash the token on `request.state` and re-set it inside the generator.
-- **Two separate "analysis" paths** — IQ Scan (`scanner.py`, rule-based, instant) and Deep Analysis (`analyzer.py`, LLM-based, streaming). They produce different outputs and don't cross-reference.
-- **Two separate "fix" paths** — Fix Agent (from scan findings, auto-applies patches) and Optimize flow (from benchmark labeling, produces suggestions for a new space). They're independent.
-- **Vite proxy** — dev frontend at :5173 proxies `/api` to :8000. In production, FastAPI serves static files from `frontend/dist/` directly.
-- **Python 3.11+** required (`pyproject.toml`). Uses `uv` for dependency management (`uv.lock` present).
-- **Root `package.json`** exists solely as a build hook for Databricks Apps — `postinstall` chains to `frontend/npm install`, `build` chains to `frontend/npm run build`.
+```bash
+# Backend (from repo root)
+uv run start-server
 
-## Code Style
+# Frontend
+cd frontend && npm run dev
+```
 
-- Backend: Python, Pydantic models, FastAPI routers, no class-based views
-- Frontend: React 19 + TypeScript + Tailwind CSS v4 + Vite 7, functional components only
-- UI primitives in `frontend/src/components/ui/` (button, card, badge, etc.) using `class-variance-authority`
-- Path alias `@` maps to `frontend/src/` (configured in `vite.config.ts` and `tsconfig.app.json`)
-- All API routes prefixed with `/api`
-- Pydantic models in `backend/models.py`, TypeScript mirrors in `frontend/src/types/index.ts` — keep in sync
+Frontend runs at `localhost:5173`, proxies API calls to backend at `localhost:8000`.
diff --git a/agents.yaml b/agents.yaml
@@ -0,0 +1,49 @@
+# Multi-agent deployment configuration for Genie Workbench.
+#
+# Deploy all agents:
+#   dbx-agent-app deploy --config agents.yaml
+#
+# Deploy a single agent:
+#   dbx-agent-app deploy --config agents.yaml --agent scorer
+#
+# Each agent is a standalone Databricks App that exposes:
+#   - /invocations  (Responses Agent protocol)
+#   - /.well-known/agent.json  (A2A discovery)
+#   - /health  (liveness probe)
+#   - MCP server  (tool integration)
+
+project:
+  name: genie-workbench
+  workspace_path: /Workspace/Shared/apps
+
+agents:
+  - name: scorer
+    source: ./agents/scorer
+    description: "IQ scoring for Genie Spaces — scan, history, stars"
+
+  - name: analyzer
+    source: ./agents/analyzer
+    description: "LLM-powered deep analysis of Genie Space configurations"
+
+  - name: creator
+    source: ./agents/creator
+    description: "Conversational wizard for building new Genie Spaces"
+
+  - name: optimizer
+    source: ./agents/optimizer
+    description: "Optimization suggestions from benchmark labeling feedback"
+
+  - name: fixer
+    source: ./agents/fixer
+    description: "AI fix agent — generates and applies config patches"
+
+  - name: supervisor
+    source: .
+    description: "Supervisor — serves React SPA and routes to sub-agents"
+    depends_on: [scorer, analyzer, creator, optimizer, fixer]
+    url_env_map:
+      scorer: SCORER_URL
+      analyzer: ANALYZER_URL
+      creator: CREATOR_URL
+      optimizer: OPTIMIZER_URL
+      fixer: FIXER_URL
diff --git a/agents/_shared/__init__.py b/agents/_shared/__init__.py
@@ -0,0 +1,7 @@
+"""Shared utilities for Genie Workbench agents.
+
+Provides cross-cutting concerns that multiple agents need:
+- auth_bridge: Bridge @app_agent UserContext into monolith + AI Dev Kit auth
+- sp_fallback: Service principal fallback for Genie API scope errors
+- lakebase_client: Shared PostgreSQL connection pool management
+"""
diff --git a/agents/_shared/auth_bridge.py b/agents/_shared/auth_bridge.py
@@ -0,0 +1,125 @@
+"""Bridge @app_agent UserContext into both monolith and AI Dev Kit auth systems.
+
+During migration, agent tools receive `request.user_context` from @app_agent,
+but domain logic (scanner, genie_client, etc.) calls `get_workspace_client()`
+from the monolith's auth module. And `databricks-tools-core` functions use
+their own separate ContextVars via `set_databricks_auth()`.
+
+This module provides `obo_context()` — a single context manager that sets up
+all three auth systems so existing domain logic works unchanged inside agents.
+
+Source patterns:
+    - backend/services/auth.py:25      (_obo_client ContextVar)
+    - backend/services/auth.py:33-58   (set_obo_user_token)
+    - databricks_tools_core/auth.py    (set_databricks_auth / clear_databricks_auth)
+"""
+
+from __future__ import annotations
+
+import os
+from contextlib import contextmanager
+from contextvars import ContextVar
+from typing import Optional
+
+from databricks.sdk import WorkspaceClient
+from databricks.sdk.config import Config
+
+
+# Monolith-compatible ContextVar (mirrors backend/services/auth.py:25)
+_obo_client: ContextVar[Optional[WorkspaceClient]] = ContextVar(
+    "_obo_client", default=None
+)
+
+# Singleton SP client (lazy-initialized)
+_sp_client: Optional[WorkspaceClient] = None
+
+
+@contextmanager
+def obo_context(access_token: str, host: Optional[str] = None):
+    """Set up OBO auth for monolith code and databricks-tools-core.
+
+    Creates a per-request WorkspaceClient from the user's OBO token and
+    stores it in both the monolith ContextVar and the AI Dev Kit ContextVars.
+
+    Usage in any agent tool::
+
+        @scorer.tool(description="Run IQ scan on a Genie Space")
+        async def scan_space(space_id: str, request: AgentRequest) -> dict:
+            with obo_context(request.user_context.access_token):
+                # All of these now work:
+                # - get_workspace_client() returns OBO client
+                # - databricks-tools-core functions use OBO token
+                result = scanner.calculate_score(space_id)
+
+    For streaming generators, capture the token before yielding and
+    re-enter obo_context() per-yield. This matches the pattern in
+    backend/routers/create.py:125-198.
+
+    Args:
+        access_token: The user's OBO access token.
+        host: Databricks workspace host. Defaults to DATABRICKS_HOST env var.
+
+    Yields:
+        WorkspaceClient configured with the user's OBO token.
+    """
+    resolved_host = host or os.environ.get("DATABRICKS_HOST", "")
+
+    # 1. Create OBO WorkspaceClient (monolith pattern from auth.py:49-58)
+    #    Must set auth_type="pat" and clear client_id/client_secret to prevent
+    #    the SDK from using oauth-m2m from env vars on Databricks Apps.
+    cfg = Config(
+        host=resolved_host,
+        token=access_token,
+        auth_type="pat",
+        client_id=None,
+        client_secret=None,
+    )
+    client = WorkspaceClient(config=cfg)
+    token = _obo_client.set(client)
+
+    # 2. Set databricks-tools-core ContextVars (if available)
+    has_tools_core = False
+    try:
+        from databricks_tools_core.auth import (
+            set_databricks_auth,
+            clear_databricks_auth,
+        )
+
+        set_databricks_auth(resolved_host, access_token)
+        has_tools_core = True
+    except ImportError:
+        pass
+
+    try:
+        yield client
+    finally:
+        _obo_client.reset(token)
+        if has_tools_core:
+            clear_databricks_auth()
+
+
+def get_workspace_client() -> WorkspaceClient:
+    """Drop-in replacement for backend.services.auth.get_workspace_client().
+
+    Returns the OBO client if inside an obo_context(), otherwise the default
+    singleton (SP on Databricks Apps, CLI/PAT locally).
+
+    Domain logic can import this instead of the monolith version during
+    migration — the behavior is identical.
+    """
+    obo = _obo_client.get()
+    if obo is not None:
+        return obo
+    return get_service_principal_client()
+
+
+def get_service_principal_client() -> WorkspaceClient:
+    """Get the service principal client (bypasses OBO).
+
+    Used for app-level operations and as fallback when the user's OBO token
+    lacks required scopes (e.g., Genie API before consent flow).
+    """
+    global _sp_client
+    if _sp_client is None:
+        _sp_client = WorkspaceClient()
+    return _sp_client