Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
143 changes: 26 additions & 117 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,132 +1,41 @@
# Genie Workbench

Databricks App for creating, scoring, and optimizing Genie Spaces. FastAPI backend + React/Vite frontend deployed together on Databricks Apps.
## Project Overview

## Commands
Genie Workbench is a Databricks App that acts as a quality control and optimization platform for Genie Space administrators. It helps builders understand why their Genie Space isn't performing well and fix it.

```bash
# Backend (from project root)
uv pip install -e . # Install Python deps
uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reload # Dev server

# Frontend (from frontend/)
cd frontend && npm install && npm run build # Build for production
cd frontend && npm run dev # Vite dev server (port 5173, proxies /api to :8000)
cd frontend && npm run lint # ESLint

# Full build (what Databricks Apps runs)
npm install # Triggers postinstall -> cd frontend && npm install
npm run build # Triggers cd frontend && npm run build

# Deploy
databricks sync --watch . /Workspace/Users/<email>/genie-workbench
databricks apps deploy <app-name> --source-code-path /Workspace/Users/<email>/genie-workbench

# Tests (require running backend at localhost:8000)
python tests/test_e2e_local.py # E2E create agent tests
python tests/test_full_schema.py # Schema validation
# Deployed E2E tests require: pip install playwright && playwright install chromium
python tests/test_e2e_deployed.py
```

## Architecture

```
backend/
main.py # FastAPI app entry point, OBO middleware, static file serving
models.py # All Pydantic models (shared between routers/services)
routers/
analysis.py # /api/space/*, /api/analyze/*, /api/optimize, /api/genie/*, /api/sql/*
spaces.py # /api/spaces/* (list, scan, history, star, fix)
admin.py # /api/admin/* (dashboard, leaderboard, alerts)
auth.py # /api/auth/me
create.py # /api/create/* (agent chat, UC discovery, wizard)
services/
auth.py # OBO auth (ContextVar), SP fallback, WorkspaceClient mgmt
genie_client.py # Databricks Genie API (fetch space, list spaces, query for SQL)
scanner.py # Rule-based IQ scoring engine (0-100, 4 dimensions)
analyzer.py # LLM-based deep analysis against best-practices checklist
optimizer.py # LLM-based optimization from benchmark feedback
fix_agent.py # LLM agent that generates JSON patches and applies via Genie API
create_agent.py # Multi-turn LLM agent for creating new Genie Spaces
create_agent_session.py # Session persistence for create agent (Lakebase)
create_agent_tools.py # Tool definitions for create agent (UC discovery, SQL, etc.)
lakebase.py # PostgreSQL persistence (asyncpg pool, in-memory fallback)
llm_utils.py # OpenAI-compatible LLM client via Databricks serving endpoints
uc_client.py # Unity Catalog browsing (catalogs, schemas, tables)
prompts/ # Prompt templates for analysis
prompts_create/ # Prompt templates for create agent (multi-file, modular)
references/schema.md # Genie Space JSON schema reference
frontend/
src/
App.tsx # Root: SpaceList | SpaceDetail | AdminDashboard | CreateAgentChat
lib/api.ts # All API calls (fetch, SSE streaming helpers)
types/index.ts # TypeScript types mirroring backend Pydantic models
components/ # UI components (analysis, optimization, fix agent, etc.)
pages/ # SpaceList, SpaceDetail, AdminDashboard, HistoryTab, IQScoreTab
hooks/ # useAnalysis, useTheme
vite.config.ts # Vite config with /api proxy to localhost:8000
```

## Key Patterns

### Authentication (OBO)
On Databricks Apps, user identity flows via `x-forwarded-access-token` header. `OBOAuthMiddleware` in `main.py` stores the token in a `ContextVar`. All services call `get_workspace_client()` which returns the OBO client if set, otherwise the SP singleton. Some Genie API calls require SP auth (missing `genie` OAuth scope) — see `_is_scope_error()` fallback in `genie_client.py`.

### SSE Streaming
Multiple endpoints use `StreamingResponse` with `text/event-stream`:
- `/api/analyze/stream` — analysis progress
- `/api/optimize` — optimization with heartbeat keepalives (15s)
- `/api/spaces/{id}/fix` — fix agent patches
- `/api/create/agent/chat` — multi-turn agent with typed events (session, step, thinking, tool_call, tool_result, message_delta, message, created, error, done)

Frontend consumes these via manual `fetch` + `ReadableStream` in `lib/api.ts` (not EventSource). Buffer splitting on `\n\n`.
- **Backend:** Python (FastAPI), deployed as a Databricks App
- **Frontend:** React/TypeScript (Vite)
- **Storage:** Lakebase (with in-memory fallback for local dev)
- **Tracing:** Optional MLflow integration

### Lakebase Persistence
`services/lakebase.py` uses asyncpg with graceful fallback to in-memory dicts when `LAKEBASE_HOST` is not set. Credentials auto-generated via Databricks SDK (`/api/2.0/database/credentials`). Schema defined in `sql/setup_lakebase.sql`.
## GenieRX Specification

### LLM Calls
All LLM calls go through Databricks model serving endpoints using OpenAI-compatible API. Model configured via `LLM_MODEL` env var (default: `databricks-claude-sonnet-4-6`). MLflow tracing is optional — controlled by `MLFLOW_EXPERIMENT_ID`.
The GenieRX spec (`docs/genierx-spec.md`) defines the core analysis and recommendation framework used throughout this project. **Always consult it when working on analysis, scoring, or recommendation features.**

## Environment Variables
Key concepts from the spec:

Defined in `app.yaml`. Key ones:
- `SQL_WAREHOUSE_ID` — from app resource `sql-warehouse`
- `LLM_MODEL` — serving endpoint name
- `LAKEBASE_HOST`, `LAKEBASE_PORT`, `LAKEBASE_DATABASE`, `LAKEBASE_INSTANCE_NAME` — Lakebase config
- `MLFLOW_EXPERIMENT_ID` — enables MLflow tracing (validated at startup, cleared if invalid)
- `GENIE_TARGET_DIRECTORY` — where new spaces are created (default `/Shared/`)
- `DEV_USER_EMAIL` — local dev only
- **Authoritative Facts** — raw data from systems of record, safe to surface directly
- **Canonical Metrics** — governed KPIs with stable definitions and cross-team agreement
- **Heuristic Signals** — derived fields with subjective thresholds; must always carry caveats

Local dev uses `.env.local` (loaded first with override) then `.env`.
When implementing or modifying any analyzer, scorer, or recommender logic, ensure field classifications align with this taxonomy. Heuristic signals must never be presented as authoritative facts in Genie answers.

## Dev/Test Workflow
## Key Documentation

There is no local dev server — all testing is done by syncing code to Databricks and redeploying:
- `docs/genierx-spec.md` — GenieRX analyzer/recommender specification
- `docs/genie-space-schema.md` — Genie space schema reference
- `docs/checklist-by-schema.md` — Analysis checklist organized by schema section
- `CUJ.md` — Core user journeys and product analysis

1. Edit code locally
2. `databricks sync --watch . /Workspace/Users/<email>/genie-workbench` picks up changes automatically
3. Re-run `databricks apps deploy <app-name> --source-code-path /Workspace/Users/<email>/genie-workbench` to trigger a new deployment
4. Test in the deployed Databricks App
## Development

Do NOT suggest running `uvicorn` or `npm run dev` locally. The app depends on Databricks-managed resources (OBO auth, Lakebase, serving endpoints) that aren't available outside a Databricks App environment.

## Gotchas

- **frontend/dist/ is gitignored but NOT databricksignored** — the built React app must be synced to workspace for deployment. Build before `databricks sync`.
- **`.databricksignore` excludes `*.md`** but explicitly includes `backend/references/schema.md` (needed at runtime by the analyzer).
- **OBO ContextVar and streaming** — for SSE endpoints, the ContextVar is NOT cleared after `call_next` because the response streams lazily. Streaming handlers stash the token on `request.state` and re-set it inside the generator.
- **Two separate "analysis" paths** — IQ Scan (`scanner.py`, rule-based, instant) and Deep Analysis (`analyzer.py`, LLM-based, streaming). They produce different outputs and don't cross-reference.
- **Two separate "fix" paths** — Fix Agent (from scan findings, auto-applies patches) and Optimize flow (from benchmark labeling, produces suggestions for a new space). They're independent.
- **Vite proxy** — dev frontend at :5173 proxies `/api` to :8000. In production, FastAPI serves static files from `frontend/dist/` directly.
- **Python 3.11+** required (`pyproject.toml`). Uses `uv` for dependency management (`uv.lock` present).
- **Root `package.json`** exists solely as a build hook for Databricks Apps — `postinstall` chains to `frontend/npm install`, `build` chains to `frontend/npm run build`.
```bash
# Backend (from repo root)
uv run start-server

## Code Style
# Frontend
cd frontend && npm run dev
```

- Backend: Python, Pydantic models, FastAPI routers, no class-based views
- Frontend: React 19 + TypeScript + Tailwind CSS v4 + Vite 7, functional components only
- UI primitives in `frontend/src/components/ui/` (button, card, badge, etc.) using `class-variance-authority`
- Path alias `@` maps to `frontend/src/` (configured in `vite.config.ts` and `tsconfig.app.json`)
- All API routes prefixed with `/api`
- Pydantic models in `backend/models.py`, TypeScript mirrors in `frontend/src/types/index.ts` — keep in sync
Frontend runs at `localhost:5173`, proxies API calls to backend at `localhost:8000`.
49 changes: 49 additions & 0 deletions agents.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Multi-agent deployment configuration for Genie Workbench.
#
# Deploy all agents:
# dbx-agent-app deploy --config agents.yaml
#
# Deploy a single agent:
# dbx-agent-app deploy --config agents.yaml --agent scorer
#
# Each agent is a standalone Databricks App that exposes:
# - /invocations (Responses Agent protocol)
# - /.well-known/agent.json (A2A discovery)
# - /health (liveness probe)
# - MCP server (tool integration)

project:
name: genie-workbench
workspace_path: /Workspace/Shared/apps

agents:
- name: scorer
source: ./agents/scorer
description: "IQ scoring for Genie Spaces — scan, history, stars"

- name: analyzer
source: ./agents/analyzer
description: "LLM-powered deep analysis of Genie Space configurations"

- name: creator
source: ./agents/creator
description: "Conversational wizard for building new Genie Spaces"

- name: optimizer
source: ./agents/optimizer
description: "Optimization suggestions from benchmark labeling feedback"

- name: fixer
source: ./agents/fixer
description: "AI fix agent — generates and applies config patches"

- name: supervisor
source: .
description: "Supervisor — serves React SPA and routes to sub-agents"
depends_on: [scorer, analyzer, creator, optimizer, fixer]
url_env_map:
scorer: SCORER_URL
analyzer: ANALYZER_URL
creator: CREATOR_URL
optimizer: OPTIMIZER_URL
fixer: FIXER_URL
7 changes: 7 additions & 0 deletions agents/_shared/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
"""Shared utilities for Genie Workbench agents.

Provides cross-cutting concerns that multiple agents need:
- auth_bridge: Bridge @app_agent UserContext into monolith + AI Dev Kit auth
- sp_fallback: Service principal fallback for Genie API scope errors
- lakebase_client: Shared PostgreSQL connection pool management
"""
125 changes: 125 additions & 0 deletions agents/_shared/auth_bridge.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
"""Bridge @app_agent UserContext into both monolith and AI Dev Kit auth systems.

During migration, agent tools receive `request.user_context` from @app_agent,
but domain logic (scanner, genie_client, etc.) calls `get_workspace_client()`
from the monolith's auth module. And `databricks-tools-core` functions use
their own separate ContextVars via `set_databricks_auth()`.

This module provides `obo_context()` — a single context manager that sets up
all three auth systems so existing domain logic works unchanged inside agents.

Source patterns:
- backend/services/auth.py:25 (_obo_client ContextVar)
- backend/services/auth.py:33-58 (set_obo_user_token)
- databricks_tools_core/auth.py (set_databricks_auth / clear_databricks_auth)
"""

from __future__ import annotations

import os
from contextlib import contextmanager
from contextvars import ContextVar
from typing import Optional

from databricks.sdk import WorkspaceClient
from databricks.sdk.config import Config


# Monolith-compatible ContextVar (mirrors backend/services/auth.py:25)
_obo_client: ContextVar[Optional[WorkspaceClient]] = ContextVar(
"_obo_client", default=None
)

# Singleton SP client (lazy-initialized)
_sp_client: Optional[WorkspaceClient] = None


@contextmanager
def obo_context(access_token: str, host: Optional[str] = None):
"""Set up OBO auth for monolith code and databricks-tools-core.

Creates a per-request WorkspaceClient from the user's OBO token and
stores it in both the monolith ContextVar and the AI Dev Kit ContextVars.

Usage in any agent tool::

@scorer.tool(description="Run IQ scan on a Genie Space")
async def scan_space(space_id: str, request: AgentRequest) -> dict:
with obo_context(request.user_context.access_token):
# All of these now work:
# - get_workspace_client() returns OBO client
# - databricks-tools-core functions use OBO token
result = scanner.calculate_score(space_id)

For streaming generators, capture the token before yielding and
re-enter obo_context() per-yield. This matches the pattern in
backend/routers/create.py:125-198.

Args:
access_token: The user's OBO access token.
host: Databricks workspace host. Defaults to DATABRICKS_HOST env var.

Yields:
WorkspaceClient configured with the user's OBO token.
"""
resolved_host = host or os.environ.get("DATABRICKS_HOST", "")

# 1. Create OBO WorkspaceClient (monolith pattern from auth.py:49-58)
# Must set auth_type="pat" and clear client_id/client_secret to prevent
# the SDK from using oauth-m2m from env vars on Databricks Apps.
cfg = Config(
host=resolved_host,
token=access_token,
auth_type="pat",
client_id=None,
client_secret=None,
)
client = WorkspaceClient(config=cfg)
token = _obo_client.set(client)

# 2. Set databricks-tools-core ContextVars (if available)
has_tools_core = False
try:
from databricks_tools_core.auth import (
set_databricks_auth,
clear_databricks_auth,
)

set_databricks_auth(resolved_host, access_token)
has_tools_core = True
except ImportError:
pass

try:
yield client
finally:
_obo_client.reset(token)
if has_tools_core:
clear_databricks_auth()


def get_workspace_client() -> WorkspaceClient:
"""Drop-in replacement for backend.services.auth.get_workspace_client().

Returns the OBO client if inside an obo_context(), otherwise the default
singleton (SP on Databricks Apps, CLI/PAT locally).

Domain logic can import this instead of the monolith version during
migration — the behavior is identical.
"""
obo = _obo_client.get()
if obo is not None:
return obo
return get_service_principal_client()


def get_service_principal_client() -> WorkspaceClient:
"""Get the service principal client (bypasses OBO).

Used for app-level operations and as fallback when the user's OBO token
lacks required scopes (e.g., Genie API before consent flow).
"""
global _sp_client
if _sp_client is None:
_sp_client = WorkspaceClient()
return _sp_client
Loading