Skip to content

Release v1.10.0: SDK auth migration, summarization agent, and Lakebase fixes#125

Open
forrestmurray-db wants to merge 88 commits into
mainfrom
release/v1.10.0
Open

Release v1.10.0: SDK auth migration, summarization agent, and Lakebase fixes#125
forrestmurray-db wants to merge 88 commits into
mainfrom
release/v1.10.0

Conversation

@forrestmurray-db
Copy link
Copy Markdown
Collaborator

Summary

  • SDK Auth Migration: Replace manual token storage with Databricks SDK-based authentication (resolve_databricks_token()). Removes DatabricksTokenDB model, token input fields, and DATABRICKS_TOKEN env var mutations. All services now use SDK auth.
  • Summarization Agent Overhaul: Refactor summarization to a tool-based agent with span data resolution. Add facilitator visibility into summarization status/results, job tracking via SummarizationJob table, and resummarize capability.
  • Lakebase Fixes: Switch to do_connect token injection, fix connection pool settings, and update specs with pool requirements and service principal permissions.
  • Docs: Update facilitator guide for Lakebase and Git-based deployment, fix setup prerequisites.
  • Bug Fixes: Deduplicate convertTraceToTraceData for summary propagation, handle databricks_host with existing https:// prefix, resolve available-models without mlflow intake config.

Changes (63 files, +4839 / -1206)

Auth (12 commits)

  • Add resolve_databricks_token() utility using Databricks SDK
  • Remove DatabricksTokenDB model and databricks_tokens table
  • Remove token input fields from IntakePage and DBSQLExportPage
  • Replace token_storage patterns across all services and routers
  • Update TypeScript models and service docstrings

Summarization (7 commits)

  • Refactor to tool-based agent with span data resolution
  • Add facilitator visibility into summarization status and results
  • New SummarizationJob model and migration (0018)
  • Fix summary propagation through convertTraceToTraceData
  • Use SDK auth and separate DB session for background tasks

Lakebase & Database (3 commits)

  • Switch to do_connect token injection for Lakebase
  • Fix pool settings for Databricks SQL connections
  • Update specs with connection pool requirements

Docs (4 commits)

  • Update facilitator guide for Lakebase and Git-based deployment
  • Add service principal permissions to AUTHENTICATION_SPEC
  • Fix Lakebase setup prerequisites

Test plan

  • Verify SDK auth works end-to-end (token resolution, service initialization)
  • Test summarization agent with tool-based flow
  • Confirm facilitator dashboard shows summarization status
  • Verify Lakebase connection pool behavior
  • Run just test-server — all backend tests pass
  • Run just ui-test-unit — all frontend tests pass
  • Run just e2e — end-to-end tests pass

🤖 Generated with Claude Code

forrestmurray-db and others added 30 commits April 10, 2026 10:51
Replace the hardcoded MODEL_MAPPING with a live API call to Databricks
serving-endpoints. The backend uses async httpx to avoid blocking the
event loop, and the frontend fetches models via useAvailableModels and
builds options dynamically with buildModelOptions. All components now
store and pass endpoint names directly instead of translating between
display names and backend names.

Also switches model prefetching from an eager useEffect in
WorkflowContext to intent-based prefetchQuery on hover/focus of
navigation buttons, and clears Databricks auth env vars that can
override token auth in the MLflow intake service.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace stale hasMlflowConfig references in DiscoveryAnalysisTab with
modelOptions.length checks to match the switch to dynamic model listing.
Fix discovery-complete endpoint returning 404 for facilitators whose
workshop_id is NULL by also checking against workshop.facilitator_id.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prevent worktree contents from being tracked.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…vice init

Add a public resolve_databricks_token() function that uses the Databricks
SDK for auth (service principal on Apps, CLI profile locally) with a
fallback to DATABRICKS_TOKEN env var. Remove the token_storage/db_service
fallback chain from DatabricksService.__init__.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
MLflow uses whatever Databricks auth the SDK provides. Stop setting
DATABRICKS_TOKEN in the environment — only set DATABRICKS_HOST so the
SDK knows which workspace to target.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mark databricks_token as deprecated with empty default in Python models
(MLflowIntakeConfig, MLflowIntakeConfigCreate, DBSQLExportRequest,
DatabricksConfig) and optional in TypeScript models. SDK auth is used
instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…outer

Replace 10+ token_storage.get_token / db_service.get_databricks_token
fallback chains with resolve_databricks_token(). Remove all
os.environ["DATABRICKS_TOKEN"] mutations. Update test mocks to patch
resolve_databricks_token instead of token_storage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…outers

Update discovery_service (7 refs), judge_service, draft_rubric_grouping,
database_service, databricks router, dbsql_export router. Remove
set/get_databricks_token methods from database_service. Update test
mocks to patch resolve_databricks_token.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the token persistence infrastructure:
- DatabricksTokenDB SQLAlchemy model from database.py
- databricks_tokens from postgres_manager ALLOWED_TABLES and CREATE TABLE
- DatabricksTokenDB import from database_service.py
- test_token_storage_service.py (5 tests for deleted functionality)
- Update postgres_manager test expectations

token_storage_service.py is kept for Custom LLM API key storage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…Page

Users no longer need to provide Databricks tokens — the backend uses
SDK auth (service principal on Apps, CLI profile locally). Remove all
token state, localStorage persistence, form fields, and validation
from both pages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove os.environ["DATABRICKS_TOKEN"] and DATABRICKS_CLIENT_ID/SECRET
pop() calls from alignment_service, judge_service, dbsql_export_service,
and database_service. The SDK handles auth automatically — only
DATABRICKS_HOST needs to be set for MLflow to know which workspace.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AUTHENTICATION_SPEC:
- Rewrite Architecture Context to describe the two-layer model accurately
- Add new "Databricks API Authentication" section with token resolution
  contract, environment-specific behavior, MLflow auth, and what was removed
- Add "Future: Per-User Auth" subsection for OBO pattern
- Add 8 success criteria for Databricks API auth
- Mark SDK Auth Migration as complete in implementation log

BUILD_AND_DEPLOY_SPEC:
- Mark DATABRICKS_TOKEN as optional (SDK auth preferred) in env vars table
- Update Databricks Apps Authentication section to reference
  resolve_databricks_token() and link to AUTHENTICATION_SPEC

JUDGE_EVALUATION_SPEC:
- Fix troubleshooting note: "host, token" → "host, experiment ID + SDK auth"
- Add SDK Auth Migration to implementation log

README.md:
- Add keyword index entries: PAT, SDK auth, resolve_databricks_token,
  service principal, DATABRICKS_TOKEN, DATABRICKS_CLIENT_ID, OAuth,
  CLI profile

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document the Databricks resources the app's service principal needs
access to: MLflow Experiment (Can edit), Model Serving Endpoints
(Can query), SQL Warehouse (Can use), Unity Catalog Volume (Can read
and write). Note which are required vs optional.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lakebase (PostgreSQL) is the primary production database. Its OAuth
tokens are refreshed via WorkspaceClient().config.oauth_token() every
15 minutes. Split permissions into core (Lakebase, MLflow, Serving
Endpoints) vs optional (SQL Warehouse, UC Volume).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AUTHENTICATION_SPEC:
- Add "Lakebase Connection Pool" section with token lifecycle, do_connect
  injection pattern, required pool settings, credential API, and setup
  prerequisites — all with links to Databricks docs
- Update Lakebase row in permissions table to reference generate_database_credential
- Add 7 Lakebase connection pool success criteria
- Add implementation log entry

BUILD_AND_DEPLOY_SPEC:
- Add Lakebase env vars (PGHOST, PGDATABASE, PGUSER, PGPORT, PGSSLMODE,
  PGAPPNAME, ENDPOINT_NAME, DATABASE_ENV) to environment variables table
- Add implementation log section with SDK auth and Lakebase pool entries

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ings

Replace the creator-based connection factory with the recommended
do_connect event pattern from Databricks docs. Key changes:

- OAuthTokenManager → LakebaseCredentialManager using
  generate_database_credential(endpoint=ENDPOINT_NAME) API
- Token injection via do_connect event (not creator callable)
- pool_recycle: 300s → 3600s (was causing excessive connection churn)
- pool_pre_ping: True → False (conflicts with do_connect injection)
- max_overflow: 10 → 5 (caps at 20 total across 2 workers)
- postgres_manager: pool created once with custom OAuthConnection
  class, never recreated on token refresh
- database.py: _reset_connection_pool no longer calls force_refresh

Reference: https://docs.databricks.com/aws/en/lakebase/connect/custom-app.html

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove databricks_token from CSV upload body type, make
DatabricksConfig.token optional, update ApiService/WorkshopsService
docstrings to reflect SDK auth.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When Lakebase is added as a Databricks App resource, the platform
automatically creates a Postgres role for the service principal.
Manual databricks_create_role() is only needed for external/additional
identities outside the App resource integration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ndency

- Add summarization_enabled, summarization_model, summarization_guidance
  columns to WorkshopDB
- Add summary (JSON) column to TraceDB for structured milestone views
- Add corresponding Pydantic model fields and DB service methods
- Add pydantic-ai-slim[openai] dependency
- Create TRACE_SUMMARIZATION_SPEC with success criteria
- Create implementation plan

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… with batch support

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…raceViewer

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ingestion

- PUT /workshops/{id}/summarization-settings for facilitator config
- POST /workshops/{id}/resummarize for on-demand re-summarization
- Background summarization triggered after MLflow trace ingestion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…odelOptions

The settings agent used a function name that doesn't exist in
modelMapping.ts. Fixed to follow the same pattern as other components:
useAvailableModels() + buildModelOptions().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s fork

The FastAPI lifespan bootstrap ran migrations in each worker process,
requiring interprocess locks and never applying new migrations after
initial deploy. Move migration execution to gunicorn's on_starting hook
which runs exactly once in the master process before any workers fork.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts:
#	specs/BUILD_AND_DEPLOY_SPEC.md
…nd tasks

- Use resolve_databricks_token() instead of stored PAT (SDK auth compat)
- Create new SessionLocal() inside background tasks to avoid using the
  request-scoped DB session after it's closed
- Add logging for summarization completion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
forrestmurray-db and others added 30 commits April 15, 2026 11:40
Fix two indentation errors in workshops.py caused by removing
'if databricks_token:' gatekeeping without dedenting the body.
Remove orphaned 'else: no token' branch.

Update test fixtures for databricks and dbsql_export routers to match
new no-arg create_databricks_service() and DBSQLExportService() APIs.

810 passed, 0 failed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Migration 0017_remove_databricks_host branched from 0016 alongside
the existing 0017_add_summarization, creating multiple heads.
Renumbered to 0019 with down_revision pointing to 0018.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The frontend no longer sends experiment_id (it comes from
MLFLOW_EXPERIMENT_ID env var). The endpoint now resolves it from the
environment when not provided in the request body.

810 passed, 0 failed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Variable was renamed to workspace_host via get_databricks_host() but
the call to _run_summarization_background still referenced the old name.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nvalidation

- Remove duplicate QueryClient from App.tsx — main.tsx's configured
  client is now the single provider
- Set global staleTime: 30s to prevent refetch storms on navigation;
  remove 12 redundant per-hook staleTime overrides
- Add 7 selector hooks (useWorkshopPhase, useWorkshopDisplayConfig,
  useWorkshopMeta, useWorkshopDiscoveryConfig, useWorkshopAnnotationConfig,
  useWorkshopEvalConfig, useWorkshopSummarizationConfig) using TanStack
  Query select option — components only re-render when their slice changes
- Migrate 15 components from useWorkshop() to selector hooks
- Fix mutation anti-patterns: remove unnecessary workshop invalidation
  from annotation submit; eliminate setQueryData + invalidateQueries in
  4 mutations (toggle notes, JSONPath, span filter, summarization)
- Update 12 test files to mock new selector hooks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summarization jobs could get stuck with no way to stop them. This adds
a cancel endpoint that cancels the asyncio background task and a cancel
button in the SummarizationSettings progress UI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Deleted the .env.local file containing Databricks configuration.
- Updated justfile to include log level configuration for uvicorn commands.
- Removed unused databricks_host property from Body_upload_csv_and_log_to_mlflow_workshops model.
- Made experiment_id optional in MLflowIntakeConfigCreate type.
- Added cancelSummarizationJob API method to ApiService and WorkshopsService.
- Enhanced logging in summarization background tasks and services for better traceability.
- Refactored DiscoveryService to use unified SDK authentication for Databricks LLM calls.
…roup display and focus handling

- Added a section in DraftRubricPanel to display proposed groups for faster review, including an apply and dismiss button.
- Updated DraftRubricSidebar to handle focus changes, allowing the sidebar to expand when inputs are focused.
- Improved the layout and styling of proposed group displays in both components for better user experience.
Strip wrapping quotes and whitespace from experiment IDs loaded from env and request inputs so Databricks MLflow lookups resolve correctly.
Improve discovery for one-at-a-time eval mode by removing trace-count selection at start, injecting workshop use-case and milestone summary context into follow-up and analysis prompts, and tracking milestone references through follow-up answers and findings evidence. Add clickable origin badges that scroll facilitators to the referenced trace or milestone in the feed.
Add question-level lineage refs and markdown link rendering so findings can cite trace milestones and follow-up questions inline, with navigation that resolves question links back to participant-selected context.
Add eval-mode workshop support with per-trace criteria CRUD, rubric rendering, scoring aggregation, and mode-gated routing/UI so eval and legacy workshop flows can evolve independently.

(cherry picked from commit e85d269f7d53b67fa8fb8aeec1003da42ab0a72d)
…128)

Define and normalize MLflow experiment IDs during ingestion, switch trace search to the non-deprecated locations API, and ensure Databricks hosts always include a protocol to prevent endpoint listing failures.
Introduce a social discovery mode with threaded trace/milestone comments, voting, facilitator @assistant/@agent mentions, and SSE streaming updates while keeping analysis mode behind a toggle. Also include eval-mode regression hotfixes for Databricks/MLflow intake behavior with expanded unit coverage.
Add facilitator-only comment deletion and make social thread vote/delete interactions respond instantly with optimistic updates, so moderation and rating actions feel reliable under live SSE updates.
…ation timeout

- Extend get_display_text with optional milestone context enrichment so
  LLM judges can reason about agent trajectory, not just final response
- JudgeService now includes trace milestone summaries when evaluating
  (normal workshop mode)
- Fix summarization background job idle-in-transaction timeout by using
  short-lived DB sessions per write instead of one long-lived session
  across the entire batch of LLM calls
The lockfile was generated against pypi-proxy.dev.databricks.com which
is unavailable in the deployment environment.
Drop client package-lock.json and add client .npmrc to avoid proxy-pinned tarball URLs during Databricks Apps npm installs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant