Release v1.10.0: SDK auth migration, summarization agent, and Lakebase fixes#125
Open
forrestmurray-db wants to merge 88 commits into
Open
Release v1.10.0: SDK auth migration, summarization agent, and Lakebase fixes#125forrestmurray-db wants to merge 88 commits into
forrestmurray-db wants to merge 88 commits into
Conversation
Replace the hardcoded MODEL_MAPPING with a live API call to Databricks serving-endpoints. The backend uses async httpx to avoid blocking the event loop, and the frontend fetches models via useAvailableModels and builds options dynamically with buildModelOptions. All components now store and pass endpoint names directly instead of translating between display names and backend names. Also switches model prefetching from an eager useEffect in WorkflowContext to intent-based prefetchQuery on hover/focus of navigation buttons, and clears Databricks auth env vars that can override token auth in the MLflow intake service. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace stale hasMlflowConfig references in DiscoveryAnalysisTab with modelOptions.length checks to match the switch to dynamic model listing. Fix discovery-complete endpoint returning 404 for facilitators whose workshop_id is NULL by also checking against workshop.facilitator_id. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prevent worktree contents from being tracked. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…vice init Add a public resolve_databricks_token() function that uses the Databricks SDK for auth (service principal on Apps, CLI profile locally) with a fallback to DATABRICKS_TOKEN env var. Remove the token_storage/db_service fallback chain from DatabricksService.__init__. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
MLflow uses whatever Databricks auth the SDK provides. Stop setting DATABRICKS_TOKEN in the environment — only set DATABRICKS_HOST so the SDK knows which workspace to target. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mark databricks_token as deprecated with empty default in Python models (MLflowIntakeConfig, MLflowIntakeConfigCreate, DBSQLExportRequest, DatabricksConfig) and optional in TypeScript models. SDK auth is used instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…outer Replace 10+ token_storage.get_token / db_service.get_databricks_token fallback chains with resolve_databricks_token(). Remove all os.environ["DATABRICKS_TOKEN"] mutations. Update test mocks to patch resolve_databricks_token instead of token_storage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…outers Update discovery_service (7 refs), judge_service, draft_rubric_grouping, database_service, databricks router, dbsql_export router. Remove set/get_databricks_token methods from database_service. Update test mocks to patch resolve_databricks_token. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the token persistence infrastructure: - DatabricksTokenDB SQLAlchemy model from database.py - databricks_tokens from postgres_manager ALLOWED_TABLES and CREATE TABLE - DatabricksTokenDB import from database_service.py - test_token_storage_service.py (5 tests for deleted functionality) - Update postgres_manager test expectations token_storage_service.py is kept for Custom LLM API key storage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…Page Users no longer need to provide Databricks tokens — the backend uses SDK auth (service principal on Apps, CLI profile locally). Remove all token state, localStorage persistence, form fields, and validation from both pages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove os.environ["DATABRICKS_TOKEN"] and DATABRICKS_CLIENT_ID/SECRET pop() calls from alignment_service, judge_service, dbsql_export_service, and database_service. The SDK handles auth automatically — only DATABRICKS_HOST needs to be set for MLflow to know which workspace. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AUTHENTICATION_SPEC: - Rewrite Architecture Context to describe the two-layer model accurately - Add new "Databricks API Authentication" section with token resolution contract, environment-specific behavior, MLflow auth, and what was removed - Add "Future: Per-User Auth" subsection for OBO pattern - Add 8 success criteria for Databricks API auth - Mark SDK Auth Migration as complete in implementation log BUILD_AND_DEPLOY_SPEC: - Mark DATABRICKS_TOKEN as optional (SDK auth preferred) in env vars table - Update Databricks Apps Authentication section to reference resolve_databricks_token() and link to AUTHENTICATION_SPEC JUDGE_EVALUATION_SPEC: - Fix troubleshooting note: "host, token" → "host, experiment ID + SDK auth" - Add SDK Auth Migration to implementation log README.md: - Add keyword index entries: PAT, SDK auth, resolve_databricks_token, service principal, DATABRICKS_TOKEN, DATABRICKS_CLIENT_ID, OAuth, CLI profile Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document the Databricks resources the app's service principal needs access to: MLflow Experiment (Can edit), Model Serving Endpoints (Can query), SQL Warehouse (Can use), Unity Catalog Volume (Can read and write). Note which are required vs optional. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lakebase (PostgreSQL) is the primary production database. Its OAuth tokens are refreshed via WorkspaceClient().config.oauth_token() every 15 minutes. Split permissions into core (Lakebase, MLflow, Serving Endpoints) vs optional (SQL Warehouse, UC Volume). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AUTHENTICATION_SPEC: - Add "Lakebase Connection Pool" section with token lifecycle, do_connect injection pattern, required pool settings, credential API, and setup prerequisites — all with links to Databricks docs - Update Lakebase row in permissions table to reference generate_database_credential - Add 7 Lakebase connection pool success criteria - Add implementation log entry BUILD_AND_DEPLOY_SPEC: - Add Lakebase env vars (PGHOST, PGDATABASE, PGUSER, PGPORT, PGSSLMODE, PGAPPNAME, ENDPOINT_NAME, DATABASE_ENV) to environment variables table - Add implementation log section with SDK auth and Lakebase pool entries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ings Replace the creator-based connection factory with the recommended do_connect event pattern from Databricks docs. Key changes: - OAuthTokenManager → LakebaseCredentialManager using generate_database_credential(endpoint=ENDPOINT_NAME) API - Token injection via do_connect event (not creator callable) - pool_recycle: 300s → 3600s (was causing excessive connection churn) - pool_pre_ping: True → False (conflicts with do_connect injection) - max_overflow: 10 → 5 (caps at 20 total across 2 workers) - postgres_manager: pool created once with custom OAuthConnection class, never recreated on token refresh - database.py: _reset_connection_pool no longer calls force_refresh Reference: https://docs.databricks.com/aws/en/lakebase/connect/custom-app.html Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove databricks_token from CSV upload body type, make DatabricksConfig.token optional, update ApiService/WorkshopsService docstrings to reflect SDK auth. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When Lakebase is added as a Databricks App resource, the platform automatically creates a Postgres role for the service principal. Manual databricks_create_role() is only needed for external/additional identities outside the App resource integration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ndency - Add summarization_enabled, summarization_model, summarization_guidance columns to WorkshopDB - Add summary (JSON) column to TraceDB for structured milestone views - Add corresponding Pydantic model fields and DB service methods - Add pydantic-ai-slim[openai] dependency - Create TRACE_SUMMARIZATION_SPEC with success criteria - Create implementation plan Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… with batch support Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…raceViewer Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ingestion
- PUT /workshops/{id}/summarization-settings for facilitator config
- POST /workshops/{id}/resummarize for on-demand re-summarization
- Background summarization triggered after MLflow trace ingestion
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…odelOptions The settings agent used a function name that doesn't exist in modelMapping.ts. Fixed to follow the same pattern as other components: useAvailableModels() + buildModelOptions(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s fork The FastAPI lifespan bootstrap ran migrations in each worker process, requiring interprocess locks and never applying new migrations after initial deploy. Move migration execution to gunicorn's on_starting hook which runs exactly once in the master process before any workers fork. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts: # specs/BUILD_AND_DEPLOY_SPEC.md
…nd tasks - Use resolve_databricks_token() instead of stored PAT (SDK auth compat) - Create new SessionLocal() inside background tasks to avoid using the request-scoped DB session after it's closed - Add logging for summarization completion Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix two indentation errors in workshops.py caused by removing 'if databricks_token:' gatekeeping without dedenting the body. Remove orphaned 'else: no token' branch. Update test fixtures for databricks and dbsql_export routers to match new no-arg create_databricks_service() and DBSQLExportService() APIs. 810 passed, 0 failed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Migration 0017_remove_databricks_host branched from 0016 alongside the existing 0017_add_summarization, creating multiple heads. Renumbered to 0019 with down_revision pointing to 0018. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The frontend no longer sends experiment_id (it comes from MLFLOW_EXPERIMENT_ID env var). The endpoint now resolves it from the environment when not provided in the request body. 810 passed, 0 failed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Variable was renamed to workspace_host via get_databricks_host() but the call to _run_summarization_background still referenced the old name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nvalidation - Remove duplicate QueryClient from App.tsx — main.tsx's configured client is now the single provider - Set global staleTime: 30s to prevent refetch storms on navigation; remove 12 redundant per-hook staleTime overrides - Add 7 selector hooks (useWorkshopPhase, useWorkshopDisplayConfig, useWorkshopMeta, useWorkshopDiscoveryConfig, useWorkshopAnnotationConfig, useWorkshopEvalConfig, useWorkshopSummarizationConfig) using TanStack Query select option — components only re-render when their slice changes - Migrate 15 components from useWorkshop() to selector hooks - Fix mutation anti-patterns: remove unnecessary workshop invalidation from annotation submit; eliminate setQueryData + invalidateQueries in 4 mutations (toggle notes, JSONPath, span filter, summarization) - Update 12 test files to mock new selector hooks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summarization jobs could get stuck with no way to stop them. This adds a cancel endpoint that cancels the asyncio background task and a cancel button in the SummarizationSettings progress UI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Deleted the .env.local file containing Databricks configuration. - Updated justfile to include log level configuration for uvicorn commands. - Removed unused databricks_host property from Body_upload_csv_and_log_to_mlflow_workshops model. - Made experiment_id optional in MLflowIntakeConfigCreate type. - Added cancelSummarizationJob API method to ApiService and WorkshopsService. - Enhanced logging in summarization background tasks and services for better traceability. - Refactored DiscoveryService to use unified SDK authentication for Databricks LLM calls.
…roup display and focus handling - Added a section in DraftRubricPanel to display proposed groups for faster review, including an apply and dismiss button. - Updated DraftRubricSidebar to handle focus changes, allowing the sidebar to expand when inputs are focused. - Improved the layout and styling of proposed group displays in both components for better user experience.
Strip wrapping quotes and whitespace from experiment IDs loaded from env and request inputs so Databricks MLflow lookups resolve correctly.
Improve discovery for one-at-a-time eval mode by removing trace-count selection at start, injecting workshop use-case and milestone summary context into follow-up and analysis prompts, and tracking milestone references through follow-up answers and findings evidence. Add clickable origin badges that scroll facilitators to the referenced trace or milestone in the feed.
Add question-level lineage refs and markdown link rendering so findings can cite trace milestones and follow-up questions inline, with navigation that resolves question links back to participant-selected context.
Add eval-mode workshop support with per-trace criteria CRUD, rubric rendering, scoring aggregation, and mode-gated routing/UI so eval and legacy workshop flows can evolve independently. (cherry picked from commit e85d269f7d53b67fa8fb8aeec1003da42ab0a72d)
…128) Define and normalize MLflow experiment IDs during ingestion, switch trace search to the non-deprecated locations API, and ensure Databricks hosts always include a protocol to prevent endpoint listing failures.
Introduce a social discovery mode with threaded trace/milestone comments, voting, facilitator @assistant/@agent mentions, and SSE streaming updates while keeping analysis mode behind a toggle. Also include eval-mode regression hotfixes for Databricks/MLflow intake behavior with expanded unit coverage.
Add facilitator-only comment deletion and make social thread vote/delete interactions respond instantly with optimistic updates, so moderation and rating actions feel reliable under live SSE updates.
…ation timeout - Extend get_display_text with optional milestone context enrichment so LLM judges can reason about agent trajectory, not just final response - JudgeService now includes trace milestone summaries when evaluating (normal workshop mode) - Fix summarization background job idle-in-transaction timeout by using short-lived DB sessions per write instead of one long-lived session across the entire batch of LLM calls
The lockfile was generated against pypi-proxy.dev.databricks.com which is unavailable in the deployment environment.
Drop client package-lock.json and add client .npmrc to avoid proxy-pinned tarball URLs during Databricks Apps npm installs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
resolve_databricks_token()). RemovesDatabricksTokenDBmodel, token input fields, andDATABRICKS_TOKENenv var mutations. All services now use SDK auth.SummarizationJobtable, and resummarize capability.do_connecttoken injection, fix connection pool settings, and update specs with pool requirements and service principal permissions.convertTraceToTraceDatafor summary propagation, handledatabricks_hostwith existinghttps://prefix, resolve available-models without mlflow intake config.Changes (63 files, +4839 / -1206)
Auth (12 commits)
resolve_databricks_token()utility using Databricks SDKDatabricksTokenDBmodel anddatabricks_tokenstableSummarization (7 commits)
SummarizationJobmodel and migration (0018)convertTraceToTraceDataLakebase & Database (3 commits)
do_connecttoken injection for LakebaseDocs (4 commits)
Test plan
just test-server— all backend tests passjust ui-test-unit— all frontend tests passjust e2e— end-to-end tests pass🤖 Generated with Claude Code