Skip to content

Commit 5528628

Browse files
authored
Feature/builder app skills update (#355)
* Update databricks-builder-app to support lakebase autoscale added/modified builder-app setup and deployments to support autoscale instance along with provisioned. In the app.yaml.example provided an update on how to configure autoscale and in the README instructions of how to deploy and setup lakebase permission * minor change to builder app deploy script made a change to the deploy script to clean up the SP's src directory of the snapshot App source code. noticed every deployment would make the copy with no clean on the old versions. created a step in the deploy to clean up the old inactive directories * Delete databricks-builder-app/app.yaml removing file that should have ignored in commit * builder-app-skills update Deploy all skills (databricks, MLflow, APX) via install_skills.sh and dynamically populate ENABLED_SKILLS in app.yaml at deploy time instead of maintaining a hardcoded list.
1 parent 017cd26 commit 5528628

File tree

10 files changed

+308
-77
lines changed

10 files changed

+308
-77
lines changed

.claude/skills/databricks-python-sdk/SKILL.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -613,3 +613,13 @@ If I'm unsure about a method, I should:
613613
| Pipelines | https://databricks-sdk-py.readthedocs.io/en/latest/workspace/pipelines/pipelines.html |
614614
| Secrets | https://databricks-sdk-py.readthedocs.io/en/latest/workspace/workspace/secrets.html |
615615
| DBUtils | https://databricks-sdk-py.readthedocs.io/en/latest/dbutils.html |
616+
617+
## Related Skills
618+
619+
- **[databricks-config](../databricks-config/SKILL.md)** - profile and authentication setup
620+
- **[databricks-bundles](../databricks-bundles/SKILL.md)** - deploying resources via DABs
621+
- **[databricks-jobs](../databricks-jobs/SKILL.md)** - job orchestration patterns
622+
- **[databricks-unity-catalog](../databricks-unity-catalog/SKILL.md)** - catalog governance
623+
- **[databricks-model-serving](../databricks-model-serving/SKILL.md)** - serving endpoint management
624+
- **[databricks-vector-search](../databricks-vector-search/SKILL.md)** - vector index operations
625+
- **[databricks-lakebase-provisioned](../databricks-lakebase-provisioned/SKILL.md)** - managed PostgreSQL via SDK

databricks-builder-app/.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,5 +32,8 @@ client/.vite/
3232
.vscode/
3333
*.swp
3434

35+
# Local docs/notes
36+
docs/TODO.md
37+
3538
# OS
3639
.DS_Store

databricks-builder-app/README.md

Lines changed: 102 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -503,18 +503,21 @@ databricks auth login --host https://your-workspace.cloud.databricks.com
503503
# 2. Create the app (first time only)
504504
databricks apps create my-builder-app
505505

506-
# 3. Add Lakebase as a resource (first time only)
506+
# 3. Configure app.yaml (copy and edit the example)
507+
cp app.yaml.example app.yaml
508+
# Edit app.yaml — set LAKEBASE_ENDPOINT (autoscale) or LAKEBASE_INSTANCE_NAME (provisioned)
509+
510+
# 4. (Provisioned Lakebase only) Add Lakebase as an app resource
511+
# Skip this step if using autoscale — it connects via OAuth directly.
507512
databricks apps add-resource my-builder-app \
508513
--resource-type database \
509514
--resource-name lakebase \
510515
--database-instance <your-lakebase-instance-name>
511516

512-
# 4. Configure app.yaml (copy and edit the example)
513-
cp app.yaml.example app.yaml
514-
# Edit app.yaml with your Lakebase instance name and other settings
515-
516517
# 5. Deploy
517518
./scripts/deploy.sh my-builder-app
519+
520+
# 6. Grant database permissions to the app's service principal (see Section 7)
518521
```
519522

520523
### Step-by-Step Deployment Guide
@@ -564,6 +567,10 @@ The app requires a PostgreSQL database (Lakebase) for storing projects, conversa
564567

565568
#### 4. Add Lakebase as an App Resource
566569

570+
**Autoscale Lakebase**: Skip this step. Autoscale connects via OAuth using `LAKEBASE_ENDPOINT` — no app resource needed.
571+
572+
**Provisioned Lakebase**: Add the instance as an app resource:
573+
567574
```bash
568575
databricks apps add-resource my-builder-app \
569576
--resource-type database \
@@ -647,25 +654,89 @@ The deploy script will:
647654

648655
#### 7. Grant Database Permissions
649656

650-
After the first deployment, grant table permissions to the app's service principal:
657+
After the first deployment, the app's service principal needs two things:
658+
1. A **Lakebase OAuth role** (so it can authenticate via OAuth tokens)
659+
2. **PostgreSQL grants** on the `builder_app` schema (so it can create/read/write tables)
651660

652-
```sql
653-
-- Run this in a Databricks notebook or SQL editor
654-
-- Replace <service-principal-id> with your app's service principal
661+
##### Step 7a: Find the service principal's client ID
662+
663+
```bash
664+
SP_CLIENT_ID=$(databricks apps get my-builder-app --output json | jq -r '.service_principal_client_id')
665+
echo $SP_CLIENT_ID
666+
```
667+
668+
##### Step 7b: Create a Lakebase OAuth role for the SP
655669

656-
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public
657-
TO `<service-principal-id>`;
658-
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public
659-
TO `<service-principal-id>`;
660-
ALTER DEFAULT PRIVILEGES IN SCHEMA public
661-
GRANT ALL ON TABLES TO `<service-principal-id>`;
670+
> **Important**: Do NOT use PostgreSQL `CREATE ROLE` directly. Lakebase Autoscaling requires
671+
> roles to be created through the Databricks API so the OAuth authentication layer recognizes them.
672+
673+
```python
674+
from databricks.sdk import WorkspaceClient
675+
from databricks.sdk.service.postgres import Role, RoleRoleSpec, RoleAuthMethod, RoleIdentityType
676+
677+
w = WorkspaceClient()
678+
679+
# Replace with your branch path and SP client ID
680+
branch = "projects/<project-id>/branches/<branch-id>"
681+
sp_client_id = "<sp-client-id>"
682+
683+
w.postgres.create_role(
684+
parent=branch,
685+
role=Role(
686+
spec=RoleRoleSpec(
687+
postgres_role=sp_client_id,
688+
auth_method=RoleAuthMethod.LAKEBASE_OAUTH_V1,
689+
identity_type=RoleIdentityType.SERVICE_PRINCIPAL,
690+
)
691+
),
692+
).wait()
662693
```
663694

664-
To find your app's service principal ID:
695+
Or via CLI:
696+
665697
```bash
666-
databricks apps get my-builder-app --output json | jq '.service_principal_id'
698+
databricks postgres create-role \
699+
"projects/<project-id>/branches/<branch-id>" \
700+
--json '{
701+
"spec": {
702+
"postgres_role": "<sp-client-id>",
703+
"auth_method": "LAKEBASE_OAUTH_V1",
704+
"identity_type": "SERVICE_PRINCIPAL"
705+
}
706+
}'
707+
```
708+
709+
**Provisioned Lakebase**: This step is not needed — adding the instance as an app resource
710+
(Step 4) automatically configures authentication.
711+
712+
##### Step 7c: Grant PostgreSQL permissions
713+
714+
Connect to your Lakebase database as your own user (via psql or a notebook) and run:
715+
716+
```sql
717+
-- Replace <sp-client-id> with the service_principal_client_id
718+
719+
-- 1. Allow the SP to create the builder_app schema
720+
GRANT CREATE ON DATABASE databricks_postgres TO "<sp-client-id>";
721+
722+
-- 2. Create the schema and grant full access
723+
CREATE SCHEMA IF NOT EXISTS builder_app;
724+
GRANT USAGE ON SCHEMA builder_app TO "<sp-client-id>";
725+
GRANT ALL PRIVILEGES ON SCHEMA builder_app TO "<sp-client-id>";
726+
727+
-- 3. Grant access to any existing tables/sequences (needed if you ran migrations locally)
728+
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA builder_app TO "<sp-client-id>";
729+
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA builder_app TO "<sp-client-id>";
730+
731+
-- 4. Ensure the SP has access to future tables/sequences created by other users
732+
ALTER DEFAULT PRIVILEGES IN SCHEMA builder_app
733+
GRANT ALL ON TABLES TO "<sp-client-id>";
734+
ALTER DEFAULT PRIVILEGES IN SCHEMA builder_app
735+
GRANT ALL ON SEQUENCES TO "<sp-client-id>";
667736
```
668737

738+
After granting permissions, redeploy the app so it can run migrations with the new role.
739+
669740
#### 8. Access Your App
670741

671742
After successful deployment, the script will display your app URL:
@@ -706,33 +777,29 @@ Skills are copied from the sibling `databricks-skills/` directory. Ensure:
706777
2. The skill name in `ENABLED_SKILLS` matches a directory in `databricks-skills/`
707778
3. The skill directory contains a `SKILL.md` file
708779

709-
#### "Permission denied for table projects" or Database Errors
780+
#### "password authentication failed" or "Permission denied for table projects"
710781

711-
When using a shared Lakebase instance, you need to grant the app's service principal permissions on the tables:
782+
See [Section 7: Grant Database Permissions](#7-grant-database-permissions) for the complete setup.
712783

713-
```bash
714-
# 1. Get your app's service principal ID
715-
databricks apps get my-builder-app --output json | python3 -c "import sys, json; print(json.load(sys.stdin)['service_principal_id'])"
716-
```
717-
718-
2. Connect to your Lakebase instance via psql or a Databricks notebook, then run:
784+
Common causes:
719785

720-
```sql
721-
-- Replace <service-principal-id> with the ID from step 1
722-
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO "<service-principal-id>";
723-
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO "<service-principal-id>";
724-
GRANT USAGE ON SCHEMA public TO "<service-principal-id>";
725-
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "<service-principal-id>";
726-
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "<service-principal-id>";
727-
```
786+
| Error | Cause | Fix |
787+
|-------|-------|-----|
788+
| `password authentication failed` | Lakebase OAuth role missing or created via SQL instead of API | Create the role via `w.postgres.create_role()` with `LAKEBASE_OAUTH_V1` auth (Step 7b) |
789+
| `permission denied for table` | SP lacks PostgreSQL grants on schema/tables | Run the GRANT statements (Step 7c) |
790+
| `schema "builder_app" does not exist` | SP lacks `CREATE` on the database | `GRANT CREATE ON DATABASE databricks_postgres TO "<sp-client-id>"` |
791+
| `relation does not exist` | Migrations haven't run | Redeploy the app, or run `alembic upgrade head` locally |
728792

729-
Alternatively, if you have a fresh/private Lakebase instance, the app's migrations will create the tables with proper ownership automatically.
793+
> **Autoscale Lakebase pitfall**: Do NOT use `CREATE ROLE ... LOGIN` in PostgreSQL directly.
794+
> Lakebase Autoscaling requires roles to be created through the Databricks API so that OAuth
795+
> token authentication works. Manually created roles get `NO_LOGIN` auth and will fail with
796+
> "password authentication failed".
730797

731798
#### App shows blank page or "Not Found"
732799

733800
Check the app logs in Databricks:
734801
```bash
735-
databricks apps get-logs my-builder-app
802+
databricks apps logs my-builder-app
736803
```
737804

738805
Common causes:

databricks-builder-app/alembic/env.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,10 @@
3535
def get_url_and_connect_args():
3636
"""Get database URL and connect_args from environment.
3737
38-
Supports two modes:
38+
Supports three modes:
3939
1. Static URL: Uses LAKEBASE_PG_URL directly
40-
2. Dynamic OAuth: Builds URL from LAKEBASE_INSTANCE_NAME + generates token
40+
2. Autoscale OAuth: Builds URL from LAKEBASE_ENDPOINT + generates token via client.postgres
41+
3. Provisioned OAuth: Builds URL from LAKEBASE_INSTANCE_NAME + generates token via client.database
4142
4243
Returns tuple of (url, connect_args) for psycopg2 driver.
4344
"""

databricks-builder-app/app.yaml.example

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -28,29 +28,35 @@ env:
2828
# =============================================================================
2929
# Skills Configuration
3030
# =============================================================================
31-
# Comma-separated list of skills to enable
31+
# Comma-separated list of skills to enable.
32+
# AUTO-POPULATED by deploy.sh at deploy time based on installed skills.
33+
# To override, set a specific list here before deploying.
3234
- name: ENABLED_SKILLS
33-
value: "databricks-bundles,databricks-agent-bricks,databricks-aibi-dashboards,databricks-app-apx,databricks-app-python,databricks-config,databricks-docs,databricks-jobs,databricks-python-sdk,databricks-unity-catalog,databricks-mlflow-evaluation,databricks-spark-declarative-pipelines,databricks-synthetic-data-gen,databricks-unstructured-pdf-generation"
35+
value: ""
3436
- name: SKILLS_ONLY_MODE
3537
value: "false"
3638

3739
# =============================================================================
3840
# Database Configuration (Lakebase)
3941
# =============================================================================
40-
# IMPORTANT: You must add Lakebase as an app resource for database connectivity.
41-
#
42-
# Steps:
43-
# 1. Create a Lakebase instance in your workspace (if not exists)
44-
# 2. Add it as an app resource:
45-
# databricks apps add-resource <app-name> \
46-
# --resource-type database \
47-
# --resource-name lakebase \
48-
# --database-instance <your-lakebase-instance-name>
42+
# Choose ONE of the two options below.
4943
#
50-
# When added as a resource, Databricks automatically sets:
51-
# - PGHOST, PGPORT, PGUSER, PGPASSWORD, PGDATABASE
44+
# --- Option A: Autoscale Lakebase (recommended) ---
45+
# Scales to zero when idle. No add-resource step needed — connects via OAuth.
46+
# Find endpoint name in: Catalog → Lakebase → your project → Branches → Endpoints
47+
#
48+
# - name: LAKEBASE_ENDPOINT
49+
# value: "projects/<project-name>/branches/production/endpoints/<endpoint>"
50+
# - name: LAKEBASE_DATABASE_NAME
51+
# value: "databricks_postgres"
52+
#
53+
# --- Option B: Provisioned Lakebase ---
54+
# Fixed-capacity instance. Must add as an app resource:
55+
# databricks apps add-resource <app-name> \
56+
# --resource-type database \
57+
# --resource-name lakebase \
58+
# --database-instance <your-lakebase-instance-name>
5259
#
53-
# You only need to specify the instance name for OAuth token generation:
5460
- name: LAKEBASE_INSTANCE_NAME
5561
value: "<your-lakebase-instance-name>"
5662
- name: LAKEBASE_DATABASE_NAME

0 commit comments

Comments
 (0)