Skip to content

Commit 8cfb1ba

Browse files
committed
feat: add missing info on README
1 parent 1788fbe commit 8cfb1ba

1 file changed

Lines changed: 26 additions & 7 deletions

File tree

README.md

Lines changed: 26 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ To run on Databricks, configure the following:
121121

122122
#### Databricks Secrets
123123

124-
Create a secrets scope and add credentials:
124+
Create a secrets scope and add credentials. These secrets are used at **runtime** by the Databricks jobs:
125125

126126
```bash
127127
databricks secrets create-scope migration-accelerator
@@ -133,30 +133,49 @@ databricks secrets put-secret migration-accelerator DATABRICKS_CLIENT_ID
133133
databricks secrets put-secret migration-accelerator DATABRICKS_CLIENT_SECRET
134134
```
135135

136+
> **Note:** The Snowflake credentials (`SNOWFLAKE_ACCOUNT`, `SNOWFLAKE_USER`, `SNOWFLAKE_PASSWORD`) are required for the ingestion job to connect to Snowflake. The Databricks credentials are used by the Job Executor App for API authentication.
137+
136138
#### Cluster Environment Variables
137139

138140
Set in **Cluster → Advanced Options → Spark → Environment Variables**:
139141

140142
```bash
143+
# Required - Unity Catalog configuration
141144
UC_CATALOG=your_catalog_name
142145
UC_SCHEMA=migration_accelerator
146+
147+
# Required - Snowflake source context
143148
SNOWFLAKE_DATABASE=your_database
144149
SNOWFLAKE_SCHEMA=your_schema
150+
151+
# Required - Translation output configuration
152+
DDL_OUTPUT_DIR=/Volumes/your_catalog/migration_accelerator/outputs
153+
DBX_ENDPOINT=databricks-llama-4-maverick
154+
155+
# Optional - Override defaults if needed
156+
# SECRETS_SCOPE=migration-accelerator # Default: migration-accelerator
157+
# UC_RAW_VOLUME=snowflake_artifacts_raw # Default: snowflake_artifacts_raw
158+
# SNOWFLAKE_WAREHOUSE=COMPUTE_WH # Default: COMPUTE_WH
159+
# SNOWFLAKE_ROLE=SYSADMIN # Default: SYSADMIN
145160
```
146161

147162
#### GitHub Secrets (for CI/CD)
148163

164+
These secrets are used by **GitHub Actions** to deploy the Databricks Asset Bundle (not at runtime):
165+
149166
| Secret | Description |
150167
|--------|-------------|
151-
| `DATABRICKS_HOST` | Workspace URL |
152-
| `DATABRICKS_CLIENT_ID` | OAuth M2M client ID |
153-
| `DATABRICKS_CLIENT_SECRET` | OAuth M2M client secret |
154-
| `DATABRICKS_CLUSTER_ID` | Cluster ID for jobs |
155-
| `UC_CATALOG` | Unity Catalog name |
156-
| `DEVS_GROUP` | Group name for permissions |
168+
| `DATABRICKS_HOST` | Workspace URL (e.g., `https://your-workspace.cloud.databricks.com`) |
169+
| `DATABRICKS_CLIENT_ID` | Service principal OAuth M2M client ID |
170+
| `DATABRICKS_CLIENT_SECRET` | Service principal OAuth M2M client secret |
171+
| `DATABRICKS_CLUSTER_ID` | Existing all-purpose cluster ID for running job tasks |
172+
| `UC_CATALOG` | Unity Catalog name for schema and volume creation |
173+
| `DEVS_GROUP` | Databricks group name for job and catalog permissions |
157174

158175
> **Note:** The `DEVS_GROUP` (e.g., `migration-accelerator-devs`) must exist in Databricks before deployment. Create it in **Admin Console → Groups → Create Group**.
159176

177+
> **Secrets vs GitHub Secrets:** Databricks Secrets (in the scope) are read at **runtime** by the jobs. GitHub Secrets are used at **deploy time** by the CI/CD pipeline to authenticate and configure the bundle.
178+
160179
#### After deployment
161180

162181
Once deployed, get the service principal name from the Databricks App in Compute->Apps->dbx-job-executor-app->Authorization->App Authorization and th job id from Jobs & Pipelines->snowflake_ingestion_job->Job Details->Job ID.

0 commit comments

Comments
 (0)