Skip to content

Commit b003e29

Browse files
authored
Merge pull request #38 from thisisqubika/DC-393-fix-review-readme-secrets
feat: add missing info on README
2 parents c31260d + 8cfb1ba commit b003e29

1 file changed

Lines changed: 26 additions & 7 deletions

File tree

README.md

Lines changed: 26 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -210,7 +210,7 @@ To run on Databricks, configure the following:
210210

211211
#### Databricks Secrets
212212

213-
Create a secrets scope and add credentials:
213+
Create a secrets scope and add credentials. These secrets are used at **runtime** by the Databricks jobs:
214214

215215
```bash
216216
databricks secrets create-scope migration-accelerator
@@ -222,30 +222,49 @@ databricks secrets put-secret migration-accelerator DATABRICKS_CLIENT_ID
222222
databricks secrets put-secret migration-accelerator DATABRICKS_CLIENT_SECRET
223223
```
224224

225+
> **Note:** The Snowflake credentials (`SNOWFLAKE_ACCOUNT`, `SNOWFLAKE_USER`, `SNOWFLAKE_PASSWORD`) are required for the ingestion job to connect to Snowflake. The Databricks credentials are used by the Job Executor App for API authentication.
226+
225227
#### Cluster Environment Variables
226228

227229
Set in **Cluster → Advanced Options → Spark → Environment Variables**:
228230

229231
```bash
232+
# Required - Unity Catalog configuration
230233
UC_CATALOG=your_catalog_name
231234
UC_SCHEMA=migration_accelerator
235+
236+
# Required - Snowflake source context
232237
SNOWFLAKE_DATABASE=your_database
233238
SNOWFLAKE_SCHEMA=your_schema
239+
240+
# Required - Translation output configuration
241+
DDL_OUTPUT_DIR=/Volumes/your_catalog/migration_accelerator/outputs
242+
DBX_ENDPOINT=databricks-llama-4-maverick
243+
244+
# Optional - Override defaults if needed
245+
# SECRETS_SCOPE=migration-accelerator # Default: migration-accelerator
246+
# UC_RAW_VOLUME=snowflake_artifacts_raw # Default: snowflake_artifacts_raw
247+
# SNOWFLAKE_WAREHOUSE=COMPUTE_WH # Default: COMPUTE_WH
248+
# SNOWFLAKE_ROLE=SYSADMIN # Default: SYSADMIN
234249
```
235250

236251
#### GitHub Secrets (for CI/CD)
237252

253+
These secrets are used by **GitHub Actions** to deploy the Databricks Asset Bundle (not at runtime):
254+
238255
| Secret | Description |
239256
|--------|-------------|
240-
| `DATABRICKS_HOST` | Workspace URL |
241-
| `DATABRICKS_CLIENT_ID` | OAuth M2M client ID |
242-
| `DATABRICKS_CLIENT_SECRET` | OAuth M2M client secret |
243-
| `DATABRICKS_CLUSTER_ID` | Cluster ID for jobs |
244-
| `UC_CATALOG` | Unity Catalog name |
245-
| `DEVS_GROUP` | Group name for permissions |
257+
| `DATABRICKS_HOST` | Workspace URL (e.g., `https://your-workspace.cloud.databricks.com`) |
258+
| `DATABRICKS_CLIENT_ID` | Service principal OAuth M2M client ID |
259+
| `DATABRICKS_CLIENT_SECRET` | Service principal OAuth M2M client secret |
260+
| `DATABRICKS_CLUSTER_ID` | Existing all-purpose cluster ID for running job tasks |
261+
| `UC_CATALOG` | Unity Catalog name for schema and volume creation |
262+
| `DEVS_GROUP` | Databricks group name for job and catalog permissions |
246263

247264
> **Note:** The `DEVS_GROUP` (e.g., `migration-accelerator-devs`) must exist in Databricks before deployment. Create it in **Admin Console → Groups → Create Group**.
248265

266+
> **Secrets vs GitHub Secrets:** Databricks Secrets (in the scope) are read at **runtime** by the jobs. GitHub Secrets are used at **deploy time** by the CI/CD pipeline to authenticate and configure the bundle.
267+
249268
#### After deployment
250269

251270
Once deployed, get the service principal name from the Databricks App in Compute->Apps->dbx-job-executor-app->Authorization->App Authorization and th job id from Jobs & Pipelines->snowflake_ingestion_job->Job Details->Job ID. Then add this Service Principal to the developers permission group specified in the variable DEVS_GROUP in the Github Secrets.

0 commit comments

Comments
 (0)