Skip to content

Commit 4306f62

Browse files
fix: move Databricks Delta deploy key into secret scope (#20)
* fix: move databricks delta deploy key into secret scope * fix: hide databricks delta secret sync value from ps
1 parent 7b63a3e commit 4306f62

10 files changed

Lines changed: 157 additions & 16 deletions

File tree

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,7 @@ Runtime split:
144144

145145
Packaged entrypoints:
146146

147+
- `just databricks-delta-sync-secret`
147148
- `just databricks-delta-deploy`
148149
- `just databricks-delta-run`
149150
- `just databricks-delta-smoke <warehouse_id>`

justfile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,9 @@ databricks-sync-staging-views *args:
7575
databricks-apply-sql-dir profile warehouse_id sql_dir:
7676
./scripts/apply-databricks-sql-dir.sh {{profile}} {{warehouse_id}} {{sql_dir}}
7777

78+
databricks-delta-sync-secret *args:
79+
./scripts/ensure-databricks-delta-secret.sh {{args}}
80+
7881
databricks-delta-deploy profile="DEFAULT" target="dev":
7982
./scripts/deploy-databricks-delta.sh {{profile}} {{target}}
8083

platform/databricks/delta/README.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,13 +37,34 @@ cannot overwrite key, ordering, or delete semantics.
3737

3838
Bundle lifecycle:
3939

40+
- `scripts/ensure-databricks-delta-secret.sh <profile> [scope] [key]`
4041
- `scripts/deploy-databricks-delta.sh <profile> <target>`
4142
- `scripts/run-databricks-delta-job.sh <profile> <target> [job_key]`
4243
- `scripts/run-databricks-delta-smoke.sh <profile> <target> <warehouse_id>`
4344

4445
These scripts default `DATABRICKS_BUNDLE_ENGINE=direct` so deployment does not
4546
depend on Terraform downloads.
4647

48+
## Secret Contract
49+
50+
The Databricks bundle never receives the raw Convex deploy key as a bundle
51+
variable.
52+
53+
- The deploy key lives in a Databricks secret scope.
54+
- The bundle only carries the secret scope name and secret key name.
55+
- The extractor resolves the deploy key inside Databricks at runtime with
56+
`dbutils.secrets.get(...)`.
57+
58+
Helper defaults:
59+
60+
- `DATABRICKS_DELTA_SECRET_SCOPE=convex-streaming-olap-export`
61+
- `DATABRICKS_DELTA_SECRET_KEY=convex-deploy-key`
62+
63+
If `CONVEX_DEPLOY_KEY` is available locally, the deploy and run helpers will
64+
create or update that Databricks secret automatically before validating,
65+
deploying, or running the job. If the local key is not available, the helpers
66+
require the target Databricks secret to already exist.
67+
4768
Bootstrap SQL can still be applied directly with:
4869

4970
- `scripts/apply-databricks-sql-dir.sh <profile> <warehouse_id> <rendered_sql_dir>`
@@ -67,6 +88,7 @@ flowchart LR
6788
Recommended operator entrypoints:
6889

6990
```bash
91+
just databricks-delta-sync-secret
7092
just databricks-delta-deploy
7193
just databricks-delta-run
7294
just databricks-delta-smoke <warehouse_id>

platform/databricks/delta/databricks.yml

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,10 @@ include:
88
variables:
99
convex_deployment_url:
1010
description: Convex deployment root URL.
11-
convex_deploy_key:
12-
description: Convex deploy key used by the extractor.
11+
convex_deploy_key_secret_scope:
12+
description: Databricks secret scope that stores the Convex deploy key.
13+
convex_deploy_key_secret_key:
14+
description: Secret key name inside the Databricks secret scope.
1315
source_id:
1416
description: Source identifier stored in the checkpoint table.
1517
table_name:
@@ -30,6 +32,8 @@ targets:
3032
workspace:
3133
profile: DEFAULT
3234
variables:
35+
convex_deploy_key_secret_scope: convex-streaming-olap-export
36+
convex_deploy_key_secret_key: convex-deploy-key
3337
source_id: convex-streaming-olap-export-dev
3438
table_name: ""
3539
catalog: workspace
@@ -41,6 +45,8 @@ targets:
4145
workspace:
4246
profile: DEFAULT
4347
variables:
48+
convex_deploy_key_secret_scope: convex-streaming-olap-export
49+
convex_deploy_key_secret_key: convex-deploy-key
4450
source_id: convex-streaming-olap-export
4551
table_name: ""
4652
catalog: workspace

platform/databricks/delta/extractor/README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,9 @@ It mirrors the current Rust source/checkpoint behavior:
1414
## Required environment
1515

1616
- `CONVEX_DEPLOYMENT_URL`
17-
- `CONVEX_DEPLOY_KEY`
17+
- one of:
18+
- `CONVEX_DEPLOY_KEY`
19+
- `CONVEX_DEPLOY_KEY_SECRET_SCOPE` and `CONVEX_DEPLOY_KEY_SECRET_KEY`
1820

1921
## Optional environment
2022

@@ -25,5 +27,6 @@ It mirrors the current Rust source/checkpoint behavior:
2527
- `DATABRICKS_BRONZE_SCHEMA`: defaults to `bronze`
2628
- `DATABRICKS_CHECKPOINT_TABLE`: defaults to `connector_checkpoint`
2729

28-
In the bundled Databricks Delta path, these are usually passed as task
29-
parameters rather than exported manually.
30+
In the bundled Databricks Delta path, the job receives the secret scope/key
31+
names as task parameters and resolves the actual deploy key with
32+
`dbutils.secrets.get(...)` at runtime.

platform/databricks/delta/extractor/convex_cdc_job.py

Lines changed: 39 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,11 @@
2020
StructType,
2121
)
2222

23+
try:
24+
from pyspark.dbutils import DBUtils
25+
except ImportError: # local syntax checks and non-Databricks execution
26+
DBUtils = None # type: ignore[assignment]
27+
2328

2429
spark = SparkSession.builder.getOrCreate()
2530

@@ -49,6 +54,33 @@ def opt(value: Optional[str], env_name: str, default: Optional[str] = None) -> s
4954
return env(env_name, default)
5055

5156

57+
def resolve_deploy_key(
58+
*,
59+
direct_value: Optional[str],
60+
secret_scope: Optional[str],
61+
secret_key: Optional[str],
62+
) -> str:
63+
if direct_value is not None:
64+
return direct_value
65+
66+
env_value = os.getenv("CONVEX_DEPLOY_KEY")
67+
if env_value:
68+
return env_value
69+
70+
scope = secret_scope or os.getenv("CONVEX_DEPLOY_KEY_SECRET_SCOPE")
71+
key = secret_key or os.getenv("CONVEX_DEPLOY_KEY_SECRET_KEY")
72+
if not scope or not key:
73+
raise RuntimeError(
74+
"missing Convex deploy key: provide --deploy-key, CONVEX_DEPLOY_KEY, "
75+
"or both deploy-key secret scope/key settings"
76+
)
77+
78+
if DBUtils is None:
79+
raise RuntimeError("pyspark.dbutils.DBUtils is unavailable outside Databricks runtime")
80+
81+
return DBUtils(spark).secrets.get(scope=scope, key=key)
82+
83+
5284
@dataclass
5385
class Checkpoint:
5486
phase: str
@@ -374,6 +406,8 @@ def parse_args() -> argparse.Namespace:
374406
parser = argparse.ArgumentParser(description="Convex CDC Databricks extractor")
375407
parser.add_argument("--deployment-url")
376408
parser.add_argument("--deploy-key")
409+
parser.add_argument("--deploy-key-secret-scope")
410+
parser.add_argument("--deploy-key-secret-key")
377411
parser.add_argument("--source-id")
378412
parser.add_argument("--table-name")
379413
parser.add_argument("--catalog")
@@ -441,7 +475,11 @@ def run_once() -> None:
441475
args = parse_args()
442476

443477
deployment_url = opt(args.deployment_url, "CONVEX_DEPLOYMENT_URL")
444-
deploy_key = opt(args.deploy_key, "CONVEX_DEPLOY_KEY")
478+
deploy_key = resolve_deploy_key(
479+
direct_value=args.deploy_key,
480+
secret_scope=args.deploy_key_secret_scope,
481+
secret_key=args.deploy_key_secret_key,
482+
)
445483
source_id = opt(args.source_id, "CONVEX_SOURCE_ID", deployment_url)
446484
table_name = args.table_name if args.table_name is not None else os.getenv("CONVEX_TABLE_NAME")
447485

platform/databricks/delta/resources/convex_delta_extract.job.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,10 @@ resources:
1111
parameters:
1212
- --deployment-url
1313
- ${var.convex_deployment_url}
14-
- --deploy-key
15-
- ${var.convex_deploy_key}
14+
- --deploy-key-secret-scope
15+
- ${var.convex_deploy_key_secret_scope}
16+
- --deploy-key-secret-key
17+
- ${var.convex_deploy_key_secret_key}
1618
- --source-id
1719
- ${var.source_id}
1820
- --table-name

scripts/deploy-databricks-delta.sh

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,23 +28,27 @@ read_env_file_value() {
2828
}
2929

3030
deployment_url="${CONVEX_DEPLOYMENT_URL:-$(read_env_file_value CONVEX_DEPLOYMENT_URL || true)}"
31-
deploy_key="${CONVEX_DEPLOY_KEY:-$(read_env_file_value CONVEX_DEPLOY_KEY || true)}"
3231

33-
if [[ -z "$deployment_url" || -z "$deploy_key" ]]; then
34-
echo "CONVEX_DEPLOYMENT_URL and CONVEX_DEPLOY_KEY are required" >&2
32+
if [[ -z "$deployment_url" ]]; then
33+
echo "CONVEX_DEPLOYMENT_URL is required" >&2
3534
exit 1
3635
fi
3736

3837
source_id="${CONVEX_SOURCE_ID:-$deployment_url}"
3938
table_name="${CONVEX_TABLE_NAME:-}"
39+
secret_scope="${DATABRICKS_DELTA_SECRET_SCOPE:-convex-streaming-olap-export}"
40+
secret_key="${DATABRICKS_DELTA_SECRET_KEY:-convex-deploy-key}"
4041
catalog="${DATABRICKS_DELTA_CATALOG:-workspace}"
4142
control_schema="${DATABRICKS_DELTA_CONTROL_SCHEMA:-convex_streaming_olap_export_control}"
4243
bronze_schema="${DATABRICKS_DELTA_BRONZE_SCHEMA:-convex_streaming_olap_export_bronze}"
4344
checkpoint_table="${DATABRICKS_DELTA_CHECKPOINT_TABLE:-connector_checkpoint}"
4445

46+
"$repo_root/scripts/ensure-databricks-delta-secret.sh" "$profile" "$secret_scope" "$secret_key"
47+
4548
bundle_args=(
4649
--var "convex_deployment_url=$deployment_url"
47-
--var "convex_deploy_key=$deploy_key"
50+
--var "convex_deploy_key_secret_scope=$secret_scope"
51+
--var "convex_deploy_key_secret_key=$secret_key"
4852
--var "source_id=$source_id"
4953
--var "table_name=$table_name"
5054
--var "catalog=$catalog"
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
if [[ "$#" -lt 1 || "$#" -gt 3 ]]; then
5+
echo "usage: $0 <profile> [scope] [key]" >&2
6+
exit 1
7+
fi
8+
9+
profile="$1"
10+
scope_arg="${2:-}"
11+
key_arg="${3:-}"
12+
13+
repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
14+
15+
read_env_file_value() {
16+
local key="$1"
17+
local env_file="$repo_root/.env"
18+
if [[ ! -f "$env_file" ]]; then
19+
return 1
20+
fi
21+
local line
22+
line="$(grep -E "^${key}=" "$env_file" | tail -n 1 || true)"
23+
if [[ -z "$line" ]]; then
24+
return 1
25+
fi
26+
printf '%s' "${line#*=}"
27+
}
28+
29+
scope="${scope_arg:-${DATABRICKS_DELTA_SECRET_SCOPE:-convex-streaming-olap-export}}"
30+
key="${key_arg:-${DATABRICKS_DELTA_SECRET_KEY:-convex-deploy-key}}"
31+
deploy_key="${CONVEX_DEPLOY_KEY:-$(read_env_file_value CONVEX_DEPLOY_KEY || true)}"
32+
33+
scopes_json="$(databricks secrets list-scopes -p "$profile" -o json)"
34+
scope_exists=false
35+
if jq -e --arg scope "$scope" '.[] | select(.name == $scope)' <<<"$scopes_json" >/dev/null; then
36+
scope_exists=true
37+
fi
38+
39+
if [[ -n "$deploy_key" ]]; then
40+
if [[ "$scope_exists" == false ]]; then
41+
databricks secrets create-scope "$scope" -p "$profile" >/dev/null
42+
fi
43+
printf '%s' "$deploy_key" | databricks secrets put-secret "$scope" "$key" -p "$profile" >/dev/null
44+
echo "synced Databricks secret $scope/$key"
45+
exit 0
46+
fi
47+
48+
if [[ "$scope_exists" == false ]]; then
49+
echo "Databricks secret scope $scope does not exist and CONVEX_DEPLOY_KEY is not available to create it" >&2
50+
exit 1
51+
fi
52+
53+
if ! databricks secrets list-secrets "$scope" -p "$profile" -o json | jq -e --arg key "$key" '.[] | select(.key == $key)' >/dev/null; then
54+
echo "Databricks secret $scope/$key does not exist and CONVEX_DEPLOY_KEY is not available to create it" >&2
55+
exit 1
56+
fi
57+
58+
echo "using existing Databricks secret $scope/$key"

scripts/run-databricks-delta-job.sh

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,23 +29,27 @@ read_env_file_value() {
2929
}
3030

3131
deployment_url="${CONVEX_DEPLOYMENT_URL:-$(read_env_file_value CONVEX_DEPLOYMENT_URL || true)}"
32-
deploy_key="${CONVEX_DEPLOY_KEY:-$(read_env_file_value CONVEX_DEPLOY_KEY || true)}"
3332

34-
if [[ -z "$deployment_url" || -z "$deploy_key" ]]; then
35-
echo "CONVEX_DEPLOYMENT_URL and CONVEX_DEPLOY_KEY are required" >&2
33+
if [[ -z "$deployment_url" ]]; then
34+
echo "CONVEX_DEPLOYMENT_URL is required" >&2
3635
exit 1
3736
fi
3837

3938
source_id="${CONVEX_SOURCE_ID:-$deployment_url}"
4039
table_name="${CONVEX_TABLE_NAME:-}"
40+
secret_scope="${DATABRICKS_DELTA_SECRET_SCOPE:-convex-streaming-olap-export}"
41+
secret_key="${DATABRICKS_DELTA_SECRET_KEY:-convex-deploy-key}"
4142
catalog="${DATABRICKS_DELTA_CATALOG:-workspace}"
4243
control_schema="${DATABRICKS_DELTA_CONTROL_SCHEMA:-convex_streaming_olap_export_control}"
4344
bronze_schema="${DATABRICKS_DELTA_BRONZE_SCHEMA:-convex_streaming_olap_export_bronze}"
4445
checkpoint_table="${DATABRICKS_DELTA_CHECKPOINT_TABLE:-connector_checkpoint}"
4546

47+
"$repo_root/scripts/ensure-databricks-delta-secret.sh" "$profile" "$secret_scope" "$secret_key"
48+
4649
bundle_args=(
4750
--var "convex_deployment_url=$deployment_url"
48-
--var "convex_deploy_key=$deploy_key"
51+
--var "convex_deploy_key_secret_scope=$secret_scope"
52+
--var "convex_deploy_key_secret_key=$secret_key"
4953
--var "source_id=$source_id"
5054
--var "table_name=$table_name"
5155
--var "catalog=$catalog"

0 commit comments

Comments
 (0)