Skip to content

Commit dcd3ff4

Browse files
Fix Databricks dashboard counts and auto-refresh (#28)
1 parent c8f86de commit dcd3ff4

10 files changed

Lines changed: 297 additions & 88 deletions

README.md

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -207,14 +207,27 @@ just databricks-delta-publish-dashboard DEFAULT <warehouse-id>
207207

208208
The first version focuses on:
209209

210-
- latest checkpoint freshness
210+
- checkpoint freshness and write volume
211211
- bronze table count
212212
- silver table count
213213
- recent checkpoint history
214-
- bronze and silver table inventory
214+
- per-table bronze vs silver record counts
215+
- a side-by-side bronze/silver table map
215216

216-
Silver remaining empty is expected until you deploy a real Lakeflow `AUTO CDC`
217-
pipeline that reads bronze and materializes silver.
217+
The dashboard now filters out internal Lakeflow objects from the bronze/silver
218+
counts and uses full-width tables for recent checkpoints, per-table record
219+
counts, and the bronze/silver map.
220+
221+
To make the pipeline feel truly live:
222+
223+
- the extractor job now deploys with a 5-minute Databricks schedule
224+
- the Lakeflow pipeline is configured as continuous after its first run
225+
- after `just databricks-delta-deploy-pipeline DEFAULT prod`, run `just databricks-delta-run-pipeline DEFAULT prod` once to start the continuous update loop
226+
227+
If you want a self-animating demo, add a tiny heartbeat write in the source
228+
Convex app on a 1-minute cron. That creates a steady stream of real upstream
229+
changes that show up in checkpoints, bronze, and silver without any manual
230+
reruns.
218231

219232
## Suggested Screenshots
220233

docs/monitoring.md

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,15 @@ dashboard ID.
2020

2121
The first dashboard focuses on:
2222

23-
- latest checkpoint freshness
23+
- checkpoint freshness and write volume
2424
- bronze table count
2525
- silver table count
2626
- recent checkpoint history
27-
- bronze and silver table inventory
27+
- per-table bronze vs silver record counts
28+
- a side-by-side bronze/silver table map
29+
30+
The template filters out internal Lakeflow objects from the layer counts and
31+
uses full-width tables so the dashboard is easier to read in Lakeview.
2832

2933
## AUTO CDC Status
3034

@@ -39,6 +43,15 @@ just databricks-delta-run-pipeline DEFAULT prod
3943
For a newly onboarded source, silver will stay empty until you run that same
4044
deploy/run sequence for that source.
4145

46+
Once deployed:
47+
48+
- the extractor job runs every 5 minutes on a Databricks job schedule
49+
- the Lakeflow pipeline stays continuous after its first `run`
50+
51+
If you want a no-touch proof loop, add a tiny heartbeat write in the upstream
52+
Convex app on a 1-minute cron. That gives Databricks a steady stream of real
53+
changes to ingest and makes the dashboard visibly move without manual reruns.
54+
4255
What exists today:
4356

4457
- control schema and checkpoint tables

justfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -81,8 +81,8 @@ databricks-delta-sync-secret *args:
8181
databricks-delta-bootstrap warehouse_id profile="DEFAULT":
8282
./scripts/bootstrap-databricks-delta.sh {{profile}} {{warehouse_id}}
8383

84-
databricks-delta-render-dashboard output_file:
85-
./scripts/render-databricks-delta-dashboard.sh {{output_file}}
84+
databricks-delta-render-dashboard output_file profile="":
85+
./scripts/render-databricks-delta-dashboard.sh {{output_file}} {{profile}}
8686

8787
databricks-delta-publish-dashboard profile warehouse_id dashboard_id="":
8888
./scripts/publish-databricks-delta-dashboard.sh {{profile}} {{warehouse_id}} {{dashboard_id}}

platform/databricks/delta/README.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ Bundle lifecycle:
4040

4141
- `scripts/ensure-databricks-delta-secret.sh <profile> [scope] [key]`
4242
- `scripts/bootstrap-databricks-delta.sh <profile> <warehouse_id>`
43-
- `scripts/render-databricks-delta-dashboard.sh <output_file>`
43+
- `scripts/render-databricks-delta-dashboard.sh <output_file> [profile]`
4444
- `scripts/publish-databricks-delta-dashboard.sh <profile> <warehouse_id> [dashboard_id]`
4545
- `scripts/render-databricks-delta-pipeline.sh <profile> <output_file>`
4646
- `scripts/deploy-databricks-delta-pipeline.sh <profile> <target>`
@@ -118,3 +118,10 @@ just databricks-delta-deploy
118118
just databricks-delta-run
119119
just databricks-delta-smoke <warehouse_id>
120120
```
121+
122+
Auto-update behavior after deployment:
123+
124+
- the extractor job is scheduled every 5 minutes
125+
- the Lakeflow pipeline is continuous once you start it with the first `run`
126+
- if you want a hands-free demo, add a simple heartbeat write in the source
127+
Convex app so fresh upstream changes keep landing automatically

platform/databricks/delta/dashboards/README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,14 @@ Files:
1010
Use the helper scripts:
1111

1212
```bash
13-
./scripts/render-databricks-delta-dashboard.sh /tmp/convex-sync-overview.lvdash.json
13+
./scripts/render-databricks-delta-dashboard.sh /tmp/convex-sync-overview.lvdash.json DEFAULT
1414
./scripts/publish-databricks-delta-dashboard.sh DEFAULT <warehouse_id>
1515
```
1616

17+
Pass the Databricks profile to the render helper if you want the generated
18+
Lakeview dashboard to include per-table bronze and silver row counts. Without a
19+
profile, the dashboard still renders, but the row-count dataset is left empty.
20+
1721
If you pass an existing dashboard ID to the publish script, it updates and
1822
re-publishes that dashboard. Otherwise it creates a new one and prints the
1923
dashboard ID.

0 commit comments

Comments
 (0)