Skip to content

Commit 26f783e

Browse files
committed
Data Lineage: Visual lineage via Catalog Explorer
1 parent a2f6e78 commit 26f783e

1 file changed

Lines changed: 60 additions & 0 deletions

File tree

lineage/lineage_overview.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# Lineage Overview (Unity Catalog)
2+
3+
Unity Catalog automatically tracks **column-level** and **table-level** lineage across SQL queries, streaming tables, materialized views, dashboards, and Lakeflow jobs.
4+
5+
---
6+
7+
## 1. Accessing Lineage
8+
9+
Open:
10+
```
11+
Catalog Explorer → dbsql_samuel → demo → <any table> → Lineage
12+
```
13+
14+
You will see:
15+
- **Upstream** sources (e.g., CSV Volume paths, Bronze tables)
16+
- **Downstream** dependencies (Silver/Gold MVs, dashboards, alerts)
17+
18+
---
19+
20+
## 2. Airline Lab Lineage Structure
21+
22+
### Bronze Layer
23+
Derived from Volume CSVs:
24+
- `airports_bronze_st`
25+
- `flights_bronze_st`
26+
- `lookupcodes_bronze_st`
27+
28+
### Silver Layer
29+
- `airports_silver_mv` ← airports_bronze_st
30+
- `lookupcodes_silver_mv` ← lookupcodes_bronze_st
31+
- `flights_silver_mv` ← flights_bronze_st
32+
33+
### Gold Layer
34+
- `airports_by_city_mv` ← airports_silver_mv
35+
36+
### Dashboards & Alerts
37+
- Dashboards show as downstream
38+
- Alerts (e.g. ATL delay alert) show lineage including triggering dataset
39+
40+
### Lakeflow Jobs
41+
If run, tasks appear in operational lineage.
42+
43+
---
44+
45+
## 3. Why Lineage Is Useful
46+
47+
- Debug workflow failures
48+
- Validate dependency chains
49+
- Ensure governance & compliance
50+
- Remove unused assets
51+
- Support documentation & audits
52+
53+
---
54+
55+
## 4. Notes
56+
57+
- Lineage populates only after first read/write
58+
- Dashboards create lineage when visualizations query data
59+
- AI/BI interactions are tracked after execution
60+

0 commit comments

Comments
 (0)