|
| 1 | +# Lineage Overview (Unity Catalog) |
| 2 | + |
| 3 | +Unity Catalog automatically tracks **column-level** and **table-level** lineage across SQL queries, streaming tables, materialized views, dashboards, and Lakeflow jobs. |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +## 1. Accessing Lineage |
| 8 | + |
| 9 | +Open: |
| 10 | +``` |
| 11 | +Catalog Explorer → dbsql_samuel → demo → <any table> → Lineage |
| 12 | +``` |
| 13 | + |
| 14 | +You will see: |
| 15 | +- **Upstream** sources (e.g., CSV Volume paths, Bronze tables) |
| 16 | +- **Downstream** dependencies (Silver/Gold MVs, dashboards, alerts) |
| 17 | + |
| 18 | +--- |
| 19 | + |
| 20 | +## 2. Airline Lab Lineage Structure |
| 21 | + |
| 22 | +### Bronze Layer |
| 23 | +Derived from Volume CSVs: |
| 24 | +- `airports_bronze_st` |
| 25 | +- `flights_bronze_st` |
| 26 | +- `lookupcodes_bronze_st` |
| 27 | + |
| 28 | +### Silver Layer |
| 29 | +- `airports_silver_mv` ← airports_bronze_st |
| 30 | +- `lookupcodes_silver_mv` ← lookupcodes_bronze_st |
| 31 | +- `flights_silver_mv` ← flights_bronze_st |
| 32 | + |
| 33 | +### Gold Layer |
| 34 | +- `airports_by_city_mv` ← airports_silver_mv |
| 35 | + |
| 36 | +### Dashboards & Alerts |
| 37 | +- Dashboards show as downstream |
| 38 | +- Alerts (e.g. ATL delay alert) show lineage including triggering dataset |
| 39 | + |
| 40 | +### Lakeflow Jobs |
| 41 | +If run, tasks appear in operational lineage. |
| 42 | + |
| 43 | +--- |
| 44 | + |
| 45 | +## 3. Why Lineage Is Useful |
| 46 | + |
| 47 | +- Debug workflow failures |
| 48 | +- Validate dependency chains |
| 49 | +- Ensure governance & compliance |
| 50 | +- Remove unused assets |
| 51 | +- Support documentation & audits |
| 52 | + |
| 53 | +--- |
| 54 | + |
| 55 | +## 4. Notes |
| 56 | + |
| 57 | +- Lineage populates only after first read/write |
| 58 | +- Dashboards create lineage when visualizations query data |
| 59 | +- AI/BI interactions are tracked after execution |
| 60 | + |
0 commit comments