|
| 1 | +--- |
| 2 | +name: dbt-develop |
| 3 | +description: Create and modify dbt models — staging, intermediate, marts, incremental, medallion architecture. Use when building new SQL models, extending existing ones, scaffolding YAML configs, or reorganizing project structure. Powered by altimate-dbt. |
| 4 | +--- |
| 5 | + |
| 6 | +# dbt Model Development |
| 7 | + |
| 8 | +## Requirements |
| 9 | +**Agent:** builder or migrator (requires file write access) |
| 10 | +**Tools used:** bash (runs `altimate-dbt` commands), read, glob, write, edit |
| 11 | + |
| 12 | +## When to Use This Skill |
| 13 | + |
| 14 | +**Use when the user wants to:** |
| 15 | +- Create a new dbt model (staging, intermediate, mart, OBT) |
| 16 | +- Add or modify SQL logic in an existing model |
| 17 | +- Generate sources.yml or schema.yml from warehouse metadata |
| 18 | +- Reorganize models into layers (staging/intermediate/mart or bronze/silver/gold) |
| 19 | +- Convert a model to incremental materialization |
| 20 | +- Scaffold a new dbt project structure |
| 21 | + |
| 22 | +**Do NOT use for:** |
| 23 | +- Adding tests to models → use `dbt-test` |
| 24 | +- Writing model/column descriptions → use `dbt-docs` |
| 25 | +- Debugging build failures → use `dbt-troubleshoot` |
| 26 | +- Analyzing change impact → use `dbt-analyze` |
| 27 | + |
| 28 | +## Core Workflow: Plan → Discover → Write → Validate |
| 29 | + |
| 30 | +### 1. Plan — Understand Before Writing |
| 31 | + |
| 32 | +Before writing any SQL: |
| 33 | +- Read the task requirements carefully |
| 34 | +- Identify which layer this model belongs to (staging, intermediate, mart) |
| 35 | +- Check existing models for naming conventions and patterns |
| 36 | +- **Check dependencies:** If `packages.yml` exists, check for `dbt_packages/` or `package-lock.yml`. Only run `dbt deps` if packages are declared but not yet installed. |
| 37 | + |
| 38 | +```bash |
| 39 | +altimate-dbt info # project name, adapter type |
| 40 | +altimate-dbt parents --model <upstream> # understand what feeds this model |
| 41 | +altimate-dbt children --model <downstream> # understand what consumes it |
| 42 | +``` |
| 43 | + |
| 44 | +### 2. Discover — Understand the Data Before Writing |
| 45 | + |
| 46 | +**Never write SQL without deeply understanding your data first.** The #1 cause of wrong results is writing SQL blind — assuming grain, relationships, column names, or values without checking. |
| 47 | + |
| 48 | +**Step 2a: Read all documentation and schema definitions** |
| 49 | +- Read `sources.yml`, `schema.yml`, and any YAML files that describe the source/parent models |
| 50 | +- These contain column descriptions, data types, tests, and business context |
| 51 | +- Pay special attention to: primary keys, unique constraints, relationships between tables, and what each column represents |
| 52 | + |
| 53 | +**Step 2b: Understand the grain of each parent model/source** |
| 54 | +- What does one row represent? (one customer? one event? one day per customer?) |
| 55 | +- What are the primary/unique keys? |
| 56 | +- This is critical for JOINs — joining on the wrong grain causes fan-out (too many rows) or missing rows |
| 57 | + |
| 58 | +```bash |
| 59 | +altimate-dbt columns --model <name> # existing model columns |
| 60 | +altimate-dbt columns-source --source <src> --table <tbl> # source table columns |
| 61 | +altimate-dbt execute --query "SELECT count(*) FROM {{ ref('model') }}" --limit 1 |
| 62 | +altimate-dbt execute --query "SELECT * FROM {{ ref('model') }}" --limit 5 |
| 63 | +altimate-dbt column-values --model <name> --column <col> # sample values for key columns |
| 64 | +``` |
| 65 | + |
| 66 | +**Step 2c: Query the actual data to verify your understanding** |
| 67 | +- Check row counts, NULLs, date ranges, cardinality of key columns |
| 68 | +- Verify foreign key relationships actually hold (do all IDs in child exist in parent?) |
| 69 | +- Check for duplicates in what you think are unique keys |
| 70 | + |
| 71 | +**Step 2d: Read existing models that your new model will reference** |
| 72 | +- Read the actual SQL of parent models — understand their logic, filters, and transformations |
| 73 | +- Read 2-3 existing models in the same directory to match patterns and conventions |
| 74 | + |
| 75 | +```bash |
| 76 | +glob models/**/*.sql # find all model files |
| 77 | +read <model_file> # understand existing patterns and logic |
| 78 | +``` |
| 79 | + |
| 80 | +### 3. Write — Follow Layer Patterns |
| 81 | + |
| 82 | +See [references/layer-patterns.md](references/layer-patterns.md) for staging/intermediate/mart templates. |
| 83 | +See [references/medallion-architecture.md](references/medallion-architecture.md) for bronze/silver/gold patterns. |
| 84 | +See [references/incremental-strategies.md](references/incremental-strategies.md) for incremental materialization. |
| 85 | +See [references/yaml-generation.md](references/yaml-generation.md) for sources.yml and schema.yml. |
| 86 | + |
| 87 | +### 4. Validate — Build, Verify, Check Impact |
| 88 | + |
| 89 | +Never stop at writing the SQL. Always validate: |
| 90 | + |
| 91 | +**Build it:** |
| 92 | +```bash |
| 93 | +altimate-dbt compile --model <name> # catch Jinja errors |
| 94 | +altimate-dbt build --model <name> # materialize + run tests |
| 95 | +``` |
| 96 | + |
| 97 | +**Verify the output:** |
| 98 | +```bash |
| 99 | +altimate-dbt columns --model <name> # confirm expected columns exist |
| 100 | +altimate-dbt execute --query "SELECT count(*) FROM {{ ref('<name>') }}" --limit 1 |
| 101 | +altimate-dbt execute --query "SELECT * FROM {{ ref('<name>') }}" --limit 10 # spot-check values |
| 102 | +``` |
| 103 | +- Do the columns match what schema.yml or the task expects? |
| 104 | +- Does the row count make sense? (no fan-out from bad joins, no missing rows from wrong filters) |
| 105 | +- Are values correct? (spot-check NULLs, aggregations, date ranges) |
| 106 | + |
| 107 | +**Check SQL quality** (on the compiled SQL from `altimate-dbt compile`): |
| 108 | +- `sql_analyze` — catches anti-patterns (SELECT *, cartesian products, missing filters) |
| 109 | +- `altimate_core_validate` — validates syntax and schema references |
| 110 | +- `altimate_core_column_lineage` — traces how source columns flow to output columns. Use this to verify your SELECT is pulling the right columns from the right sources, especially for complex JOINs or multi-CTE models. |
| 111 | + |
| 112 | +**Check downstream impact** (when modifying an existing model): |
| 113 | +```bash |
| 114 | +altimate-dbt children --model <name> # who depends on this? |
| 115 | +altimate-dbt build --model <name> --downstream # rebuild downstream to catch breakage |
| 116 | +``` |
| 117 | +Use `altimate-dbt children` and `altimate-dbt parents` to verify the DAG is intact when changes could affect downstream models. |
| 118 | + |
| 119 | +## Iron Rules |
| 120 | + |
| 121 | +1. **Never write SQL without reading the source columns first.** Use `altimate-dbt columns` or `altimate-dbt columns-source`. |
| 122 | +2. **Never stop at compile.** Always `altimate-dbt build` to catch runtime errors. |
| 123 | +3. **Match existing patterns.** Read 2-3 existing models in the same directory before writing. |
| 124 | +4. **One model, one purpose.** A staging model should not contain business logic. An intermediate model should not be materialized as a table unless it has consumers. |
| 125 | +5. **Fix ALL errors, not just yours.** After creating/modifying models, run a full `dbt build`. If ANY model fails — even pre-existing ones you didn't touch — fix them. Your job is to leave the project in a fully working state. |
| 126 | + |
| 127 | +## Common Mistakes |
| 128 | + |
| 129 | +| Mistake | Fix | |
| 130 | +|---------|-----| |
| 131 | +| Writing SQL without checking column names | Run `altimate-dbt columns` or `altimate-dbt columns-source` first | |
| 132 | +| Stopping at `compile` — "it compiled, ship it" | Always `altimate-dbt build` to materialize and run tests | |
| 133 | +| Hardcoding table references instead of `{{ ref() }}` | Always use `{{ ref('model') }}` or `{{ source('src', 'table') }}` | |
| 134 | +| Creating a staging model with JOINs | Staging = 1:1 with source. JOINs belong in intermediate or mart | |
| 135 | +| Not checking existing naming conventions | Read existing models in the same directory first | |
| 136 | +| Using `SELECT *` in final models | Explicitly list columns for clarity and contract stability | |
| 137 | + |
| 138 | +## Reference Guides |
| 139 | + |
| 140 | +| Guide | Use When | |
| 141 | +|-------|----------| |
| 142 | +| [references/altimate-dbt-commands.md](references/altimate-dbt-commands.md) | Need the full CLI reference | |
| 143 | +| [references/layer-patterns.md](references/layer-patterns.md) | Creating staging, intermediate, or mart models | |
| 144 | +| [references/medallion-architecture.md](references/medallion-architecture.md) | Organizing into bronze/silver/gold layers | |
| 145 | +| [references/incremental-strategies.md](references/incremental-strategies.md) | Converting to incremental materialization | |
| 146 | +| [references/yaml-generation.md](references/yaml-generation.md) | Generating sources.yml or schema.yml | |
| 147 | +| [references/common-mistakes.md](references/common-mistakes.md) | Extended anti-patterns catalog | |
0 commit comments