|
| 1 | +# CLI |
| 2 | + |
| 3 | +This page works through the SQLMesh example project using the SQLMesh command-line interface. |
| 4 | + |
| 5 | +## 1. Create the SQLMesh project |
| 6 | +First, create a project directory and navigate to it: |
| 7 | + |
| 8 | +```bash |
| 9 | +mkdir sqlmesh-example |
| 10 | +``` |
| 11 | +```bash |
| 12 | +cd sqlmesh-example |
| 13 | +``` |
| 14 | + |
| 15 | +If using a python virtual environment, ensure it's activated first by running the `source .env/bin/activate` command from the folder used during [installation](../installation.md). |
| 16 | + |
| 17 | +Create a SQLMesh scaffold with the following command: |
| 18 | + |
| 19 | +```bash |
| 20 | +sqlmesh init |
| 21 | +``` |
| 22 | + |
| 23 | +See the [quick start overview](../quick_start.md#project-directories-and-files) for more information about the project directories, files, data, and models. |
| 24 | + |
| 25 | +## 2. Plan and apply environments |
| 26 | +### 2.1 Create a prod environment |
| 27 | + |
| 28 | +SQLMesh's key actions are creating and applying *plans* to *environments*. At this point, the only environment is the empty `prod` environment. |
| 29 | + |
| 30 | +The first SQLMesh plan must execute every model to populate the production environment. Running `sqlmesh plan` will generate the plan and the following output: |
| 31 | + |
| 32 | +```bash linenums="1" |
| 33 | +$ sqlmesh plan |
| 34 | +====================================================================== |
| 35 | +Successfully Ran 1 tests against duckdb |
| 36 | +---------------------------------------------------------------------- |
| 37 | +New environment `prod` will be created from `prod` |
| 38 | +Summary of differences against `prod`: |
| 39 | +└── Added Models: |
| 40 | + ├── sqlmesh_example.seed_model |
| 41 | + ├── sqlmesh_example.incremental_model |
| 42 | + └── sqlmesh_example.full_model |
| 43 | +Models needing backfill (missing dates): |
| 44 | +├── sqlmesh_example.full_model: 2020-01-01 - 2023-05-31 |
| 45 | +├── sqlmesh_example.incremental_model: 2020-01-01 - 2023-05-31 |
| 46 | +└── sqlmesh_example.seed_model: 2023-05-31 - 2023-05-31 |
| 47 | +Apply - Backfill Tables [y/n]: |
| 48 | +``` |
| 49 | + |
| 50 | +Line 3 of the output notes that `sqlmesh plan` successfully executed the project's test `tests/test_full_model.yaml` with duckdb. |
| 51 | + |
| 52 | +Line 5 describes what environments the plan will affect when applied - a new `prod` environment in this case. |
| 53 | + |
| 54 | +Lines 7-10 of the output show that SQLMesh detected three new models relative to the current empty environment. |
| 55 | + |
| 56 | +Lines 11-14 list each model that will be executed by the plan, along with the date intervals that will be run. Note that `full_model` and `incremental_model` both show `2020-01-01` as their start date because: |
| 57 | + |
| 58 | +1. The incremental model specifies that date in the `start` property of its `MODEL` statement and |
| 59 | +2. The full model depends on the incremental model. |
| 60 | + |
| 61 | +The `seed_model` date range begins on the same day the plan was made because `SEED` models have no temporality associated with them other than whether they have been modified since the previous SQLMesh plan. |
| 62 | + |
| 63 | +Line 15 asks you whether to proceed with executing the model backfills described in lines 11-14. Enter `y` and press `Enter`, and SQLMesh will execute the models and return this output: |
| 64 | + |
| 65 | +```bash linenums="1" |
| 66 | +Apply - Backfill Tables [y/n]: y |
| 67 | + sqlmesh_example.seed_model ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 1/1 • 0:00:00 |
| 68 | +sqlmesh_example.incremental_model ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 1/1 • 0:00:00 |
| 69 | + sqlmesh_example.full_model ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 1/1 • 0:00:00 |
| 70 | + |
| 71 | +All model batches have been executed successfully |
| 72 | + |
| 73 | +Virtually Updating 'prod' ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 0:00:00 |
| 74 | + |
| 75 | +The target environment has been updated successfully |
| 76 | +``` |
| 77 | + |
| 78 | +Lines 2-4 show the completion percentage and run time for each model (very fast in this simple example). Line 8 shows that the `prod` environment now points to the tables created during model execution. |
| 79 | + |
| 80 | +You've now created a new production environment with all of history backfilled. |
| 81 | + |
| 82 | +### 2.2 Create a dev environment |
| 83 | +Now that you've created a production environment, it's time to create a development environment so that you can modify models without affecting production. Run `sqlmesh plan dev` to create a development environment called `dev`: |
| 84 | + |
| 85 | +```bash linenums="1" |
| 86 | +$ sqlmesh plan dev |
| 87 | +====================================================================== |
| 88 | +Successfully Ran 1 tests against duckdb |
| 89 | +---------------------------------------------------------------------- |
| 90 | +New environment `dev` will be created from `prod` |
| 91 | +Apply - Virtual Update [y/n]: |
| 92 | +``` |
| 93 | + |
| 94 | +The output does not list any added or modified models because `dev` is being created from the existing `prod` environment without modification. |
| 95 | + |
| 96 | +Line 6 shows that when you apply the plan creating the `dev` environment, it will only involve a Virtual Update. This is because SQLMesh is able to safely reuse the tables you've already backfilled in the `prod` environment. Enter `y` and press `Enter` to perform the Virtual Update: |
| 97 | + |
| 98 | +```bash linenums="1" |
| 99 | +Apply - Virtual Update [y/n]: y |
| 100 | +Virtually Updating 'dev' ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 0:00:00 |
| 101 | + |
| 102 | +The target environment has been updated successfully |
| 103 | + |
| 104 | + |
| 105 | +Virtual Update executed successfully |
| 106 | +``` |
| 107 | + |
| 108 | +The output confirms that the `dev` environment has been updated successfully. |
| 109 | + |
| 110 | +## 3. Make your first update |
| 111 | + |
| 112 | +Now that we have have populated both `prod` and `dev` environments, let's modify one of the SQL models, validate it in `dev`, and push it to `prod`. |
| 113 | + |
| 114 | +### 3.1 Edit the configuration |
| 115 | +We modify the incremental SQL model by adding a new column to the query. Open the `models/incremental_model.sql` file and add `#!sql 'z' AS new_column` below `item_id` as follows: |
| 116 | + |
| 117 | +```sql linenums="1" hl_lines="13" |
| 118 | +MODEL ( |
| 119 | + name sqlmesh_example.incremental_model, |
| 120 | + kind INCREMENTAL_BY_TIME_RANGE ( |
| 121 | + time_column ds |
| 122 | + ), |
| 123 | + start '2020-01-01', |
| 124 | + cron '@daily', |
| 125 | +); |
| 126 | + |
| 127 | +SELECT |
| 128 | + id, |
| 129 | + item_id, |
| 130 | + 'z' AS new_column, -- Added column |
| 131 | + ds, |
| 132 | +FROM |
| 133 | + sqlmesh_example.seed_model |
| 134 | +WHERE |
| 135 | + ds between @start_ds and @end_ds |
| 136 | +``` |
| 137 | + |
| 138 | +## 4. Plan and apply updates |
| 139 | +We can preview the impact of the change using the `sqlmesh plan dev` command: |
| 140 | + |
| 141 | +```bash linenums="1" |
| 142 | +$ sqlmesh plan dev |
| 143 | +====================================================================== |
| 144 | +Successfully Ran 1 tests against duckdb |
| 145 | +---------------------------------------------------------------------- |
| 146 | +Summary of differences against `dev`: |
| 147 | +├── Directly Modified: |
| 148 | +│ └── sqlmesh_example.incremental_model |
| 149 | +└── Indirectly Modified: |
| 150 | + └── sqlmesh_example.full_model |
| 151 | +--- |
| 152 | + |
| 153 | ++++ |
| 154 | + |
| 155 | +@@ -1,6 +1,7 @@ |
| 156 | + |
| 157 | + SELECT |
| 158 | + id, |
| 159 | + item_id, |
| 160 | ++ 'z' AS new_column, |
| 161 | + ds |
| 162 | + FROM sqlmesh_example.seed_model |
| 163 | + WHERE |
| 164 | +Directly Modified: sqlmesh_example.incremental_model (Non-breaking) |
| 165 | +└── Indirectly Modified Children: |
| 166 | + └── sqlmesh_example.full_model |
| 167 | +Models needing backfill (missing dates): |
| 168 | +└── sqlmesh_example__dev.incremental_model: 2020-01-01 - 2023-05-31 |
| 169 | +Enter the backfill start date (eg. '1 year', '2020-01-01') or blank to backfill from the beginning of history: |
| 170 | +``` |
| 171 | + |
| 172 | +Lines 5-9 of the output summarize the differences between the modified project components and the existing `dev` environment, detecting that we directly modified `incremental_model` and that `full_model` was indirectly modified because it selects from the incremental model. |
| 173 | + |
| 174 | +On line 23, we see that SQLMesh understood that the change was additive (added a column not used by `full_model`) and was automatically classified as a non-breaking change. |
| 175 | + |
| 176 | +Hit `Enter` at the prompt to backfill data from our start date `2020-01-01`. Another prompt will appear asking for a backfill end date; hit `Enter` to backfill until now. Finally, enter `y` and press `Enter` to apply the plan and execute the backfill: |
| 177 | + |
| 178 | +```bash linenums="1" |
| 179 | +Enter the backfill start date (eg. '1 year', '2020-01-01') or blank to backfill from the beginning of history: |
| 180 | +Enter the backfill end date (eg. '1 month ago', '2020-01-01') or blank to backfill up until now: |
| 181 | +Apply - Backfill Tables [y/n]: y |
| 182 | +sqlmesh_example__dev.incremental_model ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 1/1 • 0:00:00 |
| 183 | + |
| 184 | +All model batches have been executed successfully |
| 185 | + |
| 186 | +Virtually Updating 'dev' ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 0:00:00 |
| 187 | + |
| 188 | +The target environment has been updated successfully |
| 189 | +``` |
| 190 | +
|
| 191 | +SQLMesh applies the change to `sqlmesh_example.incremental_model` and backfills the model. SQLMesh did not need to backfill `sqlmesh_example.full_model` since the change was `non-breaking`. |
| 192 | +
|
| 193 | +### 4.1 Validate updates in dev |
| 194 | +You can now view this change by querying data from `incremental_model` with `sqlmesh fetchdf "select * from sqlmesh_example__dev.incremental_model"`. |
| 195 | +
|
| 196 | +Note that the environment name `__dev` is appended to the schema namespace `sqlmesh_example` in the query: `select * from sqlmesh_example__dev.incremental_model`. |
| 197 | +
|
| 198 | +```bash |
| 199 | +$ sqlmesh fetchdf "select * from sqlmesh_example__dev.incremental_model" |
| 200 | + |
| 201 | + id item_id new_column ds |
| 202 | +0 1 2 z 2020-01-01 |
| 203 | +1 2 1 z 2020-01-01 |
| 204 | +2 3 3 z 2020-01-03 |
| 205 | +3 4 1 z 2020-01-04 |
| 206 | +4 5 1 z 2020-01-05 |
| 207 | +5 6 1 z 2020-01-06 |
| 208 | +6 7 1 z 2020-01-07 |
| 209 | +``` |
| 210 | +
|
| 211 | +You can see that `new_column` was added to the dataset. The production table was not modified; you can validate this by querying the production table using `sqlmesh fetchdf "select * from sqlmesh_example.incremental_model"`. |
| 212 | +
|
| 213 | +Note that nothing has been appended to the schema namespace `sqlmesh_example` because `prod` is the default environment. |
| 214 | +
|
| 215 | +```bash |
| 216 | +$ sqlmesh fetchdf "select * from sqlmesh_example.incremental_model" |
| 217 | + |
| 218 | + id item_id ds |
| 219 | +0 1 2 2020-01-01 |
| 220 | +1 2 1 2020-01-01 |
| 221 | +2 3 3 2020-01-03 |
| 222 | +3 4 1 2020-01-04 |
| 223 | +4 5 1 2020-01-05 |
| 224 | +5 6 1 2020-01-06 |
| 225 | +6 7 1 2020-01-07 |
| 226 | +``` |
| 227 | +
|
| 228 | +The production table does not have `new_column` because the changes to `dev` have not yet been applied to `prod`. |
| 229 | +
|
| 230 | +### 4.2 Apply updates to prod |
| 231 | +Now that we've tested the changes in dev, it's time to move them to prod. Run `sqlmesh plan` to plan and apply your changes to the prod environment. |
| 232 | +
|
| 233 | +Enter `y` and press `Enter` at the `Apply - Virtual Update [y/n]:` prompt to apply the plan and execute the backfill: |
| 234 | +
|
| 235 | +```bash |
| 236 | +$ sqlmesh plan |
| 237 | +====================================================================== |
| 238 | +Successfully Ran 1 tests against duckdb |
| 239 | +---------------------------------------------------------------------- |
| 240 | +Summary of differences against `prod`: |
| 241 | +├── Directly Modified: |
| 242 | +│ └── sqlmesh_example.incremental_model |
| 243 | +└── Indirectly Modified: |
| 244 | + └── sqlmesh_example.full_model |
| 245 | +--- |
| 246 | + |
| 247 | ++++ |
| 248 | + |
| 249 | +@@ -1,6 +1,7 @@ |
| 250 | + |
| 251 | + SELECT |
| 252 | + id, |
| 253 | + item_id, |
| 254 | ++ 'z' AS new_column, |
| 255 | + ds |
| 256 | + FROM (VALUES |
| 257 | + (1, 1, '2020-01-01'), |
| 258 | +Directly Modified: sqlmesh_example.incremental_model (Non-breaking) |
| 259 | +└── Indirectly Modified Children: |
| 260 | + └── sqlmesh_example.full_model |
| 261 | +Apply - Virtual Update [y/n]: y |
| 262 | +Virtually Updating 'prod' ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 0:00:00 |
| 263 | + |
| 264 | +The target environment has been updated successfully |
| 265 | + |
| 266 | + |
| 267 | +Virtual Update executed successfully |
| 268 | +``` |
| 269 | +
|
| 270 | +Note that a backfill was not necessary and only a Virtual Update occurred. |
| 271 | +
|
| 272 | +### 4.3. Validate updates in prod |
| 273 | +Double-check that the data updated in `prod` by running `sqlmesh fetchdf "select * from sqlmesh_example.incremental_model"`: |
| 274 | +
|
| 275 | +```bash |
| 276 | +$ sqlmesh fetchdf "select * from sqlmesh_example.incremental_model" |
| 277 | + |
| 278 | + id item_id new_column ds |
| 279 | +0 1 2 z 2020-01-01 |
| 280 | +1 2 1 z 2020-01-01 |
| 281 | +2 3 3 z 2020-01-03 |
| 282 | +3 4 1 z 2020-01-04 |
| 283 | +4 5 1 z 2020-01-05 |
| 284 | +5 6 1 z 2020-01-06 |
| 285 | +6 7 1 z 2020-01-07 |
| 286 | +``` |
| 287 | +
|
| 288 | +## 5. Next steps |
| 289 | +
|
| 290 | +Congratulations, you've now conquered the basics of using SQLMesh! |
| 291 | +
|
| 292 | +* [Learn more about SQLMesh concepts](../concepts/overview.md) |
| 293 | +* [Join our Slack community](https://tobikodata.com/slack) |
0 commit comments