You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: guides/developer/dbt-model-best-practices.mdx
+28-41Lines changed: 28 additions & 41 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -71,69 +71,56 @@ If you're using a star schema, keep in mind:
71
71
72
72
One approach is to maintain your star schema upstream for data modeling purposes, then materialize wide summary tables for specific business use cases as needed. This gives you the best of both worlds: clean data modeling practices upstream and optimized tables for BI consumption.
73
73
74
-
## Query performance considerations
74
+
## Optimizing query performance and warehouse costs
75
75
76
-
All queries in Lightdash are executed against your data warehouse, so optimizing query performance directly impacts both user experience and warehouse costs.
76
+
All Lightdash queries run against your data warehouse. These strategies help improve performance and reduce costs.
77
77
78
-
### Minimize joins at query time
79
-
80
-
For optimal query performance, handle data transformations and complex logic directly in your SQL models rather than relying heavily on joins at query time. Pre-joining related data during your data modeling process yields better performance than joining tables on-the-fly in dashboards and reports.
81
-
82
-
If you do need joins, Lightdash offers [fanout protection](/references/joins#sql-fanouts) to help with complex relationships, but wide tables will generally perform better.
78
+
| Strategy | Performance impact | Cost impact |
79
+
|----------|-------------------|-------------|
80
+
|[Materialize as tables](#materialize-models-as-tables)| High | High |
81
+
|[Minimize joins](#minimize-joins-at-query-time)| High | Medium |
82
+
|[Enable caching](#leverage-caching)| Medium | High |
83
+
|[Limit exposed models](#limit-models-exposed-to-the-bi-layer)| Low | Medium |
We recommend materializing your dbt models as [tables](https://docs.getdbt.com/docs/build/materializations#table) instead of views, especially for models that are frequently queried in Lightdash. Views execute the underlying SQL each time they're queried, which increases query time and warehouse costs.
88
+
Views re-execute SQL on every query. Tables store pre-computed results.
87
89
88
90
```yaml
89
-
#In your dbt model config
91
+
#Recommended for frequently queried models
90
92
{{ config(materialized='table') }}
91
-
```
92
-
93
-
For large datasets, consider using [incremental models](https://docs.getdbt.com/docs/build/incremental-models) to reduce build times while still maintaining table materialization:
94
93
95
-
```yaml
94
+
# For large datasets with append-only updates
96
95
{{ config(materialized='incremental') }}
97
96
```
98
97
99
-
Run your dbt models on a schedule (e.g., daily or hourly) to keep the materialized tables fresh while reducing the query load on your warehouse.
98
+
Schedule dbt runs (daily/hourly) to keep tables fresh while avoiding on-demand computation.
99
+
100
+
### Minimize joins at query time
101
+
102
+
Pre-join data in your dbt models rather than joining at query time. Wide, flat tables outperform runtime joins—even with Lightdash's [fanout protection](/references/joins#sql-fanouts).
100
103
101
104
### Leverage caching
102
105
103
-
Lightdash supports [caching](/guides/developer/caching)to reduce the number of queries executed against your warehouse. Popular charts and dashboards load faster when caching is enabled, and subsequent visits use cached results instead of querying the warehouse again.
106
+
[Caching](/guides/developer/caching)stores query results so repeat visits skip the warehouse entirely. Most effective for:
104
107
105
-
Caching is particularly effective for:
106
108
- Frequently accessed dashboards
107
-
- Charts with stable queries (no dynamic time filters with second precision)
109
+
- Charts without dynamic time filters
108
110
- Scheduled deliveries
109
111
110
-
### Reduce warehouse costs
111
-
112
-
Since all Lightdash queries run against your data warehouse, consider these strategies to manage costs:
113
-
114
-
-**Materialize frequently queried models as tables** to avoid repeated computation
115
-
-**Use incremental models** for large datasets to reduce build times
116
-
-**Enable caching** to reduce redundant queries
117
-
-**Build wide, flat tables** to minimize expensive join operations at query time
118
-
-**Schedule dbt runs** during off-peak hours when warehouse compute is cheaper (if your warehouse supports this)
119
-
120
-
### Monitor query usage
121
-
122
-
Use [query tags](/references/workspace/usage-analytics#query-tags) to monitor and analyze queries coming from Lightdash. Query tags help you:
123
-
124
-
-**Identify heavily queried tables** that may benefit from materialization or indexing
125
-
-**Spot expensive query execution plans** that could be optimized
126
-
-**Track usage patterns** to inform decisions about caching and model structure
127
-
-**Attribute warehouse costs** to specific dashboards, charts, or users
128
-
129
112
### Limit models exposed to the BI layer
130
113
131
-
Not every dbt model needs to be available in Lightdash. Limiting the models exposed to end users helps reduce confusion and ensures users are querying optimized tables.
114
+
Only surface production-ready models to end users:
132
115
133
-
**Option 1: Use dbt model tags**
116
+
-**[dbt tags](/get-started/develop-in-lightdash/adding-tables-to-lightdash#limiting-the-tables-in-lightdash-using-dbt-tags)**: Control which models appear in Lightdash
117
+
-**[User attributes](/references/workspace/user-attributes)**: Restrict model access by role
134
118
135
-
Use [dbt tags to limit which tables appear in Lightdash](/get-started/develop-in-lightdash/adding-tables-to-lightdash#limiting-the-tables-in-lightdash-using-dbt-tags). This allows you to explicitly control which models are surfaced in the BI layer, ensuring only production-ready, optimized models are available for querying.
119
+
### Monitor query usage
136
120
137
-
**Option 2: Use user attributes**
121
+
[Query tags](/references/workspace/usage-analytics#query-tags) help you identify optimization opportunities:
138
122
139
-
Use [user attributes](/references/workspace/user-attributes) to restrict access to specific models based on user roles or groups. This approach lets you limit end users to only the models that have been optimized for query performance, while giving power users or analysts access to a broader set of tables when needed.
0 commit comments