Skip to content

Commit d4d269b

Browse files
committed
docs: multi-stage grain directive for measures (follow-up to #10957)
1 parent 8fc0cf8 commit d4d269b

4 files changed

Lines changed: 94 additions & 129 deletions

File tree

docs-mintlify/docs/data-modeling/measures.mdx

Lines changed: 23 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -307,8 +307,8 @@ periods.
307307

308308
### Percent of total (fixed dimension)
309309

310-
Use the [`group_by`][ref-group-by] parameter to fix the inner aggregation to
311-
specific dimensions, enabling percent-of-total calculations:
310+
Use the [`grain`][ref-grain] parameter with `keep_only` to fix the inner
311+
aggregation to specific dimensions, enabling percent-of-total calculations:
312312

313313
```yaml
314314
measures:
@@ -320,8 +320,9 @@ measures:
320320
multi_stage: true
321321
sql: "{revenue}"
322322
type: sum
323-
group_by:
324-
- country
323+
grain:
324+
keep_only:
325+
- country
325326
326327
- name: country_revenue_percentage
327328
multi_stage: true
@@ -371,7 +372,7 @@ The `filter` parameter requires the [Tesseract SQL planner][ref-tesseract-env]
371372

372373
### Nested aggregates
373374

374-
Use the [`add_group_by`][ref-add-group-by] parameter to compute an aggregate
375+
Use the [`grain`][ref-grain] parameter with `include` to compute an aggregate
375376
of an aggregate, e.g., the average of per-customer averages:
376377

377378
```yaml
@@ -384,13 +385,15 @@ measures:
384385
multi_stage: true
385386
sql: "{avg_order_value}"
386387
type: avg
387-
add_group_by:
388-
- customer_id
388+
grain:
389+
include:
390+
- customer_id
389391
```
390392

391393
### Ranking
392394

393-
Use the [`reduce_by`][ref-reduce-by] parameter to rank items within groups:
395+
Use the [`grain`][ref-grain] parameter with `exclude` to rank items within
396+
groups:
394397

395398
```yaml
396399
measures:
@@ -403,11 +406,20 @@ measures:
403406
order_by:
404407
- sql: "{revenue}"
405408
dir: asc
406-
reduce_by:
407-
- product
409+
grain:
410+
exclude:
411+
- product
408412
type: rank
409413
```
410414

415+
<Note>
416+
417+
`grain` replaces the standalone `group_by`, `reduce_by`, and `add_group_by`
418+
parameters, which remain supported. See the [`grain`][ref-grain] reference for
419+
the migration mapping.
420+
421+
</Note>
422+
411423
### Conditional measures
412424

413425
Conditional measures depend on the value of a dimension, using the
@@ -463,9 +475,7 @@ measures:
463475
[ref-format]: /reference/data-modeling/measures#format
464476
[ref-rolling-window]: /reference/data-modeling/measures#rolling_window
465477
[ref-time-shift]: /reference/data-modeling/measures#time_shift
466-
[ref-group-by]: /reference/data-modeling/measures#group_by
467-
[ref-reduce-by]: /reference/data-modeling/measures#reduce_by
468-
[ref-add-group-by]: /reference/data-modeling/measures#add_group_by
478+
[ref-grain]: /reference/data-modeling/measures#grain
469479
[ref-filter]: /reference/data-modeling/measures#filter
470480
[ref-case]: /reference/data-modeling/measures#case
471481
[ref-switch-dim]: /reference/data-modeling/dimensions#type

docs-mintlify/recipes/data-modeling/share-of-total.mdx

Lines changed: 25 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -58,10 +58,10 @@ When the share measure needs to be part of the semantic model — so it is
5858
returned by the API, visible in Explore, or accessible to AI agents — define
5959
it using multi-stage measures powered by [Tesseract][link-tesseract].
6060

61-
The key building block is the [`group_by`][ref-group-by] parameter: when set
62-
to an empty list, the inner aggregation stage groups by _nothing_, computing
63-
the grand total across all rows. The outer stage then joins that total back and
64-
groups by the query's dimensions as usual.
61+
The key building block is the [`grain`][ref-grain] parameter with `keep_only`:
62+
when set to an empty list, the inner aggregation stage groups by _nothing_,
63+
computing the grand total across all rows. The outer stage then joins that total
64+
back and groups by the query's dimensions as usual.
6565

6666
<Warning>
6767

@@ -76,15 +76,16 @@ Calculating share of total requires three measures:
7676

7777
1. A **base measure** — the regular aggregate, e.g., `total_sale_price`.
7878
2. A **helper measure** — a multi-stage measure that re-aggregates the base
79-
measure with `group_by: []`, fixing the inner `GROUP BY` to nothing (the
80-
grand total). This measure is internal and should be hidden from views.
79+
measure with `grain` set to `keep_only: []`, fixing the inner `GROUP BY` to
80+
nothing (the grand total). This measure is internal and should be hidden from
81+
views.
8182
3. A **ratio measure** — a multi-stage measure that divides the base by the
8283
helper total.
8384

8485
The examples below extend the `order_items` cube from the
8586
[ecommerce demo model][link-ecommerce-demo]. The `brand` and `category`
8687
dimensions are proxied from the joined `products` cube so they can be
87-
referenced by `group_by`.
88+
referenced by `grain`.
8889

8990
### Share of grand total
9091

@@ -122,7 +123,8 @@ cubes:
122123
multi_stage: true
123124
sql: "{total_sale_price}"
124125
type: sum
125-
group_by: []
126+
grain:
127+
keep_only: []
126128

127129
- name: revenue_share
128130
multi_stage: true
@@ -165,7 +167,9 @@ cube(`order_items`, {
165167
multi_stage: true,
166168
sql: `${total_sale_price}`,
167169
type: `sum`,
168-
group_by: []
170+
grain: {
171+
keep_only: []
172+
}
169173
},
170174

171175
revenue_share: {
@@ -180,7 +184,7 @@ cube(`order_items`, {
180184

181185
</CodeGroup>
182186

183-
`group_by: []` tells Tesseract that the inner stage for `total_revenue_grand_total`
187+
`keep_only: []` tells Tesseract that the inner stage for `total_revenue_grand_total`
184188
should group by no dimensions, producing a single grand-total row. The outer stage
185189
joins it back and groups by whatever dimensions are in the query (e.g., `brand`),
186190
so every row receives the same total denominator.
@@ -216,9 +220,9 @@ Sometimes you want each row's share _within a category_ rather than the
216220
overall total — for example, each brand's share of its product category's
217221
revenue.
218222

219-
Use `group_by` with the dimension you want to _fix_ as the subtotal boundary.
220-
The inner stage will group only by that dimension, and the outer stage will
221-
group by the full set of query dimensions:
223+
Use `grain` with `keep_only` set to the dimension you want to _fix_ as the
224+
subtotal boundary. The inner stage will group only by that dimension, and the
225+
outer stage will group by the full set of query dimensions:
222226

223227
<CodeGroup>
224228

@@ -256,8 +260,9 @@ cubes:
256260
multi_stage: true
257261
sql: "{total_sale_price}"
258262
type: sum
259-
group_by:
260-
- category
263+
grain:
264+
keep_only:
265+
- category
261266

262267
- name: revenue_share_of_category
263268
multi_stage: true
@@ -304,7 +309,9 @@ cube(`order_items`, {
304309
multi_stage: true,
305310
sql: `${total_sale_price}`,
306311
type: `sum`,
307-
group_by: [`category`]
312+
grain: {
313+
keep_only: [`category`]
314+
}
308315
},
309316

310317
revenue_share_of_category: {
@@ -319,7 +326,7 @@ cube(`order_items`, {
319326

320327
</CodeGroup>
321328

322-
With `group_by: [category]`, the inner stage computes revenue per category.
329+
With `keep_only: [category]`, the inner stage computes revenue per category.
323330
The outer stage groups by both `category` and `brand`, so each brand row
324331
divides its revenue by the right category total. Exclude
325332
`category_revenue_grand_total` from the view the same way as shown above.
@@ -348,7 +355,7 @@ override)][ref-share-filter].
348355

349356
[link-tesseract]: https://cube.dev/blog/introducing-next-generation-data-modeling-engine
350357
[link-ecommerce-demo]: https://github.com/cubedevinc/ecommerce_demo
351-
[ref-group-by]: /reference/data-modeling/measures#group_by
358+
[ref-grain]: /reference/data-modeling/measures#grain
352359
[ref-filter]: /reference/data-modeling/measures#filter
353360
[ref-share-filter]: /docs/data-modeling/measures#share-of-total-filter-override
354361
[ref-dynamic-params]: /recipes/data-modeling/passing-dynamic-parameters-in-a-query

docs-mintlify/recipes/data-modeling/xirr.mdx

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -72,8 +72,9 @@ cubes:
7272
multi_stage: true
7373
sql: "XIRR({total_payments}, {date__day})"
7474
type: number_agg
75-
add_group_by:
76-
- date__day
75+
grain:
76+
include:
77+
- date__day
7778

7879
pre_aggregations:
7980
- name: main_xirr
@@ -122,9 +123,9 @@ cube(`payments`, {
122123
multi_stage: true,
123124
sql: `XIRR(${CUBE.total_payments}, ${CUBE.date__day})`,
124125
type: `number_agg`,
125-
add_group_by: [
126-
date__day
127-
]
126+
grain: {
127+
include: [date__day]
128+
}
128129
}
129130
},
130131

docs-mintlify/reference/data-modeling/measures.mdx

Lines changed: 40 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -639,15 +639,21 @@ cube(`time_shift`, {
639639

640640
</CodeGroup>
641641

642-
### `group_by`
642+
### `grain`
643643

644-
The `group_by` parameter is used with [multi-stage measures][ref-multi-stage] to specify
645-
dimensions that should be used for the `GROUP BY` of the inner aggregation stage,
646-
*ignoring* any dimensions present in the query.
644+
The `grain` parameter is used with [multi-stage measures][ref-multi-stage] to control the
645+
dimensions of the inner aggregation stage's `GROUP BY` — the *grain* at which the base
646+
measure is computed before the outer aggregation is applied. It accepts an object with
647+
three keys, each taking a list of dimension names from the same cube:
647648

648-
This is commonly used for fixed dimension calculations — computing a measure at a fixed
649-
granularity regardless of the query's dimensions. For example, calculating percent of
650-
total or comparing individual items to a broader dataset.
649+
- `keep_only` — group the inner stage by *only* the listed dimensions, ignoring the
650+
query's dimensions. Use it for fixed-grain calculations such as percent of total.
651+
- `exclude` — group the inner stage by the query's dimensions *minus* the listed
652+
dimensions. Use it for ranking within groups.
653+
- `include` — group the inner stage by the query's dimensions *plus* the listed
654+
dimensions. Use it for nested aggregates (an aggregate of an aggregate).
655+
656+
`keep_only` and `exclude` are mutually exclusive.
651657

652658
<CodeGroup>
653659

@@ -657,8 +663,9 @@ measures:
657663
multi_stage: true
658664
sql: "{revenue}"
659665
type: sum
660-
group_by:
661-
- country
666+
grain:
667+
keep_only:
668+
- country
662669
```
663670
664671
```javascript title="JavaScript"
@@ -667,104 +674,44 @@ measures: {
667674
multi_stage: true,
668675
sql: `${revenue}`,
669676
type: `sum`,
670-
group_by: [country]
677+
grain: {
678+
keep_only: [country]
679+
}
671680
}
672681
}
673682
```
674683

675684
</CodeGroup>
676685

677-
`group_by` accepts a list of dimension names from the same cube. The inner stage will
678-
group by *only* these dimensions, while the outer aggregation will group by the query's
679-
dimensions.
680-
681-
| Parameter | Inner `GROUP BY` | Outer `GROUP BY` |
686+
| `grain` key | Inner `GROUP BY` | Outer `GROUP BY` |
682687
|---|---|---|
683-
| `group_by` | Only the listed dimensions | Query dimensions |
684-
| `reduce_by` | Query dimensions minus listed | Query dimensions |
685-
| `add_group_by` | Query dimensions plus listed | Query dimensions |
686-
687-
### `reduce_by`
688-
689-
The `reduce_by` parameter is used with [multi-stage measures][ref-multi-stage] to specify
690-
dimensions that should be *removed* from the `GROUP BY` of the inner aggregation stage.
691-
692-
This is commonly used for ranking calculations — computing a rank across a dimension
693-
while still allowing grouping by other dimensions in the query.
694-
695-
<CodeGroup>
696-
697-
```yaml title="YAML"
698-
measures:
699-
- name: product_rank
700-
multi_stage: true
701-
order_by:
702-
- sql: "{revenue}"
703-
dir: asc
704-
reduce_by:
705-
- product
706-
type: rank
707-
```
708-
709-
```javascript title="JavaScript"
710-
measures: {
711-
product_rank: {
712-
multi_stage: true,
713-
order_by: [{
714-
sql: `${revenue}`,
715-
dir: `asc`
716-
}],
717-
reduce_by: [product],
718-
type: `rank`
719-
}
720-
}
721-
```
722-
723-
</CodeGroup>
724-
725-
`reduce_by` accepts a list of dimension names. The inner stage will group by the query's
726-
dimensions *minus* the listed dimensions, while the outer aggregation will group by the
727-
query's dimensions.
728-
729-
### `add_group_by`
688+
| `keep_only` | Only the listed dimensions | Query dimensions |
689+
| `exclude` | Query dimensions minus listed | Query dimensions |
690+
| `include` | Query dimensions plus listed | Query dimensions |
730691

731-
The `add_group_by` parameter is used with [multi-stage measures][ref-multi-stage] to
732-
specify dimensions that should be *added* to the `GROUP BY` of the inner aggregation
733-
stage, in addition to any dimensions present in the query.
692+
<Note>
734693

735-
This is commonly used for [nested aggregate][ref-nested-aggregate] patterns — computing
736-
an aggregate of an aggregate. For example, averaging per-user metrics or counting how
737-
many groups exceed a threshold.
694+
`grain` replaces the standalone `group_by`, `reduce_by`, and `add_group_by` parameters,
695+
which remain supported. To migrate, use `grain.keep_only` instead of `group_by`,
696+
`grain.exclude` instead of `reduce_by`, and `grain.include` instead of `add_group_by`.
738697

739-
<CodeGroup>
698+
</Note>
740699

741-
```yaml title="YAML"
742-
measures:
743-
- name: avg_user_score
744-
multi_stage: true
745-
sql: "{avg_score}"
746-
type: avg
747-
add_group_by:
748-
- user_id
749-
```
700+
### `group_by`, `reduce_by`, and `add_group_by` (legacy)
750701

751-
```javascript title="JavaScript"
752-
measures: {
753-
avg_user_score: {
754-
multi_stage: true,
755-
sql: `${avg_score}`,
756-
type: `avg`,
757-
add_group_by: [user_id]
758-
}
759-
}
760-
```
702+
These three parameters were the original way to control the inner aggregation stage's
703+
`GROUP BY` for [multi-stage measures][ref-multi-stage]. They are still supported, but
704+
[`grain`](#grain) now covers all three and is the recommended way to express the grain of
705+
a multi-stage measure.
761706

762-
</CodeGroup>
707+
| Legacy parameter | `grain` equivalent | Effect on the inner stage's `GROUP BY` |
708+
|---|---|---|
709+
| `group_by` | [`grain.keep_only`](#grain) | Only the listed dimensions, ignoring query dimensions |
710+
| `reduce_by` | [`grain.exclude`](#grain) | Query dimensions minus the listed dimensions |
711+
| `add_group_by` | [`grain.include`](#grain) | Query dimensions plus the listed dimensions |
763712

764-
`add_group_by` accepts a list of dimension names from the same cube. The listed
765-
dimensions will be included in the inner stage's `GROUP BY` but will *not* appear
766-
in the outer aggregation — they are used only to define the granularity at which
767-
the base measure is computed before the outer aggregation is applied.
713+
Each accepts a list of dimension names from the same cube. For new data models, use
714+
[`grain`](#grain) instead.
768715

769716
### `filter`
770717

0 commit comments

Comments
 (0)