|
| 1 | +# Aggregates |
| 2 | + |
| 3 | +> Prerequisites: `setup/server.md` |
| 4 | +
|
| 5 | +Canonical runtime rules: |
| 6 | + |
| 7 | +- Use ORM scalar metrics (`aggregateIndex` + `count()`/`aggregate()`) for counts, sums, averages |
| 8 | +- Use `_count` relation loading instead of per-row `.count()` fanout loops |
| 9 | +- Use `rankIndex` + `rank()` for rankings, random access, sorted pagination |
| 10 | +- `aggregateIndex` and `rankIndex` backfill automatically via `kitcn dev` — no manual trigger wiring needed |
| 11 | + |
| 12 | +## ORM Scalar Metrics |
| 13 | + |
| 14 | +### `aggregateIndex` Schema Declaration |
| 15 | + |
| 16 | +Declare count/aggregate coverage in table definitions: |
| 17 | + |
| 18 | +```ts |
| 19 | +const orders = convexTable( |
| 20 | + "orders", |
| 21 | + { orgId: text(), amount: integer(), score: integer() }, |
| 22 | + (t) => [ |
| 23 | + aggregateIndex("by_org") |
| 24 | + .on(t.orgId) |
| 25 | + .sum(t.amount) |
| 26 | + .avg(t.amount) |
| 27 | + .min(t.score) |
| 28 | + .max(t.score), |
| 29 | + aggregateIndex("all_metrics").all().sum(t.amount).count(t.orgId), |
| 30 | + ] |
| 31 | +); |
| 32 | +``` |
| 33 | + |
| 34 | +- `.on(fields)` — filter key fields (namespaced counts) |
| 35 | +- `.all()` — unfiltered global metrics |
| 36 | +- `.count(field)` / `.sum(field)` / `.avg(field)` / `.min(field)` / `.max(field)` — chainable metrics |
| 37 | + |
| 38 | +After deploying, CLI runs `aggregateBackfill` automatically. Wait for `aggregateBackfillStatus` to report `READY`. |
| 39 | + |
| 40 | +### `count()` — O(1) No-Scan Counts |
| 41 | + |
| 42 | +```ts |
| 43 | +const total = await ctx.orm.query.todos.count({ where: { projectId } }); |
| 44 | +``` |
| 45 | + |
| 46 | +Unfiltered `count()` uses native Convex count syscall (no aggregateIndex required). |
| 47 | +Filtered `count()` accepts `eq`, `in`, `isNull`, `gt`, `gte`, `lt`, `lte`, conjunction via `AND`, and bounded finite DNF `OR` when every branch is index-plannable on one `aggregateIndex`. Requires matching `aggregateIndex`. |
| 48 | + |
| 49 | +Windowed count: `count({ where, orderBy, skip, take, cursor })` counts rows within a window. |
| 50 | + |
| 51 | +- `skip`/`take` for pagination windows, `cursor` for "after this value" counting (requires `orderBy`, single field in v1) |
| 52 | +- `count({ select: { field: true } })` with `skip`/`take`/`cursor` throws `COUNT_FILTER_UNSUPPORTED` in v1 |
| 53 | + |
| 54 | +| Error | Cause | |
| 55 | +| -------------------------- | -------------------------------------------- | |
| 56 | +| `COUNT_NOT_INDEXED` | No `aggregateIndex` matches the filter shape | |
| 57 | +| `COUNT_FILTER_UNSUPPORTED` | Uses unsupported operators | |
| 58 | +| `COUNT_INDEX_BUILDING` | Index still backfilling | |
| 59 | +| `COUNT_RLS_UNSUPPORTED` | Called in RLS-restricted context | |
| 60 | + |
| 61 | +### `aggregate()` — Prisma-style Aggregate Blocks |
| 62 | + |
| 63 | +```ts |
| 64 | +const stats = await ctx.orm.query.orders.aggregate({ |
| 65 | + where: { orgId: "org-1" }, |
| 66 | + _count: { _all: true }, |
| 67 | + _sum: { amount: true }, |
| 68 | + _avg: { amount: true }, |
| 69 | +}); |
| 70 | +``` |
| 71 | + |
| 72 | +Same filter rules as `count()`. Supports bounded finite DNF `OR` when every branch is index-plannable and resolves to one `aggregateIndex`. |
| 73 | +Windowed aggregate: |
| 74 | + |
| 75 | +- `orderBy` + `cursor` works for `_count/_sum/_avg/_min/_max` |
| 76 | +- `skip`/`take` are `_count`-only in v1 (`AGGREGATE_ARGS_UNSUPPORTED` for non-count metrics) because metric window skip/take is not bucket-computable under strict no-scan |
| 77 | + |
| 78 | +### `groupBy()` — Finite Indexed Groups Only |
| 79 | + |
| 80 | +`groupBy()` is supported with strict no-scan bounds: |
| 81 | + |
| 82 | +- `by` is required |
| 83 | +- every `by` field must be constrained in `where` via `eq`/`in`/`isNull` |
| 84 | +- `orderBy` supports `by` fields and selected metric fields |
| 85 | +- `skip`/`take`/`cursor` require explicit `orderBy` |
| 86 | +- `having` supports conjunction filters on `by` fields and selected metrics |
| 87 | +- `OR`/`NOT` in `having` are unsupported (`AGGREGATE_FILTER_UNSUPPORTED`) |
| 88 | + |
| 89 | +```ts |
| 90 | +const rows = await ctx.orm.query.orders.groupBy({ |
| 91 | + by: ["orgId"], |
| 92 | + where: { orgId: { in: ["org-1", "org-2"] }, status: "paid" }, |
| 93 | + _count: true, |
| 94 | + _sum: { amount: true }, |
| 95 | + orderBy: [{ _count: "desc" }, { _sum: { amount: "desc" } }], |
| 96 | + having: { _count: { gt: 0 } }, |
| 97 | + take: 10, |
| 98 | +}); |
| 99 | +``` |
| 100 | + |
| 101 | +#### When to use `groupBy` vs alternatives |
| 102 | + |
| 103 | +Use `groupBy` when you need **multi-bucket metrics in one call** where each bucket is a distinct field value: |
| 104 | + |
| 105 | +| Pattern | Use instead | Why | |
| 106 | +| ---------------------------------------------------------- | ----------------------------------------- | ------------------------------------- | |
| 107 | +| Multiple `.count()` calls with different filter values | `groupBy({ by, _count })` | One call replaces N sequential counts | |
| 108 | +| `findMany` + manual Map/reduce grouping in JS | `groupBy({ by, _count, _sum })` | O(log n) per bucket vs O(n) scan | |
| 109 | +| Sampling + estimation (e.g. "count admins from 100 users") | `groupBy({ by: ['role'], _count })` | Exact counts, no estimation | |
| 110 | +| Dashboard stats with breakdowns by category | `groupBy({ by: ['status'], _sum, _avg })` | Single query for full breakdown | |
| 111 | + |
| 112 | +Delta from parity: Unlike Prisma, `groupBy` requires every `by` field to be finite-constrained in `where` (`eq`/`in`/`isNull`) and backed by an `aggregateIndex`. Unconstrained `by` fields throw `AGGREGATE_ARGS_UNSUPPORTED`. |
| 113 | + |
| 114 | +### `findMany({ distinct })` (Unsupported) |
| 115 | + |
| 116 | +`findMany({ distinct })` is not available to keep strict no-scan/index-backed guarantees. |
| 117 | +If you need deduplication, use select-pipeline distinct: |
| 118 | + |
| 119 | +```ts |
| 120 | +const result = await ctx.orm.query.todos |
| 121 | + .select() |
| 122 | + .distinct({ fields: ["status"] }) |
| 123 | + .paginate({ cursor: null, limit: 100 }); |
| 124 | +``` |
| 125 | + |
| 126 | +### Relation `_count` — Best Practice |
| 127 | + |
| 128 | +**Always prefer `_count` relation loading over per-row `.count()` fanout loops.** Single query with embedded count vs N+1 separate count queries. |
| 129 | + |
| 130 | +```ts |
| 131 | +// ❌ BAD: N+1 count queries (one per tag) |
| 132 | +const tags = await ctx.orm.query.tags.findMany({ |
| 133 | + where: { createdBy: ctx.userId }, |
| 134 | +}); |
| 135 | +const usageCounts = await Promise.all( |
| 136 | + tags.map((tag) => ctx.orm.query.todoTags.count({ where: { tagId: tag.id } })) |
| 137 | +); |
| 138 | +return tags.map((tag, idx) => ({ |
| 139 | + ...tag, |
| 140 | + usageCount: usageCounts[idx] ?? 0, |
| 141 | +})); |
| 142 | + |
| 143 | +// ✅ GOOD: Single query with embedded _count |
| 144 | +const tags = await ctx.orm.query.tags.findMany({ |
| 145 | + where: { createdBy: ctx.userId }, |
| 146 | + with: { |
| 147 | + _count: { |
| 148 | + todos: true, |
| 149 | + }, |
| 150 | + }, |
| 151 | +}); |
| 152 | +return tags.map((tag) => ({ |
| 153 | + ...tag, |
| 154 | + usageCount: tag._count?.todos ?? 0, |
| 155 | +})); |
| 156 | +``` |
| 157 | + |
| 158 | +Filtered `_count`: |
| 159 | + |
| 160 | +```ts |
| 161 | +const users = await ctx.orm.query.user.findMany({ |
| 162 | + with: { |
| 163 | + _count: { |
| 164 | + todos: { |
| 165 | + where: { deletionTime: { isNull: true } }, |
| 166 | + }, |
| 167 | + }, |
| 168 | + }, |
| 169 | +}); |
| 170 | +const usersWithTodos = users.filter( |
| 171 | + (user) => (user._count?.todos ?? 0) > 0 |
| 172 | +).length; |
| 173 | +``` |
| 174 | + |
| 175 | +Through-filtered `_count` works for `through()` relations: |
| 176 | + |
| 177 | +```ts |
| 178 | +const users = await ctx.orm.query.users.findMany({ |
| 179 | + with: { |
| 180 | + _count: { |
| 181 | + memberTeams: { where: { name: "Core" } }, |
| 182 | + }, |
| 183 | + }, |
| 184 | +}); |
| 185 | +// users[0]._count?.memberTeams => 1 |
| 186 | +``` |
| 187 | + |
| 188 | +Works on `findMany`, `findFirst`, `findFirstOrThrow`. Access via `row._count?.relation ?? 0`. |
| 189 | + |
| 190 | +### Mutation `returning({ _count })` |
| 191 | + |
| 192 | +```ts |
| 193 | +const [user] = await ctx.orm |
| 194 | + .insert(usersTable) |
| 195 | + .values({ name: "Alice" }) |
| 196 | + .returning({ |
| 197 | + id: usersTable.id, |
| 198 | + _count: { posts: true }, |
| 199 | + }); |
| 200 | +// user._count?.posts => 0 |
| 201 | + |
| 202 | +const [updated] = await ctx.orm |
| 203 | + .update(usersTable) |
| 204 | + .set({ name: "Bob" }) |
| 205 | + .where(eq(usersTable.id, userId)) |
| 206 | + .returning({ |
| 207 | + id: usersTable.id, |
| 208 | + _count: { posts: { where: { status: "published" } } }, |
| 209 | + }); |
| 210 | +// updated._count?.posts => 2 |
| 211 | +``` |
| 212 | + |
| 213 | +Works on `insert`, `update`, and `delete`. |
| 214 | + |
| 215 | +### `_sum` Nullability |
| 216 | + |
| 217 | +`_sum` returns `null` for empty sets or when all field values are `null` (Prisma-compatible): |
| 218 | + |
| 219 | +```ts |
| 220 | +// Empty table or all-null amounts → { _sum: { amount: null } } |
| 221 | +// Non-empty with values → { _sum: { amount: 1500 } } |
| 222 | +``` |
| 223 | + |
| 224 | +## Ranked Access With `rankIndex` |
| 225 | + |
| 226 | +For **rankings**, **random access**, and **sorted pagination**. ORM-native, no external dependency, backfills automatically. |
| 227 | + |
| 228 | +| Operation | Description | |
| 229 | +| ------------------------------------ | --------------------------- | |
| 230 | +| `rank().indexOf({ id })` | Position/rank of a document | |
| 231 | +| `rank().at(offset)` | Row at a specific position | |
| 232 | +| `rank().paginate({ cursor, limit })` | Ordered page traversal | |
| 233 | +| `rank().max()` / `rank().min()` | Extremes by rank order | |
| 234 | +| `rank().random()` | Random row from ranked set | |
| 235 | +| `rank().count()` / `rank().sum()` | Ranked-set count/sum | |
| 236 | + |
| 237 | +### Declaring `rankIndex` |
| 238 | + |
| 239 | +```ts |
| 240 | +const scores = convexTable( |
| 241 | + "scores", |
| 242 | + { |
| 243 | + gameId: text().notNull(), |
| 244 | + score: integer().notNull(), |
| 245 | + createdAt: timestamp().notNull(), |
| 246 | + userId: text().notNull(), |
| 247 | + }, |
| 248 | + (t) => [ |
| 249 | + rankIndex("leaderboard") |
| 250 | + .partitionBy(t.gameId) |
| 251 | + .orderBy({ column: t.score, direction: "desc" }) |
| 252 | + .orderBy({ column: t.createdAt, direction: "asc" }) |
| 253 | + .sum(t.score), |
| 254 | + |
| 255 | + rankIndex("global_leaderboard") |
| 256 | + .all() |
| 257 | + .orderBy({ column: t.score, direction: "desc" }), |
| 258 | + ] |
| 259 | +); |
| 260 | +``` |
| 261 | + |
| 262 | +`partitionBy(...)` isolates ranked sets per unique partition value. `.all()` for global (unpartitioned). |
| 263 | + |
| 264 | +### Ranked Queries |
| 265 | + |
| 266 | +```ts |
| 267 | +const leaderboard = ctx.orm.query.scores.rank("leaderboard", { |
| 268 | + where: { gameId }, |
| 269 | +}); |
| 270 | + |
| 271 | +const top10 = await leaderboard.paginate({ cursor: null, limit: 10 }); |
| 272 | +const userRank = await leaderboard.indexOf({ id: userId }); |
| 273 | +const thirdPlace = await leaderboard.at(2); |
| 274 | +const best = await leaderboard.max(); |
| 275 | +const worst = await leaderboard.min(); |
| 276 | +const randomPick = await leaderboard.random(); |
| 277 | +const total = await leaderboard.count(); |
| 278 | +const totalScore = await leaderboard.sum(); |
| 279 | +``` |
| 280 | + |
| 281 | +### Leaderboard + User Stats |
| 282 | + |
| 283 | +```ts |
| 284 | +const lb = ctx.orm.query.scores.rank("leaderboard", { |
| 285 | + where: { gameId: input.gameId }, |
| 286 | +}); |
| 287 | +const globalRank = await lb.indexOf({ id: ctx.userId }); |
| 288 | +const totalPlayers = await lb.count(); |
| 289 | +``` |
| 290 | + |
| 291 | +### Best Practices |
| 292 | + |
| 293 | +```ts |
| 294 | +// ✅ Partition per tenant to isolate write hot spots |
| 295 | +rankIndex("tenant_scores") |
| 296 | + .partitionBy(t.tenantId) |
| 297 | + .orderBy({ column: t.score, direction: "desc" }); |
| 298 | + |
| 299 | +// ❌ Global rank can create cross-tenant contention |
| 300 | +rankIndex("global_scores") |
| 301 | + .all() |
| 302 | + .orderBy({ column: t.score, direction: "desc" }); |
| 303 | +``` |
| 304 | + |
| 305 | +## Repair |
| 306 | + |
| 307 | +If rank or aggregate state gets out of sync: |
| 308 | + |
| 309 | +```bash |
| 310 | +kitcn aggregate rebuild |
| 311 | +``` |
| 312 | + |
| 313 | +## When to Use |
| 314 | + |
| 315 | +| Need | Use | |
| 316 | +| ---------------------- | --------------------------------------------------------------- | |
| 317 | +| Counts, sums, averages | ORM Scalar Metrics (`aggregateIndex` + `count()`/`aggregate()`) | |
| 318 | +| Relation counts | `_count` relation loading (`with: { _count: { ... } }`) | |
| 319 | +| Rankings, leaderboards | `rankIndex` + `rank()` (`indexOf`, `at`, `paginate`) | |
| 320 | +| Random document access | `rankIndex` + `rank()` (`random()`, `at()`) | |
| 321 | +| Sorted pagination | `rankIndex` + `rank()` (`paginate({ cursor, limit })`) | |
| 322 | +| Non-table data | Model as a table, then use `aggregateIndex` or `rankIndex` | |
| 323 | + |
| 324 | +## Limitations |
| 325 | + |
| 326 | +| Consideration | Guideline | |
| 327 | +| ---------------- | ------------------------------------------------------ | |
| 328 | +| Update frequency | High-frequency updates to nearby keys cause contention | |
| 329 | +| Key size | Keep composite keys reasonable (3-4 components max) | |
| 330 | +| Namespace count | Each namespace has overhead | |
| 331 | +| Query patterns | Design keys for actual needs | |
| 332 | + |
| 333 | +## API Reference |
| 334 | + |
| 335 | +### Prisma Parity Matrix (No-Scan) |
| 336 | + |
| 337 | +| Prisma feature | Status | Notes | |
| 338 | +| ----------------------------------------------------------- | --------- | ------------------------------------------------------------------------------------------------------------------------------------- | |
| 339 | +| `aggregate({ _count/_sum/_avg/_min/_max, where })` | Supported | Bucket-backed, no base-table scan fallback | |
| 340 | +| `aggregate({ _sum })` nullability | Supported | Returns `null` for empty/all-null sets | |
| 341 | +| `groupBy({ by, where, _count/_sum/_avg/_min/_max })` | Supported | `by` fields must be finite-constrained (`eq/in/isNull`) in `where` | |
| 342 | +| `groupBy({ having/orderBy/skip/take/cursor })` | Partial | Supported for finite index-bounded groups with conjunction-only `having` | |
| 343 | +| `count()` | Supported | Native Convex count syscall | |
| 344 | +| `count({ where })` | Supported | Indexed scalar subset | |
| 345 | +| `count({ where, select: { _all, field } })` | Supported | Field counts require `aggregateIndex.count(field)` | |
| 346 | +| `findMany({ with: { _count: { relation: true } } })` | Supported | Indexed relation counts | |
| 347 | +| `findMany({ with: { _count: { relation: { where } } } })` | Supported | Direct relation scalar filters | |
| 348 | +| `aggregate({ orderBy/take/skip/cursor })` | Partial | `orderBy/cursor` supported; `skip/take` is `_count`-only in v1 | |
| 349 | +| Advanced aggregate/count filters (`OR/NOT/string/relation`) | Partial | Bounded finite DNF `OR` rewrite is supported when branches resolve to one `aggregateIndex`; `NOT`/string/relation filters are blocked | |
| 350 | +| Relation `_count` nested relation filter | Blocked | `RELATION_COUNT_FILTER_UNSUPPORTED` | |
| 351 | +| `findMany({ distinct })` | Blocked | Not available under strict no-scan contract. Use `select().distinct({ fields })` | |
| 352 | +| Relation `_count` filtered through relation | Supported | Indexed `through()` relation filters | |
| 353 | +| Mutation return `_count` parity | Supported | `returning({ _count })` on insert/update/delete | |
0 commit comments