Commit c25f1ba
committed
fix(weave): shard call_parts by
call_parts was using the default rand() sharding key, so call_start and
call_end for the same call could land on different shards. Once split,
the partial-state rows can never merge in calls_merged_local (OPTIMIZE
runs per-shard), and queries that filter on an aggregated column see
inconsistent state.
Concretely: call_end doesn't carry parent_id, so its row defaults to
NULL. Filters like trace_roots_only (`parent_id IS NULL`) then match
the call_end row of every child call as if it were a root, inflating
counts.
Shard by `id` instead of `wf_clickhouse_calls_shard_key()` (which
defaults to trace_id): trace_id is Nullable on call_end so sipHash64
returns Nullable, which ClickHouse rejects as a sharding expression
(TYPE_MISMATCH). `id` is non-null on every call_part row and uniquely
identifies a call, so all parts of one call land together.
calls_merged Distributed table is intentionally left rand(): the only
writes come from the MV which fires on the local source/target pair and
never goes through the Distributed wrapper.id so call_start/call_end co-locate1 parent dbe0478 commit c25f1ba
1 file changed
Lines changed: 7 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
162 | 162 | | |
163 | 163 | | |
164 | 164 | | |
165 | | - | |
166 | | - | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
167 | 172 | | |
168 | 173 | | |
169 | 174 | | |
| |||
0 commit comments