Commit 5cd0700
feat(merge-executor): YAML flag to route regular merges through streaming engine (default off, no-op rollout) (#6441)
* test(merge): engine-parity tests + share MS-7 serial lock
Adds `merge::tests::parity` with two tests that run the same realistic
input fixture through both `merge_sorted_parquet_files` (in-memory
engine) and `execute_merge_operation` (streaming engine over the same
`LocalFileByteSource` the executor uses in production), then asserts
row-by-row equivalence on every visible column. These gate the upcoming
YAML flag that flips regular merges to the streaming engine: parity
must hold before the default is flipped in production.
The streaming engine writes a process-global atomic
(`PEAK_BODY_COL_PAGE_CACHE_LEN`) that the MS-7 tests reset-then-read.
Any test that runs a streaming merge must serialise against MS-7 or
inflate its readings. Move `ms7_serial_lock` from the streaming-tests
submodule to module scope (still `#[cfg(test)] pub(crate)`) so the new
parity tests acquire the same lock.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(merge-executor): YAML flag to route regular merges through the streaming engine
Wires the streaming Parquet merge engine into the regular (non-promotion)
merge path behind a node-level YAML flag,
`indexer.parquet_merge_use_streaming_engine`, defaulted to false. When
true, `ParquetMergeExecutor::handle` runs every merge through
`execute_merge_operation` (the column-major, page-bounded streaming
engine) instead of the in-memory `merge_sorted_parquet_files`.
Promotion merges (`target_prefix_len_override.is_some()`) continue to
take the streaming path unconditionally — the in-memory engine can't
handle mixed `rg_partition_prefix_len` inputs.
The in-memory engine stays in place as the runtime fallback. If the
streaming engine hits a bug in production, an operator can flip the
flag back to `false` via YAML without redeploying. Once the streaming
path has soaked, the fallback branch and `merge_sorted_parquet_files`
itself can be removed.
The flag is plumbed `IndexerConfig` → `IndexingService` →
`ParquetMergePipelineParams` → `ParquetMergeExecutor::new`, and exercised
end-to-end by the engine parity tests in the previous commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(config): exercise parquet_merge_use_streaming_engine in YAML/JSON/TOML fixtures
Adds the new flag to the YAML, JSON, and TOML node-config test fixtures
and bumps the expected `IndexerConfig` in `node_config_parse_*` to
`parquet_merge_use_streaming_engine: true`. Catches parse / serde
regressions on the field — e.g., a rename or a default-fn typo would
fail the test instead of silently parsing as `false`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(merge-pipeline): end-to-end streaming-engine flag verification
Adds `test_merge_pipeline_end_to_end_with_streaming_engine_flag`, an
integration test that runs the full actor chain (planner → downloader
→ executor → uploader → publisher) with
`ParquetMergePipelineParams::use_streaming_engine = true`. Asserts:
1. Publish fired with the right replaced_split_ids (merge ran
end-to-end through the executor).
2. `PEAK_BODY_COL_PAGE_CACHE_LEN > 0` after the merge. The streaming
engine increments this on every body-col page assembly; the
in-memory engine never touches it. Non-zero is direct evidence
the streaming path executed — not a silent fallback to in-memory.
3. The merge output row count and metric names are correct.
To make assertion (2) work cross-crate, exposes
`PEAK_BODY_COL_PAGE_CACHE_LEN` as `pub` under
`#[cfg(any(test, feature = "testsuite"))]`. The visibility widening is
test-only — production builds never see the symbol.
This is the closest analog to the sesh-mode "production-path" rule
that is feasible today: the metrics pipeline's OTLP gRPC ingest path
is not yet wired into `quickwit-serve`, so the closest end-to-end
test is the actor-chain integration test that this PR adds.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(adr): record dual Parquet merge engines as deviation #1
Captures the intentional, time-bounded divergence from ADR-003 §4
introduced by the streaming-engine wire-in: two engines coexist in
production behind `IndexerConfig::parquet_merge_use_streaming_engine`,
with the in-memory engine retained as the runtime fallback.
Documents:
- The ADR-003 §4 quote the deviation diverges from (page-granular
streaming, bounded memory).
- The current dual-engine implementation and routing logic.
- Why this exists (production safety, staged rollout, parity is
strong-but-not-total).
- Explicit exit criteria: default flipped to `true`, ≥ 2-week
production soak with no merge-correctness incidents, no rollback.
When met, a follow-up PR deletes the in-memory branch and engine,
the flag, and the parity tests.
This is the first deviation recorded under the EVOLUTION.md framework.
Indexes the doc in `deviations/README.md`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(merge-pipeline): share full correctness contract between both engine tests
Extracts the steps-5-through-8 assertions (replaced_split_ids, staged
metadata, Parquet file content, Parquet KV headers) into
`assert_cpu_mem_merge_outputs_correct` and calls it from both
`test_merge_pipeline_end_to_end` (in-memory engine) and
`test_merge_pipeline_end_to_end_with_streaming_engine_flag` (streaming
engine). The streaming-engine test had been doing only a small subset
of the checks — row count and metric names. It now runs the full
contract: time_range, num_merge_ops, sort_fields, row_keys_proto,
zonemap_regexes, low_cardinality_tags, all 100 timestamps,
sorted_series monotonicity, cpu/mem sort-order semantics, and every
`qh.*` Parquet KV header.
By construction both engines must produce a file that satisfies the
same contract — the helper is the executable parity between engines
at the pipeline-integration level, complementing the column-level
parity tests in `quickwit-parquet-engine::merge::tests::parity`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(merge-pipeline,merge-engine): multi-metric + multi-RG + m:n disjointness
Expands test coverage along three axes the existing helpers didn't hit:
1. **Multi-input, multi-metric pipeline tests** (new file
`parquet_merge_pipeline_multi_metric_test.rs`). Three inputs, each
carrying three metrics with overlapping per-metric timeseries IDs
and overlapping-but-distinct timestamps — the merge must
row-by-row interleave across all three inputs. Output writer uses
`row_group_size = 50` so the 180-row merge output breaks into
four row groups, exercising the writer's multi-RG path in both
engines. Both engine variants (in-memory + streaming) covered.
Streaming-engine test asserts `PEAK_BODY_COL_PAGE_CACHE_LEN > 0`
to confirm the flag routed through the streaming path.
2. **Engine-level multi-output contract** in
`merge::tests::parity::assert_engine_parity`. Beyond the existing
engine-vs-engine column equivalence, every parity test now also
verifies on the in-memory engine's outputs (equivalent to the
streaming engine's): sum of per-output row counts equals total
input rows, each output internally monotonic on `sorted_series`,
and across outputs the partition is disjoint (no two outputs
share any `sorted_series` value). This is the m:n
non-overlap contract.
3. **Multi-metric overlapping-input m:n** test
`parity_multi_metric_overlapping_inputs_multi_output` exercises
the strengthened contract with three inputs × three metrics where
per-metric keyspaces overlap across inputs. n = 3 outputs target.
Honest scope note in the new pipeline test module's doc: the actor
pipeline today hardcodes `num_outputs = 1` in `ParquetMergeExecutor`,
so n > 1 is not reachable end-to-end through the actor system. The
new engine-level test covers the n > 1 correctness contract for now;
when the executor is taught to accept `num_outputs > 1` from the
merge policy, the pipeline tests can grow an n > 1 variant.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(merge-executor): compute num_outputs from target_split_size_bytes
Replaces the hardcoded `MergeConfig { num_outputs: 1, ... }` in
`ParquetMergeExecutor::handle` with a per-merge computation:
num_outputs = max(1, ceil(total_input_bytes / target_split_size_bytes))
So a merge that ingests more than one target's worth of data spreads
across multiple output files; merges that fit in one target keep
producing a single output (preserving today's behavior for the
common case). The engine clamps the request to the number of
`sorted_series` boundaries actually available, so the value is an
upper bound, not an exact count.
Plumbing: `IndexerConfig` already carries `target_split_size_bytes`
in `ParquetMergePolicyConfig`. Pass that through
`ParquetMergePipelineParams.target_split_size_bytes` →
`ParquetMergeExecutor::new`. Default for tests:
`256 * 1024 * 1024` (matches the production default).
Latent multi-output bug fixed at the same time: with n>1, the
executor used to assign the planner-supplied `merge_split_id` to
**every** output split, which would have collided on the rename to
`{split_id}.parquet`. First output keeps the planner ID for
observability continuity; subsequent outputs use the fresh IDs
generated by `merge_parquet_split_metadata`.
Also exposes `quickwit_parquet_engine::merge::streaming::ms7_serial_lock`
as `pub` under the `testsuite` feature so cross-crate streaming tests
(in `quickwit-indexing`) can serialise against the same global lock
the in-crate MS-7 tests use. The streaming engine writes to a
process-global atomic on every merge — without shared locking, the
existing pipeline streaming-engine test races `store(0)` against
other tests' merges. Adds the appropriate
`#[allow(clippy::await_holding_lock)]` to the in-crate
`test_merge_pipeline_end_to_end_with_streaming_engine_flag` to
match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(merge-pipeline): bonus — prefix_len=1 multi-RG inputs + m:n outputs
Adds the bonus scenario: three multi-metric inputs each written with
`rg_partition_prefix_len = 1` and one row group per distinct
metric_name (via `row_group_size = ROWS_PER_METRIC_PER_INPUT` so the
writer flushes at every metric boundary after sorting). Merged with a
small `ParquetMergePipelineParams::target_split_size_bytes = 500`
that forces the executor's `num_outputs` calculation to ask the
engine for multiple outputs — exercising the m:n merge path now
reachable through the actor pipeline (PR's earlier commit removed
the `num_outputs = 1` hardcode).
Both engines covered:
- `test_prefix_aligned_multi_metric_three_input_multi_output_in_memory_engine`
- `test_prefix_aligned_multi_metric_three_input_multi_output_streaming_engine`
The streaming-engine variant also asserts
`PEAK_BODY_COL_PAGE_CACHE_LEN > 0` (under `ms7_serial_lock`) so a
silent fallback to the in-memory path would fail.
The shared assertion helper
`assert_three_input_three_metric_multi_output_correct` checks the
m:n contract end-to-end at the pipeline level:
- All three input splits replaced.
- ≥ 2 output splits staged (proves splitting happened).
- Sum of per-output row counts = total input rows.
- Each output internally monotonic on `sorted_series`.
- Across outputs, the `sorted_series` partition is disjoint — no
two outputs share any key, which is the "non-overlapping output"
contract the engine promises.
- Union of metric_names / services across outputs = full input set.
- Every output has `num_merge_ops = 1`, `row_keys_proto`, and a
`metric_name` zonemap regex.
To pin the test to exactly one merge (not a cascade of merges over
the now-multiple staged outputs), `make_pipeline_params` now takes
`max_merge_ops` and the bonus tests set it to `1`: outputs land at
`num_merge_ops = 1`, equal to the policy ceiling, and the planner
refuses to merge them again. The existing n=1 tests stay at 5
(headroom — they produce a single output that can't trigger another
merge anyway, since `merge_factor = 3`).
Updates the module doc to drop the now-stale scope note about m:n
not being reachable through the pipeline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* style: nightly rustfmt + drop useless borrows in assert!
Reformats doc comments / format strings under nightly rustfmt
(`wrap_comments`, `format_strings`), and removes two redundant `&` in
`assert!` arguments flagged by clippy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 4597b7a commit 5cd0700
18 files changed
Lines changed: 2134 additions & 70 deletions
File tree
- docs/internals/adr/deviations
- quickwit
- quickwit-config
- resources/tests/node_config
- src/node_config
- quickwit-indexing/src/actors
- parquet_pipeline
- quickwit-parquet-engine/src/merge
Lines changed: 178 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
97 | | - | |
98 | | - | |
| 97 | + | |
99 | 98 | | |
100 | 99 | | |
101 | 100 | | |
| |||
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
57 | | - | |
| 57 | + | |
| 58 | + | |
58 | 59 | | |
59 | 60 | | |
60 | 61 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
| 48 | + | |
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| 52 | + | |
52 | 53 | | |
53 | 54 | | |
54 | 55 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
163 | 163 | | |
164 | 164 | | |
165 | 165 | | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
166 | 176 | | |
167 | 177 | | |
168 | 178 | | |
| |||
201 | 211 | | |
202 | 212 | | |
203 | 213 | | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
204 | 218 | | |
205 | 219 | | |
206 | 220 | | |
| |||
213 | 227 | | |
214 | 228 | | |
215 | 229 | | |
| 230 | + | |
216 | 231 | | |
217 | 232 | | |
218 | 233 | | |
| |||
229 | 244 | | |
230 | 245 | | |
231 | 246 | | |
| 247 | + | |
232 | 248 | | |
233 | 249 | | |
234 | 250 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
657 | 657 | | |
658 | 658 | | |
659 | 659 | | |
| 660 | + | |
660 | 661 | | |
661 | 662 | | |
662 | 663 | | |
| |||
Lines changed: 10 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
113 | 113 | | |
114 | 114 | | |
115 | 115 | | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
116 | 122 | | |
117 | 123 | | |
118 | 124 | | |
| |||
178 | 184 | | |
179 | 185 | | |
180 | 186 | | |
| 187 | + | |
| 188 | + | |
181 | 189 | | |
182 | 190 | | |
183 | 191 | | |
| |||
723 | 731 | | |
724 | 732 | | |
725 | 733 | | |
| 734 | + | |
| 735 | + | |
726 | 736 | | |
727 | 737 | | |
728 | 738 | | |
| |||
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
61 | 61 | | |
62 | 62 | | |
63 | 63 | | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
64 | 68 | | |
65 | 69 | | |
66 | 70 | | |
| |||
0 commit comments