Skip to content

Commit f6a6e9d

Browse files
g-talbotclaude
andcommitted
docs(adr): track legacy promotion planner gap as GAP-011
The streaming Parquet merge stack landing in #6424#6428 ships the full legacy-promotion *mechanism* (engine + adapter + executor wiring) but not the planner-level *trigger*. In production today, `MergePolicyState::record_split` buckets by `CompactionScope::from_split` which includes `rg_partition_prefix_len`, so legacy (prefix=0) and aligned (prefix>0) splits are separated before `ParquetMergePolicy::operations` runs. The policy only emits `ParquetMergeOperation::new`; a repo-wide search finds `promote_legacy` only in tests. Legacy splits therefore never migrate without an explicit trigger. Tracking this as GAP-011 so we pick it up at the right time. The gap doc walks three resolution options (merge buckets in the scope key, dedicated promotion pass, or hybrid prefer-multi-input-promotion) and the cost trade-offs between them, so the eventual implementation PR has a starting point. Raised by Codex review comment id 4311184497 on PR #6423. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 0a1446e commit f6a6e9d

2 files changed

Lines changed: 137 additions & 0 deletions

File tree

Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
# GAP-011: No Planner-Level Legacy Promotion
2+
3+
**Status**: Open
4+
**Discovered**: 2026-05-18
5+
**Context**: Codex review on PR #6423 (`feat(merge): legacy promotion path + body-col schema evolution`) flagged that the promotion path is wired end-to-end at the library + executor layer but has no production trigger at the planner / policy level.
6+
7+
## Problem
8+
9+
The streaming Parquet merge stack now contains a complete *legacy promotion* pipeline:
10+
11+
- `ParquetMergeOperation::promote_legacy(splits, target_prefix_len)` constructs an operation with
12+
`target_prefix_len_override = Some(target)`.
13+
- `merge::execute_merge_operation` routes each input through `LegacyInputAdapter` when its
14+
declared `rg_partition_prefix_len < target` and through `StreamingParquetReader` otherwise. The
15+
streaming engine then sees a homogeneous stream advertising `prefix_len = target` on every
16+
input.
17+
- `ParquetMergeExecutor` (in `quickwit-indexing`) detects `target_prefix_len_override.is_some()`
18+
and routes those merges through `execute_merge_operation` (with `LocalFileByteSource`) instead
19+
of the in-memory `merge_sorted_parquet_files` path.
20+
- `merge_parquet_split_metadata` accepts a `mixed_prefix_ok: bool` flag so the post-merge
21+
aggregator skips the input-side equality check.
22+
23+
What's missing: **nothing in the planner ever creates a `promote_legacy` operation in
24+
production**. `MergePolicyState::record_split` buckets each split by
25+
`CompactionScope::from_split`, and that scope key includes `rg_partition_prefix_len`. Legacy
26+
splits (`prefix_len = 0`) and aligned splits (`prefix_len > 0`) therefore land in *different*
27+
buckets before `ParquetMergePolicy::operations` ever runs. The production policy then iterates
28+
each bucket independently and emits `ParquetMergeOperation::new` (regular merge). A repo-wide
29+
search finds `promote_legacy` only in tests.
30+
31+
In a mixed deployment (legacy + aligned splits coexisting), legacy splits therefore stay in
32+
their `prefix_len = 0` bucket forever — never gaining the prefix alignment that downstream
33+
locality compaction depends on. The promotion plumbing is reachable only from tests.
34+
35+
## Evidence
36+
37+
- `quickwit-parquet-engine/src/merge/policy/mod.rs`: `ParquetMergePolicy::operations` calls
38+
`ParquetMergeOperation::new(...)` only. `promote_legacy` is constructed only by tests in the
39+
same file.
40+
- `MergePolicyState::record_split` keys its `BTreeMap` by `CompactionScope::from_split`. The
41+
scope derivation includes `rg_partition_prefix_len`, so a legacy split and a prefix-aligned
42+
split with otherwise identical sort fields / window / merge level are never compared by the
43+
policy.
44+
- The executor branch added in PR #6423 (`scratch.merge_operation.target_prefix_len_override
45+
.is_some()`) routes promotion through `execute_merge_operation`. Library coverage at
46+
`test_promote_legacy_executor_end_to_end` exercises a `prefix_len = 0` + `prefix_len = 1` pair
47+
successfully. But that operation is only ever constructed inside the test.
48+
49+
## State of the Art
50+
51+
- **Iceberg**: Compaction policies inspect file-level metadata (partitioning, sort order) and
52+
can rewrite files to align with the latest table partitioning even when individual files
53+
pre-date the change. The compaction service treats schema-evolution-style rewrites as
54+
first-class operations.
55+
- **Husky**: Background re-organization passes that promote files into newer storage layouts.
56+
Tracked separately from the size-tiered compaction policy so cost trade-offs can be tuned.
57+
58+
In both cases, the design separates the *trigger* (decision to promote) from the *mechanism*
59+
(how the promotion is performed). Quickwit currently has the mechanism but not the trigger.
60+
61+
## Potential Solutions
62+
63+
### Option A: Merge legacy + aligned buckets in `CompactionScope::from_split`
64+
65+
Drop `rg_partition_prefix_len` from the scope key (or normalize it to a target value before
66+
bucketing). The policy then sees legacy and aligned splits as candidates for the same
67+
compaction operation and `ParquetMergePolicy::operations` decides whether to emit a regular
68+
merge or a `promote_legacy` operation based on whether the bucket contains mixed prefix
69+
lengths.
70+
71+
Simplest change, but requires the policy to detect mixed-prefix buckets and choose between
72+
`new` and `promote_legacy` per operation.
73+
74+
### Option B: Dedicated promotion pass
75+
76+
Run a separate pass before the regular compaction policy that scans for legacy splits and emits
77+
`promote_legacy` operations for them. The regular policy then sees only aligned splits.
78+
79+
Cleaner separation of concerns, but means legacy splits are migrated *before* any opportunity
80+
to coalesce them with aligned neighbors in a single multi-input merge — possibly more work
81+
overall.
82+
83+
### Option C: Hybrid — bucket together, prefer single-pass promotion
84+
85+
Keep scope bucketing as in option A. Inside the policy, when a bucket contains mixed prefix
86+
lengths AND has enough splits to merit a multi-input merge, emit `promote_legacy`. When only
87+
legacy splits exist (no aligned neighbor), emit `promote_legacy` with the same target — single-
88+
input promotion is still valuable because it converts the file to the new format for future
89+
locality compaction.
90+
91+
Most flexible; gives the policy the freedom to amortize promotion cost when there are aligned
92+
neighbors AND to still promote isolated legacy splits in the background.
93+
94+
## Signal Impact
95+
96+
Primarily affects **metrics** in the near term: the legacy split format pre-dates the
97+
prefix-aligned RG layout, and only metrics has both formats in flight today. Traces and logs
98+
on the Parquet path will eventually reach the same state if a layout change ever happens; the
99+
same planner machinery would cover them.
100+
101+
## Cost Considerations
102+
103+
Promotion is strictly more expensive than a regular merge: the legacy adapter buffers the full
104+
input file in memory and re-encodes it as a single-RG stream before the merge engine sees it.
105+
For 50 MB metrics splits this is acceptable; for larger inputs the in-memory buffer is the
106+
gating cost.
107+
108+
The planner should account for this when scheduling — promotion is best amortized into a
109+
multi-input merge rather than performed as a standalone file rewrite. Option C's "prefer
110+
multi-input promotion, fall back to single-input" structure captures this.
111+
112+
## Impact
113+
114+
- **Severity**: Medium. Legacy splits accumulate cost (every query against them pays the
115+
prefix-less scan cost) but correctness is preserved — the locality compaction stack still
116+
works on aligned splits.
117+
- **Frequency**: Persistent. Legacy splits never migrate without an explicit trigger.
118+
- **Affected Areas**: `quickwit-parquet-engine/src/merge/policy/`, `quickwit-parquet-engine/src/merge/mod.rs` (`MergePolicyState::record_split` + `CompactionScope`).
119+
120+
## Next Steps
121+
122+
- [ ] Decide between options A / B / C based on operational priorities and benchmark data.
123+
- [ ] Design the policy-level "should promote?" heuristic: how many legacy splits before
124+
triggering, whether to wait for aligned neighbors, how to deprioritize promotion vs
125+
regular compaction.
126+
- [ ] Add metrics for `legacy_splits_pending_promotion` and `promotion_operations_emitted` so we
127+
can observe the policy in production.
128+
- [ ] Wire whichever option is chosen, with an integration test that exercises the full path
129+
(legacy split → planner → executor → published prefix-aligned split).
130+
131+
## References
132+
133+
- PR #6423 (legacy promotion path + body-col schema evolution).
134+
- Codex review comment id `4311184497` (raised the gap).
135+
- `test_promote_legacy_executor_end_to_end` in `quickwit-parquet-engine::merge::streaming`
136+
library-level coverage of the mechanism.

docs/internals/adr/gaps/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,3 +115,4 @@ Gap files use sequential numbering: `001-short-description.md`
115115
| [008](./008-no-high-query-rate-optimization.md) | No High Query Rate Optimization | Open | High |
116116
| [009](./009-no-leading-edge-prioritization.md) | No Leading Edge Prioritization | Open | High |
117117
| [010](./010-no-data-caching-or-query-affinity.md) | No Multi-Level Data Caching or Query Affinity Optimization | Open | High |
118+
| [011](./011-no-legacy-promotion-planner.md) | No Planner-Level Legacy Promotion | Open | Medium |

0 commit comments

Comments
 (0)