|
| 1 | +# GAP-011: No Planner-Level Legacy Promotion |
| 2 | + |
| 3 | +**Status**: Open |
| 4 | +**Discovered**: 2026-05-18 |
| 5 | +**Context**: Codex review on PR #6423 (`feat(merge): legacy promotion path + body-col schema evolution`) flagged that the promotion path is wired end-to-end at the library + executor layer but has no production trigger at the planner / policy level. |
| 6 | + |
| 7 | +## Problem |
| 8 | + |
| 9 | +The streaming Parquet merge stack now contains a complete *legacy promotion* pipeline: |
| 10 | + |
| 11 | +- `ParquetMergeOperation::promote_legacy(splits, target_prefix_len)` constructs an operation with |
| 12 | + `target_prefix_len_override = Some(target)`. |
| 13 | +- `merge::execute_merge_operation` routes each input through `LegacyInputAdapter` when its |
| 14 | + declared `rg_partition_prefix_len < target` and through `StreamingParquetReader` otherwise. The |
| 15 | + streaming engine then sees a homogeneous stream advertising `prefix_len = target` on every |
| 16 | + input. |
| 17 | +- `ParquetMergeExecutor` (in `quickwit-indexing`) detects `target_prefix_len_override.is_some()` |
| 18 | + and routes those merges through `execute_merge_operation` (with `LocalFileByteSource`) instead |
| 19 | + of the in-memory `merge_sorted_parquet_files` path. |
| 20 | +- `merge_parquet_split_metadata` accepts a `mixed_prefix_ok: bool` flag so the post-merge |
| 21 | + aggregator skips the input-side equality check. |
| 22 | + |
| 23 | +What's missing: **nothing in the planner ever creates a `promote_legacy` operation in |
| 24 | +production**. `MergePolicyState::record_split` buckets each split by |
| 25 | +`CompactionScope::from_split`, and that scope key includes `rg_partition_prefix_len`. Legacy |
| 26 | +splits (`prefix_len = 0`) and aligned splits (`prefix_len > 0`) therefore land in *different* |
| 27 | +buckets before `ParquetMergePolicy::operations` ever runs. The production policy then iterates |
| 28 | +each bucket independently and emits `ParquetMergeOperation::new` (regular merge). A repo-wide |
| 29 | +search finds `promote_legacy` only in tests. |
| 30 | + |
| 31 | +In a mixed deployment (legacy + aligned splits coexisting), legacy splits therefore stay in |
| 32 | +their `prefix_len = 0` bucket forever — never gaining the prefix alignment that downstream |
| 33 | +locality compaction depends on. The promotion plumbing is reachable only from tests. |
| 34 | + |
| 35 | +## Evidence |
| 36 | + |
| 37 | +- `quickwit-parquet-engine/src/merge/policy/mod.rs`: `ParquetMergePolicy::operations` calls |
| 38 | + `ParquetMergeOperation::new(...)` only. `promote_legacy` is constructed only by tests in the |
| 39 | + same file. |
| 40 | +- `MergePolicyState::record_split` keys its `BTreeMap` by `CompactionScope::from_split`. The |
| 41 | + scope derivation includes `rg_partition_prefix_len`, so a legacy split and a prefix-aligned |
| 42 | + split with otherwise identical sort fields / window / merge level are never compared by the |
| 43 | + policy. |
| 44 | +- The executor branch added in PR #6423 (`scratch.merge_operation.target_prefix_len_override |
| 45 | + .is_some()`) routes promotion through `execute_merge_operation`. Library coverage at |
| 46 | + `test_promote_legacy_executor_end_to_end` exercises a `prefix_len = 0` + `prefix_len = 1` pair |
| 47 | + successfully. But that operation is only ever constructed inside the test. |
| 48 | + |
| 49 | +## State of the Art |
| 50 | + |
| 51 | +- **Iceberg**: Compaction policies inspect file-level metadata (partitioning, sort order) and |
| 52 | + can rewrite files to align with the latest table partitioning even when individual files |
| 53 | + pre-date the change. The compaction service treats schema-evolution-style rewrites as |
| 54 | + first-class operations. |
| 55 | +- **Husky**: Background re-organization passes that promote files into newer storage layouts. |
| 56 | + Tracked separately from the size-tiered compaction policy so cost trade-offs can be tuned. |
| 57 | + |
| 58 | +In both cases, the design separates the *trigger* (decision to promote) from the *mechanism* |
| 59 | +(how the promotion is performed). Quickwit currently has the mechanism but not the trigger. |
| 60 | + |
| 61 | +## Potential Solutions |
| 62 | + |
| 63 | +### Option A: Merge legacy + aligned buckets in `CompactionScope::from_split` |
| 64 | + |
| 65 | +Drop `rg_partition_prefix_len` from the scope key (or normalize it to a target value before |
| 66 | +bucketing). The policy then sees legacy and aligned splits as candidates for the same |
| 67 | +compaction operation and `ParquetMergePolicy::operations` decides whether to emit a regular |
| 68 | +merge or a `promote_legacy` operation based on whether the bucket contains mixed prefix |
| 69 | +lengths. |
| 70 | + |
| 71 | +Simplest change, but requires the policy to detect mixed-prefix buckets and choose between |
| 72 | +`new` and `promote_legacy` per operation. |
| 73 | + |
| 74 | +### Option B: Dedicated promotion pass |
| 75 | + |
| 76 | +Run a separate pass before the regular compaction policy that scans for legacy splits and emits |
| 77 | +`promote_legacy` operations for them. The regular policy then sees only aligned splits. |
| 78 | + |
| 79 | +Cleaner separation of concerns, but means legacy splits are migrated *before* any opportunity |
| 80 | +to coalesce them with aligned neighbors in a single multi-input merge — possibly more work |
| 81 | +overall. |
| 82 | + |
| 83 | +### Option C: Hybrid — bucket together, prefer single-pass promotion |
| 84 | + |
| 85 | +Keep scope bucketing as in option A. Inside the policy, when a bucket contains mixed prefix |
| 86 | +lengths AND has enough splits to merit a multi-input merge, emit `promote_legacy`. When only |
| 87 | +legacy splits exist (no aligned neighbor), emit `promote_legacy` with the same target — single- |
| 88 | +input promotion is still valuable because it converts the file to the new format for future |
| 89 | +locality compaction. |
| 90 | + |
| 91 | +Most flexible; gives the policy the freedom to amortize promotion cost when there are aligned |
| 92 | +neighbors AND to still promote isolated legacy splits in the background. |
| 93 | + |
| 94 | +## Signal Impact |
| 95 | + |
| 96 | +Primarily affects **metrics** in the near term: the legacy split format pre-dates the |
| 97 | +prefix-aligned RG layout, and only metrics has both formats in flight today. Traces and logs |
| 98 | +on the Parquet path will eventually reach the same state if a layout change ever happens; the |
| 99 | +same planner machinery would cover them. |
| 100 | + |
| 101 | +## Cost Considerations |
| 102 | + |
| 103 | +Promotion is strictly more expensive than a regular merge: the legacy adapter buffers the full |
| 104 | +input file in memory and re-encodes it as a single-RG stream before the merge engine sees it. |
| 105 | +For 50 MB metrics splits this is acceptable; for larger inputs the in-memory buffer is the |
| 106 | +gating cost. |
| 107 | + |
| 108 | +The planner should account for this when scheduling — promotion is best amortized into a |
| 109 | +multi-input merge rather than performed as a standalone file rewrite. Option C's "prefer |
| 110 | +multi-input promotion, fall back to single-input" structure captures this. |
| 111 | + |
| 112 | +## Impact |
| 113 | + |
| 114 | +- **Severity**: Medium. Legacy splits accumulate cost (every query against them pays the |
| 115 | + prefix-less scan cost) but correctness is preserved — the locality compaction stack still |
| 116 | + works on aligned splits. |
| 117 | +- **Frequency**: Persistent. Legacy splits never migrate without an explicit trigger. |
| 118 | +- **Affected Areas**: `quickwit-parquet-engine/src/merge/policy/`, `quickwit-parquet-engine/src/merge/mod.rs` (`MergePolicyState::record_split` + `CompactionScope`). |
| 119 | + |
| 120 | +## Next Steps |
| 121 | + |
| 122 | +- [ ] Decide between options A / B / C based on operational priorities and benchmark data. |
| 123 | +- [ ] Design the policy-level "should promote?" heuristic: how many legacy splits before |
| 124 | + triggering, whether to wait for aligned neighbors, how to deprioritize promotion vs |
| 125 | + regular compaction. |
| 126 | +- [ ] Add metrics for `legacy_splits_pending_promotion` and `promotion_operations_emitted` so we |
| 127 | + can observe the policy in production. |
| 128 | +- [ ] Wire whichever option is chosen, with an integration test that exercises the full path |
| 129 | + (legacy split → planner → executor → published prefix-aligned split). |
| 130 | + |
| 131 | +## References |
| 132 | + |
| 133 | +- PR #6423 (legacy promotion path + body-col schema evolution). |
| 134 | +- Codex review comment id `4311184497` (raised the gap). |
| 135 | +- `test_promote_legacy_executor_end_to_end` in `quickwit-parquet-engine::merge::streaming` — |
| 136 | + library-level coverage of the mechanism. |
0 commit comments