[phase-31 2/4] Compaction metadata types#6258
[phase-31 2/4] Compaction metadata types#6258g-talbot wants to merge 1 commit intogtt/phase-31-sort-schemafrom
Conversation
598de1a to
c95095b
Compare
3696bb4 to
a90ca2f
Compare
c95095b to
5b1c080
Compare
a90ca2f to
d60973e
Compare
5b1c080 to
6ebf40b
Compare
d60973e to
cf9484d
Compare
295f59c to
f3c03dc
Compare
cf9484d to
bb42b42
Compare
acb5d28 to
a8dc7a2
Compare
bb42b42 to
5b2304c
Compare
4ce16a3 to
b89e965
Compare
39f875f to
4d42fd9
Compare
b89e965 to
b9566a6
Compare
… model, field lookup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
b9566a6 to
b6eb595
Compare
|
@codex review |
mattmkim
left a comment
There was a problem hiding this comment.
LGTM, let's see if codex finds anything
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b6eb5958d7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| low_cardinality_tags: HashMap<String, HashSet<String>>, | ||
| high_cardinality_tag_keys: HashSet<String>, | ||
| created_at: SystemTime, | ||
| parquet_file: String, |
There was a problem hiding this comment.
Add serde default for parquet_file in legacy metadata
MetricsSplitMetadataSerde now requires parquet_file, but metadata JSON produced before this commit did not contain that field, so deserialization will fail for existing rows during upgrades. PgMetricsSplit::to_metadata() parses split_metadata_json, and metastore.rs currently drops rows on parse failure via to_metadata().ok()?, which makes legacy splits disappear from list/query results instead of remaining readable.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
i mean, true if we were in production already, but i don't think this matters.
| low_cardinality_tags: self.low_cardinality_tags, | ||
| high_cardinality_tag_keys: self.high_cardinality_tag_keys, | ||
| created_at: SystemTime::now(), | ||
| parquet_file: self.parquet_file, |
There was a problem hiding this comment.
Derive parquet_file when builder callers omit it
The builder stores parquet_file as a default-initialized String and build() writes it through unchanged, so callers that do not explicitly call .parquet_file(...) now emit metadata with an empty parquet path. This silently records invalid split metadata for current construction paths and loses the file location that was previously derivable from split_id.
Useful? React with 👍 / 👎.
Summary
Extend MetricsSplitMetadata with 6 compaction fields and update PostgreSQL model types (Phase 31 Metadata Foundation, PR 2 of 4).
Stacks on
gtt/phase-31-sort-schema(PR #6242).What's included
split/metadata.rs:
MetricsSplitMetadata:window_start,window_duration_secs,sort_fields,num_merge_ops,row_keys_proto,zonemap_regexesparquet_files: Vec<String>field (replaces single-file assumption)debug_assert!invariants from ADR-003split/postgres.rs:
InsertableMetricsSplitandPgMetricsSplitfrom_metadata()populates new SQL columnsschema/fields.rs:
ParquetField::from_name()for name-based field lookup (used by sort field resolver)Verification
cargo build -p quickwit-parquet-engine✅cargo test -p quickwit-parquet-engine -- split::metadata split::postgres✅ (11 tests)cargo clippy -p quickwit-parquet-engine --all-features --tests✅Test plan
(Replaces #6243 which was accidentally auto-merged during a rebase)
🤖 Generated with Claude Code