Skip to content

Commit e4d92f7

Browse files
chiinlquahclaude
andauthored
refactor: deduplicate ListedLogFiles / LogSegment and clean up validation (delta-io#1983)
## What changes are proposed in this pull request? This PR cleans up the relationship between `ListedLogFiles` (now `LogSegmentFiles`) and `LogSegment`, removing several sources of duplication: - **Remove `ListedLogFilesBuilder`**: it was an exact structural duplicate of `ListedLogFiles`. Replaced with `#[derive(Default)]` and direct struct-literal construction + `..Default::default()`. - **Rename `ListedLogFiles` → `LogSegmentFiles`**: the struct combines filesystem-listed files _and_ catalog-provided log tail commits, so "listed" was a misnomer. Also renames the module. - **Embed `LogSegmentFiles` in `LogSegment`**: `LogSegment` previously declared the same 6 fields (`ascending_commit_files`, `ascending_compaction_files`, `checkpoint_parts`, `latest_crc_file`, `latest_commit_file`, `max_published_version`) that `LogSegmentFiles` already had. These are now accessed via `log_segment.listed.*`. Also removes `into_parts()` since `try_new()` now mutates the struct in place. - **Consolidate validation into `LogSegment::try_new`**: every `LogSegmentFiles` construction flows into `try_new`, so `validate()` on `LogSegmentFiles` was a redundant pre-check. The structural invariant checks (compaction sort order, checkpoint integrity) are now a `#[cfg(debug_assertions)]` block at the top of `try_new`, alongside the existing always-on checks (contiguity, checkpoint gap, end version). Follow up PR that consolidates all the validations to a single loop and enables them for prod. delta-io#1995 <!-- **Uncomment** this section if there are any changes affecting public APIs. Else, **delete** this section. ### This PR affects the following public APIs If there are breaking changes, please ensure the `breaking-changes` label gets added by CI, and describe why the changes are needed. Note that _new_ public APIs are not considered breaking. --> ## How was this change tested? unit test added --------- Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
1 parent 38a420e commit e4d92f7

15 files changed

Lines changed: 605 additions & 507 deletions

File tree

kernel/src/lib.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -147,7 +147,7 @@ pub mod last_checkpoint_hint;
147147
#[cfg(not(feature = "internal-api"))]
148148
pub(crate) mod last_checkpoint_hint;
149149

150-
pub(crate) mod listed_log_files;
150+
pub(crate) mod log_segment_files;
151151

152152
#[cfg(feature = "internal-api")]
153153
pub mod history_manager;

kernel/src/log_reader/checkpoint_manifest.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -136,8 +136,8 @@ mod tests {
136136
) -> DeltaResult<()> {
137137
let log_segment = snapshot.log_segment();
138138
let log_root = log_segment.log_root.clone();
139-
assert_eq!(log_segment.checkpoint_parts.len(), 1);
140-
let checkpoint_file = &log_segment.checkpoint_parts[0];
139+
assert_eq!(log_segment.listed.checkpoint_parts.len(), 1);
140+
let checkpoint_file = &log_segment.listed.checkpoint_parts[0];
141141
let mut manifest_phase =
142142
CheckpointManifestReader::try_new(engine.clone(), checkpoint_file, log_root)?;
143143

@@ -218,7 +218,7 @@ mod tests {
218218

219219
let manifest_phase = CheckpointManifestReader::try_new(
220220
engine.clone(),
221-
&snapshot.log_segment().checkpoint_parts[0],
221+
&snapshot.log_segment().listed.checkpoint_parts[0],
222222
snapshot.log_segment().log_root.clone(),
223223
)?;
224224

kernel/src/log_segment.rs

Lines changed: 156 additions & 104 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)