Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 12 additions & 4 deletions be/src/load/group_commit/wal/wal_reader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -41,16 +41,16 @@ Status WalReader::init_reader(const TupleDescriptor* tuple_descriptor) {

// ---- Unified init_reader(ReaderInitContext*) overrides ----

Status WalReader::_open_file_reader(ReaderInitContext* /*ctx*/) {
Status WalReader::_open_file_reader(ReaderInitContext* ctx) {
auto* wal_ctx = checked_context_cast<WalInitContext>(ctx);
_tuple_descriptor = wal_ctx->output_tuple_descriptor;
Comment on lines +44 to +46
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wal_ctx->output_tuple_descriptor is dereferenced later via _tuple_descriptor->slots() (both in _do_get_next_block and in _get_columns_impl). If it is ever null, this will crash and/or silently reintroduce the “all columns missing” behavior. Please add an explicit validation here (return an error Status with a clear message) instead of relying on implicit invariants.

Copilot uses AI. Check for mistakes.
RETURN_IF_ERROR(_state->exec_env()->wal_mgr()->get_wal_path(_wal_id, _wal_path));
_wal_reader = std::make_shared<doris::WalFileReader>(_wal_path);
RETURN_IF_ERROR(_wal_reader->init());
return Status::OK();
}

Status WalReader::_do_init_reader(ReaderInitContext* base_ctx) {
auto* ctx = checked_context_cast<WalInitContext>(base_ctx);
_tuple_descriptor = ctx->output_tuple_descriptor;
Status WalReader::_do_init_reader(ReaderInitContext* /*base_ctx*/) {
return Status::OK();
}

Expand Down Expand Up @@ -131,6 +131,14 @@ Status WalReader::_get_columns_impl(std::unordered_map<std::string, DataTypePtr>
} catch (const std::invalid_argument& e) {
return Status::InvalidArgument("Invalid format, {}", e.what());
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a regression test for this path. This patch fixes the on_before_init_reader() / get_columns() ordering bug, but the nearby scanner-based WAL tests in be/test/format/wal/wal_manager_test.cpp are currently disabled, and this PR does not add replacement coverage. Without a FileScanner + FORMAT_WAL test (or equivalent replay coverage), this exact all columns marked missing -> replay writes NULL/defaults regression can slip back in unnoticed.

// Report WAL columns so on_before_init_reader does not mark them as missing.
if (_tuple_descriptor) {
for (auto* slot_desc : _tuple_descriptor->slots()) {
if (_column_pos_map.contains(slot_desc->col_unique_id())) {
name_to_type->emplace(slot_desc->col_name(), slot_desc->get_data_type_ptr());
}
}
}
Comment on lines +134 to +141
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change alters how WAL schema is reported to TableFormatReader::on_before_init_reader() (via _get_columns_impl), which was previously causing runtime “missing column” behavior. Please add/enable a unit/regression test that reproduces the 0-rows WAL replay issue and asserts the reader no longer marks present WAL columns as missing (e.g., via missing_cols() or by verifying the output block row count is preserved).

Copilot uses AI. Check for mistakes.
return Status::OK();
}

Expand Down
Loading