From c2c863a4140f895b43885cd85e86514a73f406e9 Mon Sep 17 00:00:00 2001 From: jackylee-ch Date: Tue, 19 May 2026 15:28:17 +0800 Subject: [PATCH] [Velox] Fix Iceberg writer aggregate-init build failure on macOS C++20 libc++ MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two call sites in cpp/velox/compute/iceberg/IcebergWriter.cc construct aggregate types via paren-init with multiple arguments: fields.emplace_back(name, type, transform, parameter); // line 311 return WriteStats(numWrittenBytes, // line 262 numWrittenFiles, writeIOTimeNs, writeWallNs); Both target types are pure aggregate structs without a user-defined constructor: // Velox: velox/connectors/hive/iceberg/PartitionSpec.h struct IcebergPartitionSpec::Field { std::string name; TypePtr type; TransformType transformType; std::optional parameter; }; // Gluten: cpp/velox/compute/iceberg/IcebergWriter.h struct WriteStats { uint64_t numWrittenBytes{0}; uint32_t numWrittenFiles{0}; uint64_t writeIOTimeNs{0}; uint64_t writeWallNs{0}; ... }; C++20 P0960R3 made aggregates initializable via paren-init at the language level, but the standard library implementation of std::construct_at (which std::vector::emplace_back routes through to do in-place construction) is uneven across stdlibs: - libstdc++ (Linux GCC): construct_at accepts aggregate paren-init, so emplace_back(a, b, c, d) compiles for aggregate types. - libc++ (Apple Clang 17, macOS): construct_at still requires a real constructor, so the substitution fails: error: no matching function for call to 'construct_at' note: candidate template ignored: substitution failure [with _Tp = IcebergPartitionSpec::Field, _Args = ...]: no matching constructor for initialization of 'IcebergPartitionSpec::Field' Result: builds with --enable_enhanced_features=ON on macOS abort hard at IcebergWriter.cc.cpp.o (~156/217) before producing libgluten.dylib, blocking any macOS user from running the Iceberg writer integration that this gate adds. The fix is a minimal style change at the two call sites — switch from paren-init to brace-init, which is aggregate initialization on every C++17/20 toolchain (libstdc++, libc++, MSVC): fields.push_back({name, type, transform, parameter}); return WriteStats{numWrittenBytes, numWrittenFiles, ...}; Linux libstdc++ continues to compile cleanly: brace-init resolves to the same aggregate initialization the paren-init form selected via P0960. Zero runtime change, identical field-by-field initialization. Verification: - macOS 14 arm64 + Apple Clang 17 + --enable_enhanced_features=ON: cpp build now reaches 217/217 (was 156/217 before this fix). - cpp ctest matrix on the same configuration: 5539 / 5539 pass, including the enhanced-only velox_iceberg_test executable. - Linux x86_64 build remains green; brace-init for aggregates is unchanged in libstdc++. --- cpp/velox/compute/iceberg/IcebergWriter.cc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/cpp/velox/compute/iceberg/IcebergWriter.cc b/cpp/velox/compute/iceberg/IcebergWriter.cc index 576524d4996..e248fe4d279 100644 --- a/cpp/velox/compute/iceberg/IcebergWriter.cc +++ b/cpp/velox/compute/iceberg/IcebergWriter.cc @@ -259,11 +259,11 @@ WriteStats IcebergWriter::writeStats() const { const auto currentTimeNs = getCurrentTimeNano(); VELOX_CHECK_GE(currentTimeNs, createTimeNs_); const auto sinkStats = dataSink_->stats(); - return WriteStats( + return WriteStats{ sinkStats.numWrittenBytes, sinkStats.numWrittenFiles, sinkStats.writeIOTimeUs * 1000, - currentTimeNs - createTimeNs_); + currentTimeNs - createTimeNs_}; } std::shared_ptr @@ -308,7 +308,7 @@ parseIcebergPartitionSpec(const uint8_t* data, const int32_t length, RowTypePtr parameter = protoField.parameter(); } - fields.emplace_back(protoField.name(), rowType->findChild(protoField.name()), transform, parameter); + fields.push_back({protoField.name(), rowType->findChild(protoField.name()), transform, parameter}); } return std::make_shared(protoSpec.spec_id(), fields);