feat: add iceberg_data library alongside iceberg#631
feat: add iceberg_data library alongside iceberg#631zhjwpku wants to merge 2 commits intoapache:mainfrom
Conversation
c766bd8 to
889125a
Compare
| set(CMAKE_CXX_STANDARD 23) | ||
|
|
||
| find_package(iceberg CONFIG REQUIRED) | ||
| find_package(iceberg CONFIG REQUIRED COMPONENTS bundle rest) |
There was a problem hiding this comment.
This doesn't have to be changed, I updated it mainly to test the COMPONENTS keyword and thought it could be a useful reference for others. I will change it back if any objection.
There was a problem hiding this comment.
Pull request overview
This PR splits data-path functionality (writers, deletes, puffin) into a new iceberg_data library while keeping the existing iceberg target focused on core metadata/planning and utilities, and updates build/test wiring accordingly.
Changes:
- Introduce
iceberg_dataas a separate library (Meson + CMake) and movedata/,deletes/, andpuffin/sources to it. - Update unit test build definitions to link against
iceberg_datawhere needed. - Adjust packaging/examples/CI scripts for the new component structure and Windows environment detection.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/iceberg/test/meson.build | Adds per-test selection to link against iceberg vs iceberg_data. |
| src/iceberg/test/CMakeLists.txt | Adds USE_DATA option for tests to link iceberg_data where required. |
| src/iceberg/meson.build | Creates iceberg_data library target and adjusts dependencies/source lists. |
| src/iceberg/iceberg_export.h | Tweaks template export macro behavior for non-Windows builds. |
| src/iceberg/iceberg-config.cmake.in | Documents new iceberg_data CMake targets/components. |
| src/iceberg/file_writer.h | Exports WriterProperties for cross-library visibility. |
| src/iceberg/data/meson.build | Installs public headers for iceberg/data/*. |
| src/iceberg/arrow_c_data_guard_internal.h | Exports guard classes used across the new library boundary. |
| src/iceberg/CMakeLists.txt | Adds iceberg_data CMake target and wires bundle to depend on it. |
| example/CMakeLists.txt | Requests bundle and rest components explicitly in find_package(iceberg ...). |
| ci/scripts/build_iceberg.sh | Improves Windows detection to include Cygwin. |
| ci/scripts/build_example.sh | Improves Windows detection to include Cygwin. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| endif() | ||
|
|
||
| if(TARGET iceberg_data_static) | ||
| target_compile_definitions(iceberg_data_static PRIVATE ICEBERG_STATIC) |
There was a problem hiding this comment.
iceberg_data_static sets ICEBERG_STATIC as a PRIVATE compile definition, but consumers of the static library also need ICEBERG_STATIC so that ICEBERG_EXPORT in public headers (e.g., iceberg/data/writer.h) doesn’t expand to dllimport on Windows. As-is, linking an external target against iceberg::iceberg_data_static can fail due to incorrect import/export annotations. Consider making ICEBERG_STATIC an INTERFACE/PUBLIC compile definition on iceberg_data_static (potentially gated to WIN32), or switching the data headers to a dedicated ICEBERG_DATA_* export macro that add_iceberg_lib() already defines.
| target_compile_definitions(iceberg_data_static PRIVATE ICEBERG_STATIC) | |
| target_compile_definitions(iceberg_data_static | |
| PUBLIC $<$<PLATFORM_ID:Windows>:ICEBERG_STATIC>) |
|
Out of curiosity, what is the purpose of creating this |
Check this issue #627, we can have further discussion there. |
thank you. |
6ae47fd to
190c145
Compare
| ) | ||
|
|
||
| iceberg_data_interface_args = [] | ||
| if get_option('default_library') != 'shared' |
There was a problem hiding this comment.
Use == 'static' here, like iceberg_rest. With --default-library=both, consumers can link the shared iceberg_data library but still get ICEBERG_DATA_STATIC, so Windows builds won't import the DLL symbols correctly.
| visit_type_test.cc) | ||
|
|
||
| add_iceberg_test(roaring_test SOURCES roaring_test.cc) | ||
| add_iceberg_test(roaring_test USE_DATA SOURCES roaring_test.cc) |
There was a problem hiding this comment.
IMO, we can remove the executable roaring_test as well as the file roaring_test.cc.
|
|
||
| /// \brief Metadata about a Puffin file. | ||
| struct ICEBERG_EXPORT FileMetadata { | ||
| struct ICEBERG_DATA_EXPORT FileMetadata { |
There was a problem hiding this comment.
I'm not quite sure if puffin file metadata (and other associated metadata definitions) belong to the iceberg_data library. I'm fine to go with the current PR and change if necessary in the future.
| if(TARGET iceberg_data_static) | ||
| target_compile_definitions(iceberg_data_static PRIVATE ICEBERG_STATIC) | ||
| endif() |
There was a problem hiding this comment.
| if(TARGET iceberg_data_static) | |
| target_compile_definitions(iceberg_data_static PRIVATE ICEBERG_STATIC) | |
| endif() |
iceberg_data_static has linked iceberg_static so all symbols from the latter should be imported. Let's remove these lines.
Move data writers, deletes/, and puffin/ into a separate
iceberg_datalibrary that links the existingicebergtarget.delete_file_indexstays inicebergbecause manifest_group embeds DeleteFileIndex::Builder with only core dependencies.iceberg— unchanged target name for metadata/planning, expressions, manifests, catalog (incl. in-memory), utilities, file I/O abstractions, and delete_file_index.iceberg_data— data/, deletes/, puffin/; linksiceberg.iceberg_bundlelinksiceberg_datawhen the bundle is built.iceberg_restlinksicebergand cpr only.