Skip to content

feat: implement basic parquet writer and add roundtrip tests#198

Merged
zeroshade merged 9 commits intoapache:mainfrom
HuaHuaY:basic_parquet_writer
Sep 3, 2025
Merged

feat: implement basic parquet writer and add roundtrip tests#198
zeroshade merged 9 commits intoapache:mainfrom
HuaHuaY:basic_parquet_writer

Conversation

@HuaHuaY
Copy link
Copy Markdown
Contributor

@HuaHuaY HuaHuaY commented Aug 28, 2025

Add parquet writer factory and basic parquet writer without metrics.

Copilot AI review requested due to automatic review settings August 28, 2025 03:35
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a basic ParquetWriter class for writing ArrowArray data to Parquet files. The implementation follows the Writer interface pattern and uses Apache Arrow/Parquet libraries for the underlying functionality.

  • Adds ParquetWriter class with PIMPL pattern for encapsulation
  • Implements core writer operations: Open, Write, Close with Arrow/Parquet integration
  • Provides placeholder implementations for metrics, length, and split_offsets methods

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/iceberg/parquet/parquet_writer.h Header file defining the ParquetWriter class interface with PIMPL pattern
src/iceberg/parquet/parquet_writer.cc Implementation of ParquetWriter with Arrow/Parquet integration and factory registration
src/iceberg/CMakeLists.txt Adds parquet_writer.cc to the build configuration

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread src/iceberg/parquet/parquet_writer.cc
Comment thread src/iceberg/parquet/parquet_writer.cc Outdated
@HuaHuaY HuaHuaY force-pushed the basic_parquet_writer branch 2 times, most recently from 602b681 to 978dfcf Compare September 1, 2025 06:23
@HuaHuaY HuaHuaY changed the title feat: implement basic parquet writer feat: implement basic parquet writer and add roundtrip tests Sep 1, 2025
Comment thread src/iceberg/parquet/parquet_reader.cc Outdated
Comment thread src/iceberg/schema_internal.h Outdated
Comment thread test/parquet_test.cc Outdated
Comment thread src/iceberg/parquet/parquet_writer.cc Outdated
Comment thread src/iceberg/parquet/parquet_writer.cc
Comment thread src/iceberg/parquet/parquet_writer.cc
Comment thread src/iceberg/parquet/parquet_writer.cc Outdated
Comment thread test/parquet_test.cc Outdated
Comment thread test/parquet_test.cc Outdated
Comment thread test/parquet_test.cc Outdated
@HuaHuaY HuaHuaY force-pushed the basic_parquet_writer branch 2 times, most recently from 3039e80 to 90039fb Compare September 1, 2025 11:14
@HuaHuaY HuaHuaY force-pushed the basic_parquet_writer branch from 1f9507a to 02a05cc Compare September 2, 2025 02:30
Comment thread src/iceberg/parquet/parquet_writer.cc Outdated
Comment thread src/iceberg/parquet/parquet_writer.cc Outdated
Comment thread src/iceberg/parquet/parquet_writer.cc
Comment thread test/parquet_test.cc Outdated
Comment thread test/parquet_test.cc Outdated
Comment thread test/parquet_test.cc Outdated
Comment thread test/parquet_test.cc Outdated
Comment thread test/parquet_test.cc
Comment thread test/parquet_test.cc
Comment thread test/parquet_test.cc Outdated
Copy link
Copy Markdown
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left some nits.

Comment thread src/iceberg/parquet/parquet_writer.cc Outdated
Comment thread test/parquet_test.cc Outdated
Comment thread test/parquet_test.cc
@zeroshade zeroshade merged commit 1002270 into apache:main Sep 3, 2025
7 checks passed
@HuaHuaY HuaHuaY deleted the basic_parquet_writer branch September 4, 2025 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants