Skip to content

Commit 6bdfc9e

Browse files
andygroveclaude
andcommitted
Revert documentation changes to parquet_scans.md
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 33d87cd commit 6bdfc9e

1 file changed

Lines changed: 1 addition & 5 deletions

File tree

docs/source/contributor-guide/parquet_scans.md

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -37,13 +37,9 @@ implementation:
3737

3838
- Leverages the DataFusion community's ongoing improvements to `DataSourceExec`
3939
- Provides support for reading complex types (structs, arrays, and maps)
40-
- Delegates Parquet decoding to native Rust code rather than JVM-side decoding
40+
- Removes the use of reusable mutable-buffers in Comet, which is complex to maintain
4141
- Improves performance
4242

43-
> **Note on mutable buffers:** Both `native_comet` and `native_iceberg_compat` use reusable mutable buffers
44-
> when transferring data from JVM to native code via Arrow FFI. The `native_iceberg_compat` implementation uses DataFusion's native Parquet reader for data columns, bypassing Comet's mutable buffer infrastructure entirely. However, partition columns still use `ConstantColumnReader`, which relies on Comet's mutable buffers that are reused across batches. This means native operators that buffer data (such as `SortExec` or `ShuffleWriterExec`) must perform deep copies to avoid data corruption.
45-
> See the [FFI documentation](ffi.md) for details on the `arrow_ffi_safe` flag and ownership semantics.
46-
4743
The `native_datafusion` and `native_iceberg_compat` scans share the following limitations:
4844

4945
- When reading Parquet files written by systems other than Spark that contain columns with the logical type `UINT_8`

0 commit comments

Comments
 (0)