Skip to content

Commit 3fcebd3

Browse files
authored
docs: Mark native_comet scan as deprecated (#3274)
1 parent 5860dd2 commit 3fcebd3

4 files changed

Lines changed: 18 additions & 14 deletions

File tree

common/src/main/scala/org/apache/comet/CometConf.scala

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -113,8 +113,9 @@ object CometConf extends ShimCometConf {
113113

114114
// Deprecated: native_comet uses mutable buffers incompatible with Arrow FFI best practices
115115
// and does not support complex types. Use native_iceberg_compat or auto instead.
116+
// This will be removed in a future release.
116117
// See: https://github.com/apache/datafusion-comet/issues/2186
117-
@deprecated("Use SCAN_AUTO instead", "0.9.0")
118+
@deprecated("Use SCAN_AUTO instead. native_comet will be removed in a future release.", "0.9.0")
118119
val SCAN_NATIVE_COMET = "native_comet"
119120
val SCAN_NATIVE_DATAFUSION = "native_datafusion"
120121
val SCAN_NATIVE_ICEBERG_COMPAT = "native_iceberg_compat"
@@ -125,9 +126,9 @@ object CometConf extends ShimCometConf {
125126
.doc(
126127
s"The implementation of Comet Native Scan to use. Available modes are `$SCAN_NATIVE_COMET`," +
127128
s"`$SCAN_NATIVE_DATAFUSION`, and `$SCAN_NATIVE_ICEBERG_COMPAT`. " +
128-
s"`$SCAN_NATIVE_COMET` (DEPRECATED) is for the original Comet native scan which " +
129-
"uses a jvm based parquet file reader and native column decoding. " +
130-
"Supports simple types only. " +
129+
s"`$SCAN_NATIVE_COMET` (DEPRECATED - will be removed in a future release) is for the " +
130+
"original Comet native scan which uses a jvm based parquet file reader and native " +
131+
"column decoding. Supports simple types only. " +
131132
s"`$SCAN_NATIVE_DATAFUSION` is a fully native implementation of scan based on " +
132133
"DataFusion. " +
133134
s"`$SCAN_NATIVE_ICEBERG_COMPAT` is the recommended native implementation that " +

docs/source/contributor-guide/ffi.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -177,8 +177,9 @@ message Scan {
177177

178178
#### When ownership is NOT transferred to native:
179179

180-
If the data originates from `native_comet` scan (or from `native_iceberg_compat` in some cases), then ownership is
181-
not transferred to native and the JVM may re-use the underlying buffers in the future.
180+
If the data originates from `native_comet` scan (deprecated, will be removed in a future release) or from
181+
`native_iceberg_compat` in some cases, then ownership is not transferred to native and the JVM may re-use the
182+
underlying buffers in the future.
182183

183184
It is critical that the native code performs a deep copy of the arrays if the arrays are to be buffered by
184185
operators such as `SortExec` or `ShuffleWriterExec`, otherwise data corruption is likely to occur.

docs/source/contributor-guide/parquet_scans.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -26,11 +26,11 @@ settings. Most users should not need to change this setting. However, it is poss
2626
a particular implementation for all scan operations by setting this configuration property to one of the following
2727
implementations.
2828

29-
| Implementation | Description |
30-
| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
31-
| `native_comet` | This implementation provides strong compatibility with Spark but does not support complex types. This is the original scan implementation in Comet and may eventually be removed. |
32-
| `native_iceberg_compat` | This implementation delegates to DataFusion's `DataSourceExec` but uses a hybrid approach of JVM and native code. This scan is designed to be integrated with Iceberg in the future. |
33-
| `native_datafusion` | This experimental implementation delegates to DataFusion's `DataSourceExec` for full native execution. There are known compatibility issues when using this scan. |
29+
| Implementation | Description |
30+
| ----------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
31+
| `native_comet` | **Deprecated.** This implementation provides strong compatibility with Spark but does not support complex types. This is the original scan implementation in Comet and will be removed in a future release. |
32+
| `native_iceberg_compat` | This implementation delegates to DataFusion's `DataSourceExec` but uses a hybrid approach of JVM and native code. This scan is designed to be integrated with Iceberg in the future. |
33+
| `native_datafusion` | This experimental implementation delegates to DataFusion's `DataSourceExec` for full native execution. There are known compatibility issues when using this scan. |
3434

3535
The `native_datafusion` and `native_iceberg_compat` scans provide the following benefits over the `native_comet`
3636
implementation:
@@ -71,7 +71,9 @@ The `native_datafusion` scan has some additional limitations:
7171

7272
There are some differences in S3 support between the scan implementations.
7373

74-
### `native_comet`
74+
### `native_comet` (Deprecated)
75+
76+
> **Note:** The `native_comet` scan implementation is deprecated and will be removed in a future release.
7577
7678
The `native_comet` Parquet scan implementation reads data from S3 using the [Hadoop-AWS module](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html), which
7779
is identical to the approach commonly used with vanilla Spark. AWS credential configuration and other Hadoop S3A

docs/source/contributor-guide/roadmap.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,8 @@ more Spark SQL tests and fully implementing ANSI support ([#313]) for all suppor
4242

4343
### Removing the native_comet scan implementation
4444

45-
We are working towards deprecating ([#2186]) and removing ([#2177]) the `native_comet` scan implementation, which
46-
is the original scan implementation that uses mutable buffers (which is incompatible with best practices around
45+
The `native_comet` scan implementation is now deprecated and will be removed in a future release ([#2186], [#2177]).
46+
This is the original scan implementation that uses mutable buffers (which is incompatible with best practices around
4747
Arrow FFI) and does not support complex types.
4848

4949
Now that the default `auto` scan mode uses `native_iceberg_compat` (which is based on DataFusion's `DataSourceExec`),

0 commit comments

Comments
 (0)