chore: drop leftover JVM Parquet helpers and the native_datafusion scan name by andygrove · Pull Request #4385 · apache/datafusion-comet

andygrove · 2026-05-21T15:43:37Z

Which issue does this PR close?

Closes #.

Rationale for this change

After the JVM Parquet reader paths (native_comet and native_iceberg_compat) were retired, three Spark-derived helpers were left behind with no production callers. They were originally needed for the JVM vectorized reader to clip Spark request schemas against Parquet file schemas and to build predicate column descriptors; the native Parquet scan does all of that in Rust now.

The same cleanup pass surfaced something else: every reference to native_datafusion in code, comments, and test names is now misleading. There is only one scan, so naming it adds no information and makes the code read as if other options still exist.

What changes are included in this PR?

Delete three dead files:
- spark/src/main/scala/org/apache/spark/sql/comet/parquet/CometParquetReadSupport.scala
- spark/src/main/scala/org/apache/spark/sql/comet/parquet/CometSparkToParquetSchemaConverter.scala (only referenced by the file above)
- spark/src/main/java/org/apache/parquet/filter2/predicate/SparkFilterApi.java (untouched since the initial PR; no callers)
Rename CometScanRule.nativeDataFusionScan to nativeScan.
Update test names that contained native_datafusion to just say native scan (5 tests in ParquetReadV1Suite, 1 in CometTaskMetricsSuite).
Update comments in ParquetReadSuite, ParquetEncryptionITCase, CometNativeScan, parquet_exec.rs, and the bug-triage skill that referred to native_datafusion or native_iceberg_compat.

Net diff: 10 files, +17 / -767.

How are these changes tested?

Covered by existing tests; the renames are mechanical and the deleted files have no callers.

ParquetReadV1Suite "native scan rejects": 5/5 pass (the renamed type-mismatch regression tests).
make and cargo clippy --all-targets --workspace -- -D warnings both green.

…scan name After the JVM Parquet reader paths were removed, three Spark-derived helpers in `spark.sql.comet.parquet` and `parquet.filter2.predicate` were left behind with no production callers. The scan they used to support is also the only scan we still have, so the `native_datafusion` qualifier in code, comments, and test names is now redundant. Removed: - `CometParquetReadSupport.scala`: copy of Spark's `ParquetReadSupport` - `CometSparkToParquetSchemaConverter.scala`: only referenced by the file above - `SparkFilterApi.java`: copy of Spark 3.2 filter API, never referenced Renamed: - `CometScanRule.nativeDataFusionScan` -> `nativeScan` - Test cases and comments referring to `native_datafusion` now say "native scan" / "native Parquet scan" Also dropped the stale `native_iceberg_compat` mention from `parquet_exec.rs` and the bug-triage skill's area-indicator list.

mbutrovich

Thanks for finding the dead code, @andygrove!

andygrove and others added 2 commits May 21, 2026 09:43

Merge branch 'main' into parquet-reader-audit

c0c1828

mbutrovich approved these changes May 22, 2026

View reviewed changes

andygrove merged commit 354ad46 into apache:main May 22, 2026
125 checks passed

andygrove deleted the parquet-reader-audit branch May 22, 2026 01:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: drop leftover JVM Parquet helpers and the native_datafusion scan name#4385

chore: drop leftover JVM Parquet helpers and the native_datafusion scan name#4385
andygrove merged 2 commits into
apache:mainfrom
andygrove:parquet-reader-audit

andygrove commented May 21, 2026

Uh oh!

mbutrovich left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andygrove commented May 21, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

mbutrovich left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants