You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Comet write toggles (both off by default; can be turned on independently)
91
+
spark.comet.write.iceberg.splitOperator.enabled=true # layer 1: split-operator plan
92
+
spark.comet.write.iceberg.nativeAcceleration=true # layer 2: native parquet write (detection only at this stage)
84
93
```
85
94
86
-
`IcebergSparkSessionExtensions` is **mandatory on Spark 3.4** for `UPDATE` / `MERGE` on V2
87
-
tables — Spark 3.4's stock planner rejects `UPDATE TABLE` on V2 sources, and Iceberg's analyzer
88
-
extension (`RewriteUpdateTable` / `RewriteMergeIntoTable`) is what provides the rewrite. On
89
-
Spark 3.5+ the extension is optional but recommended (Spark added native row-level operation
90
-
support in 3.5). `INSERT INTO` / `INSERT OVERWRITE` / `DELETE FROM` work without the extension
91
-
on every Spark version.
95
+
`IcebergSparkSessionExtensions` is **mandatory on Spark 3.4** for `UPDATE` / `MERGE` on V2 tables —
96
+
Spark 3.4's stock planner rejects `UPDATE TABLE` on V2 sources, and Iceberg's analyzer extension
97
+
(`RewriteUpdateTable` / `RewriteMergeIntoTable`) is what provides the rewrite. On Spark 3.5+ the
98
+
extension is optional but recommended (Spark added native row-level operation support in 3.5).
99
+
`INSERT INTO` / `INSERT OVERWRITE` / `DELETE FROM` work without the extension on every Spark
100
+
version.
92
101
93
102
If `splitOperator.enabled=false`, Comet leaves Iceberg's stock plan alone — every write goes
94
103
straight through Iceberg-Java.
@@ -110,18 +119,55 @@ realisable benefit.
110
119
111
120
-[¹] Requires `IcebergSparkSessionExtensions` (see Configuration above).
112
121
122
+
## Fallback triggers
123
+
124
+
Before accepting a write into the native path, the serde checks for things iceberg-rust either
125
+
doesn't support or implements differently from iceberg-java. The first matching trigger short-
126
+
circuits the conversion and the plan stays on the JVM two-op path.
127
+
128
+
Each row attributes the gap to the layer that's the actual root cause: **iceberg-rust** (the
129
+
high-level Iceberg writer / commit / metadata logic) or **parquet-rs** (the underlying Apache
130
+
Parquet Rust encoder that iceberg-rust drives).
131
+
132
+
| Property / state | Falls back when … | Why |
133
+
|---|---|---|
134
+
| resolved write format |`SparkWrite.format` (i.e. effective `write-format` option overlaid on `write.format.default`) is not `parquet`|**iceberg-rust**: writer stack is parquet-only today. The check reads the resolved format, so per-write `option("write-format", "orc")` is honoured even when the table default is parquet. |
135
+
|`write.object-storage.enabled`|`true`|**iceberg-rust**: no hashed-prefix object-storage location generator |
136
+
|`write.location-provider.impl`| set |**iceberg-rust**: can't load a Java `LocationProvider` class |
|`encryption.*` (any key) | set |**iceberg-rust + parquet-rs**: no Parquet modular encryption support in either layer |
139
+
|`write.metadata.metrics.default`| mentions `counts`|**iceberg-rust**: doesn't implement Iceberg's `counts`-without-bounds metrics mode. parquet-rs supports stats on or off but has no "value counts only" knob, so iceberg-rust would need to track and emit those counts itself; it doesn't. |
140
+
|`write.metadata.metrics.default`|`none`|**iceberg-rust**: always populates `column_sizes` / `value_counts` / `null_value_counts` / bounds from the parquet footer regardless of the requested mode; iceberg-java's `none` emits an empty `DataFile.metrics`, so a native write would produce strictly richer manifests. |
141
+
|`write.metadata.metrics.column.<col>`| any column set to `counts`|**iceberg-rust**: same as default `counts`, per-column |
142
+
|`write.metadata.metrics.column.<col>`| any column set to `none`|**iceberg-rust**: same as default `none`, per-column |
143
+
|`write.parquet.bloom-filter-max-bytes`| explicitly set |**iceberg-rust**: doesn't enforce iceberg-java's bloom-filter byte cap. parquet-rs has no global cap parameter on `WriterProperties`, so iceberg-rust would need to clamp NDV / FPP itself before passing them through. |
144
+
|`write.parquet.bloom-filter-enabled.column.<col>`| any column set to `true`|**iceberg-rust**: bloom-filter sizing diverges from iceberg-java (same root cause as above — no cap clamping) |
145
+
|`write.parquet.row-group-check-min-record-count`| set to a non-default value (default `100`) |**parquet-rs**: row-group close cadence is byte-driven and doesn't expose parquet-mr's per-row-count knobs, so iceberg-rust can't honour this property |
146
+
|`write.parquet.row-group-check-max-record-count`| set to a non-default value (default `10000`) |**parquet-rs**: same as above |
147
+
|`write.parquet.page-version`| set to a non-default value (default `v1`) |**parquet-rs**: the default writer does not implement DataPageV2 encoding; iceberg-java with `v2` produces a different on-disk format. |
148
+
|`write.parquet.stats-enabled.column.<col>`| any column override set |**parquet-rs**: no per-column statistics-enabled override on `WriterProperties` matching parquet-mr's; iceberg-rust would silently ignore the request. |
149
+
|`parquet.enable.dictionary`| set |**parquet-rs**: explicit dictionary override is not translated to the native writer settings; parquet-rs would fall back to its own default and diverge from parquet-mr. |
150
+
|`write.metadata.metrics.max-inferred-column-defaults`| set to less than the number of fields in the schema (default `100`) |**iceberg-rust**: doesn't apply `MetricsModes.None` to columns past this limit; iceberg-java does, so the produced column statistics would diverge |
151
+
|`io-impl` (catalog property) | set |**iceberg-rust**: native FileIO is selected by URI scheme (`OpenDalStorageFactory`), so an explicit Java `FileIO` class can't be honoured. |
152
+
| Resolved data-location URI scheme | not in `{file, memory, s3, s3a, gs, oss}`|**iceberg-rust**: `iceberg_common::storage_factory_for` resolves only the listed schemes; `hdfs`, `abfs`/`abfss`, `wasb`/`wasbs`, and similar would crash at write time. |
153
+
154
+
Every trigger has a pinning test in `CometIcebergWriteDetectionSuite`.
155
+
113
156
## Tests
114
157
115
158
-**`CometIcebergWriteActionSuite`** — end-to-end scenarios against a temporary Hadoop catalog
116
159
covering the copy-on-write V2 logical-write variants on the split-operator path, plus parity
117
160
vs Iceberg-Java's writer, the disabled-config fallback, and the commit-once invariant. Passes
118
161
on Spark 3.4 / 3.5 / 4.0 with `IcebergSparkSessionExtensions` loaded.
162
+
-**`CometIcebergWriteDetectionSuite`** — tests that build a planned-but-not-executed
163
+
`IcebergWriteExec` per fall-back trigger and assert
164
+
`CometIcebergNativeWrite.getSupportLevel` returns `Unsupported` with the expected reason
165
+
fragment, plus a positive baseline (clean parquet V2 table → `Compatible`).
119
166
120
167
## Abort behaviour
121
168
122
-
The writer calls `writer.abort()` per task on failure to release task-level resources.
123
-
The committer (`IcebergCommitExec.run()`) calls `BatchWrite.abort(messages)` if `commit()`
124
-
raises.
169
+
The writer calls `writer.abort()` per task on failure to release task-level resources. The
170
+
committer (`IcebergCommitExec.run()`) calls `BatchWrite.abort(messages)` if `commit()` raises.
125
171
126
172
If task writers stage files locally but their commit messages never reach the driver (e.g.
127
173
driver crash mid-collect), the staged Parquet files become **unreferenced orphans**. Iceberg's
0 commit comments