Skip to content

CometNativeException: "arrays of different length" when using to_date on Iceberg Timestamp column #3255

@boudica-dev-eng

Description

@boudica-dev-eng

Describe the bug

I am encountering a CometNativeException when performing standard date transformations (to_date or datediff) on a Timestamp column read from an Iceberg table.

The error message Cannot perform binary operation on arrays of different length occurs even though the table schema contains no ArrayType columns (only Scalars).

The issue appears to be related to how Comet handles the vectorisation of the Timestamp column, possibly involving dictionary encoding in the underlying Parquet files.

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage X failed 4 times...
Caused by: org.apache.comet.CometNativeException: Compute error: Cannot perform binary operation on arrays of different length
    at org.apache.comet.Native.executePlan(Native Method)
    ...

When Comet is disabled, Spark executes the job flawlessly.

Steps to reproduce

  1. Read an Iceberg table containing a TIMESTAMPTZ column.
  2. Apply F.to_date() to the timestamp column.
  3. Trigger an action (e.g., .count() or a write).
# Schema is simple: id (String), ts (Timestamp) - No Arrays present
df = spark.read.format("iceberg").load("db.table")

# This crashes Comet:
df.withColumn("date_col", F.to_date(F.col("ts"))).count()

# This ALSO crashes Comet:
df.withColumn("diff", F.datediff(F.current_date(), F.col("ts"))).count()

Expected behavior

Column is added as date

Additional context

  • Comet version: built from ea26629
  • Spark version: 4.0.1_2.13 (Spark Connect, Kubernetes, no Python/PySpark)

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingcrashNative engine crash/panic/segfaultpriority:highCrashes, panics, segfaults, major functional breakage

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions