to_csv: Handle edge cases found during fuzz testing

## Summary

During review of PR #3004 (which adds basic `to_csv` support), fuzz testing revealed several edge cases that are not handled correctly. These should be addressed in follow-up work after the initial implementation is merged.

## Bugs Found

### 1. Null value not quoted when it contains special characters
When the `nullValue` option contains the delimiter or other special characters (e.g., `"N,A"`), it's written unquoted, corrupting the CSV output.

| Expected (Spark) | Actual (Comet) |
|------------------|----------------|
| `"N,A",world` | `N,A,world` |
| `hello,"N,A"` | `hello,N,A` |

**Location:** `native/spark-expr/src/csv_funcs/to_csv.rs:164-171`

**Fix:** Check if `null_value` contains special characters and quote/escape it appropriately.

### 2. Whitespace trimming applied incorrectly  
When `ignoreLeadingWhiteSpace=false` or `ignoreTrailingWhiteSpace=false`, strings containing whitespace plus special characters are incorrectly handled. The code trims whitespace before checking if quoting is needed.

| Expected (Spark) | Actual (Comet) |
|------------------|----------------|
| `  \"` (preserved whitespace with escaped quote) | `""` (empty) |

**Location:** `native/spark-expr/src/csv_funcs/to_csv.rs:176-183`

**Fix:** Review the order of operations - quoting determination should consider the original (untrimmed) value.

### 3. Decimal formatting mismatch
Spark uses scientific notation for small decimal values, while Comet uses fixed-point notation.

| Expected (Spark) | Actual (Comet) |
|------------------|----------------|
| `0E-18` | `0.000000000000000000` |

**Fix:** Align decimal-to-string casting with Spark's formatting behavior.

### 4. NPE with single-column struct (needs investigation)
`NullPointerException` occurs when processing single-column structs with certain null patterns. This may be a Spark-side issue with how Comet's output is handled, but needs investigation.

## Reproduction

Fuzz tests were added in `CometCsvExpressionSuite.scala` that reproduce these issues:
- `to_csv - edge case: delimiter in null value representation`
- `to_csv - fuzz test: comprehensive random data and options`  
- `to_csv - edge case: numeric boundary values`
- `to_csv - edge case: single column struct`

## Related

- PR #3004 - Initial `to_csv` implementation

Expected (Spark)	Actual (Comet)
`"N,A",world`	`N,A,world`
`hello,"N,A"`	`hello,N,A`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

to_csv: Handle edge cases found during fuzz testing #3232

Summary

Bugs Found

1. Null value not quoted when it contains special characters

2. Whitespace trimming applied incorrectly

3. Decimal formatting mismatch

4. NPE with single-column struct (needs investigation)

Reproduction

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

to_csv: Handle edge cases found during fuzz testing #3232

Description

Summary

Bugs Found

1. Null value not quoted when it contains special characters

2. Whitespace trimming applied incorrectly

3. Decimal formatting mismatch

4. NPE with single-column struct (needs investigation)

Reproduction

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions