Skip to content

Struct field nested directly inside another struct loses nullability on file round-trip #8348

@joseph-isaacs

Description

@joseph-isaacs

Found by the Python API Hypothesis fuzzer (vortex-python/test/test_fuzz_file_roundtrip.py).

A nullable struct field nested directly inside another struct comes back non-nullable when read from a file, tripping the dtype debug_assert_eq! in vortex-array/src/stream/adapter.rs on debug builds:

import os
import tempfile

import pyarrow as pa
import vortex as vx

typ = pa.struct([("a", pa.struct([("b", pa.int32())]))])
t = pa.table({"c0": pa.array([{"a": None}, None], type=typ)})
with tempfile.TemporaryDirectory() as d:
    p = os.path.join(d, "x.vortex")
    vx.io.write(t, p)
    vx.open(p).scan().read_all()
pyo3_runtime.PanicException: assertion `left == right` failed:
ArrayStreamAdapter expected array with type {c0={a={b=i32}?}?}, actual {c0={a={b=i32}}?}

The expected dtype has the inner field a as nullable ({b=i32}?) but the dtype read back from the file has dropped the ?. On release builds the debug_assert doesn't fire, so this presumably round-trips with silently altered nullability. The fuzzer currently assume()s away struct-of-struct schemas; that guard can be removed once this is fixed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions