Documentation Gap
Documentation claims [1.0, NaN, 3.0] becomes [1.0, 0.0, 3.0] (element-wise) but code replaces the ENTIRE vector with [0.0, 0.0, 0.0] (whole-vector replacement).
Description
The docs incorrectly describe on_bad_vectors='fill' as element-wise NaN replacement — the actual code replaces the ENTIRE vector.
- Docs claim
[1.0, NaN, 3.0] becomes [1.0, 0.0, 3.0] with fill_value=0.0, but the code at table.py:3177-3181 replaces the entire vector with [0.0, 0.0, 0.0] — the is_bad flag is per-vector, not per-element
- Users lose ALL valid elements in partially-bad vectors without knowing it
- Zero fill vectors cause downstream issues: undefined cosine similarity (division by zero) and L2 results clustering near the origin
How to Validate
Affected Files
python/python/lancedb/table.py
python/python/lancedb/db.py
docs/tables/consistency.mdx
Created by Oqoqo
Documentation Gap
Documentation claims [1.0, NaN, 3.0] becomes [1.0, 0.0, 3.0] (element-wise) but code replaces the ENTIRE vector with [0.0, 0.0, 0.0] (whole-vector replacement).
Description
The docs incorrectly describe
on_bad_vectors='fill'as element-wise NaN replacement — the actual code replaces the ENTIRE vector.[1.0, NaN, 3.0]becomes[1.0, 0.0, 3.0]withfill_value=0.0, but the code attable.py:3177-3181replaces the entire vector with[0.0, 0.0, 0.0]— theis_badflag is per-vector, not per-elementHow to Validate
Affected Files
python/python/lancedb/table.pypython/python/lancedb/db.pydocs/tables/consistency.mdxCreated by Oqoqo