Commit 2cf4232
committed
fix(fsst): widen FSST output offsets to i64 to avoid i32 overflow
`fsst_compress_iter` previously hardcoded `VarBinBuilder::<i32>` for the
FSST output, panicking once cumulative compressed bytes crossed
`i32::MAX`. Switch to `VarBinBuilder::<i64>` so large inputs compress
without overflow. The `FSSTMetadata.codes_offsets_ptype` field already
records the offset PType, so existing serialized arrays continue to
deserialize unchanged.
Widening exposed a latent bug in `VarBin::compare`: with i64 offsets,
the LHS is converted to Arrow `LargeBinary`/`LargeUtf8` (per
`preferred_arrow_type`), but the RHS scalar was hardcoded to `Binary`/
`Utf8`. Arrow refuses `LargeBinary == Binary`. The RHS now picks the
matching Arrow type from the LHS Datum.
The previously-ignored regression test
`fsst_compress_offsets_overflow_i32` now passes when run with
`--ignored`. It still allocates ~5 GiB and stays `#[ignore]`d.
Signed-off-by: Claude <noreply@anthropic.com>1 parent d9bcd20 commit 2cf4232
3 files changed
Lines changed: 31 additions & 16 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
72 | | - | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
73 | 76 | | |
74 | 77 | | |
75 | 78 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
112 | 112 | | |
113 | 113 | | |
114 | 114 | | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
119 | 120 | | |
120 | | - | |
121 | | - | |
| 121 | + | |
| 122 | + | |
122 | 123 | | |
123 | 124 | | |
124 | 125 | | |
| |||
127 | 128 | | |
128 | 129 | | |
129 | 130 | | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | 131 | | |
135 | 132 | | |
136 | 133 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
| 6 | + | |
5 | 7 | | |
6 | 8 | | |
| 9 | + | |
7 | 10 | | |
8 | 11 | | |
9 | 12 | | |
| |||
82 | 85 | | |
83 | 86 | | |
84 | 87 | | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | | - | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
89 | 99 | | |
90 | 100 | | |
91 | 101 | | |
92 | 102 | | |
93 | | - | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
94 | 109 | | |
95 | 110 | | |
96 | 111 | | |
| |||
0 commit comments