Commit fc5abd6
committed
[SPARK-55754][PYTHON][TEST][FOLLOWUP] Fix pure_ints type mismatch in bench
### What changes were proposed in this pull request?
Refactor `MockDataFactory.NAMED_TYPE_POOLS` in `python/benchmarks/bench_eval_type.py` so the `pure_ints`, `pure_floats`, and `pure_strings` entries reuse the corresponding `TYPE_REGISTRY` entries instead of duplicating their factory lambdas.
### Why are the changes needed?
`NAMED_TYPE_POOLS[\"pure_ints\"]` declared the column as `IntegerType()` (32-bit) but generated data with `np.int64`. Because every benchmark that uses this pool runs through serializers with `arrow_cast=True`, the mismatch was silently corrected by a 64-to-32 narrowing cast inside the pandas/arrow conversion path -- meaning the `pure_ints` scenario in seven mixins (`ArrowBatchedUDF`, `ArrowUDTF`, `ArrowTableUDF`, `MapArrowIterUDF`, `MapPandasIterUDF`, `ScalarArrowUDF`, `ScalarPandasUDF`) was measuring an extra narrowing step rather than a pure int32 baseline.
`pure_floats` and `pure_strings` had no such mismatch but duplicated the same lambdas as `TYPE_REGISTRY[\"double\"]` / `TYPE_REGISTRY[\"string\"]`, risking drift in future edits. Reusing the registry entries eliminates the duplication. `pure_ts` is left as-is because no matching `TYPE_REGISTRY` entry exists.
### Does this PR introduce _any_ user-facing change?
No. Test-only change in the benchmark module.
### How was this patch tested?
- Confirmed `NAMED_TYPE_POOLS[\"pure_ints\"][0]` now produces a `pa.int32()` array matching its `IntegerType()` declaration (was `pa.int64()`).
- Confirmed `pure_floats` and `pure_strings` still produce `pa.float64()` and `pa.string()` arrays after the refactor.
- Ran `setup` + `time_worker` for the `pure_ints` scenario across all seven affected `*TimeBench` classes; all passed.
### Was this patch authored or co-authored using generative AI tooling?
Yes. Generated-by: Claude Code (claude-opus-4-7)
Closes apache#56169 from viirya/SPARK-55724-pure-ints-followup.
Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>1 parent 70469a2 commit fc5abd6
1 file changed
Lines changed: 3 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
261 | 261 | | |
262 | 262 | | |
263 | 263 | | |
264 | | - | |
265 | | - | |
266 | | - | |
267 | | - | |
268 | | - | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
269 | 267 | | |
270 | 268 | | |
271 | 269 | | |
| |||
0 commit comments