perf: Additional optimizations for cast from string to int#3048
Conversation
|
|
||
| macro_rules! cast_utf8_to_int { | ||
| ($array:expr, $eval_mode:expr, $array_type:ty, $cast_method:ident) => {{ | ||
| ($array:expr, $array_type:ty, $parse_fn:expr) => {{ |
There was a problem hiding this comment.
Pass in the parsing function as a closure rather than passing in the eval mode and then matching on that for each row.
| match eval_mode { | ||
| EvalMode::Legacy => do_parse_string_to_int_legacy(str, min_value), | ||
| EvalMode::Ansi => do_parse_string_to_int_ansi(str, type_name, min_value), | ||
| EvalMode::Try => do_parse_string_to_int_try(str, min_value), | ||
| } |
There was a problem hiding this comment.
This was happening once per row
|
Current performance compared to main branch: |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3048 +/- ##
============================================
+ Coverage 56.12% 59.50% +3.37%
- Complexity 976 1381 +405
============================================
Files 119 167 +48
Lines 11743 15522 +3779
Branches 2251 2575 +324
============================================
+ Hits 6591 9236 +2645
- Misses 4012 4989 +977
- Partials 1140 1297 +157 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| .map(|v| v as i16)) | ||
| fn parse_string_to_i8_legacy(str: &str) -> SparkResult<Option<i8>> { | ||
| match do_parse_string_to_int_legacy::<i32>(str, i32::MIN)? { | ||
| None => Ok(None), |
There was a problem hiding this comment.
| None => Ok(None), |
?
comphead
left a comment
There was a problem hiding this comment.
Thanks @andygrove it is similar to #2600 (comment) suggestion and thanks for addressing those issues
I'll take a look next. |
|
thank you @andygrove |
Which issue does this PR close?
N/A
Rationale for this change
This follows on from #3017 and makes some additional optimizations.
See comment #3048 (comment) for criterion benchmark results.
What changes are included in this PR?
trim_asciiinstead of custom trim logicdo_parse_string_to_int_legacyand remove mutableparse_sign_and_digitsHow are these changes tested?
Existing tests.