[Variant] Align cast logic for from/to_decimal for variant by klion26 · Pull Request #9689 · apache/arrow-rs

klion26 · 2026-04-10T08:26:48Z

Which issue does this PR close?

Closes Align cast logic for from/to_decimal for variant to cast kernel #9688 .

What changes are included in this PR?

Extract some logic in arrow-cast
Reuse the extracted logic in arrow-cast and parquet-variant

Are these changes tested?

Reuse the existing tests in arrow-test

Are there any user-facing changes?

Yes, changed the docs

klion26

@scovich @sdf-jkl Please help to review this when you're free, thanks

klion26 · 2026-04-10T08:33:33Z

+                .ok()
+                .and_then(|x: i32| x.try_into().ok()),
+            Variant::ShortString(v) => {
+                parse_string_to_decimal_native::<Decimal32Type>(v.as_str(), 0usize)


Use v.as_str() instead of v because we use match *self, if we change to match self then we need to derefer self in other match arms, seems there is litter benefit gained, so stick to the current match *self and use v.as_str() here.

scovich

General approach looks reasonable, but needs some tweaks to avoid regressing performance. Do we have benchmarks we can throw at this to verify?

scovich · 2026-04-10T16:17:30Z

+                    let v = cast_single_decimal_to_integer::<D, T::Native>(
+                        array.value(i),
+                        div,
+                        scale as _,


Why are we casting? Isn't it a trivial i16 -> i16 cast?
(again below)

To avoid the overlfow cast_single_decimal_to_integer receives i16, the scale of arrow decimal is i8 and VariantDecimal is u8.

scovich · 2026-04-10T16:26:16Z

+        } else {
+            match cast_options.safe {
+                true => {
+                    let v = cast_single_decimal_to_integer::<D, T::Native>(


The original code hoisted checks for scale < 0 (mul_checked vs. div_checked) and cast_options.safe (NULL vs. error) outside the loop, producing four simple loop bodies. This was presumably done for performance reasons (minimizing branching inside the loop).

The new code pushes the cast_options.safe check inside a single loop and pushes scale < 0 check all the way down inside cast_single_decimal_to_integer. That triples the number of branches inside the loop body (the null check is per-row and so is always stuck inside the loop). Performance will almost certainly be impacted, possibly significantly.

It would be safer to just preserve the replication (even tho it duplicates logic with the new helper), and rely on the compiler's inlining and "jump threading" optimizations to eliminate that redundancy:

code snippet

if scale < 0 { if cast_options.safe { for i in 0..array.len() { if array.is_null(i) { value_builder.append_null(); } else { let v = cast_single_decimal_to_integer::<D, T::Native>(...); value_builder.append_option(v.ok()); } } } else { for i in 0..array.len() { if array.is_null(i) { value_builder.append_null(); } else { let v = cast_single_decimal_to_integer::<D, T::Native>(...); value_builder.append_value(v?); } } } } else { if cast_options.safe { for i in 0..array.len() { if array.is_null(i) { value_builder.append_null(); } else { let v = cast_single_decimal_to_integer::<D, T::Native>(...); value_builder.append_option(v.ok()); } } } else { for i in 0..array.len() { if array.is_null(i) { value_builder.append_null(); } else { let v = cast_single_decimal_to_integer::<D, T::Native>(...); value_builder.append_value(v?); } } } }

If you wanted to simplify a bit, you could define and use a local macro inside this function:

// Helper macro for emitting nearly the same loop every time, so we can hoist branches out. // The compiler will specialize the resulting code (inlining and jump threading) macro_rules! cast_loop { (|$v:ident| $body:expr) => {{ for i in 0..array.len() { if array.is_null(i) { value_builder.append_null(); } else { let $v = cast_single_decimal_to_integer::<D, T::Native>(...); $body } } }}; } if scale < 0 { if cast_options.safe { cast_loop!(|v| value_builder.append_option(v.ok())); } else { cast_loop!(|v| value_builder.append_value(v?)); } } else { if cast_options.safe { cast_loop!(|v| value_builder.append_option(v.ok())); } else { cast_loop!(|v| value_builder.append_value(v?)); } }

Note that the four loop bodies are almost syntactically identical -- differing only in whether they append_option(v.ok()) or append_value(v?) -- but the inlined body of cast_single_decimal_to_integer inside each loop will be specialized based on the scale < 0 check we already performed. Result: stand-alone calls to the helper function are always safe, but we still get maximum performance here.

Thanks for the detailed explain. fixed.

scovich · 2026-04-10T16:34:47Z

        }),
        Float64 => cast_decimal_to_float::<D, Float64Type, _>(array, |x| {
-            as_float(x) / 10_f64.powi(*scale as i32)
+            single_decimal_to_float_lossy::<D, F>(&as_float, x, *scale as _)


If we're anyway changing the code, i32::from(*scale) makes clear that this is a lossless conversion

(a bunch more similarly lossless as _ below)

Yes, i32::from(*scale) is better, fixed.

scovich · 2026-04-10T16:44:42Z

+            Variant::Decimal16(d) => Self::cast_decimal_to_num::<Decimal128Type, T, _>(
+                d.integer(),
+                d.scale(),
+                |x: i128| x as f64,


I'm a bit surprised those type annotations are necessary when the first arg takes D::Native and should thus constrain the third arg's `F: fn(D::Native) -> f64?

Compiler didn't need this, fixed.

scovich · 2026-04-10T16:46:05Z

+            .map(|(_, frac)| frac.len())
+            .unwrap_or(0);


Fixed, use map_or_else here as map_or_default is a nightly-only experimental API for now

scovich · 2026-04-10T16:48:10Z

+        parse_string_to_decimal_native::<D>(input, scale_usize)
+            .ok()
+            .and_then(|raw| VD::try_new(raw, scale).ok())


nit

Suggested change

parse_string_to_decimal_native::<D>(input, scale_usize)

.ok()

.and_then(|raw| VD::try_new(raw, scale).ok())

let raw = parse_string_to_decimal_native::<D>(input, scale_usize).ok()?;

VD::try_new(raw, scale).ok()

Fixed, more readable now

scovich · 2026-04-10T16:51:32Z

        self.as_num()
    }

+    fn convert_string_to_decimal<VD, D>(input: &str) -> Option<VD>


nit: If you swap the template order, I callers would be a tad more readable, e.g.:

convert_string_to_decimal::<Decimal32Type, _>

scovich · 2026-04-10T16:53:25Z

+                .as_num::<i64>()
+                .map(|x| (x as i128).try_into().ok())


Out of curiosity, why not just as_num::<i128> directly?
But if you must keep the double cast, at least do i128::from(x) to make clear it's lossless.

There is no UInt128 arrow type. as_num needs the type to implement DecimalCastTarget, which needs a corresponding arrow type UInt128.

Changed to i128::from(..)

klion26

@scovich Thanks for the review. I've addressed the comments. For the benchmark, I'll verify locally and come back.

klion26 · 2026-04-13T08:56:10Z

+                    let v = cast_single_decimal_to_integer::<D, T::Native>(
+                        array.value(i),
+                        div,
+                        scale as _,


To avoid the overlfow cast_single_decimal_to_integer receives i16, the scale of arrow decimal is i8 and VariantDecimal is u8.

klion26 · 2026-04-13T11:28:54Z

+        } else {
+            match cast_options.safe {
+                true => {
+                    let v = cast_single_decimal_to_integer::<D, T::Native>(


Thanks for the detailed explain. fixed.

klion26 · 2026-04-13T11:39:22Z

        }),
        Float64 => cast_decimal_to_float::<D, Float64Type, _>(array, |x| {
-            as_float(x) / 10_f64.powi(*scale as i32)
+            single_decimal_to_float_lossy::<D, F>(&as_float, x, *scale as _)


Yes, i32::from(*scale) is better, fixed.

klion26 · 2026-04-13T11:41:07Z

+            Variant::Decimal16(d) => Self::cast_decimal_to_num::<Decimal128Type, T, _>(
+                d.integer(),
+                d.scale(),
+                |x: i128| x as f64,


Compiler didn't need this, fixed.

klion26 · 2026-04-13T11:44:53Z

+            .map(|(_, frac)| frac.len())
+            .unwrap_or(0);


Fixed, use map_or_else here as map_or_default is a nightly-only experimental API for now

klion26 · 2026-04-13T11:45:22Z

+        parse_string_to_decimal_native::<D>(input, scale_usize)
+            .ok()
+            .and_then(|raw| VD::try_new(raw, scale).ok())


Fixed, more readable now

klion26 · 2026-04-13T11:54:39Z

+                .as_num::<i64>()
+                .map(|x| (x as i128).try_into().ok())


There is no UInt128 arrow type. as_num needs the type to implement DecimalCastTarget, which needs a corresponding arrow type UInt128.

Changed to i128::from(..)

klion26 · 2026-04-13T12:38:49Z

        self.as_num()
    }

+    fn convert_string_to_decimal<VD, D>(input: &str) -> Option<VD>


klion26 · 2026-04-13T12:48:44Z

        ),
+        Variant::Float(f) => single_float_to_decimal::<O>(f64::from(*f), mul),
+        Variant::Double(f) => single_float_to_decimal::<O>(*f, mul),
+        Variant::String(v) if scale > 0 => parse_string_to_decimal_native::<O>(


Seems, the logic of parsing a string to decimal in arrow-cast has a bug, it will call it with scale:i8 as uscale, will validate and file an issue to track it.(but it doesn't relate to the current change),

Variant spec doesn't even allow negative scale in the first place:

Oh... but this code is converting from variant to arrow decimal, so negative scales are totally legit.

Meanwhile, perhaps a code comment would be good here? It might not be obvious to future source divers why the scale > 0 constraint.

Actually, scale >= 0 should be safe?

I have reviewed the code today, arrow-cast has checked the scale in decimal.rs::cast_string_to_decimal(only supports scale >=0 when cast uf8 to decimal), and I need to update the code here from >0 to >=0, will add a comment here also.

arrow-rs/arrow-cast/src/cast/decimal.rs

Lines 725 to 730 in 711fac8

{

if scale < 0 {

return Err(ArrowError::InvalidArgumentError(format!(

"Cannot cast string to decimal with negative scale {scale}"

)));

}

scovich

Mostly nits, but one dangerous floating point order of operations issue that I overlooked while suggesting #9689 (comment). Probably best to back it out (see comment for details).

scovich · 2026-04-13T13:32:08Z

    F: Fn(D::Native) -> f64,
 {
-    f(x) / 10_f64.powi(scale)
+    f(x) * 10_f64.powi(-scale)


Rescuing #9689 (comment) from github oblivion...

I just remembered that floating point a * 10**-b is technically NOT equivalent to a / 10**b. The algebraic operators section of rust docs talks about it:

Algebraic operators of the form a.algebraic_*(b) allow the compiler to optimize floating point operations using all the usual algebraic properties of real numbers – despite the fact that those properties do not hold on floating point numbers. This can give a great performance boost since it may unlock vectorization.

The exact set of optimizations is unspecified but typically allows combining operations, rearranging series of operations based on mathematical properties, converting between division and reciprocal multiplication, and disregarding the sign of zero.

(emphasis mine)

Whether we think any difference matters for this specific case... I don't know. But we should probably defer a change like this to its own PR with appropriate performance evaluation and weighing of trade-offs.

At least there's now a narrow waist for such a future optimization to be made easily.

I have the same question before changing the code, but the code on play rust below shows that they are equal, not sure if this is enought.

yes, If this code below is not enough,we should keep it as it in the current pr

let max: u8 = u8::MAX; for i in 0..max { let left = 1f64 / 10_f64.powi(<i32 as From::<u8>>::from(i)); let right = 1f64 * 10_f64.powi(-<i32 as From::<u8>>::from(i)); if left != right { println!("No equal {:?}", i); } } println!("Over")

Pure luck that 1f didn't trigger anything. A more complete example finds plenty of values just in -10..=10:

a = -10, b = -30 (a as f64) / 10_f64.powi(b) = -1.00000000000000007618e31 (a as f64) * 10_f64.powi(-b) = -9.99999999999999963590e30 a = -10, b = -26 (a as f64) / 10_f64.powi(b) = -1.00000000000000015073e27 (a as f64) * 10_f64.powi(-b) = -1.00000000000000001329e27 a = -10, b = -23 (a as f64) / 10_f64.powi(b) = -9.99999999999999849005e23 (a as f64) * 10_f64.powi(-b) = -9.99999999999999983223e23 a = -10, b = -17 (a as f64) / 10_f64.powi(b) = -9.99999999999999872000e17 (a as f64) * 10_f64.powi(-b) = -1.00000000000000000000e18 a = -10, b = -5 (a as f64) / 10_f64.powi(b) = -9.99999999999999883585e5 (a as f64) * 10_f64.powi(-b) = -1.00000000000000000000e6 a = -10, b = 6 (a as f64) / 10_f64.powi(b) = -1.00000000000000008180e-5 (a as f64) * 10_f64.powi(-b) = -9.99999999999999912396e-6 a = -10, b = 11 (a as f64) / 10_f64.powi(b) = -1.00000000000000003643e-10 (a as f64) * 10_f64.powi(-b) = -9.99999999999999907185e-11 a = -10, b = 15 (a as f64) / 10_f64.powi(b) = -9.99999999999999998819e-15 (a as f64) * 10_f64.powi(-b) = -1.00000000000000015659e-14 a = -10, b = 17 (a as f64) / 10_f64.powi(b) = -9.99999999999999979098e-17 (a as f64) * 10_f64.powi(-b) = -1.00000000000000010236e-16 Found 246 examples in all

scovich · 2026-04-13T13:34:50Z

+        Variant::Double(f) => single_float_to_decimal::<O>(*f, mul),
+        Variant::String(v) if scale > 0 => parse_string_to_decimal_native::<O>(
+            v,
+            <i8 as TryInto<usize>>::try_into(scale).expect("scale is positive, would never fail"),


as _ seems appropriate for cases like this?

(sorry if my earlier comments gave the impression that it was outright bad/forbidden)

Will fix it. Thanks for the feedback. I'd very much appreciate your feedback, they let me learn more about the Rust type system and the best practices.

Use from because the compiler can guarantee the function is infallible, but the case here, the compiler can't guarantee (the logic guarantees it's infallible)

scovich · 2026-04-13T13:36:07Z

+            fn arrow_type() -> DataType {
+                $arrow_type
+            }


Out of curiosity, why a function for this one, instead of a const like the other uses?

Suggested change

fn arrow_type() -> DataType {

$arrow_type

}

const ARROW_TYPE: DataType = $arrow_type;

Changed to const, This style is more consistent.

scovich · 2026-04-13T13:37:32Z

+        base.pow_checked(<u32 as From<u8>>::from(scale))
+            .ok()
+            .and_then(|div| match T::KIND {


Another place where and_then hurts readability

let div = base.pow_checked(<u32 as From<u8>>::from(scale)).ok()?; match T::KIND { ... }

scovich · 2026-04-13T13:39:42Z

+            Variant::Decimal16(d) => {
+                Self::cast_decimal_to_num::<Decimal128Type, T, _>(d.integer(), d.scale(), |x| {
+                    x as f64
+                })
+            }


aside: Very odd choice by fmt... I would have expected

Suggested change

Variant::Decimal16(d) => {

Self::cast_decimal_to_num::<Decimal128Type, T, _>(d.integer(), d.scale(), |x| {

x as f64

})

}

Variant::Decimal16(d) => Self::cast_decimal_to_num::<Decimal128Type, T, _>(

d.integer(),

d.scale(),

|x| x as f64,

)

🤷

try to manually change it, fmt changed back to this format.

scovich · 2026-04-13T13:40:53Z

+        // find the last '.'
+        let scale_usize = input
+            .rsplit_once('.')
+            .map_or_else(|| 0, |(_, frac)| frac.len());


Suggested change

.map_or_else(|| 0, |(_, frac)| frac.len());

.map_or(0, |(_, frac)| frac.len());

(sorry, I mixed up the two names in my earlier suggestion)

I'm actually surprised clippy didn't flag it.

scovich · 2026-04-13T13:45:43Z

+            Variant::Int8(_) | Variant::Int16(_) | Variant::Int32(_) | Variant::Int64(_) => self
+                .as_num::<i64>()
+                .map(|x| <i128 as From<i64>>::from(x).try_into().ok())
+                .unwrap(),


Why unwrap an option in a function that returns `Option?

Suggested change

Variant::Int8(_) | Variant::Int16(_) | Variant::Int32(_) | Variant::Int64(_) => self

.as_num::<i64>()

.map(|x| <i128 as From<i64>>::from(x).try_into().ok())

.unwrap(),

Variant::Int8(_) | Variant::Int16(_) | Variant::Int32(_) | Variant::Int64(_) => {

let x = self.as_num::<i64>()?;

i128::from(x).try_into().ok()

}

Also: the <i128 as From<i64>> is surprising -- does the compiler actually need it for some reason?

Changed to the suggestion, it looks better

use unwrap here because this is always safe, didn't comment this because I didn't add a comment because I thought it would be too direct to get this.

Yes, the compiler needs it because there are multiple from found

Note: candidate #1 is defined in the trait `From` Note: candidate #2 is defined in an impl of the trait `num_traits::NumCast` for the type `i128

klion26

@scovich, thanks for the review. I've addressed the comments.

I didn't find time to add the benchmark and run them today, will find time tomorrow.

klion26 · 2026-04-14T12:17:32Z

    F: Fn(D::Native) -> f64,
 {
-    f(x) / 10_f64.powi(scale)
+    f(x) * 10_f64.powi(-scale)


I have the same question before changing the code, but the code on play rust below shows that they are equal, not sure if this is enought.

yes, If this code below is not enough,we should keep it as it in the current pr

let max: u8 = u8::MAX; for i in 0..max { let left = 1f64 / 10_f64.powi(<i32 as From::<u8>>::from(i)); let right = 1f64 * 10_f64.powi(-<i32 as From::<u8>>::from(i)); if left != right { println!("No equal {:?}", i); } } println!("Over")

klion26 · 2026-04-14T12:23:49Z

        ),
+        Variant::Float(f) => single_float_to_decimal::<O>(f64::from(*f), mul),
+        Variant::Double(f) => single_float_to_decimal::<O>(*f, mul),
+        Variant::String(v) if scale > 0 => parse_string_to_decimal_native::<O>(


I have reviewed the code today, arrow-cast has checked the scale in decimal.rs::cast_string_to_decimal(only supports scale >=0 when cast uf8 to decimal), and I need to update the code here from >0 to >=0, will add a comment here also.

arrow-rs/arrow-cast/src/cast/decimal.rs

Lines 725 to 730 in 711fac8

{

if scale < 0 {

return Err(ArrowError::InvalidArgumentError(format!(

"Cannot cast string to decimal with negative scale {scale}"

)));

}

klion26 · 2026-04-14T12:31:00Z

+        Variant::Double(f) => single_float_to_decimal::<O>(*f, mul),
+        Variant::String(v) if scale > 0 => parse_string_to_decimal_native::<O>(
+            v,
+            <i8 as TryInto<usize>>::try_into(scale).expect("scale is positive, would never fail"),


Will fix it. Thanks for the feedback. I'd very much appreciate your feedback, they let me learn more about the Rust type system and the best practices.

Use from because the compiler can guarantee the function is infallible, but the case here, the compiler can't guarantee (the logic guarantees it's infallible)

klion26 · 2026-04-14T12:36:58Z

+            fn arrow_type() -> DataType {
+                $arrow_type
+            }


Changed to const, This style is more consistent.

klion26 · 2026-04-14T12:39:28Z

+        base.pow_checked(<u32 as From<u8>>::from(scale))
+            .ok()
+            .and_then(|div| match T::KIND {


klion26 · 2026-04-14T12:41:00Z

+            Variant::Decimal16(d) => {
+                Self::cast_decimal_to_num::<Decimal128Type, T, _>(d.integer(), d.scale(), |x| {
+                    x as f64
+                })
+            }


try to manually change it, fmt changed back to this format.

klion26 · 2026-04-14T12:44:49Z

+        // find the last '.'
+        let scale_usize = input
+            .rsplit_once('.')
+            .map_or_else(|| 0, |(_, frac)| frac.len());


klion26 · 2026-04-14T12:54:22Z

+            Variant::Int8(_) | Variant::Int16(_) | Variant::Int32(_) | Variant::Int64(_) => self
+                .as_num::<i64>()
+                .map(|x| <i128 as From<i64>>::from(x).try_into().ok())
+                .unwrap(),


Changed to the suggestion, it looks better

use unwrap here because this is always safe, didn't comment this because I didn't add a comment because I thought it would be too direct to get this.

Yes, the compiler needs it because there are multiple from found

Note: candidate #1 is defined in the trait `From` Note: candidate #2 is defined in an impl of the trait `num_traits::NumCast` for the type `i128

klion26

@scovich I've updated the code and fixed a performance regression problem, please take a look when you're free. thanks

I've filed an pr for the benchmark in #9729, add benchmark in a seperate pr so that we can compare the results befor and after this pr merged in the remote machine.

klion26 · 2026-04-15T13:08:32Z

@@ -833,11 +844,11 @@ where
                    if array.is_null(i) {
                        value_builder.append_null();
                    } else {


Changed back to the if/esle and match pattern, because

We need to distinguish the logic in safe and no-safe path because of a performance problem, in the last commit, we will construct an arrowerror(will call format!) and drop it in safe mode, this have 50%+ performance regression in benchmark.

After step 1, seems there is little gain to union the logic here

Ah, I totally missed the spurious error allocation pitfall 🤦. Glad your benchmarking uncovered it!

If you really wanted to unify without the overhead, a helper that returns Result<T, D::Native> should do the trick: Ok(v) is a valid value, and Err(v) is the out of gamut value. The value would be super cheap, and safe path does .ok() while unsafe path does .map(|v| ArrowError::CastError(...)).

But probably not worth it, especially given that the checked mul/div also produce ArrowError via ?.

klion26 · 2026-04-15T13:10:37Z

+    } else {
+        value.div_checked(div)?
+    };
+    T::from::<D::Native>(v).ok_or_else(|| {


Did not unify these two functions, because if I unify them with a common function like

fn cast_single_decimal_to_integer<D, T>(...) -> Result<Option<T>, ArrowError>> { let v = if negative { value.mul_checked(div)? } else { value.div_checked(div)? }; OK(T::from::<D::Native>(v)) }

Then, in the caller function, I can't the value of v above, this make the error msg in cast_single_decimal_to_integer_result wrong.

Ah tricky indeed.

scovich

LGTM, pending benchmark results once #9729 merges.

scovich · 2026-04-15T16:35:05Z

+    } else {
+        value.div_checked(div)?
+    };
+    T::from::<D::Native>(v).ok_or_else(|| {


Ah tricky indeed.

scovich · 2026-04-15T16:35:49Z

@@ -833,11 +844,11 @@ where
                    if array.is_null(i) {
                        value_builder.append_null();
                    } else {


Ah, I totally missed the spurious error allocation pitfall 🤦. Glad your benchmarking uncovered it!

scovich · 2026-04-15T16:40:40Z

@@ -833,11 +844,11 @@ where
                    if array.is_null(i) {
                        value_builder.append_null();
                    } else {


If you really wanted to unify without the overhead, a helper that returns Result<T, D::Native> should do the trick: Ok(v) is a valid value, and Err(v) is the out of gamut value. The value would be super cheap, and safe path does .ok() while unsafe path does .map(|v| ArrowError::CastError(...)).

But probably not worth it, especially given that the checked mul/div also produce ArrowError via ?.

alamb · 2026-04-19T13:24:35Z

Are we ready to merge this one in?

scovich · 2026-04-20T16:58:43Z

Are we ready to merge this one in?

I believe @klion26 was planning to merge benchmarks first so we could validate the performance impact of this PR?

I've filed an pr for the benchmark in #9729, add benchmark in a seperate pr so that we can compare the results befor and after this pr merged in the remote machine.

alamb · 2026-04-22T13:36:23Z

run benchmark cast_kernels

adriangbot · 2026-04-22T13:40:02Z

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4296656691-1746-zx2qs 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing variant-cast-decimal (15f175d) to 9a2b49c (merge-base) diff
BENCH_NAME=cast_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench cast_kernels
BENCH_FILTER=
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-04-22T14:19:16Z

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

group                                                              main                                   variant-cast-decimal
-----                                                              ----                                   --------------------
"cast decimal128 to float32"                                       1.00     27.3±0.01µs        ? ?/sec    1.00     27.4±0.02µs        ? ?/sec
"cast decimal128 to float64"                                       1.00     27.1±0.01µs        ? ?/sec    1.00     27.2±0.03µs        ? ?/sec
"cast decimal128 to int16"                                         1.00     52.9±0.63µs        ? ?/sec    1.00     52.9±0.74µs        ? ?/sec
"cast decimal128 to int32"                                         1.02     38.0±0.09µs        ? ?/sec    1.00     37.3±0.10µs        ? ?/sec
"cast decimal128 to int64"                                         1.00     36.3±0.09µs        ? ?/sec    1.00     36.3±0.11µs        ? ?/sec
"cast decimal128 to int8"                                          1.00     51.3±0.58µs        ? ?/sec    1.00     51.3±0.78µs        ? ?/sec
"cast decimal128 to uint16"                                        1.00     54.1±0.71µs        ? ?/sec    1.00     54.3±0.81µs        ? ?/sec
"cast decimal128 to uint32"                                        1.00     36.0±0.08µs        ? ?/sec    1.00     35.9±0.17µs        ? ?/sec
"cast decimal128 to uint64"                                        1.00     35.1±0.18µs        ? ?/sec    1.01     35.4±0.07µs        ? ?/sec
"cast decimal128 to uint8"                                         1.00     49.9±0.41µs        ? ?/sec    1.00     49.7±0.66µs        ? ?/sec
"cast decimal256 to float32"                                       1.00     70.4±0.04µs        ? ?/sec    1.00     70.4±0.03µs        ? ?/sec
"cast decimal256 to float64"                                       1.00     68.4±0.02µs        ? ?/sec    1.00     68.5±0.02µs        ? ?/sec
"cast decimal256 to int16"                                         1.01    165.8±0.77µs        ? ?/sec    1.00    164.9±0.86µs        ? ?/sec
"cast decimal256 to int32"                                         1.00    144.4±0.15µs        ? ?/sec    1.00    144.5±0.34µs        ? ?/sec
"cast decimal256 to int64"                                         1.01    142.3±0.72µs        ? ?/sec    1.00    141.0±0.38µs        ? ?/sec
"cast decimal256 to int8"                                          1.00    161.9±0.88µs        ? ?/sec    1.00    161.1±0.97µs        ? ?/sec
"cast decimal256 to uint16"                                        1.07    166.9±0.46µs        ? ?/sec    1.00    156.3±0.83µs        ? ?/sec
"cast decimal256 to uint32"                                        1.00    132.4±0.46µs        ? ?/sec    1.00    132.7±0.75µs        ? ?/sec
"cast decimal256 to uint64"                                        1.01    132.8±0.50µs        ? ?/sec    1.00    131.8±0.74µs        ? ?/sec
"cast decimal256 to uint8"                                         1.09    163.6±0.63µs        ? ?/sec    1.00    150.7±0.89µs        ? ?/sec
"cast decimal32 to float32"                                        1.01      6.9±0.00µs        ? ?/sec    1.00      6.8±0.00µs        ? ?/sec
"cast decimal32 to float64"                                        1.00      6.8±0.01µs        ? ?/sec    1.00      6.8±0.00µs        ? ?/sec
"cast decimal32 to int16"                                          1.00     26.3±0.29µs        ? ?/sec    1.01     26.5±0.29µs        ? ?/sec
"cast decimal32 to int32"                                          1.00     20.4±0.28µs        ? ?/sec    1.16     23.5±0.47µs        ? ?/sec
"cast decimal32 to int64"                                          1.00     20.4±0.17µs        ? ?/sec    1.20     24.6±0.46µs        ? ?/sec
"cast decimal32 to int8"                                           1.00     33.5±0.61µs        ? ?/sec    1.01     33.7±0.36µs        ? ?/sec
"cast decimal32 to uint16"                                         1.00     26.4±0.38µs        ? ?/sec    1.02     26.9±0.41µs        ? ?/sec
"cast decimal32 to uint32"                                         1.00     20.1±0.20µs        ? ?/sec    1.18     23.6±0.35µs        ? ?/sec
"cast decimal32 to uint64"                                         1.00     20.7±0.21µs        ? ?/sec    1.19     24.7±0.41µs        ? ?/sec
"cast decimal32 to uint8"                                          1.05     35.4±0.52µs        ? ?/sec    1.00     33.6±0.45µs        ? ?/sec
"cast decimal64 to float32"                                        1.00      6.8±0.00µs        ? ?/sec    1.00      6.8±0.00µs        ? ?/sec
"cast decimal64 to float64"                                        1.00      6.8±0.00µs        ? ?/sec    1.00      6.9±0.00µs        ? ?/sec
"cast decimal64 to int16"                                          1.00     34.0±0.31µs        ? ?/sec    1.90     64.7±0.36µs        ? ?/sec
"cast decimal64 to int32"                                          1.00     26.5±0.06µs        ? ?/sec    1.00     26.5±0.10µs        ? ?/sec
"cast decimal64 to int64"                                          1.00     26.4±0.10µs        ? ?/sec    1.01     26.7±0.09µs        ? ?/sec
"cast decimal64 to int8"                                           1.00     34.4±0.26µs        ? ?/sec    1.88     64.6±0.28µs        ? ?/sec
"cast decimal64 to uint16"                                         1.00     34.2±0.21µs        ? ?/sec    1.89     64.6±0.27µs        ? ?/sec
"cast decimal64 to uint32"                                         1.00     26.1±0.08µs        ? ?/sec    1.00     26.0±0.04µs        ? ?/sec
"cast decimal64 to uint64"                                         1.00     25.8±0.05µs        ? ?/sec    1.02     26.4±0.08µs        ? ?/sec
"cast decimal64 to uint8"                                          1.00     34.2±0.26µs        ? ?/sec    1.84     62.8±0.36µs        ? ?/sec
"cast float32 to decimal128(32, 3)"                                1.00     33.8±0.32µs        ? ?/sec    1.00     33.8±0.33µs        ? ?/sec
"cast float32 to decimal256(76, 4)"                                1.00    501.1±4.51µs        ? ?/sec    1.02   508.8±14.72µs        ? ?/sec
"cast float32 to decimal32(9, 2)"                                  1.05     20.8±0.99µs        ? ?/sec    1.00     19.8±0.66µs        ? ?/sec
"cast float32 to decimal64(18, 2"                                  1.00     21.8±0.96µs        ? ?/sec    1.00     21.8±0.66µs        ? ?/sec
"cast float64 to decimal128(32, 3)"                                1.00     32.2±0.30µs        ? ?/sec    1.00     32.1±0.41µs        ? ?/sec
"cast float64 to decimal256(76, 4)"                                1.00    500.3±5.69µs        ? ?/sec    1.02   511.2±15.91µs        ? ?/sec
"cast float64 to decimal32(9, 2)"                                  1.00     21.3±0.79µs        ? ?/sec    1.03     21.8±1.18µs        ? ?/sec
"cast float64 to decimal64(18, 2"                                  1.00     21.2±0.62µs        ? ?/sec    1.03     21.9±0.67µs        ? ?/sec
"cast invalid float32 to decimal128(32, 3)"                        1.03     23.4±1.07µs        ? ?/sec    1.00     22.8±1.15µs        ? ?/sec
"cast invalid float32 to decimal256(76, 4)"                        1.00     39.3±0.58µs        ? ?/sec    1.02     40.0±0.71µs        ? ?/sec
"cast invalid float32 to decimal32(9, 2)"                          1.07     22.1±2.78µs        ? ?/sec    1.00     20.7±1.72µs        ? ?/sec
"cast invalid float32 to decimal64(18, 2"                          1.00     22.8±1.07µs        ? ?/sec    1.05     24.0±1.23µs        ? ?/sec
"cast invalid float64 to decimal32(9, 2)"                          1.02     22.0±1.85µs        ? ?/sec    1.00     21.6±0.82µs        ? ?/sec
"cast invalid float64 to to decimal128(32, 3)"                     1.00     23.5±1.28µs        ? ?/sec    1.00     23.4±1.40µs        ? ?/sec
"cast invalid float64 to to decimal256(76, 4)"                     1.00     38.7±0.53µs        ? ?/sec    1.03     39.8±0.84µs        ? ?/sec
"cast invalid float64 to to decimal64(18, 2)"                      1.02     23.8±1.11µs        ? ?/sec    1.00     23.2±1.34µs        ? ?/sec
"cast invalid string to decimal128(38, 3)"                         1.00    713.9±0.64µs        ? ?/sec    1.00    711.2±1.39µs        ? ?/sec
"cast invalid string to decimal256(76, 4)"                         1.00    714.1±0.84µs        ? ?/sec    1.00    715.0±1.25µs        ? ?/sec
"cast invalid string to decimal32(9, 2)"                           1.00    682.6±1.09µs        ? ?/sec    1.00    684.9±2.33µs        ? ?/sec
"cast invalid string to decimal64(18, 2)"                          1.00    686.1±0.94µs        ? ?/sec    1.01    693.0±1.14µs        ? ?/sec
"cast string to decimal128(38, 3)"                                 1.00    642.3±0.69µs        ? ?/sec    1.00    641.4±0.52µs        ? ?/sec
"cast string to decimal256(76, 4)"                                 1.01    660.8±0.61µs        ? ?/sec    1.00    657.2±0.51µs        ? ?/sec
"cast string to decimal32(9, 2)"                                   1.01    794.2±0.46µs        ? ?/sec    1.00    789.7±0.58µs        ? ?/sec
"cast string to decimal64(18, 2)"                                  1.00    619.2±0.63µs        ? ?/sec    1.00    621.1±0.42µs        ? ?/sec
cast binary view to string                                         1.00     58.4±0.49µs        ? ?/sec    1.12     65.1±0.42µs        ? ?/sec
cast binary view to string view                                    1.00     65.0±0.36µs        ? ?/sec    1.00     64.9±0.28µs        ? ?/sec
cast binary view to wide string                                    1.00     59.0±0.51µs        ? ?/sec    1.00     58.7±0.49µs        ? ?/sec
cast date32 to date64 512                                          1.00    322.1±2.40ns        ? ?/sec    1.00    320.8±0.48ns        ? ?/sec
cast date64 to date32 512                                          1.00    405.0±2.24ns        ? ?/sec    1.01    409.9±0.49ns        ? ?/sec
cast decimal128 to decimal128 512                                  1.00      6.9±0.01µs        ? ?/sec    1.00      6.9±0.01µs        ? ?/sec
cast decimal128 to decimal128 512 lower precision                  1.00     13.5±0.02µs        ? ?/sec    1.00     13.5±0.04µs        ? ?/sec
cast decimal128 to decimal128 512 with lower scale (infallible)    1.00     45.9±0.08µs        ? ?/sec    1.00     45.9±0.05µs        ? ?/sec
cast decimal128 to decimal128 512 with same scale                  1.00     76.2±0.53ns        ? ?/sec    1.03     78.3±2.17ns        ? ?/sec
cast decimal128 to decimal256 512                                  1.00     26.2±0.03µs        ? ?/sec    1.00     26.3±0.03µs        ? ?/sec
cast decimal256 to decimal128 512                                  1.00    309.9±0.29µs        ? ?/sec    1.00    309.9±0.24µs        ? ?/sec
cast decimal256 to decimal256 512                                  1.00     82.0±0.07µs        ? ?/sec    1.00     82.0±0.07µs        ? ?/sec
cast decimal256 to decimal256 512 with same scale                  1.00     77.5±0.94ns        ? ?/sec    1.01     78.2±1.51ns        ? ?/sec
cast decimal32 to decimal32 512                                    1.00      8.6±0.01µs        ? ?/sec    1.00      8.7±0.02µs        ? ?/sec
cast decimal32 to decimal32 512 lower precision                    1.00     10.1±0.03µs        ? ?/sec    1.00     10.1±0.04µs        ? ?/sec
cast decimal32 to decimal64 512                                    1.04      3.5±0.05µs        ? ?/sec    1.00      3.4±0.00µs        ? ?/sec
cast decimal64 to decimal32 512                                    1.00     32.4±0.02µs        ? ?/sec    1.00     32.4±0.02µs        ? ?/sec
cast decimal64 to decimal64 512                                    1.00      2.8±0.00µs        ? ?/sec    1.00      2.9±0.00µs        ? ?/sec
cast dict to string view                                           1.00     40.7±1.48µs        ? ?/sec    1.01     41.0±1.98µs        ? ?/sec
cast f32 to string 512                                             1.00     11.6±0.03µs        ? ?/sec    1.01     11.7±0.06µs        ? ?/sec
cast f64 to string 512                                             1.00     15.4±0.04µs        ? ?/sec    1.00     15.4±0.04µs        ? ?/sec
cast float32 to int32 512                                          1.01   1376.6±5.95ns        ? ?/sec    1.00   1367.3±4.27ns        ? ?/sec
cast float64 to float32 512                                        1.01    705.6±3.66ns        ? ?/sec    1.00    697.9±2.06ns        ? ?/sec
cast float64 to uint64 512                                         1.00  1389.9±12.21ns        ? ?/sec    1.02  1414.1±32.50ns        ? ?/sec
cast i64 to string 512                                             1.00      8.9±0.03µs        ? ?/sec    1.00      8.9±0.04µs        ? ?/sec
cast int32 to float32 512                                          1.00    702.3±5.53ns        ? ?/sec    1.01    712.0±1.89ns        ? ?/sec
cast int32 to float64 512                                          1.02    722.5±3.96ns        ? ?/sec    1.00    708.8±2.20ns        ? ?/sec
cast int32 to int32 512                                            1.03    176.6±2.37ns        ? ?/sec    1.00    171.2±0.83ns        ? ?/sec
cast int32 to int64 512                                            1.06    715.9±5.18ns        ? ?/sec    1.00    672.6±2.47ns        ? ?/sec
cast int32 to uint32 512                                           1.01   1388.9±7.70ns        ? ?/sec    1.00   1375.1±1.61ns        ? ?/sec
cast int64 to int32 512                                            1.03   1513.6±3.56ns        ? ?/sec    1.00   1465.7±1.47ns        ? ?/sec
cast no runs of int32s to ree<int32>                               1.00     58.2±0.94µs        ? ?/sec    1.00     58.3±0.91µs        ? ?/sec
cast runs of 10 string to ree<int32>                               1.01      8.8±0.04µs        ? ?/sec    1.00      8.7±0.05µs        ? ?/sec
cast runs of 1000 int32s to ree<int32>                             1.00      3.5±0.01µs        ? ?/sec    1.00      3.5±0.01µs        ? ?/sec
cast string single run to ree<int32>                               1.00     27.5±0.12µs        ? ?/sec    1.00     27.4±0.03µs        ? ?/sec
cast string to binary view 512                                     1.02      2.3±0.02µs        ? ?/sec    1.00      2.3±0.02µs        ? ?/sec
cast string view to binary view                                    1.01     74.2±0.74ns        ? ?/sec    1.00     73.2±0.76ns        ? ?/sec
cast string view to dict                                           1.00    174.6±0.67µs        ? ?/sec    1.00    174.7±0.59µs        ? ?/sec
cast string view to string                                         1.02     45.3±2.03µs        ? ?/sec    1.00     44.4±2.04µs        ? ?/sec
cast string view to wide string                                    1.00     46.5±1.87µs        ? ?/sec    1.00     46.4±1.99µs        ? ?/sec
cast time32s to time32ms 512                                       1.00    140.6±2.26ns        ? ?/sec    1.04    145.7±0.25ns        ? ?/sec
cast time32s to time64us 512                                       1.00    322.4±2.18ns        ? ?/sec    1.00    321.5±0.92ns        ? ?/sec
cast time64ns to time32s 512                                       1.00    404.2±2.29ns        ? ?/sec    1.00    402.8±0.24ns        ? ?/sec
cast timestamp_ms to i64 512                                       1.00    251.2±0.86ns        ? ?/sec    1.01    253.2±3.82ns        ? ?/sec
cast timestamp_ms to timestamp_ns 512                              1.00   1822.9±2.71ns        ? ?/sec    1.02   1865.8±3.74ns        ? ?/sec
cast timestamp_ns to timestamp_s 512                               1.02    173.2±1.06ns        ? ?/sec    1.00    170.2±1.32ns        ? ?/sec
cast utf8 to date32 512                                            1.00      6.4±0.04µs        ? ?/sec    1.00      6.4±0.03µs        ? ?/sec
cast utf8 to date64 512                                            1.03     34.7±0.14µs        ? ?/sec    1.00     33.6±0.21µs        ? ?/sec
cast utf8 to f32                                                   1.00      5.6±0.03µs        ? ?/sec    1.00      5.6±0.02µs        ? ?/sec
cast wide string to binary view 512                                1.00      4.1±0.10µs        ? ?/sec    1.00      4.1±0.07µs        ? ?/sec

Resource Usage

base (merge-base)

Metric	Value
Wall time	1140.3s
Peak memory	3.1 GiB
Avg memory	3.0 GiB
CPU user	1137.2s
CPU sys	0.8s
Peak spill	0 B

branch

Metric	Value
Wall time	1135.2s
Peak memory	3.0 GiB
Avg memory	3.0 GiB
CPU user	1134.0s
CPU sys	0.2s
Peak spill	0 B

File an issue against this benchmark runner

klion26 · 2026-04-22T14:28:53Z

The last force push did not change any code, I just rebased the main branch and try to run a benchmark, did not notice that @alamb has help triggered a benchmark already, I apologize for any misunderstanding this may have caused.

alamb · 2026-04-22T14:42:51Z

run benchmarks cast_kernels

alamb · 2026-04-22T14:43:16Z

Thanks @klion26

Several of the benchmarks seem to show this Pr slowing down. I will rerun the benchmarks to see if we can reproduce those results

adriangbot · 2026-04-22T14:46:49Z

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4297214665-1751-5fxnw 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing variant-cast-decimal (e222986) to 9a2b49c (merge-base) diff
BENCH_NAME=cast_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench cast_kernels
BENCH_FILTER=
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-04-22T15:25:16Z

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

group                                                              main                                   variant-cast-decimal
-----                                                              ----                                   --------------------
"cast decimal128 to float32"                                       1.00     27.3±0.01µs        ? ?/sec    1.00     27.3±0.01µs        ? ?/sec
"cast decimal128 to float64"                                       1.00     27.1±0.01µs        ? ?/sec    1.00     27.1±0.01µs        ? ?/sec
"cast decimal128 to int16"                                         1.00     53.0±0.63µs        ? ?/sec    1.00     53.1±0.85µs        ? ?/sec
"cast decimal128 to int32"                                         1.02     38.0±0.10µs        ? ?/sec    1.00     37.3±0.09µs        ? ?/sec
"cast decimal128 to int64"                                         1.00     36.4±0.09µs        ? ?/sec    1.00     36.2±0.11µs        ? ?/sec
"cast decimal128 to int8"                                          1.00     51.3±0.64µs        ? ?/sec    1.00     51.5±0.91µs        ? ?/sec
"cast decimal128 to uint16"                                        1.00     54.2±0.76µs        ? ?/sec    1.01     54.5±0.94µs        ? ?/sec
"cast decimal128 to uint32"                                        1.01     36.0±0.07µs        ? ?/sec    1.00     35.8±0.07µs        ? ?/sec
"cast decimal128 to uint64"                                        1.00     35.1±0.17µs        ? ?/sec    1.01     35.5±0.06µs        ? ?/sec
"cast decimal128 to uint8"                                         1.00     49.9±0.46µs        ? ?/sec    1.00     49.8±0.74µs        ? ?/sec
"cast decimal256 to float32"                                       1.00     70.4±0.03µs        ? ?/sec    1.00     70.4±0.03µs        ? ?/sec
"cast decimal256 to float64"                                       1.00     68.5±0.02µs        ? ?/sec    1.00     68.5±0.03µs        ? ?/sec
"cast decimal256 to int16"                                         1.01    165.6±0.85µs        ? ?/sec    1.00    164.7±0.73µs        ? ?/sec
"cast decimal256 to int32"                                         1.00    144.3±0.17µs        ? ?/sec    1.00    144.5±0.50µs        ? ?/sec
"cast decimal256 to int64"                                         1.01    142.2±0.75µs        ? ?/sec    1.00    141.1±0.45µs        ? ?/sec
"cast decimal256 to int8"                                          1.01    161.7±0.95µs        ? ?/sec    1.00    160.6±0.80µs        ? ?/sec
"cast decimal256 to uint16"                                        1.06    165.9±0.44µs        ? ?/sec    1.00    156.3±0.71µs        ? ?/sec
"cast decimal256 to uint32"                                        1.00    132.9±0.38µs        ? ?/sec    1.02    135.0±1.10µs        ? ?/sec
"cast decimal256 to uint64"                                        1.00    132.6±0.41µs        ? ?/sec    1.00    132.0±0.82µs        ? ?/sec
"cast decimal256 to uint8"                                         1.08    163.3±0.64µs        ? ?/sec    1.00    150.7±0.77µs        ? ?/sec
"cast decimal32 to float32"                                        1.00      6.8±0.00µs        ? ?/sec    1.00      6.8±0.00µs        ? ?/sec
"cast decimal32 to float64"                                        1.00      6.8±0.00µs        ? ?/sec    1.00      6.8±0.00µs        ? ?/sec
"cast decimal32 to int16"                                          1.00     25.9±0.25µs        ? ?/sec    1.02     26.5±0.39µs        ? ?/sec
"cast decimal32 to int32"                                          1.00     20.3±0.27µs        ? ?/sec    1.18     23.9±0.33µs        ? ?/sec
"cast decimal32 to int64"                                          1.00     20.4±0.21µs        ? ?/sec    1.20     24.5±0.64µs        ? ?/sec
"cast decimal32 to int8"                                           1.00     33.3±0.58µs        ? ?/sec    1.01     33.7±0.31µs        ? ?/sec
"cast decimal32 to uint16"                                         1.00     26.3±0.22µs        ? ?/sec    1.04     27.3±0.39µs        ? ?/sec
"cast decimal32 to uint32"                                         1.00     20.3±0.19µs        ? ?/sec    1.15     23.3±0.34µs        ? ?/sec
"cast decimal32 to uint64"                                         1.00     20.3±0.16µs        ? ?/sec    1.18     24.1±0.32µs        ? ?/sec
"cast decimal32 to uint8"                                          1.07     36.1±0.58µs        ? ?/sec    1.00     33.8±0.42µs        ? ?/sec
"cast decimal64 to float32"                                        1.00      6.8±0.00µs        ? ?/sec    1.00      6.8±0.00µs        ? ?/sec
"cast decimal64 to float64"                                        1.00      6.8±0.00µs        ? ?/sec    1.00      6.8±0.00µs        ? ?/sec
"cast decimal64 to int16"                                          1.00     34.0±0.34µs        ? ?/sec    1.91     64.8±0.43µs        ? ?/sec
"cast decimal64 to int32"                                          1.00     26.4±0.08µs        ? ?/sec    1.01     26.7±0.14µs        ? ?/sec
"cast decimal64 to int64"                                          1.00     26.3±0.09µs        ? ?/sec    1.00     26.4±0.09µs        ? ?/sec
"cast decimal64 to int8"                                           1.00     34.2±0.23µs        ? ?/sec    1.89     64.6±0.29µs        ? ?/sec
"cast decimal64 to uint16"                                         1.00     34.1±0.26µs        ? ?/sec    1.90     65.0±0.40µs        ? ?/sec
"cast decimal64 to uint32"                                         1.01     26.2±0.08µs        ? ?/sec    1.00     26.0±0.04µs        ? ?/sec
"cast decimal64 to uint64"                                         1.00     25.8±0.05µs        ? ?/sec    1.02     26.3±0.07µs        ? ?/sec
"cast decimal64 to uint8"                                          1.00     34.0±0.27µs        ? ?/sec    1.85     63.0±0.38µs        ? ?/sec
"cast float32 to decimal128(32, 3)"                                1.00     33.9±0.45µs        ? ?/sec    1.00     33.9±0.44µs        ? ?/sec
"cast float32 to decimal256(76, 4)"                                1.00    501.0±4.98µs        ? ?/sec    1.00    500.7±3.48µs        ? ?/sec
"cast float32 to decimal32(9, 2)"                                  1.06     21.0±0.90µs        ? ?/sec    1.00     19.8±1.01µs        ? ?/sec
"cast float32 to decimal64(18, 2"                                  1.00     21.6±0.76µs        ? ?/sec    1.03     22.1±0.80µs        ? ?/sec
"cast float64 to decimal128(32, 3)"                                1.00     32.1±0.45µs        ? ?/sec    1.00     32.0±0.50µs        ? ?/sec
"cast float64 to decimal256(76, 4)"                                1.00    499.2±6.29µs        ? ?/sec    1.01    502.5±5.42µs        ? ?/sec
"cast float64 to decimal32(9, 2)"                                  1.00     20.8±0.81µs        ? ?/sec    1.03     21.5±0.92µs        ? ?/sec
"cast float64 to decimal64(18, 2"                                  1.00     21.6±0.61µs        ? ?/sec    1.00     21.5±0.54µs        ? ?/sec
"cast invalid float32 to decimal128(32, 3)"                        1.03     22.8±0.79µs        ? ?/sec    1.00     22.1±1.01µs        ? ?/sec
"cast invalid float32 to decimal256(76, 4)"                        1.00     39.5±0.67µs        ? ?/sec    1.02     40.2±0.62µs        ? ?/sec
"cast invalid float32 to decimal32(9, 2)"                          1.11     22.9±2.33µs        ? ?/sec    1.00     20.7±1.76µs        ? ?/sec
"cast invalid float32 to decimal64(18, 2"                          1.06     24.5±2.23µs        ? ?/sec    1.00     23.2±1.05µs        ? ?/sec
"cast invalid float64 to decimal32(9, 2)"                          1.04     21.5±1.23µs        ? ?/sec    1.00     20.7±1.24µs        ? ?/sec
"cast invalid float64 to to decimal128(32, 3)"                     1.00     22.8±0.89µs        ? ?/sec    1.00     22.7±1.12µs        ? ?/sec
"cast invalid float64 to to decimal256(76, 4)"                     1.00     39.0±0.62µs        ? ?/sec    1.03     40.2±1.48µs        ? ?/sec
"cast invalid float64 to to decimal64(18, 2)"                      1.00     23.1±1.06µs        ? ?/sec    1.03     23.7±1.26µs        ? ?/sec
"cast invalid string to decimal128(38, 3)"                         1.01    713.4±0.96µs        ? ?/sec    1.00    709.7±0.93µs        ? ?/sec
"cast invalid string to decimal256(76, 4)"                         1.00    712.8±0.93µs        ? ?/sec    1.00    713.2±0.74µs        ? ?/sec
"cast invalid string to decimal32(9, 2)"                           1.00    682.2±1.34µs        ? ?/sec    1.00    680.8±0.92µs        ? ?/sec
"cast invalid string to decimal64(18, 2)"                          1.00    685.1±0.99µs        ? ?/sec    1.02    698.4±1.24µs        ? ?/sec
"cast string to decimal128(38, 3)"                                 1.00    642.6±0.41µs        ? ?/sec    1.00    641.6±0.58µs        ? ?/sec
"cast string to decimal256(76, 4)"                                 1.00    659.0±0.59µs        ? ?/sec    1.00    657.2±0.70µs        ? ?/sec
"cast string to decimal32(9, 2)"                                   1.00    792.3±0.56µs        ? ?/sec    1.00    789.5±0.50µs        ? ?/sec
"cast string to decimal64(18, 2)"                                  1.00    619.1±0.64µs        ? ?/sec    1.00    621.2±0.58µs        ? ?/sec
cast binary view to string                                         1.01     59.0±1.69µs        ? ?/sec    1.00     58.2±0.47µs        ? ?/sec
cast binary view to string view                                    1.00     65.0±0.35µs        ? ?/sec    1.00     64.9±0.28µs        ? ?/sec
cast binary view to wide string                                    1.00     58.5±0.51µs        ? ?/sec    1.00     58.7±0.49µs        ? ?/sec
cast date32 to date64 512                                          1.00    321.8±1.21ns        ? ?/sec    1.03    331.9±1.43ns        ? ?/sec
cast date64 to date32 512                                          1.00    399.7±0.32ns        ? ?/sec    1.04    415.2±1.68ns        ? ?/sec
cast decimal128 to decimal128 512                                  1.00      6.9±0.02µs        ? ?/sec    1.00      6.9±0.01µs        ? ?/sec
cast decimal128 to decimal128 512 lower precision                  1.08     14.5±0.12µs        ? ?/sec    1.00     13.5±0.04µs        ? ?/sec
cast decimal128 to decimal128 512 with lower scale (infallible)    1.00     45.9±0.05µs        ? ?/sec    1.00     45.9±0.06µs        ? ?/sec
cast decimal128 to decimal128 512 with same scale                  1.00     75.2±0.26ns        ? ?/sec    1.02     76.5±0.43ns        ? ?/sec
cast decimal128 to decimal256 512                                  1.00     26.3±0.01µs        ? ?/sec    1.00     26.2±0.01µs        ? ?/sec
cast decimal256 to decimal128 512                                  1.00    309.9±0.30µs        ? ?/sec    1.00    310.4±0.29µs        ? ?/sec
cast decimal256 to decimal256 512                                  1.00     82.1±0.08µs        ? ?/sec    1.00     82.1±0.07µs        ? ?/sec
cast decimal256 to decimal256 512 with same scale                  1.00     75.9±0.50ns        ? ?/sec    1.02     77.5±0.83ns        ? ?/sec
cast decimal32 to decimal32 512                                    1.00      8.6±0.01µs        ? ?/sec    1.01      8.6±0.01µs        ? ?/sec
cast decimal32 to decimal32 512 lower precision                    1.01     10.1±0.05µs        ? ?/sec    1.00     10.1±0.02µs        ? ?/sec
cast decimal32 to decimal64 512                                    1.00      3.4±0.01µs        ? ?/sec    1.00      3.3±0.00µs        ? ?/sec
cast decimal64 to decimal32 512                                    1.00     32.4±0.04µs        ? ?/sec    1.00     32.4±0.01µs        ? ?/sec
cast decimal64 to decimal64 512                                    1.00      2.8±0.01µs        ? ?/sec    1.00      2.9±0.00µs        ? ?/sec
cast dict to string view                                           1.00     41.0±1.78µs        ? ?/sec    1.00     40.8±2.01µs        ? ?/sec
cast f32 to string 512                                             1.00     11.6±0.05µs        ? ?/sec    1.01     11.7±0.04µs        ? ?/sec
cast f64 to string 512                                             1.01     15.4±0.04µs        ? ?/sec    1.00     15.2±0.04µs        ? ?/sec
cast float32 to int32 512                                          1.05  1398.2±16.41ns        ? ?/sec    1.00   1334.9±5.06ns        ? ?/sec
cast float64 to float32 512                                        1.00    666.7±2.37ns        ? ?/sec    1.03    689.0±2.63ns        ? ?/sec
cast float64 to uint64 512                                         1.02  1406.7±42.41ns        ? ?/sec    1.00  1385.9±10.85ns        ? ?/sec
cast i64 to string 512                                             1.00      8.9±0.04µs        ? ?/sec    1.00      8.9±0.03µs        ? ?/sec
cast int32 to float32 512                                          1.00    703.1±4.78ns        ? ?/sec    1.02    715.9±3.19ns        ? ?/sec
cast int32 to float64 512                                          1.00    693.4±3.27ns        ? ?/sec    1.02    708.5±2.13ns        ? ?/sec
cast int32 to int32 512                                            1.00    172.7±0.80ns        ? ?/sec    1.01    173.8±4.74ns        ? ?/sec
cast int32 to int64 512                                            1.03    692.4±3.26ns        ? ?/sec    1.00    670.2±2.95ns        ? ?/sec
cast int32 to uint32 512                                           1.00   1376.1±1.76ns        ? ?/sec    1.00   1378.2±1.72ns        ? ?/sec
cast int64 to int32 512                                            1.02   1492.5±1.38ns        ? ?/sec    1.00   1463.4±1.48ns        ? ?/sec
cast no runs of int32s to ree<int32>                               1.03     57.3±0.70µs        ? ?/sec    1.00     55.8±0.91µs        ? ?/sec
cast runs of 10 string to ree<int32>                               1.00      8.7±0.03µs        ? ?/sec    1.01      8.8±0.05µs        ? ?/sec
cast runs of 1000 int32s to ree<int32>                             1.00      3.4±0.01µs        ? ?/sec    1.00      3.4±0.01µs        ? ?/sec
cast string single run to ree<int32>                               1.00     27.4±0.04µs        ? ?/sec    1.00     27.4±0.02µs        ? ?/sec
cast string to binary view 512                                     1.04      2.4±0.02µs        ? ?/sec    1.00      2.3±0.02µs        ? ?/sec
cast string view to binary view                                    1.00     73.6±0.87ns        ? ?/sec    1.00     73.2±0.70ns        ? ?/sec
cast string view to dict                                           1.00    174.8±0.65µs        ? ?/sec    1.00    175.2±0.71µs        ? ?/sec
cast string view to string                                         1.00     44.4±2.13µs        ? ?/sec    1.02     45.2±1.72µs        ? ?/sec
cast string view to wide string                                    1.01     46.4±1.95µs        ? ?/sec    1.00     46.1±2.59µs        ? ?/sec
cast time32s to time32ms 512                                       1.00    138.0±0.47ns        ? ?/sec    1.09    150.2±1.56ns        ? ?/sec
cast time32s to time64us 512                                       1.00    321.1±0.66ns        ? ?/sec    1.00    322.4±1.38ns        ? ?/sec
cast time64ns to time32s 512                                       1.00    401.8±0.32ns        ? ?/sec    1.01    404.3±2.03ns        ? ?/sec
cast timestamp_ms to i64 512                                       1.00    251.8±0.86ns        ? ?/sec    1.00    251.3±1.48ns        ? ?/sec
cast timestamp_ms to timestamp_ns 512                              1.00   1828.7±3.00ns        ? ?/sec    1.01   1851.4±7.08ns        ? ?/sec
cast timestamp_ns to timestamp_s 512                               1.00    170.1±1.21ns        ? ?/sec    1.05    178.2±6.43ns        ? ?/sec
cast utf8 to date32 512                                            1.00      6.4±0.03µs        ? ?/sec    1.00      6.4±0.04µs        ? ?/sec
cast utf8 to date64 512                                            1.00     32.1±0.13µs        ? ?/sec    1.00     32.0±0.12µs        ? ?/sec
cast utf8 to f32                                                   1.00      5.6±0.02µs        ? ?/sec    1.01      5.7±0.03µs        ? ?/sec
cast wide string to binary view 512                                1.00      4.1±0.08µs        ? ?/sec    1.00      4.1±0.10µs        ? ?/sec

Resource Usage

base (merge-base)

Metric	Value
Wall time	1140.3s
Peak memory	3.0 GiB
Avg memory	3.0 GiB
CPU user	1135.2s
CPU sys	0.8s
Peak spill	0 B

branch

Metric	Value
Wall time	1140.2s
Peak memory	3.0 GiB
Avg memory	3.0 GiB
CPU user	1135.7s
CPU sys	0.2s
Peak spill	0 B

File an issue against this benchmark runner

klion26 · 2026-04-23T03:28:28Z

I can't reproduce the benchmark on my laptop, review the change manually, there change is extract logic to an inline function so that can be shared multiple places, there are two difference I can see maybe affect the performance

maybe there will be one more branch guess
<T::Native as NumCast>::from changed to T::Native::from

all benchmarks on my laptop

group                                                              main-cast-0935                         variant-cast-decimal-0902
-----                                                              --------------                         -------------------------
"cast decimal128 to float32"                                       1.00     20.4±0.27µs        ? ?/sec    1.00     20.5±0.29µs        ? ?/sec
"cast decimal128 to float64"                                       1.00     20.9±0.27µs        ? ?/sec    1.00     20.9±0.26µs        ? ?/sec
"cast decimal128 to int16"                                         1.01     44.3±0.71µs        ? ?/sec    1.00     44.1±0.60µs        ? ?/sec
"cast decimal128 to int32"                                         1.00     38.7±0.51µs        ? ?/sec    1.00     38.7±0.49µs        ? ?/sec
"cast decimal128 to int64"                                         1.00     38.7±0.60µs        ? ?/sec    1.00     38.7±0.55µs        ? ?/sec
"cast decimal128 to int8"                                          1.00     41.7±0.66µs        ? ?/sec    1.00     41.8±0.60µs        ? ?/sec
"cast decimal128 to uint16"                                        1.04     49.5±0.52µs        ? ?/sec    1.00     47.4±0.62µs        ? ?/sec
"cast decimal128 to uint32"                                        1.00     37.4±0.44µs        ? ?/sec    1.00     37.4±0.39µs        ? ?/sec
"cast decimal128 to uint64"                                        1.00     38.3±0.50µs        ? ?/sec    1.00     38.3±0.56µs        ? ?/sec
"cast decimal128 to uint8"                                         1.00     40.5±0.54µs        ? ?/sec    1.00     40.5±0.59µs        ? ?/sec
"cast decimal256 to float32"                                       1.00     57.0±1.05µs        ? ?/sec    1.00     57.1±1.17µs        ? ?/sec
"cast decimal256 to float64"                                       1.01     57.6±0.71µs        ? ?/sec    1.00     57.3±0.54µs        ? ?/sec
"cast decimal256 to int16"                                         1.00    195.8±2.30µs        ? ?/sec    1.01    198.3±3.24µs        ? ?/sec
"cast decimal256 to int32"                                         1.00    179.9±1.93µs        ? ?/sec    1.01    182.5±2.19µs        ? ?/sec
"cast decimal256 to int64"                                         1.00    180.5±2.19µs        ? ?/sec    1.00    180.4±2.35µs        ? ?/sec
"cast decimal256 to int8"                                          1.00    193.4±2.89µs        ? ?/sec    1.01    194.5±2.46µs        ? ?/sec
"cast decimal256 to uint16"                                        1.00    193.3±2.44µs        ? ?/sec    1.00    193.7±2.07µs        ? ?/sec
"cast decimal256 to uint32"                                        1.00    174.6±2.43µs        ? ?/sec    1.00    175.2±2.35µs        ? ?/sec
"cast decimal256 to uint64"                                        1.00    173.4±2.32µs        ? ?/sec    1.00    173.2±2.44µs        ? ?/sec
"cast decimal256 to uint8"                                         1.00    188.6±1.98µs        ? ?/sec    1.00    188.7±2.18µs        ? ?/sec
"cast decimal32 to float32"                                        1.00      2.2±0.03µs        ? ?/sec    1.00      2.2±0.04µs        ? ?/sec
"cast decimal32 to float64"                                        1.00      2.6±0.03µs        ? ?/sec    1.00      2.5±0.03µs        ? ?/sec
"cast decimal32 to int16"                                          1.00     18.1±0.22µs        ? ?/sec    1.00     18.2±0.24µs        ? ?/sec
"cast decimal32 to int32"                                          1.00     18.2±0.22µs        ? ?/sec    1.00     18.1±0.19µs        ? ?/sec
"cast decimal32 to int64"                                          1.00     18.8±0.20µs        ? ?/sec    1.00     18.9±0.27µs        ? ?/sec
"cast decimal32 to int8"                                           1.00     33.5±0.30µs        ? ?/sec    1.04     34.7±7.43µs        ? ?/sec
"cast decimal32 to uint16"                                         1.00     18.1±0.22µs        ? ?/sec    1.00     18.2±0.26µs        ? ?/sec
"cast decimal32 to uint32"                                         1.00     18.1±0.23µs        ? ?/sec    1.00     18.1±0.24µs        ? ?/sec
"cast decimal32 to uint64"                                         1.00     18.8±0.22µs        ? ?/sec    1.00     18.8±0.22µs        ? ?/sec
"cast decimal32 to uint8"                                          1.00     40.8±0.46µs        ? ?/sec    1.00     40.9±0.51µs        ? ?/sec
"cast decimal64 to float32"                                        1.00  1855.0±24.26ns        ? ?/sec    1.00  1853.3±22.62ns        ? ?/sec
"cast decimal64 to float64"                                        1.01      2.1±0.04µs        ? ?/sec    1.00      2.1±0.05µs        ? ?/sec
"cast decimal64 to int16"                                          1.00     27.1±0.45µs        ? ?/sec    1.00     27.2±0.34µs        ? ?/sec
"cast decimal64 to int32"                                          1.00     18.2±0.24µs        ? ?/sec    1.00     18.2±0.28µs        ? ?/sec
"cast decimal64 to int64"                                          1.00     18.9±0.23µs        ? ?/sec    1.00     19.0±0.33µs        ? ?/sec
"cast decimal64 to int8"                                           1.00     25.0±0.34µs        ? ?/sec    1.00     25.0±0.34µs        ? ?/sec
"cast decimal64 to uint16"                                         1.00     29.4±0.52µs        ? ?/sec    1.01     29.8±0.71µs        ? ?/sec
"cast decimal64 to uint32"                                         1.00     18.2±0.35µs        ? ?/sec    1.00     18.2±0.31µs        ? ?/sec
"cast decimal64 to uint64"                                         1.00     19.0±0.28µs        ? ?/sec    1.00     18.9±0.26µs        ? ?/sec
"cast decimal64 to uint8"                                          1.00     25.0±0.37µs        ? ?/sec    1.00     25.1±0.49µs        ? ?/sec
"cast float32 to decimal128(32, 3)"                                1.00     31.3±0.63µs        ? ?/sec    1.00     31.4±0.63µs        ? ?/sec
"cast float32 to decimal256(76, 4)"                                1.00   565.2±10.41µs        ? ?/sec    1.00    565.1±9.10µs        ? ?/sec
"cast float32 to decimal32(9, 2)"                                  1.00     22.6±0.27µs        ? ?/sec    1.00     22.7±0.33µs        ? ?/sec
"cast float32 to decimal64(18, 2"                                  1.00     21.6±0.32µs        ? ?/sec    1.00     21.7±0.33µs        ? ?/sec
"cast float64 to decimal128(32, 3)"                                1.00     30.6±0.31µs        ? ?/sec    1.00     30.6±0.37µs        ? ?/sec
"cast float64 to decimal256(76, 4)"                                1.00    559.3±7.19µs        ? ?/sec    1.00    560.0±7.11µs        ? ?/sec
"cast float64 to decimal32(9, 2)"                                  1.00     22.4±0.27µs        ? ?/sec    1.00     22.5±0.28µs        ? ?/sec
"cast float64 to decimal64(18, 2"                                  1.00     21.4±0.21µs        ? ?/sec    1.00     21.5±0.22µs        ? ?/sec
"cast invalid float32 to decimal128(32, 3)"                        1.00     22.3±0.19µs        ? ?/sec    1.00     22.2±0.25µs        ? ?/sec
"cast invalid float32 to decimal256(76, 4)"                        1.00     40.4±0.40µs        ? ?/sec    1.00     40.2±0.53µs        ? ?/sec
"cast invalid float32 to decimal32(9, 2)"                          1.01     20.9±0.24µs        ? ?/sec    1.00     20.8±0.22µs        ? ?/sec
"cast invalid float32 to decimal64(18, 2"                          1.01     22.0±0.56µs        ? ?/sec    1.00     21.8±0.25µs        ? ?/sec
"cast invalid float64 to decimal32(9, 2)"                          1.00     21.1±0.24µs        ? ?/sec    1.00     21.1±0.23µs        ? ?/sec
"cast invalid float64 to to decimal128(32, 3)"                     1.00     22.7±0.35µs        ? ?/sec    1.00     22.7±0.25µs        ? ?/sec
"cast invalid float64 to to decimal256(76, 4)"                     1.00     39.5±0.43µs        ? ?/sec    1.04     41.1±0.52µs        ? ?/sec
"cast invalid float64 to to decimal64(18, 2)"                      1.00     22.0±0.24µs        ? ?/sec    1.00     22.0±0.22µs        ? ?/sec
"cast invalid string to decimal128(38, 3)"                         1.00   794.7±10.77µs        ? ?/sec    1.01    803.4±9.88µs        ? ?/sec
"cast invalid string to decimal256(76, 4)"                         1.00   798.3±13.31µs        ? ?/sec    1.01    805.6±8.91µs        ? ?/sec
"cast invalid string to decimal32(9, 2)"                           1.00   752.3±14.47µs        ? ?/sec    1.01    757.7±9.14µs        ? ?/sec
"cast invalid string to decimal64(18, 2)"                          1.00    753.9±9.34µs        ? ?/sec    1.01   763.4±10.36µs        ? ?/sec
"cast string to decimal128(38, 3)"                                 1.00    722.1±8.64µs        ? ?/sec    1.01    730.3±9.34µs        ? ?/sec
"cast string to decimal256(76, 4)"                                 1.00    728.8±7.86µs        ? ?/sec    1.01    734.7±9.10µs        ? ?/sec
"cast string to decimal32(9, 2)"                                   1.00   894.8±10.82µs        ? ?/sec    1.00   897.8±11.37µs        ? ?/sec
"cast string to decimal64(18, 2)"                                  1.00   704.6±10.37µs        ? ?/sec    1.01   714.4±16.92µs        ? ?/sec
cast binary view to string                                         1.00     78.6±1.08µs        ? ?/sec    1.00     78.9±1.27µs        ? ?/sec
cast binary view to string view                                    1.00     59.8±0.70µs        ? ?/sec    1.00     59.9±0.77µs        ? ?/sec
cast binary view to wide string                                    1.00     75.5±1.05µs        ? ?/sec    1.00     75.6±1.04µs        ? ?/sec
cast date32 to date64 512                                          1.00    271.2±3.02ns        ? ?/sec    1.00    271.6±4.18ns        ? ?/sec
cast date64 to date32 512                                          1.00    289.6±4.27ns        ? ?/sec    1.00    290.4±4.88ns        ? ?/sec
cast decimal128 to decimal128 512                                  1.00      6.4±0.14µs        ? ?/sec    1.00      6.4±0.08µs        ? ?/sec
cast decimal128 to decimal128 512 lower precision                  1.00     36.8±0.65µs        ? ?/sec    1.01     37.0±0.67µs        ? ?/sec
cast decimal128 to decimal128 512 with lower scale (infallible)    1.00     36.1±0.58µs        ? ?/sec    1.00     36.0±0.59µs        ? ?/sec
cast decimal128 to decimal128 512 with same scale                  1.02     54.3±0.83ns        ? ?/sec    1.00     53.1±0.89ns        ? ?/sec
cast decimal128 to decimal256 512                                  1.00     21.5±0.29µs        ? ?/sec    1.00     21.6±0.37µs        ? ?/sec
cast decimal256 to decimal128 512                                  1.00    333.1±4.18µs        ? ?/sec    1.00    332.4±3.16µs        ? ?/sec
cast decimal256 to decimal256 512                                  1.03    89.3±11.01µs        ? ?/sec    1.00     86.5±1.27µs        ? ?/sec
cast decimal256 to decimal256 512 with same scale                  1.00     53.4±0.93ns        ? ?/sec    1.00     53.6±0.80ns        ? ?/sec
cast decimal32 to decimal32 512                                    1.00     21.1±0.27µs        ? ?/sec    1.00     21.1±0.30µs        ? ?/sec
cast decimal32 to decimal32 512 lower precision                    1.01     42.8±1.45µs        ? ?/sec    1.00     42.3±0.82µs        ? ?/sec
cast decimal32 to decimal64 512                                    1.00      3.5±0.05µs        ? ?/sec    1.02      3.6±0.06µs        ? ?/sec
cast decimal64 to decimal32 512                                    1.00     19.9±0.20µs        ? ?/sec    1.00     19.9±0.29µs        ? ?/sec
cast decimal64 to decimal64 512                                    1.00      3.4±0.06µs        ? ?/sec    1.00      3.4±0.05µs        ? ?/sec
cast dict to string view                                           1.00     37.5±0.53µs        ? ?/sec    1.00     37.3±0.47µs        ? ?/sec
cast f32 to string 512                                             1.00     11.5±0.14µs        ? ?/sec    1.00     11.5±0.22µs        ? ?/sec
cast f64 to string 512                                             1.00     14.1±0.18µs        ? ?/sec    1.00     14.2±0.21µs        ? ?/sec
cast float32 to int32 512                                          1.00    594.4±7.93ns        ? ?/sec    1.00    595.7±7.76ns        ? ?/sec
cast float64 to float32 512                                        1.00    546.9±9.78ns        ? ?/sec    1.01   553.3±10.16ns        ? ?/sec
cast float64 to uint64 512                                         1.00    626.6±9.52ns        ? ?/sec    1.00    625.0±6.96ns        ? ?/sec
cast i64 to string 512                                             1.00      9.8±0.16µs        ? ?/sec    1.01      9.8±0.15µs        ? ?/sec
cast int32 to float32 512                                          1.00    553.0±7.63ns        ? ?/sec    1.00    551.1±7.75ns        ? ?/sec
cast int32 to float64 512                                          1.00    575.3±8.51ns        ? ?/sec    1.00    577.1±9.23ns        ? ?/sec
cast int32 to int32 512                                            1.05    115.4±5.15ns        ? ?/sec    1.00    110.2±5.03ns        ? ?/sec
cast int32 to int64 512                                            1.00    571.6±8.65ns        ? ?/sec    1.00    573.6±8.61ns        ? ?/sec
cast int32 to uint32 512                                           1.00   912.9±15.86ns        ? ?/sec    1.02  935.4±158.71ns        ? ?/sec
cast int64 to int32 512                                            1.01  1564.3±27.36ns        ? ?/sec    1.00  1551.2±21.00ns        ? ?/sec
cast no runs of int32s to ree<int32>                               1.00     66.2±0.85µs        ? ?/sec    1.00     66.3±1.15µs        ? ?/sec
cast runs of 10 string to ree<int32>                               1.00      8.1±0.15µs        ? ?/sec    1.00      8.1±0.10µs        ? ?/sec
cast runs of 1000 int32s to ree<int32>                             1.00      2.7±0.03µs        ? ?/sec    1.00      2.7±0.04µs        ? ?/sec
cast string single run to ree<int32>                               1.00     41.4±0.44µs        ? ?/sec    1.00     41.5±0.83µs        ? ?/sec
cast string to binary view 512                                     1.00      2.3±0.04µs        ? ?/sec    1.01      2.3±0.04µs        ? ?/sec
cast string view to binary view                                    1.00     39.8±0.77ns        ? ?/sec    1.00     39.7±0.61ns        ? ?/sec
cast string view to dict                                           1.00    126.5±2.91µs        ? ?/sec    1.00    126.6±2.15µs        ? ?/sec
cast string view to string                                         1.00     56.5±0.69µs        ? ?/sec    1.00     56.6±0.80µs        ? ?/sec
cast string view to wide string                                    1.00     55.7±0.85µs        ? ?/sec    1.00     55.6±0.73µs        ? ?/sec
cast time32s to time32ms 512                                       1.00    108.6±1.64ns        ? ?/sec    1.00    108.4±1.65ns        ? ?/sec
cast time32s to time64us 512                                       1.00    274.1±3.38ns        ? ?/sec    1.01    275.7±4.03ns        ? ?/sec
cast time64ns to time32s 512                                       1.00    289.7±3.60ns        ? ?/sec    1.00    289.9±3.99ns        ? ?/sec
cast timestamp_ms to i64 512                                       1.05    152.2±2.07ns        ? ?/sec    1.00    144.8±2.69ns        ? ?/sec
cast timestamp_ms to timestamp_ns 512                              1.00  1808.7±29.47ns        ? ?/sec    1.00  1802.3±23.53ns        ? ?/sec
cast timestamp_ns to timestamp_s 512                               1.00    108.9±3.99ns        ? ?/sec    1.00    108.7±3.93ns        ? ?/sec
cast utf8 to date32 512                                            1.00      6.5±0.09µs        ? ?/sec    1.00      6.5±0.09µs        ? ?/sec
cast utf8 to date64 512                                            1.00     31.1±0.57µs        ? ?/sec    1.00     31.2±0.50µs        ? ?/sec
cast utf8 to f32                                                   1.00     13.2±0.16µs        ? ?/sec    1.00     13.2±0.16µs        ? ?/sec
cast wide string to binary view 512                                1.00      5.0±0.08µs        ? ?/sec    1.00      5.0±0.10µs        ? ?/sec

only keep the decimal64 to * benchmarks

group                              main-cast-0844                         variant-cast-decimal-0847
-----                              --------------                         -------------------------
"cast decimal64 to float32"        1.00  1835.6±32.51ns        ? ?/sec    1.01  1860.6±25.18ns        ? ?/sec
"cast decimal64 to float64"        1.00      2.1±0.02µs        ? ?/sec    1.00      2.1±0.03µs        ? ?/sec
"cast decimal64 to int16"          1.00     27.2±0.43µs        ? ?/sec    1.00     27.3±0.34µs        ? ?/sec
"cast decimal64 to int32"          1.00     18.1±0.25µs        ? ?/sec    1.00     18.1±0.22µs        ? ?/sec
"cast decimal64 to int64"          1.00     19.0±0.32µs        ? ?/sec    1.00     19.0±0.34µs        ? ?/sec
"cast decimal64 to int8"           1.00     25.1±0.43µs        ? ?/sec    1.00     24.9±0.31µs        ? ?/sec
"cast decimal64 to uint16"         1.00     29.2±0.34µs        ? ?/sec    1.01     29.5±0.39µs        ? ?/sec
"cast decimal64 to uint32"         1.00     18.1±0.20µs        ? ?/sec    1.00     18.2±0.31µs        ? ?/sec
"cast decimal64 to uint64"         1.00     19.0±0.25µs        ? ?/sec    1.00     18.9±0.28µs        ? ?/sec
"cast decimal64 to uint8"          1.00     25.0±0.28µs        ? ?/sec    1.11     27.6±0.46µs        ? ?/sec
cast decimal64 to decimal32 512    1.01     19.8±0.32µs        ? ?/sec    1.00     19.6±0.32µs        ? ?/sec
cast decimal64 to decimal64 512    1.00      3.4±0.07µs        ? ?/sec    1.00      3.4±0.05µs        ? ?/sec

klion26 · 2026-04-23T03:28:45Z

run benchmarks cast_kernels

adriangbot · 2026-04-23T03:32:23Z

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4301544456-1768-dffsx 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing variant-cast-decimal (c74d244) to 9a2b49c (merge-base) diff
BENCH_NAME=cast_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench cast_kernels
BENCH_FILTER=
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-04-23T04:11:07Z

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

group                                                              main                                   variant-cast-decimal
-----                                                              ----                                   --------------------
"cast decimal128 to float32"                                       1.00     27.4±0.02µs        ? ?/sec    1.00     27.3±0.02µs        ? ?/sec
"cast decimal128 to float64"                                       1.00     27.2±0.01µs        ? ?/sec    1.00     27.2±0.01µs        ? ?/sec
"cast decimal128 to int16"                                         1.00     53.2±0.84µs        ? ?/sec    1.00     53.1±0.79µs        ? ?/sec
"cast decimal128 to int32"                                         1.00     38.0±0.08µs        ? ?/sec    1.00     38.0±0.08µs        ? ?/sec
"cast decimal128 to int64"                                         1.00     36.4±0.07µs        ? ?/sec    1.02     37.1±0.10µs        ? ?/sec
"cast decimal128 to int8"                                          1.00     51.5±0.76µs        ? ?/sec    1.00     51.7±0.77µs        ? ?/sec
"cast decimal128 to uint16"                                        1.00     54.3±0.89µs        ? ?/sec    1.01     54.8±0.96µs        ? ?/sec
"cast decimal128 to uint32"                                        1.01     36.0±0.07µs        ? ?/sec    1.00     35.8±0.06µs        ? ?/sec
"cast decimal128 to uint64"                                        1.00     35.1±0.18µs        ? ?/sec    1.00     35.1±0.07µs        ? ?/sec
"cast decimal128 to uint8"                                         1.00     50.0±0.50µs        ? ?/sec    1.00     49.8±0.73µs        ? ?/sec
"cast decimal256 to float32"                                       1.00     70.5±0.08µs        ? ?/sec    1.01     70.9±0.15µs        ? ?/sec
"cast decimal256 to float64"                                       1.00     68.5±0.04µs        ? ?/sec    1.00     68.6±0.03µs        ? ?/sec
"cast decimal256 to int16"                                         1.00    165.6±0.75µs        ? ?/sec    1.00    165.2±0.64µs        ? ?/sec
"cast decimal256 to int32"                                         1.01    145.3±0.62µs        ? ?/sec    1.00    144.3±0.39µs        ? ?/sec
"cast decimal256 to int64"                                         1.01    142.4±0.89µs        ? ?/sec    1.00    141.1±0.45µs        ? ?/sec
"cast decimal256 to int8"                                          1.00    161.5±0.84µs        ? ?/sec    1.00    161.3±0.77µs        ? ?/sec
"cast decimal256 to uint16"                                        1.05    165.9±0.37µs        ? ?/sec    1.00    157.8±0.60µs        ? ?/sec
"cast decimal256 to uint32"                                        1.00    132.7±0.65µs        ? ?/sec    1.01    133.4±0.75µs        ? ?/sec
"cast decimal256 to uint64"                                        1.00    132.7±0.40µs        ? ?/sec    1.01    134.3±0.50µs        ? ?/sec
"cast decimal256 to uint8"                                         1.07    163.3±0.39µs        ? ?/sec    1.00    152.8±0.54µs        ? ?/sec
"cast decimal32 to float32"                                        1.00      6.8±0.00µs        ? ?/sec    1.00      6.8±0.00µs        ? ?/sec
"cast decimal32 to float64"                                        1.00      6.8±0.01µs        ? ?/sec    1.00      6.8±0.00µs        ? ?/sec
"cast decimal32 to int16"                                          1.00     25.4±0.62µs        ? ?/sec    1.05     26.7±0.49µs        ? ?/sec
"cast decimal32 to int32"                                          1.00     20.5±0.30µs        ? ?/sec    1.14     23.4±0.26µs        ? ?/sec
"cast decimal32 to int64"                                          1.00     20.3±0.20µs        ? ?/sec    1.19     24.3±0.43µs        ? ?/sec
"cast decimal32 to int8"                                           1.01     33.7±0.55µs        ? ?/sec    1.00     33.2±0.34µs        ? ?/sec
"cast decimal32 to uint16"                                         1.00     26.0±0.53µs        ? ?/sec    1.03     26.8±0.26µs        ? ?/sec
"cast decimal32 to uint32"                                         1.00     20.0±0.19µs        ? ?/sec    1.18     23.6±0.40µs        ? ?/sec
"cast decimal32 to uint64"                                         1.00     20.5±0.22µs        ? ?/sec    1.14     23.4±0.35µs        ? ?/sec
"cast decimal32 to uint8"                                          1.02     35.6±0.52µs        ? ?/sec    1.00     34.8±0.39µs        ? ?/sec
"cast decimal64 to float32"                                        1.00      6.8±0.00µs        ? ?/sec    1.00      6.8±0.00µs        ? ?/sec
"cast decimal64 to float64"                                        1.01      6.9±0.00µs        ? ?/sec    1.00      6.8±0.00µs        ? ?/sec
"cast decimal64 to int16"                                          1.00     33.9±0.32µs        ? ?/sec    1.93     65.5±0.31µs        ? ?/sec
"cast decimal64 to int32"                                          1.00     26.4±0.07µs        ? ?/sec    1.01     26.7±0.12µs        ? ?/sec
"cast decimal64 to int64"                                          1.00     26.4±0.09µs        ? ?/sec    1.00     26.4±0.08µs        ? ?/sec
"cast decimal64 to int8"                                           1.00     34.3±0.27µs        ? ?/sec    1.86     63.9±0.25µs        ? ?/sec
"cast decimal64 to uint16"                                         1.00     34.3±0.26µs        ? ?/sec    1.88     64.6±0.37µs        ? ?/sec
"cast decimal64 to uint32"                                         1.00     26.3±0.08µs        ? ?/sec    1.02     26.7±0.14µs        ? ?/sec
"cast decimal64 to uint64"                                         1.00     25.8±0.06µs        ? ?/sec    1.02     26.4±0.09µs        ? ?/sec
"cast decimal64 to uint8"                                          1.00     34.1±0.28µs        ? ?/sec    1.84     62.7±0.41µs        ? ?/sec
"cast float32 to decimal128(32, 3)"                                1.00     33.8±0.38µs        ? ?/sec    1.00     33.9±0.28µs        ? ?/sec
"cast float32 to decimal256(76, 4)"                                1.00    501.6±4.91µs        ? ?/sec    1.01    508.4±5.92µs        ? ?/sec
"cast float32 to decimal32(9, 2)"                                  1.03     21.6±1.40µs        ? ?/sec    1.00     20.9±0.98µs        ? ?/sec
"cast float32 to decimal64(18, 2"                                  1.03     22.3±1.07µs        ? ?/sec    1.00     21.5±0.80µs        ? ?/sec
"cast float64 to decimal128(32, 3)"                                1.00     32.1±0.48µs        ? ?/sec    1.00     32.1±0.38µs        ? ?/sec
"cast float64 to decimal256(76, 4)"                                1.00    499.9±6.42µs        ? ?/sec    1.01    506.7±6.07µs        ? ?/sec
"cast float64 to decimal32(9, 2)"                                  1.01     21.4±1.10µs        ? ?/sec    1.00     21.2±0.85µs        ? ?/sec
"cast float64 to decimal64(18, 2"                                  1.01     21.9±0.75µs        ? ?/sec    1.00     21.7±0.57µs        ? ?/sec
"cast invalid float32 to decimal128(32, 3)"                        1.00     22.9±0.74µs        ? ?/sec    1.00     22.9±0.81µs        ? ?/sec
"cast invalid float32 to decimal256(76, 4)"                        1.01     39.6±0.41µs        ? ?/sec    1.00     39.4±0.50µs        ? ?/sec
"cast invalid float32 to decimal32(9, 2)"                          1.00     21.0±1.88µs        ? ?/sec    1.19     25.0±2.03µs        ? ?/sec
"cast invalid float32 to decimal64(18, 2"                          1.00     22.9±1.11µs        ? ?/sec    1.06     24.3±1.40µs        ? ?/sec
"cast invalid float64 to decimal32(9, 2)"                          1.03     21.2±1.32µs        ? ?/sec    1.00     20.5±1.18µs        ? ?/sec
"cast invalid float64 to to decimal128(32, 3)"                     1.03     23.3±1.23µs        ? ?/sec    1.00     22.6±0.81µs        ? ?/sec
"cast invalid float64 to to decimal256(76, 4)"                     1.01     38.9±0.62µs        ? ?/sec    1.00     38.3±0.44µs        ? ?/sec
"cast invalid float64 to to decimal64(18, 2)"                      1.04     24.1±1.17µs        ? ?/sec    1.00     23.3±1.27µs        ? ?/sec
"cast invalid string to decimal128(38, 3)"                         1.00    713.3±0.73µs        ? ?/sec    1.00    716.2±0.85µs        ? ?/sec
"cast invalid string to decimal256(76, 4)"                         1.00    712.4±0.89µs        ? ?/sec    1.00    714.6±1.11µs        ? ?/sec
"cast invalid string to decimal32(9, 2)"                           1.00    683.4±2.04µs        ? ?/sec    1.00    683.6±0.77µs        ? ?/sec
"cast invalid string to decimal64(18, 2)"                          1.00    685.7±2.23µs        ? ?/sec    1.00    685.9±0.67µs        ? ?/sec
"cast string to decimal128(38, 3)"                                 1.00    642.4±0.65µs        ? ?/sec    1.01    647.3±0.56µs        ? ?/sec
"cast string to decimal256(76, 4)"                                 1.00    659.0±0.59µs        ? ?/sec    1.00    660.2±0.68µs        ? ?/sec
"cast string to decimal32(9, 2)"                                   1.00    792.0±0.56µs        ? ?/sec    1.00    789.0±0.49µs        ? ?/sec
"cast string to decimal64(18, 2)"                                  1.00    619.2±0.58µs        ? ?/sec    1.00    620.9±0.84µs        ? ?/sec
cast binary view to string                                         1.01     58.6±0.38µs        ? ?/sec    1.00     58.3±0.43µs        ? ?/sec
cast binary view to string view                                    1.00     65.1±0.37µs        ? ?/sec    1.00     64.8±0.09µs        ? ?/sec
cast binary view to wide string                                    1.00     58.8±0.47µs        ? ?/sec    1.01     59.3±0.85µs        ? ?/sec
cast date32 to date64 512                                          1.00    320.7±0.67ns        ? ?/sec    1.00    321.3±1.01ns        ? ?/sec
cast date64 to date32 512                                          1.00    408.6±0.38ns        ? ?/sec    1.01    413.2±0.42ns        ? ?/sec
cast decimal128 to decimal128 512                                  1.00      6.9±0.00µs        ? ?/sec    1.00      6.9±0.01µs        ? ?/sec
cast decimal128 to decimal128 512 lower precision                  1.09     14.8±0.22µs        ? ?/sec    1.00     13.5±0.04µs        ? ?/sec
cast decimal128 to decimal128 512 with lower scale (infallible)    1.00     45.9±0.06µs        ? ?/sec    1.00     45.9±0.05µs        ? ?/sec
cast decimal128 to decimal128 512 with same scale                  1.00     76.9±1.11ns        ? ?/sec    1.00     76.8±0.47ns        ? ?/sec
cast decimal128 to decimal256 512                                  1.00     26.3±0.01µs        ? ?/sec    1.00     26.3±0.08µs        ? ?/sec
cast decimal256 to decimal128 512                                  1.00    310.2±0.30µs        ? ?/sec    1.01    312.6±1.08µs        ? ?/sec
cast decimal256 to decimal256 512                                  1.00     82.1±0.09µs        ? ?/sec    1.01     82.8±0.09µs        ? ?/sec
cast decimal256 to decimal256 512 with same scale                  1.00     76.6±0.57ns        ? ?/sec    1.01     77.6±0.85ns        ? ?/sec
cast decimal32 to decimal32 512                                    1.00      8.6±0.01µs        ? ?/sec    1.00      8.6±0.01µs        ? ?/sec
cast decimal32 to decimal32 512 lower precision                    1.00     10.0±0.06µs        ? ?/sec    1.00     10.1±0.02µs        ? ?/sec
cast decimal32 to decimal64 512                                    1.00      3.3±0.00µs        ? ?/sec    1.00      3.4±0.00µs        ? ?/sec
cast decimal64 to decimal32 512                                    1.00     32.4±0.02µs        ? ?/sec    1.00     32.4±0.02µs        ? ?/sec
cast decimal64 to decimal64 512                                    1.00      2.8±0.00µs        ? ?/sec    1.00      2.9±0.01µs        ? ?/sec
cast dict to string view                                           1.02     41.1±2.07µs        ? ?/sec    1.00     40.4±1.56µs        ? ?/sec
cast f32 to string 512                                             1.00     11.6±0.04µs        ? ?/sec    1.00     11.7±0.04µs        ? ?/sec
cast f64 to string 512                                             1.02     15.5±0.06µs        ? ?/sec    1.00     15.2±0.12µs        ? ?/sec
cast float32 to int32 512                                          1.00   1353.7±8.65ns        ? ?/sec    1.00  1356.5±10.01ns        ? ?/sec
cast float64 to float32 512                                        1.00    692.8±1.83ns        ? ?/sec    1.02    708.2±2.43ns        ? ?/sec
cast float64 to uint64 512                                         1.00  1381.6±10.93ns        ? ?/sec    1.04  1432.6±26.19ns        ? ?/sec
cast i64 to string 512                                             1.01      8.9±0.04µs        ? ?/sec    1.00      8.8±0.03µs        ? ?/sec
cast int32 to float32 512                                          1.02    707.3±4.99ns        ? ?/sec    1.00    693.4±5.49ns        ? ?/sec
cast int32 to float64 512                                          1.00    707.1±2.91ns        ? ?/sec    1.00    705.9±2.84ns        ? ?/sec
cast int32 to int32 512                                            1.00    173.7±1.05ns        ? ?/sec    1.04    180.0±3.45ns        ? ?/sec
cast int32 to int64 512                                            1.00    696.0±4.17ns        ? ?/sec    1.02    708.2±4.28ns        ? ?/sec
cast int32 to uint32 512                                           1.00   1376.8±2.20ns        ? ?/sec    1.00   1382.8±5.14ns        ? ?/sec
cast int64 to int32 512                                            1.00   1492.3±1.00ns        ? ?/sec    1.00   1486.2±1.45ns        ? ?/sec
cast no runs of int32s to ree<int32>                               1.02     58.3±0.81µs        ? ?/sec    1.00     57.4±0.98µs        ? ?/sec
cast runs of 10 string to ree<int32>                               1.01      8.8±0.03µs        ? ?/sec    1.00      8.7±0.07µs        ? ?/sec
cast runs of 1000 int32s to ree<int32>                             1.00      3.4±0.01µs        ? ?/sec    1.00      3.4±0.01µs        ? ?/sec
cast string single run to ree<int32>                               1.00     27.4±0.02µs        ? ?/sec    1.00     27.4±0.02µs        ? ?/sec
cast string to binary view 512                                     1.00      2.3±0.02µs        ? ?/sec    1.02      2.3±0.01µs        ? ?/sec
cast string view to binary view                                    1.00     73.2±0.64ns        ? ?/sec    1.00     73.5±0.61ns        ? ?/sec
cast string view to dict                                           1.00    174.5±0.59µs        ? ?/sec    1.01    175.4±0.66µs        ? ?/sec
cast string view to string                                         1.00     44.7±2.39µs        ? ?/sec    1.02     45.8±1.98µs        ? ?/sec
cast string view to wide string                                    1.00     45.9±2.49µs        ? ?/sec    1.00     45.9±2.26µs        ? ?/sec
cast time32s to time32ms 512                                       1.00    145.4±0.25ns        ? ?/sec    1.02    148.9±0.60ns        ? ?/sec
cast time32s to time64us 512                                       1.00    320.5±0.67ns        ? ?/sec    1.01    322.3±0.79ns        ? ?/sec
cast time64ns to time32s 512                                       1.00    401.5±0.28ns        ? ?/sec    1.00    402.7±0.48ns        ? ?/sec
cast timestamp_ms to i64 512                                       1.01    253.3±1.87ns        ? ?/sec    1.00    251.6±0.67ns        ? ?/sec
cast timestamp_ms to timestamp_ns 512                              1.00   1850.0±1.95ns        ? ?/sec    1.00   1857.2±4.46ns        ? ?/sec
cast timestamp_ns to timestamp_s 512                               1.00    170.4±1.42ns        ? ?/sec    1.04    177.7±3.04ns        ? ?/sec
cast utf8 to date32 512                                            1.00      6.4±0.04µs        ? ?/sec    1.01      6.5±0.04µs        ? ?/sec
cast utf8 to date64 512                                            1.00     32.0±0.13µs        ? ?/sec    1.00     32.1±0.12µs        ? ?/sec
cast utf8 to f32                                                   1.01      5.6±0.03µs        ? ?/sec    1.00      5.6±0.03µs        ? ?/sec
cast wide string to binary view 512                                1.00      4.1±0.07µs        ? ?/sec    1.00      4.1±0.09µs        ? ?/sec

Resource Usage

base (merge-base)

Metric	Value
Wall time	1145.3s
Peak memory	3.0 GiB
Avg memory	3.0 GiB
CPU user	1139.6s
CPU sys	0.8s
Peak spill	0 B

branch

Metric	Value
Wall time	1130.3s
Peak memory	3.0 GiB
Avg memory	3.0 GiB
CPU user	1128.6s
CPU sys	0.2s
Peak spill	0 B

File an issue against this benchmark runner

github-actions Bot added arrow Changes to the arrow crate parquet-variant parquet-variant* crates labels Apr 10, 2026

klion26 commented Apr 10, 2026

View reviewed changes

scovich reviewed Apr 10, 2026

View reviewed changes

klion26 commented Apr 13, 2026

View reviewed changes

scovich self-requested a review April 13, 2026 13:49

scovich reviewed Apr 13, 2026

View reviewed changes

klion26 commented Apr 14, 2026

View reviewed changes

klion26 commented Apr 15, 2026

View reviewed changes

scovich approved these changes Apr 15, 2026

View reviewed changes

scovich mentioned this pull request Apr 21, 2026

Release arrow-rs / parquet Minor version 58.2.0 (April 2026) #9109

Open

13 tasks

alamb mentioned this pull request Apr 22, 2026

Add benchmark for cast from/to decimals #9729

Merged

klion26 added 7 commits April 22, 2026 22:15

[Variant] Align cast logic for from/to_decimal for variant

87915a6

add some example for decimal from string

097e971

address comment

0d75884

address comment

98a4843

performance ok

153233f

improve performance back

1a34e05

fix erro msg

e222986

klion26 force-pushed the variant-cast-decimal branch from 15f175d to e222986 Compare April 22, 2026 14:23

change used to debug benchmark

c74d244

	.map(\|(_, frac)\| frac.len())
	.unwrap_or(0);
	.map_or_else(0, \|(_, frac)\| frac.len());

	.map(\|(_, frac)\| frac.len())
	.unwrap_or(0);
	.map_or_default(\|(_, frac)\| frac.len());

	{
	if scale < 0 {
	return Err(ArrowError::InvalidArgumentError(format!(
	"Cannot cast string to decimal with negative scale {scale}"
	)));
	}

	.map_or_else(\|\| 0, \|(_, frac)\| frac.len());
	.map_or(0, \|(_, frac)\| frac.len());

Conversation

klion26 commented Apr 10, 2026

Which issue does this PR close?

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

klion26 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

scovich left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

klion26 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

klion26 Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

klion26 Apr 13, 2026 •

edited

Loading