Skip to content

Opt-in overflow tracking for decimal fixed-point arithmetic#22356

Draft
Avinash-Raj wants to merge 4 commits into
rapidsai:mainfrom
Avinash-Raj:avi/decimal-overflow
Draft

Opt-in overflow tracking for decimal fixed-point arithmetic#22356
Avinash-Raj wants to merge 4 commits into
rapidsai:mainfrom
Avinash-Raj:avi/decimal-overflow

Conversation

@Avinash-Raj
Copy link
Copy Markdown
Contributor

@Avinash-Raj Avinash-Raj commented May 2, 2026

Summary

Introduces an opt-in path for detecting overflow in decimal fixed-point arithmetic, at two layers:

  1. Value-type level – new numeric::decimal32_safe / decimal64_safe / decimal128_safe aliases backed by a new fixed_point<Rep, Rad, overflow_tracking::on> instantiation. These carry a sticky per-value bool that is set whenever a constructor, arithmetic operator, rescaled(), or float→decimal conversion would overflow the underlying integer representation. The flag propagates through +, -, *, % and rescaled(), so any expression built on the value-level operators (binaryop transforms, scans, reductions
    expressed through value ops) carries it for free.

  2. Column / binaryop level – new public API cudf::binary_operation_safe(..., rmm::device_scalar<uint32_t>& overflow_flag, ...) for the seven base-10 decimal arithmetic operators (ADD, SUB, MUL, DIV, MOD, PMOD, PYMOD). It updates a single device-side flag via atomicMax whenever any active (non-null) row overflows during arithmetic or rescale to the output scale. Designed for callers (e.g. velox-cudf) that need a one-shot “did any row overflow?” signal without per-row storage.

Motivation

Existing decimal32 / decimal64 / decimal128 silently wrap on overflow. Downstream engines that need stricter semantics today have to either implement their own checked arithmetic or rebuild libcudf. This PR keeps the default behavior unchanged but lets opt-in callers ask the same libcudf build for overflow-aware decimals.

@Avinash-Raj Avinash-Raj requested review from a team as code owners May 2, 2026 07:58
@github-actions github-actions Bot added libcudf Affects libcudf (C++/CUDA) code. CMake CMake build issue labels May 2, 2026
@Avinash-Raj Avinash-Raj changed the title Add optional sticky fixed-point overflow tracking (CUDF_TRACK_FIXED_POINT_OVERFLOW) [Draft] Add optional sticky fixed-point overflow tracking (CUDF_TRACK_FIXED_POINT_OVERFLOW) May 2, 2026
@Avinash-Raj Avinash-Raj marked this pull request as draft May 2, 2026 07:59
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 2, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@Avinash-Raj Avinash-Raj changed the title [Draft] Add optional sticky fixed-point overflow tracking (CUDF_TRACK_FIXED_POINT_OVERFLOW) Add optional sticky fixed-point overflow tracking (CUDF_TRACK_FIXED_POINT_OVERFLOW) May 2, 2026
@bdice
Copy link
Copy Markdown
Contributor

bdice commented May 2, 2026

I would favor adding new operators/functions rather than a compile-time flag — we want to be able to ship one build of cuDF that works universally.

@devavret
Copy link
Copy Markdown
Contributor

devavret commented May 4, 2026

@bdice I'd imagine this is not limited to scalar functions like binaryops and such. We'd also need to detect overflow in aggregations like groupby and reduce. And those would require overloading the existing +,* operators.

What's your opinion on adding a new type maybe called safe_decimal.

Spark should also have a concept of arithmetic exception for overflows. How do they support this with libcudf decimal?

@mattgara
Copy link
Copy Markdown

mattgara commented May 4, 2026

What's your opinion on adding a new type maybe called safe_decimal.

Inline with this thinking, we could extend cudf's fixed_point to implement this:

enum class overflow_tracking { off, on };

template <typename Rep, Radix Rad,
          overflow_tracking Track = overflow_tracking::off>
class fixed_point{ ...

and use if constexpr (Track == overflow_tracking::on) to branch the implementation based on this configuration. Effectively, this should be compile-time semantically equivalent to having a safe_decimal, but with less code duplication.

@ttnghia
Copy link
Copy Markdown
Contributor

ttnghia commented May 4, 2026

This is related to ANSI support that I've recently filed: #21676
We can use some kind of input enum like mentioned above, and custom return type as my suggestion.

@Avinash-Raj
Copy link
Copy Markdown
Contributor Author

Ok, replaced the CMake CUDF_TRACK_FIXED_POINT_OVERFLOW flag with an opt-in overflow_tracking NTTP on numeric::fixed_point and add decimal*_safe / type_id::DECIMAL*_SAFE so velox-cudf gets per-element overflow detection without requiring a separate libcudf build.

@pmattione-nvidia
Copy link
Copy Markdown
Contributor

You might want to put overflow tracking into the floating <--> decimal conversion code as well (include/cudf/fixed_point/detail/floating_conversion.hpp). E.g. there are the guarded_left_shift() and guarded_right_shift() functions, and there are potential overflows in convert_floating_to_integral_shifting(), shift_to_decimal_pospow(), and shift_to_decimal_negpow(). All potential overflow sites are mentioned explicitly in the code comments in these functions.

@Avinash-Raj Avinash-Raj changed the title Add optional sticky fixed-point overflow tracking (CUDF_TRACK_FIXED_POINT_OVERFLOW) Opt-in overflow tracking for decimal fixed-point arithmetic May 11, 2026
@davidwendt
Copy link
Copy Markdown
Contributor

Could this be done without modifying the fixed-point class? The operator overloads are not really required for this to work since the dispatcher logic creates specialized code paths automatically. And somehow the overflow-awareness is communicated through to the appropriate operation/function and would generally require a special code path either way.

It seems the operators could be just free functions and not member functions or member operators. The return type would include an overflow flag along with the result. Similar to how the https://github.com/rapidsai/cudf/blob/main/cpp/include/cudf/fixed_point/detail/floating_conversion.hpp is a separate file handling specific operations for fixed-point.

The new free functions would be called as part of the specialized overflow-aware aggregation types for reduce, groupby, etc.
This would better isolate the specialized operation functions away from the current fixed-point implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants