Refactor Spark format_string integer conversion dispatch#22388
Open
kosiew wants to merge 3 commits into
Open
Conversation
…ize match arms - Added a local IntegerValue adapter to format_string functionality. - Collapsed duplicated match arms for %d, %x, %o, %s, %c to improve code efficiency. - Ensured preservation of signed width behavior for negative hex and octal values. - Introduced a regression table test covering integer widths and null values.
- Removed IntegerFormatValue. - Added direct IntegerValue helpers. - Introduced private From implementations via local macros. - Shortened integer ScalarValue arms for clarity. - Deduplicated invalid integer conversion error handling. - Made test argument counts explicit for better readability.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
%cformatting dispatch in format_string.rs #22163Rationale for this change
ConversionSpecifier::formatcontained substantial duplication across integerScalarValuevariants for%d,%x,%o,%s, and%chandling. Each integer width repeated nearly identical conversion logic, making the code harder to maintain and increasing the risk of inconsistent behavior across integer types.This change consolidates integer formatting behavior into shared internal helpers while preserving existing Spark-compatible semantics.
What changes are included in this PR?
Introduced a local
IntegerValueenum to normalize signed and unsigned integer handling while preserving width-specific unsigned bit behavior for%xand%o.Replaced repeated per-variant integer dispatch branches in
ConversionSpecifier::formatwith a sharedformat_integerhelper.Added shared helper methods for:
%cconversionAdded small
macro_rules!helpers to generateFrom<T> for IntegerValueimplementations for signed and unsigned integer families, reducing repetitive conversion boilerplate while preserving width-specific unsigned formatting semantics.Added
invalid_integer_conversionhelper to centralize integer conversion error generation.Added table-driven regression coverage for integer formatting behavior across:
%d,%x,%o,%s, and%cAre these changes tested?
Yes.
Added
test_integer_formatting_across_widthscovering:Int8,Int16,Int32, andInt64UInt8,UInt16,UInt32, andUInt64%d,%x,%o,%s, and%cformatting behaviorAre there any user-facing changes?
No intended user-facing behavior changes. This PR is a structural refactor intended to preserve existing Spark-compatible integer formatting semantics.
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.