public/schema: add BigDecimal common type#421
Merged
Conversation
BigDecimal carries an arbitrary-precision decimal value with no schema-level precision or scale, complementing the fixed-precision Decimal type. Use it for sources whose column metadata does not declare precision and scale — unparameterised Postgres NUMERIC, Oracle NUMBER without DATA_PRECISION, MongoDB Decimal128. Adds: - BigDecimal CommonType = 16 with its String/typeFromStr cases. - NewBigDecimal, FormatBigDecimal, ParseBigDecimal helpers. ParseBigDecimal recovers the scale from the input rather than taking it as a parameter. - A shared parseCanonicalDecimal helper between ParseDecimal and ParseBigDecimal so the accepted form stays consistent. - A leaf-check in Common.Validate: every type other than Object, Map, Array, and Union must have empty Children. Removes a pre-existing inconsistency where Validate enforced parameter rules but ignored structural ones. Decimal parsers (ParseDecimal, ParseBigDecimal) are now lenient on non-canonical-but-unambiguous inputs (leading plus, leading zeros, missing integer part as in ".5") and strict on ambiguous or malformed inputs (scientific notation, multiple decimal points, whitespace, thousands separators, non-digit characters). Canonical form is asserted on the way out by FormatDecimal/FormatBigDecimal — Postel applies. Documents the BigDecimal type, the relaxed value contract, and the parse/emit asymmetry in public/schema/decimal_types.md.
| @@ -0,0 +1,64 @@ | |||
| // Copyright 2025 Redpanda Data, Inc. | |||
Contributor
There was a problem hiding this comment.
Suggested change
| // Copyright 2025 Redpanda Data, Inc. | |
| // Copyright 2026 Redpanda Data, Inc. |
Collaborator
Author
There was a problem hiding this comment.
Applied in 819a609 — thank you, @josephwoodward.
squiidz
approved these changes
Apr 28, 2026
| return nil, 0, fmt.Errorf("failed to parse decimal value %q", s) | ||
| } | ||
|
|
||
| return n, int32(len(fracPart)), nil |
Contributor
There was a problem hiding this comment.
Given len returns a 64-bit int, do we want to check it's within math.MaxInt32 and handle if not instead of it silently wrapping?
Collaborator
Author
There was a problem hiding this comment.
Good catch, @josephwoodward — addressed in 79ae0ab. The same wrap-around lurked on the ParseDecimal side too (its int32(len(fracPart)) > scale comparison would have short-circuited rather than caught the overflow), so the bound now lives in the shared parseCanonicalDecimal helper and both parsers benefit from the explicit error.
len(fracPart) is a 64-bit int. The downstream casts to int32 (the scale type) in ParseDecimal and ParseBigDecimal would wrap silently on a fractional part longer than math.MaxInt32 — the BigDecimal path would return a negative scale and the Decimal path would short-circuit its "exceeds scale" check. Bound the fractional length in parseCanonicalDecimal with an explicit error so both parsers fail loudly rather than returning corrupt output.
josephwoodward
approved these changes
Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
BigDecimalcommon type for arbitrary-precision decimals, complementing the fixed-precisionDecimalshipped in public/schema: add Decimal common type with precision and scale #420. Use it for sources whose column metadata does not declare precision and scale — unparameterised Postgresnumeric, OracleNUMBERwith noDATA_PRECISION, MongoDBDecimal128.NewBigDecimal,FormatBigDecimal, andParseBigDecimal. UnlikeParseDecimal,ParseBigDecimalrecovers the scale from the input rather than taking it as a parameter.Common.Validatenow enforces a leaf-check across all non-container types (everyCommonTypeother thanObject,Map,Array,Union). This removes a pre-existing inconsistency where Validate enforced parameter rules but ignored structural ones.ParseDecimalandParseBigDecimalare now consistently lenient on non-canonical-but-unambiguous inputs (leading plus, leading zeros, missing integer part as in".5") and strict on ambiguous or malformed inputs (scientific notation, multiple decimal points, whitespace, thousands separators, non-digit characters). Canonical form is asserted exclusively at the emit boundary byFormatDecimal/FormatBigDecimal— Postel applies.public/schema/decimal_types.mdextended with a "BigDecimal: arbitrary-precision decimals" section, per-format converter expectations (Avro/Parquet/Iceberg reject; JSON Schema permissive pattern), and a note on the parse/emit asymmetry.Non-decimal schema fingerprints remain byte-stable; the new
BigDecimaltype is encoded by its type identifier and never carries logical params.Test plan
task fmttask lint(0 issues)task test(full repo suite passes)BigDecimalToAny/ParseFromAny round-trip,Validaterejection of children +Logical.Decimal,FormatBigDecimal/ParseBigDecimalhappy and error paths, parse → format normalisation of non-canonical inputs, and the broader leaf-check across every leafCommonType