Skip to content

public/schema: add Decimal common type with precision and scale#420

Merged
Jeffail merged 2 commits into
mainfrom
schema-decimal-types
Apr 28, 2026
Merged

public/schema: add Decimal common type with precision and scale#420
Jeffail merged 2 commits into
mainfrom
schema-decimal-types

Conversation

@Jeffail
Copy link
Copy Markdown
Collaborator

@Jeffail Jeffail commented Apr 28, 2026

Summary

  • Adds a Decimal common type to public/schema, parameterised by precision and scale via a new LogicalParams struct so the existing common schema can describe fixed-precision decimals losslessly across Avro, Parquet, and database NUMBER/NUMERIC columns.
  • Bounds (precision ∈ [1, 38], scale ∈ [0, precision]) describe the lossless intersection of the targeted downstream formats. Oracle's negative-scale and unbounded-precision variants do not survive Avro/Parquet round-trips and are therefore not modelled.
  • Adds ergonomic helpers: NewDecimal, FormatDecimal / ParseDecimal, and DecimalParams.Format / Parse / ValidateValue. These codify the canonical-string value contract so every data-source plugin and converter shares a single formatter rather than reinventing it.
  • Adds Common.Validate as the public entry point for schema-parameter checks. ParseFromAny now validates once at the top level instead of O(depth) times.
  • Documents the schema changes plus per-format guidance for converters (Avro / Parquet / Oracle / Postgres / MySQL / JSON Schema) and the value-emission contract for data-source plugins (mysql_cdc, oracle_cdc, etc.) in public/schema/decimal_types.md.

Non-decimal schemas keep their existing fingerprints byte-for-byte, so any cached conversions keyed on fingerprint remain valid.

Test plan

  • task fmt
  • task lint (0 issues)
  • task test (full repo suite passes)
  • New unit coverage for Decimal parse/serialise round-trips, validation rules, fingerprint sensitivity to precision and scale, and helper round-trip property

Introduces a new Decimal CommonType carrying its precision and scale via a
LogicalParams struct, enabling lossless schema conversion across Avro,
Parquet, and database NUMBER/NUMERIC decimals. Bounds (precision in [1, 38],
scale in [0, precision]) describe the lossless intersection of the targeted
downstream formats.

Adds NewDecimal, FormatDecimal/ParseDecimal, and DecimalParams.Format /
Parse / ValidateValue helpers so data sources and converters share a single
canonical-string contract. Also adds Common.Validate as the public entry
point for schema-parameter checks; ParseFromAny now validates once at the
top level rather than O(depth) times.

Documents the schema changes and the converter/data-source contracts in
public/schema/decimal_types.md.
Comment thread public/schema/common.go
return 0, fmt.Errorf("missing field `%s`", key)
}

switch n := v.(type) {
Copy link
Copy Markdown
Contributor

@josephwoodward josephwoodward Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to handle json.Number in the case statement below as well?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent catch, @josephwoodward — addressed in d1e0b1a. anyIntField now handles json.Number via Int64(), and a sweep turned up a sibling gap in InferFromAny (any caller piping values through json.Decoder.UseNumber() would have hit unsupported data type for every numeric value), which now classifies json.Number as Int64 when it parses as an integer and Float64 otherwise. End-to-end coverage exercises the full ToAny -> json.Marshal -> Decoder.UseNumber -> ParseFromAny round-trip.

Adds json.Number handling to anyIntField (used for decimal precision and
scale parsing in ParseFromAny) and to InferFromAny. Without this, schemas
pipelined through a json.Decoder configured with UseNumber lose round-trip
fidelity at the parsing boundary.

InferFromAny classifies json.Number as Int64 when it parses as an integer
and Float64 otherwise. Adds an end-to-end test exercising the
ToAny -> json.Marshal -> Decoder.UseNumber -> ParseFromAny path.
@Jeffail Jeffail merged commit 3e22241 into main Apr 28, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants