Apache Iceberg version
main (development)
Please describe the bug 🐞
Component: literals.go
Affected version: main (verified on current apache/iceberg-go main)
Severity: Low–Medium.
Type metadata returned by DecimalLiteral.Type() is wrong, but most internal call sites are insensitive to the precision field. One concrete functional consequence is confirmed in the Substrait conversion path.
DecimalLiteral.Type() always returns decimal(9, scale) regardless of the column's actual declared precision, because DecimalLiteral does not carry precision. This makes the type returned by .Type() unreliable as a source of truth about the originating column type.
The previous version of this report listed broad downstream impact on bounds checks, casts, predicate projection, and partition evaluation. On audit, those paths do not consume DecimalLiteral.Type().Precision() and are unaffected in practice. This rewrite narrows the claim to what is actually demonstrable.
Affected code
literals.go:
func (d DecimalLiteral) Type() Type { return DecimalTypeOf(9, d.Scale) }
DecimalLiteral is defined as:
type DecimalLiteral Decimal
// Decimal carries Val and Scale, but NOT precision.
type Decimal struct {
Val decimal128.Num
Scale int32
}
Root cause
DecimalLiteral (and Decimal) do not store the column's declared precision — only the scale. When Type() reconstructs a DecimalType, it has no precision to use, so it hardcodes 9.
9 happens to be a valid small precision, but it is wrong for any column with precision ≠ 9. Common production types like decimal(10,2), decimal(18,4), or decimal(38,10) all silently report decimal(9, scale) from lit.Type().
Reproduction
lit := iceberg.DecimalLiteral{
Val: decimal128.FromI64(12345678),
Scale: 4,
}
fmt.Println(lit.Type()) // prints: decimal(9, 4)
// Round-trip through LiteralFromBytes also drops precision:
typ := iceberg.DecimalTypeOf(18, 4)
b, _ := lit.MarshalBinary()
back, _ := iceberg.LiteralFromBytes(typ, b)
fmt.Println(back.Type().Equals(typ)) // false — back.Type() is decimal(9, 4)
Confirmed functional impact
1. Substrait conversion emits wrong precision
table/substrait/substrait.go:
func toDecimalLiteral(v iceberg.DecimalLiteral) expr.Literal {
byts, _ := v.MarshalBinary()
result, _ := expr.NewLiteral(&types.Decimal{
Scale: int32(v.Scale),
Value: byts,
Precision: int32(v.Type().(*iceberg.DecimalType).Precision()),
}, false)
return result
}
Here precision is read off v.Type() and embedded in the Substrait literal. For any decimal column whose precision is not 9, the emitted Substrait literal carries the wrong precision. (Note: v.Type() returns a value-type DecimalType, not a pointer, so the *iceberg.DecimalType assertion here is also questionable and worth a separate look.)
2. Type() cannot be trusted as column-type metadata
Any external caller that reflects on DecimalLiteral.Type() to recover the originating column type — for logging, schema introspection, or interop — gets decimal(9, scale) instead of the real precision.
Paths checked and found NOT affected
These were claimed in the original report but, on inspection, do not actually read precision off DecimalLiteral.Type():
DecimalLiteral.To(DecimalType) (literals.go): only compares Scale.
Source-precision is never consulted.
boundRef.evalToLiteral (exprs.go): does lit.Type().Equals(field.Type) and falls through to lit.To(field.Type) when unequal. Because To only checks scale, it succeeds with the same value. Wasteful, but functionally correct.
- Bounds / range checks on literals: the upper-bound machinery operates on the field type (which carries the correct precision), not on the literal's reported type.
- Predicate projection (
Inclusive / Exclusive): same — projection is driven by the bound term's field type, not by lit.Type().Precision().
validateAddedDataFilesMatchingFilter (table/conflict_validation.go):
partition evaluation uses partition-spec field types, not literal-reported types.
If anyone can demonstrate a failing case in one of these paths, please attach a reproducer — the original report did not include one and a code search did not turn one up.
Proposed fix
Two options:
Option A — store precision in DecimalLiteral (breaking change):
type DecimalLiteral struct {
Decimal
Precision int32
}
func (d DecimalLiteral) Type() Type { return DecimalTypeOf(int(d.Precision), int(d.Scale)) }
Breaks the public DecimalLiteral shape and every literal construction site.
Option B — fix the one known broken consumer:
Leave DecimalLiteral as-is. In toDecimalLiteral (Substrait), take precision from the bound field's DecimalType rather than from v.Type(). Document on DecimalLiteral.Type() that the returned precision is not the column's declared precision and callers needing real precision must consult the field type.
Given the narrow real impact, Option B is the lowest-risk path. Option A is justified only if more genuinely-affected call sites turn up.
Related
- Bug:
convertDecimalValue passes string length instead of declared precision to DecimalRequiredBytes (companion bug in manifest.go).
Apache Iceberg version
main (development)
Please describe the bug 🐞
Component:
literals.goAffected version:
main(verified on currentapache/iceberg-gomain)Severity: Low–Medium.
Type metadata returned by
DecimalLiteral.Type()is wrong, but most internal call sites are insensitive to the precision field. One concrete functional consequence is confirmed in the Substrait conversion path.DecimalLiteral.Type()always returnsdecimal(9, scale)regardless of the column's actual declared precision, becauseDecimalLiteraldoes not carry precision. This makes the type returned by.Type()unreliable as a source of truth about the originating column type.The previous version of this report listed broad downstream impact on bounds checks, casts, predicate projection, and partition evaluation. On audit, those paths do not consume
DecimalLiteral.Type().Precision()and are unaffected in practice. This rewrite narrows the claim to what is actually demonstrable.Affected code
literals.go:DecimalLiteralis defined as:Root cause
DecimalLiteral(andDecimal) do not store the column's declared precision — only the scale. WhenType()reconstructs aDecimalType, it has no precision to use, so it hardcodes9.9happens to be a valid small precision, but it is wrong for any column with precision ≠ 9. Common production types likedecimal(10,2),decimal(18,4), ordecimal(38,10)all silently reportdecimal(9, scale)fromlit.Type().Reproduction
Confirmed functional impact
1. Substrait conversion emits wrong precision
table/substrait/substrait.go:Here precision is read off
v.Type()and embedded in the Substrait literal. For any decimal column whose precision is not 9, the emitted Substrait literal carries the wrong precision. (Note:v.Type()returns a value-typeDecimalType, not a pointer, so the*iceberg.DecimalTypeassertion here is also questionable and worth a separate look.)2.
Type()cannot be trusted as column-type metadataAny external caller that reflects on
DecimalLiteral.Type()to recover the originating column type — for logging, schema introspection, or interop — getsdecimal(9, scale)instead of the real precision.Paths checked and found NOT affected
These were claimed in the original report but, on inspection, do not actually read precision off
DecimalLiteral.Type():DecimalLiteral.To(DecimalType)(literals.go): only comparesScale.Source-precision is never consulted.
boundRef.evalToLiteral(exprs.go): doeslit.Type().Equals(field.Type)and falls through tolit.To(field.Type)when unequal. BecauseToonly checks scale, it succeeds with the same value. Wasteful, but functionally correct.Inclusive/Exclusive): same — projection is driven by the bound term's field type, not bylit.Type().Precision().validateAddedDataFilesMatchingFilter(table/conflict_validation.go):partition evaluation uses partition-spec field types, not literal-reported types.
If anyone can demonstrate a failing case in one of these paths, please attach a reproducer — the original report did not include one and a code search did not turn one up.
Proposed fix
Two options:
Option A — store precision in
DecimalLiteral(breaking change):Breaks the public
DecimalLiteralshape and every literal construction site.Option B — fix the one known broken consumer:
Leave
DecimalLiteralas-is. IntoDecimalLiteral(Substrait), take precision from the bound field'sDecimalTyperather than fromv.Type(). Document onDecimalLiteral.Type()that the returned precision is not the column's declared precision and callers needing real precision must consult the field type.Given the narrow real impact, Option B is the lowest-risk path. Option A is justified only if more genuinely-affected call sites turn up.
Related
convertDecimalValuepasses string length instead of declared precision toDecimalRequiredBytes(companion bug inmanifest.go).