You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The schema-registry Avro decoder now honours every Avro logical type the
spec defines (previously only `decimal`), and the iceberg connector maps
the resulting schema metadata to the right column type and interprets
numeric values using the schema's declared unit.
Decoder (internal/impl/confluent/ecs_avro.go):
- Replace the single decimal-only branch with a dispatcher covering
timestamp-{millis,micros,nanos}, local-timestamp-{millis,micros,nanos},
date, time-{millis,micros}, and uuid. Per Avro 1.10 spec, unrecognised
logicalType annotations and primitive/logical-type mismatches fall
back silently to the base primitive.
Encoder (internal/impl/confluent/common_to_avro.go):
- Symmetric encode for the new common types. Timestamp without explicit
Logical params keeps emitting `timestamp-millis` via EffectiveTimestamp,
preserving pre-PR output for legacy schemas. Date and TimeOfDay
paths reject Avro-inexpressible shapes (e.g. nanos for time-of-day,
AdjustToUTC for time-of-day) with field-naming errors rather than
silently downcasting.
Iceberg type resolver (internal/impl/iceberg/type_resolver.go):
- Map schema.Timestamp through EffectiveTimestamp so legacy schemas
keep landing on TimestampTzType. Honour Logical.Timestamp.Unit and
AdjustToUTC to pick TimestampType / TimestampTzType / TimestampNsType
/ TimestampTzNsType. Add Date, TimeOfDay, UUID arms; reject TimeOfDay
shapes Iceberg can't faithfully represent (AdjustToUTC=true, nanos)
with field-naming errors.
Iceberg shredder (internal/impl/iceberg/shredder/{shredder,temporal}.go):
- Plumb a fieldID -> *schema.Common map onto RecordShredder via
SetFieldSchemaMetadata. The leaf-value converter looks up the
metadata for time-typed columns and uses the declared unit to scale
numeric inputs into the column's internal representation. Without
metadata, the converter accepts time.Time / time.Duration directly
and falls back to bloblang.ValueAsTimestamp's seconds-default for
bare numerics — preserving existing behaviour for callers that
genuinely store unix-seconds.
- This closes the silent-corruption case where a numeric millisecond
value declared by the schema as timestamp-millis would land
~50,000 years in the future when the column type flipped from BIGINT
to TIMESTAMPTZ. The schema's declared unit is now the source of
truth for unit interpretation.
Iceberg writer (internal/impl/iceberg/writer.go, router.go):
- NewWriter accepts the *typeResolver. The writer parses
schema_metadata from the first message of a batch and builds the
field-ID lookup the shredder consults. Internal API change only —
the only call site is the router.
Breaking surface is documented in CHANGELOG.md under Unreleased.
Pipeline values flow through unchanged in both preserve_logical_types
modes; bloblang behaviour and JSON output bytes are unaffected.
The breakage is concentrated in (a) iceberg tables that already exist
with BIGINT/INT/STRING columns from this bug, which hit Iceberg's
schema-evolution wall, and (b) custom code that pattern-matches the
historical schema_metadata shape via meta() lookups.
Companion to redpanda-data/benthos#429 which adds the new schema.Common
types and parameter blocks. The go.mod replace directive is a
development crutch and must flip to a tagged release before merge.
Closes#4399
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+17Lines changed: 17 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,6 +3,23 @@ Changelog
3
3
4
4
All notable changes to this project will be documented in this file.
5
5
6
+
## Unreleased
7
+
8
+
### Fixed
9
+
10
+
-**BREAKING:** schema_registry_decode (avro): Avro logical types — `timestamp-{millis,micros,nanos}`, `local-timestamp-{millis,micros,nanos}`, `date`, `time-{millis,micros}`, and `uuid` — are now preserved end-to-end in the schema metadata produced by the schema-registry decoder. Previously only `decimal` was honoured; every other logical type silently degraded to its base primitive (`long`, `int`, or `string`). Downstream sinks that consume `schema_metadata` (notably `iceberg`) now create the correct column type. ([#4399](https://github.com/redpanda-data/connect/issues/4399))
11
+
12
+
Pipeline values flow through unchanged in both `preserve_logical_types=false` (default — values stay numeric) and `preserve_logical_types=true` (values stay rich Go time types). Bloblang behaviour and JSON-output bytes are unaffected.
13
+
14
+
**What this changes for existing pipelines:**
15
+
-**iceberg with existing tables that have BIGINT / INT / STRING columns from this bug**: the connector now wants to create or evolve those columns to TIMESTAMP / TIMESTAMPTZ / DATE / TIME / UUID. Iceberg disallows BIGINT → TIMESTAMP schema evolution, so the first write after upgrade will fail loudly. Drop and re-create the table, or use Iceberg-native column-rename + add-new-column tooling to migrate before upgrading.
16
+
-**Pipelines whose own code reads the `schema_metadata` bytes via `meta()`** and pattern-matches the historical INT64 shape: schemas now contain `TIMESTAMP` / `DATE` / `TIME_OF_DAY` / `UUID` along with new `unit` and `adjust_to_utc` fields. Update the pattern.
17
+
-**iceberg shredder** is now schema-aware for numeric inputs: a numeric millisecond value declared by the schema as `timestamp-millis` is correctly interpreted as milliseconds rather than as Unix seconds. This closes a previously-silent corruption case where an int64 millis input into a TIMESTAMPTZ column would land ~50,000 years in the future.
18
+
19
+
### Changed
20
+
21
+
- iceberg: `NewWriter` now takes a `*typeResolver` argument so the writer can use schema metadata to interpret numeric inputs into time-typed columns at shredding time. Internal API change only.
returnnil, fmt.Errorf("time-of-day field %q has unit %v which Avro cannot express; only MILLIS and MICROS are supported", c.Name, c.Logical.TimeOfDay.Unit)
0 commit comments