Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions modules/manage/pages/schema-reg/schema-reg-overview.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,10 @@ The Schema Registry is built directly into the Redpanda binary. It runs out of t

**Normalization**: Normalization is the process of converting a schema into a canonical form. When a schema is normalized, it can be compared and considered equivalent to another schema that may contain minor syntactic differences. Schema normalization allows you to more easily manage schema versions and compatibility by prioritizing meaningful logical changes. Normalization is supported for Avro, JSON, and Protobuf formats during both schema registration and lookup for a subject.

=== Avro normalization

When normalizing an Avro schema, Redpanda transforms the schema into Parsing Canonical Form as defined in the https://avro.apache.org/docs/++version++/specification/#transforming-into-parsing-canonical-form[Avro specification^], with the exception that it does not apply the STRIP transformation.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, the link versioning is not working, and so the link is broken.

Copy link
Copy Markdown
Member

@BenPope BenPope Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither Redpanda or Confluent transform it into PCN.

[PRIMITIVES] Convert primitive schemas to their simple form (e.g., int instead of {"type":"int"}).

Yes, this is true

[FULLNAMES] Replace short names with fullnames, using applicable namespaces to do so. Then eliminate namespace attributes, which are now redundant.

All names are converted to fullnames, with redundant namespace included.

[STRIP] Keep only attributes that are relevant to parsing data, which are: type, name, fields, symbols, items, values, size. Strip all others (e.g., doc and aliases).

I think aliases are actually removed since the parser didn't support them (It's not intentional, I think a library update may support them now, which we should probably pull in).

[ORDER] Order the appearance of fields of JSON objects as follows: name, type, fields, symbols, items, values, size. For example, if an object has type, name, and size fields, then the name field should appear first, followed by the type and then the size fields.

Yes, order is fixed up.

[STRINGS] For all JSON string literals in the schema text, replace any escaped characters (e.g., \uXXXX escapes) with their UTF-8 equivalents.

Not sure about this.

[INTEGERS] Eliminate quotes around and any leading zeros in front of JSON integer literals (which appear in the size attributes of fixed schemas).

This is probable.

[WHITESPACE] Eliminate all whitespace in JSON outside of string literals.

Yes, this.


== Redpanda design overview

Every broker allows mutating REST calls, so there's no need to configure leadership or failover strategies. Schemas are stored in a compacted topic, and the registry uses optimistic concurrency control at the topic level to detect and avoid collisions.
Expand Down