docs(serde): document and test supported serde patterns for IPv4 / IPv6 / FixedString#427
Open
gregvirgin-ls wants to merge 1 commit into
Open
Conversation
…/FixedString
Three changes that retire client-side workarounds for the spec items in
upstream-clickhouse-rs.md §1.4. None of these is a "fix" in the
demonstrated-bug sense — the wire format is already correct; what was
missing is documentation of the required-annotation contract and tests
that pin it byte-for-byte.
- Add `clickhouse::serde::ipv6` as a pass-through helper for IPv6
columns, mirroring the existing `clickhouse::serde::ipv4` helper.
The default `Ipv6Addr` serde already produces network-order bytes
that match the RowBinary wire format for `IPv6`, so the helper is
an explicit no-op pass-through. Two reasons to ship it:
1. Symmetry with the `ipv4` annotation makes mixed-family row
definitions read consistently.
2. Future-proofs against any wire-format change.
- Expand the `clickhouse::serde::ipv4` rustdoc to make the
required-annotation contract explicit. The default `Ipv4Addr`
serde produces 4 network-order bytes; the wire format is a
little-endian u32; the bytes do not match. Schema validation
rejects the mismatched row, which is the right behaviour but the
message did not point at the fix.
- Add unit-level round-trip tests in `src/rowbinary/tests.rs`
covering IPv4 (via helper), IPv6 (via default *and* helper), and
FixedString(4). Pin the wire format byte-for-byte so future
regressions are caught without needing a live ClickHouse server.
- Add a live integration test `tests/it/ip.rs::ipv4_ipv6_fixedstring_
round_trip` that creates a single-table schema with all three
types, inserts via `#[derive(Row, Serialize)]` with the documented
annotations, fetches with validation enabled, and asserts both
Rust-side equality and CH-side `toString(...)` equivalence. Pins
the contract end-to-end against a real server so a Rust-side serde
regression *or* a CH-side decoding regression surfaces here.
Variable-length-into-FixedString helpers (e.g. a `PaddedString<N>`
newtype) are intentionally out of scope here and left as a follow-up.
Motivated by downstream ClickFlow code that uses `String` columns to
hold IPs because the IPv4/IPv6 round-trip was opaque enough to be
distrusted in production. With the docstring + the live IT test, the
recommended pattern is discoverable from `cargo doc` and verified
against a real CH instance.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This is not a bug fix — the wire format is already correct. What was missing was:
Ipv4Addr(clickhouse::serde::ipv4is required because the defaultIpv4Addrserde produces 4 network-order bytes whereas theIPv4wire format is little-endian u32).Ipv6Addr(clickhouse::serde::ipv6) — pass-through, but available so mixed-family row definitions read consistently.What changed
src/serde.rs: rustdoc onclickhouse::serde::ipv4made explicit; newclickhouse::serde::ipv6pass-through helper.src/rowbinary/tests.rs: byte-level round-trip tests undermod ip_and_fixed_stringcovering IPv4, IPv6 (default and helper), andFixedString(4).tests/it/ip.rs: newipv4_ipv6_fixedstring_round_triptest creating a table with all three column types, inserting via#[derive(Row, Serialize)], fetching with validation enabled, and asserting both Rust-side equality and CH-sidetoString(...)equivalence.What is NOT in this PR
PaddedString<N>(or similar) newtype that would let users writeStringand have it auto-pad-or-truncate toFixedString(N). Out of scope; if maintainers want it, follow-up.Ipv4Addr/Ipv6Addrserde behaviour. The contract is unchanged; it's now documented.Behaviour change
None observable to existing code. The new
clickhouse::serde::ipv6helper is opt-in. Existing users who rely on the defaultIpv6Addrserde keep working.Test matrix
The live IT test (
tests/it/ip.rs::ipv4_ipv6_fixedstring_round_trip) requires a running ClickHouse instance, per the existing convention fortests/it/.Downstream motivation
A non-trivial fraction of clickhouse-rs consumers we've looked at use
Stringcolumns to hold IPs because the round-trip viaIpv4Addr/Ipv6Addrwas opaque enough to be distrusted in production. Putting the contract in rustdoc plus pinning it with a live IT test resolves the discoverability question.