Skip to content

feat(zstd): validate Content_Checksum frames (XXH64)#108

Merged
MagicalTux merged 1 commit into
masterfrom
feat/zstd-content-checksum
Jun 30, 2026
Merged

feat(zstd): validate Content_Checksum frames (XXH64)#108
MagicalTux merged 1 commit into
masterfrom
feat/zstd-content-checksum

Conversation

@MagicalTux

Copy link
Copy Markdown
Member

Problem

The zstd decoder refused any frame whose Frame_Header set Content_Checksum_Flag, returning Error::Unsupported, because the crate shipped no XXH64. The zstd CLI writes a content checksum by default, so default zstd output only decoded with --no-check.

Change

  • New streaming XXH64 (src/zstd/xxhash.rs) — canonical algorithm, seed 0, no_std. Verified against reference vectors ("", "a", "abc", 64-byte) and a streaming-vs-one-shot equivalence test.
  • Decoder validates the checksum — every decompressed byte is fed through XXH64 at each emit site (raw / RLE / compressed blocks), and the 4-byte little-endian trailer (low 32 bits of the digest) is checked at end of frame. Mismatch → Error::ChecksumMismatch. The frame-checksum state machine (ContentChecksum phase) already existed; this wires it up.
  • Hashing is gated on the frame actually advertising a checksum, so non-checksummed frames pay nothing.

Tests

  • Replaced the obsolete decode_rejects_checksum_flag with decode_validates_correct_checksum and decode_rejects_bad_checksum (hand-built frames with a real / corrupted XXH64 trailer — no CLI dependency).
  • XXH64 unit tests.
  • Verified end-to-end against zstd CLI output across levels 1–19, checksummed and --no-check, over a range of sizes including block boundaries; corrupting the trailer is correctly rejected.

Our encoder still does not emit a content checksum (unchanged).

🤖 Generated with Claude Code

The zstd decoder rejected any frame whose Frame_Header set
Content_Checksum_Flag with Error::Unsupported, because no XXH64 was
implemented. Since the zstd CLI writes a content checksum by default,
default `zstd` output only decoded with `--no-check`.

Add a streaming XXH64 (canonical, seed 0; verified against reference
vectors and against checksums produced by the zstd CLI). The decoder now
feeds every decompressed byte through it at each emit site (raw, RLE, and
compressed blocks) and validates the 4-byte little-endian trailer — the
low 32 bits of XXH64 over the decompressed content — at end of frame,
reporting Error::ChecksumMismatch on a mismatch.

Replaces the obsolete decode_rejects_checksum_flag test with
decode_validates_correct_checksum / decode_rejects_bad_checksum, and adds
XXH64 unit tests. Verified end-to-end against `zstd` CLI output across
levels 1-19 (checksummed and --no-check). Our encoder still does not emit
a content checksum.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@MagicalTux MagicalTux merged commit d90b9ed into master Jun 30, 2026
42 checks passed
@MagicalTux MagicalTux deleted the feat/zstd-content-checksum branch June 30, 2026 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant