Skip to content

Commit b64d92c

Browse files
committed
composefs-oci: Add OCI sealing spec, canonical tar, incremental pulls docs
Move the OCI sealing specification and two new design documents (canonical tar format and incremental pulls) into the rustdoc pattern established in the previous commit, using `#[cfg(doc)]` modules. The sealing spec covers: - The `fsverity-${DIGEST}-${BLOCKSIZEBITS}` digest identifier format - Two composefs integrity metadata modes: artifact-based and inline annotations - Full annotation key scheme with JSON examples - PKCS#7 DER signature format and kernel fsverity integration - Runtime verification (kernel fsverity signatures and digest-only paths) The canonical tar spec defines a reproducible dumpfile→tar mapping needed for push-after-incremental-pull. The incremental pulls design builds on the composefs artifact to enable efficient partial fetches, comparable to zstd:chunked but without tar-split. Assisted-by: OpenCode (claude-sonnet-4-6) Signed-off-by: Colin Walters <walters@verbum.org>
1 parent 81e47d2 commit b64d92c

5 files changed

Lines changed: 748 additions & 199 deletions

File tree

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
//! # Canonical Tar Format
2+
//!
3+
//! This document defines a canonical, reproducible tar serialization for composefs filesystem trees. This is a prerequisite for pushing images after an [incremental pull](crate::incremental_pulls_spec) and complements the standardized EROFS metadata work.
4+
//!
5+
//! ## Motivation
6+
//!
7+
//! In the [incremental pull](crate::incremental_pulls_spec) model, a composefs-aware client fetches only the content objects it doesn't already have, using the EROFS metadata as a table of contents. The client does not download or store the original tar layer bytes. To push this image to another registry, or to verify the OCI `diff_id` if needed, the client must be able to regenerate a byte-identical tar stream from the EROFS metadata and local object store.
8+
//!
9+
//! Without a canonical tar format, the regenerated tar will almost certainly differ from the original (different header encoding, different entry ordering, different padding), producing different digests.
10+
//!
11+
//! ## Conceptual Model
12+
//!
13+
//! The canonical tar format is defined as a mapping from composefs dumpfile to tar. The dumpfile is a human-readable textual format that represents a complete filesystem tree and can be converted to/from EROFS. By defining dumpfile-to-tar, we complete a triangle of deterministic conversions:
14+
//!
15+
//! ```
16+
//! dumpfile ──→ canonical tar
17+
//! ↑ │
18+
//! │ │
19+
//! └── EROFS (v1) ←─┘
20+
//!
21+
//! ```
22+
//!
23+
//! A client that has an EROFS can convert to dumpfile, then to canonical tar. A builder that has a tar can convert to dumpfile, then to EROFS.
24+
//!
25+
//! ## Specification
26+
//!
27+
//! ### Header Format: pax (POSIX.1-2001)
28+
//!
29+
//! The canonical format uses pax extended headers exclusively. pax supports long filenames, large file sizes, nanosecond timestamps, arbitrary xattrs, and large uid/gid values without the ambiguities of GNU extensions.
30+
//!
31+
//! Each entry consists of:
32+
//! 1. *(If pax records are needed)* A pax extended header entry (type `x`) followed by its data blocks
33+
//! 2. The ustar header entry followed by any content data blocks
34+
//!
35+
//! The pax extended header entry's name is `PaxHeaders.0/<basename>` where `<basename>` is the entry's filename component (truncated to 100 bytes if necessary).
36+
//!
37+
//! ### Global Header
38+
//!
39+
//! The archive begins with a single pax global extended header (typeflag `g`) containing one record:
40+
//!
41+
//! ```
42+
//! canonical-tar=1
43+
//! ```
44+
//!
45+
//! This allows any client to detect canonical tar format by reading the first entry. No other global extended headers are permitted in the archive.
46+
//!
47+
//! ### Entry Ordering
48+
//!
49+
//! Entries appear in depth-first pre-order with children sorted by filename using byte-wise comparison. This matches the ordering produced by iterating a `BTreeMap<OsStr, Inode>`, which is the in-memory representation used by composefs.
50+
//!
51+
//! Example:
52+
//! ```
53+
//! ./
54+
//! ./a/
55+
//! ./a/x
56+
//! ./a/y
57+
//! ./b/
58+
//! ./b/z
59+
//! ./c
60+
//! ```
61+
//!
62+
//! The root directory entry comes first. Directories are emitted before their children.
63+
//!
64+
//! ### Path Encoding
65+
//!
66+
//! All paths are relative to the archive root, prefixed with `./`. Directories have a trailing `/`. For example, the dumpfile path `/usr/bin/sh` becomes `./usr/bin/sh` in the tar stream; the dumpfile path `/usr/lib/` becomes `./usr/lib/`.
67+
//!
68+
//! Paths that fit within 100 bytes are stored entirely in the ustar `name` field. Paths longer than 100 bytes use a pax `path` record; the ustar `name` field is filled with a truncated form and the ustar `prefix` field is left empty. The ustar prefix/name split is never used, as different implementations split at different `/` boundaries, making it a source of non-reproducibility.
69+
//!
70+
//! ### Ustar Header Fields
71+
//!
72+
//! All header fields use the ustar format (magic `ustar\0`, version `00`).
73+
//!
74+
//! | Field | Size | Encoding | Notes |
75+
//! |-------|------|----------|-------|
76+
//! | name | 100 | Bytes, null-terminated | See path encoding above |
77+
//! | mode | 8 | Octal, zero-padded, null-terminated | Permission bits only (no file-type bits). E.g. `0000755\0` |
78+
//! | uid | 8 | Octal, zero-padded, null-terminated | Values > 2,097,151 overflow to pax |
79+
//! | gid | 8 | Octal, zero-padded, null-terminated | Values > 2,097,151 overflow to pax |
80+
//! | size | 12 | Octal, zero-padded, null-terminated | File content size. 0 for directories, symlinks, devices, fifos. Values > 8 GiB overflow to pax |
81+
//! | mtime | 12 | Octal, zero-padded, null-terminated | Seconds since epoch. Values > 8,589,934,591 overflow to pax |
82+
//! | chksum | 8 | Octal, zero-padded, null-terminated + space | Unsigned sum of all header bytes with chksum field treated as spaces |
83+
//! | typeflag | 1 | ASCII | See entry types below |
84+
//! | linkname | 100 | Bytes, null-terminated | Symlink/hardlink target; longer targets use pax `linkpath` |
85+
//! | magic | 6 | `ustar\0` | |
86+
//! | version | 2 | `00` | |
87+
//! | uname | 32 | Empty (null-filled) | Not stored in EROFS; omitted |
88+
//! | gname | 32 | Empty (null-filled) | Not stored in EROFS; omitted |
89+
//! | devmajor | 8 | Octal, zero-padded, null-terminated | For block/char devices only; 0 otherwise |
90+
//! | devminor | 8 | Octal, zero-padded, null-terminated | For block/char devices only; 0 otherwise |
91+
//! | prefix | 155 | Empty (null-filled) | Never used; long paths use pax `path` instead |
92+
//!
93+
//! Unused header bytes are zero-filled.
94+
//!
95+
//! ### Entry Types
96+
//!
97+
//! | Dumpfile entry | typeflag | Notes |
98+
//! |----------------|----------|-------|
99+
//! | Regular file | `0` | Content follows header |
100+
//! | Directory | `5` | Size 0, path has trailing `/` |
101+
//! | Symlink | `2` | Target in linkname (or pax `linkpath`) |
102+
//! | Hardlink | `1` | Target in linkname as relative `./`-prefixed path |
103+
//! | Block device | `4` | devmajor/devminor set |
104+
//! | Char device | `3` | devmajor/devminor set |
105+
//! | FIFO | `6` | |
106+
//!
107+
//! ### Pax Extended Headers
108+
//!
109+
//! Pax records are used only when a value overflows the ustar header capacity. The canonical format does not unconditionally emit pax headers for values that fit in ustar fields.
110+
//!
111+
//! Pax records are emitted in the following order when present:
112+
//!
113+
//! 1. `path` (if name exceeds ustar prefix/name capacity)
114+
//! 2. `linkpath` (if linkname exceeds 100 bytes)
115+
//! 3. `size` (if > 8 GiB)
116+
//! 4. `uid` (if > 2,097,151)
117+
//! 5. `gid` (if > 2,097,151)
118+
//! 6. `mtime` (if > 8,589,934,591, or if sub-second precision is needed)
119+
//! 7. `SCHILY.xattr.*` records, sorted by full key name (byte-wise)
120+
//!
121+
//! Each pax record is formatted as `<length> <key>=<value>\n` per POSIX.1-2001. The length field is the total byte count of the record including itself.
122+
//!
123+
//! #### Xattr Encoding
124+
//!
125+
//! Extended attributes are encoded as `SCHILY.xattr.<name>` pax records. Values are binary-safe (the pax record length field handles arbitrary bytes). Xattr records are sorted by the full key string (`SCHILY.xattr.security.selinux` before `SCHILY.xattr.user.foo`), using byte-wise comparison.
126+
//!
127+
//! #### Timestamp Precision
128+
//!
129+
//! If the dumpfile timestamp has a non-zero nanosecond component, the `mtime` pax record is emitted as `<seconds>.<nanoseconds>` (nanoseconds without trailing zeros). If the timestamp is integer seconds and fits in the ustar mtime field, no pax record is emitted.
130+
//!
131+
//! ### Content and Padding
132+
//!
133+
//! File content is the raw bytes from the object store (for external files, identified by fsverity digest) or the inline bytes (for files ≤ 64 bytes).
134+
//!
135+
//! Content is followed by zero-padding to the next 512-byte block boundary. The padding bytes are all zero.
136+
//!
137+
//! ### End of Archive
138+
//!
139+
//! The archive ends with two consecutive 512-byte blocks of zeros, per POSIX.
140+
//!
141+
//! ### Hardlink Handling
142+
//!
143+
//! When the dumpfile contains hardlinks (multiple paths sharing the same leaf ID), the first path encountered in depth-first sorted order is emitted as a regular entry with full content. Subsequent paths referencing the same leaf are emitted as hardlink entries (typeflag `1`) with the first path as the linkname target.
144+
//!
145+
//! The hardlink target path uses the same `./`-prefixed encoding as all other paths.
146+
//!
147+
//! ### Whiteout Representation
148+
//!
149+
//! For per-layer (non-merged) tars, OCI whiteouts are represented as standard whiteout entries:
150+
//!
151+
//! - **File deletion**: a zero-length regular file named `.wh.<name>` in the parent directory
152+
//! - **Opaque directory**: a zero-length regular file named `.wh..wh..opq` in the directory
153+
//!
154+
//! Whiteout entries appear in sorted order alongside regular entries. Their mode is `0000644`, uid/gid are 0, mtime is 0.
155+
//!
156+
//! For merged/flattened tars, whiteouts do not appear (they have already been processed).
157+
//!
158+
//! ## Compression
159+
//!
160+
//! This specification defines the uncompressed tar byte stream only. Compression (gzip, zstd, composefs-chunked framing) is a separate concern. The composefs-chunked format described in [`incremental_pulls_spec`](crate::incremental_pulls_spec) applies zstd frame boundaries on top of this canonical ordering without changing the entry order or content.
161+
//!
162+
//! ## Implementation Notes
163+
//!
164+
//! The [tar-core](https://github.com/composefs/tar-core) crate provides the building blocks for producing canonical tar output. It supports both pax and GNU extension modes, deterministic numeric encoding, and pax record construction. The canonical tar generator would use tar-core's `EntryBuilder` in pax mode (`ExtensionMode::Pax`), calling `build_pax_data()` to emit extended headers only when ustar fields overflow.
165+
//!
166+
//! tar-core does not impose entry ordering; the caller (composefs) controls the order by walking the dumpfile/EROFS tree in sorted depth-first order.
167+
//!
168+
//! ## Relationship to Other Specs
169+
//!
170+
//! The dumpfile is the canonical filesystem representation that bridges tar and EROFS. This spec defines dumpfile to tar; a future standardized EROFS metadata spec will define dumpfile to EROFS. Together they enable round-trip conversion.
171+
//!
172+
//! The OCI layer format (`application/vnd.oci.image.layer.v1.tar`) requires a standards-compliant tar stream. A canonical tar produced by this specification is a valid OCI layer. The `diff_id` is the SHA-256 of the uncompressed canonical tar stream.
173+
//!
174+
//! ## References
175+
//!
176+
//! - [Incremental pulls](crate::incremental_pulls_spec): the primary consumer of canonical tar
177+
//! - [tar-core](https://github.com/composefs/tar-core): sans-IO tar library used by composefs
178+
//! - [OCI image layer spec](https://github.com/opencontainers/image-spec/blob/main/layer.md): OCI tar layer requirements
179+
//! - [POSIX.1-2001 pax format](https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html): pax extended header specification

0 commit comments

Comments
 (0)