Skip to content

Commit 2e09ff5

Browse files
committed
docs: Add driver implementation documentation
Create dedicated documentation files for each storage driver: - containers-storage-driver-overlay.md (includes zstd:chunked and composefs) - containers-storage-driver-vfs.md - containers-storage-driver-btrfs.md - containers-storage-driver-zfs.md The composefs and zstd:chunked documentation is consolidated into the overlay driver doc since these are overlay-specific features. Assisted-by: OpenCode (Opus 4.5) Signed-off-by: Colin Walters <walters@verbum.org>
1 parent aa10ab7 commit 2e09ff5

6 files changed

Lines changed: 181 additions & 130 deletions

storage/docs/containers-storage-composefs.md

Lines changed: 0 additions & 72 deletions
This file was deleted.
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# containers-storage 1 "February 2026"
2+
3+
## NAME
4+
containers-storage-driver-btrfs - The btrfs storage driver
5+
6+
## DESCRIPTION
7+
8+
The btrfs driver uses native btrfs copy-on-write via subvolumes and snapshots.
9+
10+
## IMPLEMENTATION
11+
12+
The on-disk file layout is an internal implementation detail and may change between versions. The only stable interface is the Go library API.
13+
14+
Requires a btrfs filesystem. Layers are stored as subvolumes under `btrfs/subvolumes/`. New empty layers are created as subvolumes; child layers are created as btrfs snapshots, providing true CoW semantics. Quotas are supported via btrfs qgroups. Set `btrfs.min_space` to enable quota enforcement.
15+
16+
Reference: `drivers/btrfs/btrfs.go`
17+
18+
## RUNTIME
19+
20+
Like VFS, there is no mount involved. Btrfs subvolumes are accessible as regular directories, so `Get()` returns the subvolume path directly. If a quota was configured, the qgroup limit is applied at this point. `Put()` is a no-op.
21+
22+
## BUGS
23+
24+
https://github.com/containers/storage/issues?q=is%3Aissue+is%3Aopen+label%3Aarea%2Fbtrfs
25+
26+
## FOOTNOTES
27+
The Containers Storage project is committed to inclusivity, a core value of open source.
28+
The `master` and `slave` mount propagation terminology is used in this repository.
29+
This language is problematic and divisive, and should be changed.
30+
However, these terms are currently used within the Linux kernel and must be used as-is at this time.
31+
When the kernel maintainers rectify this usage, Containers Storage will follow suit immediately.
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# containers-storage 1 "February 2026"
2+
3+
## NAME
4+
containers-storage-driver-overlay - The overlay storage driver
5+
6+
## DESCRIPTION
7+
8+
The overlay driver uses Linux OverlayFS for copy-on-write semantics. This is the default and recommended driver for most use cases. See [containers-storage.conf.5.md](containers-storage.conf.5.md) for configuration options.
9+
10+
## IMPLEMENTATION
11+
12+
The on-disk file layout is an internal implementation detail and may change between versions. The only stable interface is the Go library API.
13+
The description below is intended to aid debugging and recovery, but changing content directly is not supported.
14+
15+
The top-level overlay directory holds layers keyed by a [chain ID](https://github.com/opencontainers/image-spec/blob/main/config.md#layer-chainid) which identifies the precise sequence of parent layers leading to this one. A layer with the same DiffID can have multiple physical objects in this directory if it was created in different contexts (e.g. with or without zstd:chunked).
16+
17+
Each layer has at least a `diff` directory and `link` file. If there are lower layers, the layer also has a `lower` file, `merged` directory, and `work` directory. The `diff` directory has the upper layer of the overlay and is used to capture any changes to the layer. The `lower` file contains all the lower layer mounts separated by `:` and ordered from uppermost to lowermost layers. The overlay itself is mounted in the `merged` directory, and the `work` dir is needed for overlay to work.
18+
19+
The `link` file for each layer contains a unique string for the layer. Under the `l/` directory at the root there will be a symbolic link with that unique string pointing to the `diff` directory for the layer. The symbolic links are used to reference lower layers in the `lower` file and on mount. The links are used to shorten the total length of a layer reference without requiring changes to the layer identifier or root directory. Mounts are always done relative to root and referencing the symbolic links in order to ensure the number of lower directories can fit in a single page for making the mount syscall.
20+
21+
A hard upper limit of 500 lower layers is enforced.
22+
23+
The `overlay-layers/` directory alongside the per-layer directories contains metadata managed by the storage library. Each layer has a `${layerid}.tar-split.gz` file preserving the original tar stream structure (without file content) so that the original archive can be reconstructed exactly from the unpacked `diff/`. The directory also contains `layers.json` with global layer metadata and `layers.lock` for concurrency control.
24+
25+
The `overlay-containers/` directory holds running container state: `containers.json` for metadata and `containers.lock` for concurrency control.
26+
27+
Reference: `drivers/overlay/overlay.go`
28+
29+
## RUNTIME
30+
31+
When a container needs its filesystem, the driver performs a `mount(2)` with type `overlay`, passing the layer's `diff` directory as the upperdir and all parent layers' `diff` directories as lowerdirs. The kernel's overlayfs merges these at access time — no data is copied, and layers remain independent on disk. Writes go to the upperdir via copy-up. The mount is placed at the layer's `merged` directory, and the `work` directory is used internally by overlayfs for atomic operations like rename.
32+
33+
If a mount program is configured (e.g. `fuse-overlayfs` for rootless operation), it is invoked instead of the `mount(2)` syscall. When the mount option string exceeds the kernel's page size limit, the driver forks a child process that `chdir`s into the storage root and uses relative paths to shorten the options.
34+
35+
On `Put()`, the overlayfs mount is unmounted.
36+
37+
### zstd:chunked
38+
39+
`zstd:chunked` is a variant of the `application/vnd.oci.image.layer.v1.tar+zstd` media type that uses zstd skippable frames to include a table of contents with SHA-256 digests and offsets of individual file chunks. This allows fetching only content not already present via HTTP range requests.
40+
41+
Note: The zstd:chunked format is not standardized, though it is an eventual goal to do so.
42+
43+
Each layer has an associated big data key `chunked-manifest-cache` containing index metadata in a binary format suitable for mmap(). When pulling, existing layers are scanned for files with matching digests. Matching files are hardlinked if `use_hardlinks = "true"`, otherwise reflinked (or copied if reflinks are unsupported).
44+
45+
Configuration (support is enabled by default in the code):
46+
47+
```
48+
[storage.options.pull_options]
49+
enable_partial_images = "true"
50+
```
51+
52+
Configuration values must be string booleans (quoted), not native TOML booleans.
53+
54+
Reference: `pkg/chunked/internal/compression.go`
55+
56+
### composefs
57+
58+
composefs provides an immutable filesystem layer with optional integrity verification.
59+
60+
Configuration:
61+
62+
```
63+
[storage.options.overlay]
64+
use_composefs = "true"
65+
```
66+
67+
Configuration values must be string booleans (quoted), not native TOML booleans.
68+
69+
composefs requires zstd:chunked images. For non-zstd:chunked images, set `convert_images = "true"` in `[storage.options.pull_options]` to enable dynamic conversion during pulls.
70+
71+
With composefs enabled, the `diff/` directory becomes an object hash directory where each filename is the sha256 of its contents. Each layer has a `composefs-data/composefs.blob` file containing the composefs superblock with all metadata.
72+
73+
Existing layers are scanned for matching objects and reused via hardlink or reflink. An attempt is made to enable fsverity on backing files, but this is best-effort only; there is currently no support for enforced integrity verification.
74+
75+
Layers with or without composefs format can be mixed in the same overlay stack. Layers with a composefs blob are mounted and included in the final overlayfs stack, while layers without composefs format are reused as-is.
76+
77+
## BUGS
78+
79+
https://github.com/containers/storage/issues?q=is%3Aissue+is%3Aopen+label%3Aarea%2Foverlay
80+
81+
## FOOTNOTES
82+
The Containers Storage project is committed to inclusivity, a core value of open source.
83+
The `master` and `slave` mount propagation terminology is used in this repository.
84+
This language is problematic and divisive, and should be changed.
85+
However, these terms are currently used within the Linux kernel and must be used as-is at this time.
86+
When the kernel maintainers rectify this usage, Containers Storage will follow suit immediately.
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# containers-storage 1 "February 2026"
2+
3+
## NAME
4+
containers-storage-driver-vfs - The VFS storage driver
5+
6+
## DESCRIPTION
7+
8+
The VFS driver copies directories to create layers. No kernel overlay filesystem support is required.
9+
10+
## IMPLEMENTATION
11+
12+
The on-disk file layout is an internal implementation detail and may change between versions. The only stable interface is the Go library API.
13+
14+
Layers are stored under `vfs/dir/`. When creating a layer from a parent, the entire parent directory is copied. The copy uses reflinks (FICLONE) if supported by the filesystem, falling back to regular copying otherwise. The VFS driver works on any filesystem but is storage-inefficient without reflink support.
15+
16+
Reference: `drivers/vfs/driver.go`, `drivers/copy/copy_linux.go`
17+
18+
## RUNTIME
19+
20+
There is no mount involved. When a container needs its filesystem, `Get()` simply returns the layer's directory path. All layer merging happened at create time when the parent was copied, so the directory is already a complete filesystem tree. `Put()` is a no-op since there is nothing to unmount.
21+
22+
## BUGS
23+
24+
https://github.com/containers/storage/issues?q=is%3Aissue+is%3Aopen+label%3Aarea%2Fvfs
25+
26+
## FOOTNOTES
27+
The Containers Storage project is committed to inclusivity, a core value of open source.
28+
The `master` and `slave` mount propagation terminology is used in this repository.
29+
This language is problematic and divisive, and should be changed.
30+
However, these terms are currently used within the Linux kernel and must be used as-is at this time.
31+
When the kernel maintainers rectify this usage, Containers Storage will follow suit immediately.
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# containers-storage 1 "February 2026"
2+
3+
## NAME
4+
containers-storage-driver-zfs - The ZFS storage driver
5+
6+
## DESCRIPTION
7+
8+
The ZFS driver uses ZFS datasets and clones for copy-on-write semantics.
9+
10+
## IMPLEMENTATION
11+
12+
The on-disk file layout is an internal implementation detail and may change between versions. The only stable interface is the Go library API.
13+
14+
Requires `/dev/zfs` and the `zfs` command. Configure the parent dataset via the `zfs.fsname` option.
15+
16+
Layers are stored as datasets under `zfs.fsname` (e.g., `tank/containers/storage/$id`). Mountpoints are at `zfs/graph/`. All datasets use `mountpoint=legacy` so containers-storage controls mounts directly. New root layers are created with `zfs create`. Child layers are created by snapshotting the parent dataset and cloning the snapshot; the snapshot is marked for deferred deletion after cloning.
17+
18+
Reference: `drivers/zfs/zfs.go`
19+
20+
## RUNTIME
21+
22+
When a container needs its filesystem, the driver performs `mount(2)` with type `zfs` to mount the dataset at a path under `zfs/graph/`. Because all datasets use `mountpoint=legacy`, ZFS does not auto-mount them — the driver controls when and where each dataset is mounted. A reference counter tracks multiple users of the same mountpoint. On `Put()`, the last reference triggers an unmount.
23+
24+
## BUGS
25+
26+
https://github.com/containers/storage/issues?q=is%3Aissue+is%3Aopen+label%3Aarea%2Fzfs
27+
28+
## FOOTNOTES
29+
The Containers Storage project is committed to inclusivity, a core value of open source.
30+
The `master` and `slave` mount propagation terminology is used in this repository.
31+
This language is problematic and divisive, and should be changed.
32+
However, these terms are currently used within the Linux kernel and must be used as-is at this time.
33+
When the kernel maintainers rectify this usage, Containers Storage will follow suit immediately.

storage/docs/containers-storage-zstd-chunked.md

Lines changed: 0 additions & 58 deletions
This file was deleted.

0 commit comments

Comments
 (0)