Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
2211cc2
refactor(block): expand ConfigSpace to full virtio-blk layout
kalyazin Apr 24, 2026
bf4571f
refactor(block): add Discard request type and discard segment struct
kalyazin Apr 24, 2026
7833055
feat(block): add discard method to FileEngine using fallocate
kalyazin Apr 24, 2026
2196d22
style(seccomp): fix indentation and trailing whitespace in filter files
kalyazin Apr 24, 2026
ead656c
feat(seccomp): allow fallocate syscall in vmm thread filter
kalyazin Apr 24, 2026
88a1223
feat(block): handle VIRTIO_BLK_T_DISCARD requests
kalyazin Apr 24, 2026
2c8af81
chore(snapshot): fix ConfigSpace restore for VIRTIO_BLK_F_DISCARD
kalyazin Apr 24, 2026
ac34612
feat(block): advertise VIRTIO_BLK_F_DISCARD for non-read-only devices
kalyazin Apr 24, 2026
459864d
doc(block): document VIRTIO_BLK_F_DISCARD discard support
kalyazin Apr 24, 2026
ddecaf5
test(block): add unit tests for VIRTIO_BLK_F_DISCARD
kalyazin Apr 24, 2026
dbe1785
test(block): add pytest integration tests for VIRTIO_BLK_F_DISCARD
kalyazin Apr 24, 2026
cdd3917
refactor(block): extend ConfigSpace with write-zeroes fields
kalyazin May 6, 2026
93b20c4
refactor(block): add WriteZeroes request type and supporting variants
kalyazin May 6, 2026
cea19fe
feat(block): add write_zeroes method to FileEngine using fallocate
kalyazin May 6, 2026
2c033e4
feat(block): handle VIRTIO_BLK_T_WRITE_ZEROES requests
kalyazin May 6, 2026
fb04ebe
feat(block): advertise VIRTIO_BLK_F_WRITE_ZEROES for non-read-only de…
kalyazin May 6, 2026
17d83b9
test(block): add unit tests for VIRTIO_BLK_F_WRITE_ZEROES
kalyazin May 6, 2026
7261d1f
test(block): add pytest integration tests for VIRTIO_BLK_F_WRITE_ZEROES
kalyazin May 6, 2026
8a2503f
doc(block): document VIRTIO_BLK_F_WRITE_ZEROES support
kalyazin May 6, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions docs/api_requests/block-discard.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Block device discard (TRIM)

Firecracker supports the `VIRTIO_BLK_F_DISCARD` feature, which allows the guest
to issue discard (TRIM) requests to the block device. Discard requests tell the
host that a range of sectors is no longer needed, enabling the host to reclaim
space on sparse or thin-provisioned backing files.

## How it works

For all non-read-only block devices, Firecracker automatically advertises the
`VIRTIO_BLK_F_DISCARD` feature to the guest driver. No API configuration is
required — discard support is always-on for writable drives.

When the guest driver issues a `VIRTIO_BLK_T_DISCARD` request, Firecracker calls
`fallocate(2)` with `FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE` on the backing
file for each discard segment. This punches a hole in the file, freeing the
underlying disk blocks without changing the file size.

Guest tools that trigger discard include:

- `fstrim -v /` — manually trim a mounted filesystem
- `discard` mount option — automatic discard on file deletion
- `blkdiscard /dev/vda` — discard the entire block device

## Host requirements

The backing file must reside on a filesystem and kernel that support
`FALLOC_FL_PUNCH_HOLE`. This is supported on ext4, xfs, btrfs, and tmpfs on
Linux 3.5+. On filesystems that do not support hole-punching, `fallocate`
returns `EOPNOTSUPP`. Firecracker detects this on the first discard, logs a
one-time warning, and replies to the guest with `VIRTIO_BLK_S_UNSUPP`. The Linux
virtio-blk driver propagates `VIRTIO_BLK_S_UNSUPP` through the block layer and
stops issuing further discard requests. Firecracker short-circuits any remaining
discard requests with `VIRTIO_BLK_S_UNSUPP` immediately — no additional
`fallocate` calls are made.

## Limitations

- Discard is only available for non-read-only block devices.
- At most one discard segment per request is supported (`max_discard_seg = 1`).
- The discard segment flags field must be zero; non-zero flags are rejected with
an I/O error.
61 changes: 61 additions & 0 deletions docs/api_requests/block-write-zeroes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Block device write-zeroes

Firecracker supports the `VIRTIO_BLK_F_WRITE_ZEROES` feature, which allows the
guest to ask the device to zero a range of sectors without transferring a buffer
of zeros over the virtqueue. Common consumers are `mkfs` (clearing inode tables
and journals), filesystem snapshots, encrypted-volume initial wipe, and
`blkdiscard -z` / `blkzeroout` from userspace.

## How it works

For all non-read-only block devices, Firecracker automatically advertises the
`VIRTIO_BLK_F_WRITE_ZEROES` feature to the guest driver. No API configuration
is required — write-zeroes support is always-on for writable drives.

Each `VIRTIO_BLK_T_WRITE_ZEROES` request carries a 16-byte segment with a
`flags` field. Bit 0 (`VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP`) tells the device
whether it may also deallocate the underlying backing-file blocks. Firecracker
advertises `write_zeroes_may_unmap=1`, so guests are free to set this flag.

Firecracker translates the guest's UNMAP bit into a `fallocate(2)` mode on the
backing file:

| UNMAP | fallocate mode | Effect |
|-------|---------------------------------------------|---------------------------------------|
| 0 | `FALLOC_FL_ZERO_RANGE \| FALLOC_FL_KEEP_SIZE` | zeros in place, no deallocation |
| 1 | `FALLOC_FL_PUNCH_HOLE \| FALLOC_FL_KEEP_SIZE` | zeros + deallocate (sparse holes) |

The virtio spec requires that when UNMAP is clear the device MUST NOT
deallocate sectors (so `ZERO_RANGE` is mandatory for that path); when UNMAP
is set, the device MAY deallocate, and `PUNCH_HOLE` reads as zeros on every
filesystem that supports it.

## Host requirements

The backing file must reside on a filesystem that supports the corresponding
`fallocate` mode:

- `FALLOC_FL_PUNCH_HOLE` (UNMAP=1) is widely supported: ext4, xfs, btrfs, tmpfs.
- `FALLOC_FL_ZERO_RANGE` (UNMAP=0) is supported on ext4, xfs, btrfs; on tmpfs
it requires Linux 6.8+. Other filesystems may not support it.

If `fallocate` returns `EOPNOTSUPP` for either mode, Firecracker logs a one-time
warning and replies with `VIRTIO_BLK_S_UNSUPP`. The Linux virtio-blk driver
propagates that status through the block layer and stops issuing further
write-zeroes requests, so subsequent guest writes fall back to plain
`REQ_OP_WRITE` traffic. Firecracker short-circuits any in-flight write-zeroes
requests with `VIRTIO_BLK_S_UNSUPP` for the rest of the device's lifetime — no
additional `fallocate` calls are made.

The EOPNOTSUPP cache is shared across UNMAP=0 and UNMAP=1 paths: a single
fallback flag disables both. This is conservative — a filesystem that
supports `PUNCH_HOLE` but not `ZERO_RANGE` will see UNMAP=1 requests rejected
once an UNMAP=0 request fails — but it matches the discard fallback design
and avoids subtle host-side state.

## Limitations

- Write-zeroes is only available for non-read-only block devices.
- At most one segment per request is supported (`max_write_zeroes_seg = 1`).
- Only bit 0 (UNMAP) of the segment flags is allowed; non-zero reserved bits
are rejected with an I/O error.
14 changes: 9 additions & 5 deletions resources/seccomp/aarch64-unknown-linux-musl.json
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,10 @@
{
"syscall": "fsync"
},
{
"syscall": "fallocate",
"comment": "Used by the block device for VIRTIO_BLK_F_DISCARD (FALLOC_FL_PUNCH_HOLE)"
},
{
"syscall": "close"
},
Expand Down Expand Up @@ -110,8 +114,8 @@
"comment": "sigaltstack is used by Rust stdlib to remove alternative signal stack during thread teardown."
},
{
"syscall": "getrandom",
"comment": "getrandom is used by aws-lc library which we consume in virtio-rng"
"syscall": "getrandom",
"comment": "getrandom is used by aws-lc library which we consume in virtio-rng"
},
{
"syscall": "accept4",
Expand Down Expand Up @@ -213,7 +217,7 @@
},
{
"syscall": "madvise",
"comment": "Used by the VirtIO balloon device and by musl for some customer workloads. It is also used by aws-lc during random number generation. They setup a memory page that mark with MADV_WIPEONFORK to be able to detect forks. They also call it with -1 to see if madvise is supported in certain platforms."
"comment": "Used by the VirtIO balloon device and by musl for some customer workloads. It is also used by aws-lc during random number generation. They setup a memory page that mark with MADV_WIPEONFORK to be able to detect forks. They also call it with -1 to see if madvise is supported in certain platforms."
},
{
"syscall": "msync",
Expand Down Expand Up @@ -544,8 +548,8 @@
"comment": "sigaltstack is used by Rust stdlib to remove alternative signal stack during thread teardown."
},
{
"syscall": "getrandom",
"comment": "getrandom is used by `HttpServer` to reinialize `HashMap` after moving to the API thread"
"syscall": "getrandom",
"comment": "getrandom is used by `HttpServer` to reinialize `HashMap` after moving to the API thread"
},
{
"syscall": "accept4",
Expand Down
8 changes: 6 additions & 2 deletions resources/seccomp/x86_64-unknown-linux-musl.json
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@
{
"syscall": "fsync"
},
{
"syscall": "fallocate",
"comment": "Used by the block device for VIRTIO_BLK_F_DISCARD (FALLOC_FL_PUNCH_HOLE)"
},
{
"syscall": "close"
},
Expand Down Expand Up @@ -559,8 +563,8 @@
"comment": "sigaltstack is used by Rust stdlib to remove alternative signal stack during thread teardown."
},
{
"syscall": "getrandom",
"comment": "getrandom is used by `HttpServer` to reinialize `HashMap` after moving to the API thread"
"syscall": "getrandom",
"comment": "getrandom is used by `HttpServer` to reinialize `HashMap` after moving to the API thread"
},
{
"syscall": "accept4",
Expand Down
Loading
Loading