feat: reserve disk space for guest OS during template build#2082
Conversation
Add configurable reserved disk space (in MB) for the guest OS via feature flag BuildReservedDiskSpaceMB. Reserved blocks are applied in two places: - Base builder: via tune2fs -r after the final filesystem resize - Finalize layer: via tune2fs inside the sandbox before snapshotting This ensures the guest OS (running as root) always has space available even when non-root processes fill the disk.
- Use RunCommandWithLogger instead of RunCommand for tune2fs to capture stdout/stderr for diagnostics - Add explicit > 0 guard in finalize builder to avoid unnecessary RPC when feature flag is 0 - Add logger parameter to SetReservedDiskSpace
Add unit tests for parseFreeBlocks and parseReservedBlocks functions, and an integration test verifying reserved blocks are set on built templates via tune2fs inside a running sandbox.
Use assert.Positive and assert.Contains instead of assert.Greater and assert.True(strings.Contains(...)).
…ontext Rename SetReservedBlocks to SetReservedBlocksOnHost (runs tune2fs directly on the host rootfs file) and SetReservedDiskSpace to SetReservedBlocksInGuest (runs tune2fs inside the sandbox VM via envd). Both functions cross-reference each other in doc comments.
Explain that reserved blocks are configured post-creation via SetReservedBlocksOnHost (tune2fs -r) based on the BuildReservedDiskSpaceMB feature flag, not at mkfs.ext4 time, because the filesystem undergoes resize operations after creation.
BuildReservedDiskSpaceMB defaults to 0 in the offline LaunchDarkly store and there is no way to inject custom flag values into integration tests, so the test always passes trivially without asserting anything.
mkfs.ext4 >= 1.47.0 enables orphan_file by default [1]. This feature sets an RO_COMPAT flag that causes e2fsprogs < 1.47.0 to reject write operations with 'unsupported read-only feature(s)'. This broke SetReservedBlocksInGuest during the finalize build phase for images like Ubuntu 20.04/22.04 and Debian 11. The build would fail with a generic 'internal error' message. [1] https://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.47.0
7a8a4b3 to
24893a9
Compare
|
FP with rebased branch to resolve a conflict plus few minor changes on top:
|
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
| "github.com/stretchr/testify/require" | ||
| ) | ||
|
|
||
| func TestParseFreeBlocks(t *testing.T) { |
There was a problem hiding this comment.
I know it's not the point of this PR, but to get free blocks, instead of debugfs and string parsing, we could use stat -fs "%f" /, which just outputs a number.
There was a problem hiding this comment.
does stat -fs work on not mounted filesystem in a file?
There was a problem hiding this comment.
It works on the underlying filesystem, regardless of where it's mounted:
$ stat -fc "%f" /tmp/claude # sub folder of the mounted file system
3922238
$ stat -fc "%f" /tmp # root of the mount, same file system as /tmp/claude
3922238
$ stat -fc "%f" / # different mounted file system
44120469
There was a problem hiding this comment.
Right, but the filesystem is not mounted anywhere (at all), would that work too?
There was a problem hiding this comment.
The debugfs allows to execute certain action on not mounted filesystem. If you check the usage, we're passing just a file containing the ext4 fs there
|
added FF in launch darkly for it |
| } | ||
|
|
||
| // Set reserved disk space for the guest OS before syncing | ||
| if reservedDiskSpaceMB := int64(ppb.featureFlags.IntFlag(ctx, featureflags.BuildReservedDiskSpaceMB)); reservedDiskSpaceMB > 0 { |
There was a problem hiding this comment.
it would be better to move this to the place where the envd is updated -> that way old templates would have it already for the build, not only for the sandbox
it's just a different place where to put this code
There was a problem hiding this comment.
so in here in layer_executor?
// Update envd binary to the latest version
if cmd.UpdateEnvd {
...
}
There was a problem hiding this comment.
(it can be done in a follow up)
There was a problem hiding this comment.
merging as is then making PR for this
…ble slice Replace the single comma-separated string literal for mkfs.ext4 -O with strings.Join over a slice, so each ext4 feature flag can be commented individually. Flags are sorted alphabetically within the main group, with ^orphan_file separated at the end as a guest-compatibility workaround. Original comments describing the feature set rationale (tar2ext4 matching, kept defaults, orphan_file compat) are preserved in place. No functional change - produces the identical argument string at runtime. Addresses review feedback from #2082. Link: #2082 (comment)
…ble slice (#2183) Replace the single comma-separated string literal for mkfs.ext4 -O with strings.Join over a slice, so each ext4 feature flag can be commented individually. Flags are sorted alphabetically within the main group, with ^orphan_file separated at the end as a guest-compatibility workaround. Original comments describing the feature set rationale (tar2ext4 matching, kept defaults, orphan_file compat) are preserved in place. No functional change - produces the identical argument string at runtime. Addresses review feedback from #2082. Link: #2082 (comment)
Production orchestrators run e2fsprogs < 1.47.0 which does not recognize the orphan_file feature. Passing ^orphan_file to mkfs.ext4 on these older versions causes an "Invalid filesystem option set" error because the feature is entirely unknown, not just unsupported. Since orphan_file is only a default in e2fsprogs >= 1.47.0. Fixes: 5adcb7a ("feat: reserve disk space for guest OS during template build (#2082)")
) Production orchestrators run e2fsprogs < 1.47.0 which does not recognize the orphan_file feature. Passing ^orphan_file to mkfs.ext4 on these older versions causes an "Invalid filesystem option set" error because the feature is entirely unknown, not just unsupported. Since orphan_file is only a default in e2fsprogs >= 1.47.0. Fixes: 5adcb7a ("feat: reserve disk space for guest OS during template build (#2082)")
Summary
Add configurable reserved disk space (MB) for the guest OS via feature flag
BuildReservedDiskSpaceMB. Reserved blocks are set in two places to cover fresh and cached builds:tune2fs -rafter final filesystem resizetune2fs -r /dev/vdabefore snapshotThis ensures the guest OS (running as root) always has available space, protecting systemd and other critical services from disk-full conditions even when non-root processes fill the disk.
Default value is 0 (no reserved space), preserving current behavior.
Note
Medium Risk
Touches template build/finalize filesystem provisioning and relies on running
tune2fsagainst the root device, so misconfiguration or missing tooling could cause build failures or unexpected disk accounting; behavior is gated by a default-off feature flag.Overview
Adds a new feature flag (
BuildReservedDiskSpaceMB) to reserve an amount of ext4 space for root during template builds, applyingtune2fs -rboth on the host-built rootfs after resizing and again inside the finalize sandbox before snapshotting; includes unit tests for parsing ext4 metadata output and an integration test that validates the reserved block count viatune2fsin a sandbox created from the built template.Written by Cursor Bugbot for commit 7a8a4b3. This will update automatically on new commits. Configure here.