Use memfd to track sandbox memory#2522
Conversation
PR SummaryMedium Risk Overview Reviewed by Cursor Bugbot for commit 18ee751. Bugbot is set up for automated code reviews on this repo. Configure here. |
6d2b804 to
314abd0
Compare
1ab27f8 to
4f0b47b
Compare
❌ 4 Tests Failed:
View the full list of 6 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
4ae4ebb to
9198a96
Compare
|
Update: I've removed the logic that punches holes in the memfd, progressively after copying data into the diff file. I've ran some experiments and got some signal about this causing increase in CPU utilization and slowing down PAUSE and RESUMEs. I think that we can proceed with adding support for memfd and revisiting after the deduplication work. |
ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bed7ac7e41
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
328d277 to
ad966b6
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ad966b6c63
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e5aedb7b33
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Introduce a Cacher interface which abstracts the memory cache implementation, as seen by the diff/upload layer. Currently, it's only the Cache type that implements that. This is preparation for introducing a second cache type which is backed by the memfd used to map guest memory. Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
We are changing Firecracker to, optionally, back the guest memory using a memfd object. When enabled, Firecracker passes over the memfd file descriptor over the UFFD UDS, alongside the UFFD file descriptor, using SCM_RIGHTS. Change the UFFD serve logic to also parse the memfd file descriptor. When present, wrap the descriptor in a Memfd object. The object itself provides an interface that lets users access the guest memory from the memfd. UFFD logic exposes the Memfd object over a newly added method of the MemoryBackend interface, called Memfd(). The noop memory backend always returns nil for now, as Firecracker might only use memfd when resuming from a snapshot. Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Change the ExportMemory() logic to export the memory via a MemfdCache when Firecracker has sent us a memfd file descriptor. Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Add unit tests for memfd and MemfdCache functionality. Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Add a feature flag that controls whether the orchestrator will instruct Firecracker to use memfd for backing the guest memory. Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Trim verbose comments and drop trivial tests so the PR is easier to review. The behavioral changes are: - copyFromMemfd uses a fixed 2 MiB chunk (memfdCopyChunkSize) matching the source hugepage size, decoupled from cache.blockSize (which remains the dirty-tracking unit). - NewCacheFromMemfd no longer logs-and-swallows the memfd close error; it returns it. The size==0 fast path drops; the loop is a no-op when ranges sum to 0. - UseMemFdFlag comment matches the mmap-based implementation.
9c553f4 to
66be8a2
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 2bbbf3a. Configure here.
NewCacheFromMemfd already closes the memfd on every error path; calling memfd.Close() again here is a (harmless) double-close that muddies ownership semantics.
Trim the diff further: - MemfdCache embeds *Cache, dropping six delegate methods and the custom Close (Cache.Close handles the file; the memfd is already closed before the wrapper is returned). - copyFromMemfd performs one copy per range instead of a chunked inner loop; the memfdCopyChunkSize constant goes away. Cancellation checks move to range boundaries. - Memfd.Slice uses a simple nil-check instead of sync.Once for the lazy mmap; no concurrent callers in either the sync or the upcoming async path. - Drop TestMemfdCache_DirtyBitmap (covered by MultipleRanges + the BitsetRanges adapter is exercised elsewhere) and the chunk-boundary test (no chunked loop anymore). Consolidate SliceOutOfBounds.
NewFromFd now mmaps the fd eagerly via fstat and returns (*Memfd, error). The size field on Memfd goes away (use len(mmap)), the lazy-init nil-check in Slice goes away, and the explicit size computation in uffd.go is dropped — the kernel already knows the size. Also standardize on golang.org/x/sys/unix (drop the mixed syscall import).
Memfd has a single owner across all paths: NewCacheFromMemfd consumes it during construction, and the UFFD handshake transfers ownership via atomic Swap. There is no path that calls Close twice on the same Memfd, so the m.mmap=nil / m.fd=-1 / nil-check sentinels are dead weight. Drop them and document the single-use contract.
Instead of materializing a []Range in the caller just to iterate it once inside NewCacheFromMemfd, take the dirty *roaring.Bitmap and the block size directly. Total cache size comes from cardinality * blockSize, and copyFromMemfd iterates BitsetRanges in place. Drops the now-thin exportMemoryFromMemfd helper in fc/memory.go.
The gRPC handler already injects sandbox/team/template contexts via ctx, so team and template targeting for UseMemFdFlag already worked. Add a sandbox-type attribute (sandbox vs build) and pass the explicit sandboxLDContext to BoolFlag so flags can roll out to production sandboxes separately from template-builds.
487ed30 to
23f5d96
Compare
Cacher was a vague -er name for an interface with seven methods. The type exists purely so the diff/upload layer can accept either *Cache or *MemfdCache; DiffSource names that role.
- copyFromMemfd uses ctx.Err() instead of the select/default form. - pauseProcessMemory runs ExportMemory before ToDiffHeader so the memfd is owned by ExportMemory throughout; the conditional close on ToDiffHeader failure goes away. - fc.ExportMemory returns NewCacheFromMemfd directly; the "create MemfdCache" wrap is redundant with the inner error context.
PR 2522 introduced MemfdCache (wrapper around *Cache) and the DiffSource interface purely so the async-copy follow-up could attach extra state without churning callers. PR 2522 itself never uses the indirection — NewCacheFromMemfd returns *Cache, fc.ExportMemory returns *block.Cache, localDiff takes *block.Cache. Drop the scaffolding from this PR; the async PR introduces its own wrapper when it needs the override behavior. Also: - Inline the copyFromMemfd loop into NewCacheFromMemfd (single use). - Trim tests to the two that earn their keep: non-adjacent blocks and the non-zero range-start regression. - Tighten the use_memfd/firecracker-fd comments. - Loop over fds for cleanup instead of two manual closes.
FC < 1.14 rejects the use_memfd field on snapshot load (deny_unknown_fields on MemoryBackend), so combining FCSupportsMemfd(version) with the flag avoids hard-failing resumes when the flag is flipped on across a heterogeneous fleet.

What
In Unix OSs
memfdis an anonymous file that can be used to back memory. Firecracker uses this construct when it needs to share memory with external processes (currently, when using vhost-user devices).Currently, when we take a snapshot of the sandbox (for example, during
PAUSEoperations) we need to copy its memory usingprocess_vm_readv.memfdallows us to do this in a more idiomatic way.Why
memfdallows us to have a direct view of the sandbox memory from the orchestrator without having to copy memory across processes. Moreover, if the orchestrator holds a reference tomemfd, we can post process the sandbox memory after the Firecracker process is killed. This opens up possibilities for various latency and memory utilization optimizations.What we do in this PR is that we change the cache logic to use memfd to copy Firecracker memory into the diff file if the memfd is present.