Skip to content

virtio-pmem: fixes and improvements#5789

Open
ShadowCurse wants to merge 7 commits intofirecracker-microvm:mainfrom
ShadowCurse:pmem_fixes
Open

virtio-pmem: fixes and improvements#5789
ShadowCurse wants to merge 7 commits intofirecracker-microvm:mainfrom
ShadowCurse:pmem_fixes

Conversation

@ShadowCurse
Copy link
Copy Markdown
Contributor

@ShadowCurse ShadowCurse commented Mar 24, 2026

Changes

  • Validate descriptors len field to be 4 since the code expects this
  • Cache msync result for better efficiency in case multiple flush requests are presented at once
  • Add rate-limiter support

Reason

Edge case handling and addition of missing features

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkbuild --all to verify that the PR passes
    build checks on all supported architectures.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Codecov Report

❌ Patch coverage is 39.70588% with 82 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.88%. Comparing base (054b647) to head (90f29cf).

Files with missing lines Patch % Lines
src/vmm/src/devices/virtio/pmem/device.rs 46.15% 35 Missing ⚠️
src/vmm/src/rpc_interface.rs 0.00% 18 Missing ⚠️
src/vmm/src/lib.rs 0.00% 12 Missing ⚠️
src/firecracker/src/api_server/request/pmem.rs 60.86% 9 Missing ⚠️
src/vmm/src/devices/virtio/pmem/event_handler.rs 0.00% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5789      +/-   ##
==========================================
- Coverage   83.07%   82.88%   -0.20%     
==========================================
  Files         275      275              
  Lines       29462    29586     +124     
==========================================
+ Hits        24476    24522      +46     
- Misses       4986     5064      +78     
Flag Coverage Δ
5.10-m5n.metal 83.19% <39.70%> (-0.23%) ⬇️
5.10-m6a.metal 82.52% <39.70%> (-0.23%) ⬇️
5.10-m6g.metal 79.77% <39.70%> (-0.21%) ⬇️
5.10-m6i.metal 83.19% <39.70%> (-0.22%) ⬇️
5.10-m7a.metal-48xl 82.50% <39.70%> (-0.23%) ⬇️
5.10-m7g.metal 79.78% <39.70%> (-0.21%) ⬇️
5.10-m7i.metal-24xl 83.16% <39.70%> (-0.23%) ⬇️
5.10-m7i.metal-48xl 83.16% <39.70%> (-0.23%) ⬇️
5.10-m8g.metal-24xl 79.77% <39.70%> (-0.21%) ⬇️
5.10-m8g.metal-48xl 79.77% <39.70%> (-0.21%) ⬇️
5.10-m8i.metal-48xl 83.16% <39.70%> (-0.22%) ⬇️
5.10-m8i.metal-96xl 83.17% <39.70%> (-0.22%) ⬇️
6.1-m5n.metal 83.22% <39.70%> (-0.21%) ⬇️
6.1-m6a.metal 82.54% <39.70%> (-0.24%) ⬇️
6.1-m6g.metal 79.77% <39.70%> (-0.21%) ⬇️
6.1-m6i.metal 83.21% <39.70%> (-0.22%) ⬇️
6.1-m7a.metal-48xl 82.53% <39.70%> (-0.22%) ⬇️
6.1-m7g.metal 79.77% <39.70%> (-0.21%) ⬇️
6.1-m7i.metal-24xl 83.23% <39.70%> (-0.22%) ⬇️
6.1-m7i.metal-48xl 83.23% <39.70%> (-0.22%) ⬇️
6.1-m8g.metal-24xl 79.77% <39.70%> (-0.21%) ⬇️
6.1-m8g.metal-48xl 79.77% <39.70%> (-0.21%) ⬇️
6.1-m8i.metal-48xl 83.23% <39.70%> (-0.22%) ⬇️
6.1-m8i.metal-96xl 83.23% <39.70%> (-0.22%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ShadowCurse ShadowCurse marked this pull request as ready for review March 24, 2026 17:15
@ShadowCurse ShadowCurse added Status: Awaiting review Indicates that a pull request is ready to be reviewed Type: Enhancement Indicates new feature requests Type: Fix Indicates a fix to existing code labels Mar 24, 2026
@ShadowCurse ShadowCurse force-pushed the pmem_fixes branch 2 times, most recently from 46cccd5 to 813efa7 Compare March 25, 2026 10:25
@ShadowCurse ShadowCurse force-pushed the pmem_fixes branch 2 times, most recently from d140771 to e48ab2a Compare March 26, 2026 09:41
@ShadowCurse ShadowCurse changed the title Pmem fixes virtio-pmem: fixes and improvements Mar 27, 2026
@ShadowCurse ShadowCurse force-pushed the pmem_fixes branch 5 times, most recently from e4ccee4 to 24441ed Compare March 31, 2026 11:08
docs/pmem.md Outdated
available for block devices. It throttles flush requests using two token
buckets:

- `bandwidth` — limits the total bytes flushed per refill interval. Each flush
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the very first sentence is incorrect as we don't know how much each msync is going to flush. We should make it more clear that it applies the full file size at all times. Maybe we should start with that or not have a bw limiter at all for this.

Maybe more clear would be: "bandwidth — consumes tokens equal to the full backing file size for each msync call, effectively limiting how often dirty pages can be flushed to disk."

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the first sentence a bit.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what value does bandwidth actually add to the users (compared to ops) if it's imprecise? How would users know how to configure it properly?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I saw the guidance we give in the doc, but still not entirely convinced bandwidth adds a value

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our default rate-limiter configuration scheme is to give ability to set both (e.g. we use RateLimiterConfig everywhere) so at least for consistency it makes sense. Also I think bandwidth maybe makes more sense since figuring out how much requests guest sends (for any device at this point) is a questionable business, but bandwidth is something people is more used to thinking about.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, uniformity here is an argument for me. Users can choose whatever they are more comfortable with.


// Rate limit: consume 1 op and file_len bytes for the coalesced msync.
// If the rate limiter is blocked, defer processing until the timer fires.
if !self.rate_limiter.consume(1, TokenType::Ops) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're consuming tokens before even knowing if there's an item in the queue to process

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering there is only one valid type of request, I think this is fine. Valid guests will only notify if there is something for us to process.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FC notifies on resume irrespectively on whether there's a request or not in the queue

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, moved these checks past the queue processing step, before the notification.

@ShadowCurse ShadowCurse force-pushed the pmem_fixes branch 7 times, most recently from a0aae35 to a9a3d01 Compare April 7, 2026 13:19
- name: body
in: body
description: Pmem rate limiter properties
required: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it mean that users can't remove the rate limiter later on? Other devices seem to allow that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? The description here seems to be same as for other devices. Pmem logic for updating rate-limiter is same as for other devices.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is, for example, block doesn't make rate limiter required:

  PartialDrive:
    type: object
    required:
      - drive_id
    properties:
      drive_id:
        type: string
      path_on_host:
        type: string
        description:
          Host level path for the guest drive.
          This field is optional for virtio-block config and should be omitted for vhost-user-block configuration.
      rate_limiter:
        $ref: "#/definitions/RateLimiter"

while pmem does:

  PartialPmem:
    type: object
    description:
      Defines a partial pmem device structure, used to update the rate limiter
      for that device, after microvm start.
    required:
      - id
      - rate_limiter <==== here
    properties:
      id:
        type: string
      rate_limiter:
        $ref: "#/definitions/RateLimiter"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, this is simply because rate-limiter is the only thing that can be changed. There is no reason to PATCH it if not to change the rate-limiter. For other devices I was looking at the ...UpdateConfig types and since there are always a couple things that can change there it makes sense to make them optional. Do you think we should allow empty PATCH requests for pmem?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just thinking it could be the only way to disable the rate limiter later on. Not sure if it's a common use case though. Would probably make sense to keep it on par with other devices in that sense.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, made it optional

}
}

pub(crate) fn parse_patch_pmem(
Copy link
Copy Markdown
Contributor

@kalyazin kalyazin Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we seem to have unittests for parse_put_pmem, but not for parse_patch_pmem

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

docs/pmem.md Outdated
available for block devices. It throttles flush requests using two token
buckets:

- `bandwidth` — limits the total bytes flushed per refill interval. Each flush
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I saw the guidance we give in the doc, but still not entirely convinced bandwidth adds a value

CHANGELOG.md Outdated
information can be found in the
[docs](docs/vsock.md/#unix-domain-socket-renaming).
- [#5789](https://github.com/firecracker-microvm/firecracker/pull/5789): Add
rate-limiter support to virtio-pmem devie to allow control over I/O bandwidth
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/devie/device/

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@ShadowCurse ShadowCurse requested a review from micz010 as a code owner April 10, 2026 09:42
@ShadowCurse ShadowCurse force-pushed the pmem_fixes branch 2 times, most recently from 2c56730 to eb02623 Compare April 10, 2026 10:55
Head and status descriptors must be 4 bytes long by the spec.
Add validation for this.

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
There is only one type of request virto-pmem accepts, but guest still
can issue many of them and Firecracker will need to process it all.
Instead of doing one `msync` per request, it is less resource intensive
to do it once on the first valid descriptor and then duplicate the
result to other descriptors. This is safe since the guest will only
know the result of the execution after Firecracker will signal it,
which will only happen after all descriptors are processed.

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Add rate-limiter support to the virtio-pmem device to allow users to
configure limits of the I/O bandwidth generated by the `msync` call in
the device which could be triggered by the guest FLUSH requests.

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Add a section in the pmem.md file describing a way of
limiting I/O usage of `msync` calls from a virtio-pmem device.

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Mention new rate-limiter API for the virtio-pmem device in the
CHANGELOG.md

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Fix incorrect NOTE section formatting in memory usage
section

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Update device-api.md with rate-limiter for virtio-pmem.

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Awaiting review Indicates that a pull request is ready to be reviewed Type: Enhancement Indicates new feature requests Type: Fix Indicates a fix to existing code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants