Skip to content

serial: Add TX backpressure to prevent guest soft lockup#5865

Closed
JackThomson2 wants to merge 2 commits into
firecracker-microvm:mainfrom
JackThomson2:fix/serial_flooding
Closed

serial: Add TX backpressure to prevent guest soft lockup#5865
JackThomson2 wants to merge 2 commits into
firecracker-microvm:mainfrom
JackThomson2:fix/serial_flooding

Conversation

@JackThomson2
Copy link
Copy Markdown
Contributor

Changes

...

Reason

...

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkbuild --all to verify that the PR passes
    build checks on all supported architectures.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

A guest tight loop writing to /dev/ttyS0 (e.g. `cat /dev/zero >
/dev/ttyS0`) saturates the vCPU thread in MMIO-exit handling and starves
the other vCPU, producing soft-lockup, workqueue-lockup and RCU-stall
warnings such as:

    BUG: workqueue lockup - pool cpus=0-1 ... stuck for 38s!
    watchdog: BUG: soft lockup - CPU#1 stuck for 86s! [swapper/1:0]
    rcu: INFO: rcu_preempt detected stalls on CPUs/tasks

The underlying vm-superio Serial keeps LSR_THR_EMPTY|LSR_IDLE set
unconditionally (the upstream comment is "we should always be ready to
receive more data"), so the guest 8250 driver never throttles and writes
each byte through MMIO. The vCPU thread then performs a 1-byte
write(2)+flush+eventfd-write per byte; with FC's O_NONBLOCK stdout the
host-side pipe buffer no longer paces it either.

Wrap a soft TX FIFO in front of vm-superio:

  - Guest data writes are pushed onto a bounded VecDeque (no syscall on
    the vCPU thread) and a TimerFd is armed for one drain interval.
  - LSR reads mask THR_EMPTY|IDLE while the queue is at the modelled
    FIFO depth, so the guest sees a busy port and waits.
  - A drain runs on the event-manager thread (event subscriber on the
    timerfd), pops up to one FIFO per tick, and feeds each byte through
    the regular `Serial::write(DATA, b)` path (write+flush+IRQ raise).
  - Loopback writes bypass the queue so vm-superio's synchronous
    RX-FIFO routing and RDA interrupts stay in sync with the guest.
  - Overrun beyond `TX_QUEUE_CAPACITY` drops bytes via the existing
    `tx_lost_byte` event, matching real-hardware FIFO overrun.

Empirically a 60-second `cat /dev/zero > /dev/ttyS0` produces zero
soft-lockup messages; the flooding vCPU's host-side stime drops from
~76% to ~5%.

Signed-off-by: Jack Thomson <jackabt@amazon.com>
The soft TX FIFO added in the previous commit holds bytes the guest has
written to the data register but the host hasn't yet drained. With the
queue not part of `SerialState`, snapshot/restore (and live migration)
silently drops up to 64 KiB of pending console output — observable as
missing bytes in the destination's serial.log when a snapshot is taken
mid-burst.

Persist the queue:

  - Add `SerialState::tx_queue: Vec<u8>` with `#[serde(default)]`, so
    older snapshots restore as an empty queue and round-trip cleanly.
  - Bump `SNAPSHOT_VERSION` from 10.0.0 to 10.1.0. The format is a
    strict superset of v10.0.0; older Firecrackers reject the new
    snapshot via the existing minor-version gate in `Snapshot::load`.
  - Add `SerialWrapper::tx_queue_snapshot` and `restore_tx_queue` so
    `DeviceManager::serial_state` and the x86 / aarch64 restore paths
    can move the bytes through the persistence layer.
  - Truncate on restore to the live `TX_QUEUE_CAPACITY` to keep the
    runtime invariant `tx_queue.len() <= TX_QUEUE_CAPACITY` even if a
    snapshot was taken with a larger cap. Re-arm the drain timer if
    the restored queue is non-empty so bytes start flowing immediately
    on the destination side.

Signed-off-by: Jack Thomson <jackabt@amazon.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

❌ Patch coverage is 78.94737% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.80%. Comparing base (e7e0efe) to head (c9bf18c).

Files with missing lines Patch % Lines
src/vmm/src/devices/legacy/serial.rs 83.14% 15 Missing ⚠️
src/vmm/src/device_manager/persist.rs 14.28% 6 Missing ⚠️
src/vmm/src/device_manager/mod.rs 83.33% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5865      +/-   ##
==========================================
+ Coverage   82.62%   82.80%   +0.17%     
==========================================
  Files         275      277       +2     
  Lines       29750    29989     +239     
==========================================
+ Hits        24581    24831     +250     
+ Misses       5169     5158      -11     
Flag Coverage Δ
5.10-m5n.metal 83.12% <85.71%> (?)
5.10-m6a.metal 82.45% <85.71%> (?)
5.10-m6g.metal 79.78% <74.74%> (+0.04%) ⬆️
5.10-m6i.metal 83.11% <85.71%> (?)
5.10-m7a.metal-48xl 82.44% <85.71%> (?)
5.10-m7g.metal 79.78% <74.74%> (+0.04%) ⬆️
5.10-m7i.metal-24xl 83.08% <85.71%> (?)
5.10-m7i.metal-48xl 83.08% <85.71%> (?)
5.10-m8g.metal-24xl 79.78% <74.74%> (+0.04%) ⬆️
5.10-m8g.metal-48xl 79.78% <74.74%> (+0.04%) ⬆️
5.10-m8i.metal-48xl 83.09% <85.71%> (?)
5.10-m8i.metal-96xl 83.09% <85.71%> (?)
6.1-m5n.metal 83.14% <85.71%> (?)
6.1-m6a.metal 82.47% <85.71%> (?)
6.1-m6g.metal 79.78% <74.74%> (+0.04%) ⬆️
6.1-m6i.metal 83.13% <85.71%> (+0.01%) ⬆️
6.1-m7a.metal-48xl 82.46% <85.71%> (?)
6.1-m7g.metal 79.78% <74.74%> (+0.04%) ⬆️
6.1-m7i.metal-24xl 83.15% <85.71%> (?)
6.1-m7i.metal-48xl 83.15% <85.71%> (?)
6.1-m8g.metal-24xl 79.78% <74.74%> (+0.05%) ⬆️
6.1-m8g.metal-48xl 79.78% <74.74%> (+0.04%) ⬆️
6.1-m8i.metal-48xl 83.16% <85.71%> (?)
6.1-m8i.metal-96xl 83.15% <85.71%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant