Skip to content

Commit 5c92229

Browse files
authored
Merge branch 'main' into vmm-parallel
2 parents 12ef574 + d181eda commit 5c92229

27 files changed

Lines changed: 1335 additions & 348 deletions

File tree

CHANGELOG.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,27 @@ and this project adheres to
5050
balloon statistics descriptor length to prevent a guest-controlled oversized
5151
descriptor from temporarily stalling the VMM event loop. Only affects microVMs
5252
with `stats_polling_interval_s > 0`.
53+
- [#5809](https://github.com/firecracker-microvm/firecracker/pull/5809): Fixed a
54+
bug on host Linux >= 5.16 for x86_64 guests using the `kvm-clock` clock source
55+
causing the monotonic clock to jump on restore by the wall-clock time elapsed
56+
since the snapshot was taken. Users using `kvm-clock` that want to explicitly
57+
advance the clock with `KVM_CLOCK_REALTIME` can opt back in using the new
58+
`clock_realtime` flag in `LoadSnapshot` API.
59+
- [#5738](https://github.com/firecracker-microvm/firecracker/pull/5738): Fixed
60+
x86_64 snapshot serialization to cover the full KVM custom MSR range
61+
(0x4b564d00-0x4b564dff) instead of a small subset. Previously, some KVM MSRs
62+
such as MSR_KVM_ASYNC_PF_INT and MSR_KVM_ASYNC_PF_ACK were missing from
63+
snapshots, which could cause issues on restore.
64+
- [#5818](https://github.com/firecracker-microvm/firecracker/pull/5818): Enforce
65+
the virtio device initialization sequence in the PCI transport, matching the
66+
existing MMIO transport behavior. The PCI transport now validates device
67+
status transitions, rejects queue configuration writes outside the FEATURES_OK
68+
to DRIVER_OK window, rejects feature negotiation outside the DRIVER state,
69+
blocks re-initialization after a failed reset, and sets DEVICE_NEEDS_RESET
70+
when device activation fails.
71+
- [#5818](https://github.com/firecracker-microvm/firecracker/pull/5818): Reject
72+
device status writes that clear previously set bits in the MMIO transport,
73+
except for reset.
5374

5475
## [1.15.0]
5576

docs/RELEASE_POLICY.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,8 +90,8 @@ v3.1 will be patched since were the last two Firecracker releases and less than
9090

9191
| Release | Release Date | Latest Patch | Min. end of support | Official end of Support |
9292
| ------: | -----------: | -----------: | ------------------: | :------------------------------ |
93-
| v1.15 | 2026-03-09 | v1.15.0 | 2026-09-09 | Supported |
94-
| v1.14 | 2025-12-17 | v1.14.3 | 2026-06-17 | Supported |
93+
| v1.15 | 2026-03-09 | v1.15.1 | 2026-09-09 | Supported |
94+
| v1.14 | 2025-12-17 | v1.14.4 | 2026-06-17 | Supported |
9595
| v1.13 | 2025-08-28 | v1.13.2 | 2026-02-28 | 2026-03-09 (v1.15 released) |
9696
| v1.12 | 2025-05-07 | v1.12.1 | 2025-11-07 | 2025-12-17 (v1.14 released) |
9797
| v1.11 | 2025-03-18 | v1.11.0 | 2025-09-18 | 2025-09-18 (end of 6mo support) |

docs/design.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,11 @@ and/or creating their own custom CPU templates.
118118

119119
#### Clocksources available to guests
120120

121-
Firecracker only exposes kvm-clock to customers.
121+
Firecracker exposes the following clock sources to guests:
122+
123+
- x86_64: kvm-clock and tsc. Linux guests >=5.10 will pick tsc by default if
124+
stable.
125+
- aarch64: arch_sys_counter
122126

123127
### I/O: Storage, Networking and Rate Limiting
124128

docs/jailer.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -263,6 +263,15 @@ Note: default value for `<api-sock>` is `/run/firecracker.socket`.
263263

264264
### Observations
265265

266+
- All inputs to the jailer are considered trusted, including the paths provided
267+
via `--exec-file`, `--chroot-base-dir`, and `--netns`, as well as any
268+
resources placed inside the jail root directory. Cgroup mount points are
269+
discovered from `/proc/mounts` and are managed by the kernel, so they are
270+
inherently trusted. The operator invoking the jailer is part of the trusted
271+
computing base. It is the operator's responsibility to ensure that these paths
272+
and their parent directories have appropriate ownership and permissions (e.g.,
273+
root-owned, not world-writable) to prevent unauthorized modification by other
274+
local users.
266275
- The user must create hard links for (or copy) any resources which will be
267276
provided to the VM via the API (disk images, kernel images, named pipes, etc)
268277
inside the jailed root folder. Also, permissions must be properly managed for

docs/prod-host-setup.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,11 @@ namespace isolation and drops privileges of the Firecracker process.
9696

9797
To set up the jailer correctly, you'll need to:
9898

99+
- Ensure that all paths provided to the jailer (`--exec-file`,
100+
`--chroot-base-dir`, `--netns`) and their parent directories are not writable
101+
by unprivileged users. The jailer treats all its inputs as trusted; it is the
102+
operator's responsibility to ensure that these paths cannot be tampered with
103+
by other local users.
99104
- Create a dedicated non-privileged POSIX user and group to run Firecracker
100105
under. Use the created POSIX user and group IDs in Jailer's `--uid <uid>` and
101106
`--gid <gid>` flags, respectively. This will run the Firecracker as the

docs/snapshotting/snapshot-support.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -493,6 +493,11 @@ resumed with the guest OS wall-clock continuing from the moment of the snapshot
493493
creation. For this reason, the wall-clock should be updated to the current time,
494494
on the guest-side. More details on how you could do this can be found at a
495495
[related FAQ](../../FAQ.md#my-guest-wall-clock-is-drifting-how-can-i-fix-it).
496+
When using `kvm-clock` as clock source on `x86_64`, it's possible to optionally
497+
set the `clock_realtime: true` in the `LoadSnapshot` request to advance the
498+
clock on the guest at restore time (host Linux >= 5.16 is required to support
499+
this feature). Note that this may cause issues within the guest as the clock
500+
will appear to suddenly jump.
496501
497502
## Provisioning host disk space for snapshots
498503

src/firecracker/src/api_server/request/snapshot.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ fn parse_put_snapshot_load(body: &Body) -> Result<ParsedRequest, RequestError> {
111111
resume_vm: snapshot_config.resume_vm,
112112
network_overrides: snapshot_config.network_overrides,
113113
vsock_override: snapshot_config.vsock_override,
114+
clock_realtime: snapshot_config.clock_realtime,
114115
};
115116

116117
// Construct the `ParsedRequest` object.
@@ -189,6 +190,7 @@ mod tests {
189190
resume_vm: false,
190191
network_overrides: vec![],
191192
vsock_override: None,
193+
clock_realtime: false,
192194
};
193195
let mut parsed_request = parse_put_snapshot(&Body::new(body), Some("load")).unwrap();
194196
assert!(
@@ -220,6 +222,7 @@ mod tests {
220222
resume_vm: false,
221223
network_overrides: vec![],
222224
vsock_override: None,
225+
clock_realtime: false,
223226
};
224227
let mut parsed_request = parse_put_snapshot(&Body::new(body), Some("load")).unwrap();
225228
assert!(
@@ -251,6 +254,7 @@ mod tests {
251254
resume_vm: true,
252255
network_overrides: vec![],
253256
vsock_override: None,
257+
clock_realtime: false,
254258
};
255259
let mut parsed_request = parse_put_snapshot(&Body::new(body), Some("load")).unwrap();
256260
assert!(
@@ -291,6 +295,7 @@ mod tests {
291295
host_dev_name: String::from("vmtap2"),
292296
}],
293297
vsock_override: None,
298+
clock_realtime: false,
294299
};
295300
let mut parsed_request = parse_put_snapshot(&Body::new(body), Some("load")).unwrap();
296301
assert!(
@@ -319,6 +324,7 @@ mod tests {
319324
resume_vm: true,
320325
network_overrides: vec![],
321326
vsock_override: None,
327+
clock_realtime: false,
322328
};
323329
let parsed_request = parse_put_snapshot(&Body::new(body), Some("load")).unwrap();
324330
assert_eq!(

src/firecracker/swagger/firecracker.yaml

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1602,7 +1602,7 @@ definitions:
16021602
type: string
16031603
description:
16041604
The new path for the backing Unix Domain Socket.
1605-
1605+
16061606
SnapshotLoadParams:
16071607
type: object
16081608
description:
@@ -1650,6 +1650,14 @@ definitions:
16501650
for restoring a snapshot with a different socket path than the one used
16511651
when the snapshot was created. For example, when the original socket path
16521652
is no longer available or when deploying to a different environment.
1653+
clock_realtime:
1654+
type: boolean
1655+
description:
1656+
"[x86_64 only] When set to true, passes KVM_CLOCK_REALTIME to
1657+
KVM_SET_CLOCK on restore, advancing kvmclock by the wall-clock time
1658+
elapsed since the snapshot was taken. When false (default), kvmclock resumes
1659+
from where it was at snapshot time. This option may be extended to other clock
1660+
sources and CPU architectures in the future."
16531661

16541662

16551663
TokenBucket:

src/jailer/src/env.rs

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -519,8 +519,11 @@ impl Env {
519519

520520
fn join_netns(path: &str) -> Result<(), JailerError> {
521521
// The fd backing the file will be automatically dropped at the end of the scope
522-
let netns =
523-
File::open(path).map_err(|err| JailerError::FileOpen(PathBuf::from(path), err))?;
522+
let netns = OpenOptions::new()
523+
.read(true)
524+
.custom_flags(libc::O_NOFOLLOW)
525+
.open(path)
526+
.map_err(|err| JailerError::FileOpen(PathBuf::from(path), err))?;
524527

525528
// SAFETY: Safe because we are passing valid parameters.
526529
SyscallReturnCode(unsafe { libc::setns(netns.as_raw_fd(), libc::CLONE_NEWNET) })

src/jailer/src/main.rs

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,9 @@
33

44
use std::ffi::{CString, NulError, OsString};
55
use std::fmt::{Debug, Display};
6+
use std::fs::OpenOptions;
7+
use std::io::Read;
8+
use std::os::unix::fs::OpenOptionsExt;
69
use std::path::{Path, PathBuf};
710
use std::{env as p_env, fs, io};
811

@@ -240,12 +243,25 @@ where
240243
T: AsRef<Path> + Debug,
241244
V: Display + Debug,
242245
{
243-
fs::write(file_path, format!("{}\n", value))
246+
let mut file = OpenOptions::new()
247+
.write(true)
248+
.create(true)
249+
.truncate(true)
250+
.custom_flags(libc::O_NOFOLLOW)
251+
.open(file_path.as_ref())
252+
.map_err(|err| JailerError::Write(PathBuf::from(file_path.as_ref()), err))?;
253+
io::Write::write_all(&mut file, format!("{}\n", value).as_bytes())
244254
.map_err(|err| JailerError::Write(PathBuf::from(file_path.as_ref()), err))
245255
}
246256

247257
pub fn readln_special<T: AsRef<Path> + Debug>(file_path: &T) -> Result<String, JailerError> {
248-
let mut line = fs::read_to_string(file_path)
258+
let mut file = OpenOptions::new()
259+
.read(true)
260+
.custom_flags(libc::O_NOFOLLOW)
261+
.open(file_path.as_ref())
262+
.map_err(|err| JailerError::ReadToString(PathBuf::from(file_path.as_ref()), err))?;
263+
let mut line = String::new();
264+
file.read_to_string(&mut line)
249265
.map_err(|err| JailerError::ReadToString(PathBuf::from(file_path.as_ref()), err))?;
250266

251267
// Remove the newline character at the end (if any).

0 commit comments

Comments
 (0)