Skip to content

Commit 527ecf7

Browse files
rgarciaclaude
andcommitted
Refresh memory-resize spec after clonefile + rosetta merges
Update file:line references that drifted when the clonefile and rosetta/multi-platform changes merged to main (createVM gained a device block, shifting computeMemorySize and the balloon block by +4; create.go balloon-policy wiring moved into guestMemoryConfig; HotplugSize moved). Correct the integration-test path to lib/instances/, and note that the proposed MemoryCeilingBytes threading and derived-capability flag now have a merged precedent (Platform / derived EnableRosetta take the identical request -> VMConfig -> buildShimConfigFromVMConfig -> ShimConfig path). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 8ed1fd0 commit 527ecf7

1 file changed

Lines changed: 17 additions & 17 deletions

File tree

docs/proposals/memory-hotplug-resize.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -21,11 +21,11 @@ This is explicitly scoped to vz. Cloud Hypervisor and Firecracker already expose
2121

2222
- `cmd/vz-shim/vm.go:38` computes `memoryBytes := computeMemorySize(uint64(config.MemoryBytes))`.
2323
- `cmd/vz-shim/vm.go:42` passes it to `vz.NewVirtualMachineConfiguration(bootLoader, vcpus, memoryBytes)`. This is the *only* place memory size is set; there is no runtime `SetMemorySize`.
24-
- `cmd/vz-shim/vm.go:303-319` clamps the requested size into `[VirtualMachineConfigurationMinimumAllowedMemorySize(), VirtualMachineConfigurationMaximumAllowedMemorySize()]`.
24+
- `cmd/vz-shim/vm.go:307-323` clamps the requested size into `[VirtualMachineConfigurationMinimumAllowedMemorySize(), VirtualMachineConfigurationMaximumAllowedMemorySize()]`.
2525

26-
**The boot size is the instance's `Size`.** `lib/instances/create.go:891` sets `MemoryBytes: inst.Size`, threaded into `ShimConfig.MemoryBytes` at `lib/hypervisor/vz/starter.go:165`. `HotplugBytes`/`HotplugSize` exist in the config (`lib/hypervisor/config.go:9`, `lib/instances/types.go:84`) but vz ignores them.
26+
**The boot size is the instance's `Size`.** `lib/instances/create.go:878` sets `MemoryBytes: inst.Size`, threaded into `ShimConfig.MemoryBytes` at `lib/hypervisor/vz/starter.go:165`. `HotplugBytes`/`HotplugSize` exist in the config (`lib/hypervisor/config.go:9`, `lib/instances/types.go:97`) but vz ignores them.
2727

28-
**The only runtime memory lever is the balloon, and it only reclaims.** When `EnableMemoryBalloon` is set, the shim attaches a `VZVirtioTraditionalMemoryBalloonDevice` (`cmd/vz-shim/vm.go:75-87`), gated on `EnableMemoryBalloon` / `RequireMemoryBalloon` (`lib/hypervisor/vz/shimconfig/config.go:36-37`), which are populated from the guest-memory policy features (`lib/hypervisor/vz/starter.go:170-171`, `lib/instances/create.go:938-945`, `lib/guestmemory/policy.go:79-92`).
28+
**The only runtime memory lever is the balloon, and it only reclaims.** When `EnableMemoryBalloon` is set, the shim attaches a `VZVirtioTraditionalMemoryBalloonDevice` (`cmd/vz-shim/vm.go:79-91`), gated on `EnableMemoryBalloon` / `RequireMemoryBalloon` (`lib/hypervisor/vz/shimconfig/config.go:36-37`), which are populated from the guest-memory policy features (`lib/hypervisor/vz/starter.go:170-171`, `lib/instances/create.go:925-932`, `lib/guestmemory/policy.go:79-92`).
2929

3030
The balloon control plane:
3131

@@ -59,7 +59,7 @@ What Tart does:
5959

6060
What hypeman should adopt:
6161

62-
- **The minimum-resources floor guard.** Tart's `memorySizeMin` is exactly hypeman's `protected_floor` concept (`active_ballooning.protected_floor_*``protectedFloorBytes`, `planner.go:60-63`). hypeman should keep enforcing a hard lower bound on usable guest memory so a guest is never ballooned below a size it can function at, and should likewise clamp the *boot ceiling* into the framework's `[minimumAllowedMemorySize, maximumAllowedMemorySize]` range exactly as `computeMemorySize` already does (`vm.go:303-319`) and as Tart does in `setMemory`.
62+
- **The minimum-resources floor guard.** Tart's `memorySizeMin` is exactly hypeman's `protected_floor` concept (`active_ballooning.protected_floor_*``protectedFloorBytes`, `planner.go:60-63`). hypeman should keep enforcing a hard lower bound on usable guest memory so a guest is never ballooned below a size it can function at, and should likewise clamp the *boot ceiling* into the framework's `[minimumAllowedMemorySize, maximumAllowedMemorySize]` range exactly as `computeMemorySize` already does (`vm.go:307-323`) and as Tart does in `setMemory`.
6363
- **Treating "the number you give the VZ configuration" as the hard ceiling.** Tart's `configuration.memorySize` is the immovable boot size; hypeman's `NewVirtualMachineConfiguration` argument is the same. This RFC's central move — choose that number deliberately as a *ceiling* rather than a *baseline* — only works because both projects agree this value is fixed for the VM's lifetime.
6464

6565
Where hypeman should diverge:
@@ -91,7 +91,7 @@ The shim boots the machine at `ceiling`, then, before/at the moment the guest st
9191
- guest demand rises → deflate balloon → guest sees more, up to `ceiling`, no reboot;
9292
- host under pressure → inflate balloon → guest gives memory back, down to `floor`.
9393

94-
Because vz/Linux memory backing is lazy (pages are host-resident only once touched, which is exactly why `config.example.darwin.yaml` ships `kernel_page_init_mode` and why `assertLowIdleVZHostMemoryFootprint` in `guestmemory_darwin_test.go:143-166` asserts a low idle RSS), booting at a larger ceiling does **not** make the host pay for the ceiling while the guest sits at baseline. The cost of a higher ceiling is address-space/bookkeeping, not resident RAM, as long as the balloon holds the guest down and the guest doesn't touch the ballooned pages. This is the property that makes the technique pay off.
94+
Because vz/Linux memory backing is lazy (pages are host-resident only once touched, which is exactly why `config.example.darwin.yaml` ships `kernel_page_init_mode` and why `assertLowIdleVZHostMemoryFootprint` in `lib/instances/guestmemory_darwin_test.go:143-166` asserts a low idle RSS), booting at a larger ceiling does **not** make the host pay for the ceiling while the guest sits at baseline. The cost of a higher ceiling is address-space/bookkeeping, not resident RAM, as long as the balloon holds the guest down and the guest doesn't touch the ballooned pages. This is the property that makes the technique pay off.
9595

9696
### Config-time changes (vz-shim)
9797

@@ -128,7 +128,7 @@ bootBytes := uint64(config.MemoryBytes)
128128
if config.MemoryCeilingBytes > config.MemoryBytes {
129129
bootBytes = uint64(config.MemoryCeilingBytes)
130130
}
131-
memoryBytes := computeMemorySize(bootBytes) // existing min/max clamp, vm.go:303-319
131+
memoryBytes := computeMemorySize(bootBytes) // existing min/max clamp, vm.go:307-323
132132
// ...
133133
vmConfig, err := vz.NewVirtualMachineConfiguration(bootLoader, vcpus, memoryBytes)
134134
```
@@ -138,7 +138,7 @@ A ceiling requires the balloon. Boot-at-ceiling without a balloon would leave th
138138
```go
139139
//go:build darwin
140140

141-
// cmd/vz-shim/vm.go (balloon block, replacing vm.go:75-87)
141+
// cmd/vz-shim/vm.go (balloon block, replacing vm.go:79-91)
142142
ceilingActive := config.MemoryCeilingBytes > config.MemoryBytes
143143
if config.EnableMemoryBalloon || ceilingActive {
144144
balloonConfig, err := vz.NewVirtioTraditionalMemoryBalloonDeviceConfiguration()
@@ -235,7 +235,7 @@ func growthTargetBytes(cfg ActiveBallooningConfig, c candidateState, demand gues
235235
SupportsLiveMemoryCeiling bool
236236
```
237237

238-
For vz this is `true` exactly when a ceiling is configured; `SupportsHotplugMemory` stays `false` (we are not hotplugging — we are deflating a pre-sized balloon). No other hypervisor sets it.
238+
For vz this is `true` exactly when a ceiling is configured; `SupportsHotplugMemory` stays `false` (we are not hotplugging — we are deflating a pre-sized balloon). No other hypervisor sets it. Like the merged `EnableRosetta` flag, this is derived internally from config rather than surfaced as a user-facing request knob — `EnableRosetta` states that contract directly ("Derived internally … not a user-facing field", `lib/instances/types.go:163-166`), and the ceiling-implies-balloon requirement below follows the same derive-from-config approach.
239239

240240
### Why this reuses the existing machinery rather than adding knobs
241241

@@ -262,9 +262,9 @@ Add an optional per-instance memory ceiling. It defaults to the baseline (no cei
262262
MemoryCeilingBytes int64 // 0 = no ceiling (boot at Size)
263263
```
264264

265-
Threaded: `CreateInstanceRequest.MemoryCeilingBytes` → stored on the instance → `hypervisor.VMConfig` (new field) → `buildShimConfigFromVMConfig` (`starter.go:161-188`) → `ShimConfig.MemoryCeilingBytes`. The ceiling is persisted on the instance so it survives standby/restore (the shim config is already round-tripped through the snapshot manifest — `shimconfig.SnapshotManifest`, `server.go:186-205`; the restore path rebuilds `ShimConfig` at `starter.go:148-156`).
265+
Threaded: `CreateInstanceRequest.MemoryCeilingBytes` → stored on the instance → `hypervisor.VMConfig` (new field) → `buildShimConfigFromVMConfig` (`starter.go:161-188`) → `ShimConfig.MemoryCeilingBytes`. This is the same path the `Platform`/`EnableRosetta` fields already take, so it is proven plumbing rather than new machinery: the user-facing `Platform` rides `CreateInstanceRequest` → instance (`lib/instances/types.go:252`, `:93`), and the *derived* `EnableRosetta` flag is computed during create (`deriveEnableRosetta`, `lib/instances/rosetta.go:17`, called `create.go:119`), carried on `hypervisor.VMConfig` (`lib/hypervisor/config.go:36`), copied by `buildShimConfigFromVMConfig` (`starter.go:172`), and consumed as `ShimConfig.EnableRosetta` (`shimconfig/config.go:42`). `MemoryCeilingBytes` adds one field at each of those same hops. The ceiling is persisted on the instance so it survives standby/restore (the shim config is already round-tripped through the snapshot manifest — `shimconfig.SnapshotManifest`, `server.go:186-205`; the restore path rebuilds `ShimConfig` at `starter.go:148-156`).
266266

267-
Validation, mirroring Tart's `setMemory` floor guard (`VMConfig.swift:178-190`) and the existing `computeMemorySize` clamp (`vm.go:303-319`):
267+
Validation, mirroring Tart's `setMemory` floor guard (`VMConfig.swift:178-190`) and the existing `computeMemorySize` clamp (`vm.go:307-323`):
268268

269269
- `MemoryCeilingBytes == 0` → no ceiling.
270270
- `0 < MemoryCeilingBytes ≤ Size` → reject (ceiling below baseline is meaningless).
@@ -304,29 +304,29 @@ The doc comment in `config.example.darwin.yaml:163-164` ("CPU/Memory Hotplug —
304304
## Platform constraints & edge cases
305305

306306
- **macOS version.** The traditional memory balloon device is macOS 11+ (`memory_balloon.go:39-44`); the whole vz backend already requires it. No new minimum is introduced by ballooning itself. Snapshots remain macOS 14+ on Apple Silicon (`config.example.darwin.yaml:169-172`), which matters only for ceiling VMs that are also snapshotted (the manifest already carries the shim config, so the ceiling is preserved across restore).
307-
- **Apple Silicon only.** vz Linux guests under hypeman target arm64 (`guestmemory_darwin_test.go:30-32`; `SupportsSnapshot = runtime.GOARCH == "arm64"`, `client.go:87`). No change.
308-
- **Ceiling is bounded by host RAM, not by APFS or disk.** `VirtualMachineConfigurationMaximumAllowedMemorySize()` (`configuration.go:312`) returns a value derived from physical host memory; the framework rejects configurations above it at `Validate()` time (`vm.go:89-91`). Memory here is RAM-backed, so there is **no** disk-image or APFS-volume-boundary concern for the memory feature specifically — unlike disk resizing, which Tart explicitly documents as one-directional to avoid data loss (`Set.swift:34-37`). The ceiling math is purely "sum of VM ceilings vs. host RAM," and admission control should reason about the **baseline** for packing (since that's the resident cost at idle) while treating the **ceiling** as the worst case the balloon controller must be able to claw back under pressure.
307+
- **Apple Silicon only.** vz Linux guests under hypeman target arm64 (`lib/instances/guestmemory_darwin_test.go:30-32`; `SupportsSnapshot = runtime.GOARCH == "arm64"`, `client.go:87`). No change.
308+
- **Ceiling is bounded by host RAM, not by APFS or disk.** `VirtualMachineConfigurationMaximumAllowedMemorySize()` (`configuration.go:312`) returns a value derived from physical host memory; the framework rejects configurations above it at `Validate()` time (`vm.go:93`). Memory here is RAM-backed, so there is **no** disk-image or APFS-volume-boundary concern for the memory feature specifically — unlike disk resizing, which Tart explicitly documents as one-directional to avoid data loss (`Set.swift:34-37`). The ceiling math is purely "sum of VM ceilings vs. host RAM," and admission control should reason about the **baseline** for packing (since that's the resident cost at idle) while treating the **ceiling** as the worst case the balloon controller must be able to claw back under pressure.
309309
- **Oversubscription risk.** Booting at the ceiling makes a guest *capable* of touching ceiling-many pages. If many guests simultaneously grow toward their ceilings while the host is healthy, then the host swings into pressure, the controller must reclaim fast enough. This is bounded by `per_vm_max_step_bytes` (reclaim is incremental) and the protected floor (a guest is never squeezed below a usable size). The honest failure mode: if aggregate demand exceeds host RAM faster than the balloon can inflate, the host swaps. Grow-on-demand is therefore off by default and rate-limited; the safe default deployment is "boot at ceiling, hold at baseline, only ever deflate toward ceiling under explicit opt-in."
310310
- **Balloon refusal / partial inflation.** `SetTargetVirtualMachineMemorySize` is a request; the guest's balloon driver fulfills it asynchronously and may lag or partially comply (e.g. under guest memory pressure with `deflate-on-OOM` semantics). The controller already reads back the *target* (`GetTargetVirtualMachineMemorySize`, `server.go:224`) — note this is the target, not the achieved size, so accounting is target-based, same as today. No new guarantee is claimed about instantaneous compliance.
311311
- **Lazy backing assumption.** The density win depends on the guest not touching ballooned pages. A guest configured with `kernel_page_init_mode: hardened` (`init_on_alloc=1 init_on_free=1`, `policy.go:64-76`) touches more pages on alloc/free; the `performance` mode preserves lazy host allocation. Ceiling VMs that care about density should run `performance` page-init, exactly the tradeoff the existing knob encodes.
312312
- **Interaction with `inst.Size + inst.HotplugSize`.** For vz, `HotplugSize` is always 0 today; the `Source` change uses `max(Size+HotplugSize, MemoryCeilingBytes)` so it stays correct if hotplug is ever populated on another backend without affecting vz.
313313

314314
## Testing plan
315315

316-
Extend the existing darwin manual integration tests (gated by `requireGuestMemoryManualRun`, darwin, arm64 — `guestmemory_darwin_test.go:25-32`) and the unit tests for the controller/planner.
316+
Extend the existing darwin manual integration tests (gated by `requireGuestMemoryManualRun`, darwin, arm64 — `lib/instances/guestmemory_darwin_test.go:25-32`) and the unit tests for the controller/planner.
317317

318318
Unit (host-independent, run everywhere):
319319

320320
- `planner_test.go` (new cases) / extend `policy_test.go`: `growthTargetBytes` returns no-change when `GrowOnDemandEnabled` is false; grows to `AssignedMemoryBytes` only above `GrowUtilizationPercent`; never exceeds `AssignedMemoryBytes`; never goes below `protectedFloor`.
321321
- `controller_test.go`: with `AssignedMemoryBytes` = ceiling and current target = baseline, a healthy host with grow enabled raises the target by at most `per_vm_max_step_bytes` per reconcile and respects `per_vm_cooldown` (the clamps at `controller.go:243-258` already exist; assert they bound the grow path too).
322322
- `ActiveBallooningConfig.Normalize`: `GrowUtilizationPercent` clamps to `(0,100)` and defaults to 85 when unset/invalid.
323323

324-
Integration (darwin/arm64, manual), extending `guestmemory_darwin_test.go`:
324+
Integration (darwin/arm64, manual), extending `lib/instances/guestmemory_darwin_test.go`:
325325

326-
- **Boot-at-ceiling.** Create a vz instance with `Size = 1 GiB`, `MemoryCeilingBytes = 4 GiB`. Assert `getVZVMInfo` reports a balloon device (`guestmemory_darwin_test.go:64-66`). Read `/proc/meminfo` `MemTotal` over the exec agent (`vzExecCommand`, used at `guestmemory_darwin_test.go:58-60`) and assert it reflects the *boot* size (~4 GiB) — the guest kernel sees the ceiling.
327-
- **Balloon-to-baseline.** After startup, assert `GET /api/v1/vm.balloon` target ≈ 1 GiB and that guest `MemAvailable` shrinks accordingly, while host RSS of the shim stays low (reuse `assertLowIdleVZHostMemoryFootprint`, `guestmemory_darwin_test.go:143-166`) — proving the ceiling didn't cost resident host RAM at baseline.
326+
- **Boot-at-ceiling.** Create a vz instance with `Size = 1 GiB`, `MemoryCeilingBytes = 4 GiB`. Assert `getVZVMInfo` reports a balloon device (`lib/instances/guestmemory_darwin_test.go:64-66`). Read `/proc/meminfo` `MemTotal` over the exec agent (`vzExecCommand`, used at `lib/instances/guestmemory_darwin_test.go:58-60`) and assert it reflects the *boot* size (~4 GiB) — the guest kernel sees the ceiling.
327+
- **Balloon-to-baseline.** After startup, assert `GET /api/v1/vm.balloon` target ≈ 1 GiB and that guest `MemAvailable` shrinks accordingly, while host RSS of the shim stays low (reuse `assertLowIdleVZHostMemoryFootprint`, `lib/instances/guestmemory_darwin_test.go:143-166`) — proving the ceiling didn't cost resident host RAM at baseline.
328328
- **Live grow.** `PUT /api/v1/vm.balloon` with target 4 GiB (or drive the controller with `GrowOnDemandEnabled` and a synthetic high-utilization signal); assert the guest's usable memory climbs toward 4 GiB *without a reboot* (no change in `getVZVMInfo` state transitions, instance not recreated).
329-
- **Live shrink under pressure.** Reuse `assertActiveBallooningLifecycle` (`guestmemory_darwin_test.go:72`) with an injected `PressureSampler` (the controller already supports injection — `NewControllerWithSampler`, `active_ballooning.go:198`) reporting `Stressed: true`; assert the target drops toward the floor and never below `protectedFloor`.
329+
- **Live shrink under pressure.** Reuse `assertActiveBallooningLifecycle` (`lib/instances/guestmemory_darwin_test.go:72`) with an injected `PressureSampler` (the controller already supports injection — `NewControllerWithSampler`, `active_ballooning.go:198`) reporting `Stressed: true`; assert the target drops toward the floor and never below `protectedFloor`.
330330
- **Ceiling validation.** Unit-level: `CreateInstanceRequest` with ceiling ≤ size is rejected; ceiling above `VirtualMachineConfigurationMaximumAllowedMemorySize()` is rejected.
331331

332332
## Risks & alternatives considered

0 commit comments

Comments
 (0)