|
| 1 | +# Node Agent |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +`nodeagent` is the EVE microservice responsible for the lifecycle of the node |
| 6 | +itself — as opposed to the workloads it runs. Its main jobs are: |
| 7 | + |
| 8 | +* drive the **A/B baseos upgrade** process, in cooperation with `baseosmgr` and |
| 9 | + `zedagent`, including the post-upgrade *test* window during which the new |
| 10 | + image must prove itself by reaching the controller before being marked |
| 11 | + `active`, |
| 12 | +* **reboot, shut down or power off** the device when asked by the controller |
| 13 | + (via `zedagent`) or when an internal health timer expires, |
| 14 | +* watch a small set of **health signals** (controller reachability, vault |
| 15 | + state, TPM sanity, free disk space, certificate refusal, kubernetes node |
| 16 | + drain) and either trigger a reboot or push the device into |
| 17 | + *Maintenance Mode*, |
| 18 | +* on every boot, **reconstruct and report the previous reboot** (reason, boot |
| 19 | + reason, stack/dmesg, image, time) and bump the persistent `restartCounter`, |
| 20 | +* surface installer logs from the very first boot of a freshly installed image. |
| 21 | + |
| 22 | +`nodeagent` is intentionally small (a single `nodeagentContext` event loop, no |
| 23 | +sub-packages); most of its complexity lives in *when* to do something rather |
| 24 | +than *how*. The "how" — flipping partition states, reaching the controller — |
| 25 | +is delegated to `baseosmgr`/`zboot` and `zedagent`. |
| 26 | + |
| 27 | +## Key Input/Output |
| 28 | + |
| 29 | +**nodeagent consumes** (all via pubsub unless noted): |
| 30 | + |
| 31 | +* global configuration properties |
| 32 | + * `ConfigItemValueMap` from `zedagent` |
| 33 | + * supplies the four health-timer thresholds: |
| 34 | + `timer.reboot.no.network` (`ResetIfCloudGoneTime`), |
| 35 | + `timer.update.fallback.no.network` (`FallbackIfCloudGoneTime`), |
| 36 | + `timer.test.baseimage.update` (`MintimeUpdateSuccess`), |
| 37 | + `timer.vault.ready.cutoff` (`VaultReadyCutOffTime`) |
| 38 | +* zedagent status |
| 39 | + * `ZedAgentStatus` from `zedagent` |
| 40 | + * carries the controller-driven `RebootCmd` / `ShutdownCmd` / `PoweroffCmd` |
| 41 | + requests with a `RequestedRebootReason` and `RequestedBootReason`, |
| 42 | + * carries `ConfigGetStatus` (`Success`, `TemporaryFail`, `ReadSaved`, |
| 43 | + `Fail`) which is the heartbeat used to drive every "have we lost the |
| 44 | + controller?" timer, |
| 45 | + * carries `EdgeNodeCertsRefused` to drive the corresponding |
| 46 | + Maintenance Mode reason |
| 47 | +* zboot status |
| 48 | + * `ZbootStatus` per partition from `baseosmgr` |
| 49 | + * tells nodeagent when the *other* partition has been flipped to |
| 50 | + `updating` (→ schedule a reboot into the new image) and when the |
| 51 | + *current* partition has been flipped to `active` (→ upgrade fully |
| 52 | + committed) |
| 53 | +* domain status |
| 54 | + * `DomainStatus` from `domainmgr` |
| 55 | + * polled while a reboot/shutdown/poweroff is in flight to wait for all |
| 56 | + app domains to be halted |
| 57 | +* vault status |
| 58 | + * `VaultStatus` from `vaultmgr` — drives `MaintenanceModeReasonVaultLockedUp` |
| 59 | + and, if an upgrade is in progress and the vault never opens within |
| 60 | + `VaultReadyCutOffTime`, triggers a fallback reboot |
| 61 | + (`BootReasonVaultFailure`) |
| 62 | +* TPM sanity status |
| 63 | + * `TpmSanityStatus` from `tpmmgr` — drives |
| 64 | + `MaintenanceModeReasonTpmEncFailure` |
| 65 | +* volume manager status |
| 66 | + * `VolumeMgrStatus` from `volumemgr` — its `RemainingSpace` field drives |
| 67 | + `MaintenanceModeReasonNoDiskSpace` |
| 68 | +* node drain status (kubevirt builds only) |
| 69 | + * `kubeapi.NodeDrainStatus` from `zedkube` |
| 70 | + * keeps `WaitDrainInProgress` set in `NodeAgentStatus` so that `zedagent` |
| 71 | + holds back the controller-requested reboot/shutdown/poweroff until the |
| 72 | + kube node has finished draining its workloads |
| 73 | +* onboarding status |
| 74 | + * `OnboardingStatus` (via `wait.WaitForOnboarded`) — nodeagent blocks on |
| 75 | + this once before joining the main event loop |
| 76 | +* on-disk state (read at start) |
| 77 | + * `/persist/reboot-reason`, `/persist/boot-reason`, `/persist/reboot-stack`, |
| 78 | + `/persist/reboot-image` (via `agentlog.Get*`), used to reconstruct the |
| 79 | + previous reboot, |
| 80 | + * `/persist/SMART_details.json` and `/persist/SMART_details_previous.json` |
| 81 | + — SMART power-cycle counters from the storage controller; consulted |
| 82 | + when no reboot reason was recorded, to distinguish a dirty power-off |
| 83 | + (counter incremented) from a kernel panic / watchdog reset |
| 84 | + (counter unchanged), |
| 85 | + * `/run/global/first-boot` — marker dropped by the installer on the very |
| 86 | + first boot; presence sets the boot reason to `BootReasonFirst`, |
| 87 | + * `/persist/installer/installer.log` plus |
| 88 | + `/persist/installer/send-require` — installer output to be replayed |
| 89 | + into the regular log stream, |
| 90 | + * `/persist/status/restartcounter` — monotonic restart counter, |
| 91 | + * `/persist/fault-injection/readfile` — fault-injection knob. |
| 92 | + |
| 93 | +**nodeagent publishes**: |
| 94 | + |
| 95 | +* `NodeAgentStatus` (consumed by `zedagent`, ultimately surfaced to the |
| 96 | + controller) |
| 97 | + * current active partition (`IMGA` or `IMGB`), |
| 98 | + * `UpdateInprogress` plus the `RemainingTestTime` countdown shown to the |
| 99 | + operator during post-upgrade validation, |
| 100 | + * `DeviceReboot` / `DeviceShutdown` / `DevicePoweroff` plus |
| 101 | + `AllDomainsHalted` (the fine-grained progression of the operation), |
| 102 | + * `RebootReason`, `BootReason`, `RebootStack`, `RebootTime`, `RebootImage` |
| 103 | + from the previous boot, |
| 104 | + * `RestartCounter`, |
| 105 | + * `LocalMaintenanceMode` and the multi-reason |
| 106 | + `LocalMaintenanceModeReasons`, |
| 107 | + * `HVTypeKube`, `WaitDrainInProgress` |
| 108 | +* `ZbootConfig` — one entry per partition (`IMGA`, `IMGB`); the only |
| 109 | + meaningful field is `TestComplete`, which is flipped to `true` when the |
| 110 | + post-upgrade test window expires successfully and `baseosmgr` should |
| 111 | + commit the new image. This publication is **persistent**: it is read back |
| 112 | + on next boot. |
| 113 | +* `RebootReason` / `BootReason` files in `/persist/` — written via |
| 114 | + `agentlog.RebootReason()` just before issuing the actual `zboot.Reset()` |
| 115 | + or `zboot.Poweroff()` so that the *next* boot of nodeagent can pick them |
| 116 | + up. |
| 117 | + |
| 118 | +## Components |
| 119 | + |
| 120 | +Unlike NIM, nodeagent is not split into separately-testable components with |
| 121 | +Go interfaces between them. It is a single `nodeagentContext` running one |
| 122 | +goroutine for the main event loop. The logical responsibilities, however, |
| 123 | +are cleanly partitioned across the source files; that partitioning is the |
| 124 | +right place to think about coverage: |
| 125 | + |
| 126 | +### Lifecycle / pubsub wiring (`nodeagent.go`) |
| 127 | + |
| 128 | +`Run()` initializes the agent, creates the publishers and subscribers, |
| 129 | +starts a 10-second `tickerTimer` and a 25-second `stillRunning` watchdog, |
| 130 | +blocks for `GlobalConfig` and onboarding to be available, and then enters |
| 131 | +the main `select` loop. The same file also contains the handlers for the |
| 132 | +non-zboot subscriptions: `globalConfig`, `vaultStatus`, `volumeMgrStatus`, |
| 133 | +`tpmStatus`, and the `zedAgentStatus` ingress that translates controller |
| 134 | +device-ops into local `handleDeviceCmd()` calls. |
| 135 | + |
| 136 | +### Reboot-reason reconstruction (`nodeagent.go`) |
| 137 | + |
| 138 | +`handleLastRebootReason()` is called once at startup. It reads anything the |
| 139 | +*previous* boot left behind (`agentlog.GetRebootReason()`, |
| 140 | +`GetBootReason()`, `GetRebootImage()`), and if nothing was recorded it |
| 141 | +synthesizes a default using two side-channel signals: |
| 142 | + |
| 143 | +* `/run/global/first-boot` (set by the installer) → `BootReasonFirst`, |
| 144 | +* the `PowerCycleCount` delta between current and previous SMART snapshots |
| 145 | + → `BootReasonPowerFail` (count went up: dirty power cycle) versus |
| 146 | + `BootReasonKernel` (count unchanged: kernel panic / watchdog with no |
| 147 | + kdump), |
| 148 | +* fallback: `BootReasonUnknown`. |
| 149 | + |
| 150 | +The reboot stack, if any, is logged line-by-line and (if it is bigger than |
| 151 | +~1700 bytes) tail-truncated so that the publication fits in pubsub. This |
| 152 | +function is also where `restartCounter` gets read, incremented, and |
| 153 | +written back. |
| 154 | + |
| 155 | +### Health timers (`handletimers.go`) |
| 156 | + |
| 157 | +`handleDeviceTimers()` fires every 10 seconds and is the heart of the |
| 158 | +agent. It only operates on its own monotonic `timeTickCount` (incremented |
| 159 | +by the timer interval), never on wall-clock time, so that NTP jumping the |
| 160 | +clock by decades on first boot does not trip every timer at once. It runs |
| 161 | +four checks in order: |
| 162 | + |
| 163 | +1. **`handleFallbackOnCloudDisconnect`** — only when an upgrade is being |
| 164 | + tested. If the controller has been unreachable for |
| 165 | + `FallbackIfCloudGoneTime`, the new image is presumed bad: schedule a |
| 166 | + reboot with `BootReasonFallback`, `baseosmgr` will then flip the |
| 167 | + partition back. |
| 168 | +2. **`handleRebootOnVaultLocked`** — if `vaultmgr` reports the vault as |
| 169 | + `DATASEC_AT_REST_ERROR`, wait at most `VaultReadyCutOffTime`. If an |
| 170 | + upgrade is in progress when the deadline fires, reboot with |
| 171 | + `BootReasonVaultFailure` (the upgrade fails); otherwise enter |
| 172 | + Maintenance Mode with `MaintenanceModeReasonVaultLockedUp`. |
| 173 | +3. **`handleResetOnCloudDisconnect`** — independently of any upgrade, if |
| 174 | + the controller has been unreachable for `ResetIfCloudGoneTime`, |
| 175 | + schedule a reboot with `BootReasonDisconnect`. This is the long-tail |
| 176 | + "we have lost the cloud, try a clean restart" timer, intended to |
| 177 | + recover from odd hardware/driver failures (for example a hung Ethernet |
| 178 | + adapter) that a reboot is likely to clear. |
| 179 | +4. **`handleUpgradeTestValidation`** — if a post-upgrade test is in flight |
| 180 | + (`testInprogress`) and `MintimeUpdateSuccess` has elapsed, declare the |
| 181 | + image good: clear the test, set `ZbootConfig.TestComplete = true` so |
| 182 | + `baseosmgr` commits the partition. |
| 183 | + |
| 184 | +`updateZedagentCloudConnectStatus()` translates `ConfigGetStatus` |
| 185 | +transitions into `lastControllerReachableTime` updates and into |
| 186 | +start/restart/clear of the test window. |
| 187 | + |
| 188 | +`handleDeviceCmd()` and `scheduleNodeOperation()` are the entry points for |
| 189 | +both controller-driven (`RebootCmd`/`ShutdownCmd`/`PoweroffCmd`) and |
| 190 | +internally-driven device operations. They guard against double-trigger, |
| 191 | +update `NodeAgentStatus`, and spawn `handleNodeOperation()` in its own |
| 192 | +goroutine. |
| 193 | + |
| 194 | +`handleNodeOperation()` waits `minRebootDelay` (30s by default), persists |
| 195 | +the reboot reason via `agentlog.RebootReason()`, calls |
| 196 | +`waitForAllDomainsHalted()` (poll `DomainStatus` up to |
| 197 | +`maxDomainHaltTime`), `syscall.Sync()`, waits another `minRebootDelay`, |
| 198 | +flushes coverage data, and finally calls `zboot.Reset()` or |
| 199 | +`zboot.Poweroff()`. A 120-second backstop goroutine `os.Exit(0)`s the |
| 200 | +process if the zboot call hangs — the underlying `reboot` syscall has |
| 201 | +been seen to stall inside the kernel due to kernel bugs, so the backstop |
| 202 | +ensures the in-kernel watchdog takes over and restarts the node. |
| 203 | + |
| 204 | +### A/B upgrade orchestration (`handlebaseos.go`, `handlezboot.go`) |
| 205 | + |
| 206 | +`handleZbootStatusImpl()` is the inbound side. When the *current* |
| 207 | +partition transitions to `active` while we still thought the upgrade was |
| 208 | +in progress, the agent latches the upgrade as fully committed |
| 209 | +(`updateInprogress=false`, etc.). It then dispatches: |
| 210 | + |
| 211 | +* `doZbootBaseOsInstallationComplete()` — the *other* partition just |
| 212 | + became `updating` (a new image was written): schedule a reboot with |
| 213 | + `BootReasonUpdate` and a friendly `NORMAL: baseos-update(...) to EVE |
| 214 | + version X reboot` message. |
| 215 | +* `doZbootBaseOsTestValidationComplete()` — the *current* partition's |
| 216 | + `TestComplete` flag has been observed back from `baseosmgr` after we |
| 217 | + set it; clear it on the config side and mark `updateComplete=true`. |
| 218 | + |
| 219 | +`handlezboot.go` contains the small lookup helpers (`lookupZbootConfig`, |
| 220 | +`lookupZbootStatus`, `getZbootOtherPartition`, |
| 221 | +`isZbootOtherPartitionStateUpdating`, `publishZbootConfig*`). All |
| 222 | +real partition operations are delegated to the `pillar/zboot` package |
| 223 | +which knows about `IMGA`/`IMGB`/`grubenv`. |
| 224 | + |
| 225 | +### Kube node-drain glue (`handlenodedrain.go`) |
| 226 | + |
| 227 | +Kubevirt builds receive a `kubeapi.NodeDrainStatus` from `zedkube`. As |
| 228 | +long as a drain initiated by *device-op* (reboot/shutdown/poweroff) is |
| 229 | +between `REQUESTED` and `COMPLETE`, nodeagent flips |
| 230 | +`WaitDrainInProgress` so that `zedagent` keeps the deferred device op |
| 231 | +deferred. On `COMPLETE`, the flag is cleared and the device op is |
| 232 | +allowed to proceed. |
| 233 | + |
| 234 | +### Maintenance Mode |
| 235 | + |
| 236 | +Maintenance Mode is a multi-reason flag (`MaintenanceModeMultiReason`) |
| 237 | +maintained via two helpers, `addMaintenanceModeReason()` and |
| 238 | +`removeMaintenanceModeReason()`. Each contributing handler (vault, TPM, |
| 239 | +disk space, certs-refused) calls these and re-publishes |
| 240 | +`NodeAgentStatus`. The mode is only fully cleared when *every* reason |
| 241 | +has been removed. |
| 242 | + |
| 243 | +## Control-flow |
| 244 | + |
| 245 | +There are four largely independent control paths through nodeagent. |
| 246 | + |
| 247 | +### 1. Boot and onboarding |
| 248 | + |
| 249 | +```text |
| 250 | +Run() |
| 251 | + └─ subscribe GlobalConfig |
| 252 | + └─ wait for GCInitialized (sets log levels) |
| 253 | + └─ parseSMARTData() |
| 254 | + └─ handleLastRebootReason() (publishes nothing yet, |
| 255 | + └─ handleInstallationLog() updates ctx fields) |
| 256 | + └─ create publications, ZbootConfig, NodeAgentStatus |
| 257 | + └─ subscribe vault/volume/tpm |
| 258 | + └─ publishZbootConfigAll() (one entry per partition) |
| 259 | + └─ ctx.updateInprogress = zboot.IsCurrentPartitionStateInProgress() |
| 260 | + └─ publishNodeAgentStatus() (first publication) |
| 261 | + └─ subscribe DomainStatus |
| 262 | + └─ wait.WaitForOnboarded() |
| 263 | + └─ setTestStartTime() (no-op unless updateInprogress) |
| 264 | + └─ subscribe ZbootStatus, ZedAgentStatus, NodeDrainStatus |
| 265 | + └─ event loop |
| 266 | +``` |
| 267 | + |
| 268 | +### 2. Periodic timer tick (every 10s) |
| 269 | + |
| 270 | +```text |
| 271 | +tickerTimer fires |
| 272 | + → updateTickerTime() (advance ctx.timeTickCount) |
| 273 | + → handleFallbackOnCloudDisconnect() (only if updateInprogress) |
| 274 | + → handleRebootOnVaultLocked() (only if vault disabled) |
| 275 | + → handleResetOnCloudDisconnect() (always) |
| 276 | + → handleUpgradeTestValidation() (only if testInprogress) |
| 277 | +``` |
| 278 | + |
| 279 | +### 3. Controller-driven device operation |
| 280 | + |
| 281 | +```text |
| 282 | +zedagent publishes ZedAgentStatus{RebootCmd:true, …} |
| 283 | + → handleZedAgentStatusImpl() |
| 284 | + → handleDeviceCmd(op=Reboot) |
| 285 | + → scheduleNodeOperation(reason, bootReason, op) |
| 286 | + → ctx.deviceReboot = true |
| 287 | + → publishNodeAgentStatus() (zedagent now sees DeviceReboot) |
| 288 | + → go handleNodeOperation(op) |
| 289 | + ├ wait minRebootDelay |
| 290 | + ├ agentlog.RebootReason(...) (persists reason for next boot) |
| 291 | + ├ waitForAllDomainsHalted() |
| 292 | + ├ ctx.allDomainsHalted = true; publish |
| 293 | + ├ syscall.Sync(); wait minRebootDelay |
| 294 | + ├ flushCoverage |
| 295 | + └ zboot.Reset() / zboot.Poweroff() |
| 296 | +``` |
| 297 | + |
| 298 | +The very same `scheduleNodeOperation()` is what the four health timers |
| 299 | +call when they decide the device must be reset. |
| 300 | + |
| 301 | +### 4. Baseos upgrade |
| 302 | + |
| 303 | +```text |
| 304 | +(a) "other partition is updating" — new image just written |
| 305 | +zedagent → ZbootStatus(other = updating) |
| 306 | + → handleZbootStatusImpl() |
| 307 | + → doZbootBaseOsInstallationComplete() |
| 308 | + → scheduleNodeOperation(BootReasonUpdate, Reboot) |
| 309 | +
|
| 310 | +(b) post-reboot, current partition still inprogress — test window |
| 311 | +Run() sets updateInprogress = true |
| 312 | + → setTestStartTime() once GlobalConfig is in |
| 313 | +ConfigGetStatus = Success keeps lastControllerReachableTime fresh |
| 314 | +After MintimeUpdateSuccess seconds: |
| 315 | + handleUpgradeTestValidation() |
| 316 | + → initiateBaseOsControllerTestComplete() |
| 317 | + → publish ZbootConfig{TestComplete:true} for curPart |
| 318 | +
|
| 319 | +(c) baseosmgr acknowledges by flipping curPart to active and clearing |
| 320 | + its TestComplete in ZbootStatus: |
| 321 | + handleZbootStatusImpl(): |
| 322 | + if curPart && updateInprogress && state==active: |
| 323 | + updateInprogress = false; testComplete = false; updateComplete = false |
| 324 | + doZbootBaseOsTestValidationComplete(): |
| 325 | + republish ZbootConfig with TestComplete=false; updateComplete=true |
| 326 | +``` |
| 327 | + |
| 328 | +If the test window times out without the controller being reachable, |
| 329 | +`handleFallbackOnCloudDisconnect()` instead schedules a fallback |
| 330 | +reboot (`BootReasonFallback`). `baseosmgr` then rolls the partition |
| 331 | +back on the next boot. |
| 332 | + |
| 333 | +## Debugging |
| 334 | + |
| 335 | +### PubSub |
| 336 | + |
| 337 | +On a running device: |
| 338 | + |
| 339 | +```sh |
| 340 | +cat /run/nodeagent/NodeAgentStatus/nodeagent.json | jq |
| 341 | +cat /persist/status/nodeagent/ZbootConfig/IMGA.json | jq |
| 342 | +cat /persist/status/nodeagent/ZbootConfig/IMGB.json | jq |
| 343 | +``` |
| 344 | + |
| 345 | +The first shows the agent's view of update/reboot state and the list of |
| 346 | +maintenance-mode reasons. The other two show whether nodeagent has |
| 347 | +asked `baseosmgr` to commit the new image (`TestComplete`). |
| 348 | + |
| 349 | +Persistent files of interest under `/persist/`: |
| 350 | + |
| 351 | +* `status/restartcounter` — number of restarts of pillar |
| 352 | +* `reboot-reason`, `boot-reason`, `reboot-stack`, `reboot-image` — |
| 353 | + written just before reboot; consumed and discarded on next boot |
| 354 | +* `SMART_details.json`, `SMART_details_previous.json` — power-cycle |
| 355 | + counter snapshots used by the boot-reason heuristic |
| 356 | +* `installer/installer.log`, `installer/send-require` — installer |
| 357 | + output to be replayed into the log stream on first post-install boot |
| 358 | + |
| 359 | +### Logs |
| 360 | + |
| 361 | +Useful `grep` patterns: |
| 362 | + |
| 363 | +```text |
| 364 | +"Current partition RebootReason" – previous boot's reason as read at startup |
| 365 | +"found bootReason" – previous boot's BootReason |
| 366 | +"Default RebootReason" – nodeagent had to synthesize one |
| 367 | +"Starting upgrade validation for" – post-upgrade test window opening |
| 368 | +"inprogress, waiting for" – periodic countdown of remaining test time |
| 369 | +"Upgrade Validation Test Complete" – post-upgrade test window expired OK |
| 370 | +"Exceeded fallback outage" – BootReasonFallback path firing |
| 371 | +"Exceeded outage for controller" – BootReasonDisconnect path firing |
| 372 | +"Exceeded time for vault to be ready" – BootReasonVaultFailure path firing |
| 373 | +"setting MaintenanceModeReason" – addMaintenanceModeReason() |
| 374 | +"clearing MaintenanceModeReason" – removeMaintenanceModeReason() |
| 375 | +"No reason to be in maintenance mode" – mode fully cleared |
| 376 | +"baseos-update(" – BootReasonUpdate scheduling |
| 377 | +"handleNodeOperation: minRebootDelay" – the 30s pre-reboot pause |
| 378 | +"waitForAllDomainsHalted" – polling DomainStatus before reboot |
| 379 | +"Doing a sync.." – just before zboot.Reset/Poweroff |
| 380 | +"nodedrain-step:" – kube node-drain glue |
| 381 | +``` |
| 382 | + |
| 383 | +### Forcing transitions for development |
| 384 | + |
| 385 | +* Reboot/shutdown/poweroff via the controller is the normal path; on |
| 386 | + a dev device it can also be exercised by making `zedagent` publish |
| 387 | + `ZedAgentStatus{RebootCmd:true,…}`. |
| 388 | +* The fallback / reset timers can be exercised by cutting controller |
| 389 | + reachability (`eden eve link down` in eden) for longer than the |
| 390 | + configured `timer.update.fallback.no.network` / |
| 391 | + `timer.reboot.no.network`. |
| 392 | +* The post-upgrade test window can be shortened with |
| 393 | + `timer.test.baseimage.update=30` (used by the |
| 394 | + `update_eve_image` eden test). |
| 395 | +* The fault-injection knob `/persist/fault-injection/readfile` causes |
| 396 | + nodeagent to read an arbitrary file at startup. Pointing it at a |
| 397 | + large file is the easiest way to drive pillar into an out-of-memory |
| 398 | + condition, which then triggers the OOM-killer and a watchdog reboot — |
| 399 | + useful for exercising the OOM/watchdog path end-to-end. |
0 commit comments