Skip to content

Commit 828c31b

Browse files
authored
CP-312160: secureboot certificate update design doc (#7006)
Microsoft Secure Boot certificates from 2011 are reaching end-of-life, and legacy VMs may still contain only the old certificate set. We design an out-of-band mechanism to update per-VM UEFI Secure Boot variables safely and at scale.
2 parents dee989f + 88a695c commit 828c31b

1 file changed

Lines changed: 169 additions & 0 deletions

File tree

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
---
2+
title: Handling Microsoft Secure Boot Certificate Expiry
3+
layout: default
4+
design_doc: true
5+
revision: 1
6+
status: draft
7+
---
8+
9+
## 1. Background
10+
11+
Microsoft Secure Boot certificates from 2011 are reaching end-of-life, and legacy VMs may still contain only the old certificate set. XenServer needs an out-of-band mechanism to update per-VM UEFI Secure Boot variables safely and at scale.
12+
13+
Scope of this design:
14+
15+
- Update certificate state tracking and update flow for VMs, snapshots, and templates
16+
- Provide API support for scheduling certificate updates on VM boot
17+
- Integrate xapi and varstored behavior for consistent state handling
18+
19+
## 2. System Overview
20+
21+
### 2.1 Out-of-band Update Mechanism
22+
23+
Certificate update is implemented as a dedicated API-driven workflow (not a plugin), so that:
24+
25+
- The interface is documented and SDK-generated
26+
- RBAC can be assigned precisely
27+
- xapi can route requests and coordinate host-side behavior consistently
28+
29+
### 2.2 Certificate State Tracking
30+
31+
A new VM field is introduced:
32+
33+
- `VM.secureboot_certificates_state` (enum, readonly)
34+
35+
States:
36+
37+
- `ok`: No update required (including non-applicable VM types)
38+
- `update_available`: Update required
39+
- `update_on_boot`: Update scheduled for next boot
40+
41+
~~~mermaid
42+
43+
stateDiagram
44+
update_available --> update_on_boot : Admin marks VM for update
45+
update_on_boot --> ok : VM boots, update succeeds
46+
update_on_boot --> update_on_boot : VM boots, update fails(retain state)
47+
ok --> update_available : recompute state(e.g. legacy VM import)
48+
49+
~~~
50+
51+
### 2.3 RBAC
52+
53+
The new update API follows VM-admin-level access, aligned with existing NVRAM-related VM operations.
54+
55+
## 3. Design for Components
56+
57+
### 3.1 VM Certificate State Model
58+
59+
`VM.secureboot_certificates_state` applies to these VM-class objects,
60+
61+
- VMs
62+
- Snapshots
63+
- Templates
64+
65+
Transition intent:
66+
67+
- Admin marks a VM for update: `update_available -> update_on_boot`
68+
- VM boots and update succeeds: `update_on_boot -> ok`
69+
- VM boots and update fails: remains `update_on_boot` or is reset to `update_available` based on update result handling
70+
71+
### 3.2 API: Mark/Unmark Update-on-Boot
72+
73+
New API:
74+
75+
- `VM.update_secureboot_certificates_on_boot(session, vm, mark)`
76+
77+
Behavior:
78+
79+
- `mark=true`: require current state `update_available`, then set `update_on_boot`
80+
- `mark=false`: require current state `update_on_boot`, then set `update_available`
81+
82+
Validation:
83+
84+
- Reject invalid transitions with `OPERATION_NOT_ALLOWED`
85+
86+
### 3.3 DB Upgrade and Import Handling
87+
88+
On toolstack restart after upgrade:
89+
90+
- Initialize `secureboot_certificates_state` for all VM records to `ok`
91+
- Re-evaluate NVRAM and set `update_available` where needed
92+
93+
Applied to:
94+
95+
- VMs
96+
- Snapshots
97+
- Non-default templates
98+
99+
Default templates remain `ok`.
100+
101+
For VM import and cross-pool migration:
102+
103+
- If imported metadata lacks `secureboot_certificates_state`, determine state from NVRAM and set it during import
104+
- If imported metadata contains `secureboot_certificates_state`, reserve the state during import
105+
106+
### 3.4 NVRAM and State Consistency
107+
108+
The certificate state must stay consistent with actual NVRAM content.
109+
110+
Key interface change:
111+
112+
- Extend `VM.set_NVRAM_EFI_variables` with optional parameter `update`, we call it `VM.set_NVRAM_EFI_variables_V2`
113+
114+
Rules:
115+
116+
- `update=yes` -> set state `ok`
117+
- `update=no` -> do not update state
118+
- omitted -> xapi runs certificate check helper and derives state
119+
120+
This ensures compatibility when old varstored instances are still running during rolling update windows.
121+
122+
### 3.5 Certificate Check Helper
123+
124+
A standalone program will be introduced, which xapi calls to determine the SecureBoot cert state
125+
126+
Inputs:
127+
128+
- `temp file path` which contains NVRAM EFI-variables data
129+
130+
Behavior:
131+
132+
- This program comes to use some common functions shared with varstored.
133+
- This program is launched by xapi, it is executed in a sandboxed and reduced privileges environment.
134+
- Xapi retrieves VM's NVRAM content from database and passes it to this program via command-line arguments.
135+
- If this program outputs `update_required`, xapi sets `VM.secureboot_certificates_state` to be `update_available`.
136+
- If this program outputs `update_ok`, xapi sets `VM.secureboot_certificates_state` to be `ok`.
137+
- On toolstack restart, during DB upgrade, this program is invoked to compute `VM.secureboot_certificates_state`. Since xapi process has not completed initialization at that point, this program cannot call any services of xapi.
138+
139+
### 3.6 Boot-time Automatic Update Path
140+
141+
When varstored initializes a VM and sees `secureboot_certificates_state=update_on_boot`, varstored does,
142+
143+
- Perform certificate update flow during boot-time initialization
144+
- Write updated NVRAM and synchronize state via `VM.set_NVRAM_EFI_variables_V2`
145+
146+
The `VM.set_NVRAM_EFI_variables_V2` interface performs same as `VM.set_NVRAM_EFI_variables`, uses the existing varstored-guard process to make calls to xapi.
147+
148+
If `VM.set_NVRAM_EFI_variables_V2` runs into error (e.g. there is something wrong with the communication with xapi),
149+
150+
- xapi does not update VM NVRAM and `VM.secureboot_certificates_state`
151+
- VM boot gets stuck at the firmware initialization stage, if the issue is not fixed, rebooting the VM will still encounter the same problem
152+
- Once the issue is fixed, admin can continue the secureboot certificate upgrade by VM reboot
153+
154+
### 3.7 End-to-end Workflow
155+
156+
1. Upgrade packages (`xapi-core`, `varstored`, related components)
157+
2. Restart toolstack
158+
3. xapi DB upgrade initializes and recalculates `secureboot_certificates_state`
159+
4. Admin marks selected VMs via `VM.update_secureboot_certificates_on_boot`
160+
5. VM reboot triggers varstored certificate update
161+
6. xapi updates state to reflect post-update NVRAM content
162+
163+
## 4. Out of Scope
164+
165+
- User-notification mechanism for certificate expiry
166+
- Custom certificate workflow
167+
- Template/snapshot feature expansion beyond state tracking and conversion behavior
168+
- OS-specific test-process guidance
169+
- VM with Secure Boot PCR7 binding (e.g. Windows bitlocker), provide customer documentation to guide how to resolve such issues

0 commit comments

Comments
 (0)