Skip to content

Commit 79fd6d0

Browse files
authored
Merge branch 'main' into fix/pattern-sh-macos-empty-arrays
2 parents 44c9cff + 5fb705e commit 79fd6d0

7 files changed

Lines changed: 588 additions & 2 deletions

Makefile

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,12 @@
33
# You can add custom targets above or below the include line
44

55
include Makefile-common
6+
7+
##@ Firmware Reference Values
8+
.PHONY: collect-firmware-refvals
9+
collect-firmware-refvals: ## Collect firmware reference values from bare metal cluster
10+
@scripts/collect-firmware-refvals.sh
11+
12+
.PHONY: collect-firmware-refvals-merge
13+
collect-firmware-refvals-merge: ## Collect and merge with existing firmware refvals
14+
@scripts/collect-firmware-refvals.sh --merge

docs/firmware-reference-values.md

Lines changed: 316 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,316 @@
1+
# Firmware Reference Values for Bare Metal Attestation
2+
3+
This guide explains how to collect firmware reference values for bare metal confidential computing deployments (Intel TDX / AMD SEV-SNP).
4+
5+
## Overview
6+
7+
Firmware reference values are cryptographic measurements of the Trusted Computing Base (TCB) components:
8+
9+
- **Intel TDX**: `mr_td` (OVMF code hash), `rtmr_1` (kernel/initrd), `rtmr_2` (cmdline), `xfam` (extended features)
10+
- **AMD SEV-SNP**: `snp_launch_measurement` (firmware/kernel/initrd hash)
11+
12+
These values are used by the KBS attestation policy to verify that confidential workloads are running on approved firmware with expected security properties.
13+
14+
## Prerequisites
15+
16+
### 1. Veritas Tool
17+
18+
The [veritas](https://github.com/confidential-containers/veritas) tool collects attestation evidence from confidential VMs.
19+
20+
**Installation:** Veritas is automatically installed inside the collection pod by the script. No local installation required.
21+
22+
**Version requirement**: 0.2.0 or later
23+
24+
### 2. Bare Metal Cluster Access
25+
26+
You need:
27+
28+
- A running bare metal cluster with Intel TDX or AMD SEV-SNP hardware
29+
- KataConfig deployed and in Ready state
30+
- At least one kata pod successfully running (proves TEE is functional)
31+
- `oc` CLI logged in to the cluster
32+
- `jq` installed locally
33+
34+
### 3. Local Tools
35+
36+
```bash
37+
# Check prerequisites
38+
command -v oc && echo "✓ oc CLI installed"
39+
command -v jq && echo "✓ jq installed"
40+
oc whoami && echo "✓ Logged in to cluster"
41+
```
42+
43+
## Workflow
44+
45+
The firmware collection workflow is fully automated via a single command:
46+
47+
### Step 1: Collect Firmware Reference Values
48+
49+
```bash
50+
# From the coco-pattern repository root:
51+
make collect-firmware-refvals
52+
```
53+
54+
This command:
55+
56+
1. Launches a kata pod with `RuntimeClass: kata-cc`
57+
2. Installs veritas inside the pod
58+
3. Collects firmware measurements from the TEE
59+
4. Transforms output to RVPS format (JSON with arrays)
60+
5. Saves to `~/.coco-pattern/firmware-reference-values.json`
61+
6. Cleans up the pod
62+
63+
**Output format** (`~/.coco-pattern/firmware-reference-values.json`):
64+
65+
```json
66+
{
67+
"mr_td": ["a1b2c3d4..."],
68+
"rtmr_1": ["e5f6a7b8..."],
69+
"rtmr_2": ["c9d0e1f2..."],
70+
"snp_launch_measurement": ["f3e4d5c6..."],
71+
"xfam": ["e742060000000000"]
72+
}
73+
```
74+
75+
**Key points:**
76+
77+
- Each field is an **array** of strings (supports multiple valid values)
78+
- Hash values are lowercase hex strings (SHA-384 = 96 hex chars for TDX/SNP firmware)
79+
- Empty arrays `[]` mean "not available" - attestation will skip that check
80+
- Only populated fields for the detected TEE type (TDX or SNP)
81+
82+
### Step 2: Enable in values-secret.yaml
83+
84+
Uncomment the `firmwareReferenceValues` section in `~/values-secret-coco-pattern.yaml`:
85+
86+
```yaml
87+
- name: firmwareReferenceValues
88+
vaultPrefixes:
89+
- hub
90+
fields:
91+
- name: json
92+
path: ~/.coco-pattern/firmware-reference-values.json
93+
```
94+
95+
### Step 3: Load Secrets to Vault
96+
97+
```bash
98+
make load-secrets
99+
```
100+
101+
The validated patterns framework reads `values-secret-coco-pattern.yaml` and pushes firmware values to Vault at `secret/data/hub/firmwareReferenceValues`.
102+
103+
### Step 4: Verify Upload
104+
105+
```bash
106+
# Check the secret was written to Vault
107+
vault kv get secret/hub/firmwareReferenceValues
108+
```
109+
110+
Expected output shows a single `json` key containing the full JSON object.
111+
112+
### Step 5: Deploy/Sync KBS
113+
114+
If the KBS cluster is already running:
115+
116+
```bash
117+
# Force ExternalSecret to re-sync from Vault
118+
oc delete externalsecret firmware-refvals-eso -n trustee-operator-system
119+
120+
# Verify the secret synced
121+
oc get secret firmware-reference-values -n trustee-operator-system
122+
123+
# Check RVPS ConfigMap contains firmware entries
124+
oc get configmap rvps-reference-values -n trustee-operator-system -o yaml
125+
```
126+
127+
If deploying fresh:
128+
129+
```bash
130+
make install
131+
```
132+
133+
The RVPS will automatically reload reference values from the `rvps-reference-values` ConfigMap.
134+
135+
## Multi-OCP-Version Support
136+
137+
Different OpenShift versions may have different firmware measurements due to kernel/initrd changes. To support multiple versions:
138+
139+
1. **Collect from each version:**
140+
141+
```bash
142+
# OCP 4.18 cluster
143+
make collect-firmware-refvals
144+
145+
# OCP 4.19 cluster
146+
make collect-firmware-refvals-merge
147+
```
148+
149+
2. **The merge automatically deduplicates:**
150+
151+
The `--merge` flag (used by `collect-firmware-refvals-merge`) reads the existing file, unions the arrays, and deduplicates:
152+
153+
```json
154+
{
155+
"mr_td": ["<4.18-value>", "<4.19-value>"],
156+
"rtmr_2": ["<4.18-kernel>", "<4.19-kernel>"]
157+
}
158+
```
159+
160+
3. **Load merged values to Vault:**
161+
162+
```bash
163+
make load-secrets
164+
```
165+
166+
The attestation policy uses `in` checks - a pod passes if its measurement matches **any** value in the array.
167+
168+
## Advanced Options
169+
170+
The collection script supports several options:
171+
172+
```bash
173+
# Merge with existing file
174+
./scripts/collect-firmware-refvals.sh --merge
175+
176+
# Use different namespace for collection pod
177+
./scripts/collect-firmware-refvals.sh --namespace my-namespace
178+
179+
# Override output file
180+
./scripts/collect-firmware-refvals.sh --output /custom/path/firmware.json
181+
182+
# Use different RuntimeClass (for peer-pods/Azure)
183+
./scripts/collect-firmware-refvals.sh --runtime-class kata-remote
184+
185+
# Use custom base image
186+
./scripts/collect-firmware-refvals.sh --pod-image myregistry.io/custom-ubi9:latest
187+
188+
# Show all options
189+
./scripts/collect-firmware-refvals.sh --help
190+
```
191+
192+
## Known Limitations (Veritas Gaps)
193+
194+
As of veritas 0.2.0, the following are **not** collected and must be added manually if needed:
195+
196+
### 1. TCB Version Numbers
197+
198+
Veritas does not extract minimum TCB version numbers (bootloader, microcode, SNP, TEE). These are available in the attestation evidence but not in the veritas JSON output.
199+
200+
**Workaround:** Extract from attestation quotes manually if needed. Add to the JSON file as:
201+
202+
```json
203+
{
204+
"tcb_bootloader_min": "3",
205+
"tcb_snp_min": "20",
206+
"tcb_microcode_min": "115"
207+
}
208+
```
209+
210+
Then update the attestation policy to check:
211+
212+
```rego
213+
input.snp.report.reported_tcb.bootloader >= tcb_bootloader_min
214+
```
215+
216+
### 2. SNP Policy Bits
217+
218+
The SNP guest policy contains multiple flags (smt_allowed, migrate_ma, debug, etc.). Veritas reports the full policy word but does not break it into individual enforcement rules.
219+
220+
To enforce specific policy bits, add to attestation policy:
221+
222+
```rego
223+
input.snp.report.policy.smt_allowed == false
224+
input.snp.report.policy.debug == false
225+
```
226+
227+
### 3. Container Image Measurements
228+
229+
Veritas does not measure the application container image digest. Image policy enforcement is handled separately via:
230+
231+
- Confidential Data Hub (CDH) pulling image from KBS
232+
- Kyverno policies validating image signatures (cosign, Notary)
233+
234+
## Troubleshooting
235+
236+
### Collection script fails to launch pod
237+
238+
**Symptom:** `oc apply` fails or pod stuck in Pending
239+
240+
**Check:**
241+
242+
- RuntimeClass `kata-cc` exists: `oc get runtimeclass kata-cc`
243+
- KataConfig is Ready: `oc get kataconfig kata-config`
244+
- Node has sufficient resources
245+
246+
### Veritas collection fails
247+
248+
**Symptom:** `veritas collect` returns empty or errors
249+
250+
**Check:**
251+
252+
1. Pod is using correct RuntimeClass (kata-cc for bare metal)
253+
2. Pod is actually running on bare metal hardware (not Azure peer-pods)
254+
3. TEE device exists inside pod: `oc exec <pod> -- ls /dev/tdx_guest` (TDX) or `ls /dev/sev` (SNP)
255+
4. Veritas installed correctly: `oc exec <pod> -- veritas --version`
256+
257+
### KBS attestation still passes without firmware values
258+
259+
**Expected behavior:** The attestation policy has backwards-compatible fallback rules. If no firmware reference values are in RVPS, the policy only checks `init_data`.
260+
261+
To **enforce** firmware, remove the fallback rules from `attestation-policy.yaml`:
262+
263+
```rego
264+
# Remove these "hardware := 2 if count(query_reference_value(...)) == 0" rules
265+
```
266+
267+
### Hash mismatch after cluster upgrade
268+
269+
**Cause:** Kernel/firmware updated, changing rtmr_2 or mr_td
270+
271+
**Fix:** Re-collect firmware values from upgraded cluster, merge into existing file:
272+
273+
```bash
274+
make collect-firmware-refvals-merge
275+
make load-secrets
276+
```
277+
278+
## SHA-256 vs SHA-384
279+
280+
You may notice different hash algorithms in different contexts:
281+
282+
- **init_data TOML**: SHA-256 (CoCo initdata spec, used for PCR8 extend)
283+
- **Bare metal TDX firmware**: SHA-384 (Intel TDX architecture requirement)
284+
- **Bare metal SNP firmware**: SHA-384 (AMD SEV-SNP architecture requirement)
285+
- **Azure vTPM PCRs**: SHA-256
286+
287+
These are **correct** - they're different mechanisms at different layers. The attestation policy checks these independently.
288+
289+
## Security Considerations
290+
291+
### Threat Model
292+
293+
Firmware reference values protect against:
294+
295+
- Unauthorized firmware modifications (malicious OVMF, compromised bootloader)
296+
- Kernel tampering (different kernel than expected)
297+
- Debug mode enabled (allows memory inspection via hypervisor)
298+
299+
Choose the level appropriate for your threat model.
300+
301+
### Debug Mode
302+
303+
The attestation policy enforces `debug == false` for both TDX and SNP. Debug mode allows:
304+
305+
- Memory inspection via hypervisor
306+
- Single-stepping the guest
307+
- Extracting secrets from guest memory
308+
309+
**Production workloads must run with debug disabled.** If attestation fails due to debug mode, do not disable the check - fix the KataConfig to disable debug.
310+
311+
## References
312+
313+
- [Veritas Documentation](https://github.com/confidential-containers/veritas)
314+
- [Intel TDX Attestation Spec](https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html)
315+
- [AMD SEV-SNP Attestation Spec](https://www.amd.com/en/developer/sev.html)
316+
- [Trustee Attestation Policy Reference](https://github.com/openshift/trustee-operator/tree/main/config/templates)

0 commit comments

Comments
 (0)