|
| 1 | +# Firmware Reference Values for Bare Metal Attestation |
| 2 | + |
| 3 | +This guide explains how to collect firmware reference values for bare metal confidential computing deployments (Intel TDX / AMD SEV-SNP). |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +Firmware reference values are cryptographic measurements of the Trusted Computing Base (TCB) components: |
| 8 | + |
| 9 | +- **Intel TDX**: `mr_td` (OVMF code hash), `rtmr_1` (kernel/initrd), `rtmr_2` (cmdline), `xfam` (extended features) |
| 10 | +- **AMD SEV-SNP**: `snp_launch_measurement` (firmware/kernel/initrd hash) |
| 11 | + |
| 12 | +These values are used by the KBS attestation policy to verify that confidential workloads are running on approved firmware with expected security properties. |
| 13 | + |
| 14 | +## Prerequisites |
| 15 | + |
| 16 | +### 1. Veritas Tool |
| 17 | + |
| 18 | +The [veritas](https://github.com/confidential-containers/veritas) tool collects attestation evidence from confidential VMs. |
| 19 | + |
| 20 | +**Installation:** Veritas is automatically installed inside the collection pod by the script. No local installation required. |
| 21 | + |
| 22 | +**Version requirement**: 0.2.0 or later |
| 23 | + |
| 24 | +### 2. Bare Metal Cluster Access |
| 25 | + |
| 26 | +You need: |
| 27 | + |
| 28 | +- A running bare metal cluster with Intel TDX or AMD SEV-SNP hardware |
| 29 | +- KataConfig deployed and in Ready state |
| 30 | +- At least one kata pod successfully running (proves TEE is functional) |
| 31 | +- `oc` CLI logged in to the cluster |
| 32 | +- `jq` installed locally |
| 33 | + |
| 34 | +### 3. Local Tools |
| 35 | + |
| 36 | +```bash |
| 37 | +# Check prerequisites |
| 38 | +command -v oc && echo "✓ oc CLI installed" |
| 39 | +command -v jq && echo "✓ jq installed" |
| 40 | +oc whoami && echo "✓ Logged in to cluster" |
| 41 | +``` |
| 42 | + |
| 43 | +## Workflow |
| 44 | + |
| 45 | +The firmware collection workflow is fully automated via a single command: |
| 46 | + |
| 47 | +### Step 1: Collect Firmware Reference Values |
| 48 | + |
| 49 | +```bash |
| 50 | +# From the coco-pattern repository root: |
| 51 | +make collect-firmware-refvals |
| 52 | +``` |
| 53 | + |
| 54 | +This command: |
| 55 | + |
| 56 | +1. Launches a kata pod with `RuntimeClass: kata-cc` |
| 57 | +2. Installs veritas inside the pod |
| 58 | +3. Collects firmware measurements from the TEE |
| 59 | +4. Transforms output to RVPS format (JSON with arrays) |
| 60 | +5. Saves to `~/.coco-pattern/firmware-reference-values.json` |
| 61 | +6. Cleans up the pod |
| 62 | + |
| 63 | +**Output format** (`~/.coco-pattern/firmware-reference-values.json`): |
| 64 | + |
| 65 | +```json |
| 66 | +{ |
| 67 | + "mr_td": ["a1b2c3d4..."], |
| 68 | + "rtmr_1": ["e5f6a7b8..."], |
| 69 | + "rtmr_2": ["c9d0e1f2..."], |
| 70 | + "snp_launch_measurement": ["f3e4d5c6..."], |
| 71 | + "xfam": ["e742060000000000"] |
| 72 | +} |
| 73 | +``` |
| 74 | + |
| 75 | +**Key points:** |
| 76 | + |
| 77 | +- Each field is an **array** of strings (supports multiple valid values) |
| 78 | +- Hash values are lowercase hex strings (SHA-384 = 96 hex chars for TDX/SNP firmware) |
| 79 | +- Empty arrays `[]` mean "not available" - attestation will skip that check |
| 80 | +- Only populated fields for the detected TEE type (TDX or SNP) |
| 81 | + |
| 82 | +### Step 2: Enable in values-secret.yaml |
| 83 | + |
| 84 | +Uncomment the `firmwareReferenceValues` section in `~/values-secret-coco-pattern.yaml`: |
| 85 | + |
| 86 | +```yaml |
| 87 | +- name: firmwareReferenceValues |
| 88 | + vaultPrefixes: |
| 89 | + - hub |
| 90 | + fields: |
| 91 | + - name: json |
| 92 | + path: ~/.coco-pattern/firmware-reference-values.json |
| 93 | +``` |
| 94 | +
|
| 95 | +### Step 3: Load Secrets to Vault |
| 96 | +
|
| 97 | +```bash |
| 98 | +make load-secrets |
| 99 | +``` |
| 100 | + |
| 101 | +The validated patterns framework reads `values-secret-coco-pattern.yaml` and pushes firmware values to Vault at `secret/data/hub/firmwareReferenceValues`. |
| 102 | + |
| 103 | +### Step 4: Verify Upload |
| 104 | + |
| 105 | +```bash |
| 106 | +# Check the secret was written to Vault |
| 107 | +vault kv get secret/hub/firmwareReferenceValues |
| 108 | +``` |
| 109 | + |
| 110 | +Expected output shows a single `json` key containing the full JSON object. |
| 111 | + |
| 112 | +### Step 5: Deploy/Sync KBS |
| 113 | + |
| 114 | +If the KBS cluster is already running: |
| 115 | + |
| 116 | +```bash |
| 117 | +# Force ExternalSecret to re-sync from Vault |
| 118 | +oc delete externalsecret firmware-refvals-eso -n trustee-operator-system |
| 119 | + |
| 120 | +# Verify the secret synced |
| 121 | +oc get secret firmware-reference-values -n trustee-operator-system |
| 122 | + |
| 123 | +# Check RVPS ConfigMap contains firmware entries |
| 124 | +oc get configmap rvps-reference-values -n trustee-operator-system -o yaml |
| 125 | +``` |
| 126 | + |
| 127 | +If deploying fresh: |
| 128 | + |
| 129 | +```bash |
| 130 | +make install |
| 131 | +``` |
| 132 | + |
| 133 | +The RVPS will automatically reload reference values from the `rvps-reference-values` ConfigMap. |
| 134 | + |
| 135 | +## Multi-OCP-Version Support |
| 136 | + |
| 137 | +Different OpenShift versions may have different firmware measurements due to kernel/initrd changes. To support multiple versions: |
| 138 | + |
| 139 | +1. **Collect from each version:** |
| 140 | + |
| 141 | + ```bash |
| 142 | + # OCP 4.18 cluster |
| 143 | + make collect-firmware-refvals |
| 144 | + |
| 145 | + # OCP 4.19 cluster |
| 146 | + make collect-firmware-refvals-merge |
| 147 | + ``` |
| 148 | + |
| 149 | +2. **The merge automatically deduplicates:** |
| 150 | + |
| 151 | + The `--merge` flag (used by `collect-firmware-refvals-merge`) reads the existing file, unions the arrays, and deduplicates: |
| 152 | + |
| 153 | + ```json |
| 154 | + { |
| 155 | + "mr_td": ["<4.18-value>", "<4.19-value>"], |
| 156 | + "rtmr_2": ["<4.18-kernel>", "<4.19-kernel>"] |
| 157 | + } |
| 158 | + ``` |
| 159 | + |
| 160 | +3. **Load merged values to Vault:** |
| 161 | + |
| 162 | + ```bash |
| 163 | + make load-secrets |
| 164 | + ``` |
| 165 | + |
| 166 | +The attestation policy uses `in` checks - a pod passes if its measurement matches **any** value in the array. |
| 167 | + |
| 168 | +## Advanced Options |
| 169 | + |
| 170 | +The collection script supports several options: |
| 171 | + |
| 172 | +```bash |
| 173 | +# Merge with existing file |
| 174 | +./scripts/collect-firmware-refvals.sh --merge |
| 175 | + |
| 176 | +# Use different namespace for collection pod |
| 177 | +./scripts/collect-firmware-refvals.sh --namespace my-namespace |
| 178 | + |
| 179 | +# Override output file |
| 180 | +./scripts/collect-firmware-refvals.sh --output /custom/path/firmware.json |
| 181 | + |
| 182 | +# Use different RuntimeClass (for peer-pods/Azure) |
| 183 | +./scripts/collect-firmware-refvals.sh --runtime-class kata-remote |
| 184 | + |
| 185 | +# Use custom base image |
| 186 | +./scripts/collect-firmware-refvals.sh --pod-image myregistry.io/custom-ubi9:latest |
| 187 | + |
| 188 | +# Show all options |
| 189 | +./scripts/collect-firmware-refvals.sh --help |
| 190 | +``` |
| 191 | + |
| 192 | +## Known Limitations (Veritas Gaps) |
| 193 | + |
| 194 | +As of veritas 0.2.0, the following are **not** collected and must be added manually if needed: |
| 195 | + |
| 196 | +### 1. TCB Version Numbers |
| 197 | + |
| 198 | +Veritas does not extract minimum TCB version numbers (bootloader, microcode, SNP, TEE). These are available in the attestation evidence but not in the veritas JSON output. |
| 199 | + |
| 200 | +**Workaround:** Extract from attestation quotes manually if needed. Add to the JSON file as: |
| 201 | + |
| 202 | +```json |
| 203 | +{ |
| 204 | + "tcb_bootloader_min": "3", |
| 205 | + "tcb_snp_min": "20", |
| 206 | + "tcb_microcode_min": "115" |
| 207 | +} |
| 208 | +``` |
| 209 | + |
| 210 | +Then update the attestation policy to check: |
| 211 | + |
| 212 | +```rego |
| 213 | +input.snp.report.reported_tcb.bootloader >= tcb_bootloader_min |
| 214 | +``` |
| 215 | + |
| 216 | +### 2. SNP Policy Bits |
| 217 | + |
| 218 | +The SNP guest policy contains multiple flags (smt_allowed, migrate_ma, debug, etc.). Veritas reports the full policy word but does not break it into individual enforcement rules. |
| 219 | + |
| 220 | +To enforce specific policy bits, add to attestation policy: |
| 221 | + |
| 222 | +```rego |
| 223 | +input.snp.report.policy.smt_allowed == false |
| 224 | +input.snp.report.policy.debug == false |
| 225 | +``` |
| 226 | + |
| 227 | +### 3. Container Image Measurements |
| 228 | + |
| 229 | +Veritas does not measure the application container image digest. Image policy enforcement is handled separately via: |
| 230 | + |
| 231 | +- Confidential Data Hub (CDH) pulling image from KBS |
| 232 | +- Kyverno policies validating image signatures (cosign, Notary) |
| 233 | + |
| 234 | +## Troubleshooting |
| 235 | + |
| 236 | +### Collection script fails to launch pod |
| 237 | + |
| 238 | +**Symptom:** `oc apply` fails or pod stuck in Pending |
| 239 | + |
| 240 | +**Check:** |
| 241 | + |
| 242 | +- RuntimeClass `kata-cc` exists: `oc get runtimeclass kata-cc` |
| 243 | +- KataConfig is Ready: `oc get kataconfig kata-config` |
| 244 | +- Node has sufficient resources |
| 245 | + |
| 246 | +### Veritas collection fails |
| 247 | + |
| 248 | +**Symptom:** `veritas collect` returns empty or errors |
| 249 | + |
| 250 | +**Check:** |
| 251 | + |
| 252 | +1. Pod is using correct RuntimeClass (kata-cc for bare metal) |
| 253 | +2. Pod is actually running on bare metal hardware (not Azure peer-pods) |
| 254 | +3. TEE device exists inside pod: `oc exec <pod> -- ls /dev/tdx_guest` (TDX) or `ls /dev/sev` (SNP) |
| 255 | +4. Veritas installed correctly: `oc exec <pod> -- veritas --version` |
| 256 | + |
| 257 | +### KBS attestation still passes without firmware values |
| 258 | + |
| 259 | +**Expected behavior:** The attestation policy has backwards-compatible fallback rules. If no firmware reference values are in RVPS, the policy only checks `init_data`. |
| 260 | + |
| 261 | +To **enforce** firmware, remove the fallback rules from `attestation-policy.yaml`: |
| 262 | + |
| 263 | +```rego |
| 264 | +# Remove these "hardware := 2 if count(query_reference_value(...)) == 0" rules |
| 265 | +``` |
| 266 | + |
| 267 | +### Hash mismatch after cluster upgrade |
| 268 | + |
| 269 | +**Cause:** Kernel/firmware updated, changing rtmr_2 or mr_td |
| 270 | + |
| 271 | +**Fix:** Re-collect firmware values from upgraded cluster, merge into existing file: |
| 272 | + |
| 273 | +```bash |
| 274 | +make collect-firmware-refvals-merge |
| 275 | +make load-secrets |
| 276 | +``` |
| 277 | + |
| 278 | +## SHA-256 vs SHA-384 |
| 279 | + |
| 280 | +You may notice different hash algorithms in different contexts: |
| 281 | + |
| 282 | +- **init_data TOML**: SHA-256 (CoCo initdata spec, used for PCR8 extend) |
| 283 | +- **Bare metal TDX firmware**: SHA-384 (Intel TDX architecture requirement) |
| 284 | +- **Bare metal SNP firmware**: SHA-384 (AMD SEV-SNP architecture requirement) |
| 285 | +- **Azure vTPM PCRs**: SHA-256 |
| 286 | + |
| 287 | +These are **correct** - they're different mechanisms at different layers. The attestation policy checks these independently. |
| 288 | + |
| 289 | +## Security Considerations |
| 290 | + |
| 291 | +### Threat Model |
| 292 | + |
| 293 | +Firmware reference values protect against: |
| 294 | + |
| 295 | +- Unauthorized firmware modifications (malicious OVMF, compromised bootloader) |
| 296 | +- Kernel tampering (different kernel than expected) |
| 297 | +- Debug mode enabled (allows memory inspection via hypervisor) |
| 298 | + |
| 299 | +Choose the level appropriate for your threat model. |
| 300 | + |
| 301 | +### Debug Mode |
| 302 | + |
| 303 | +The attestation policy enforces `debug == false` for both TDX and SNP. Debug mode allows: |
| 304 | + |
| 305 | +- Memory inspection via hypervisor |
| 306 | +- Single-stepping the guest |
| 307 | +- Extracting secrets from guest memory |
| 308 | + |
| 309 | +**Production workloads must run with debug disabled.** If attestation fails due to debug mode, do not disable the check - fix the KataConfig to disable debug. |
| 310 | + |
| 311 | +## References |
| 312 | + |
| 313 | +- [Veritas Documentation](https://github.com/confidential-containers/veritas) |
| 314 | +- [Intel TDX Attestation Spec](https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html) |
| 315 | +- [AMD SEV-SNP Attestation Spec](https://www.amd.com/en/developer/sev.html) |
| 316 | +- [Trustee Attestation Policy Reference](https://github.com/openshift/trustee-operator/tree/main/config/templates) |
0 commit comments