|
| 1 | +# Firmware Reference Values for Bare Metal Attestation |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +Firmware reference values provide cryptographic measurements of the trusted computing base (TCB) for Intel TDX and AMD SEV-SNP confidential VMs running on bare metal. These values enable attestation policies to verify that workloads are running on known-good firmware with the expected security configuration. |
| 6 | + |
| 7 | +Without firmware reference values, attestation only verifies the `init_data` (runtime configuration hash). With firmware values, the Key Broker Service (KBS) can enforce: |
| 8 | + |
| 9 | +- **Hardware integrity**: Verify firmware measurements (MRTD, RTMRs, launch measurement) |
| 10 | +- **TCB version**: Ensure minimum firmware/microcode versions |
| 11 | +- **Security configuration**: Enforce debug-disabled mode |
| 12 | + |
| 13 | +## Architecture |
| 14 | + |
| 15 | +### Intel TDX Measurements |
| 16 | + |
| 17 | +- **mr_td** (SHA-384): Initial contents of the TD (Trust Domain) - firmware + initial page tables |
| 18 | +- **rtmr_1** (SHA-384): Guest firmware + bootloader measurements |
| 19 | +- **rtmr_2** (SHA-384): Kernel + initrd measurements |
| 20 | +- **xfam** (hex): Extended feature mask - CPU features available to the TD |
| 21 | + |
| 22 | +### AMD SEV-SNP Measurements |
| 23 | + |
| 24 | +- **snp_launch_measurement** (SHA-384): Hash of initial guest memory contents + VMSA |
| 25 | +- **debug flag**: Policy bit indicating whether debug is allowed |
| 26 | + |
| 27 | +### Hash Algorithm Clarification |
| 28 | + |
| 29 | +Different layers use different hash algorithms - this is **correct and expected**: |
| 30 | + |
| 31 | +| Layer | Algorithm | Why | |
| 32 | +|-------|-----------|-----| |
| 33 | +| init_data (OSC TOML) | SHA-256 | CoCo initdata spec, extends into vTPM PCR8 | |
| 34 | +| TDX firmware (mr_td, rtmr_*) | SHA-384 | Intel TDX architecture requirement | |
| 35 | +| SNP firmware (launch_measurement) | SHA-384 | AMD SEV-SNP architecture requirement | |
| 36 | +| Azure vTPM PCRs | SHA-256 | TPM 2.0 default bank for virtual TPMs | |
| 37 | + |
| 38 | +The attestation policy verifies each independently - there is no conflict. |
| 39 | + |
| 40 | +## Prerequisites |
| 41 | + |
| 42 | +### 1. Veritas Tool |
| 43 | + |
| 44 | +Veritas is a Python tool for collecting reference values from confidential VMs. Install via pip: |
| 45 | + |
| 46 | +```bash |
| 47 | +pip install veritas-collectd |
| 48 | +``` |
| 49 | + |
| 50 | +**Version requirement**: 0.2.0 or later |
| 51 | + |
| 52 | +### 2. Bare Metal Cluster Access |
| 53 | + |
| 54 | +You need: |
| 55 | +- A running bare metal cluster with Intel TDX or AMD SEV-SNP hardware |
| 56 | +- KataConfig deployed and in Ready state |
| 57 | +- At least one kata pod successfully running (proves TEE is functional) |
| 58 | + |
| 59 | +### 3. Vault Access |
| 60 | + |
| 61 | +You need write access to the Vault instance at `secret/data/hub/firmwareReferenceValues`. |
| 62 | + |
| 63 | +If using the pattern's default Vault setup: |
| 64 | +```bash |
| 65 | +# Get Vault root token from cluster |
| 66 | +oc get secret -n vault vault-init -o jsonpath='{.data.root_token}' | base64 -d |
| 67 | +``` |
| 68 | + |
| 69 | +## Workflow |
| 70 | + |
| 71 | +### Step 1: Collect Reference Values from Kata Pod |
| 72 | + |
| 73 | +Run a kata pod on the bare metal cluster and use veritas to extract firmware measurements: |
| 74 | + |
| 75 | +```bash |
| 76 | +# Create a test pod with kata-remote runtime |
| 77 | +oc apply -f - <<EOF |
| 78 | +apiVersion: v1 |
| 79 | +kind: Pod |
| 80 | +metadata: |
| 81 | + name: firmware-collector |
| 82 | + namespace: default |
| 83 | +spec: |
| 84 | + runtimeClassName: kata-remote |
| 85 | + containers: |
| 86 | + - name: busybox |
| 87 | + image: quay.io/quay/busybox:latest |
| 88 | + command: ["sleep", "3600"] |
| 89 | +EOF |
| 90 | + |
| 91 | +# Wait for pod to be Running |
| 92 | +oc wait --for=condition=Ready pod/firmware-collector -n default --timeout=300s |
| 93 | + |
| 94 | +# Exec into the pod and run veritas |
| 95 | +oc exec -it firmware-collector -n default -- sh |
| 96 | + |
| 97 | +# Inside the pod: |
| 98 | +veritas collect --output /tmp/refvals.json |
| 99 | +cat /tmp/refvals.json |
| 100 | +exit |
| 101 | + |
| 102 | +# Copy the reference values out |
| 103 | +oc cp default/firmware-collector:/tmp/refvals.json ./refvals-$(oc get nodes -o jsonpath='{.items[0].status.nodeInfo.osImage}' | tr ' ' '-').json |
| 104 | +``` |
| 105 | + |
| 106 | +### Step 2: Transform to Vault Format |
| 107 | + |
| 108 | +Veritas output format differs from what KBS expects. Use the provided script: |
| 109 | + |
| 110 | +```bash |
| 111 | +# In the coco-pattern repository root: |
| 112 | +make push-firmware-refvals REFVALS_FILE=./refvals-*.json |
| 113 | +``` |
| 114 | + |
| 115 | +This script: |
| 116 | +1. Extracts firmware measurements from veritas JSON |
| 117 | +2. Converts to the KBS/RVPS expected format (arrays of hex strings) |
| 118 | +3. Pushes to Vault at `secret/data/hub/firmwareReferenceValues` |
| 119 | + |
| 120 | +### Step 3: Vault Secret Format |
| 121 | + |
| 122 | +The script creates a secret with this structure: |
| 123 | + |
| 124 | +```json |
| 125 | +{ |
| 126 | + "mr_td": ["a1b2c3d4..."], |
| 127 | + "rtmr_1": ["e5f6a7b8..."], |
| 128 | + "rtmr_2": ["c9d0e1f2..."], |
| 129 | + "snp_launch_measurement": ["f3e4d5c6..."], |
| 130 | + "xfam": ["e742060000000000"] |
| 131 | +} |
| 132 | +``` |
| 133 | + |
| 134 | +**Key points:** |
| 135 | +- Each field is an **array** of strings (supports multiple valid values) |
| 136 | +- Hash values are lowercase hex strings (SHA-384 = 96 hex chars) |
| 137 | +- Empty arrays `[]` mean "not available" - attestation will skip that check |
| 138 | +- Missing keys are treated the same as empty arrays |
| 139 | + |
| 140 | +### Step 4: Verify Upload |
| 141 | + |
| 142 | +```bash |
| 143 | +# Check the secret was written |
| 144 | +vault kv get secret/hub/firmwareReferenceValues |
| 145 | + |
| 146 | +# Expected output: |
| 147 | +# ====== Data ====== |
| 148 | +# Key Value |
| 149 | +# --- ----- |
| 150 | +# mr_td ["a1b2c3d4..."] |
| 151 | +# rtmr_1 ["e5f6a7b8..."] |
| 152 | +# ... |
| 153 | +``` |
| 154 | + |
| 155 | +### Step 5: Trigger KBS Sync |
| 156 | + |
| 157 | +The trustee-chart creates an ExternalSecret that pulls from this Vault path. Force a sync: |
| 158 | + |
| 159 | +```bash |
| 160 | +# On the cluster with KBS deployed: |
| 161 | +oc delete externalsecret firmware-refvals-eso -n trustee-operator-system |
| 162 | + |
| 163 | +# Wait for it to recreate (ArgoCD sync-wave or manual re-apply) |
| 164 | +# Verify the secret exists: |
| 165 | +oc get secret firmware-reference-values -n trustee-operator-system |
| 166 | +``` |
| 167 | + |
| 168 | +The RVPS will automatically reload reference values from the `rvps-reference-values` ConfigMap. |
| 169 | + |
| 170 | +## Multi-OCP-Version Support |
| 171 | + |
| 172 | +Different OpenShift versions may have different firmware measurements due to kernel/initrd changes. To support multiple versions: |
| 173 | + |
| 174 | +1. **Collect from each version:** |
| 175 | + ```bash |
| 176 | + # OCP 4.18 cluster |
| 177 | + veritas collect --output refvals-ocp-4.18.json |
| 178 | + |
| 179 | + # OCP 4.19 cluster |
| 180 | + veritas collect --output refvals-ocp-4.19.json |
| 181 | + ``` |
| 182 | + |
| 183 | +2. **Merge the arrays:** |
| 184 | + ```json |
| 185 | + { |
| 186 | + "mr_td": ["<4.18-value>", "<4.19-value>"], |
| 187 | + "rtmr_2": ["<4.18-kernel>", "<4.19-kernel>"] |
| 188 | + } |
| 189 | + ``` |
| 190 | + |
| 191 | +3. **Push merged values to Vault:** |
| 192 | + ```bash |
| 193 | + vault kv put secret/hub/firmwareReferenceValues \ |
| 194 | + mr_td='["val1","val2"]' \ |
| 195 | + rtmr_1='["val1","val2"]' \ |
| 196 | + rtmr_2='["val1","val2"]' |
| 197 | + ``` |
| 198 | + |
| 199 | +The attestation policy uses `in` checks - a pod passes if its measurement matches **any** value in the array. |
| 200 | + |
| 201 | +## Known Limitations (Veritas Gaps) |
| 202 | + |
| 203 | +As of veritas 0.2.0, the following are **not** collected and must be added manually if needed: |
| 204 | + |
| 205 | +### 1. TCB Version Numbers |
| 206 | + |
| 207 | +Veritas does not extract minimum required TCB levels (e.g., SNP microcode version). To enforce: |
| 208 | + |
| 209 | +```json |
| 210 | +{ |
| 211 | + "tcb_bootloader_min": "3", |
| 212 | + "tcb_snp_min": "20", |
| 213 | + "tcb_microcode_min": "115" |
| 214 | +} |
| 215 | +``` |
| 216 | + |
| 217 | +Then update the attestation policy to check: |
| 218 | +```rego |
| 219 | +input.snp.report.reported_tcb.bootloader >= tcb_bootloader_min |
| 220 | +``` |
| 221 | + |
| 222 | +### 2. SNP Policy Bits |
| 223 | + |
| 224 | +The SNP guest policy contains multiple flags (smt_allowed, migrate_ma, debug, etc.). Veritas reports the full policy word but does not break it into individual enforcement rules. |
| 225 | + |
| 226 | +To enforce specific policy bits, add to attestation policy: |
| 227 | +```rego |
| 228 | +input.snp.report.policy.smt_allowed == false |
| 229 | +input.snp.report.policy.debug == false |
| 230 | +``` |
| 231 | + |
| 232 | +### 3. Container Image Measurements |
| 233 | + |
| 234 | +Veritas does not measure the application container image digest. Image policy enforcement is handled separately via: |
| 235 | +- Confidential Data Hub (CDH) pulling image from KBS |
| 236 | +- Kyverno policies validating image signatures (cosign, Notary) |
| 237 | + |
| 238 | +## Troubleshooting |
| 239 | + |
| 240 | +### Veritas collection fails |
| 241 | + |
| 242 | +**Symptom:** `veritas collect` returns empty or errors |
| 243 | + |
| 244 | +**Check:** |
| 245 | +1. Pod is using `kata-remote` RuntimeClass |
| 246 | +2. Pod is actually running on bare metal (not Azure peer-pods) |
| 247 | +3. TEE device exists: `ls /dev/tdx_guest` (TDX) or `ls /dev/sev` (SNP) |
| 248 | + |
| 249 | +### KBS attestation still passes without firmware values |
| 250 | + |
| 251 | +**Expected behavior:** The attestation policy has backwards-compatible fallback rules. If no firmware reference values are in RVPS, the policy only checks `init_data`. |
| 252 | + |
| 253 | +To **enforce** firmware, remove the fallback rules from `attestation-policy.yaml`: |
| 254 | +```rego |
| 255 | +# Remove these "hardware := 2 if count(query_reference_value(...)) == 0" rules |
| 256 | +``` |
| 257 | + |
| 258 | +### Hash mismatch after cluster upgrade |
| 259 | + |
| 260 | +**Cause:** Kernel/firmware updated, changing rtmr_2 or mr_td |
| 261 | + |
| 262 | +**Fix:** Re-collect firmware values from upgraded cluster, merge into Vault arrays |
| 263 | + |
| 264 | +## Security Considerations |
| 265 | + |
| 266 | +### Firmware Reference Values Are Sensitive |
| 267 | + |
| 268 | +These values reveal the exact firmware/kernel configuration of your confidential cluster. Treat them as **confidential**: |
| 269 | + |
| 270 | +- Store in Vault with ACLs restricting read access |
| 271 | +- Do not commit to public Git repositories |
| 272 | +- Rotate if disclosed (re-image nodes with different firmware if possible) |
| 273 | + |
| 274 | +### Attestation Policy Trade-offs |
| 275 | + |
| 276 | +Strict firmware enforcement provides stronger security but reduces operational flexibility: |
| 277 | + |
| 278 | +| Policy | Security | Flexibility | |
| 279 | +|--------|----------|-------------| |
| 280 | +| init_data only | Medium | High - easy upgrades | |
| 281 | +| init_data + firmware | High | Low - every kernel update requires reference value refresh | |
| 282 | +| init_data + firmware + TCB min | Highest | Lowest - blocks old firmware entirely | |
| 283 | + |
| 284 | +Choose the level appropriate for your threat model. |
| 285 | + |
| 286 | +### Debug Mode |
| 287 | + |
| 288 | +The attestation policy enforces `debug == false` for both TDX and SNP. Debug mode allows: |
| 289 | +- Memory inspection via hypervisor |
| 290 | +- Single-stepping the guest |
| 291 | +- Extracting secrets from guest memory |
| 292 | + |
| 293 | +**Production workloads must run with debug disabled.** If attestation fails due to debug mode, do not disable the check - fix the KataConfig to disable debug. |
| 294 | + |
| 295 | +## References |
| 296 | + |
| 297 | +- [Veritas Documentation](https://github.com/confidential-containers/veritas) |
| 298 | +- [Intel TDX Attestation Spec](https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html) |
| 299 | +- [AMD SEV-SNP Attestation Spec](https://www.amd.com/system/files/TechDocs/56860.pdf) |
| 300 | +- [CoCo Attestation Architecture](https://github.com/confidential-containers/attestation-service) |
0 commit comments