Skip to content

Commit b3b3e8c

Browse files
committed
initrd: fix TPM1 counter auth regression and defend lock cascade failure
PR #2068 (tpm_reseal_ux-integrity_report-detect_disk_and_tpm_swap, merged at d3d8053) changed increment_tpm_counter from hardcoded -pwdc '' (empty counter auth) to -pwdc "${tpm_passphrase:-}" (owner passphrase from cache/prompt), but left check_tpm_counter using empty -pwdc when called from kexec-sign-config.sh without a $3 passphrase argument. This caused every counter increment to compute SHA1(owner_pass) while the counter was created with SHA1("") — persistent TPM_AUTH_FAIL. Per TCG TPM Main Spec Part 3, TPM_CreateCounter uses owner auth (-pwdo) but TPM_IncrementCounter uses the counter's own authData, not the owner password. The correct design for Heads' rollback counter is empty auth: rollback security comes from the signed /boot/kexec_rollback.txt and TPM sealing, not counter access control. The repeated auth failures (3 per boot x ~5 boots via the _tpm_auth_retry loop) triggered TPM 1.2 dictionary-attack lockout (TPM_DEFEND_LOCK_RUNNING), which persisted through forceclear on some implementations, causing tpm takeown to fail and TPM reset to abort — a cascade failure from the counter auth mismatch. Changes: - initrd/bin/tpmr.sh (_tpm_auth_retry, tpm2_counter_inc, tpm2_seal, tpm1_seal): add 'defend' and '0x98e|0x149' to auth detection grep patterns so defend lock and TPM2 RC codes are treated as retryable auth failures rather than fatal errors - initrd/bin/tpmr.sh (tpm1_reset): detect "defend lock" after takeown failure and cycle physical presence to clear the lock state before retrying — a full AC power cycle remains the fallback if software presence is insufficient - initrd/bin/tpmr.sh (tpm1_counter_increment): when caller passes -pwdc '' (empty counter auth), call tpm counter_increment directly without _tpm_auth_retry — the previous code stripped -pwdc and always routed through _tpm_auth_retry which appended the cached owner passphrase, defeating the empty-auth attempt. Use || return to prevent set -e from killing the script on expected auth failure. - initrd/etc/functions.sh (check_tpm_counter): pass -pwdc '' (empty counter auth) instead of -pwdc "${tpm_passphrase:-}" so the counter is created with SHA1("") per TCG spec - initrd/etc/functions.sh (increment_tpm_counter): try -pwdc '' first for TPM1 (correct behavior). If that fails on a readable counter (created by PR #2068 era code), prompt for owner passphrase and retry as migration fallback. Clear user-facing messages explain the one-time migration and that a TPM reset can switch to empty auth. - initrd/etc/functions.sh (increment_tpm_counter): remove the TPM1-specific owner-passphrase prompt block added by PR #2068 — no longer needed as new counters use empty auth - initrd/etc/functions.sh (increment_tpm_counter): change the DIE-path fallback counter_create from -pwdc "${tpm_passphrase:-}" to -pwdc '' for consistency (would have created a new counter with old buggy auth on the off chance it was reached) - doc/tpm.md: document TPM1 boot chain, tpmtotp tool selection, auth retry patterns, defend lock recovery, and physical presence Signed-off-by: Thierry Laurion <insurgo@riseup.net>
1 parent 7d3a28a commit b3b3e8c

3 files changed

Lines changed: 182 additions & 30 deletions

File tree

doc/tpm.md

Lines changed: 94 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,35 @@ See also: [architecture.md](architecture.md), [boot-process.md](boot-process.md)
1010
## tpmr — unified TPM abstraction
1111

1212
`initrd/bin/tpmr.sh` is a shell script wrapper that presents a single interface
13-
over both TPM 1.2 (`tpm` / `trousers`) and TPM 2.0 (`tpm2-tools`). All Heads
14-
scripts call `tpmr.sh` rather than invoking `tpm` or `tpm2` directly.
13+
over both TPM 1.2 and TPM 2.0. All Heads scripts call `tpmr.sh` rather than
14+
invoking TPM tools directly.
15+
16+
### Boot chain and TPM tool selection
17+
18+
```text
19+
initrd/init (PID 1)
20+
└─ CONFIG_BOOTSCRIPT → /bin/gui-init.sh [board config]
21+
├─ source /etc/functions.sh [shared TPM helpers]
22+
├─ source /etc/gui_functions.sh [whiptail wrappers]
23+
└─ calls initrd/bin/tpmr.sh [TPM abstraction]
24+
├─ TPM1: calls `tpm` (tpmtotp util/tpm) [CONFIG_TPM2_TOOLS != y]
25+
│ modules/tpmtotp → output: totp hotp qrenc util/tpm
26+
27+
└─ TPM2: calls tpm2_* (tpm2-tools) [CONFIG_TPM2_TOOLS=y]
28+
modules/tpm2-tss + modules/tpm2-tools
29+
```
30+
31+
TPM1 support comes exclusively from the `tpmtotp` module (`modules/tpmtotp`),
32+
which builds `util/tpm` as part of its outputs. This binary is installed to
33+
the initrd as `tpm` and supports subcommands such as `physicalpresence`,
34+
`forceclear`, `takeown -pwdo`, `counter_create`, `counter_increment`, etc.
35+
36+
TPM2 support comes from `modules/tpm2-tss` (TSS software stack) and
37+
`modules/tpm2-tools` (command-line tools like `tpm2_nvdefine`,
38+
`tpm2_getcap`, `tpm2_nvincrement`).
39+
40+
Both TPM1 and TPM2 boards may also enable `CONFIG_TPMTOTP=y` for the
41+
`totp` and `hotp` utilities, which are independent of the TPM version.
1542

1643
### PCR sizes
1744

@@ -398,3 +425,68 @@ To verify that a new board's coreboot config matches the expected RoT:
398425
| Auth sessions | Not used | Required for policy-based unseal |
399426
| `kexec_finalize` | No-op | Extends PCRs, then `tpm2 shutdown` |
400427
| `startsession` | No-op | Creates encryption session |
428+
429+
### TPM1 auth retry and error detection
430+
431+
`_tpm_auth_retry()` in `initrd/bin/tpmr.sh` provides shared retry logic for
432+
both TPM1 and TPM2 operations that need authorization. On auth failure
433+
(wrong passphrase), the passphrase cache is shredded and the user is
434+
re-prompted up to 3 times before giving up.
435+
436+
Auth failure is detected by grepping the command output for known error
437+
patterns. TPM1 (tpmtotp) errors go to stdout via `printf()` with
438+
`TPM_GetErrMsg()` strings. TPM2 (tpm2-tools) errors go to stderr via
439+
`LOG_ERR()` and may include raw TPM response codes.
440+
441+
| Pattern | Type | TPM version | Example error |
442+
| --- | --- | --- | --- |
443+
| `authorization|auth|bad|permission` | English words | TPM1+TPM2 | `TPM_AUTHFAIL`, `bad passphrase` |
444+
| `defend` | English word | TPM1 | `Defend lock running` |
445+
| `0x98e|0x149` | Hex codes | TPM2 | `TPM2_RC_AUTH_FAIL`, `TPM2_RC_NV_AUTHORIZATION` |
446+
447+
### TPM1 reset defend lock
448+
449+
`TPM_DEFEND_LOCK_RUNNING` (`tpm_error.h`: `TPM_BASE + TPM_NON_FATAL + 3`)
450+
is a standard TPM 1.2 error raised when the TPM's dictionary-attack
451+
protection is active. After too many failed authorization attempts, the
452+
TPM enters a time-out period and refuses all authorization operations —
453+
including `tpm takeown` even after a successful `tpm forceclear`
454+
(forceclear clears the owner but not the dictionary attack counter on
455+
some implementations).
456+
457+
tpmtotp's `tpm takeown` outputs:
458+
```
459+
Error Defend lock running from TPM_TakeOwnership
460+
```
461+
462+
`tpm1_reset()` in `initrd/bin/tpmr.sh` detects "defend lock" in the
463+
`takeown` output and cycles physical presence (`physicaldisable` /
464+
`physicalenable` / `physicalpresence` / `physicalsetdeactivated`) to
465+
reset the TPM state machine and clear the lock on chips that honour
466+
software presence. `TPM_ResetLockValue` (in tpmtotp's `util/resetlockvalue.c`)
467+
exists but requires owner auth — after forceclear there is no owner,
468+
so it cannot be used.
469+
470+
If the cycling also fails, only a full AC power cycle (not just reboot)
471+
will clear the defend lock. The timeout duration is chip-specific and
472+
not documented in the tpmtotp source.
473+
474+
### TPM1 physical presence
475+
476+
TPM1.2 forceclear requires physical presence to be asserted. The
477+
`tpm1_reset()` function does this with `tpm physicalpresence -s` (software
478+
presence). On some platforms (e.g., Dell OptiPlex, some Infineon TPMs),
479+
software physical presence may not work — the TPM firmware only accepts
480+
hardware-asserted presence (GPIO set by BIOS). In that case, `forceclear`
481+
returns success but may not fully reset the TPM, or `takeown` may fail
482+
with unexpected errors.
483+
484+
When software physical presence fails, the LOG shows:
485+
```
486+
tpm1_reset: unable to set physical presence
487+
```
488+
489+
This is logged but not fatal — `tpm forceclear` is still attempted.
490+
If the TPM firmware ignores software physical presence, the reset fails
491+
and the user must use the platform's hardware TPM reset mechanism
492+
(typically a BIOS option or jumper).

initrd/bin/tpmr.sh

Lines changed: 63 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -354,7 +354,7 @@ tpm2_counter_inc() {
354354
rm -f "$tmp_err_file"
355355
shred -n 10 -z -u /tmp/secret/tpm_owner_passphrase 2>/dev/null || true
356356
DEBUG "tpm2_counter_inc attempt $attempt failed. Stderr: $tmp_err_content"
357-
if ! echo "$tmp_err_content" | grep -qiE 'authorization|auth|bad|permission|0x98e|0x149'; then
357+
if ! echo "$tmp_err_content" | grep -qiE 'authorization|auth|bad|permission|defend|0x98e|0x149'; then
358358
DIE "Can't increment TPM counter for $index, access denied."
359359
fi
360360
WARN "Authentication failed, retrying..."
@@ -370,16 +370,26 @@ tpm2_counter_inc() {
370370
# Caching: prompt_tpm_owner_password reuses cached passphrase if available.
371371
# On auth failure the cache is shredded; next prompt will ask the user.
372372
#
373+
# Error stream selection:
374+
# TPM1 (tpmtotp): errors go to stdout via printf() — capture stdout+stderr
375+
# TPM2 (tpm2-tools): errors go to stderr via LOG_ERR() — capture stderr only
376+
#
377+
# Auth detection grep patterns:
378+
# English words — TPM1 (TPM_GetErrMsg returns "Authentication failed...")
379+
# — TPM2 (tpm2-tools LOG_ERR returns "TPM2_RC_AUTH_FAIL...")
380+
# defend — TPM1 "Defend lock running" (TPM_DEFEND_LOCK_RUNNING)
381+
# 0x98e, 0x149 — TPM2 raw hex codes (TPM2_RC_AUTH_FAIL, TPM2_RC_NV_AUTHORIZATION)
382+
#
373383
# Usage: _tpm_auth_retry <label> <error_stream> <tpm_type> <pw_flag> <cmd...>
374384
# <label>: short name for debug (e.g. "counter_create")
375-
# <error_stream>: "stdout" (TPM1) or "stderr" (TPM2)
385+
# <error_stream>: "stdout" (TPM1: tpmtotp printf) or "stderr" (TPM2: tpm2-tools LOG_ERR)
376386
# <tpm_type>: "tpm1" or "tpm2"
377387
# <pw_flag>: passphrase flag for TPM1 (-pwdo or -pwdc), ignored for TPM2
378388
# <cmd...>: the tpm command and its non-auth arguments
379389
#
380390
# Exit codes:
381391
# 0: success
382-
# 1: non-auth error (e.g., "out of resources" 0x15) — caller should check
392+
# 1: non-auth error (e.g., TPM1 "out of resources" 0x15) — caller should check
383393
_tpm_auth_retry() {
384394
local label="$1" error_stream="$2" tpm_type="$3" pw_flag="$4"
385395
shift 4
@@ -417,7 +427,7 @@ _tpm_auth_retry() {
417427
DEBUG "_tpm_auth_retry $label attempt $attempt failed: $out_content"
418428
rm -f "$tmp_file"
419429
shred -n 10 -z -u /tmp/secret/tpm_owner_passphrase 2>/dev/null || true
420-
if echo "$out_content" | grep -qiE 'authorization|auth|bad|permission'; then
430+
if echo "$out_content" | grep -qiE 'authorization|auth|bad|permission|defend|0x98e|0x149'; then
421431
WARN "$label failed (bad passphrase?). Retrying..."
422432
else
423433
# Non-auth error (e.g., out of resources 0x15)
@@ -443,17 +453,33 @@ tpm1_counter_read() {
443453

444454
# tpm1_counter_increment - Increment a TPM1 rollback counter.
445455
# Args: -ix <index> [ -pwdc <passphrase> ]
456+
#
457+
# Auth behaviour:
458+
# -pwdc '' : empty counter auth (SHA1 of ""), correct per TCG spec.
459+
# Calls tpm directly without retry — caller handles fallback.
460+
# -pwdc <pass> : owner passphrase auth via _tpm_auth_retry (migration).
461+
# (no -pwdc) : owner passphrase auth via _tpm_auth_retry (backward compat).
446462
tpm1_counter_increment() {
447463
TRACE_FUNC
448464
local counter_id=""
465+
local pwdc_provided="n"
466+
local pwdc_value=""
449467
while [ $# -gt 0 ]; do
450468
case "$1" in
451469
-ix) counter_id="$2"; shift 2 ;;
452-
-pwdc) shift 2 ;; # passphrase handled by _tpm_auth_retry
470+
-pwdc) pwdc_provided="y"; pwdc_value="$2"; shift 2 ;;
453471
*) shift ;;
454472
esac
455473
done
456-
_tpm_auth_retry "counter_increment" "stdout" "tpm1" "-pwdc" tpm counter_increment -ix "$counter_id"
474+
if [ "$pwdc_provided" = "y" ] && [ -z "$pwdc_value" ]; then
475+
# Empty counter auth per TCG spec.
476+
# Call tpm directly: no retry needed, caller handles fallback.
477+
# Use || return so set -e doesn't kill the script on auth failure.
478+
tpm counter_increment -ix "$counter_id" -pwdc '' || return $?
479+
else
480+
_tpm_auth_retry "counter_increment" "stdout" "tpm1" "-pwdc" \
481+
tpm counter_increment -ix "$counter_id"
482+
fi
457483
}
458484

459485
tpm2_counter_create() {
@@ -641,7 +667,7 @@ tpm2_seal() {
641667
rm -f "$tmp_err_file"
642668
DEBUG "Failed attempt $attempt to write sealed secret to NVRAM from tpm2_seal. Stderr: $tmp_err_content"
643669
shred -n 10 -z -u /tmp/secret/tpm_owner_passphrase 2>/dev/null || true
644-
if echo "$tmp_err_content" | grep -qiE 'authorization|auth|bad|permission'; then
670+
if echo "$tmp_err_content" | grep -qiE 'authorization|auth|bad|permission|defend|0x98e|0x149'; then
645671
if [ "$attempt" -ge 3 ]; then
646672
DIE "Unable to write sealed secret to TPM NVRAM after 3 attempts. Reset the TPM and try again."
647673
fi
@@ -759,7 +785,7 @@ tpm1_seal() {
759785
rm -f "$tmp_def_out"
760786
DEBUG "tpm1_seal nv_definespace failed (attempt $attempt): $def_out_content"
761787
# If auth failure, retry after re-prompt; otherwise bail out.
762-
if echo "$def_out_content" | grep -qiE 'authorization|auth|bad|permission'; then
788+
if echo "$def_out_content" | grep -qiE 'authorization|auth|bad|permission|defend'; then
763789
shred -n 10 -z -u /tmp/secret/tpm_owner_passphrase 2>/dev/null || true
764790
WARN "nv_definespace failed (bad passphrase?). Retrying..."
765791
continue
@@ -788,7 +814,7 @@ tpm1_seal() {
788814
fi
789815
DEBUG "tpm1_seal nv_writevalue(post-define) output: $tmp_out_content"
790816
shred -n 10 -z -u /tmp/secret/tpm_owner_passphrase 2>/dev/null || true
791-
if echo "$tmp_out_content" | grep -qiE 'authorization|auth|bad|permission'; then
817+
if echo "$tmp_out_content" | grep -qiE 'authorization|auth|bad|permission|defend'; then
792818
if [ "$attempt" -ge 3 ]; then
793819
DIE "Unable to write sealed secret to TPM NVRAM after 3 attempts"
794820
fi
@@ -1075,9 +1101,34 @@ tpm1_reset() {
10751101
DO_WITH_DEBUG tpm physicalenable >/dev/null 2>&1 || LOG "tpm1_reset: unable to physicalenable after clear"
10761102

10771103
# 3. Take ownership with the new TPM owner passphrase.
1078-
if ! DO_WITH_DEBUG --mask-position 3 tpm takeown -pwdo "$tpm_owner_passphrase" >/dev/null 2>&1; then
1079-
LOG "tpm1_reset: tpm takeown failed after forceclear"
1080-
return 1
1104+
# TPM_DEFEND_LOCK_RUNNING is a standard TPM 1.2 error raised after
1105+
# too many failed authorization attempts (see tpm_error.h). The TPM
1106+
# enters a time-out period and refuses all authorization operations —
1107+
# including takeown, even after a successful forceclear (forceclear
1108+
# clears the owner but not the dictionary attack counter on some
1109+
# implementations).
1110+
# TPM_ResetLockValue requires owner auth, which does not exist after
1111+
# forceclear, so we cannot call it. Cycle physical presence
1112+
# (physicaldisable + physicalenable) to reset the TPM state machine
1113+
# on chips that honour software presence. If the lock persists,
1114+
# only a full AC power cycle (not just reboot) will clear it.
1115+
local takeown_rc takeown_out
1116+
takeown_out="$(DO_WITH_DEBUG --mask-position 3 tpm takeown -pwdo "$tpm_owner_passphrase" 2>&1)" && takeown_rc=0 || takeown_rc=$?
1117+
if [ $takeown_rc -ne 0 ]; then
1118+
if echo "$takeown_out" | grep -qi "defend lock"; then
1119+
LOG "tpm1_reset: defend lock detected after forceclear — cycling physical presence to clear"
1120+
DO_WITH_DEBUG tpm physicaldisable >/dev/null 2>&1 || true
1121+
DO_WITH_DEBUG tpm physicalenable >/dev/null 2>&1 || true
1122+
DO_WITH_DEBUG tpm physicalpresence -s >/dev/null 2>&1 || true
1123+
DO_WITH_DEBUG tpm physicalsetdeactivated -c >/dev/null 2>&1 || true
1124+
if ! DO_WITH_DEBUG --mask-position 3 tpm takeown -pwdo "$tpm_owner_passphrase" >/dev/null 2>&1; then
1125+
LOG "tpm1_reset: tpm takeown still failed after defend lock recovery"
1126+
return 1
1127+
fi
1128+
else
1129+
LOG "tpm1_reset: tpm takeown failed after forceclear"
1130+
return 1
1131+
fi
10811132
fi
10821133

10831134
# 4. Leave TPM enabled, present, and not deactivated.

initrd/etc/functions.sh

Lines changed: 25 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1872,7 +1872,7 @@ check_tpm_counter() {
18721872
(
18731873
set +e
18741874
tpmr.sh counter_create \
1875-
-pwdc "${tpm_passphrase:-}" \
1875+
-pwdc '' \
18761876
-la "$LABEL" \
18771877
>/tmp/counter 2> >(tee >(SINK_LOG "tpm counter_create stderr") >&2)
18781878
echo $? > /tmp/counter_create_rc
@@ -2051,23 +2051,13 @@ increment_tpm_counter() {
20512051
fi
20522052

20532053
# Prefer explicit passphrase, otherwise reuse cached TPM owner passphrase.
2054+
# TPM2 uses owner-auth fallback in tpm2_counter_inc; TPM1 uses empty counter
2055+
# auth (SHA1("")) per TCG spec — no owner passphrase needed for increment.
20542056
if [ -z "$tpm_passphrase" ] && [ -s /tmp/secret/tpm_owner_passphrase ]; then
20552057
tpm_passphrase="$(cat /tmp/secret/tpm_owner_passphrase)"
20562058
DEBUG "increment_tpm_counter: using cached TPM owner passphrase"
20572059
fi
20582060

2059-
# TPM1 counter_increment requires owner auth in practice on this path.
2060-
# origin/master typically reached this with cached owner passphrase already set,
2061-
# but the newer reseal/update flows can call this later in the session after
2062-
# that cache is absent. Prompt once and cache to avoid empty -pwdc failures.
2063-
if [ "$CONFIG_TPM2_TOOLS" != "y" ] && [ -z "$tpm_passphrase" ]; then
2064-
WARN "TPM Owner Passphrase is required to update rollback counter before signing updated boot hashes."
2065-
DEBUG "increment_tpm_counter: TPM1 path has no cached/provided owner passphrase; prompting now"
2066-
prompt_tpm_owner_password
2067-
tpm_passphrase="$tpm_owner_passphrase"
2068-
DEBUG "increment_tpm_counter: TPM1 owner passphrase obtained and cached"
2069-
fi
2070-
20712061
# Try to increment the counter. We normally hide the verbose
20722062
# output of tpmr.sh commands to avoid overwhelming the console, but we
20732063
# must *not* swallow any interactive prompts. The previous implementation
@@ -2094,7 +2084,11 @@ increment_tpm_counter() {
20942084
increment_ok="y"
20952085
fi
20962086
else
2097-
# TPM1 path uses owner auth in practice.
2087+
# TPM1 counter uses empty auth (SHA1 of "") per TCG spec.
2088+
# The counter's auth is separate from the owner passphrase.
2089+
# If empty auth fails on a readable counter, the counter was
2090+
# created by pre-fix code with owner-passphrase auth — prompt
2091+
# for owner passphrase and retry as migration fallback.
20982092
# NOTE: tpmtotp C code prints ALL output (success + errors) to stdout.
20992093
# We must capture stdout to detect failures properly.
21002094
# DO_WITH_DEBUG internally captures the command's stderr (tee /dev/stderr
@@ -2104,10 +2098,25 @@ increment_tpm_counter() {
21042098
if (
21052099
set -o pipefail
21062100
DO_WITH_DEBUG --mask-position 5 \
2107-
tpmr.sh counter_increment -ix "$counter_id" -pwdc "${tpm_passphrase:-}" \
2101+
tpmr.sh counter_increment -ix "$counter_id" -pwdc '' \
21082102
2>/dev/null | tee /tmp/counter-"$counter_id" >/dev/null
21092103
); then
21102104
increment_ok="y"
2105+
elif [ "$counter_present" = "y" ]; then
2106+
if [ -z "$tpm_passphrase" ]; then
2107+
WARN "TPM Owner Passphrase required to increment counter created by previous Heads version"
2108+
prompt_tpm_owner_password
2109+
tpm_passphrase="$tpm_owner_passphrase"
2110+
fi
2111+
if (
2112+
set -o pipefail
2113+
DO_WITH_DEBUG --mask-position 5 \
2114+
tpmr.sh counter_increment -ix "$counter_id" -pwdc "${tpm_passphrase}" \
2115+
2>/dev/null | tee /tmp/counter-"$counter_id" >/dev/null
2116+
); then
2117+
increment_ok="y"
2118+
WARN "TPM counter created by older firmware (uses owner passphrase). This is a one-time migration; operation continues with owner-passphrase auth. Reset TPM in menu (Options -> TPM/TOTP/HOTP Options -> Reset the TPM) to create a new empty-auth counter (recommended), or leave as-is."
2119+
fi
21112120
fi
21122121
fi
21132122

@@ -2126,7 +2135,7 @@ increment_tpm_counter() {
21262135
if (
21272136
set -o pipefail
21282137
DO_WITH_DEBUG --mask-position 3 \
2129-
tpmr.sh counter_create -pwdc "${tpm_passphrase:-}" -la 3135106223 \
2138+
tpmr.sh counter_create -pwdc '' -la 3135106223 \
21302139
2> >(tee >(SINK_LOG "tpm counter_create stderr") >&2) |
21312140
tee /tmp/new-counter >/dev/null
21322141
); then

0 commit comments

Comments
 (0)