|
| 1 | +# STM32 Bare-Metal (`WOLFSSL_STM32_BARE`) Board Status |
| 2 | + |
| 3 | +Generated 2026-05-04. Tracks the boards exercised by the `STM32_Bare_Test` |
| 4 | +multi-board example in `wolfssl-examples-stm32` against the corresponding |
| 5 | +direct-register support in `wolfssl/wolfcrypt/src/port/st/stm32.c`. |
| 6 | + |
| 7 | +Columns: |
| 8 | + |
| 9 | +- **HASH HW** — chip has a HASH peripheral (MD5/SHA-1/SHA-2/...). "yes" = |
| 10 | + the BARE driver routes `wc_Sha*` to the HASH IP. "-" = no HASH silicon; |
| 11 | + SHA falls back to software in all configs. |
| 12 | +- **AES HW** — chip has an AES/CRYP peripheral. "CRYP" = FIFO-based AES |
| 13 | + on F4/F7/H7/MP13 (`wc_Stm32_Aes_*` -> CRYP HW). "TinyAES" = single-reg |
| 14 | + AES on L4/L5/U3/U5/H5/G4/WB/WL/G0/WBA. "-" = no AES; software path. |
| 15 | +- **PKA HW** — chip has a public-key accelerator. "yes" + Tested = the |
| 16 | + bare-metal PKA driver (added 2026-05-04) is wired up and validated end |
| 17 | + to end. "yes" + Untested = silicon present but no validation flash this |
| 18 | + session. "-" = no PKA silicon. |
| 19 | +- **Status** — Validated = `make BOARD=<x> CONFIG=bare TARGET=test` runs |
| 20 | + the full `wolfcrypt_test` and exits with `Result: 0 (PASS)` on real |
| 21 | + hardware in this session. |
| 22 | + |
| 23 | +## Validated boards |
| 24 | + |
| 25 | +| BOARD | Chip | Cortex / Clock | HASH HW | AES HW | PKA HW | Status | |
| 26 | +|--------|---------------|--------------------|---------|----------|----------------|------------| |
| 27 | +| `h7` | STM32H753ZI | M7F / 480 MHz PLL | yes | CRYP | - | Validated | |
| 28 | +| `f439` | STM32F439ZI | M4F / 144 MHz PLL | yes | CRYP | - | Validated | |
| 29 | +| `wb55` | STM32WB55RG | M4F / 64 MHz PLL | - | TinyAES | yes (Tested V1)| Validated | |
| 30 | +| `u3` | STM32U385RG | M33 / 96 MHz | yes | TinyAES | yes (Tested V2)| Validated | |
| 31 | +| `u5` | STM32U575ZI | M33 / 160 MHz | yes | - | - | Validated | |
| 32 | +| `h5` | STM32H563ZI | M33 / 250 MHz | yes | - | yes (Compile) | Build OK\* | |
| 33 | +| `g491` | STM32G491RE | M4F / 170 MHz PLL | - | - | - | Validated | |
| 34 | + |
| 35 | +\* H5 PKA driver is enabled for `BUILD_BARE` and **builds cleanly**. |
| 36 | +**Runtime validation is blocked by a flash ECC fault.** See the |
| 37 | +"H5 reproduction steps" section below for the full repro recipe. |
| 38 | + |
| 39 | +\*\* U5 is the STM32U575 NUCLEO -- that silicon does **not** have PKA |
| 40 | +(only U585+ does). The HASH + RNG bare-metal paths are validated. |
| 41 | +For PKA validation on U5 we'd need a NUCLEO-U585AI-Q. |
| 42 | + |
| 43 | +## Bench HW used |
| 44 | + |
| 45 | +These results are from `make BOARD=<x> CONFIG=bare TARGET=bench`. Numbers |
| 46 | +are from the wolfcrypt `benchmark.c` block-1024 default. Best column wins |
| 47 | +each row. |
| 48 | + |
| 49 | +| Board | Clock | AES-128-CBC enc (BARE) | SHA-256 (BARE) | ECDHE secp256r1 (BARE) | |
| 50 | +|--------|---------|------------------------|----------------|------------------------| |
| 51 | +| h7 | 480 MHz | **19.165 MiB/s** | **25.928 MiB/s**| (no PKA HW; SP-SW) | |
| 52 | +| f439 | 144 MHz | 11.401 MiB/s | 25.757 MiB/s | (no PKA HW; SP-SW) | |
| 53 | +| g491 | 170 MHz | 1.017 MiB/s (sw) | 3.037 MiB/s | 11.8 ops/s (sw) | |
| 54 | +| wb55 | 64 MHz | 7.237 MiB/s | 1.243 MiB/s sw | 4.83 ops/s (PKA HW)\** | |
| 55 | +| u3 | 96 MHz | (TinyAES BARE -- prior)| HASH HW (prior)| 1.115 ops/s (PKA HW)\**| |
| 56 | + |
| 57 | +\** WB55 and U3 PKA HW perform similarly to (or slightly slower than) |
| 58 | +the SP-ECC software path at P-256 on these clocks. Both ST docs and |
| 59 | +direct measurement (U3 SW = 1.106 vs PKA = 1.115 ops/s) confirm the |
| 60 | +PKA HW is correctness-only at P-256 on these specific chips. Larger |
| 61 | +curves (P-384/521) where SP-ECC scales worse, and faster-clocked PKA |
| 62 | +(H5 at 250 MHz, eventual U585), should let PKA pull meaningfully |
| 63 | +ahead. Driver covers V1 (WB) and V2 (U3 / H5 / U5 / WBA / G4A1) |
| 64 | +register layouts; the V2 path is exercised end-to-end on U3. |
| 65 | + |
| 66 | +## TODO -- not yet wired up |
| 67 | + |
| 68 | +| BOARD candidate | Chip | Cortex / Clock max | What lights up | Notes | |
| 69 | +|-----------------|---------------|--------------------|------------------------|----------------------------------------------| |
| 70 | +| `f437` | STM32F437IIHx | M4F / 168 MHz | CRYP + HASH + RNG | STM32439I-EVAL. Parity check vs F439 | |
| 71 | +| `f767` / `f779` | STM32F767ZI | M7F / 216 MHz | CRYP + HASH + RNG | NUCLEO-F767ZI | |
| 72 | +| `mp135` | STM32MP135F | A7 / 650 MHz | CRYP + HASH + RNG + PKA| STM32MP135F-DK. Linux/bare-metal split | |
| 73 | +| `l4r5` | STM32L4R5ZI | M4F / 120 MHz | TinyAES + HASH + RNG | NUCLEO-L4R5ZI | |
| 74 | +| `l552` | STM32L552ZE | M33 / 110 MHz | TinyAES + HASH + RNG + SAES | NUCLEO-L552ZE-Q | |
| 75 | +| `h573` / `h533` | STM32H573ZI | M33 / 250 MHz | TinyAES + HASH + RNG + SAES | NUCLEO-H573ZI -- H5 with AES added | |
| 76 | +| `u585` | STM32U585AI | M33 / 160 MHz | TinyAES + HASH + RNG + SAES + PKA | NUCLEO-U585AI-Q | |
| 77 | +| `wba` | STM32WBA52CG | M33 / 100 MHz | TinyAES + HASH + RNG + PKA | NUCLEO-WBA52CG. Same V2 PKA layout | |
| 78 | +| `wl55` | STM32WL55JC | M4F / 48 MHz | TinyAES + RNG | NUCLEO-WL55JC. Sub-GHz radio | |
| 79 | +| `g0b1` | STM32G0B1RE | M0+ / 64 MHz | TinyAES + RNG | NUCLEO-G0B1RE | |
| 80 | +| `g474` / `g484` | STM32G474RE | M4F / 170 MHz | TinyAES + RNG | NUCLEO-G474RE -- G4 sibling that DOES have AES | |
| 81 | +| `g4a1` | STM32G4A1RE | M4F / 170 MHz | TinyAES + RNG + PKA + AES | G491 sibling that has the full crypto block | |
| 82 | +| `c5a3` | STM32C5A3ZG | M0+ / ~48 MHz | - | NUCLEO-C5A3ZG -- entry-level; software only | |
| 83 | + |
| 84 | +The bare-metal PKA driver in `wolfcrypt/src/port/st/stm32.c` already |
| 85 | +covers the V1 (WB) and V2 (H5/U3/U5/G4/WBA) PKA register layouts. New |
| 86 | +boards that have PKA need only board bring-up files (startup, linker, |
| 87 | +hw_init, system_*.c) plus `WOLFSSL_STM32_PKA` in `user_settings.h` -- |
| 88 | +no driver changes. |
| 89 | + |
| 90 | +## Repository checkpoints (this session) |
| 91 | + |
| 92 | +`wolfssl@stm32_bare`: |
| 93 | +- `7a8ee7d` H7 PLL bring-up to 480 MHz |
| 94 | +- `06530195b` WB55 AES1 + CCF macro abstraction |
| 95 | +- `8e838294b` G4 family clock-enable maps |
| 96 | +- `112e7f929` PKA BARE driver (V1+V2 register layouts; WB55 validated) |
| 97 | +- `8383907c1` H5 HASH digest read fix (`HRA` not `HR`) |
| 98 | + |
| 99 | +`wolfssl-examples-stm32@stm32_bare`: |
| 100 | +- H7 480 MHz hw_init + benches in README |
| 101 | +- WB55 PLL64 + bench |
| 102 | +- G491 board files + bench + README correction (G491RE has no PKA) |
| 103 | +- WB55 PKA enable |
| 104 | +- H5 cube path wildcard |
| 105 | + |
| 106 | +## H5 reproduction steps (NUCLEO-H563ZI bare-metal flash ECC fault) |
| 107 | + |
| 108 | +### Symptom |
| 109 | + |
| 110 | +After flashing the wolfcrypt test build to NUCLEO-H563ZI, the board |
| 111 | +emits zero bytes on USART3 (PD8 / ST-LINK VCP at 115200 8N1) and |
| 112 | +the CPU spins inside the default NMI handler (`Infinite_Loop` / |
| 113 | +`b .`). Halting via SWD shows xPSR.IPSR = 2 (NMI active). |
| 114 | + |
| 115 | +### Root cause |
| 116 | + |
| 117 | +Flash ECC double-bit detection fires on read of flash address |
| 118 | +**0x08002000**. The status latches in `FLASH_ECCDETR`: |
| 119 | + |
| 120 | +``` |
| 121 | +FLASH_ECCDETR = 0x80000200 |
| 122 | + ^ bit 31 ECCD = 1 (uncorrectable error) |
| 123 | + ^^^ bits[15:0] ADDR_ECC = 0x0200 |
| 124 | +``` |
| 125 | + |
| 126 | +ADDR_ECC is in 16-byte (128-bit quad-word) units: `0x200 * 16 = |
| 127 | +0x2000`, so the failing flash word is at `0x08002000`. The H5 |
| 128 | +flash interface raises NMI on uncorrectable ECC errors. |
| 129 | + |
| 130 | +### Reproducer |
| 131 | + |
| 132 | +```sh |
| 133 | +cd ~/GitHub/wolfssl-examples-stm32/STM32_Bare_Test |
| 134 | +make BOARD=h5 CONFIG=bare TARGET=test |
| 135 | +PROG=/opt/st/stm32cubeide_*/plugins/com.st.stm32cube.ide.mcu.externaltools.cubeprogrammer.linux64_*/tools/bin/STM32_Programmer_CLI |
| 136 | +$PROG -c port=SWD reset=HWrst -e all -d build/h5-test-bare/app.bin 0x08000000 -v -rst |
| 137 | +# UART log will be empty (0 bytes) at /dev/ttyACM<n> |
| 138 | +``` |
| 139 | + |
| 140 | +To inspect the latched ECC state via OpenOCD: |
| 141 | + |
| 142 | +```sh |
| 143 | +OPENOCD=/home/davidgarske/GitHub/OpenOCD/src/openocd |
| 144 | +SCRIPTS=/home/davidgarske/GitHub/OpenOCD/tcl |
| 145 | +$OPENOCD -s $SCRIPTS -f interface/stlink-dap.cfg \ |
| 146 | + -f target/stm32h5x.cfg \ |
| 147 | + -c "init; halt" \ |
| 148 | + -c "echo {ECCDETR}; mdw 0x40022104" \ |
| 149 | + -c "echo {ECCCORR}; mdw 0x40022100" \ |
| 150 | + -c "shutdown" |
| 151 | +# Expected output: |
| 152 | +# ECCDETR |
| 153 | +# 0x40022104: 80000200 |
| 154 | +# ECCCORR |
| 155 | +# 0x40022100: 00000000 |
| 156 | +``` |
| 157 | + |
| 158 | +### What I tried that did NOT help |
| 159 | + |
| 160 | +- Mass erase + reprogram via STM32_Programmer_CLI (`-e all -d ...`) |
| 161 | +- Mass erase + reprogram via OpenOCD `flash erase_sector ; flash write_image` |
| 162 | +- Padding the `.bin` to 16-byte (128-bit quad-word) alignment |
| 163 | +- Two physical NUCLEO-H563ZI boards (different STLINK serials) |
| 164 | +- Option-byte verification: TZEN = 0xC3 (TZ disabled), SRAM2/3 ECC |
| 165 | + disabled, HDP1_STRT/END set such that no HDP region is configured |
| 166 | + (STRT=1 > END=0, i.e. RM-documented "no protected area"), no WRP |
| 167 | +- Clearing `FLASH_ECCDETR` via openocd write to bit 31 -- value is |
| 168 | + re-latched as soon as the CPU runs again |
| 169 | +- Building `CONFIG=c` (pure software, no BARE drivers, no PKA, no |
| 170 | + wolfssl HW paths) -- same fault, so it is not a wolfssl regression |
| 171 | +- Replacing `printf` with direct USART writes inside `main()` -- |
| 172 | + same fault, so it is not a newlib stdio init issue per se |
| 173 | + |
| 174 | +### What does work on the same hardware |
| 175 | + |
| 176 | +- A standalone direct-USART "Hello %d" program (built with the |
| 177 | + same `--specs=nano.specs --specs=nosys.specs`, ~151 KB) boots |
| 178 | + and prints. The wolfssl-linked test (~260 KB) does not. |
| 179 | +- The build is correct: `STM32_Programmer_CLI -r32 0x08002000 16` |
| 180 | + reads flash content that matches the bin byte-for-byte. |
| 181 | + |
| 182 | +### Hypothesis |
| 183 | + |
| 184 | +Either the H5 flash interface stages an ECC error at a fixed |
| 185 | +quad-word in this code-size range that neither programmer is |
| 186 | +clearing, or the chip-erase sequence as currently invoked leaves |
| 187 | +that quad-word's ECC bits in an inconsistent state that subsequent |
| 188 | +programs do not refresh. The same code path validates end-to-end |
| 189 | +on STM32U385 (V2 PKA, identical register sequence) so the wolfssl |
| 190 | +PKA driver itself is not the cause. |
| 191 | + |
| 192 | +### What likely fixes it |
| 193 | + |
| 194 | +- Test with a different programming tool (J-Link, STM32CubeIDE GUI) |
| 195 | + to rule out CLI / OpenOCD behavior |
| 196 | +- Try writing the same image to BANK2 (0x08100000) and switching |
| 197 | + SWAP_BANK -- if the fault follows the bank, it is silicon; if it |
| 198 | + follows the address, it is the programmer |
| 199 | +- Try a smaller wolfssl build that does not cross 0x08002000 to |
| 200 | + confirm the dependency is on physical flash address rather than |
| 201 | + on what is at it |
0 commit comments