|
| 1 | +# Hardware PID Benchmark |
| 2 | + |
| 3 | +Cycle-accurate PID benchmark targeting real Cortex-M hardware. |
| 4 | +Uses the ARM DWT CYCCNT register for cycle-exact measurement on |
| 5 | +Cortex-M targets, and falls back to `clock_gettime` on `native_sim`. |
| 6 | + |
| 7 | +## Supported Boards |
| 8 | + |
| 9 | +- **nucleo_f446re** — STM32F446RE (Cortex-M4, 180 MHz) |
| 10 | +- **nucleo_h743zi** — STM32H743ZI (Cortex-M7, 480 MHz) |
| 11 | +- **native_sim** — Host execution (nanosecond timing) |
| 12 | + |
| 13 | +## Building |
| 14 | + |
| 15 | +```sh |
| 16 | +# For real hardware |
| 17 | +west build -b nucleo_f446re tests/benchmarks/hw_benchmark |
| 18 | +west build -b nucleo_h743zi tests/benchmarks/hw_benchmark |
| 19 | + |
| 20 | +# For simulation |
| 21 | +west build -b native_sim tests/benchmarks/hw_benchmark |
| 22 | +``` |
| 23 | + |
| 24 | +## Flashing |
| 25 | + |
| 26 | +Connect the Nucleo board via USB (ST-Link) and run: |
| 27 | + |
| 28 | +```sh |
| 29 | +west flash |
| 30 | +``` |
| 31 | + |
| 32 | +Ensure your udev rules are configured for ST-Link. On Linux: |
| 33 | + |
| 34 | +```sh |
| 35 | +# /etc/udev/rules.d/99-stlink.rules |
| 36 | +SUBSYSTEM=="usb", ATTR{idVendor}=="0483", ATTR{idProduct}=="374b", MODE="0666" |
| 37 | +``` |
| 38 | + |
| 39 | +## Capturing Results |
| 40 | + |
| 41 | +Open a serial terminal at 115200 baud on the board's virtual COM port: |
| 42 | + |
| 43 | +```sh |
| 44 | +# Linux — typical device path for Nucleo boards |
| 45 | +picocom -b 115200 /dev/ttyACM0 |
| 46 | + |
| 47 | +# Or use west |
| 48 | +west espressif monitor # (for ESP boards) |
| 49 | +# For STM32 Nucleo, use any serial terminal on the ST-Link VCP |
| 50 | +``` |
| 51 | + |
| 52 | +Press the board's reset button to re-run the benchmark. |
| 53 | + |
| 54 | +### Expected Output |
| 55 | + |
| 56 | +``` |
| 57 | +[00:00:00.000,000] <inf> hw_bench: === HW PID Benchmark === |
| 58 | +[00:00:00.000,000] <inf> hw_bench: Timer unit: cycles |
| 59 | +[00:00:00.000,000] <inf> hw_bench: Iterations: 10000 (warmup: 500) |
| 60 | +[00:00:00.xxx,xxx] <inf> hw_bench: --- Hand-coded PID --- |
| 61 | +[00:00:00.xxx,xxx] <inf> hw_bench: Total: NNNN cycles (NN cycles/tick) |
| 62 | +[00:00:00.xxx,xxx] <inf> hw_bench: RAM (struct): 24 bytes |
| 63 | +[00:00:00.xxx,xxx] <inf> hw_bench: --- arbiter Engine PID --- |
| 64 | +[00:00:00.xxx,xxx] <inf> hw_bench: Total: NNNN cycles (NN cycles/tick) |
| 65 | +[00:00:00.xxx,xxx] <inf> hw_bench: RAM (ctx): NNN bytes |
| 66 | +[00:00:00.xxx,xxx] <inf> hw_bench: === Comparison === |
| 67 | +[00:00:00.xxx,xxx] <inf> hw_bench: Engine overhead: NN% (NN vs NN cycles/tick) |
| 68 | +[00:00:00.xxx,xxx] <inf> hw_bench: HW Benchmark complete |
| 69 | +``` |
| 70 | + |
| 71 | +## Running via Twister |
| 72 | + |
| 73 | +```sh |
| 74 | +# Build-only test (no hardware required) |
| 75 | +west twister -T tests/benchmarks/hw_benchmark -p native_sim |
| 76 | + |
| 77 | +# With real hardware (requires connected board and runner configured) |
| 78 | +west twister -T tests/benchmarks/hw_benchmark -p nucleo_f446re --device-testing |
| 79 | +``` |
| 80 | + |
| 81 | +## How It Works |
| 82 | + |
| 83 | +### DWT CYCCNT (Cortex-M) |
| 84 | + |
| 85 | +The Data Watchpoint and Trace (DWT) unit provides a 32-bit free-running |
| 86 | +cycle counter (`CYCCNT`). At typical Cortex-M clock speeds: |
| 87 | + |
| 88 | +- **180 MHz (F446RE)**: wraps every ~23.8 seconds |
| 89 | +- **480 MHz (H743ZI)**: wraps every ~8.9 seconds |
| 90 | + |
| 91 | +The benchmark completes well within these limits. |
| 92 | + |
| 93 | +### Measurement Methodology |
| 94 | + |
| 95 | +1. **Warmup**: 500 iterations to stabilize caches and branch predictors. |
| 96 | +2. **Measured window**: 10,000 iterations of the PID loop. |
| 97 | +3. **Both implementations** (hand-coded and arbiter engine) use identical |
| 98 | + inputs and produce identical control outputs. |
| 99 | +4. The cycle count delta is divided by the iteration count for per-tick cost. |
| 100 | + |
| 101 | +### ROM Comparison |
| 102 | + |
| 103 | +ROM cannot be measured at runtime. After building, compare `.elf` sizes: |
| 104 | + |
| 105 | +```sh |
| 106 | +arm-none-eabi-size build/zephyr/zephyr.elf |
| 107 | +``` |
0 commit comments