|
| 1 | +# wolfIP port: Xilinx ZCU102 (UltraScale+ MPSoC) |
| 2 | + |
| 3 | +Bare-metal wolfIP port for the AMD/Xilinx Zynq UltraScale+ MPSoC, demoed |
| 4 | +on the ZCU102 dev board. Targets a single Cortex-A53 core (APU 0) at |
| 5 | +EL3, GCC bare-metal, no Xilinx Standalone BSP, no FreeRTOS, no wolfBoot. |
| 6 | + |
| 7 | +This first milestone is aimed at a deterministic UDP-only profile |
| 8 | +suitable for DO-178C DAL-C qualification. The application opens a |
| 9 | +UDP echo socket on port 7 and runs a DHCP client to acquire a lease. |
| 10 | + |
| 11 | +## What this port covers |
| 12 | + |
| 13 | +- PS-GEM3 (on-board RJ45) at 1 Gbps via the TI DP83867IR PHY (RGMII). |
| 14 | +- Polled RX + polled TX (GIC-400 SPI 63 latches correctly, but the |
| 15 | + Cortex-A53 IRQ exception is not delivered in our EL3 single-EL setup; |
| 16 | + `eth_poll()` drives `gem_isr()` directly from the main loop). |
| 17 | +- Clean-room Cadence GEM driver - no XEmacPs, no Xilinx Standalone BSP, |
| 18 | + no `xparameters.h`. All register base addresses live in `board.h`. |
| 19 | +- MMU at EL3 with a static page table: DDR Normal WB, a Normal-NC DMA |
| 20 | + carve-out for GEM BDs/buffers, peripherals Device-nGnRnE, and an |
| 21 | + OCM (0xFFFC0000+) Normal-WB executable block. |
| 22 | +- PS-UART0 polled console (USB-UART on the ZCU102 board, channel 0). |
| 23 | +- DHCP client and a UDP echo demo (port 7); ICMP echo reply works |
| 24 | + through the wolfIP core. |
| 25 | + |
| 26 | +## What is explicitly NOT in this port yet |
| 27 | + |
| 28 | +- Software VLAN (Daniele has a separate wolfIP-core PR in flight). |
| 29 | +- uC/OS-II socket port (planned follow-up; trivially adapts an existing |
| 30 | + `bsd_socket.c`). |
| 31 | +- Additional GEM instances (GEM0/1/2). Driver is single-instance. |
| 32 | +- Versal Gen 1, Zynq-7000. |
| 33 | +- wolfBoot integration. Stock Xilinx FSBL hands control directly to |
| 34 | + `app.elf`. |
| 35 | +- TLS / wolfSSL. |
| 36 | + |
| 37 | +## Hardware |
| 38 | + |
| 39 | +- AMD/Xilinx ZCU102 evaluation board (XCZU9EG-2FFVB1156). Rev 1.0 or |
| 40 | + 1.1 are both fine. |
| 41 | +- USB-UART via the on-board FTDI FT4232 (host sees four `/dev/ttyUSB*` |
| 42 | + channels; UART0 is the standard one, typically `/dev/ttyUSB0` or the |
| 43 | + channel labelled "MIO" depending on board / udev). |
| 44 | +- Ethernet via the on-board RJ45 (PS-GEM3 -> DP83867 PHY @ MDIO 0x0C). |
| 45 | + |
| 46 | +## Build |
| 47 | + |
| 48 | +Toolchain: ARM GNU `aarch64-none-elf-gcc`. The default is on `$PATH`; |
| 49 | +override with `CROSS_COMPILE=...-` if needed. |
| 50 | + |
| 51 | +``` |
| 52 | +cd src/port/zcu102 |
| 53 | +make CROSS_COMPILE=aarch64-none-elf- |
| 54 | +``` |
| 55 | + |
| 56 | +Output: `app.elf`. Section sizes are printed at the end of the build. |
| 57 | + |
| 58 | +## Build BOOT.BIN |
| 59 | + |
| 60 | +You need a pre-built ZCU102 FSBL ELF. The simplest way to obtain one |
| 61 | +is the Vitis "zynqmp_fsbl" template (single-click build), or PetaLinux |
| 62 | +`petalinux-build -c bootloader`. We deliberately do NOT vendor FSBL |
| 63 | +sources here; FSBL is a Xilinx-provided component and stock works. |
| 64 | + |
| 65 | +Source Vitis first (so `bootgen` is on `$PATH`), then: |
| 66 | + |
| 67 | +``` |
| 68 | +FSBL_ELF=/path/to/zynqmp_fsbl.elf make bootbin |
| 69 | +``` |
| 70 | + |
| 71 | +Output: `BOOT.BIN` in the port directory. |
| 72 | + |
| 73 | +## Boot |
| 74 | + |
| 75 | +### SD card boot |
| 76 | + |
| 77 | +1. Format a microSD as FAT32. |
| 78 | +2. Copy `BOOT.BIN` to the root of the SD card. |
| 79 | +3. Set ZCU102 boot mode DIP SW6 to SD (positions 1-4 = ON, OFF, OFF, OFF). |
| 80 | +4. Insert the card and power-cycle the board. |
| 81 | + |
| 82 | +### JTAG boot (Vitis xsct) |
| 83 | + |
| 84 | +``` |
| 85 | +xsct |
| 86 | +% connect |
| 87 | +% targets -set -filter {name =~ "PSU"} |
| 88 | +% rst -system |
| 89 | +% loadhw -hw /path/to/your-design.xsa |
| 90 | +% targets -set -filter {name =~ "Cortex-A53 #0"} |
| 91 | +% dow /path/to/wolfip/src/port/zcu102/app.elf |
| 92 | +% con |
| 93 | +``` |
| 94 | + |
| 95 | +If you do not have an XSA from your own design, the stock ZCU102 base |
| 96 | +design from Vitis is fine - we only depend on the PS configuration |
| 97 | +(DDR controller, MIO pinmuxing, IOPLL clocks) which is identical |
| 98 | +across base designs. |
| 99 | + |
| 100 | +### JTAG iteration (no SD swap) |
| 101 | + |
| 102 | +This port ships a self-contained xsdb loader under `jtag/` that |
| 103 | +power-cycles the board (via remote Pi GPIO, optional), forces JTAG |
| 104 | +boot mode, runs `psu_init`, loads `app.elf` into OCM, and releases |
| 105 | +A53-0 at the OCM entry. The whole app + BSS + page tables + DMA |
| 106 | +buffers fit in the 256 KB OCM, so DDR-via-JTAG flakiness is avoided. |
| 107 | + |
| 108 | +``` |
| 109 | +./jtag/boot.sh # one-shot |
| 110 | +./jtag/boot_iter.sh # build + power-cycle + load loop |
| 111 | +``` |
| 112 | + |
| 113 | +See `jtag/boot.tcl` for the actual xsdb sequence. |
| 114 | + |
| 115 | +## Expected UART output |
| 116 | + |
| 117 | +``` |
| 118 | +=== wolfIP ZCU102 (UltraScale+ A53-0 EL3) === |
| 119 | +MMU on, caches on. Bringing up GIC-400... |
| 120 | +Initializing wolfIP stack... |
| 121 | +Bringing up GEM3 (RGMII, DP83867)... |
| 122 | +GEM3: PHY at MDIO addr=0x0000000C |
| 123 | +DP83867: ID1=0x00002000 ID2=0x0000A231 |
| 124 | +DP83867 link: 1000 Mbps FD |
| 125 | + link UP, PHY=0x0000000C |
| 126 | +Starting DHCP client... |
| 127 | +DHCP bound: |
| 128 | + IP: 192.168.1.50 |
| 129 | + Mask: 255.255.255.0 |
| 130 | + GW: 192.168.1.1 |
| 131 | +Opening UDP echo socket on port 7 |
| 132 | +Ready. Try: nc -u <leased-ip> 7 |
| 133 | +``` |
| 134 | + |
| 135 | +## Verification |
| 136 | + |
| 137 | +From a host on the same subnet as the board: |
| 138 | + |
| 139 | +``` |
| 140 | +$ ping -c 3 192.168.1.50 |
| 141 | +$ echo "hello wolfip" | nc -u -w1 192.168.1.50 7 |
| 142 | +hello wolfip |
| 143 | +``` |
| 144 | + |
| 145 | +UART capture via the `uart-monitor` skill (add a board entry pointing |
| 146 | +at `/dev/ttyUSB0` and 115200 8N1). |
| 147 | + |
| 148 | +## Files |
| 149 | + |
| 150 | +| File | Purpose | |
| 151 | +|---------------------|---------| |
| 152 | +| `Makefile` | Build app.elf and BOOT.BIN | |
| 153 | +| `target.ld` | aarch64 EL3 linker script - separate RX/RW segments, 2 MB DMA region | |
| 154 | +| `startup.S` | EL3 vectors, BSS clear, MMU/main bring-up, IRQ trampoline | |
| 155 | +| `board.h` | PS register base addresses, GIC SPI IDs | |
| 156 | +| `mmu.c` / `.h` | EL3 page tables (T0SZ=32, 1 GB L1 + 2 MB L2 for DDR + DMA carve-out) | |
| 157 | +| `gic.c` / `.h` | GIC-400 (GICv2) minimal driver | |
| 158 | +| `uart.c` / `.h` | PS-UART0 polled console | |
| 159 | +| `gem.c` / `.h` | Cadence GEM driver (PS-GEM3): BDs, polled-RX/TX, MDIO, cache maintenance | |
| 160 | +| `phy_dp83867.c` / `.h` | TI DP83867IR init + RGMII skew + AN + RX_CTRL strap quirk | |
| 161 | +| `main.c` | wolfIP init, DHCP client, UDP echo on port 7, memset/memcpy wrappers | |
| 162 | +| `config.h` | wolfIP build profile (UDP-only intent) | |
| 163 | +| `bootgen/boot.bif` | bootgen template (substitutes `${FSBL_ELF}` and `${APP_ELF}`) | |
| 164 | +| `bootgen/build_bootbin.sh` | renders the bif and invokes bootgen | |
| 165 | +| `jtag/boot.sh` / `.tcl` | xsdb loader for OCM-only JTAG iteration | |
| 166 | + |
| 167 | +## Notes for cert / DAL-C |
| 168 | + |
| 169 | +- No Xilinx Standalone BSP linked in. `aarch64-none-elf-gcc` newlib |
| 170 | + provides `memcpy`/`memset` only. |
| 171 | +- No dynamic allocation. All buffers static in BSS or `.dma_buffers`. |
| 172 | +- No floating point (`-mgeneral-regs-only`). |
| 173 | +- The MAC address is hard-coded in `board.h`. Replace with a |
| 174 | + per-board value (e.g., read from EEPROM or PS_VERSION fuses) for |
| 175 | + production; we keep static for repeatability in the lab. |
| 176 | +- The wolfIP core currently sizes its timer heap as |
| 177 | + `MAX_TIMERS = MAX_TCPSOCKETS * 3`. This port sets `MAX_TCPSOCKETS=2` |
| 178 | + in `config.h` so DHCP / ARP can schedule timers; the application |
| 179 | + does not open any TCP sockets. A core wolfIP follow-up should |
| 180 | + decouple the timer count from TCP so the TCP code can be fully |
| 181 | + excluded from a DAL-C build. |
| 182 | +- The wolfIP core triggers two false-positive GCC warnings |
| 183 | + (`-Wzero-length-bounds`, `-Wtype-limits`) when `MAX_TCPSOCKETS` |
| 184 | + reaches its lower bound. We suppress them on the wolfip.c compile |
| 185 | + only; the diagnostics on this port's source remain at `-Wall -Wextra |
| 186 | + -Werror`. |
| 187 | +- newlib's aarch64 `memset`/`memcpy` use `dc zva`, which hangs on this |
| 188 | + Cortex-A53 setup even with `SCTLR_EL3.DZE=1`. We override both with |
| 189 | + bytewise versions in `main.c` via `-Wl,--wrap`. |
| 190 | + |
| 191 | +## Known issues |
| 192 | + |
| 193 | +- The A53 IRQ exception is not delivered (GIC latches the SPI/SGI and |
| 194 | + `GICC_IAR` ack works when polled, but the IRQ vector never fires). |
| 195 | + Worked around by polling `gem_isr()` from `eth_poll()` in the main |
| 196 | + loop. Real root cause is open. |
| 197 | +- `MAX_TCPSOCKETS=2` is the minimum for the current wolfIP core - see |
| 198 | + the timer-heap note above. |
0 commit comments