Skip to content

Commit bcaaf37

Browse files
committed
ZynqMP ZCU102 SD-card Linux boot: EL2 cleanup, DTS bootargs, SDHCI init
Add the pieces needed to boot Linux end-to-end from the ZCU102 SD card with wolfBoot at EL2: * src/boot_aarch64_start.S: new el2_flush_and_disable_mmu helper that cleans D-cache to PoC, invalidates I-cache to PoU, and clears SCTLR_EL2.{M,C,I}, then returns. Satisfies the ARM64 Linux boot protocol and is also correct for any other payload that sets up its own translation (hypervisor, RTOS, later bootloader stage). * src/boot_aarch64.c: call el2_flush_and_disable_mmu from do_boot() on the EL2 direct-jump path before falling through to the br x4 block. Also pull in hal/zynq.h and hal/nxp_ls1028a.h so the EL_HYPERVISOR / BOOT_EL1 guards compile for those targets. * hal/zynq.c: implement hal_dts_fixup() — set /chosen/bootargs from LINUX_BOOTARGS (with a LINUX_BOOTARGS_ROOT default of /dev/mmcblk0p4) and grow DTB totalsize by 512 bytes to give fdt_setprop() headroom (matches hal/versal.c). Add hal_get_timer_us() via CNTPCT_EL0. * src/sdhci.c: add a 1 ms settling delay after sdhci_platform_init() and a CMD0 retry loop (up to 10 x 10 ms) so the ZCU102 Arasan controller reliably detects the card after the slot-type change + soft reset. * config/examples/zynqmp_sdcard.config: stay at EL2 by default (comment out BOOT_EL1), default rootfs to /dev/mmcblk0p4, turn DEBUG off. * hal/versal.c: correct the default LINUX_BOOTARGS_ROOT to /dev/mmcblk0p4 to match the shipped MBR layout. * docs/Targets.md: note the unconditional EL2 cleanup in the ZynqMP and Versal SD-card sections. Behavior change: non-Linux AArch64 EL2 payloads now enter with MMU off and caches clean instead of inheriting wolfBoot's tables. No in-tree payload relies on the old state leakage.
1 parent 877ffea commit bcaaf37

File tree

7 files changed

+263
-14
lines changed

7 files changed

+263
-14
lines changed

config/examples/zynqmp_sdcard.config

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ HASH?=SHA3
1818
IMAGE_HEADER_SIZE?=1024
1919

2020
# Debug options
21-
DEBUG?=1
21+
DEBUG?=0
2222
DEBUG_SYMBOLS=1
2323
DEBUG_UART=1
2424
CFLAGS_EXTRA+=-DDEBUG_ZYNQ=1
@@ -39,8 +39,12 @@ NO_XIP=1
3939
# ELF loading support
4040
ELF?=1
4141

42-
# Boot Exception Level: transition from EL2 -> EL1 before jumping to app
43-
BOOT_EL1?=1
42+
# Boot Exception Level: leave wolfBoot at EL2 for handoff to Linux (matches
43+
# the standard PetaLinux U-Boot flow and preserves KVM/hypervisor use of
44+
# EL2). The EL2 Linux-cleanup path in do_boot() will clean dcache/disable
45+
# MMU before jumping to the kernel. To drop to EL1 via ERET instead, set
46+
# BOOT_EL1?=1 (requires EL2_HYPERVISOR=1, which is the hal/zynq.h default).
47+
#BOOT_EL1?=1
4448

4549
# General options
4650
VTOR?=1
@@ -78,8 +82,13 @@ CFLAGS_EXTRA+=-DBOOT_PART_B=2
7882
# Disk read chunk size (512KB)
7983
CFLAGS_EXTRA+=-DDISK_BLOCK_SIZE=0x80000
8084

81-
# Linux rootfs is on partition 4 (SD1 = mmcblk1)
82-
CFLAGS_EXTRA+=-DLINUX_BOOTARGS_ROOT=\"/dev/mmcblk1p4\"
85+
# Linux rootfs is on partition 4. Device naming depends on whether both
86+
# ZynqMP SDHCI controllers are enabled in the XSA / device tree:
87+
# * both sdhci0 + sdhci1 enabled -> SD1 = /dev/mmcblk1
88+
# * only sdhci1 enabled (ZCU102 default -> only external SD populated)
89+
# -> SD1 = /dev/mmcblk0
90+
# Check `ls /sys/class/mmc_host/` on your running target to confirm.
91+
CFLAGS_EXTRA+=-DLINUX_BOOTARGS_ROOT=\"/dev/mmcblk0p4\"
8392

8493
# ============================================================================
8594
# Boot Memory Layout

docs/Targets.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2599,6 +2599,8 @@ qemu-system-aarch64 -machine xlnx-zcu102 -cpu cortex-a53 -serial stdio -display
25992599

26002600
Use `config/examples/zynqmp_sdcard.config`. This uses the Arasan SDHCI controller (SD1 - external SD card slot on ZCU102) and an **MBR** partitioned SD card.
26012601

2602+
wolfBoot unconditionally flushes the EL2 D-cache/I-cache and disables the EL2 MMU before handoff (see `el2_flush_and_disable_mmu` in `src/boot_aarch64_start.S`), satisfying the ARM64 Linux boot protocol with no extra config flag required.
2603+
26022604
**Partition layout**
26032605
| Partition | Name | Size | Type | Contents |
26042606
|-----------|--------|-----------|-------------------------------|-------------------------------------------|
@@ -3005,6 +3007,8 @@ Typical boot timing with ECC384/SHA384 signing:
30053007

30063008
Use `config/examples/versal_vmk180_sdcard.config`. This uses the Arasan SDHCI controller and an **MBR** partitioned SD card.
30073009

3010+
Versal defaults to `BOOT_EL1` — the handoff goes through `el2_to_el1_boot` (ERET to EL1). Custom `BOOT_EL2` Versal configs get the same EL2 cache/MMU teardown as ZynqMP via `el2_flush_and_disable_mmu` in `src/boot_aarch64_start.S`, so no extra config flag is needed to boot Linux directly at EL2.
3011+
30083012
**Partition layout**
30093013
| Partition | Name | Size | Type | Contents |
30103014
|-----------|------|------|------|----------|

hal/versal.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@
6666
/* Linux kernel command line arguments */
6767
#ifndef LINUX_BOOTARGS
6868
#ifndef LINUX_BOOTARGS_ROOT
69-
#define LINUX_BOOTARGS_ROOT "/dev/mmcblk0p2"
69+
#define LINUX_BOOTARGS_ROOT "/dev/mmcblk0p4"
7070
#endif
7171

7272
#define LINUX_BOOTARGS \

hal/zynq.c

Lines changed: 68 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,20 @@
5757
/* QSPI bare-metal */
5858
#endif
5959

60+
/* DTB fixup for kernel command line. Override LINUX_BOOTARGS or
61+
* LINUX_BOOTARGS_ROOT in your config to customize.
62+
*
63+
* Note: console=ttyPS0 is ZynqMP-specific (PS UART0). Versal's default
64+
* (hal/versal.c) omits the console= token because Versal relies on
65+
* earlycon alone plus a DT-declared stdout-path. */
66+
#ifndef LINUX_BOOTARGS
67+
#ifndef LINUX_BOOTARGS_ROOT
68+
#define LINUX_BOOTARGS_ROOT "/dev/mmcblk0p4"
69+
#endif
70+
#define LINUX_BOOTARGS \
71+
"earlycon console=ttyPS0,115200 root=" LINUX_BOOTARGS_ROOT " rootwait"
72+
#endif
73+
6074
/* QSPI Slave Device Information */
6175
typedef struct QspiDev {
6276
uint32_t mode; /* GQSPI_GEN_FIFO_MODE_SPI, GQSPI_GEN_FIFO_MODE_DSPI or GQSPI_GEN_FIFO_MODE_QSPI */
@@ -1795,7 +1809,20 @@ void RAMFUNCTION ext_flash_unlock(void)
17951809

17961810
}
17971811

1798-
#ifdef MMU
1812+
#if defined(MMU) && defined(__WOLFBOOT)
1813+
/* Get current time in microseconds using ARMv8 generic timer */
1814+
uint64_t hal_get_timer_us(void)
1815+
{
1816+
uint64_t count, freq;
1817+
__asm__ volatile("mrs %0, CNTPCT_EL0" : "=r"(count));
1818+
__asm__ volatile("mrs %0, CNTFRQ_EL0" : "=r"(freq));
1819+
if (freq == 0)
1820+
return 0;
1821+
/* Use __uint128_t to avoid overflow of (count * 1e6) at long uptimes
1822+
* (would overflow uint64_t after ~51h at 100MHz). */
1823+
return (uint64_t)(((__uint128_t)count * 1000000ULL) / freq);
1824+
}
1825+
17991826
void* hal_get_dts_address(void)
18001827
{
18011828
#ifdef WOLFBOOT_DTS_BOOT_ADDRESS
@@ -1809,8 +1836,46 @@ void* hal_get_dts_address(void)
18091836

18101837
int hal_dts_fixup(void* dts_addr)
18111838
{
1812-
/* place FDT fixup specific to ZynqMP here */
1813-
//fdt_set_boot_cpuid_phys(buf, fdt_boot_cpuid_phys(fdt));
1839+
int off, ret;
1840+
struct fdt_header *fdt = (struct fdt_header *)dts_addr;
1841+
1842+
/* Verify FDT header */
1843+
ret = fdt_check_header(dts_addr);
1844+
if (ret != 0) {
1845+
wolfBoot_printf("FDT: Invalid header! %d\n", ret);
1846+
return ret;
1847+
}
1848+
1849+
wolfBoot_printf("FDT: Version %d, Size %d\n",
1850+
fdt_version(fdt), fdt_totalsize(fdt));
1851+
1852+
/* Expand totalsize so fdt_setprop() has in-blob free space to place
1853+
* a new/larger bootargs property. Physical headroom is already
1854+
* guaranteed by the load-address layout (DTB at WOLFBOOT_LOAD_DTS_ADDRESS,
1855+
* kernel loaded much higher), so growing the header is safe. Matches
1856+
* the pattern used in hal/versal.c:hal_dts_fixup. */
1857+
fdt_set_totalsize(fdt, fdt_totalsize(fdt) + 512);
1858+
1859+
/* Find /chosen node */
1860+
off = fdt_find_node_offset(fdt, -1, "chosen");
1861+
if (off < 0) {
1862+
/* Create /chosen node if it doesn't exist */
1863+
off = fdt_add_subnode(fdt, 0, "chosen");
1864+
}
1865+
if (off < 0) {
1866+
wolfBoot_printf("FDT: Failed to find/create chosen node (%d)\n", off);
1867+
return off;
1868+
}
1869+
1870+
/* Set bootargs property - overrides PetaLinux default root= with
1871+
* the wolfBoot partition layout. */
1872+
wolfBoot_printf("FDT: Setting bootargs: %s\n", LINUX_BOOTARGS);
1873+
ret = fdt_fixup_str(fdt, off, "chosen", "bootargs", LINUX_BOOTARGS);
1874+
if (ret < 0) {
1875+
wolfBoot_printf("FDT: Failed to set bootargs (%d)\n", ret);
1876+
return ret;
1877+
}
1878+
18141879
return 0;
18151880
}
18161881
#endif

src/boot_aarch64.c

Lines changed: 36 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,16 @@
2626
#include "printf.h"
2727
#include "wolfboot/wolfboot.h"
2828

29-
/* Include platform-specific header for EL configuration defines */
30-
#ifdef TARGET_versal
29+
/* Include platform-specific header for EL configuration defines
30+
* (EL2_HYPERVISOR, etc.). Must be visible here so the BOOT_EL1 /
31+
* EL2_HYPERVISOR guards around the EL2->EL1 ERET transition below
32+
* compile in for the active target. */
33+
#if defined(TARGET_versal)
3134
#include "hal/versal.h"
35+
#elif defined(TARGET_zynq)
36+
#include "hal/zynq.h"
37+
#elif defined(TARGET_ls1028a)
38+
#include "hal/nxp_ls1028a.h"
3239
#endif
3340

3441
/* Linker exported variables */
@@ -43,6 +50,17 @@ extern unsigned int _end_data;
4350
extern void main(void);
4451
extern void gicv2_init_secure(void);
4552

53+
/* Asm helper in boot_aarch64_start.S: cleans the entire D-cache to PoC,
54+
* invalidates the I-cache to PoU, and disables MMU + I-cache + D-cache
55+
* via SCTLR_EL2, then returns. Required before handoff to any payload
56+
* that sets up its own translation (Linux kernel, hypervisor, bare-metal
57+
* RTOS, later bootloader stage), and mandatory for the ARM64 Linux boot
58+
* protocol. Only built when EL2_HYPERVISOR == 1 is visible to
59+
* boot_aarch64_start.S (e.g. via hal/zynq.h on ZynqMP). */
60+
#if defined(EL2_HYPERVISOR) && EL2_HYPERVISOR == 1
61+
extern void el2_flush_and_disable_mmu(void);
62+
#endif
63+
4664
/* SKIP_GIC_INIT - Skip GIC initialization before booting app
4765
* This is needed for:
4866
* - Versal: Uses GICv3, not GICv2. BL31 handles GIC setup.
@@ -163,7 +181,22 @@ void RAMFUNCTION do_boot(const uint32_t *app_offset)
163181
el2_to_el1_boot((uintptr_t)app_offset, dts);
164182
}
165183
#else
166-
/* Stay at current EL (EL2 or EL3) and jump directly to application */
184+
/* Stay at current EL (EL2 or EL3) and jump directly to application.
185+
*
186+
* Before the jump, tear down wolfBoot's EL2 MMU/caches so the next
187+
* stage enters with a clean state. Mandatory for the ARM64 Linux
188+
* boot protocol (Linux's arm64_panic_block_init() panics with
189+
* "Non-EFI boot detected with MMU and caches enabled" otherwise),
190+
* and correct for any payload that sets up its own translation
191+
* (hypervisor, RTOS, later bootloader stage). */
192+
#if defined(MMU) && defined(EL2_HYPERVISOR) && EL2_HYPERVISOR == 1
193+
if (current_el() == 2) {
194+
wolfBoot_printf("do_boot: flushing caches, disabling MMU\n");
195+
el2_flush_and_disable_mmu();
196+
}
197+
#endif
198+
199+
/* Non-Linux EL2 and EL3 path: legacy direct br x4 */
167200

168201
/* Set application address via x4 */
169202
asm volatile("mov x4, %0" : : "r"(app_offset));

src/boot_aarch64_start.S

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1334,4 +1334,118 @@ el2_to_el1_boot:
13341334
b .
13351335
#endif /* BOOT_EL1 && EL2_HYPERVISOR */
13361336

1337+
1338+
/*
1339+
* Clean entire D-cache to the Point of Coherency (PoC), invalidate the
1340+
* I-cache to the Point of Unification (PoU), and disable MMU + I/D-cache
1341+
* at EL2. Returns normally to the caller.
1342+
*
1343+
* Terminology (ARM ARM B2.8):
1344+
* PoC - Point of Coherency: the point at which all observers (CPUs,
1345+
* DMA masters, etc.) see the same memory. Cleaning to PoC
1346+
* guarantees the image bytes we memcpy'd are visible to the
1347+
* next stage's first uncached instruction fetches.
1348+
* PoU - Point of Unification: the point at which instruction and data
1349+
* caches converge. Invalidating I-cache to PoU ensures stale
1350+
* fetches are discarded before we hand off.
1351+
*
1352+
* wolfBoot's startup (line ~347 above) enables MMU+I+D cache at EL2 for
1353+
* its own use. Any payload we hand off to (Linux kernel, hypervisor,
1354+
* bare-metal RTOS, a later bootloader stage) expects to enter without
1355+
* inheriting wolfBoot's translation tables, and the ARM64 Linux boot
1356+
* protocol (Documentation/arch/arm64/booting.rst) explicitly REQUIRES
1357+
* MMU off, D-cache off, and the loaded image cleaned to PoC. This
1358+
* helper performs that teardown and returns; the caller then performs
1359+
* the actual jump with whatever ABI the payload expects.
1360+
*
1361+
* Safe to return because wolfBoot's .text is identity-mapped (VA=PA)
1362+
* at EL2, so instruction fetch keeps working after SCTLR_EL2.M is
1363+
* cleared.
1364+
*
1365+
* AAPCS64: clobbers x0-x11; x30 (LR) is preserved because the
1366+
* set/way loop body does not touch it.
1367+
*/
1368+
#if defined(EL2_HYPERVISOR) && EL2_HYPERVISOR == 1
1369+
.global el2_flush_and_disable_mmu
1370+
el2_flush_and_disable_mmu:
1371+
/* ---- 1. Clean & invalidate entire data cache to PoC by set/way ----
1372+
* Standard ARMv8 routine, adapted from arm-trusted-firmware /
1373+
* U-Boot / Linux. Iterates every (level, set, way) triple and
1374+
* issues `dc cisw` on it. Terminates at the Level of Coherency
1375+
* (LoC) read from CLIDR_EL1. */
1376+
mrs x0, clidr_el1
1377+
and x3, x0, #0x07000000 /* x3 = LoC (level of coherency) */
1378+
lsr x3, x3, #23 /* x3 = LoC * 2 */
1379+
cbz x3, .Ldcache_done
1380+
mov x10, #0 /* x10 = current cache level << 1 */
1381+
1382+
.Ldcache_level_loop:
1383+
add x2, x10, x10, lsr #1 /* x2 = level * 3 */
1384+
lsr x1, x0, x2 /* x1 = ctype field for this level */
1385+
and x1, x1, #7
1386+
cmp x1, #2
1387+
b.lt .Ldcache_skip_level /* No data cache at this level */
1388+
msr csselr_el1, x10 /* Select cache level (instruction = 0) */
1389+
isb
1390+
mrs x1, ccsidr_el1
1391+
and x2, x1, #7 /* x2 = log2(line length) - 4 */
1392+
add x2, x2, #4 /* x2 = log2(line length) */
1393+
mov x4, #0x3ff
1394+
and x4, x4, x1, lsr #3 /* x4 = max way number */
1395+
clz w5, w4 /* x5 = bit position of way size */
1396+
mov x7, #0x7fff
1397+
and x7, x7, x1, lsr #13 /* x7 = max set number */
1398+
1399+
.Ldcache_set_loop:
1400+
mov x9, x4 /* x9 = current way */
1401+
.Ldcache_way_loop:
1402+
lsl x6, x9, x5
1403+
orr x11, x10, x6 /* level | way */
1404+
lsl x6, x7, x2
1405+
orr x11, x11, x6 /* level | way | set */
1406+
dc cisw, x11 /* clean & invalidate by set/way */
1407+
subs x9, x9, #1
1408+
b.ge .Ldcache_way_loop
1409+
subs x7, x7, #1
1410+
b.ge .Ldcache_set_loop
1411+
1412+
.Ldcache_skip_level:
1413+
add x10, x10, #2
1414+
cmp x3, x10
1415+
b.gt .Ldcache_level_loop
1416+
1417+
.Ldcache_done:
1418+
mov x10, #0
1419+
msr csselr_el1, x10
1420+
dsb sy
1421+
isb
1422+
1423+
/* ---- 2. Invalidate entire I-cache to PoU ----
1424+
* `ic iallu` invalidates all instruction cache to the Point of
1425+
* Unification for the local PE. */
1426+
ic iallu
1427+
dsb ish
1428+
isb
1429+
1430+
/* ---- 3. Disable MMU + I-cache + D-cache at EL2 ----
1431+
* SCTLR_EL2.M (bit 0) = MMU enable
1432+
* SCTLR_EL2.C (bit 2) = D-cache enable
1433+
* SCTLR_EL2.I (bit 12) = I-cache enable
1434+
*
1435+
* ARM ARM (B2.7.2) requires `dsb sy` before `isb` when modifying
1436+
* SCTLR_ELx.M so the system register write is observable before the
1437+
* pipeline is re-synchronized. Matches the MMU-enable sequence used
1438+
* earlier in this file.
1439+
*/
1440+
mrs x0, SCTLR_EL2
1441+
bic x0, x0, #(1 << 0) /* M */
1442+
bic x0, x0, #(1 << 2) /* C */
1443+
bic x0, x0, #(1 << 12) /* I */
1444+
msr SCTLR_EL2, x0
1445+
dsb sy
1446+
isb
1447+
1448+
ret
1449+
#endif /* EL2_HYPERVISOR */
1450+
13371451
.end

src/sdhci.c

Lines changed: 26 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -581,6 +581,7 @@ static uint32_t sdhci_get_response_bits(int from, int count)
581581
/* voltage: 0=off or SDHCI_SRS10_BVS_[X_X]V */
582582
static int sdcard_power_init_seq(uint32_t voltage)
583583
{
584+
int retries;
584585
/* Set power to specified voltage */
585586
int status = sdhci_set_power(voltage);
586587
#ifdef DEBUG_SDHCI
@@ -590,9 +591,24 @@ static int sdcard_power_init_seq(uint32_t voltage)
590591
SDHCI_REG(SDHCI_SRS09), SDHCI_REG(SDHCI_SRS10),
591592
SDHCI_REG(SDHCI_SRS11), SDHCI_REG(SDHCI_SRS12));
592593
#endif
593-
if (status == 0) {
594-
/* send CMD0 (go idle) to reset card */
594+
if (status != 0)
595+
return status;
596+
/* SD spec requires >= 1ms after power stabilizes before CMD0. */
597+
udelay(1000);
598+
/* Some cards and the ZynqMP Arasan controller need more settling
599+
* time after the slot-type change + soft reset in sdhci_platform_init().
600+
* Use a retry loop: if CMD0 fails, wait and retry (self-calibrating). */
601+
for (retries = 0; retries < 10; retries++) {
595602
status = sdhci_cmd(MMC_CMD0_GO_IDLE, 0, SDHCI_RESP_NONE);
603+
if (status == 0)
604+
break;
605+
udelay(10000); /* 10ms between retries */
606+
}
607+
if (status != 0) {
608+
wolfBoot_printf("SD: CMD0 failed after %d retries\n", retries);
609+
}
610+
else if (retries > 0) {
611+
wolfBoot_printf("SD: CMD0 succeeded after %d retries\n", retries);
596612
}
597613
if (status == 0) {
598614
/* send the operating conditions command */
@@ -1387,6 +1403,11 @@ int sdhci_init(void)
13871403
/* Call platform-specific initialization (clocks, resets, pin mux) */
13881404
sdhci_platform_init();
13891405

1406+
/* Allow controller to settle after platform init (slot type change,
1407+
* soft reset, clock configuration). Without this, the controller may
1408+
* not be ready to accept register writes on some platforms. */
1409+
udelay(1000); /* 1ms */
1410+
13901411
/* Reset the host controller */
13911412
sdhci_reg_or(SDHCI_HRS00, SDHCI_HRS00_SWR);
13921413
/* Bit will clear when reset is done */
@@ -1482,6 +1503,9 @@ int sdhci_init(void)
14821503
/* Setup 400khz starting clock */
14831504
sdhci_set_clock(SDHCI_CLK_400KHZ);
14841505

1506+
/* Allow clock to stabilize before issuing first command */
1507+
udelay(1000); /* 1ms */
1508+
14851509
#ifdef DISK_EMMC
14861510
/* Run full eMMC card initialization */
14871511
status = emmc_card_full_init();

0 commit comments

Comments
 (0)