Skip to content

Commit 08b7e80

Browse files
psiddhclaude
andcommitted
Add Zephyr + Alif E8 tutorial and Alif board conf
Add a step-by-step tutorial for running MobileNetV2 on the Alif Ensemble E8 DevKit with ExecuTorch, Zephyr RTOS, and Ethos-U55 NPU delegation. Covers workspace setup, model export, FVP validation on Corstone-320, and Alif flashing with SE Tools (including the mramAddress gotcha). Also adds the missing Alif E8 board conf that enables CONFIG_ETHOS_U and CONFIG_ET_ARM_MODEL_PTE_DMA_ACCESSIBLE (MRAM is DMA-accessible). Co-authored-by: Claude <noreply@anthropic.com>
1 parent a7e44bf commit 08b7e80

3 files changed

Lines changed: 268 additions & 0 deletions

File tree

docs/source/embedded-section.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ Start here for C++ development with ExecuTorch runtime APIs and essential tutori
2727
- {doc}`backends/arm-ethos-u/tutorials/ethos-u-getting-started` — Export a simple PyTorch model for the ExecuTorch Ethos-U backend
2828
- {doc}`raspberry_pi_llama_tutorial` — Deploy a LLaMA model on a Raspberry Pi
2929
- {doc}`pico2_tutorial` — Deploy a demo MNIST model on the Raspberry Pi Pico 2
30+
- {doc}`zephyr_alif_tutorial` — Deploy MobileNetV2 on Alif Ensemble E8 with Zephyr and Ethos-U NPU
3031

3132

3233
```{toctree}
@@ -41,3 +42,4 @@ embedded-backends
4142
backends/arm-ethos-u/tutorials/ethos-u-getting-started
4243
raspberry_pi_llama_tutorial
4344
pico2_tutorial
45+
zephyr_alif_tutorial
Lines changed: 256 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,256 @@
1+
# Zephyr: MobileNetV2 on Alif Ensemble E8 with Ethos-U NPU
2+
3+
Run a quantized MobileNetV2 image classifier on the
4+
Alif Ensemble E8 DevKit using
5+
ExecuTorch, Zephyr RTOS, and the Arm Ethos-U55 NPU. The same build flow also
6+
works on the Arm Corstone-320 FVP for development without hardware.
7+
8+
## What You'll Build
9+
10+
- A quantized INT8 MobileNetV2 model fully delegated to the Ethos-U55 NPU
11+
(110 ops, ~19 ms inference on Alif E8)
12+
- A Zephyr RTOS application that loads the `.pte` model, runs inference on a
13+
static test image, and prints the top-5 ImageNet predictions over UART
14+
15+
## Prerequisites
16+
17+
### Hardware (choose one)
18+
19+
| Target | Description |
20+
|--------|-------------|
21+
| **Alif Ensemble E8 DevKit** | Cortex-M55 HP core + Ethos-U55 (256 MACs), 4.5 MB HP SRAM, MRAM |
22+
| **Corstone-320 FVP** | Virtual platform simulating Cortex-M85 + Ethos-U85 (no hardware needed, Linux only) |
23+
24+
### Software
25+
26+
- Linux x86_64 (FVP and Arm toolchain are Linux-only; macOS can export models
27+
but cannot run the FVP or flash)
28+
- Python 3.10+
29+
- Alif SE Tools for flashing (Alif hardware only)
30+
31+
## Step 1: Set Up the Zephyr Workspace
32+
33+
Create a workspace, install `west`, and initialize the Zephyr tree:
34+
35+
```bash
36+
mkdir ~/zephyr_workspace && cd ~/zephyr_workspace
37+
python3 -m venv .venv && source .venv/bin/activate
38+
pip install west "cmake<4.0.0" pyelftools ninja jsonschema
39+
west init --manifest-rev v4.3.0
40+
```
41+
42+
Install the Zephyr SDK (compiler toolchain):
43+
44+
```bash
45+
wget https://github.com/zephyrproject-rtos/sdk-ng/releases/download/v0.17.4/zephyr-sdk-0.17.4_linux-x86_64.tar.xz
46+
tar -xf zephyr-sdk-0.17.4_linux-x86_64.tar.xz && rm -f zephyr-sdk-0.17.4_linux-x86_64.tar.xz
47+
./zephyr-sdk-0.17.4/setup.sh -c -t arm-zephyr-eabi
48+
export ZEPHYR_SDK_INSTALL_DIR=$(realpath ./zephyr-sdk-0.17.4)
49+
```
50+
51+
## Step 2: Add ExecuTorch as a Zephyr Module
52+
53+
Copy the submanifest, configure `west` to pull only the modules we need, and
54+
update:
55+
56+
```bash
57+
mkdir -p zephyr/submanifests
58+
cat > zephyr/submanifests/executorch.yaml << 'EOF'
59+
manifest:
60+
projects:
61+
- name: executorch
62+
url: https://github.com/pytorch/executorch
63+
revision: main
64+
path: modules/lib/executorch
65+
EOF
66+
67+
west config manifest.project-filter -- -.*,+zephyr,+executorch,+cmsis,+cmsis_6,+cmsis-nn,+hal_ethos_u
68+
west update
69+
```
70+
71+
For Alif boards, also add the Alif HAL:
72+
73+
```bash
74+
west config manifest.project-filter -- -.*,+zephyr,+executorch,+cmsis,+cmsis_6,+cmsis-nn,+hal_ethos_u,+hal_alif
75+
west update
76+
```
77+
78+
## Step 3: Install ExecuTorch and Arm Tools
79+
80+
```bash
81+
cd modules/lib/executorch
82+
git submodule sync && git submodule update --init --recursive
83+
./install_executorch.sh
84+
cd ../../..
85+
```
86+
87+
Install the Arm toolchain, Vela compiler, and Corstone FVPs:
88+
89+
```bash
90+
modules/lib/executorch/examples/arm/setup.sh --i-agree-to-the-contained-eula
91+
source modules/lib/executorch/examples/arm/arm-scratch/setup_path.sh
92+
```
93+
94+
## Step 4: Export the MobileNetV2 Model
95+
96+
Export a quantized INT8 MobileNetV2 with Ethos-U delegation. Choose the target
97+
that matches your hardware:
98+
99+
**For Alif E8 (Ethos-U55 with 256 MACs):**
100+
101+
```bash
102+
python -m modules.lib.executorch.backends.arm.scripts.aot_arm_compiler \
103+
--model_name=mv2_untrained \
104+
--quantize --delegate \
105+
--target=ethos-u55-256 \
106+
--output=mv2_ethosu.pte
107+
```
108+
109+
**For Corstone-320 FVP (Ethos-U85 with 256 MACs):**
110+
111+
```bash
112+
python -m modules.lib.executorch.backends.arm.scripts.aot_arm_compiler \
113+
--model_name=mv2_untrained \
114+
--quantize --delegate \
115+
--target=ethos-u85-256 \
116+
--output=mv2_u85_256.pte
117+
```
118+
119+
The `--delegate` flag routes all compatible ops through the Ethos-U backend.
120+
The Vela compiler converts the TOSA intermediate representation into an
121+
optimized command stream for the NPU. Use `mv2` instead of `mv2_untrained` for
122+
meaningful predictions (requires torchvision pretrained weights).
123+
124+
## Step 5: Build the Zephyr Application
125+
126+
**For Alif E8:**
127+
128+
```bash
129+
west build -b alif_e8_dk/ae822fa0e5597xx0/rtss_hp \
130+
-S ethos-u55-enable \
131+
modules/lib/executorch/zephyr/samples/mv2-ethosu -- \
132+
-DET_PTE_FILE_PATH=mv2_ethosu.pte
133+
```
134+
135+
**For Corstone-320 FVP:**
136+
137+
```bash
138+
west build -b mps4/corstone320/fvp \
139+
modules/lib/executorch/zephyr/samples/mv2-ethosu -- \
140+
-DET_PTE_FILE_PATH=mv2_u85_256.pte
141+
```
142+
143+
## Step 6a: Run on Corstone-320 FVP
144+
145+
Set up the FVP paths and run:
146+
147+
```bash
148+
export FVP_ROOT=$PWD/modules/lib/executorch/examples/arm/arm-scratch/FVP-corstone320
149+
export ARMFVP_BIN_PATH=${FVP_ROOT}/models/Linux64_GCC-9.3
150+
export LD_LIBRARY_PATH=${FVP_ROOT}/python/lib:${ARMFVP_BIN_PATH}:${LD_LIBRARY_PATH}
151+
export ARMFVP_EXTRA_FLAGS="-C mps4_board.uart0.shutdown_on_eot=1 -C mps4_board.subsystem.ethosu.num_macs=256"
152+
153+
west build -t run
154+
```
155+
156+
MV2 inference is cycle-accurate on the FVP and takes 10-20 minutes of wall
157+
clock. You should see output like:
158+
159+
```
160+
========================================
161+
ExecuTorch MobileNetV2 Classification Demo
162+
========================================
163+
Ethos-U backend registered successfully
164+
Model loaded, has 1 methods
165+
Inference completed in <N> ms
166+
--- Classification Results ---
167+
Top-5 predictions:
168+
[1] class <id>: <score>
169+
...
170+
MobileNetV2 Demo Complete
171+
========================================
172+
```
173+
174+
## Step 6b: Flash and Run on Alif E8
175+
176+
### Flash with Alif SE Tools
177+
178+
The Alif SE Tools (`app-gen-toc.py` and `app-write-mram.py`) program the
179+
binary into the E8's MRAM. You need a `zephyr.json` configuration file.
180+
181+
Create `zephyr.json` in the build output directory:
182+
183+
```bash
184+
cat > build/zephyr.json << 'EOF'
185+
{
186+
"binary": "zephyr.bin",
187+
"mramAddress": "0x80008000",
188+
"cpu": "M55_HP"
189+
}
190+
EOF
191+
```
192+
193+
> **Important:** Use `mramAddress: "0x80008000"` (FLASH_LOAD_OFFSET=0x8000),
194+
> **not** the default `0x80200000`. The default offset does not leave enough
195+
> MRAM for the ~3.5 MB MV2 model blob.
196+
197+
Generate the table of contents and flash:
198+
199+
```bash
200+
cd build
201+
python /path/to/alif-se-tools/app-gen-toc.py
202+
python /path/to/alif-se-tools/app-write-mram.py
203+
cd ..
204+
```
205+
206+
### Connect Serial Console
207+
208+
Connect to UART4 at 115200 baud. On Linux:
209+
210+
```bash
211+
picocom -b 115200 /dev/ttyUSB0
212+
```
213+
214+
Press the reset button on the E8 DevKit. You should see the classification
215+
output within a few seconds (~19 ms inference).
216+
217+
## Troubleshooting
218+
219+
| Symptom | Cause | Fix |
220+
|---------|-------|-----|
221+
| Linker: `region 'FLASH' overflowed` | Model PTE too large for ITCM | Use the DDR overlay (FVP) or verify mramAddress (Alif) |
222+
| Linker: `region 'RAM' overflowed` | Pools + model copy exceed SRAM | Set `CONFIG_ET_ARM_MODEL_PTE_DMA_ACCESSIBLE=y` to skip the SRAM copy |
223+
| FVP hangs after "Ethos-U backend registered" | Cycle-accurate MV2 simulation is slow | Wait 10-20 min, or use Corstone-320 (faster than 300) |
224+
| No serial output on Alif | Wrong UART or baud rate | Use UART4 at 115200 baud |
225+
| `app-write-mram.py` fails | Wrong mramAddress | Use `0x80008000`, not `0x80200000` |
226+
| Runtime: method allocator OOM | Pool size too small | Increase `CONFIG_EXECUTORCH_METHOD_ALLOCATOR_POOL_SIZE` in board conf |
227+
228+
## Memory Layout
229+
230+
| Region | Corstone-320 FVP | Alif E8 |
231+
|--------|-----------------|---------|
232+
| Code + .rodata | ITCM (512 KB) | MRAM |
233+
| .data + .bss + pools | ISRAM (4 MB) | HP SRAM (4.5 MB) |
234+
| Model PTE (~3.5 MB) | DDR (16 MB, via overlay) | MRAM (DMA-accessible) |
235+
| NPU delegation | Ethos-U85 (256 MACs) | Ethos-U55 (256 MACs) |
236+
237+
## Using Claude Code with Zephyr
238+
239+
If you use [Claude Code](https://docs.anthropic.com/en/docs/claude-code), the
240+
ExecuTorch repo ships a `/zephyr` skill that can help with:
241+
242+
- **Workspace setup** — scaffolds the Zephyr workspace, west manifests, and SDK install
243+
- **Board bringup** — generates DTS overlays, board confs, and linker snippets for new boards
244+
- **Memory debugging** — diagnoses linker overflow errors and runtime allocation failures,
245+
with the exact pool sizes your model needs
246+
247+
Type `/zephyr` in Claude Code while working in the ExecuTorch repo to activate
248+
it. Related skills: `/export` for model conversion, `/cortex-m` for baremetal
249+
Cortex-M builds, `/executorch-kb` for backend-specific debugging.
250+
251+
## Next Steps
252+
253+
- Swap `mv2_untrained` for `mv2` (with torchvision) to get real ImageNet predictions
254+
- Try other models: `resnet18`, or bring your own `.py` model file
255+
- Explore the [hello-executorch sample](https://github.com/pytorch/executorch/tree/main/zephyr/samples/hello-executorch) for a minimal starting point
256+
- See the [Ethos-U Getting Started tutorial](backends/arm-ethos-u/tutorials/ethos-u-getting-started.md) for the baremetal (non-Zephyr) flow
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Copyright (c) Meta Platforms, Inc. and affiliates.
2+
#
3+
# Copyright 2026 Arm Limited and/or its affiliates.
4+
#
5+
# SPDX-License-Identifier: Apache-2.0
6+
#
7+
# Alif Ensemble E8 DevKit (HP core): Ethos-U55 with 256 MACs.
8+
# MRAM is DMA-accessible by the NPU so no SRAM copy is needed.
9+
CONFIG_ETHOS_U=y
10+
CONFIG_ET_ARM_MODEL_PTE_DMA_ACCESSIBLE=y

0 commit comments

Comments
 (0)