Skip to content

Commit 2828406

Browse files
psiddhclaude
andcommitted
Add Zephyr + Alif E8 tutorial and Alif board conf
Add a step-by-step tutorial for running MobileNetV2 on the Alif Ensemble E8 DevKit with ExecuTorch, Zephyr RTOS, and Ethos-U55 NPU delegation. Covers workspace setup, model export, FVP validation on Corstone-320, and Alif flashing with SE Tools (including the mramAddress gotcha). Also adds the missing Alif E8 board conf that enables CONFIG_ETHOS_U and CONFIG_ET_ARM_MODEL_PTE_DMA_ACCESSIBLE (MRAM is DMA-accessible). Co-authored-by: Claude <noreply@anthropic.com>
1 parent a7e44bf commit 2828406

3 files changed

Lines changed: 318 additions & 0 deletions

File tree

docs/source/embedded-section.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ Start here for C++ development with ExecuTorch runtime APIs and essential tutori
2727
- {doc}`backends/arm-ethos-u/tutorials/ethos-u-getting-started` — Export a simple PyTorch model for the ExecuTorch Ethos-U backend
2828
- {doc}`raspberry_pi_llama_tutorial` — Deploy a LLaMA model on a Raspberry Pi
2929
- {doc}`pico2_tutorial` — Deploy a demo MNIST model on the Raspberry Pi Pico 2
30+
- {doc}`zephyr_alif_tutorial` — Deploy MobileNetV2 on Alif Ensemble E8 with Zephyr and Ethos-U NPU
3031

3132

3233
```{toctree}
@@ -41,3 +42,4 @@ embedded-backends
4142
backends/arm-ethos-u/tutorials/ethos-u-getting-started
4243
raspberry_pi_llama_tutorial
4344
pico2_tutorial
45+
zephyr_alif_tutorial
Lines changed: 306 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,306 @@
1+
# Zephyr: MobileNetV2 on Alif Ensemble E8 with Ethos-U NPU
2+
3+
Run a quantized MobileNetV2 image classifier on the
4+
Alif Ensemble E8 DevKit using
5+
ExecuTorch, Zephyr RTOS, and the Arm Ethos-U55 NPU. The same build flow also
6+
works on the Arm Corstone-320 FVP for development without hardware.
7+
8+
## What You'll Build
9+
10+
- A quantized INT8 MobileNetV2 model fully delegated to the Ethos-U55 NPU
11+
(110 ops, ~19 ms inference on Alif E8)
12+
- A Zephyr RTOS application that loads the `.pte` model, runs inference on a
13+
static test image, and prints the top-5 ImageNet predictions over UART
14+
15+
## Prerequisites
16+
17+
### Hardware (choose one)
18+
19+
| Target | Description |
20+
|--------|-------------|
21+
| **Alif Ensemble E8 DevKit** | Cortex-M55 HP core + Ethos-U55 (256 MACs), 4.5 MB HP SRAM, MRAM |
22+
| **Corstone-320 FVP** | Virtual platform simulating Cortex-M85 + Ethos-U85 (no hardware needed, Linux only) |
23+
24+
### Software
25+
26+
- Linux x86_64 (FVP and Arm toolchain are Linux-only; macOS can export models
27+
but cannot run the FVP or flash)
28+
- Python 3.10+
29+
- Alif SE Tools for flashing (Alif hardware only)
30+
31+
## Step 1: Set Up the Zephyr Workspace
32+
33+
Create a workspace, install `west`, and initialize the Zephyr tree:
34+
35+
```bash
36+
mkdir ~/zephyr_workspace && cd ~/zephyr_workspace
37+
python3 -m venv .venv && source .venv/bin/activate
38+
pip install west "cmake<4.0.0" pyelftools ninja jsonschema
39+
west init --manifest-rev v4.3.0
40+
```
41+
42+
Install the Zephyr SDK (compiler toolchain):
43+
44+
```bash
45+
wget https://github.com/zephyrproject-rtos/sdk-ng/releases/download/v0.17.4/zephyr-sdk-0.17.4_linux-x86_64.tar.xz
46+
tar -xf zephyr-sdk-0.17.4_linux-x86_64.tar.xz && rm -f zephyr-sdk-0.17.4_linux-x86_64.tar.xz
47+
./zephyr-sdk-0.17.4/setup.sh -c -t arm-zephyr-eabi
48+
export ZEPHYR_SDK_INSTALL_DIR=$(realpath ./zephyr-sdk-0.17.4)
49+
```
50+
51+
## Step 2: Add ExecuTorch as a Zephyr Module
52+
53+
Copy the submanifest, configure `west` to pull only the modules we need, and
54+
update:
55+
56+
```bash
57+
mkdir -p zephyr/submanifests
58+
cat > zephyr/submanifests/executorch.yaml << 'EOF'
59+
manifest:
60+
projects:
61+
- name: executorch
62+
url: https://github.com/pytorch/executorch
63+
revision: main
64+
path: modules/lib/executorch
65+
EOF
66+
67+
west config manifest.project-filter -- -.*,+zephyr,+executorch,+cmsis,+cmsis_6,+cmsis-nn,+hal_ethos_u
68+
west update
69+
```
70+
71+
For Alif boards, also add the Alif HAL:
72+
73+
```bash
74+
west config manifest.project-filter -- -.*,+zephyr,+executorch,+cmsis,+cmsis_6,+cmsis-nn,+hal_ethos_u,+hal_alif
75+
west update
76+
```
77+
78+
## Step 3: Install ExecuTorch and Arm Tools
79+
80+
```bash
81+
cd modules/lib/executorch
82+
git submodule sync && git submodule update --init --recursive
83+
./install_executorch.sh
84+
cd ../../..
85+
```
86+
87+
Install the Arm toolchain, Vela compiler, and Corstone FVPs:
88+
89+
```bash
90+
modules/lib/executorch/examples/arm/setup.sh --i-agree-to-the-contained-eula
91+
source modules/lib/executorch/examples/arm/arm-scratch/setup_path.sh
92+
```
93+
94+
## Step 4: Export the MobileNetV2 Model
95+
96+
Export a quantized INT8 MobileNetV2 with Ethos-U delegation. Choose the target
97+
that matches your hardware:
98+
99+
**For Alif E8 (Ethos-U55 with 256 MACs):**
100+
101+
```bash
102+
python -m modules.lib.executorch.backends.arm.scripts.aot_arm_compiler \
103+
--model_name=mv2_untrained \
104+
--quantize --delegate \
105+
--target=ethos-u55-256 \
106+
--output=mv2_ethosu.pte
107+
```
108+
109+
**For Corstone-320 FVP (Ethos-U85 with 256 MACs):**
110+
111+
```bash
112+
python -m modules.lib.executorch.backends.arm.scripts.aot_arm_compiler \
113+
--model_name=mv2_untrained \
114+
--quantize --delegate \
115+
--target=ethos-u85-256 \
116+
--output=mv2_u85_256.pte
117+
```
118+
119+
The `--delegate` flag routes all compatible ops through the Ethos-U backend.
120+
The Vela compiler converts the TOSA intermediate representation into an
121+
optimized command stream for the NPU. Use `mv2` instead of `mv2_untrained` for
122+
meaningful predictions (requires torchvision pretrained weights).
123+
124+
## Step 5: Build the Zephyr Application
125+
126+
**For Alif E8:**
127+
128+
```bash
129+
west build -b alif_e8_dk/ae822fa0e5597xx0/rtss_hp \
130+
-S ethos-u55-enable \
131+
modules/lib/executorch/zephyr/samples/mv2-ethosu -- \
132+
-DET_PTE_FILE_PATH=mv2_ethosu.pte
133+
```
134+
135+
**For Corstone-320 FVP:**
136+
137+
```bash
138+
west build -b mps4/corstone320/fvp \
139+
modules/lib/executorch/zephyr/samples/mv2-ethosu -- \
140+
-DET_PTE_FILE_PATH=mv2_u85_256.pte
141+
```
142+
143+
## Step 6a: Run on Corstone-320 FVP
144+
145+
Set up the FVP paths and run:
146+
147+
```bash
148+
export FVP_ROOT=$PWD/modules/lib/executorch/examples/arm/arm-scratch/FVP-corstone320
149+
export ARMFVP_BIN_PATH=${FVP_ROOT}/models/Linux64_GCC-9.3
150+
export LD_LIBRARY_PATH=${FVP_ROOT}/python/lib:${ARMFVP_BIN_PATH}:${LD_LIBRARY_PATH}
151+
export ARMFVP_EXTRA_FLAGS="-C mps4_board.uart0.shutdown_on_eot=1 -C mps4_board.subsystem.ethosu.num_macs=256"
152+
153+
west build -t run
154+
```
155+
156+
MV2 inference is cycle-accurate on the FVP and takes 10-20 minutes of wall
157+
clock. You should see output like:
158+
159+
```
160+
========================================
161+
ExecuTorch MobileNetV2 Classification Demo
162+
========================================
163+
Ethos-U backend registered successfully
164+
Model loaded, has 1 methods
165+
Inference completed in <N> ms
166+
--- Classification Results ---
167+
Top-5 predictions:
168+
[1] class <id>: <score>
169+
...
170+
MobileNetV2 Demo Complete
171+
========================================
172+
```
173+
174+
## Step 6b: Flash and Run on Alif E8
175+
176+
### Flash with Alif SE Tools
177+
178+
Use the Alif SE Tools to program the binary into the E8's MRAM. Create a
179+
`zephyr.json` in the build output directory:
180+
181+
```bash
182+
cat > build/zephyr/zephyr.json << 'EOF'
183+
{
184+
"HP_img_class": {
185+
"binary": "zephyr.bin",
186+
"version": "1.0.0",
187+
"mramAddress": "0x80008000",
188+
"cpu_id": "M55_HP",
189+
"flags": ["boot"],
190+
"signed": false
191+
},
192+
"DEVICE": {
193+
"disabled": false,
194+
"binary": "app-device-config.json",
195+
"version": "0.5.00",
196+
"signed": true
197+
}
198+
}
199+
EOF
200+
```
201+
202+
> **Important:** Use `mramAddress: "0x80008000"` (FLASH_LOAD_OFFSET=0x8000),
203+
> **not** the default `0x80200000`. The default offset does not leave enough
204+
> MRAM for the ~3.5 MB MV2 model blob.
205+
206+
Generate the table of contents and flash using the SE Tools:
207+
208+
```bash
209+
cd build/zephyr
210+
python <path-to-alif-se-tools>/app-gen-toc.py
211+
python <path-to-alif-se-tools>/app-write-mram.py
212+
cd ../..
213+
```
214+
215+
Refer to the Alif SE Tools documentation for installation and detailed usage.
216+
217+
### Connect Serial Console
218+
219+
Connect to UART4 at 115200 baud. On Linux:
220+
221+
```bash
222+
picocom -b 115200 /dev/ttyUSB0
223+
```
224+
225+
Press the reset button on the E8 DevKit. You should see:
226+
227+
```
228+
Booting Zephyr OS build ff8b8697c0f5 ***
229+
230+
========================================
231+
ExecuTorch MobileNetV2 Classification Demo
232+
========================================
233+
234+
I [executorch:main.cpp] Ethos-U backend registered successfully
235+
I [executorch:main.cpp] Model PTE at 0x8004b290, Size: 3490912 bytes
236+
I [executorch:main.cpp] Model loaded, has 1 methods
237+
I [executorch:main.cpp] Running method: forward
238+
I [executorch:main.cpp] Method allocator pool size: 1572864 bytes.
239+
I [executorch:main.cpp] Setting up planned buffer 0, size 752640.
240+
I [executorch:main.cpp] Loading method...
241+
I [executorch:main.cpp] Method 'forward' loaded successfully
242+
I [executorch:main.cpp] Preparing input: static RGB image (150528 bytes)
243+
I [executorch:main.cpp]
244+
--- Starting inference ---
245+
I [executorch:main.cpp] Inference completed in 19 ms
246+
I [executorch:main.cpp]
247+
--- Classification Results ---
248+
I [executorch:main.cpp] Top-5 predictions:
249+
I [executorch:main.cpp] [1] class 0: 0.0000
250+
I [executorch:main.cpp] [2] class 1: 0.0000
251+
I [executorch:main.cpp] [3] class 2: 0.0000
252+
I [executorch:main.cpp] [4] class 3: 0.0000
253+
I [executorch:main.cpp] [5] class 4: 0.0000
254+
I [executorch:main.cpp]
255+
========================================
256+
I [executorch:main.cpp] MobileNetV2 Demo Complete
257+
I [executorch:main.cpp] Model size: 3490912 bytes
258+
I [executorch:main.cpp] Input: 224x224x3 RGB image (150528 bytes)
259+
I [executorch:main.cpp] Output: 1000 ImageNet classes (top-5 shown)
260+
I [executorch:main.cpp] Inference time: 19 ms
261+
I [executorch:main.cpp] ========================================
262+
```
263+
264+
All predictions show `0.0000` because `mv2_untrained` has random weights.
265+
Use `mv2` (with torchvision pretrained weights) for meaningful class scores.
266+
267+
## Troubleshooting
268+
269+
| Symptom | Cause | Fix |
270+
|---------|-------|-----|
271+
| Linker: `region 'FLASH' overflowed` | Model PTE too large for ITCM | Use the DDR overlay (FVP) or verify mramAddress (Alif) |
272+
| Linker: `region 'RAM' overflowed` | Pools + model copy exceed SRAM | Set `CONFIG_ET_ARM_MODEL_PTE_DMA_ACCESSIBLE=y` to skip the SRAM copy |
273+
| FVP hangs after "Ethos-U backend registered" | Cycle-accurate MV2 simulation is slow | Wait 10-20 min, or use Corstone-320 (faster than 300) |
274+
| No serial output on Alif | Wrong UART or baud rate | Use UART4 at 115200 baud |
275+
| `app-write-mram.py` fails | Wrong mramAddress | Use `0x80008000`, not `0x80200000` |
276+
| Runtime: method allocator OOM | Pool size too small | Increase `CONFIG_EXECUTORCH_METHOD_ALLOCATOR_POOL_SIZE` in board conf |
277+
278+
## Memory Layout
279+
280+
| Region | Corstone-320 FVP | Alif E8 |
281+
|--------|-----------------|---------|
282+
| Code + .rodata | ITCM (512 KB) | MRAM |
283+
| .data + .bss + pools | ISRAM (4 MB) | HP SRAM (4.5 MB) |
284+
| Model PTE (~3.5 MB) | DDR (16 MB, via overlay) | MRAM (DMA-accessible) |
285+
| NPU delegation | Ethos-U85 (256 MACs) | Ethos-U55 (256 MACs) |
286+
287+
## Using Claude Code with Zephyr
288+
289+
If you use [Claude Code](https://docs.anthropic.com/en/docs/claude-code), the
290+
ExecuTorch repo ships a `/zephyr` skill that can help with:
291+
292+
- **Workspace setup** — scaffolds the Zephyr workspace, west manifests, and SDK install
293+
- **Board bringup** — generates DTS overlays, board confs, and linker snippets for new boards
294+
- **Memory debugging** — diagnoses linker overflow errors and runtime allocation failures,
295+
with the exact pool sizes your model needs
296+
297+
Type `/zephyr` in Claude Code while working in the ExecuTorch repo to activate
298+
it. Related skills: `/export` for model conversion, `/cortex-m` for baremetal
299+
Cortex-M builds, `/executorch-kb` for backend-specific debugging.
300+
301+
## Next Steps
302+
303+
- Swap `mv2_untrained` for `mv2` (with torchvision) to get real ImageNet predictions
304+
- Try other models: `resnet18`, or bring your own `.py` model file
305+
- Explore the [hello-executorch sample](https://github.com/pytorch/executorch/tree/main/zephyr/samples/hello-executorch) for a minimal starting point
306+
- See the [Ethos-U Getting Started tutorial](backends/arm-ethos-u/tutorials/ethos-u-getting-started.md) for the baremetal (non-Zephyr) flow
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Copyright (c) Meta Platforms, Inc. and affiliates.
2+
#
3+
# Copyright 2026 Arm Limited and/or its affiliates.
4+
#
5+
# SPDX-License-Identifier: Apache-2.0
6+
#
7+
# Alif Ensemble E8 DevKit (HP core): Ethos-U55 with 256 MACs.
8+
# MRAM is DMA-accessible by the NPU so no SRAM copy is needed.
9+
CONFIG_ETHOS_U=y
10+
CONFIG_ET_ARM_MODEL_PTE_DMA_ACCESSIBLE=y

0 commit comments

Comments
 (0)