Skip to content

Commit 1b33653

Browse files
author
ssjia
committed
Update on "[ET-VK][conv1d] Implement height-packed depthwise conv1d operator"
Implement a depthwise conv1d operator using height-packed layout where channels are the packed dimension (WHCN dim 1). Depthwise conv applies a separate filter to each channel independently (groups=C), so 4 channels can be processed in parallel using element-wise vec4 FMA over kernel positions. Thread mapping: X=C/4, Y=L_out, Z=N. Each thread computes one output texel (4 channels at one spatial position). Inner loop iterates over kernel positions K with bounds-checked input access for padding. Weight [C,1,K] is prepacked as channels-packed so each vec4 load gives 4 channels' weights at one kernel position. Supports both buffer and texture3d storage, fp32/fp16, optional bias, and arbitrary stride/padding/dilation. Registered as et_vk.conv1d_dw.default (standalone custom op). Performance on Adreno 750 (S24): - [1,128,4096] K=31 buffer f16: 231 GFLOP/s - [1,128,4096] K=31 buffer f32: 155 GFLOP/s - [1,512,2048] K=5 buffer f32: 66 GFLOP/s Differential Revision: [D97344091](https://our.internmc.facebook.com/intern/diff/D97344091/) [ghstack-poisoned]
2 parents 88343dd + 8fac673 commit 1b33653

32 files changed

Lines changed: 615 additions & 116 deletions

.lintrunner.toml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -376,7 +376,6 @@ exclude_patterns = [
376376
'scripts/check_binary_dependencies.py',
377377
'profiler/test/test_profiler_e2e.py',
378378
'backends/arm/test/ops/*.py',
379-
'backends/cortex_m/test/**/*.py',
380379
]
381380
command = [
382381
'python',
@@ -410,7 +409,6 @@ include_patterns = [
410409
exclude_patterns = [
411410
'third-party/**',
412411
'**/third-party/**',
413-
'backends/cortex_m/test/**/*.py',
414412
]
415413
command = [
416414
'python',

backends/arm/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -186,7 +186,7 @@ This approach is useful for checking your change against this workflow on your o
186186
These scripts also install the necessary dependencies to run the tests.
187187
Below is an overview of some of the testing options this script provides:
188188

189-
| Command | Description |
189+
| Command | Description |
190190
| ---------------------------------------------------- | ------------------------------------------------------------ |
191191
| `test_arm_baremetal.sh test_pytest_ops_no_target` | Runs operator unit tests for non-target specific use-cases. |
192192
| `test_arm_baremetal.sh test_pytest_models_no_target` | Runs model unit tests for non-target specific use-cases. |
@@ -201,9 +201,9 @@ Below is an overview of some of the testing options this script provides:
201201
| `test_arm_baremetal.sh test_run_ethos_u85` | Runs end-to-end unit tests for Ethos-U85 specific use-cases. |
202202
| `test_arm_baremetal.sh test_pytest_ops_vkml` | Runs operator unit tests for VGF specific use-cases. |
203203
| `test_arm_baremetal.sh test_pytest_models_vkml` | Runs model unit tests for VGF specific use-cases. |
204-
| `test_arm_baremetal.sh test_run_vkml` | Runs end-to-end unit tests for VGF specific use-cases. |
205-
| `test_arm_baremetal.sh test_model_smollm2-135M` | Runs some models with Corstone FVP. |
206-
| `test_arm_baremetal.sh test_smaller_stories_llama` | Runs E2E model tests on Corstone FVP. |
204+
| `test_arm_baremetal.sh test_run_vkml` | Runs end-to-end unit tests for VGF specific use-cases. |
205+
| `test_arm_baremetal.sh test_model_smollm2-135M` | Runs some models with Corstone FVP. |
206+
| `test_arm_baremetal.sh test_smaller_stories_llama` | Runs E2E model tests on Corstone FVP. |
207207
| `test_arm_baremetal.sh test_memory_allocation` | Runs memory allocation tests for Ethos-U specific targets |
208208

209209
For more information, please refer to the `backends/arm/test/test_arm_baremetal.sh` script.

backends/arm/__init__.py

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# Copyright 2026 Arm Limited and/or its affiliates.
2+
#
3+
# This source code is licensed under the BSD-style license found in the
4+
# LICENSE file in the root directory of this source tree.
5+
"""Public entry points for the Arm backend.
6+
7+
Public API is defined by explicit module exports (e.g., ``.vgf``, ``.ethosu``,
8+
``.quantizer``). Selected symbols are re-exported here for convenience.
9+
10+
"""
11+
12+
from __future__ import annotations
13+
14+
import importlib
15+
from typing import Any
16+
17+
# Public for tooling (manifest generation and API validation).
18+
LAZY_IMPORTS = {
19+
"EthosUBackend": ("executorch.backends.arm.ethosu", "EthosUBackend"),
20+
"EthosUCompileSpec": ("executorch.backends.arm.ethosu", "EthosUCompileSpec"),
21+
"EthosUPartitioner": ("executorch.backends.arm.ethosu", "EthosUPartitioner"),
22+
"VgfBackend": ("executorch.backends.arm.vgf", "VgfBackend"),
23+
"VgfCompileSpec": ("executorch.backends.arm.vgf", "VgfCompileSpec"),
24+
"VgfPartitioner": ("executorch.backends.arm.vgf", "VgfPartitioner"),
25+
"EthosUQuantizer": ("executorch.backends.arm.quantizer", "EthosUQuantizer"),
26+
"VgfQuantizer": ("executorch.backends.arm.quantizer", "VgfQuantizer"),
27+
("get_symmetric_quantization_config"): (
28+
"executorch.backends.arm.quantizer",
29+
"get_symmetric_quantization_config",
30+
),
31+
("get_symmetric_a16w8_quantization_config"): (
32+
"executorch.backends.arm.quantizer",
33+
"get_symmetric_a16w8_quantization_config",
34+
),
35+
}
36+
37+
38+
def __getattr__(name: str) -> Any:
39+
if name in LAZY_IMPORTS:
40+
module_name, attr = LAZY_IMPORTS[name]
41+
module = importlib.import_module(module_name)
42+
value = getattr(module, attr)
43+
globals()[name] = value
44+
return value
45+
raise AttributeError(f"module '{__name__}' has no attribute '{name}'")
46+
47+
48+
def __dir__() -> list[str]:
49+
return sorted(list(globals()) + list(LAZY_IMPORTS))
Lines changed: 177 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,177 @@
1+
# Copyright 2026 Arm Limited and/or its affiliates.
2+
#
3+
# This source code is licensed under the BSD-style license found in the
4+
# LICENSE file in the root directory of this source tree.
5+
#
6+
# This file is generated by
7+
# backends/arm/scripts/generate_public_api_manifest.py
8+
9+
[python]
10+
11+
[python.EthosUBackend]
12+
kind = "class"
13+
signature = "EthosUBackend()"
14+
15+
[python.EthosUBackend.preprocess]
16+
kind = "function"
17+
signature = "EthosUBackend.preprocess(edge_program: torch.export.exported_program.ExportedProgram, compile_specs: List[executorch.exir.backend.compile_spec_schema.CompileSpec]) -> executorch.exir.backend.backend_details.PreprocessResult"
18+
19+
[python.EthosUCompileSpec]
20+
kind = "class"
21+
signature = "EthosUCompileSpec(target: str, system_config: str | None = None, memory_mode: str | None = None, extra_flags: list[str] | None = None, config_ini: str | None = 'Arm/vela.ini')"
22+
23+
[python.EthosUCompileSpec.DebugMode]
24+
kind = "enum"
25+
signature = "EthosUCompileSpec.DebugMode(value, names=None, *, module=None, qualname=None, type=None, start=1)"
26+
27+
[python.EthosUCompileSpec.__eq__]
28+
kind = "function"
29+
signature = "EthosUCompileSpec.__eq__(self, other)"
30+
31+
[python.EthosUCompileSpec.__repr__]
32+
kind = "function"
33+
signature = "EthosUCompileSpec.__repr__(self)"
34+
35+
[python.EthosUCompileSpec.dump_debug_info]
36+
kind = "function"
37+
signature = "EthosUCompileSpec.dump_debug_info(self, debug_mode: executorch.backends.arm.common.arm_compile_spec.ArmCompileSpec.DebugMode | None)"
38+
39+
[python.EthosUCompileSpec.dump_intermediate_artifacts_to]
40+
kind = "function"
41+
signature = "EthosUCompileSpec.dump_intermediate_artifacts_to(self, output_path: str | None)"
42+
43+
[python.EthosUCompileSpec.set_pass_pipeline_config]
44+
kind = "function"
45+
signature = "EthosUCompileSpec.set_pass_pipeline_config(self, config: executorch.backends.arm.common.pipeline_config.ArmPassPipelineConfig) -> None"
46+
47+
[python.EthosUPartitioner]
48+
kind = "class"
49+
signature = "EthosUPartitioner(compile_spec: executorch.backends.arm.ethosu.compile_spec.EthosUCompileSpec, additional_checks: Optional[Sequence[torch.fx.passes.operator_support.OperatorSupportBase]] = None) -> None"
50+
51+
[python.EthosUPartitioner.ops_to_not_decompose]
52+
kind = "function"
53+
signature = "EthosUPartitioner.ops_to_not_decompose(self, ep: torch.export.exported_program.ExportedProgram) -> Tuple[List[torch._ops.OpOverload], Optional[Callable[[torch.fx.node.Node], bool]]]"
54+
55+
[python.EthosUPartitioner.partition]
56+
kind = "function"
57+
signature = "EthosUPartitioner.partition(self, exported_program: torch.export.exported_program.ExportedProgram) -> executorch.exir.backend.partitioner.PartitionResult"
58+
59+
[python.EthosUQuantizer]
60+
kind = "class"
61+
signature = "EthosUQuantizer(compile_spec: 'EthosUCompileSpec', use_composable_quantizer: 'bool' = False) -> 'None'"
62+
63+
[python.EthosUQuantizer.annotate]
64+
kind = "function"
65+
signature = "EthosUQuantizer.annotate(self, model: 'GraphModule') -> 'GraphModule'"
66+
67+
[python.EthosUQuantizer.set_global]
68+
kind = "function"
69+
signature = "EthosUQuantizer.set_global(self, quantization_config: 'Optional[QuantizationConfig]') -> 'TOSAQuantizer'"
70+
71+
[python.EthosUQuantizer.set_io]
72+
kind = "function"
73+
signature = "EthosUQuantizer.set_io(self, quantization_config: 'Optional[QuantizationConfig]') -> 'TOSAQuantizer'"
74+
75+
[python.EthosUQuantizer.set_module_name]
76+
kind = "function"
77+
signature = "EthosUQuantizer.set_module_name(self, module_name: 'str', quantization_config: 'Optional[QuantizationConfig]') -> 'TOSAQuantizer'"
78+
79+
[python.EthosUQuantizer.set_module_type]
80+
kind = "function"
81+
signature = "EthosUQuantizer.set_module_type(self, module_type: 'Callable', quantization_config: 'Optional[QuantizationConfig]') -> 'TOSAQuantizer'"
82+
83+
[python.EthosUQuantizer.transform_for_annotation]
84+
kind = "function"
85+
signature = "EthosUQuantizer.transform_for_annotation(self, model: 'GraphModule') -> 'GraphModule'"
86+
87+
[python.EthosUQuantizer.validate]
88+
kind = "function"
89+
signature = "EthosUQuantizer.validate(self, model: 'GraphModule') -> 'None'"
90+
91+
[python.VgfBackend]
92+
kind = "class"
93+
signature = "VgfBackend()"
94+
95+
[python.VgfBackend.preprocess]
96+
kind = "function"
97+
signature = "VgfBackend.preprocess(edge_program: torch.export.exported_program.ExportedProgram, compile_specs: List[executorch.exir.backend.compile_spec_schema.CompileSpec]) -> executorch.exir.backend.backend_details.PreprocessResult"
98+
99+
[python.VgfCompileSpec]
100+
kind = "class"
101+
signature = "VgfCompileSpec(tosa_spec: executorch.backends.arm.tosa.specification.TosaSpecification | str | None = None, compiler_flags: list[str] | None = None)"
102+
103+
[python.VgfCompileSpec.DebugMode]
104+
kind = "enum"
105+
signature = "VgfCompileSpec.DebugMode(value, names=None, *, module=None, qualname=None, type=None, start=1)"
106+
107+
[python.VgfCompileSpec.__eq__]
108+
kind = "function"
109+
signature = "VgfCompileSpec.__eq__(self, other)"
110+
111+
[python.VgfCompileSpec.__repr__]
112+
kind = "function"
113+
signature = "VgfCompileSpec.__repr__(self)"
114+
115+
[python.VgfCompileSpec.dump_debug_info]
116+
kind = "function"
117+
signature = "VgfCompileSpec.dump_debug_info(self, debug_mode: executorch.backends.arm.common.arm_compile_spec.ArmCompileSpec.DebugMode | None)"
118+
119+
[python.VgfCompileSpec.dump_intermediate_artifacts_to]
120+
kind = "function"
121+
signature = "VgfCompileSpec.dump_intermediate_artifacts_to(self, output_path: str | None)"
122+
123+
[python.VgfCompileSpec.set_pass_pipeline_config]
124+
kind = "function"
125+
signature = "VgfCompileSpec.set_pass_pipeline_config(self, config: executorch.backends.arm.common.pipeline_config.ArmPassPipelineConfig) -> None"
126+
127+
[python.VgfPartitioner]
128+
kind = "class"
129+
signature = "VgfPartitioner(compile_spec: executorch.backends.arm.vgf.compile_spec.VgfCompileSpec, additional_checks: Optional[Sequence[torch.fx.passes.operator_support.OperatorSupportBase]] = None) -> None"
130+
131+
[python.VgfPartitioner.ops_to_not_decompose]
132+
kind = "function"
133+
signature = "VgfPartitioner.ops_to_not_decompose(self, ep: torch.export.exported_program.ExportedProgram) -> Tuple[List[torch._ops.OpOverload], Optional[Callable[[torch.fx.node.Node], bool]]]"
134+
135+
[python.VgfPartitioner.partition]
136+
kind = "function"
137+
signature = "VgfPartitioner.partition(self, exported_program: torch.export.exported_program.ExportedProgram) -> executorch.exir.backend.partitioner.PartitionResult"
138+
139+
[python.VgfQuantizer]
140+
kind = "class"
141+
signature = "VgfQuantizer(compile_spec: 'VgfCompileSpec', use_composable_quantizer: 'bool' = False) -> 'None'"
142+
143+
[python.VgfQuantizer.annotate]
144+
kind = "function"
145+
signature = "VgfQuantizer.annotate(self, model: 'GraphModule') -> 'GraphModule'"
146+
147+
[python.VgfQuantizer.set_global]
148+
kind = "function"
149+
signature = "VgfQuantizer.set_global(self, quantization_config: 'Optional[QuantizationConfig]') -> 'TOSAQuantizer'"
150+
151+
[python.VgfQuantizer.set_io]
152+
kind = "function"
153+
signature = "VgfQuantizer.set_io(self, quantization_config: 'Optional[QuantizationConfig]') -> 'TOSAQuantizer'"
154+
155+
[python.VgfQuantizer.set_module_name]
156+
kind = "function"
157+
signature = "VgfQuantizer.set_module_name(self, module_name: 'str', quantization_config: 'Optional[QuantizationConfig]') -> 'TOSAQuantizer'"
158+
159+
[python.VgfQuantizer.set_module_type]
160+
kind = "function"
161+
signature = "VgfQuantizer.set_module_type(self, module_type: 'Callable', quantization_config: 'Optional[QuantizationConfig]') -> 'TOSAQuantizer'"
162+
163+
[python.VgfQuantizer.transform_for_annotation]
164+
kind = "function"
165+
signature = "VgfQuantizer.transform_for_annotation(self, model: 'GraphModule') -> 'GraphModule'"
166+
167+
[python.VgfQuantizer.validate]
168+
kind = "function"
169+
signature = "VgfQuantizer.validate(self, model: 'GraphModule') -> 'None'"
170+
171+
[python.get_symmetric_a16w8_quantization_config]
172+
kind = "function"
173+
signature = "get_symmetric_a16w8_quantization_config(is_per_channel: 'bool' = True, is_qat: 'bool' = False, is_dynamic: 'bool' = False, weight_qmin: 'int' = -127, weight_qmax: 'int' = 127, epsilon: 'float' = 0.000244140625) -> 'QuantizationConfig'"
174+
175+
[python.get_symmetric_quantization_config]
176+
kind = "function"
177+
signature = "get_symmetric_quantization_config(is_per_channel: 'bool' = True, is_qat: 'bool' = False, is_dynamic: 'bool' = False, act_qmin: 'int' = -128, act_qmax: 'int' = 127, weight_qmin: 'int' = -127, weight_qmax: 'int' = 127, eps: 'float' = 1.52587890625e-05) -> 'QuantizationConfig'"

backends/arm/requirements-arm-tosa.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,4 @@ tosa-adapter-model-explorer == 0.1.0
99
ai-edge-model-explorer >= 0.1.16
1010
# NOTE: Will be removed when tosa-tools is installed via pypi
1111
pybind11 == 2.10.4
12+
pytest-timeout

0 commit comments

Comments
 (0)