Skip to content

Commit 33acf17

Browse files
committed
Update base for Update on "[ET Device Support] TensorImpl carries device info"
This diff extends `TensorImpl` to carry device information, enabling the runtime tensor to track which device its data resides on (CPU, CUDA, etc.). This is a prerequisite for parsing device info from the schema and allocating device memory. Differential Revision: [D93635655](https://our.internmc.facebook.com/intern/diff/D93635655/) [ghstack-poisoned]
2 parents e51b858 + 60d57e5 commit 33acf17

60 files changed

Lines changed: 3986 additions & 1294 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.ci/scripts/wheel/pre_build_script.sh

Lines changed: 19 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -9,34 +9,41 @@ set -euxo pipefail
99

1010
# This script is run before building ExecuTorch binaries
1111

12-
if [[ "$(uname -m)" == "aarch64" ]]; then
13-
# On some Linux aarch64 systems, the "atomic" library is not found during linking.
14-
# To work around this, replace "atomic" with the literal ${ATOMIC_LIB} so the
15-
# build system uses the full path to the atomic library.
16-
file="extension/llm/tokenizers/third-party/sentencepiece/src/CMakeLists.txt"
17-
sed 's/list(APPEND SPM_LIBS "atomic")/list(APPEND SPM_LIBS ${ATOMIC_LIB})/' \
18-
"$file" > "${file}.tmp" && mv "${file}.tmp" "$file"
19-
20-
grep -n 'list(APPEND SPM_LIBS ${ATOMIC_LIB})' "$file" && \
21-
echo "the file $file has been modified for atomic to use full path"
12+
# Initialize submodules here instead of during checkout so we can use OpenSSL
13+
# on Windows (schannel fails with SEC_E_ILLEGAL_MESSAGE on some gitlab hosts).
14+
UNAME_S=$(uname -s)
15+
if [[ $UNAME_S == *"MINGW"* || $UNAME_S == *"MSYS"* ]]; then
16+
git -c http.sslBackend=openssl submodule update --init
17+
else
18+
git submodule update --init
2219
fi
2320

2421
# Clone nested submodules for tokenizers - this is a workaround for recursive
2522
# submodule clone failing due to path length limitations on Windows. Eventually,
2623
# we should update the core job in test-infra to enable long paths before
2724
# checkout to avoid needing to do this.
2825
pushd extension/llm/tokenizers
29-
UNAME_S=$(uname -s)
3026
if [[ $UNAME_S == *"MINGW"* || $UNAME_S == *"MSYS"* ]]; then
3127
git -c http.sslBackend=openssl submodule update --init
3228
else
3329
git submodule update --init
3430
fi
3531
popd
3632

33+
if [[ "$(uname -m)" == "aarch64" ]]; then
34+
# On some Linux aarch64 systems, the "atomic" library is not found during linking.
35+
# To work around this, replace "atomic" with the literal ${ATOMIC_LIB} so the
36+
# build system uses the full path to the atomic library.
37+
file="extension/llm/tokenizers/third-party/sentencepiece/src/CMakeLists.txt"
38+
sed 's/list(APPEND SPM_LIBS "atomic")/list(APPEND SPM_LIBS ${ATOMIC_LIB})/' \
39+
"$file" > "${file}.tmp" && mv "${file}.tmp" "$file"
40+
41+
grep -n 'list(APPEND SPM_LIBS ${ATOMIC_LIB})' "$file" && \
42+
echo "the file $file has been modified for atomic to use full path"
43+
fi
44+
3745
# On Windows, enable symlinks and re-checkout the current revision to create
3846
# the symlinked src/ directory. This is needed to build the wheel.
39-
UNAME_S=$(uname -s)
4047
if [[ $UNAME_S == *"MINGW"* || $UNAME_S == *"MSYS"* ]]; then
4148
echo "Enabling symlinks on Windows"
4249
git config core.symlinks true

.github/workflows/build-wheels-windows.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,4 +64,6 @@ jobs:
6464
smoke-test-script: ${{ matrix.smoke-test-script }}
6565
trigger-event: ${{ github.event_name }}
6666
wheel-build-params: "--verbose"
67-
submodules: true
67+
# Submodules are initialized in pre_build_script.sh with OpenSSL to avoid
68+
# schannel SSL errors on Windows when cloning from non-GitHub hosts.
69+
submodules: false

.github/workflows/cuda.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,9 @@ jobs:
135135
# Run CUDA backend Python tests
136136
python -m pytest backends/cuda/tests backends/cuda/passes/tests -v -o "addopts="
137137
138+
# Build Qwen3.5 MoE runner (ExecuTorch already built above)
139+
cd examples/models/qwen3_5_moe && cmake --workflow --preset qwen3-5-moe-cuda
140+
138141
export-model-cuda-artifact:
139142
name: export-model-cuda-artifact
140143
# Skip this job if the pull request is from a fork (HuggingFace secrets are not available)

Makefile

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@
9191
#
9292
# ==============================================================================
9393

94-
.PHONY: voxtral-cuda voxtral-cpu voxtral-metal voxtral_realtime-cuda voxtral_realtime-cpu voxtral_realtime-metal whisper-cuda whisper-cuda-debug whisper-cpu whisper-metal parakeet-cuda parakeet-cuda-debug parakeet-cpu parakeet-metal parakeet-vulkan dinov2-cuda dinov2-cuda-debug sortformer-cuda sortformer-cpu silero-vad-cpu llama-cuda llama-cuda-debug llama-cpu llava-cpu gemma3-cuda gemma3-cpu clean help
94+
.PHONY: voxtral-cuda voxtral-cpu voxtral-metal voxtral_realtime-cuda voxtral_realtime-cpu voxtral_realtime-metal whisper-cuda whisper-cuda-debug whisper-cpu whisper-metal parakeet-cuda parakeet-cuda-debug parakeet-cpu parakeet-metal parakeet-vulkan dinov2-cuda dinov2-cuda-debug sortformer-cuda sortformer-cpu silero-vad-cpu llama-cuda llama-cuda-debug llama-cpu llava-cpu gemma3-cuda gemma3-cpu qwen3_5_moe-cuda clean help
9595

9696
help:
9797
@echo "This Makefile adds targets to build runners for various models on various backends. Run using \`make <target>\`. Available targets:"
@@ -121,6 +121,7 @@ help:
121121
@echo " llava-cpu - Build Llava runner with CPU backend"
122122
@echo " gemma3-cuda - Build Gemma3 runner with CUDA backend"
123123
@echo " gemma3-cpu - Build Gemma3 runner with CPU backend"
124+
@echo " qwen3_5_moe-cuda - Build Qwen3.5 MoE runner with CUDA backend"
124125
@echo " clean - Clean build artifacts"
125126

126127
voxtral-cuda:
@@ -362,6 +363,15 @@ gemma3-cpu:
362363
@echo "✓ Build complete!"
363364
@echo " Binary: cmake-out/examples/models/gemma3/gemma3_e2e_runner"
364365

366+
qwen3_5_moe-cuda:
367+
@echo "==> Building and installing ExecuTorch with CUDA..."
368+
cmake --workflow --preset llm-release-cuda
369+
@echo "==> Building Qwen3.5 MoE runner with CUDA..."
370+
cd examples/models/qwen3_5_moe && cmake --workflow --preset qwen3-5-moe-cuda
371+
@echo ""
372+
@echo "✓ Build complete!"
373+
@echo " Binary: cmake-out/examples/models/qwen3_5_moe/qwen3_5_moe_runner"
374+
365375
clean:
366376
rm -rf cmake-out \
367377
extension/llm/tokenizers/build \

backends/arm/common/arm_compile_spec.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -117,9 +117,9 @@ def from_list(cls, compile_specs: list[CompileSpec]): # noqa: C901
117117
raise ValueError("No tosa_spec in compile spec.")
118118
if output_format is None:
119119
raise ValueError("No output_format in compile spec.")
120-
if output_format != cls.get_output_format():
120+
if output_format != cls._get_output_format():
121121
raise ValueError(
122-
f"Incorrect output format '{output_format}' for {cls.__name__}, expected '{cls.get_output_format()}'"
122+
f"Incorrect output format '{output_format}' for {cls.__name__}, expected '{cls._get_output_format()}'"
123123
)
124124
if compiler_flags is None:
125125
compiler_flags = []
@@ -134,17 +134,17 @@ def from_list(cls, compile_specs: list[CompileSpec]): # noqa: C901
134134
output_order_workaround=output_order_workaround,
135135
pipeline_config=pipeline_config,
136136
)
137-
cls.from_list_hook(compile_spec, unknown_specs)
138-
compile_spec.validate()
137+
cls._from_list_hook(compile_spec, unknown_specs)
138+
compile_spec._validate()
139139
return compile_spec
140140

141141
@classmethod
142-
def from_list_hook(cls, compile_spec, specs: dict[str, str]): # noqa: B027
142+
def _from_list_hook(cls, compile_spec, specs: dict[str, str]): # noqa: B027
143143
"""Allows subclasses to hook into parsing compile spec lists."""
144144
pass
145145

146146
@abstractmethod
147-
def validate(self):
147+
def _validate(self):
148148
"""Throws an error if the compile spec is not valid."""
149149

150150
def to_list(self):
@@ -170,7 +170,7 @@ def to_list(self):
170170
# Add output format to identify kind of compile spec.
171171
compile_spec.append(
172172
CompileSpec(
173-
ArmCompileSpec._OUTPUT_FORMAT_KEY, self.get_output_format().encode()
173+
ArmCompileSpec._OUTPUT_FORMAT_KEY, self._get_output_format().encode()
174174
)
175175
)
176176

@@ -285,5 +285,5 @@ def get_output_order_workaround(self) -> bool:
285285

286286
@classmethod
287287
@abstractmethod
288-
def get_output_format(cls) -> str:
288+
def _get_output_format(cls) -> str:
289289
"""Returns a constant string that is the output format of the class."""

backends/arm/ethosu/compile_spec.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ def __init__(
119119
)
120120
tosa_spec = self._tosa_spec_for_target(target_lower)
121121
self._set_compile_specs(tosa_spec, compiler_flags)
122-
self.validate()
122+
self._validate()
123123

124124
def to_list(self):
125125
"""Return compile specs including the encoded Ethos-U target."""
@@ -128,11 +128,11 @@ def to_list(self):
128128
return compile_specs
129129

130130
@classmethod
131-
def from_list_hook(cls, compile_spec, specs: dict[str, str]):
131+
def _from_list_hook(cls, compile_spec, specs: dict[str, str]):
132132
"""Restore target-specific metadata from serialized compile specs."""
133133
compile_spec.target = specs.get(cls._TARGET_KEY, None)
134134

135-
def validate(self):
135+
def _validate(self):
136136
"""Validate the configuration against supported Ethos-U settings."""
137137
if len(self.compiler_flags) == 0:
138138
raise ValueError(
@@ -144,7 +144,7 @@ def validate(self):
144144
)
145145

146146
@classmethod
147-
def get_output_format(cls) -> str:
147+
def _get_output_format(cls) -> str:
148148
"""Return the artifact format emitted by this compile spec."""
149149
return "vela"
150150

backends/arm/quantizer/arm_quantizer.py

Lines changed: 39 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@
7171
SharedQspecQuantizer,
7272
)
7373
from executorch.backends.arm.vgf import VgfCompileSpec
74+
from executorch.exir._warnings import experimental
7475
from torch.fx import GraphModule, Node
7576
from torchao.quantization.pt2e import (
7677
FakeQuantize,
@@ -441,14 +442,26 @@ def _for_each_filtered_node(
441442

442443

443444
class TOSAQuantizer(Quantizer):
444-
"""Manage quantization annotations for TOSA-compatible backends."""
445+
"""Manage quantization annotations for TOSA-compatible backends.
446+
447+
.. warning::
448+
Setting ``use_composable_quantizer=True`` enables an experimental API
449+
surface that may change without notice.
450+
451+
"""
445452

446453
def __init__(
447454
self,
448455
compile_spec_or_tosa_spec,
449456
use_composable_quantizer: bool = False,
450457
) -> None:
451-
"""Create a TOSA quantizer from a TOSA spec or Arm compile spec."""
458+
"""Create a TOSA quantizer from a TOSA spec or Arm compile spec.
459+
460+
.. warning::
461+
Setting ``use_composable_quantizer=True`` enables an experimental
462+
API surface that may change without notice.
463+
464+
"""
452465
self.use_composable_quantizer = use_composable_quantizer
453466
self.quantizer: _TOSAQuantizerV1 | _TOSAQuantizerV2
454467
if use_composable_quantizer:
@@ -606,6 +619,10 @@ def set_io(
606619
self.quantizer.set_io(quantization_config)
607620
return self
608621

622+
@experimental(
623+
"This API is experimental and may change without notice. "
624+
"It is only available when use_composable_quantizer=True."
625+
)
609626
def add_quantizer(self, quantizer: Quantizer) -> TOSAQuantizer:
610627
"""Insert a quantizer with highest precedence."""
611628
if self.use_composable_quantizer:
@@ -614,6 +631,10 @@ def add_quantizer(self, quantizer: Quantizer) -> TOSAQuantizer:
614631
"add_quantizer is only supported in the composable quantizer implementation."
615632
)
616633

634+
@experimental(
635+
"This API is experimental and may change without notice. "
636+
"It is only available when use_composable_quantizer=True."
637+
)
617638
def set_node_finder(
618639
self, quantization_config: Optional[QuantizationConfig], node_finder: NodeFinder
619640
) -> TOSAQuantizer:
@@ -631,6 +652,10 @@ def set_node_finder(
631652
"set_node_finder is only supported in the composable quantizer implementation."
632653
)
633654

655+
@experimental(
656+
"This API is experimental and may change without notice. "
657+
"It is only available when use_composable_quantizer=True."
658+
)
634659
def set_node_target(
635660
self, node_target: OpOverload, quantization_config: Optional[QuantizationConfig]
636661
) -> TOSAQuantizer:
@@ -641,6 +666,10 @@ def set_node_target(
641666
"set_node_target is only supported in the composable quantizer implementation."
642667
)
643668

669+
@experimental(
670+
"This API is experimental and may change without notice. "
671+
"It is only available when use_composable_quantizer=True."
672+
)
644673
def set_node_name(
645674
self, node_name: str, quantization_config: Optional[QuantizationConfig]
646675
) -> TOSAQuantizer:
@@ -1167,6 +1196,10 @@ def set_io(
11671196
class EthosUQuantizer(TOSAQuantizer):
11681197
"""Quantizer supported by the Arm Ethos-U backend.
11691198
1199+
.. warning::
1200+
Setting ``use_composable_quantizer=True`` enables an experimental API
1201+
surface that may change without notice.
1202+
11701203
Args:
11711204
compile_spec (EthosUCompileSpec): Backend compile specification for
11721205
Ethos-U targets.
@@ -1185,6 +1218,10 @@ def __init__(
11851218
class VgfQuantizer(TOSAQuantizer):
11861219
"""Quantizer supported by the Arm Vgf backend.
11871220
1221+
.. warning::
1222+
Setting ``use_composable_quantizer=True`` enables an experimental API
1223+
surface that may change without notice.
1224+
11881225
Args:
11891226
compile_spec (VgfCompileSpec): Backend compile specification for Vgf
11901227
targets.

backends/arm/scripts/docgen/docgen.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,7 @@ def generate_ethos_u_docs():
135135
"""Generates documentation for the Ethos-U components in the backend."""
136136
compilespec_string = get_class_docstring(
137137
EthosUCompileSpec,
138-
("DebugMode", "to_list", "from_list", "from_list_hook", "validate"),
138+
("DebugMode", "to_list", "from_list"),
139139
)
140140
partitioner_string = get_class_docstring(EthosUPartitioner)
141141
quantizer_string = get_class_docstring(
@@ -190,7 +190,7 @@ def generate_vgf_docs():
190190
"""Generates documentation for the VGF components in the backend."""
191191
compilespec_string = get_class_docstring(
192192
VgfCompileSpec,
193-
("DebugMode", "to_list", "from_list", "from_list_hook", "validate"),
193+
("DebugMode", "to_list", "from_list"),
194194
)
195195
partitioner_string = get_class_docstring(VgfPartitioner)
196196
quantizer_string = get_class_docstring(

0 commit comments

Comments
 (0)