Skip to content

Commit 300a70f

Browse files
docs(install): note aarch64 wheels are aarch64-sbsa, not L4T (Jetson) (#1941)
Adds a WARNING callout after the Linux aarch64 row of the PyPI build-targets table, explaining that: 1. Wheels are built on aarch64-sbsa runners (standard CUDA Toolkit), not the L4T / JetPack runtime that Jetson Orin / Xavier / Thor (on CUDA 12) use. 2. The mismatch surfaces as 'Error named symbol not found in /src/csrc/ops.cu' on the first CUDA op — a symbol-resolution error, NOT a kernel-image-for- device error. The cubins ARE binary-compatible with the device per Ampere-family binary compat (sm_80 SASS runs on sm_87 hardware natively). 3. Working options on Jetson: on-device source build, or third-party prebuilt from Jetson AI Lab. References #1218 and #1930 for the original error reports, and #1939 for the empirical confirmation that the fault is the toolchain delta, not the arch list (sm_80-only cubin built on-device runs cleanly on sm_87 hardware). Co-authored-by: neil-the-nowledgable <254185769+neil-the-nowledgable@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 4396187 commit 300a70f

1 file changed

Lines changed: 8 additions & 0 deletions

File tree

docs/source/installation.mdx

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,14 @@ Use `pip` or `uv` to install the latest release:
6666
pip install bitsandbytes
6767
```
6868

69+
> [!WARNING]
70+
> **NVIDIA Jetson (L4T / JetPack) — source build required.** The `Linux aarch64` wheels above are built on aarch64-sbsa runners (server-class ARM with the standard CUDA Toolkit). They are **not compatible** with the L4T runtime on Jetson devices (Orin Nano / NX / AGX, Xavier, Thor on CUDA 12), even though both are aarch64 and even though the cubins are binary-compatible with the device's compute capability (e.g., `sm_80` cubin runs on `sm_87` hardware via Ampere-family binary compat — see [NVIDIA's docs on binary compatibility](https://developer.nvidia.com/blog/understanding-ptx-the-assembly-language-of-cuda-gpu-computing/#binary_compatibility)). The mismatch is at the CUDA library / ABI layer (JetPack ships its own CUDA Toolkit and system libraries), and surfaces as a runtime symbol-resolution error like `Error named symbol not found in /src/csrc/ops.cu` on the first CUDA op.
71+
>
72+
> **Two working options on Jetson:**
73+
>
74+
> 1. **Source build on-device.** Use the [Compile from Source](#cuda-compile) instructions below, passing your device's compute capability explicitly (sm_87 for Orin family, sm_72 for Xavier). On an Orin Nano Super: `cmake -DCOMPUTE_BACKEND=cuda -DCOMPUTE_CAPABILITY=87 . && make -j4 && pip install .`
75+
> 2. **Third-party prebuilt** from [Jetson AI Lab's package index](https://pypi.jetson-ai-lab.io/) (e.g., `pypi.jetson-ai-lab.io/jp6/cu126/bitsandbytes/`).
76+
6977
### Compile from Source[[cuda-compile]]
7078

7179
> [!TIP]

0 commit comments

Comments
 (0)