Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/ci_linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,6 @@ jobs:
--exclude cust
'

# Exclude cust_raw because it triggers hundreds of warnings.
- name: Check documentation
run: |
docker exec "$CONTAINER_NAME" bash -lc 'set -euo pipefail
Expand Down
1 change: 0 additions & 1 deletion .github/workflows/ci_windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,6 @@ jobs:
run: cargo test --workspace --exclude blastoff --exclude cudnn --exclude cudnn-sys --exclude cust

# Exclude crates that require cuDNN, not available on Windows CI: cudnn, cudnn-sys.
# Exclude cust_raw because it triggers hundreds of warnings.
- name: Check documentation
env:
RUSTDOCFLAGS: -Dwarnings
Expand Down
10 changes: 10 additions & 0 deletions .github/workflows/container_images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,15 @@ jobs:
- name: Ubuntu-24.04/CUDA-12.8.1
image: "rust-gpu/rust-cuda-ubuntu24-cuda12"
dockerfile: ./container/ubuntu24-cuda12/Dockerfile
- name: Ubuntu-24.04/CUDA-13.0.2
image: "rust-gpu/rust-cuda-ubuntu24-cuda13"
dockerfile: ./container/ubuntu24-cuda13/Dockerfile
- name: RockyLinux-9/CUDA-12.8.1
image: "rust-gpu/rust-cuda-rockylinux9-cuda12"
dockerfile: ./container/rockylinux9-cuda12/Dockerfile
- name: RockyLinux-9/CUDA-13.0.2
image: "rust-gpu/rust-cuda-rockylinux9-cuda13"
dockerfile: ./container/rockylinux9-cuda13/Dockerfile
steps:
- name: Free up space
# Without this the job will likely run out of disk space.
Expand Down Expand Up @@ -153,8 +159,12 @@ jobs:
variance:
- name: Ubuntu-24.04/CUDA-12.8.1
image: "rust-gpu/rust-cuda-ubuntu24-cuda12"
- name: Ubuntu-24.04/CUDA-13.0.2
image: "rust-gpu/rust-cuda-ubuntu24-cuda13"
- name: RockyLinux-9/CUDA-12.8.1
image: "rust-gpu/rust-cuda-rockylinux9-cuda12"
- name: RockyLinux-9/CUDA-13.0.2
image: "rust-gpu/rust-cuda-rockylinux9-cuda13"
steps:
- name: Set artifact name
run: |
Expand Down
106 changes: 13 additions & 93 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,116 +2,36 @@
<h1>The Rust CUDA Project</h1>

<p>
<strong>An ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in
<a href="https://www.rust-lang.org/">Rust</a></strong>
<strong>An ecosystem of libraries and tools for writing and executing extremely fast GPU code
fully in <a href="https://www.rust-lang.org/">Rust</a></strong>
</p>

<h3>
<a href="https://rust-gpu.github.io/rust-cuda/index.html">Guide</a>
<span> | </span>
<a href="https://rust-gpu.github.io/rust-cuda/guide/getting_started.html">Getting Started</a>
<span> | </span>
<a href="https://rust-gpu.github.io/rust-cuda/features.html">Features</a>
</h3>
<strong>⚠️ The project is still in early development, expect bugs, safety issues, and things that don't work ⚠️</strong>
</div>

<br/>

> [!IMPORTANT]
> This project is no longer dormant and is [being
> rebooted](https://rust-gpu.github.io/blog/2025/01/27/rust-cuda-reboot). Read the [latest status update](https://rust-gpu.github.io/blog/2025/08/11/rust-cuda-update).
> Please contribute!
>
> The project is still in early development, however. Expect bugs, safety issues, and things that
> don't work.

## Goal

The Rust CUDA Project is a project aimed at making Rust a tier-1 language for extremely fast GPU computing
using the CUDA Toolkit. It provides tools for compiling Rust to extremely fast PTX code as well as libraries
for using existing CUDA libraries with it.

## Background

Historically, general purpose high performance GPU computing has been done using the CUDA toolkit. The CUDA toolkit primarily
provides a way to use Fortran/C/C++ code for GPU computing in tandem with CPU code with a single source. It also provides
many libraries, tools, forums, and documentation to supplement the single-source CPU/GPU code.

CUDA is exclusively an NVIDIA-only toolkit. Many tools have been proposed for cross-platform GPU computing such as
OpenCL, Vulkan Computing, and HIP. However, CUDA remains the most used toolkit for such tasks by far. This is why it is
imperative to make Rust a viable option for use with the CUDA toolkit.

However, CUDA with Rust has been a historically very rocky road. The only viable option until now has been to use the LLVM PTX
backend, however, the LLVM PTX backend does not always work and would generate invalid PTX for many common Rust operations, and
in recent years it has been shown time and time again that a specialized solution is needed for Rust on the GPU with the advent
of projects such as rust-gpu (for Rust -> SPIR-V).

Our hope is that with this project we can push the Rust GPU computing industry forward and make Rust an excellent language
for such tasks. Rust offers plenty of benefits such as `__restrict__` performance benefits for every kernel, An excellent module/crate system,
delimiting of unsafe areas of CPU/GPU code with `unsafe`, high level wrappers to low level CUDA libraries, etc.

## Structure

The scope of the Rust CUDA Project is quite broad, it spans the entirety of the CUDA ecosystem, with libraries and tools to make it
usable using Rust. Therefore, the project contains many crates for all corners of the CUDA ecosystem.

The current line-up of libraries is the following:

- `rustc_codegen_nvvm` Which is a rustc backend that targets NVVM IR (a subset of LLVM IR) for the [libnvvm](https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html) library.
- Generates highly optimized PTX code which can be loaded by the CUDA Driver API to execute on the GPU.
- For the near future it will be CUDA-only, but it may be used to target amdgpu in the future.
- `cuda_std` for GPU-side functions and utilities, such as thread index queries, memory allocation, warp intrinsics, etc.
- _Not_ a low level library, provides many utility functions to make it easier to write cleaner and more reliable GPU kernels.
- Closely tied to `rustc_codegen_nvvm` which exposes GPU features through it internally.
- [`cudnn`](https://github.com/Rust-GPU/rust-cuda/tree/master/crates/cudnn) for a collection of GPU-accelerated primitives for deep neural networks.
- `cust` for CPU-side CUDA features such as launching GPU kernels, GPU memory allocation, device queries, etc.
- High level with features such as RAII and Rust Results that make it easier and cleaner to manage the interface to the GPU.
- A high level wrapper for the CUDA Driver API, the lower level version of the more common CUDA Runtime API used from C++.
- Provides much more fine grained control over things like kernel concurrency and module loading than the C++ Runtime API.
- `gpu_rand` for GPU-friendly random number generation, currently only implements xoroshiro RNGs from `rand_xoshiro`.
- `optix` for CPU-side hardware raytracing and denoising using the CUDA OptiX library.

In addition to many "glue" crates for things such as high level wrappers for certain smaller CUDA libraries.

## Related Projects

Other projects related to using Rust on the GPU:

- 2016: [glassful](https://github.com/kmcallister/glassful) Subset of Rust that compiles to GLSL.
- 2017: [inspirv-rust](https://github.com/msiglreith/inspirv-rust) Experimental Rust MIR -> SPIR-V Compiler.
- 2018: [nvptx](https://github.com/japaric-archived/nvptx) Rust to PTX compiler using the `nvptx` target for rustc (using the LLVM PTX backend).
- 2020: [accel](https://github.com/termoshtt/accel) Higher-level library that relied on the same mechanism that `nvptx` does.
- 2020: [rlsl](https://github.com/MaikKlein/rlsl) Experimental Rust -> SPIR-V compiler (predecessor to rust-gpu)
- 2020: [rust-gpu](https://github.com/Rust-GPU/rust-gpu) `rustc` compiler backend to compile Rust to SPIR-V for use in shaders, similar mechanism as our project.

## Usage
```bash
## setup your environment like:
### export OPTIX_ROOT=/opt/NVIDIA-OptiX-SDK-9.0.0-linux64-x86_64
### export OPTIX_ROOT_DIR=/opt/NVIDIA-OptiX-SDK-9.0.0-linux64-x86_64

## build proj
cargo build
```

## Use Rust CUDA in Container Environments

The distribution related Dockerfile are located in `container` folder.
Taking ubuntu 24.04 as an example, run the following command in repository root:
```bash
docker build -f ./container/ubuntu24-cuda12/Dockerfile -t rust-cuda-ubuntu24 .
docker run --rm --runtime=nvidia --gpus all -it rust-cuda-ubuntu24
```
## Documentation

A sample `.devcontainer.json` file is also included, configured for Ubuntu 24.02. Copy this to `.devcontainer/devcontainer.json` to make additional customizations.
Please see [The Rust CUDA Guide](https://rust-gpu.github.io/rust-cuda/) for documentation on Rust
CUDA.

## License

Licensed under either of

- Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
- Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or
http://www.apache.org/licenses/LICENSE-2.0)
- MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)

at your discretion.

### Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in
the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any
additional terms or conditions.
92 changes: 92 additions & 0 deletions container/rockylinux9-cuda13/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
FROM nvcr.io/nvidia/cuda:13.0.2-cudnn-devel-rockylinux9 AS llvm-builder

RUN dnf -y install \
--nobest \
--allowerasing \
--setopt=install_weak_deps=False \
openssl-devel \
pkgconfig \
which \
xz \
zlib-devel \
libffi-devel \
ncurses-devel \
libxml2-devel \
libedit-devel \
python3 \
make \
cmake && \
dnf clean all

WORKDIR /data/llvm7

# Download and build LLVM 7.1.0 for all architectures.
RUN curl -sSf -L -O https://github.com/llvm/llvm-project/releases/download/llvmorg-7.1.0/llvm-7.1.0.src.tar.xz && \
tar -xf llvm-7.1.0.src.tar.xz && \
cd llvm-7.1.0.src && \
mkdir build && cd build && \
ARCH=$(uname -m) && \
if [ "$ARCH" = "x86_64" ]; then \
TARGETS="X86;NVPTX"; \
else \
TARGETS="AArch64;NVPTX"; \
fi && \
cmake \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_TARGETS_TO_BUILD="$TARGETS" \
-DLLVM_BUILD_LLVM_DYLIB=ON \
-DLLVM_LINK_LLVM_DYLIB=ON \
-DLLVM_ENABLE_ASSERTIONS=OFF \
-DLLVM_ENABLE_BINDINGS=OFF \
-DLLVM_INCLUDE_EXAMPLES=OFF \
-DLLVM_INCLUDE_TESTS=OFF \
-DLLVM_INCLUDE_BENCHMARKS=OFF \
-DLLVM_ENABLE_ZLIB=ON \
-DLLVM_ENABLE_TERMINFO=ON \
-DCMAKE_INSTALL_PREFIX=/opt/llvm-7 \
.. && \
make -j$(nproc) && \
make install && \
cd ../.. && \
rm -rf llvm-7.1.0.src* && \
dnf clean all

FROM nvcr.io/nvidia/cuda:13.0.2-cudnn-devel-rockylinux9

RUN dnf -y install \
--nobest \
--allowerasing \
--setopt=install_weak_deps=False \
clang \
openssl-devel \
fontconfig-devel \
libX11-devel \
libXcursor-devel \
libXi-devel \
libXrandr-devel \
libxml2-devel \
ncurses-devel \
pkgconfig \
which \
xz \
zlib-devel \
cmake && \
dnf clean all

COPY --from=llvm-builder /opt/llvm-7 /opt/llvm-7
RUN ln -s /opt/llvm-7/bin/llvm-config /usr/bin/llvm-config && \
ln -s /opt/llvm-7/bin/llvm-config /usr/bin/llvm-config-7

# Get Rust (install rustup; toolchain installed from rust-toolchain.toml below)
RUN curl -sSf -L https://sh.rustup.rs | bash -s -- -y --profile minimal --default-toolchain none
ENV PATH="/root/.cargo/bin:${PATH}"

# Setup the workspace
WORKDIR /data/rust-cuda
RUN --mount=type=bind,source=rust-toolchain.toml,target=/data/rust-cuda/rust-toolchain.toml \
rustup show

# Add nvvm to LD_LIBRARY_PATH.
ENV LD_LIBRARY_PATH="/usr/local/cuda/nvvm/lib64:${LD_LIBRARY_PATH}"
ENV LLVM_LINK_STATIC=1
ENV RUST_LOG=info
89 changes: 89 additions & 0 deletions container/ubuntu24-cuda13/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
FROM nvcr.io/nvidia/cuda:13.0.2-cudnn-devel-ubuntu24.04 AS llvm-builder

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -qq -y install \
build-essential \
clang \
curl \
libffi-dev \
libedit-dev \
libncurses5-dev \
libssl-dev \
libtinfo-dev \
libxml2-dev \
cmake \
ninja-build \
pkg-config \
python3 \
xz-utils \
zlib1g-dev && \
rm -rf /var/lib/apt/lists/*

WORKDIR /data/llvm7

# Download and build LLVM 7.1.0 for all architectures.
RUN curl -sSf -L -O https://github.com/llvm/llvm-project/releases/download/llvmorg-7.1.0/llvm-7.1.0.src.tar.xz && \
tar -xf llvm-7.1.0.src.tar.xz && \
cd llvm-7.1.0.src && \
mkdir build && cd build && \
ARCH=$(dpkg --print-architecture) && \
if [ "$ARCH" = "amd64" ]; then \
TARGETS="X86;NVPTX"; \
else \
TARGETS="AArch64;NVPTX"; \
fi && \
cmake -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_TARGETS_TO_BUILD="$TARGETS" \
-DLLVM_BUILD_LLVM_DYLIB=ON \
-DLLVM_LINK_LLVM_DYLIB=ON \
-DLLVM_ENABLE_ASSERTIONS=OFF \
-DLLVM_ENABLE_BINDINGS=OFF \
-DLLVM_INCLUDE_EXAMPLES=OFF \
-DLLVM_INCLUDE_TESTS=OFF \
-DLLVM_INCLUDE_BENCHMARKS=OFF \
-DLLVM_ENABLE_ZLIB=ON \
-DLLVM_ENABLE_TERMINFO=ON \
-DCMAKE_INSTALL_PREFIX=/opt/llvm-7 \
.. && \
ninja -j$(nproc) && \
ninja install && \
cd ../.. && \
rm -rf llvm-7.1.0.src*

FROM nvcr.io/nvidia/cuda:13.0.2-cudnn-devel-ubuntu24.04

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -qq -y install \
build-essential \
clang \
curl \
libssl-dev \
libtinfo-dev \
pkg-config \
xz-utils \
zlib1g-dev \
cmake \
libfontconfig-dev \
libx11-xcb-dev \
libxcursor-dev \
libxi-dev \
libxinerama-dev \
libxrandr-dev && \
rm -rf /var/lib/apt/lists/*

COPY --from=llvm-builder /opt/llvm-7 /opt/llvm-7
RUN ln -s /opt/llvm-7/bin/llvm-config /usr/bin/llvm-config && \
ln -s /opt/llvm-7/bin/llvm-config /usr/bin/llvm-config-7

# Get Rust (install rustup; toolchain installed from rust-toolchain.toml below)
RUN curl -sSf -L https://sh.rustup.rs | bash -s -- -y --profile minimal --default-toolchain none
ENV PATH="/root/.cargo/bin:${PATH}"

# Setup the workspace
WORKDIR /data/rust-cuda
RUN --mount=type=bind,source=rust-toolchain.toml,target=/data/rust-cuda/rust-toolchain.toml \
rustup show

# Add nvvm to LD_LIBRARY_PATH.
ENV LD_LIBRARY_PATH="/usr/local/cuda/nvvm/lib64:${LD_LIBRARY_PATH}"
ENV LLVM_LINK_STATIC=1
ENV RUST_LOG=info
6 changes: 4 additions & 2 deletions crates/nvvm/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -312,8 +312,10 @@ pub enum NvvmArch {
/// This default value of 7.5 corresponds to Turing and later devices. We default to this
/// because it is the minimum supported by CUDA 13.0 while being in the middle of the range
/// supported by CUDA 12.x.
// WARNING: If you change the default, consider updating the `--target-arch` values used for
// compiletests in `ci_linux.yml` and `.github/workflows/ci_{linux,windows}.yml`.
// WARNING: If you change the default, consider updating:
// - The `--target-arch` values used for compiletests in `ci_linux.yml` and
// `.github/workflows/ci_{linux,windows}.yml`.
// - The CUDA versions used in `setup_cuda_environment` in `compiletests`.
#[default]
Compute75,
Compute80,
Expand Down
3 changes: 0 additions & 3 deletions guide/src/README.md

This file was deleted.

10 changes: 7 additions & 3 deletions guide/src/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Summary

- [Introduction](README.md)
- [Supported Features](features.md)
- [Frequently Asked Questions](faq.md)
- [Introduction](introduction.md)
- [Guide](guide/README.md)
- [Getting Started](guide/getting_started.md)
- [Compute Capability Gating](guide/compute_capabilities.md)
Expand All @@ -18,3 +16,9 @@
- [Types](nvvm/types.md)
- [PTX Generation](nvvm/ptxgen.md)
- [Debugging](nvvm/debugging.md)

----

[Supported Features](features.md)
[Frequently Asked Questions](faq.md)

File renamed without changes
File renamed without changes
Loading
Loading