Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 

README.md

🚒 Cuda & PyTorch

🧸 Prerequisites

gcc --version
conda --version

🚗 Nvidia Driver

nvidia-smi # skip this step if installed

check recommended drivers

sudo ubuntu-drivers devices

install specific driver (base on recommendation)

sudo apt install -y nvidia-driver-550
sudo reboot

# you might need to permit third-party driver with MOK
# if anything went wrong, remove all drivers and re-install
# sudo apt purge nvidia-*
# sudo apt autoremove

nvidia-smi

This CUDA version you see here is the maximum version supported by the driver.

🔧 Cuda toolkit

nvcc --version # skip this step if installed

Since we want to use PyTorch, first check the latest version (or any version you want) here. Then install cuda toolkit here, we use CUDA 12.6. It is recommended to install it using the runfile method, note that don't select install driver again.

wget https://developer.download.nvidia.com/compute/cuda/12.6.3/local_installers/cuda_12.6.3_560.35.05_linux.run
sudo sh cuda_12.6.3_560.35.05_linux.run

Add CUDA path (or add it to your shell configuration file, we will give instructions below).

export CUDA_HOME=/usr/local/cuda-12.6
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH

nvcc --version

This CUDA version you see here is the version we are using.

🧱 Multi-version CUDA

For example, we install CUDA 11.8. It is recommended to install it using the runfile method, note that don't select install driver again.

wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
sudo sh cuda_11.8.0_520.61.05_linux.run

Then switch it by

#CUDA 11.8 
export CUDA_HOME=/usr/local/cuda-11.8
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH

Switch CUDA

For simplicity, we define functions to switch CUDA version more easily. Put the following script in your shell configuration file (e.g., default vim ~/.bashrc or we use vim ~/.zshrc).

# Default CUDA version
export CUDA_DEFAULT_VERSION=12.6

# Function to list available CUDA versions
cuda_list() {
    ls /usr/local | grep -oP 'cuda-\K[\d.]+'
}

# Function to switch CUDA versions (with validation)
cuda_switch() {
    if [ -z "$1" ]; then
        echo "Please specify a version: cuda_switch <VERSION>"
        return 1
    fi

    # Check if the requested CUDA version is available
    if ! cuda_list | grep -q "^$1$"; then
        echo "Error: CUDA $1 is not installed. Available versions:"
        cuda_list
        return 1
    fi

    export CUDA_HOME="/usr/local/cuda-$1"
    export PATH="$CUDA_HOME/bin:$PATH"
    export LD_LIBRARY_PATH="$CUDA_HOME/lib64:$LD_LIBRARY_PATH"

    # Print message only if NOT running in silent mode
    if [ "$2" != "silent" ]; then
        echo "Switched to CUDA $1"
    fi
}

# Set default CUDA version at startup
cuda_switch $CUDA_DEFAULT_VERSION silent

Restart shell or reload shell by source ~/.bashrc. We added two functions:

  • cuda_list: Print available CUDA versions in /usr/local/.
  • cuda_switch <VERSION>: Switch to CUDA <VERSION>.

🔥 PyTorch

We test it on conda base environment.

conda activate base

Install PyTorch with CUDA from here. If you follow the instructions above, you should install CUDA 12.6 version.

# CUDA 12.6 
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
# CUDA 11.8
# pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Run python and check if successfully installed.

import torch
print(torch.version.cuda) # 12.6
print(torch.cuda.is_available()) # true
print(torch.cuda.get_device_name(0))