Skip to content

Latest commit

 

History

History
111 lines (86 loc) · 2.51 KB

File metadata and controls

111 lines (86 loc) · 2.51 KB

Jetson-Example: Run GPT-OSS 20B on NVIDIA Jetson

This project provides one-click deployment for GPT-OSS 20B on NVIDIA Jetson devices. It uses the prebuilt Docker image:

chenduola6/got-oss-20b:jp6

Docker image size: 31.28 GB

Hardware Requirements

  • NVIDIA Jetson device with at least 16GB VRAM
  • At least 50GB available disk space

Supported JetPack/L4T versions:

  • JetPack 6.2 -> L4T 36.4.0
  • JetPack 6.2.1 -> L4T 36.4.3
  • JetPack 6.1 -> L4T 36.4.4

GPT-OSS demo

Getting Started

Installation

PyPI (recommended):

pip install jetson-examples

GitHub (developer):

git clone https://github.com/Seeed-Projects/jetson-examples
cd jetson-examples
pip install .

Usage

One-line deployment

reComputer run gpt-oss

This command pulls the image and starts llama-server in a detached container. The script waits for /v1/models to become ready before exiting.

Note: The script auto-detects the available GPU run mode on your Jetson (--runtime nvidia or --gpus all).

Note: If prompted by the script, allow adding your user to the docker group so future runs do not require sudo docker. After adding the group, log out and log back in once.

Note: If curl /v1/models returns 503 {"message":"Loading model"}, the model is still loading. First startup can take several minutes.

Note: If startup fails because of memory pressure, add swap space and try again:

sudo fallocate -l 16G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

You can lower memory usage when launching:

LLAMA_CTX=512 LLAMA_NGL=16 reComputer run gpt-oss

Verify service

curl http://127.0.0.1:8080/v1/models

Check logs

docker logs -f gpt-oss

Manual Deployment (inside Docker)

docker pull chenduola6/got-oss-20b:jp6

docker run -it --rm \
  --runtime nvidia \
  --network host \
  --ipc=host \
  chenduola6/got-oss-20b:jp6

# inside the container
cd /root/gpt-oss/llama.cpp

./build/bin/llama-server \
  -m /root/gpt-oss/gguf/gpt-oss-20b-Q4_K.gguf \
  -ngl 20 -c 1024 \
  --host 0.0.0.0 --port 8080

Cleanup

Only remove the container (keep image cache):

reComputer clean gpt-oss

References