TRELLIS-AMD

TRELLIS running on AMD GPUs with ROCm - Image to 3D Asset Generation

This is a fork of Microsoft TRELLIS modified to run on AMD consumer GPUs (tested on RX 7800 XT with ROCm 7.2.1, torch 2.10.0+rocm7.0).

Status (May 2026): Fully operational on RX 7800 XT. Tested end-to-end: image → 3D asset → textured GLB, including mesh rendering, hole filling, and texture baking on a 16 GB consumer card. The multi-month rasterizer investigation that unblocked this is documented in experiments/raster/.

Features

Feature	Status	Timing
✅ 3D Model Generation	Working	~45 seconds
✅ Gaussian Splatting	Working (145+ it/s)	~30 seconds
✅ Gaussian Export (.ply)	Working	Instant
✅ Mesh Extraction	Working	~60 seconds
✅ GLB Export with Textures	Working	5-10 minutes

⚠️ GLB Export Takes 5-10 Minutes: This is normal! The console will show progress through 5 steps. Your system will be under heavy load during texture baking - this is expected.

Requirements

AMD GPU (tested: RX 7800 XT, RDNA3 / gfx1101)
ROCm 7.0+ (tested on system ROCm 7.2.1, torch 2.10.0+rocm7.0)
Python 3.10+
16 GB VRAM (the pipeline is split into staged phases to fit; see example.py)

Quick Start

Install libsparsehash-dev(required for building torchsparse)

Ubuntu/Debian:

sudo apt-get install libsparsehash-dev

Fedora:

sudo dnf install sparsehash-devel

Arch Linux

sudo pacman -S google-sparsehash

# Clone the repository
git clone https://github.com/CalebisGross/TRELLIS-AMD
cd TRELLIS-AMD

# Run the installation script
chmod +x install_amd.sh
./install_amd.sh

# Activate environment and run
source .venv/bin/activate
ATTN_BACKEND=sdpa XFORMERS_DISABLED=1 SPARSE_BACKEND=torchsparse \
  TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1 python app.py

Then open http://localhost:7860 in your browser.

What's Different from Original TRELLIS?

Custom Extensions (AMD-compatible)

Extension	Modification
nvdiffrast-hip	AMD-safe coarse rasterizer, HIP warp intrinsic macros
diff-gaussian-rasterization	Manual HIP build script, buffer initialization fixes
torchsparse	Built with `FORCE_CUDA=1` for HIP GPU backend

Application Modifications

HIP rasterizer (CoarseRaster + FineRaster) bounds-check fix for the triHeader[i].misc OOB on RDNA3 — see experiments/raster/findings.md
_fill_holes pole-clamps the Hammersley camera distribution so views directly above/below the mesh don't NaN view_look_at and hang coarseRaster (default num_views also dropped 1000 → 100 for speed)
Added progress logging for GLB export
example.py splits the pipeline into staged phases, moving idle submodels to CPU between phases so the full run fits in 16 GB VRAM

Processing Time Reference

Operation	Expected Time	Notes
3D Generation (Sampling)	~45s	12 steps of diffusion
Gaussian Export	Instant	Saves .ply file
GLB Export	5-10 min	Heavy CPU+GPU load is normal

The GLB export shows progress in console:

[GLB Export] Starting GLB extraction (this takes 5-10 minutes)...
[GLB Export] Step 1/5: Mesh postprocessing...
[GLB Export] Step 2/5: UV parametrization...
[GLB Export] Step 3/5: Rendering multiview observations (100 views)...
[GLB Export] Step 4/5: Baking texture (2500 optimization steps)...
[GLB Export] Step 5/5: Finalizing GLB mesh...
[GLB Export] Complete!

Known Limitations

Performance: Coarse rasterizer is serialized and slower than NVIDIA's warp-parallel version
~7% silent triangle culls: The Bug 6 bounds-check fix culls triangles with an out-of-range triHeader[i].misc from triangleSetup. Visual impact is small but the underlying invariant violation is unresolved. See experiments/raster/findings.md for the Phase C root-cause hypothesis.
fill_holes uses 100 views, not 1000: TRELLIS upstream rasterizes 1000 Hammersley-distributed views to detect invisible faces. We clamp views away from the world-up poles (otherwise the HIP rasterizer hangs on degenerate view matrices) and use 100 views. Hole detection quality is visually indistinguishable, and step 1 of GLB extract is now ~10x faster.

Troubleshooting

GPU Hang/Crash

Ensure you're using ROCm 7.0+ and PyTorch built for ROCm (torch 2.10.0+rocm7.0 or newer is recommended).

Empty Mesh

Confirm the input image actually has a foreground subject after rembg background removal. If so, raise the Mesh Simplify slider toward 0 in the UI to keep more triangles, or pass simplify=0.0 to to_glb().

CUDA Symbol Errors

Make sure you're using the AMD-modified extensions in this repo, not the original CUDA ones.

torchsparse "no attribute" Error

Rebuild with: cd extensions/torchsparse && CUDA_HOME=/opt/rocm FORCE_CUDA=1 pip install . --no-build-isolation

Credits

Original TRELLIS by Microsoft
nvdiffrast by NVIDIA
AMD GPU modifications developed through extensive debugging of HIP compatibility issues

License

See original licenses for TRELLIS, nvdiffrast, and diff-gaussian-rasterization.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github/workflows		.github/workflows
assets		assets
configs		configs
dataset_toolkits		dataset_toolkits
docs		docs
experiments/raster		experiments/raster
extensions		extensions
tools/raster_repro		tools/raster_repro
trellis		trellis
.gitignore		.gitignore
.gitmodules		.gitmodules
AMD_GPU_GUIDE.md		AMD_GPU_GUIDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
DATASET.md		DATASET.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
app.py		app.py
app_text.py		app_text.py
example.py		example.py
example_multi_image.py		example_multi_image.py
example_text.py		example_text.py
example_variant.py		example_variant.py
install_amd.sh		install_amd.sh
requirements.txt		requirements.txt
setup.sh		setup.sh
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TRELLIS-AMD

Features

Requirements

Quick Start

What's Different from Original TRELLIS?

Custom Extensions (AMD-compatible)

Application Modifications

Processing Time Reference

Known Limitations

Troubleshooting

GPU Hang/Crash

Empty Mesh

CUDA Symbol Errors

torchsparse "no attribute" Error

Credits

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TRELLIS-AMD

Features

Requirements

Quick Start

What's Different from Original TRELLIS?

Custom Extensions (AMD-compatible)

Application Modifications

Processing Time Reference

Known Limitations

Troubleshooting

GPU Hang/Crash

Empty Mesh

CUDA Symbol Errors

torchsparse "no attribute" Error

Credits

License

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages