Deployment Guide: Running LandmarkDiff on your own GPU #231
Unanswered
dreamlessx
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
A quick guide for deploying LandmarkDiff locally with GPU support. This covers Docker, pip install, and common troubleshooting.
Option 1: Docker (recommended)
The easiest path if you have the NVIDIA Container Toolkit installed.
For training:
Requires Docker 24+ and NVIDIA Container Toolkit. Minimum 6 GB VRAM for inference, 25 GB for training.
Option 2: pip install
Option 3: HPC with Apptainer/Singularity
For clusters that do not allow Docker (most HPC environments):
See GPU_TRAINING_GUIDE.md for SLURM job scripts and multi-node configurations.
GPU requirements
TPS mode runs on CPU only and needs no GPU at all.
Common issues
CUDA out of memory: Lower
num_inference_stepsto 20 or use--mode tpsfor CPU-only. You can also enable CPU offloading viapipeline.enable_model_cpu_offload().MediaPipe fails to detect face: Ensure the input image has a clearly visible face at reasonable resolution (512x512 minimum). Extreme angles or heavy occlusion can cause detection failure.
Models not downloading: On first run, LandmarkDiff downloads ~6 GB of model weights from Hugging Face. If you are behind a firewall, set
HF_HUB_OFFLINE=0and pre-download models withhuggingface-cli download.Share your deployment setup or ask questions below.
Beta Was this translation helpful? Give feedback.
All reactions