Skip to content

Commit 0452fca

Browse files
author
shadeform
committed
fix(nim): use nvcr.io/nim/nvidia NIM image path and v1.5 model defaults
- Point llm-nim compose image to nvcr.io/nim/nvidia/llama-3.3-nemotron-super-49b-v1.5:latest - Align DEPLOYMENT.md self-host docker example with v1.5 and nim org path - Refresh setup notebook: disk checks, compose injection, nvcr login docs, deployment_strategy copy, device_ids/compose flags Made-with: Cursor
1 parent 757cea5 commit 0452fca

3 files changed

Lines changed: 560 additions & 75 deletions

File tree

DEPLOYMENT.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -304,7 +304,7 @@ Deploy NIMs on your own infrastructure for data privacy, cost control, and custo
304304
```bash
305305
# Example: Deploy LLM NIM on port 8000
306306
docker run --gpus all -p 8000:8000 \
307-
nvcr.io/nvidia/nim/llama-3.3-nemotron-super-49b:latest
307+
nvcr.io/nim/nvidia/llama-3.3-nemotron-super-49b-v1.5:latest
308308

309309
# Example: Deploy Embedding NIM on port 8001
310310
docker run --gpus all -p 8001:8001 \

deploy/compose/docker-compose.dev.yaml

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,5 +158,38 @@ services:
158158
- backend
159159
restart: unless-stopped
160160

161+
llm-nim:
162+
image: nvcr.io/nim/nvidia/llama-3.3-nemotron-super-49b-v1.5:latest
163+
container_name: wosa-llm-nim
164+
ports:
165+
- "${LLM_NIM_PORT:-8000}:8000"
166+
environment:
167+
- NVIDIA_API_KEY=${NVIDIA_API_KEY:-}
168+
- CUDA_VISIBLE_DEVICES=0,1,2,3
169+
- NVIDIA_VISIBLE_DEVICES=all
170+
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
171+
deploy:
172+
resources:
173+
reservations:
174+
devices:
175+
- driver: nvidia
176+
device_ids:
177+
- "0"
178+
- "1"
179+
- "2"
180+
- "3"
181+
capabilities: [gpu]
182+
volumes:
183+
- nim_cache:/root/.cache
184+
- nim_models:/models
185+
restart: unless-stopped
186+
healthcheck:
187+
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
188+
interval: 30s
189+
timeout: 10s
190+
retries: 3
191+
start_period: 180s
161192
volumes:
162193
kafka_data:
194+
nim_cache:
195+
nim_models:

0 commit comments

Comments
 (0)