Skip to content

Commit 057e945

Browse files
authored
Merge pull request #1 from GPULab-AI/codex/serverless-cli-update
[codex] add serverless CLI and updater
2 parents 8ac134e + 52e8daf commit 057e945

6 files changed

Lines changed: 2485 additions & 0 deletions

File tree

README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,40 @@ gpulab deploy \
137137
-- python train.py --epochs 100 --lr 0.001 --batch-size 32
138138
```
139139

140+
## Serverless GPUs
141+
142+
Serverless endpoints use the same API key auth as containers.
143+
144+
```bash
145+
# See available serverless templates, GPU types, regions, volumes, and policy templates
146+
gpulab serverless options
147+
148+
# Create an endpoint
149+
gpulab serverless create \
150+
--name llama-api \
151+
--template pytorch \
152+
--gpu-type "RTX 4090" \
153+
--memory 32 \
154+
--port 8000 \
155+
--min-replicas 0 \
156+
--max-replicas 2 \
157+
--concurrency 1 \
158+
-e HF_TOKEN \
159+
--command "python app.py"
160+
161+
# Inspect, invoke, and read logs/history
162+
gpulab serverless inspect llama-api
163+
gpulab serverless invoke llama-api /v1/chat/completions -d '{"prompt":"hello"}' --wait
164+
gpulab serverless requests llama-api
165+
gpulab serverless autoscaling-logs llama-api
166+
gpulab serverless logs llama-api --replica all
167+
gpulab serverless logs llama-api --deploy
168+
169+
# Update or delete
170+
gpulab serverless update llama-api --max-replicas 4 --autoscaling-template pending_requests_linear
171+
gpulab serverless delete llama-api --force
172+
```
173+
140174
## Commands
141175

142176
| Command | Description |
@@ -158,6 +192,11 @@ gpulab deploy \
158192
| `gpulab templates` | List templates |
159193
| `gpulab gpus types` | List GPU types |
160194
| `gpulab volumes` | List volumes |
195+
| `gpulab serverless` | Manage serverless GPU endpoints |
196+
| `gpulab serverless logs <endpoint>` | View serverless replica container logs |
197+
| `gpulab serverless requests <endpoint>` | View serverless request logs |
198+
| `gpulab serverless autoscaling-logs <endpoint>` | View autoscaling history |
199+
| `gpulab update` | Update the CLI from GitHub Releases |
161200

162201
## Global Flags
163202

bin/gpulab

411 KB
Binary file not shown.

0 commit comments

Comments
 (0)