Skip to content

Commit 6454bc9

Browse files
abrichrclaude
andauthored
feat: add SGLang local model serving to comparison framework (#190)
Add support for serving models via SGLang on remote GPU hosts, enabling comparison of API models (GPT, Claude) against locally-served models (e.g. Qwen3.5-9B) that vLLM cannot serve. Key changes: - New scripts/sglang_server.py: SGLangServerManager handles full lifecycle (SSH install, server start, readiness polling, SSH tunnel, cleanup) - Extended ModelConfig with provider="sglang", serve config, max_new_tokens - New --gpu-host and --ssh-key CLI flags (optional; sglang models skipped without --gpu-host) - SGLang server auto-starts per model, tunneled as OpenAI-compatible API - Environment variables (OPENAI_BASE_URL, OPENAI_API_KEY) saved/restored between models so API models remain unaffected - New example_comparisons/unified_agents.yaml demonstrating mixed config Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent c877588 commit 6454bc9

3 files changed

Lines changed: 605 additions & 333 deletions

File tree

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Example comparison config: API + local (SGLang) models as unified agents
2+
#
3+
# Usage:
4+
# # API models only (no --gpu-host, sglang models are skipped):
5+
# python scripts/compare_models.py --config example_comparisons/unified_agents.yaml
6+
#
7+
# # API + local models (SGLang on remote GPU):
8+
# python scripts/compare_models.py \
9+
# --config example_comparisons/unified_agents.yaml \
10+
# --gpu-host user@gpu-server
11+
#
12+
# # With specific SSH key:
13+
# python scripts/compare_models.py \
14+
# --config example_comparisons/unified_agents.yaml \
15+
# --gpu-host user@gpu-server --ssh-key ~/.ssh/gpu_key
16+
17+
name: "Unified Agent Comparison (API + Local)"
18+
description: "Compare API models vs locally-served models as unified desktop agents"
19+
20+
tasks:
21+
- example_tasks/notepad-hello.yaml
22+
- example_tasks/clear-browsing-data-chrome.yaml
23+
24+
models:
25+
- name: gpt-5.4-mini
26+
provider: openai
27+
type: unified
28+
29+
- name: Qwen/Qwen3.5-9B
30+
provider: sglang
31+
type: unified
32+
max_new_tokens: 2048
33+
serve:
34+
engine: sglang
35+
port: 8080
36+
37+
server_url: http://localhost:5001
38+
max_steps: 10
39+
runs_per_config: 1
40+
save_screenshots: true
41+
output_dir: comparison_results/unified/

0 commit comments

Comments
 (0)