Skip to content

Commit 8b41ee5

Browse files
authored
Add GTC2026 Blog (#325)
1 parent a8a62e7 commit 8b41ee5

12 files changed

Lines changed: 151 additions & 0 deletions

blog/2026-03-25-gtc2026.md

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
---
2+
title: "Highlights of SGLang at NVIDIA GTC 2026"
3+
author: "The SGLang Team"
4+
date: "March 31, 2026"
5+
previewImg: /images/blog/gtc2026/happyhour-crowd.jpg
6+
type: news
7+
---
8+
9+
10+
SGLang came to NVIDIA GTC 2026 with panels, a happy hour, a 200-person meetup, and a hands-on training lab. Three days, five events, one packed week at the center of the LLM ecosystem and left with a lot to share. If you missed it, here's the full recap.
11+
12+
<img src="/images/blog/gtc2026/hero-collage.jpg" style="display: block; margin: 20px auto 0; width: 100%; max-width: 100%; height: auto;" />
13+
<p style="text-align: center; color: #666; font-style: italic;">SGLang at GTC 2026: five events, three days.</p>
14+
15+
16+
## At the Main Conference
17+
18+
### SGLang Featured in the GTC Keynote
19+
20+
SGLang was featured on the NVIDIA AI ecosystem slide during Jensen Huang's GTC keynote. We are honored to be recognized as part of the infrastructure stack behind AI-native applications.
21+
22+
<img src="/images/blog/gtc2026/keynote-slide.jpg" style="display: block; margin: 20px auto 0; width: 75%; max-width: 100%; height: auto;" />
23+
<p style="text-align: center; color: #666; font-style: italic;">SGLang on NVIDIA's AI ecosystem slide during the GTC 2026 keynote.</p>
24+
25+
📝 [X recap post](https://x.com/lmsysorg/status/2033675233765499277)
26+
27+
28+
### Open-Source AI Panel at GTC
29+
30+
On Tuesday, **Ying Sheng** joined the GTC panel **"The State of Open-Source AI"** alongside Vartika Singh (Strategic AI Lead, NVIDIA), Jonathan Cohen (VP of Applied Research, NVIDIA), Ion Stoica (Professor, EECS, UC Berkeley), Jeff Boudier (VP of Product, Hugging Face), and Ranjay Krishna (Director of Multimodal and Embodied AI, Ai2).
31+
32+
The panel examined open-source AI's growing role as the primary R&D engine for sophisticated AI systems: what makes open ecosystems trustworthy, scalable, and production-ready, and the community infrastructure enabling reproducible, auditable research.
33+
34+
<img src="/images/blog/gtc2026/panel-ying.jpg" style="display: block; margin: 20px auto 0; width: 75%; max-width: 100%; height: auto;" />
35+
<p style="text-align: center; color: #666; font-style: italic;">Ying Sheng (second from left) on the "The State of Open-Source AI" panel at GTC 2026.</p>
36+
37+
🎬 [Watch the recording on NVIDIA On-Demand](https://www.nvidia.com/en-us/on-demand/session/gtc26-s81791/)
38+
39+
40+
### SGLang Training Lab at GTC 2026
41+
42+
On Thursday morning, the **RadixArk team** led an official GTC training lab: **"High-Performance LLM Serving and Training with SGLang"**.
43+
44+
The lab covered three areas:
45+
46+
1. **Performance tuning with the SGLangCookbook**: practical techniques for improving serving throughput and latency in real deployments
47+
2. **Profiling and bottleneck analysis**: a developer-oriented walkthrough of identifying and resolving performance bottlenecks in LLM serving systems
48+
3. **SGLang × Miles RL integration**: a live demonstration of running SGLang as the inference backend inside a real RL training loop using the Miles framework
49+
50+
51+
<img src="/images/blog/gtc2026/traininglab-instructor.jpg" style="display: block; margin: 20px auto 0; width: 75%; max-width: 100%; height: auto;" />
52+
<p style="text-align: center; color: #666; font-style: italic;">The SGLang Training Lab at GTC 2026: hands-on LLM performance tuning and RL training.</p>
53+
54+
🎬 [Watch full recording on NVIDIA On-Demand](https://www.nvidia.com/en-us/on-demand/session/gtc26-dlit82143/)
55+
56+
📁 [Download the training lab materials](https://drive.google.com/drive/folders/1ByXsdu0n03-sYR8MfxFv1dK_2GGfB0Ku?usp=drive_link)
57+
58+
59+
## Side Events
60+
61+
### SGLang × RadixArk GTC Happy Hour
62+
63+
On Tuesday evening, SGLang and RadixArk co-hosted a GTC Happy Hour that brought together builders, researchers, and founders from across the inference and training ecosystem, including friends from OpenAI, xAI, DeepMind, Meta, NVIDIA, Ollama, and more.
64+
65+
<img src="/images/blog/gtc2026/happyhour-crowd.jpg" style="display: block; margin: 20px auto 0; width: 75%; max-width: 100%; height: auto;" />
66+
<p style="text-align: center; color: #666; font-style: italic;">SGLang × RadixArk Happy Hour.</p>
67+
68+
The evening featured two technical spotlights:
69+
- **Banghua Zhu (RadixArk)** introduced RadixArk and **Miles**, SGLang's native RL training framework purpose-built for large-scale MoE post-training workloads.
70+
- **Jason Zhao (ScitiX)** presented **SiMM**, an open-source in-memory KV cache engine integrated with SGLang for long-context serving.
71+
72+
<img src="/images/blog/gtc2026/happyhour-banghua.jpg" style="display: block; margin: 20px auto 0; width: 75%; max-width: 100%; height: auto;" />
73+
<p style="text-align: center; color: #666; font-style: italic;">Banghua introducing RadixArk and the Miles RL framework.</p>
74+
75+
Thank you to **Z Potentials** and **ScitiX** for sponsoring the event and making it possible.
76+
77+
📝 [X recap post](https://x.com/lmsysorg/status/2034398140510720289?s=20)
78+
79+
80+
### Banghua at Novita's GTC Event
81+
82+
**Banghua Zhu** joined Novita's GTC event with over 700 attendees. The discussion covered Jensen Huang's remarks on the inflection point between inference cost and demand, the key drivers behind the agentic AI movement, and what it takes for AI products to deliver real value. Banghua shared his perspective on how SGLang is shaping the future of inference infrastructure, enabling next-generation use cases from OpenClaw to agentic inference, and driving the evolution of open models and open infrastructure.
83+
84+
<img src="/images/blog/gtc2026/novita-banghua.jpg" style="display: block; margin: 20px auto 0; width: 75%; max-width: 100%; height: auto;" />
85+
<p style="text-align: center; color: #666; font-style: italic;">Banghua presenting at Novita's GTC event.</p>
86+
87+
Partners represented included NVIDIA, RadixArk, OpenRouter, Google DeepMind, Kimi (Moonshot AI), Alibaba Cloud, MiniMax, Z.ai, Hugging Face, and Kilo Code.
88+
89+
90+
### LinkedIn × SGLang Meetup: LLMs for Search & Recommendation
91+
92+
On Wednesday evening, we hosted approximately 200 engineers at LinkedIn's Mountain View headquarters alongside teams from LinkedIn, TikTok, Meta, and NVIDIA for a deep dive into production LLM systems for search and recommendation.
93+
94+
<img src="/images/blog/gtc2026/linkedin-venue.jpg" style="display: block; margin: 20px auto 0; width: 75%; max-width: 100%; height: auto;" />
95+
<p style="text-align: center; color: #666; font-style: italic;">SGLang swag at the LinkedIn meetup.</p>
96+
97+
#### LinkedIn Engineering Talks
98+
99+
LinkedIn opened with three engineering presentations:
100+
- **Fedor Borisyuk**: Semantic search at scale
101+
- **Zhipeng Wang**: Modeling optimizations for LLM-driven ranking
102+
- **Sundara Raman Ramachandran**: LLM inference infrastructure optimizations, including a prefill-only serving path delivering **2–3× throughput gains on H100s**, upstreamed back to SGLang
103+
104+
<img src="/images/blog/gtc2026/linkedin-talks.jpg" style="display: block; margin: 20px auto 0; width: 75%; max-width: 100%; height: auto;" />
105+
<p style="text-align: center; color: #666; font-style: italic;">LinkedIn engineers presenting on semantic search, ranking, and inference infrastructure.</p>
106+
107+
Relevant work from LinkedIn's engineering team:
108+
[[1]](https://arxiv.org/abs/2502.14305) [[2]](https://arxiv.org/abs/2602.07309) [[3]](https://arxiv.org/abs/2510.22101) [[4]](https://arxiv.org/abs/2512.07846) [[5]](https://github.com/linkedin/fmchisel) [[6]](https://openreview.net/forum?id=tyGfwG6xTh) [[7]](https://www.linkedin.com/blog/engineering/ai/scaling-llm-based-ranking-systems-with-sglang-at-linkedin)
109+
110+
#### SGLang: Roadmap and Miles Framework
111+
112+
SGLang core developer **Liangsheng Yin** walked through SGLang's H1 2026 roadmap.
113+
114+
**Mao Cheng** then presented the **Miles RL framework**, addressing training–inference mismatch in production through three core techniques:
115+
1. **Importance sampling corrections**: compensating for distribution shift between training and inference
116+
2. **Inference-training alignment**: ensuring consistency between rollout behavior and gradient updates
117+
3. **Rollout Routing Replay (R3)**: replay-based routing for efficient use of generated rollout data
118+
119+
<img src="/images/blog/gtc2026/linkedin-miles.jpg" style="display: block; margin: 20px auto 0; width: 75%; max-width: 100%; height: auto;" />
120+
<p style="text-align: center; color: #666; font-style: italic;">Mao Cheng presenting the Miles RL framework and its approach to training–inference alignment.</p>
121+
122+
#### Industry Speakers
123+
124+
- **Hongyu Lu (TikTok)**: LLM search at scale
125+
- **Luke Simon and Xi Liu (Meta)**: Generative Reasoning Reranker [[paper link]](https://lnkd.in/gGFwdkJw)
126+
- **Anish Maddipoti (NVIDIA)**: Dynamo + NeMoRL
127+
128+
129+
#### Panel Discussion
130+
131+
The closing panel, hosted by Qing Lan, featured Wenfeng Zhuo, Fedor Borisyuk, Luke Simon, and Mao Cheng. Topics included:
132+
- Semantic ID vs. embedding retrieval
133+
- Whether unified retrieval + ranking (OneRec-style systems) is production-ready
134+
- Inference and training challenges in LLM recsys
135+
- Recent breakthroughs accelerating LLM adoption for recommendations
136+
- The role of continuous learning in production recommendation systems
137+
138+
<img src="/images/blog/gtc2026/linkedin-panel.jpg" style="display: block; margin: 20px auto 0; width: 75%; max-width: 100%; height: auto;" />
139+
<p style="text-align: center; color: #666; font-style: italic;">The closing panel: Wenfeng Zhuo, Fedor Borisyuk, Luke Simon, and Mao Cheng, moderated by Qing Lan.</p>
140+
141+
This is exactly the kind of collaboration that will define the next generation of recommendation systems: production teams and open-source infrastructure co-evolving together.
142+
143+
📝 [LinkedIn recap post](https://www.linkedin.com/feed/update/urn:li:activity:7440847574133190656)
144+
145+
146+
147+
## Looking Ahead
148+
149+
GTC 2026 made clear how much the production ecosystem is converging around open-source infrastructure. From semantic search at LinkedIn scale to RL post-training for frontier MoE models, SGLang is increasingly the shared layer underneath.
150+
151+
We'll keep building in the open. Follow our [Luma calendar](https://luma.com/SGLang) for future meetups, office hours, and community events.
151 KB
Loading
113 KB
Loading
336 KB
Loading
92.8 KB
Loading
304 KB
Loading
362 KB
Loading
519 KB
Loading
307 KB
Loading
212 KB
Loading

0 commit comments

Comments
 (0)