feat(blog): MI355X Qwen3.5 SGLang v0.5.12 up-to-19x + spline interp helper by functionstackx · Pull Request #380 · SemiAnalysisAI/InferenceX-app

functionstackx · 2026-05-25T23:54:20Z

Summary

Two logical changes bundled together — both touch the blog-writing workflow.

Blog post

New post: 13 weeks after the 2026-02-16 Qwen3.5-397B-A17B release, MI355X SGLang FP8 throughput per GPU on 8k/1k moved up to 19.0x at iso-interactivity at 40 tok/s/user (192 → 3,660 tok/s/GPU between the v0.5.8.post1 baseline and v0.5.12), with peak per-GPU throughput climbing 1.3k → 6.4k tok/s/GPU. Three AMD-authored upstream SGLang PRs (#20736, #21188, #21421) drove the April jump; v0.5.12 alone adds 1.44–1.68x on top.

Skill / infra updates

.claude/skills/write-inferencex-blog/iso-interactivity.py — Python port of the dashboard's interpolation pipeline (Pareto upper-left frontier + Steffen 1990 monotone cubic Hermite + no extrapolation + Y-clamping). 1:1 with packages/app/src/components/calculator/interpolation.ts. Blog tables now produce exactly the same numbers readers see when they hover the rendered chart.
.claude/skills/write-inferencex-blog/editor.mjs — portable browser-based MDX editor with auto-save, ~/-normalized path display, 127.0.0.1-only bind. Takes the file path as argv[2] so it works for any draft on any machine.
.claude/skills/write-inferencex-blog/SKILL.md — replaces the linear-interp guidance with the spline mandate; adds a "How the Pareto frontier behaves between the knots" subsection explaining why the Hermite cubic with Steffen tangents can't overshoot between knots; gates PR creation on human review; documents the browser-editor launch step.
AGENTS.md — new "Chart Interpolation — TS and Python Helpers MUST Stay in Sync" section. Any PR touching paretoFrontUpperLeft, monotoneSlopes, hermiteInterpolate, or interpolateMetricAtInteractivity MUST also update iso-interactivity.py in the same commit, otherwise blog tables silently drift from the chart.
.vercelignore — excludes .claude/ from Vercel uploads (belt-and-suspenders since Vercel project root is packages/app/).
.gitignore — excludes Python bytecode so ad-hoc imports of iso-interactivity.py don't leak __pycache__/ into the working tree.

Test plan

Replace the dark-theme chart export — benchmark-dark.png is currently a copy of the light theme. Drop a real dark export from the dashboard at the linked preset before merging.
Verify the dashboard preset URL (g_model=Qwen-3.5-397B-A17B&g_rundate=2026-05-19&i_dstart=2026-02-20&i_dend=2026-05-19&i_prec=fp8) lands on the right cross-date MI355X view.
Spot-check 2–3 cells of the iso-interactivity table by piping a JSON request through iso-interactivity.py and confirming the spline output matches the table.
Click-through preview when Vercel builds — verify the rendered Figure shows the chart, the tables render properly, and the JsonLd FAQ doesn't appear inline.

🤖 Generated with Claude Code

Note

Medium Risk
Published benchmark ratios and iso-interactivity tables are high-visibility claims; drift between TS chart code and the new Python helper would mislead readers, though runtime app behavior is unchanged.

Overview
Adds a new MI355X Qwen3.5 / SGLang v0.5.12 benchmark post (19x iso-interactivity headline, three-date tables, spline-derived comparison with _unreachable_ rows) and expands the write-inferencex-blog workflow so authors match the live dashboard.

Blog workflow & tooling: iso_interactivity.py ports the chart’s Pareto + Steffen Hermite pipeline; SKILL.md mandates that helper (replacing linear interp), documents frontier behavior, requires a hero + repeated <Figure>, human approval before git/PR, and a local editor.mjs preview server. AGENTS.md requires TS interpolation changes to update the Python helper in the same commit. .vercelignore / .gitignore keep .claude/ and __pycache__ out of deploys and the tree.

^{Reviewed by Cursor Bugbot for commit 01f0a62. Bugbot is set up for automated code reviews on this repo. Configure here.}

…elper Blog post: 13 weeks after the 2026-02-16 Qwen3.5-397B-A17B release, AMD MI355X SGLang FP8 throughput per GPU on 8k/1k has moved up to 19.0x at iso-interactivity at 40 tok/s/user (192 -> 3,660 tok/s/GPU between v0.5.8.post1 baseline and v0.5.12), with peak per-GPU throughput climbing 1.3k -> 6.4k tok/s/GPU. Three AITER MoE PRs (sglang #20736, #21188, #21421) drove the April jump; v0.5.12 alone adds 1.44-1.68x on top. Skill updates (.claude/skills/write-inferencex-blog/): - editor.mjs: portable Node http server for browser-based MDX editing with auto-save, ~/-normalized path display, and 127.0.0.1-only bind. Takes the file path as argv[2] so any draft can be edited on any machine. - iso-interactivity.py: Python port of the dashboard's chart interpolation pipeline (paretoFrontUpperLeft + Steffen 1990 monotone cubic Hermite slopes + Hermite evaluation, no extrapolation, Y-clamped to prevent spline overshoot). 1:1 with the canonical TS at packages/app/src/components/calculator/ interpolation.ts. Blog iso-interactivity tables must use this helper so published numbers match the rendered chart exactly. - SKILL.md: replaces the linear-interpolation guidance with the spline mandate and the helper invocation; new "How the Pareto frontier behaves between the knots" subsection explains why the Hermite cubic with Steffen tangents never overshoots and why blog tables can diverge from naive linear by 10%+ on steep segments. Adds the human-review gate before PR creation and the browser-editor launch step. AGENTS.md: new "Chart Interpolation - TS and Python Helpers MUST Stay in Sync" section. Any PR touching paretoFrontUpperLeft, monotoneSlopes, hermiteInterpolate, or interpolateMetricAtInteractivity in TypeScript MUST also update iso-interactivity.py in the same commit, otherwise blog tables silently drift from the chart. .vercelignore: excludes .claude/ from Vercel uploads (belt-and-suspenders since the project root is packages/app/). .gitignore: excludes Python bytecode (__pycache__/, *.pyc) so ad-hoc imports of iso-interactivity.py don't leak. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-05-25T23:54:25Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
inferencemax-app	Ready	Preview, Comment	May 26, 2026 12:09am

Mandates reading every packages/app/content/blog/*.mdx file before writing a new post (not just "when in doubt"). Highlights two foundational posts that must be read heavily for tone and structure: - inferencex-v2-nvidia-blackwell-vs-amd-vs-hopper.mdx — the v2 launch piece that sets the editorial voice (composability framing, rack-scale vs single-node, TCO discussion, first-name engineer acknowledgments). - inferencemax-open-source-inference-benchmarking.mdx — the origin story for the open-source benchmark and the "speed is the moat" framing about software cadence. Also adds the Qwen3.5 post itself to the template reference list as the canonical "three-date version-bump time series" template, with the spline iso-interactivity comparison and the `_unreachable_` cell convention for out-of-frontier interactivities. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Blog: moves the <Figure> block from below the iso-interactivity table up to immediately after the top DashboardCTA so the hero chart is the first thing readers see. Removes the duplicate Figure placement. Reworks the [Live chart] line below the iso-interactivity table to call out that it's the interactive version of the figure at the top. Skill: adds a new "<Figure> hero image immediately after the top DashboardCTA" subsection in Step 4 specifying the new placement convention, and rewrites the "<Figure> with the chart image" section further down into a "[Live chart] link after the iso-interactivity tables" section so the structure stops implying two Figure blocks per post. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…table Earlier commit deleted the bottom Figure when hoisting the chart to the top — the intent was both placements, not move. The same chart asset now appears twice: once as the hero immediately after the top DashboardCTA so readers see the curves before the prose, and once again directly below the iso-interactivity table so readers don't have to scroll back up to map the data rows to the chart. Skill mandates both placements explicitly with copy-paste identical <Figure> blocks; the [Live chart] link stays below the second Figure unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Five issues from Cursor Bugbot review: 1. Blog: fix "lower of TP=2 and TP=4 throughput" -> "higher of" in the iso-interactivity intro. Throughput is higher-is-better; the Pareto frontier picks the maximum throughput at each interactivity, not the minimum. 2. Helper: rename iso-interactivity.py -> iso_interactivity.py so `import iso_interactivity` actually works (Python identifiers can't contain hyphens). The CLI invocation works either way but the module-import pattern shown in the docstring was broken. Updates the docstring's CLI example, SKILL.md, and AGENTS.md to reference the new filename. git mv preserves history. 3. Helper: guard against KeyError when a frontier point is missing the requested metric_key. Now uses p.get(metric_key) with a fallback to 0, matching the TS `extractMetric(...) ?? 0` behavior in interpolateMetricAtInteractivity. CLI returns `{"value": 0}` cleanly instead of dying with a traceback. 4. Helper: clarify the clamp asymmetry between the two TS analogs. The Python helper keeps the tighter [min(ys), max(ys)] clamp that matches interpolateForGPU in the calculator (the closest analog to blog iso-interactivity tables), and the docstring now explicitly notes the asymmetry with the trend-chart hook which only does max(0, raw) and lets the spline overshoot up. 5. Editor: fix Cmd+S race condition. doSave() now reschedules a debounced save instead of silently dropping the keypress when another save is already in flight. Latest buffer never gets stranded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 01f0a62. Configure here.}

cursor · 2026-05-26T00:11:12Z

+from __future__ import annotations
+import json
+import sys
+from typing import Callable, Iterable, Optional


Unused Iterable import added

Low Severity

The new module imports Iterable from typing but never references it anywhere in the file, leaving dead import noise in a helper that is meant to stay a clean 1:1 port of the TypeScript sources.

^{Reviewed by Cursor Bugbot for commit 01f0a62. Configure here.}

cursor · 2026-05-26T00:11:12Z

+      if (editor.getValue() !== lastSaved) {
+        scheduleSave();
+      }
+    }


Stale save can overwrite edits

Medium Severity

The MDX editor’s doSave posts a snapshot captured at save start with no generation check. If the user keeps typing while that request is in flight, a later successful response can write the older buffer after a beforeunload sendBeacon already persisted newer text, reverting the file on disk.

Additional Locations (1)

.claude/skills/write-inferencex-blog/editor.mjs#L171-L177

^{Reviewed by Cursor Bugbot for commit 01f0a62. Configure here.}

cursor · 2026-05-26T00:11:12Z

 - `packages/app/content/blog/mi355x-kimi-k2-5-vllm-aiter-7x-speedup.mdx` — Single-PR speedup story, 25-day cadence, iso-throughput interpolation. Closest template for "one PR moved the curve" posts.
 - `packages/app/content/blog/sglang-0-5-6-b200-deepseek-r1-fp4-up-to-1-8x.mdx` — Same-hardware version-bump story. Closest template for "framework release X is N% faster than X-1" posts.
 - `packages/app/content/blog/gb200-nvl72-kimi-k2-5-vllm-wide-ep-3x-vs-b200.mdx` — Rack-scale wide EP story. Closest template for "scale-up fabric unlocks a new operating regime" posts.
+- `packages/app/content/blog/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x.mdx` — Three-date version-bump time series with the spline iso-interactivity comparison and the `_unreachable_` cell convention for out-of-frontier interactivities.


Chart-only path still linear

Low Severity

This PR adds a mandatory spline-based iso_interactivity.py workflow for iso-interactivity tables, but the “When the user has only a chart image” section still instructs authors to linearly interpolate comparison points, which produces different numbers than the dashboard Hermite/Pareto pipeline the rest of the skill requires.

^{Reviewed by Cursor Bugbot for commit 01f0a62. Configure here.}

vercel Bot deployed to Preview May 25, 2026 23:54 View deployment

functionstackx and others added 2 commits May 25, 2026 19:55

style(editor): satisfy oxfmt --check

3e33361

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel Bot deployed to Preview May 25, 2026 23:56 View deployment

cursor Bot reviewed May 25, 2026

View reviewed changes

Comment thread .claude/skills/write-inferencex-blog/iso_interactivity.py

Comment thread packages/app/content/blog/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x.mdx Outdated

Comment thread .claude/skills/write-inferencex-blog/iso_interactivity.py

vercel Bot deployed to Preview May 25, 2026 23:59 View deployment

cursor Bot reviewed May 26, 2026

View reviewed changes

Comment thread .claude/skills/write-inferencex-blog/editor.mjs Outdated

vercel Bot deployed to Preview May 26, 2026 00:02 View deployment

cursor Bot reviewed May 26, 2026

View reviewed changes

Comment thread .claude/skills/write-inferencex-blog/iso-interactivity.py Outdated

functionstackx merged commit 6ceb08d into master May 26, 2026
14 of 15 checks passed

functionstackx deleted the blog/qwen3-5-mi355x-sglang-v0-5-12-19x branch May 26, 2026 00:09

vercel Bot deployed to Preview May 26, 2026 00:09 View deployment

cursor Bot reviewed May 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(blog): MI355X Qwen3.5 SGLang v0.5.12 up-to-19x + spline interp helper#380

feat(blog): MI355X Qwen3.5 SGLang v0.5.12 up-to-19x + spline interp helper#380
functionstackx merged 6 commits into
masterfrom
blog/qwen3-5-mi355x-sglang-v0-5-12-19x

functionstackx commented May 25, 2026 •

edited by cursor Bot

Loading

Uh oh!

vercel Bot commented May 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 26, 2026

Uh oh!

cursor Bot May 26, 2026

Uh oh!

cursor Bot May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

functionstackx commented May 25, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Blog post

Skill / infra updates

Test plan

Uh oh!

vercel Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 26, 2026

Choose a reason for hiding this comment

Unused Iterable import added

Uh oh!

cursor Bot May 26, 2026

Choose a reason for hiding this comment

Stale save can overwrite edits

Uh oh!

cursor Bot May 26, 2026

Choose a reason for hiding this comment

Chart-only path still linear

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

functionstackx commented May 25, 2026 •

edited by cursor Bot

Loading

vercel Bot commented May 25, 2026 •

edited

Loading