Skip to content

Commit c1873d6

Browse files
edit(blog,skill): hoist chart figure to top of post (after lede CTA)
Blog: moves the <Figure> block from below the iso-interactivity table up to immediately after the top DashboardCTA so the hero chart is the first thing readers see. Removes the duplicate Figure placement. Reworks the [Live chart] line below the iso-interactivity table to call out that it's the interactive version of the figure at the top. Skill: adds a new "<Figure> hero image immediately after the top DashboardCTA" subsection in Step 4 specifying the new placement convention, and rewrites the "<Figure> with the chart image" section further down into a "[Live chart] link after the iso-interactivity tables" section so the structure stops implying two Figure blocks per post. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 3e33361 commit c1873d6

2 files changed

Lines changed: 28 additions & 16 deletions

File tree

.claude/skills/write-inferencex-blog/SKILL.md

Lines changed: 20 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,21 @@ Bold the peak ratio in the lede. Second paragraph: name the upstream PRs that ma
128128

129129
Use the preset URL the user provided so clicking lands on the exact comparison view, not the bare dashboard. Format: `https://inferencex.semianalysis.com/inference?g_model=...&i_prec=...&g_rundate=...&g_runid=...&i_active={hw1}_{fw1}%2C{hw2}_{fw2}&i_metric=y_costh&i_linelabel=1`.
130130

131+
### `<Figure>` hero image immediately after the top DashboardCTA
132+
133+
The chart image is the **hero** of the post — it goes right after the top `<DashboardCTA>`, **before** the model / architecture paragraph, so readers see the curves before they read the prose. Do not bury the figure halfway down the post next to the iso-interactivity table.
134+
135+
```mdx
136+
<Figure
137+
srcLight="/images/{slug}/benchmark-light.png"
138+
srcDark="/images/{slug}/benchmark-dark.png"
139+
alt="Plain-English description of the chart including model, precision, ISL/OSL, both compared SKUs/frameworks, and any toggles (MTP/non-MTP)"
140+
caption="Short caption. Note any non-obvious labeling convention used on the chart (e.g. 'Labels denote GPU count per config.')."
141+
/>
142+
```
143+
144+
Use the chart asset only once in the body — show it at the top and don't repeat it lower down. Below the iso-interactivity table, place a small `[Live chart](...)` link that points at the same preset URL and tells the reader the figure at the top is interactive when clicked through. That's where readers will go to drill into specific points.
145+
131146
### Model / architecture paragraph
132147

133148
One paragraph naming the model, vendor, release date (use it to compute "N weeks after release" if it sharpens the cadence framing), total/active parameters, expert count + top-K routing, attention mechanism (MLA, NSA/DSA, GQA, etc.), and context window. **Always WebSearch to verify these numbers** — don't carry over from a prior generation. Cite a source URL inline if the number is non-obvious.
@@ -169,18 +184,15 @@ Columns: `Interactivity (tok/s/user) | {NVIDIA} $/M tok | {AMD} $/M tok | {NVIDI
169184

170185
Follow with one paragraph explaining _why_ the gap peaks where it does (e.g. "the MI355X 4-GPU TP=4 recipe plateaus at $0.22 while B200 is still climbing"), and one sentence noting where the gap inverts (e.g. "Above 90 tok/s/user the comparison flips marginally back to B200 because there is no MI355X recipe matching B200's TP=8 conc 4 at 100+ tok/s/user."). **Don't paper over the inversion** — call it out.
171186

172-
### `<Figure>` with the chart image
187+
### `[Live chart]` link after the iso-interactivity tables
188+
189+
The hero `<Figure>` already shipped at the top of the post. Down here, just a one-line link that points at the same preset URL so readers can drill into the interactive version of what they saw at the top:
173190

174191
```mdx
175-
<Figure
176-
srcLight="/images/{slug}/benchmark-light.png"
177-
srcDark="/images/{slug}/benchmark-dark.png"
178-
alt="Plain-English description of the chart including model, precision, ISL/OSL, both compared SKUs/frameworks, and any toggles (MTP/non-MTP)"
179-
caption="Short caption. Note any non-obvious labeling convention used on the chart (e.g. 'Labels denote GPU count per config.')."
180-
/>
192+
[Live chart](https://inferencex.semianalysis.com/inference?...) — same view as the figure at the top, pre-filtered to {hardware/framework/model/precision} and interactive.
181193
```
182194

183-
Immediately followed by a `[Live chart]({preset URL})` link with the same preset as the `DashboardCTA` so readers can drill into a single point.
195+
Do not embed a second `<Figure>` here. One chart asset, shown once at the top.
184196

185197
### `## What's Next for {SKU/framework} on {Model}` (or similar)
186198

packages/app/content/blog/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x.mdx

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,13 @@ The story is software-only — same MI355X CDNA4 silicon at $1.48/GPU/hr the who
2222
Click to see the full InferenceX dashboard →
2323
</DashboardCTA>
2424

25+
<Figure
26+
srcLight="/images/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x/benchmark-light.png"
27+
srcDark="/images/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x/benchmark-dark.png"
28+
alt="Qwen3.5 FP8 8k/1k tok/s/GPU vs interactivity on MI355X SGLang across three dates: 2026-02-20 (v0.5.8.post1), 2026-04-16 (v0.5.10rc0), 2026-05-19 (v0.5.12). Each curve labeled with its date and the TP value at each point."
29+
caption="Qwen3.5-397B-A17B FP8 8k/1k on MI355X SGLang. Three runs over 3 months: v0.5.8.post1 (Feb 20, TP=8), v0.5.10rc0 (Apr 16, TP=2/4), v0.5.12 (May 19, TP=2/4). Point labels denote the TP value used for that config."
30+
/>
31+
2532
Qwen3.5-397B-A17B is Alibaba's MoE flagship, released 2026-02-16 is an 397B total parameters with 17B activated per token across **512 experts** (top-K routing), with a hybrid attention stack interleaving Gated DeltaNet and Gated Attention layers. The first InferenceX benchmark ran on MI355X four days after the release.
2633

2734
## What Shipped to Make This Happen
@@ -103,14 +110,7 @@ Each date is interpolated on its Pareto frontier (the lower of TP=2 and TP=4 thr
103110

104111
The 19x peak at 40 tok/s/user is partly a regime extension — the Feb TP=8 recipe had a 24.5 ms TPOT floor at conc 4 (40.86 tok/s/user) and couldn't run cheaper than that on this workload, so the comparison band tops out where the old recipe was already in collapse. By 50 tok/s/user the v0.5.8 curve doesn't exist at all; by 75 tok/s/user only the v0.5.12 curve still has a point. The May v0.5.12 image alone adds 1.44x to 1.68x on top of the April baseline across the entire shared band — a clean version-bump win.
105112

106-
<Figure
107-
srcLight="/images/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x/benchmark-light.png"
108-
srcDark="/images/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x/benchmark-dark.png"
109-
alt="Qwen3.5 FP8 8k/1k tok/s/GPU vs interactivity on MI355X SGLang across three dates: 2026-02-20 (v0.5.8.post1), 2026-04-16 (v0.5.10rc0), 2026-05-19 (v0.5.12). Each curve labeled with its date and the TP value at each point."
110-
caption="Qwen3.5-397B-A17B FP8 8k/1k on MI355X SGLang. Three runs over 3 months: v0.5.8.post1 (Feb 20, TP=8), v0.5.10rc0 (Apr 16, TP=2/4), v0.5.12 (May 19, TP=2/4). Point labels denote the TP value used for that config."
111-
/>
112-
113-
[Live chart](https://inferencex.semianalysis.com/inference?g_model=Qwen-3.5-397B-A17B&g_rundate=2026-05-19&i_gpus=mi355x_sglang&i_dstart=2026-02-20&i_dend=2026-05-19&i_prec=fp8), pre-filtered to MI355X SGLang Qwen3.5 FP8 across all three runs.
113+
[Live chart](https://inferencex.semianalysis.com/inference?g_model=Qwen-3.5-397B-A17B&g_rundate=2026-05-19&i_gpus=mi355x_sglang&i_dstart=2026-02-20&i_dend=2026-05-19&i_prec=fp8) — same view as the figure at the top, pre-filtered to MI355X SGLang Qwen3.5 FP8 across all three runs and interactive.
114114

115115
## What's Next for MI355X on Qwen3.5
116116

0 commit comments

Comments
 (0)