edit(blog,skill): hoist chart figure to top of post (after lede CTA)

functionstackx · claude · functionstackx · commit c1873d6c3488 · 2026-05-25T19:58:58.000-04:00
Blog: moves the &lt;Figure&gt; block from below the iso-interactivity
table up to immediately after the top DashboardCTA so the hero chart
is the first thing readers see. Removes the duplicate Figure
placement. Reworks the [Live chart] line below the iso-interactivity
table to call out that it's the interactive version of the figure
at the top.

Skill: adds a new "&lt;Figure&gt; hero image immediately after the top
DashboardCTA" subsection in Step 4 specifying the new placement
convention, and rewrites the "&lt;Figure&gt; with the chart image"
section further down into a "[Live chart] link after the
iso-interactivity tables" section so the structure stops implying
two Figure blocks per post.

Co-Authored-By: Claude Opus 4.7 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/.claude/skills/write-inferencex-blog/SKILL.md b/.claude/skills/write-inferencex-blog/SKILL.md
@@ -128,6 +128,21 @@ Bold the peak ratio in the lede. Second paragraph: name the upstream PRs that ma
 
 Use the preset URL the user provided so clicking lands on the exact comparison view, not the bare dashboard. Format: `https://inferencex.semianalysis.com/inference?g_model=...&i_prec=...&g_rundate=...&g_runid=...&i_active={hw1}_{fw1}%2C{hw2}_{fw2}&i_metric=y_costh&i_linelabel=1`.
 
+### `<Figure>` hero image immediately after the top DashboardCTA
+
+The chart image is the **hero** of the post — it goes right after the top `<DashboardCTA>`, **before** the model / architecture paragraph, so readers see the curves before they read the prose. Do not bury the figure halfway down the post next to the iso-interactivity table.
+
+```mdx
+<Figure
+  srcLight="/images/{slug}/benchmark-light.png"
+  srcDark="/images/{slug}/benchmark-dark.png"
+  alt="Plain-English description of the chart including model, precision, ISL/OSL, both compared SKUs/frameworks, and any toggles (MTP/non-MTP)"
+  caption="Short caption. Note any non-obvious labeling convention used on the chart (e.g. 'Labels denote GPU count per config.')."
+/>
+```
+
+Use the chart asset only once in the body — show it at the top and don't repeat it lower down. Below the iso-interactivity table, place a small `[Live chart](...)` link that points at the same preset URL and tells the reader the figure at the top is interactive when clicked through. That's where readers will go to drill into specific points.
+
 ### Model / architecture paragraph
 
 One paragraph naming the model, vendor, release date (use it to compute "N weeks after release" if it sharpens the cadence framing), total/active parameters, expert count + top-K routing, attention mechanism (MLA, NSA/DSA, GQA, etc.), and context window. **Always WebSearch to verify these numbers** — don't carry over from a prior generation. Cite a source URL inline if the number is non-obvious.
@@ -169,18 +184,15 @@ Columns: `Interactivity (tok/s/user) | {NVIDIA} $/M tok | {AMD} $/M tok | {NVIDI
 
 Follow with one paragraph explaining _why_ the gap peaks where it does (e.g. "the MI355X 4-GPU TP=4 recipe plateaus at $0.22 while B200 is still climbing"), and one sentence noting where the gap inverts (e.g. "Above 90 tok/s/user the comparison flips marginally back to B200 because there is no MI355X recipe matching B200's TP=8 conc 4 at 100+ tok/s/user."). **Don't paper over the inversion** — call it out.
 
-### `<Figure>` with the chart image
+### `[Live chart]` link after the iso-interactivity tables
+
+The hero `<Figure>` already shipped at the top of the post. Down here, just a one-line link that points at the same preset URL so readers can drill into the interactive version of what they saw at the top:
 
 ```mdx
-<Figure
-  srcLight="/images/{slug}/benchmark-light.png"
-  srcDark="/images/{slug}/benchmark-dark.png"
-  alt="Plain-English description of the chart including model, precision, ISL/OSL, both compared SKUs/frameworks, and any toggles (MTP/non-MTP)"
-  caption="Short caption. Note any non-obvious labeling convention used on the chart (e.g. 'Labels denote GPU count per config.')."
-/>
+[Live chart](https://inferencex.semianalysis.com/inference?...) — same view as the figure at the top, pre-filtered to {hardware/framework/model/precision} and interactive.
 ```
 
-Immediately followed by a `[Live chart]({preset URL})` link with the same preset as the `DashboardCTA` so readers can drill into a single point.
+Do not embed a second `<Figure>` here. One chart asset, shown once at the top.
 
 ### `## What's Next for {SKU/framework} on {Model}` (or similar)
 
diff --git a/packages/app/content/blog/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x.mdx b/packages/app/content/blog/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x.mdx
@@ -22,6 +22,13 @@ The story is software-only — same MI355X CDNA4 silicon at $1.48/GPU/hr the who
   Click to see the full InferenceX dashboard →
 </DashboardCTA>
 
+<Figure
+  srcLight="/images/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x/benchmark-light.png"
+  srcDark="/images/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x/benchmark-dark.png"
+  alt="Qwen3.5 FP8 8k/1k tok/s/GPU vs interactivity on MI355X SGLang across three dates: 2026-02-20 (v0.5.8.post1), 2026-04-16 (v0.5.10rc0), 2026-05-19 (v0.5.12). Each curve labeled with its date and the TP value at each point."
+  caption="Qwen3.5-397B-A17B FP8 8k/1k on MI355X SGLang. Three runs over 3 months: v0.5.8.post1 (Feb 20, TP=8), v0.5.10rc0 (Apr 16, TP=2/4), v0.5.12 (May 19, TP=2/4). Point labels denote the TP value used for that config."
+/>
+
 Qwen3.5-397B-A17B is Alibaba's MoE flagship, released 2026-02-16 is an 397B total parameters with 17B activated per token across **512 experts** (top-K routing), with a hybrid attention stack interleaving Gated DeltaNet and Gated Attention layers. The first InferenceX benchmark ran on MI355X four days after the release.
 
 ## What Shipped to Make This Happen
@@ -103,14 +110,7 @@ Each date is interpolated on its Pareto frontier (the lower of TP=2 and TP=4 thr
 
 The 19x peak at 40 tok/s/user is partly a regime extension — the Feb TP=8 recipe had a 24.5 ms TPOT floor at conc 4 (40.86 tok/s/user) and couldn't run cheaper than that on this workload, so the comparison band tops out where the old recipe was already in collapse. By 50 tok/s/user the v0.5.8 curve doesn't exist at all; by 75 tok/s/user only the v0.5.12 curve still has a point. The May v0.5.12 image alone adds 1.44x to 1.68x on top of the April baseline across the entire shared band — a clean version-bump win.
 
-<Figure
-  srcLight="/images/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x/benchmark-light.png"
-  srcDark="/images/mi355x-qwen3-5-sglang-v0-5-12-up-to-17x/benchmark-dark.png"
-  alt="Qwen3.5 FP8 8k/1k tok/s/GPU vs interactivity on MI355X SGLang across three dates: 2026-02-20 (v0.5.8.post1), 2026-04-16 (v0.5.10rc0), 2026-05-19 (v0.5.12). Each curve labeled with its date and the TP value at each point."
-  caption="Qwen3.5-397B-A17B FP8 8k/1k on MI355X SGLang. Three runs over 3 months: v0.5.8.post1 (Feb 20, TP=8), v0.5.10rc0 (Apr 16, TP=2/4), v0.5.12 (May 19, TP=2/4). Point labels denote the TP value used for that config."
-/>
-
-[Live chart](https://inferencex.semianalysis.com/inference?g_model=Qwen-3.5-397B-A17B&g_rundate=2026-05-19&i_gpus=mi355x_sglang&i_dstart=2026-02-20&i_dend=2026-05-19&i_prec=fp8), pre-filtered to MI355X SGLang Qwen3.5 FP8 across all three runs.
+[Live chart](https://inferencex.semianalysis.com/inference?g_model=Qwen-3.5-397B-A17B&g_rundate=2026-05-19&i_gpus=mi355x_sglang&i_dstart=2026-02-20&i_dend=2026-05-19&i_prec=fp8) — same view as the figure at the top, pre-filtered to MI355X SGLang Qwen3.5 FP8 across all three runs and interactive.
 
 ## What's Next for MI355X on Qwen3.5