Skip to content

feat(compare): 3 narrative paragraphs per slug page with templated variety#384

Merged
functionstackx merged 2 commits into
masterfrom
feat/compare-narrative-3-paragraphs
May 26, 2026
Merged

feat(compare): 3 narrative paragraphs per slug page with templated variety#384
functionstackx merged 2 commits into
masterfrom
feat/compare-narrative-3-paragraphs

Conversation

@functionstackx
Copy link
Copy Markdown
Contributor

@functionstackx functionstackx commented May 26, 2026

Summary

The SSR'd table on each `/compare/` and `/compare-per-dollar/` page has 3 default interactivity targets (25th / 50th / 75th percentile). Before this PR, the narrative above the table only described the first overlapping target — leaving the other two undescribed and the prose feeling thin.

Now: one narrative paragraph per interactivity target (3 paragraphs per page), each picked from a pool of templates so the catalog reads with variety instead of repeating one sentence shape.

Example on `/compare-per-dollar/deepseek-v4-b200-vs-mi355x`:

Push DeepSeek V4 Pro to 17 tok/s/user and B200 lands at $0.21 per million tokens against MI355X's $0.25 — B200 pulls ahead by 20%.

B200: $0.38 per million tokens. MI355X: $1.32. Both at 30 tok/s/user on DeepSeek V4 Pro, with B200 248% cheaper.

Toward the upper edge of the 5–54 tok/s/user interactivity band — at 42 tok/s/user — B200 runs $0.40 per million tokens on DeepSeek V4 Pro while MI355X runs $2.78. B200 is the cheaper choice by 587%. (Numbers reflect the default 1k/1k · fp4 selection for this URL — table and chart below update if you change sequence, precision, or model in the controls.)

Example on `/compare-per-dollar/kimi-k26-mi300x-vs-mi355x` (different starting template, different rotation):

At 21 tok/s/user on Kimi K2.5/K2.6, MI300X costs $2.56 per million tokens; MI355X costs $3.78. MI300X is 48% more cost-efficient at this operating point.

MI300X edges MI355X at 28 tok/s/user on Kimi K2.5/K2.6 — $2.83 per million tokens versus $6.14, a 117% cost-per-token gap.

Push Kimi K2.5/K2.6 to 36 tok/s/user and MI300X lands at $3.75 per million tokens against MI355X's $8.46 — MI300X pulls ahead by 126%. (default selection caveat)

Template pools

Variant Pool Templates
per-dollar both GPUs, non-tied, non-zero 6
per-dollar near-tie (≤1% cost gap) 3
per-dollar zero-cost edge case 2
per-dollar single-GPU row 3
full (/compare) both GPUs, non-tied, non-zero 6
full single-GPU row 3

SSR-deterministic selection

Template selection uses `pickRotated(pool, pageSeed, rowIndex)`:

  1. `pageSeed` = ````${variant}|${modelLabel}|${aLabel}|${bLabel}````` — stable across renders, varies per page
  2. `start = hashStr(pageSeed) % pool.length` — different pages start at different points in the rotation
  3. `templateIndex = (start + rowIndex) % pool.length` — within a page, 3 paragraphs use 3 consecutive templates (always distinct)

Why rotation and not random sampling? Sampling 3 times from a pool of 6 has ~50% chance of collision (birthday problem). Rotation guarantees zero within-page collisions while still varying across pages.

SEO stability: Same URL → same prose every request → crawlers see consistent content. No `Math.random()`, no `Date.now()`, no client state.

Files

  • `packages/app/src/lib/compare-ssr.ts` — `compareTableNarrative` now returns `string[]` instead of `string`. Adds `hashStr`, `bandFor`, `pickRotated`, the 6 template pools, and helper input types.
  • `packages/app/src/app/compare/[slug]/page-client.tsx` — renders array of `

    `, caveat appended to last one

  • `packages/app/src/app/compare-per-dollar/[slug]/page-client.tsx` — same change for per-dollar route
  • (Both `page.tsx` callers unchanged structurally — `compareTableNarrative` returns `string[]` now, prop type changed accordingly.)

Verification

Local dev (Neon URL, blob tokens stripped to bypass dev-only cache issue):

```
/compare-per-dollar/deepseek-v4-b200-vs-mi355x → 3 paragraphs at 17/30/42 tok/s/user, all distinct templates
/compare-per-dollar/kimi-k26-mi300x-vs-mi355x → 3 paragraphs at 21/28/36 tok/s/user, starts at a different template
/compare/deepseek-v4-b200-vs-mi355x → 3 paragraphs in 'full' variant (cost + throughput per paragraph)
```

Lint, format, typecheck all green via pre-commit.

Test plan

  • Click-through Vercel preview: visit a few slug pages on both `/compare` and `/compare-per-dollar`, confirm 3 distinct prose paragraphs above each table
  • Pages for different (model, GPU pair) combos pick different starting templates → catalog reads with variety
  • Same page hard-refreshed renders the same 3 paragraphs (deterministic, no flicker)

🤖 Generated with Claude Code


Note

Low Risk
Presentation and SEO copy only; benchmark math and routing are unchanged aside from narrative shape and display labels.

Overview
Compare slug pages now ship one SSR narrative paragraph per default interactivity row (typically three) instead of a single headline blurb. compareTableNarrative returns string[] and builds prose from template pools for /compare (cost + throughput) and /compare-per-dollar (cost-focused), with tie, zero-cost, and single-GPU branches. Deterministic hashStr + pickRotated picks distinct consecutive templates per page and stable copy per URL for SEO/hydration.

UI: /compare and /compare-per-dollar slug clients render a stacked block of paragraphs; the default sequence/precision caveat stays on the last paragraph only.

Copy/labels: COMPARE_MODEL_SLUGS labels and index metadata strings add scale hints (e.g. 1.6T, 1T, 397B-A17B); one test expectation updated for Kimi.

Reviewed by Cursor Bugbot for commit 847bffe. Bugbot is set up for automated code reviews on this repo. Configure here.

…templated variety

compareTableNarrative now returns string[] — one paragraph per ssrRow (3
default targets, so 3 paragraphs per slug page). Each paragraph picks a
template from a per-situation pool:

  - PER_DOLLAR_BOTH_TEMPLATES — 6 variants for the cost-only narrative
  - PER_DOLLAR_TIED_TEMPLATES — 3 variants for the near-tie case
  - PER_DOLLAR_ZERO_TEMPLATES — 2 variants for zero-cost edge case
  - PER_DOLLAR_SINGLE_TEMPLATES — 3 variants for one-GPU-only rows
  - FULL_BOTH_TEMPLATES — 6 variants for cost+throughput narrative
  - FULL_SINGLE_TEMPLATES — 3 variants for one-GPU-only rows in full mode

Template selection is SSR-deterministic via pickRotated(pool, pageSeed,
rowIndex): the per-page hash picks where to START in the rotation; rowIndex
advances by 1 from there. Result:

  - Same page renders the same 3 paragraphs every request (SSR + hydration
    agree, crawlers see stable text)
  - Different pages start at different points in the rotation → catalog
    reads with variety
  - Within a page, the 3 paragraphs always use 3 *distinct* templates —
    rotation avoids the birthday-problem collisions that random sampling
    of 3 from 6 produces

Page-client renders the paragraph array, with the "(default selection)"
caveat appended only to the last paragraph so the block reads as one piece
of prose instead of three separate footnotes.

Verified live: /compare-per-dollar/deepseek-v4-b200-vs-mi355x renders three
different templates at 17/30/42 tok/s/user;
/compare-per-dollar/kimi-k26-mi300x-vs-mi355x starts at a different template
and also rotates through three distinct ones at 21/28/36 tok/s/user.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
inferencemax-app Ready Ready Preview, Comment May 26, 2026 2:17am

Request Review

- DeepSeek V4 Pro → "DeepSeek V4 Pro 1.6T"
- Kimi K2.5/K2.6 → "Kimi K2.5/K2.6 1T"
- Qwen 3.5 → "Qwen 3.5 397B-A17B" (397B total, 17B active per forward pass)

The other models already have their scale in the name (gpt-oss 120B,
Llama 3.3 70B) or weren't in the user's spec for this update (DeepSeek R1,
GLM 5/5.1, MiniMax M2.5/M2.7). The CompareModelSlug.label drives section
headings, slug-page H1, page title, OG eyebrow, JSON-LD ItemList/Dataset
name, and the SSR narrative prose — so a single registry edit propagates
to every surface on /compare/* and /compare-per-dollar/*.

The two static DESCRIPTION constants used for meta descriptions also got
the param-count form for consistency with the rendered headings.

32/32 unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@functionstackx functionstackx merged commit 5fbe082 into master May 26, 2026
18 checks passed
@functionstackx functionstackx deleted the feat/compare-narrative-3-paragraphs branch May 26, 2026 02:20
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 847bffe. Configure here.

const both = [costPart, tputPart].filter(Boolean).join('; ');
return both.length > 0
? `${both.charAt(0).toUpperCase()}${both.slice(1)}`
: 'numbers are too close to call';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misleading fallback text when data is missing, not tied

Low Severity

The fullSummary fallback returns 'numbers are too close to call' when both costPart and tputPart are null. But these are null when cost or throughput data is zero/missing (e.g. costOk = false and tputOk = false), not when numbers are close. The old code gracefully omitted the comparison sentence with just a space; the new code inserts factually incorrect prose that contradicts the actual zero values ($0.00, 0 tok/s/GPU) displayed in the same template sentence.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 847bffe. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant