Skip to content

Commit c5b496b

Browse files
committed
use our own numbers, and fixes
1 parent ad28f24 commit c5b496b

File tree

1 file changed

+17
-23
lines changed

1 file changed

+17
-23
lines changed

src/blog/tanstack-start-ssr-performance-600-percent.md

Lines changed: 17 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -3,26 +3,26 @@ published: 2026-02-01
33
authors:
44
- Manuel Schiller
55
- Florian Pellet
6-
title: 'From 3000ms to 14ms: CPU profiling of TanStack Start SSR under heavy load'
7-
# title: 'Profile, Fix, Repeat: 2x SSR Throughput in 20 PRs'
8-
# title: '99.5% Latency Reduction in 20 PRs'
9-
# title: '231x Latency Drop: SSR Flamegraphs under heavy load'
10-
# title: '343x Faster Latency p95: Profiling SSR Hot Paths in TanStack Start'
6+
title: '5x SSR Throughput: CPU profiling of TanStack Start SSR under heavy load'
7+
# title: 'Profile, Fix, Repeat: 5x SSR Throughput in 20 PRs'
8+
# title: '10x Latency Reduction in 20 PRs'
9+
# title: '10x Latency Drop: SSR Flamegraphs under heavy load'
10+
# title: '5x SSR Throughput: Profiling SSR Hot Paths in TanStack Start'
1111
---
1212

1313
## TL;DR
1414

1515
We improved TanStack Start's SSR performance dramatically. Under sustained load (100 concurrent connections, 30 seconds):
1616

17-
<!-- these are matteo's numbers, they don't look amazing (low throughput), maybe we should use our own numbers? we'll cite his in the conclusion anyway. -->
18-
19-
- **Throughput**: 477 req/s → 1,041 req/s (**2.2x**)
20-
- **Average latency**: 3,171ms → 14ms (**231x faster**)
21-
- **p95 latency**: 10,001ms (timeout) → 29ms (**343x faster**)
22-
- **Success rate**: 75% → 100% (the server stopped failing under load)
17+
- **Throughput**: 427 req/s → 2357 req/s (**5.5x**)
18+
- **Average latency**: 424ms → 43ms (**9.9x faster**)
19+
- **p99 latency**: 6558ms → 928ms (**7.1x faster**)
20+
- **Success rate**: 99.96% → 100% (the server stopped failing under load)
2321

2422
For SSR-heavy deployments, this translates directly to lower hosting costs, the ability to handle traffic spikes without scaling, and eliminating user-facing errors.
2523

24+
This work started after `v1.154.4` and targets server-side rendering performance. The goal was to increase throughput and reduce server CPU time per request while keeping correctness guarantees.
25+
2626
We did it with a repeatable process, not a single clever trick:
2727

2828
- **Measure under load**, not in microbenchmarks.
@@ -35,20 +35,14 @@ We did it with a repeatable process, not a single clever trick:
3535

3636
The changes span over [20 PRs](https://github.com/TanStack/router/compare/v1.154.4...v1.157.18); we highlight the highest-impact patterns below.
3737

38-
<!-- the "What we optimized" section and "Methodology" feel a little redundant because "what we optimized" doesn't actually say what we optimized, just *how* we did it, which is part of the methodology. -->
39-
40-
## What we optimized (and what we did not)
41-
42-
This work started after `v1.154.4` and targets server-side rendering performance. The goal was to increase throughput and reduce server CPU time per request while keeping correctness guarantees.
38+
## Methodology: feature-focused endpoints + flamegraphs
4339

44-
We are not claiming that any single line of code is "the" reason. This work spanned over 20 PRs, with still more to come. And every change was validated by:
40+
We are not claiming that any single line of code is "the" reason. This work spanned over 20 PRs, with still more to come. Every change was validated by:
4541

46-
- a stable load test
47-
- a CPU profile (flamegraph)
42+
- a stable load test (same endpoint, same load)
43+
- a CPU profile (flamegraph) that explains the delta
4844
- a before/after comparison on the same benchmark endpoint
4945

50-
## Methodology: feature-focused endpoints + flamegraphs
51-
5246
### Why feature-focused endpoints
5347

5448
We did not benchmark "a representative app page". We used endpoints that exaggerate a feature so the profile is unambiguous:
@@ -195,7 +189,7 @@ Taking the example of the `useRouterState` hook, we can see that most of the cli
195189

196190
### The mechanism
197191

198-
Client code cares about bundle size. Server code cares about CPU time per request. Those constraints are different (this is a _general_ rule, not a _universal_ one).
192+
As a general rule, client code cares about bundle size, while server code cares about CPU time per request. Those constraints are different.
199193

200194
If you can guard a branch with a **build-time constant** like `isServer`, you can:
201195

@@ -294,7 +288,7 @@ Benchmark: placeholder text, should link to Matteo's article.
294288

295289
The "before" numbers show a server under severe stress: 25% of requests failed (likely timeouts), and p90/p95 hit the 10s timeout ceiling. After the optimizations, the server handles the same load comfortably with sub-30ms tail latency and zero failures.
296290

297-
To be clear: TanStack Start was not broken before these changes. Under normal traffic, SSR worked fine. These numbers reflect behavior under _sustained heavy load_the kind you see during traffic spikes or load testing. The optimizations ensure the server degrades gracefully instead of falling over.
291+
To be clear: TanStack Start was not broken before these changes. Under normal traffic, SSR worked fine. These numbers reflect behavior under _sustained heavy load_ (the kind you see during traffic spikes or load testing). The optimizations ensure the server degrades gracefully instead of falling over.
298292

299293
### Event-loop utilization
300294

0 commit comments

Comments
 (0)