You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/blog/tanstack-start-ssr-performance-600-percent.md
+17-23Lines changed: 17 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,26 +3,26 @@ published: 2026-02-01
3
3
authors:
4
4
- Manuel Schiller
5
5
- Florian Pellet
6
-
title: 'From 3000ms to 14ms: CPU profiling of TanStack Start SSR under heavy load'
7
-
# title: 'Profile, Fix, Repeat: 2x SSR Throughput in 20 PRs'
8
-
# title: '99.5% Latency Reduction in 20 PRs'
9
-
# title: '231x Latency Drop: SSR Flamegraphs under heavy load'
10
-
# title: '343x Faster Latency p95: Profiling SSR Hot Paths in TanStack Start'
6
+
title: '5x SSR Throughput: CPU profiling of TanStack Start SSR under heavy load'
7
+
# title: 'Profile, Fix, Repeat: 5x SSR Throughput in 20 PRs'
8
+
# title: '10x Latency Reduction in 20 PRs'
9
+
# title: '10x Latency Drop: SSR Flamegraphs under heavy load'
10
+
# title: '5x SSR Throughput: Profiling SSR Hot Paths in TanStack Start'
11
11
---
12
12
13
13
## TL;DR
14
14
15
15
We improved TanStack Start's SSR performance dramatically. Under sustained load (100 concurrent connections, 30 seconds):
16
16
17
-
<!-- these are matteo's numbers, they don't look amazing (low throughput), maybe we should use our own numbers? we'll cite his in the conclusion anyway. -->
-**Success rate**: 99.96% → 100% (the server stopped failing under load)
23
21
24
22
For SSR-heavy deployments, this translates directly to lower hosting costs, the ability to handle traffic spikes without scaling, and eliminating user-facing errors.
25
23
24
+
This work started after `v1.154.4` and targets server-side rendering performance. The goal was to increase throughput and reduce server CPU time per request while keeping correctness guarantees.
25
+
26
26
We did it with a repeatable process, not a single clever trick:
27
27
28
28
-**Measure under load**, not in microbenchmarks.
@@ -35,20 +35,14 @@ We did it with a repeatable process, not a single clever trick:
35
35
36
36
The changes span over [20 PRs](https://github.com/TanStack/router/compare/v1.154.4...v1.157.18); we highlight the highest-impact patterns below.
37
37
38
-
<!-- the "What we optimized" section and "Methodology" feel a little redundant because "what we optimized" doesn't actually say what we optimized, just *how* we did it, which is part of the methodology. -->
39
-
40
-
## What we optimized (and what we did not)
41
-
42
-
This work started after `v1.154.4` and targets server-side rendering performance. The goal was to increase throughput and reduce server CPU time per request while keeping correctness guarantees.
We are not claiming that any single line of code is "the" reason. This work spanned over 20 PRs, with still more to come. And every change was validated by:
40
+
We are not claiming that any single line of code is "the" reason. This work spanned over 20 PRs, with still more to come. Every change was validated by:
45
41
46
-
- a stable load test
47
-
- a CPU profile (flamegraph)
42
+
- a stable load test (same endpoint, same load)
43
+
- a CPU profile (flamegraph) that explains the delta
48
44
- a before/after comparison on the same benchmark endpoint
We did not benchmark "a representative app page". We used endpoints that exaggerate a feature so the profile is unambiguous:
@@ -195,7 +189,7 @@ Taking the example of the `useRouterState` hook, we can see that most of the cli
195
189
196
190
### The mechanism
197
191
198
-
Client code cares about bundle size. Server code cares about CPU time per request. Those constraints are different (this is a _general_ rule, not a _universal_ one).
192
+
As a general rule, client code cares about bundle size, while server code cares about CPU time per request. Those constraints are different.
199
193
200
194
If you can guard a branch with a **build-time constant** like `isServer`, you can:
201
195
@@ -294,7 +288,7 @@ Benchmark: placeholder text, should link to Matteo's article.
294
288
295
289
The "before" numbers show a server under severe stress: 25% of requests failed (likely timeouts), and p90/p95 hit the 10s timeout ceiling. After the optimizations, the server handles the same load comfortably with sub-30ms tail latency and zero failures.
296
290
297
-
To be clear: TanStack Start was not broken before these changes. Under normal traffic, SSR worked fine. These numbers reflect behavior under _sustained heavy load_—the kind you see during traffic spikes or load testing. The optimizations ensure the server degrades gracefully instead of falling over.
291
+
To be clear: TanStack Start was not broken before these changes. Under normal traffic, SSR worked fine. These numbers reflect behavior under _sustained heavy load_ (the kind you see during traffic spikes or load testing). The optimizations ensure the server degrades gracefully instead of falling over.
0 commit comments