You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: '600% faster SSR: profiling and eliminating server hot paths in TanStack Router'
3
-
published: 2026-02-04
4
-
authors:
5
-
- Manuel Schiller
6
-
- Florian Pellet
2
+
id: ssr-performance-600-percent
3
+
title: "From 3000ms to 14ms: profiling hot paths and eliminating bottlenecks in TanStack Start"
7
4
---
8
5
9
6
## Executive summary
10
7
11
-
We improved TanStack Router’s SSR request throughput by about **600%** (placeholder: **~16k → ~96k requests in 30s**). We did it with a repeatable process, not a single clever trick:
8
+
We improved TanStack Router's SSR performance dramatically. Under sustained load:
-**Success rate**: 75% → 100% (the server stopped failing under load)
14
+
15
+
For SSR-heavy deployments, this translates directly to lower hosting costs, the ability to handle traffic spikes without scaling, and eliminating user-facing errors.
16
+
17
+
We did it with a repeatable process, not a single clever trick:
12
18
13
19
-**Measure under load**, not in microbenchmarks.
14
20
- Use CPU profiling to find the highest-impact work.
@@ -18,13 +24,13 @@ We improved TanStack Router’s SSR request throughput by about **600%** (placeh
18
24
- add server-only fast paths behind a build-time `isServer` flag
19
25
- avoid `delete` in performance-sensitive code
20
26
21
-
This article focuses on methodology and mechanisms you can reuse in any SSR framework.
27
+
The changes span ~20 PRs; we highlight the highest-impact patterns below. This article focuses on methodology and mechanisms you can reuse in any SSR framework.
22
28
23
29
## What we optimized (and what we did not)
24
30
25
31
This work started after `v1.154.4` and targets server-side rendering performance. The goal was to increase throughput and reduce server CPU time per request while keeping correctness guarantees.
26
32
27
-
We are not claiming that any single line of code is “the” reason. Every change was validated by:
33
+
We are not claiming that any single line of code is "the" reason. This work spanned over 20 PRs, with still more to come. And every change was validated by:
28
34
29
35
- a stable load test
30
36
- a CPU profile (flamegraph)
@@ -34,7 +40,7 @@ We are not claiming that any single line of code is “the” reason. Every chan
34
40
35
41
### Why feature-focused endpoints
36
42
37
-
We did not benchmark “a representative app page”. We used endpoints that exaggerate a feature so the profile is unambiguous:
43
+
We did not benchmark "a representative app page". We used endpoints that exaggerate a feature so the profile is unambiguous:
38
44
39
45
-**`links-100`**: renders ~100 links to stress link rendering and location building.
40
46
-**`layouts-26-with-params`**: deep nesting + params to stress matching and path/param work.
@@ -72,6 +78,27 @@ Placeholders you should replace with real screenshots:
72
78
-`<!-- FLAMEGRAPH: layouts-26-with-params before -->`
73
79
-`<!-- FLAMEGRAPH: layouts-26-with-params after -->`
74
80
81
+
### Reproducing these benchmarks
82
+
83
+
**Environment:**
84
+
85
+
Our benchmarks were stable enough to produce very similar results on a range of setups. However here are the exact environment details we used to run the benchmarks:
86
+
- Node.js: v24.12.0
87
+
- Hardware: Macbook Pro M3
88
+
- OS: macOS 15.7
89
+
90
+
**Running the benchmark:**
91
+
92
+
For fast iteration, we setup a single `pnpm bench` command what would concurrently
93
+
- start the built server through `@platformatic/flame` to profile it
94
+
```sh
95
+
flame run ./dist/server.mjs
96
+
```
97
+
- run `autocannon` to stress the server by firing many requests at it
Modern engines optimize property access using object “shapes” (e.g. V8 HiddenClasses / JSC Structures) and inline caches. `delete` changes an object’s shape and can force a slower internal representation (e.g. dictionary/slow properties), which can disable or degrade those optimizations and deopt optimized code.[^v8-fast-properties][^webkit-delete-ic]
217
+
Modern engines optimize property access using object "shapes" (e.g. V8 HiddenClasses / JSC Structures) and inline caches. `delete` changes an object's shape and can force a slower internal representation (e.g. dictionary/slow properties), which can disable or degrade those optimizations and deopt optimized code.[^v8-fast-properties][^webkit-delete-ic]
142
218
143
219
### The transferable pattern
144
220
145
-
Avoid `delete` in hot paths. Prefer patterns that don’t mutate object shapes in-place:
221
+
Avoid `delete` in hot paths. Prefer patterns that don't mutate object shapes in-place:
146
222
147
223
- set a property to `undefined` (when semantics allow)
148
-
- create a new object without the key (object rest destructuring) when you need a “key removed” shape
224
+
- create a new object without the key (object rest destructuring) when you need a "key removed" shape
149
225
150
-
##Results (placeholders)
226
+
### What we changed
151
227
152
-
Replace the placeholders below with your final measurements and keep the raw `autocannon` output in your internal notes.
The "before" numbers show a server under severe stress: 25% of requests failed (likely timeouts), and p90/p95 hit the 10s timeout ceiling. After the optimizations, the server handles the same load comfortably with sub-30ms tail latency and zero failures.
169
255
170
256
### Flamegraph evidence slots
171
257
@@ -182,13 +268,15 @@ There were many other improvements (client and server) not covered here. SSR per
182
268
183
269
## Fill-in checklist before publishing
184
270
185
-
-[] Replace throughput placeholders (req/30s) with final numbers.
186
-
-[] Replace latency placeholders (avg/p95/p99) with final numbers.
187
-
-[ ] Insert flamegraph screenshots and annotate the “before” hotspots and “after” removal.
271
+
-[x] Replace throughput placeholders with final numbers.
272
+
-[x] Replace latency placeholders (avg/p90/p95) with final numbers.
273
+
-[ ] Insert flamegraph screenshots and annotate the "before" hotspots and "after" removal.
188
274
-[ ] Ensure every external claim has a citation and every internal claim has evidence.
[^v8-fast-properties]: V8 team, “Fast properties in V8” `https://v8.dev/blog/fast-properties`
193
-
194
-
[^webkit-delete-ic]: WebKit, “A Tour of Inline Caching with Delete” `https://webkit.org/blog/10298/inline-caching-delete/`
279
+
[^v8-fast-properties]: V8 team, "Fast properties in V8" `https://v8.dev/blog/fast-properties`
280
+
[^webkit-delete-ic]: WebKit, "A Tour of Inline Caching with Delete" `https://webkit.org/blog/10298/inline-caching-delete/`
281
+
[^structural-sharing]: Structural sharing is a pattern from immutable data libraries (Immer, React Query, TanStack Store) where unchanged portions of data structures are reused by reference to minimize allocation and enable cheap equality checks.
282
+
[^ssr-streaming]: With streaming SSR and Suspense, the server may render multiple chunks, but each chunk is still a single-pass render with no reactive updates.
0 commit comments