|
1 | 1 | --- |
2 | 2 | id: ssr-performance-600-percent |
3 | 3 | title: 'From 3000ms to 14ms: profiling hot paths and eliminating bottlenecks in TanStack Start' |
| 4 | +title: 'From 3000ms to 14ms: CPU profiling of TanStack Start SSR under heavy load' |
| 5 | +title: 'Profile, Fix, Repeat: 2x SSR Throughput in 20 PRs' |
4 | 6 | --- |
5 | 7 |
|
6 | 8 | ## Executive summary |
7 | 9 |
|
8 | | -We improved TanStack Router's SSR performance dramatically. Under sustained load: |
| 10 | +We improved TanStack Router's SSR performance dramatically. Under sustained load (100 concurrent connections, 30 seconds): |
9 | 11 |
|
10 | 12 | - **Throughput**: 477 req/s → 1,041 req/s (**2.2x**) |
11 | 13 | - **Average latency**: 3,171ms → 14ms (**231x faster**) |
@@ -121,14 +123,16 @@ Use cheap predicates first, then fall back to heavyweight parsing only when need |
121 | 123 | const url = new URL(to, base) |
122 | 124 |
|
123 | 125 | // After: check first, parse only if needed |
124 | | -if (isAbsoluteUrl(to)) { |
| 126 | +if (safeInternalUrl(to)) { |
| 127 | + // fast path: internal navigation, no parsing needed |
| 128 | +} else { |
125 | 129 | const url = new URL(to, base) |
126 | 130 | // ...external URL handling |
127 | | -} else { |
128 | | - // fast path: internal navigation, no parsing needed |
129 | 131 | } |
130 | 132 | ``` |
131 | 133 |
|
| 134 | +The `safeInternalUrl` check can be orders of magnitude cheaper than constructing a `URL` object[^url-cost] as long as we're ok with some false negatives in a few cases. |
| 135 | + |
132 | 136 | See: [#6442](https://github.com/TanStack/router/pull/6442), [#6447](https://github.com/TanStack/router/pull/6447), [#6516](https://github.com/TanStack/router/pull/6516) |
133 | 137 |
|
134 | 138 | ### How we proved it internally |
@@ -186,7 +190,7 @@ If you can guard a branch with a **build-time constant** like `isServer`, you ca |
186 | 190 | - keep the general algorithm for correctness and edge cases |
187 | 191 | - allow bundlers to delete the server-only branch from client builds |
188 | 192 |
|
189 | | -In TanStack Router, `isServer` is provided via build-time resolution (client: `false`, server: `true`, dev/test: `undefined` with fallback), so dead code elimination can remove entire blocks. |
| 193 | +In TanStack Router, `isServer` is provided via build-time resolution (client: `false`, server: `true`, dev/test: `undefined` with fallback). Modern bundlers like Vite, Rollup, and esbuild perform dead code elimination (DCE)[^dce], removing unreachable branches when the condition is a compile-time constant. |
190 | 194 |
|
191 | 195 | ### The transferable pattern |
192 | 196 |
|
@@ -257,6 +261,8 @@ Benchmark: placeholder text, should link to Matteo's article. |
257 | 261 |
|
258 | 262 | The "before" numbers show a server under severe stress: 25% of requests failed (likely timeouts), and p90/p95 hit the 10s timeout ceiling. After the optimizations, the server handles the same load comfortably with sub-30ms tail latency and zero failures. |
259 | 263 |
|
| 264 | +To be clear: TanStack Router was not broken before these changes. Under normal traffic, SSR worked fine. These numbers reflect behavior under *sustained heavy load*—the kind you see during traffic spikes or load testing. The optimizations ensure the server degrades gracefully instead of falling over. |
| 265 | + |
260 | 266 | ### Flamegraph evidence slots |
261 | 267 |
|
262 | 268 | - `<!-- FLAMEGRAPH: links-100 before -->` |
@@ -287,3 +293,7 @@ There were many other improvements (client and server) not covered here. SSR per |
287 | 293 | [^structural-sharing]: Structural sharing is a pattern from immutable data libraries (Immer, React Query, TanStack Store) where unchanged portions of data structures are reused by reference to minimize allocation and enable cheap equality checks. |
288 | 294 |
|
289 | 295 | [^ssr-streaming]: With streaming SSR and Suspense, the server may render multiple chunks, but each chunk is still a single-pass render with no reactive updates. |
| 296 | + |
| 297 | +[^url-cost]: The WHATWG URL Standard requires significant parsing work: scheme detection, authority parsing, path normalization, query string handling, and percent-encoding. See the [URL parsing algorithm](https://url.spec.whatwg.org/#url-parsing) for the full state machine. |
| 298 | + |
| 299 | +[^dce]: Dead code elimination is a standard compiler optimization. See esbuild's documentation on [tree shaking](https://esbuild.github.io/api/#tree-shaking) and Rollup's [tree-shaking guide](https://rollupjs.org/introduction/#tree-shaking). |
0 commit comments