Skip to content

Commit 7a01b60

Browse files
committed
more images
1 parent ea5b309 commit 7a01b60

File tree

7 files changed

+38
-20
lines changed

7 files changed

+38
-20
lines changed

public/blog-assets/tanstack-start-ssr-performance-600-percent/after-build-location.png renamed to public/blog-assets/tanstack-start-ssr-performance-600-percent/build-location-after.png

File renamed without changes.

public/blog-assets/tanstack-start-ssr-performance-600-percent/before-build-location.png renamed to public/blog-assets/tanstack-start-ssr-performance-600-percent/build-location-before.png

File renamed without changes.
172 KB
Loading
158 KB
Loading
64.3 KB
Loading
122 KB
Loading

src/blog/tanstack-start-ssr-performance-600-percent.md

Lines changed: 38 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -59,8 +59,8 @@ This is transferable: isolate the subsystem you want to improve, and benchmark t
5959

6060
We used [`autocannon`](https://github.com/mcollina/autocannon) to generate a 30s sustained load. We tracked:
6161

62-
- req/s
63-
- latency distribution (avg, p95, p99)
62+
- requests per second (req/s)
63+
- latency distribution (average, p95, p99)
6464

6565
Example command (adjust concurrency and route):
6666

@@ -133,12 +133,12 @@ See: [#6442](https://github.com/TanStack/router/pull/6442), [#6447](https://gith
133133

134134
### How we proved it internally
135135

136-
Like every PR in this series, this was profiling the impacted method before and after the change. For example we can see in the example below that the `buildLocation` method went from being one of the major bottlenecks of a navigation to being a very small part of the overall cost:
136+
Like every PR in this series, this change was validates by profiling the impacted method before and after. For example we can see in the example below that the `buildLocation` method went from being one of the major bottlenecks of a navigation to being a very small part of the overall cost:
137137

138138
| | |
139139
| ------ | --------------------------------------------------------------------------------------------------------------------------------------- |
140-
| Before | ![CPU profiling of buildLocation before the changes](/blog-assets/tanstack-start-ssr-performance-600-percent/before-build-location.png) |
141-
| After | ![CPU profiling of buildLocation after the changes](/blog-assets/tanstack-start-ssr-performance-600-percent/after-build-location.png) |
140+
| Before | ![CPU profiling of buildLocation before the changes](/blog-assets/tanstack-start-ssr-performance-600-percent/build-location-before.png) |
141+
| After | ![CPU profiling of buildLocation after the changes](/blog-assets/tanstack-start-ssr-performance-600-percent/build-location-after.png) |
142142

143143
## Finding 2: SSR does not need reactivity
144144

@@ -147,15 +147,15 @@ Like every PR in this series, this was profiling the impacted method before and
147147
SSR renders once per request.[^ssr-streaming] There is no ongoing UI to reactively update, so on the server:
148148

149149
- store subscriptions add overhead but provide no benefit
150-
- structural sharing[^structural-sharing] (replace-equal) reduces re-renders, but SSR does not re-render
151-
- batching reactive notifications is irrelevant if nothing is subscribed
150+
- structural sharing[^structural-sharing] reduces re-renders, but SSR does not re-render
151+
- batching reactive updates is irrelevant if nothing is subscribed
152152

153153
### The transferable pattern
154154

155-
If you have a runtime that supports both client reactivity and SSR, separate them:
155+
If your code supports both client reactivity and SSR, gate the reactive machinery so the server can skip it entirely:
156156

157-
- on the server: compute a snapshot and return it
158-
- on the client: subscribe and use structural sharing to reduce render churn
157+
- on the server: return state directly, no subscriptions, reduce immutability overhead
158+
- on the client: subscribe normally
159159

160160
This is the difference between "server = a function" and "client = a reactive system".
161161

@@ -176,6 +176,13 @@ function useRouterState() {
176176

177177
See: [#6497](https://github.com/TanStack/router/pull/6497), [#6482](https://github.com/TanStack/router/pull/6482)
178178

179+
### How we proved it internally
180+
181+
| | |
182+
| ------ | -------------------------------------------------------------------------------------------------------------------------------------- |
183+
| Before | ![CPU profiling of useRouterState before the changes](/blog-assets/tanstack-start-ssr-performance-600-percent/router-state-before.png) |
184+
| After | ![CPU profiling of useRouterState after the changes](/blog-assets/tanstack-start-ssr-performance-600-percent/router-state-after.png) |
185+
179186
## Finding 3: server-only fast paths are worth it (when gated correctly)
180187

181188
### The mechanism
@@ -188,7 +195,7 @@ If you can guard a branch with a **build-time constant** like `isServer`, you ca
188195
- keep the general algorithm for correctness and edge cases
189196
- allow bundlers to delete the server-only branch from client builds
190197

191-
In TanStack Start, `isServer` is provided via build-time resolution (client: `false`, server: `true`, dev/test: `undefined` with fallback). Modern bundlers like Vite, Rollup, and esbuild perform dead code elimination (DCE)[^dce], removing unreachable branches when the condition is a compile-time constant.
198+
In TanStack Start, `isServer` is provided via build-time resolution of export conditions[^export-conditions] (client: `false`, server: `true`, dev/test: `undefined` with fallback). Modern bundlers like Vite, Rollup, and esbuild perform dead code elimination (DCE)[^dce], removing unreachable branches when the condition is a compile-time constant.
192199

193200
### The transferable pattern
194201

@@ -197,7 +204,7 @@ Write two implementations:
197204
- **fast path** for the common case
198205
- **general path** for correctness
199206

200-
And gate them behind a build-time constant so you don't ship server-only logic to clients.
207+
And gate them behind a build-time constant so you don't inflate the bundle size for clients.
201208

202209
### What we changed
203210

@@ -208,9 +215,11 @@ And gate them behind a build-time constant so you don't ship server-only logic t
208215

209216
if (isServer) {
210217
// server-only fast path (removed from client bundle)
211-
return fastServerPath(input)
218+
if (isCommonCase(input)) {
219+
return fastServerPath(input)
220+
}
212221
}
213-
// general algorithm (used on client, fallback on server in dev)
222+
// general algorithm that handles all cases
214223
return generalPath(input)
215224
```
216225

@@ -243,6 +252,13 @@ return rest
243252

244253
See: [#6456](https://github.com/TanStack/router/pull/6456), [#6515](https://github.com/TanStack/router/pull/6515)
245254

255+
### How we proved it internally
256+
257+
| | |
258+
| ------ | ------------------------------------------------------------------------------------------------------------------------------------- |
259+
| Before | ![CPU profiling of startViewTransition before the changes](/blog-assets/tanstack-start-ssr-performance-600-percent/delete-before.png) |
260+
| After | ![CPU profiling of startViewTransition after the changes](/blog-assets/tanstack-start-ssr-performance-600-percent/delete-after.png) |
261+
246262
## Results
247263

248264
Benchmark: placeholder text, should link to Matteo's article.
@@ -267,15 +283,15 @@ The following graphs show event-loop utilization against throughput for each fea
267283

268284
#### links-100
269285

270-
![Event-loop utilization vs throughput for links-100, before and after](/blog-assets/tanstack-start-ssr-performance-600-percent/links-after.png)
286+
![Event-loop utilization vs throughput for links-100, before and after](/blog-assets/tanstack-start-ssr-performance-600-percent/elu-links.png)
271287

272288
#### layouts-26-with-params
273289

274-
![Event-loop utilization vs throughput for nested routes, before and after](/blog-assets/tanstack-start-ssr-performance-600-percent/nested-after.png)
290+
![Event-loop utilization vs throughput for nested routes, before and after](/blog-assets/tanstack-start-ssr-performance-600-percent/elu-nested.png)
275291

276292
#### empty (baseline)
277293

278-
![Event-loop utilization vs throughput for minimal route, before and after](/blog-assets/tanstack-start-ssr-performance-600-percent/nothing-after.png)
294+
![Event-loop utilization vs throughput for minimal route, before and after](/blog-assets/tanstack-start-ssr-performance-600-percent/elu-empty.png)
279295

280296
### Flamegraph evidence slots
281297

@@ -304,10 +320,12 @@ There were many other improvements (client and server) not covered here. SSR per
304320

305321
[^webkit-delete-ic]: WebKit, "A Tour of Inline Caching with Delete" `https://webkit.org/blog/10298/inline-caching-delete/`
306322

307-
[^structural-sharing]: Structural sharing is a pattern from immutable data libraries (Immer, React Query, TanStack Store) where unchanged portions of data structures are reused by reference to minimize allocation and enable cheap equality checks.
323+
[^structural-sharing]: Structural sharing is a pattern from immutable data libraries (Immer, React Query, TanStack Store) where unchanged portions of data structures are reused by reference enable cheap equality checks. See [Structural Sharing](https://tanstack.com/query/latest/docs/framework/react/guides/render-optimizations#structural-sharing) in the TanStack Query documentation.
308324

309-
[^ssr-streaming]: With streaming SSR and Suspense, the server may render multiple chunks, but each chunk is still a single-pass render with no reactive updates.
325+
[^ssr-streaming]: With streaming SSR and Suspense, the server may render multiple chunks, but each chunk is still a single-pass render with no reactive updates. See [renderToPipeableStream](https://react.dev/reference/react-dom/server/renderToPipeableStream) in the React documentation.
310326

311327
[^url-cost]: The WHATWG URL Standard requires significant parsing work: scheme detection, authority parsing, path normalization, query string handling, and percent-encoding. See the [URL parsing algorithm](https://url.spec.whatwg.org/#url-parsing) for the full state machine.
312328

313-
[^dce]: Dead code elimination is a standard compiler optimization. See esbuild's documentation on [tree shaking](https://esbuild.github.io/api/#tree-shaking) and Rollup's [tree-shaking guide](https://rollupjs.org/introduction/#tree-shaking).
329+
[^export-conditions]: Conditional exports are a Node.js feature that allows packages to define different entry points based on environment or import method. See [Conditional exports](https://nodejs.org/api/packages.html#conditional-exports) in the Node.js documentation.
330+
331+
[^dce]: Dead code elimination is a standard compiler optimization. See esbuild's documentation on [tree shaking](https://esbuild.github.io/api/#tree-shaking), Rollup's [tree-shaking guide](https://rollupjs.org/introduction/#tree-shaking) and Rich Harris's article on [dead code elimination](https://medium.com/@Rich_Harris/tree-shaking-versus-dead-code-elimination-d3765df85c80).

0 commit comments

Comments
 (0)