-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathbinary_formats_are_better_than_json_in_browsers.html
More file actions
566 lines (522 loc) · 30.1 KB
/
binary_formats_are_better_than_json_in_browsers.html
File metadata and controls
566 lines (522 loc) · 30.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
<head>
<title> Binary Formats are Better Than JSON in Browsers! </title>
<style type="text/css">
body {
margin:40px auto;
max-width: 800px;
line-height:1.6;
font-size:18px;
padding:0 10px
}
h1,h2,h3{
line-height:1.2
}
img {
width: 800px;
}
</style>
<script src="https://code.highcharts.com/highcharts.js"></script>
<script type="text/javascript">
// This is the data loaded from https://github.com/adamfaulkner/serialization_bakeoff
const benchmarkData =
[{"name":"json","endToEndMaterializeUnverifiedPojoDuration":4754.9,"deserializeDuration":1178.5680000000007,"bodyReadDuration":1543.5920000000012,"serializeDuration":455.9,"scanForIdPropertyDuration":20.473999999999798,"materializeAsUnverifiedPojoDuration":553.7039999999993,"size":340507421,"zstdCompressedSize":49934519,"zstdDuration":845.6,"materializeAsVerifiedPojoDuration":796.7939999999979},{"name":"proto","endToEndMaterializeUnverifiedPojoDuration":3829.5599999999986,"deserializeDuration":1818.5500000000006,"bodyReadDuration":390.31999999999897,"serializeDuration":283.3,"scanForIdPropertyDuration":10.34800000000032,"materializeAsUnverifiedPojoDuration":642.5419999999998,"size":136972707,"zstdCompressedSize":44935390,"zstdDuration":492.2,"materializeAsVerifiedPojoDuration":501.0120000000032},{"name":"updated proto","endToEndMaterializeUnverifiedPojoDuration":3439.474000000002,"deserializeDuration":1455.7920000000026,"bodyReadDuration":382.12600000000384,"serializeDuration":285.6,"scanForIdPropertyDuration":9.52800000000425,"materializeAsUnverifiedPojoDuration":638.1940000000031,"size":136972707,"zstdCompressedSize":44935390,"zstdDuration":494,"materializeAsVerifiedPojoDuration":524.4259999999965},{"name":"pbf","endToEndMaterializeUnverifiedPojoDuration":3389.4140000000043,"deserializeDuration":1576.5660000000003,"bodyReadDuration":441.4059999999998,"serializeDuration":298.8,"scanForIdPropertyDuration":18.84999999999127,"materializeAsUnverifiedPojoDuration":405.44400000000024,"size":136972707,"zstdCompressedSize":44935390,"zstdDuration":503.8,"materializeAsVerifiedPojoDuration":342.0499999999971},{"name":"msgpack","endToEndMaterializeUnverifiedPojoDuration":4785.529999999999,"deserializeDuration":2406.5960000000023,"bodyReadDuration":683.351999999996,"serializeDuration":290.1,"scanForIdPropertyDuration":33.27599999999511,"materializeAsUnverifiedPojoDuration":641.1160000000003,"size":275082775,"zstdCompressedSize":44123267,"zstdDuration":595.8,"materializeAsVerifiedPojoDuration":890.9400000000053},{"name":"cbor","endToEndMaterializeUnverifiedPojoDuration":4978.405999999997,"deserializeDuration":2557.9020000000137,"bodyReadDuration":684.5460000000021,"serializeDuration":344.4,"scanForIdPropertyDuration":30.539999999996507,"materializeAsUnverifiedPojoDuration":648.0280000000027,"size":275511485,"zstdCompressedSize":44153140,"zstdDuration":587.1,"materializeAsVerifiedPojoDuration":806.8940000000118},{"name":"bebop","endToEndMaterializeUnverifiedPojoDuration":2939.0140000000015,"deserializeDuration":1616.7299999999989,"bodyReadDuration":455.4539999999979,"serializeDuration":186.3,"scanForIdPropertyDuration":17.135999999998603,"materializeAsUnverifiedPojoDuration":0,"size":164174471,"zstdCompressedSize":47341400,"zstdDuration":566,"materializeAsVerifiedPojoDuration":8.885999999998603},{"name":"flatbuffers","endToEndMaterializeUnverifiedPojoDuration":5378.229999999993,"deserializeDuration":0.06399999999557622,"bodyReadDuration":470.1159999999974,"serializeDuration":330.4,"scanForIdPropertyDuration":271.37600000000674,"materializeAsUnverifiedPojoDuration":3804.5739999999932,"size":186317000,"zstdCompressedSize":55135144,"zstdDuration":640.8,"materializeAsVerifiedPojoDuration":3798.344000000006},{"name":"avro","endToEndMaterializeUnverifiedPojoDuration":3641.855999999988,"deserializeDuration":2575.097999999998,"bodyReadDuration":273.13200000000654,"serializeDuration":279.4,"scanForIdPropertyDuration":10.759999999991852,"materializeAsUnverifiedPojoDuration":0.003999999992083758,"size":129227481,"zstdCompressedSize":41469145,"zstdDuration":436.8,"materializeAsVerifiedPojoDuration":0},{"name":"updated avro","endToEndMaterializeUnverifiedPojoDuration":2762.4179999999933,"deserializeDuration":1501.0039999999979,"bodyReadDuration":355.04000000000235,"serializeDuration":288.4,"scanForIdPropertyDuration":10.537999999994645,"materializeAsUnverifiedPojoDuration":0.002000000001862645,"size":129227481,"zstdCompressedSize":41469145,"zstdDuration":476.7,"materializeAsVerifiedPojoDuration":0.001999999996041879},{"name":"capnp","serializeDuration":247,"bodyReadDuration":405.3199999999997,"deserializeDuration":199.77999999999884,"endToEndMaterializeUnverifiedPojoDuration":37012.119999999995,"scanForIdPropertyDuration":3309.679999999993,"materializeAsUnverifiedPojoDuration":35423.72,"size":223671080,"zstdCompressedSize":53001720,"zstdDuration":613,"materializeAsVerifiedPojoDuration":40814.520000000004}];
function pickPerformanceStats(picks) {
const result = [];
for (const pick of picks) {
const found = benchmarkData.find((stat) => stat.name === pick);
if (found === undefined) {
throw new Error(`No ${pick}`);
}
result.push(found);
}
return result;
}
const sizeStats = pickPerformanceStats(["json", "updated proto", "bebop", "updated avro", "msgpack", "cbor", "flatbuffers", "flatbuffers"]);
const frontRunnerStats = pickPerformanceStats(["json", "updated proto", "bebop", "updated avro", "msgpack", "cbor", "flatbuffers"])
const avroStats = pickPerformanceStats(["avro", "updated avro"])
const protobufStats = pickPerformanceStats(["proto", "updated proto"])
const capnpStats = pickPerformanceStats(["json", "updated proto", "capnp"])
const colorsMapping = {
json: "#CC3333", // Darker red
"updated proto": "#CCCC33", // Darker yellow
proto: "#999999", // Medium gray
msgpack: "#CC33CC", // Darker magenta
cbor: "#33CCCC", // Darker cyan
bebop: "#33CC33", // Darker green
"updated avro": "#3333CC", // Darker blue
capnp: "#999999", // Medium gray
flatbuffers: "#CC7733", // Muted orange
avro: "#AAAAAA" // Light gray
};
document.addEventListener('DOMContentLoaded', function() {
Highcharts.chart('deserializeDuration', {
chart: {
type: 'column'
},
title: {
text: 'Deserialize Duration'
},
xAxis: {
type: 'category',
labels: {
autoRotation: [-45, -90],
style: {
fontSize: '13px',
fontFamily: 'Verdana, sans-serif'
}
}
},
yAxis: {
min: 0,
title: {
text: 'Milliseconds (lower is better)'
}
},
plotOptions: {
column: {
pointPadding: 0.2,
borderWidth: 0
}
},
legend: {
enabled: false
},
tooltip: {
pointFormat: '{point.y:.1f} ms'
},
series: [
{
name: 'Duration',
colors: frontRunnerStats.map((s) => colorsMapping[s.name]),
colorByPoint: true,
data: frontRunnerStats.map((s) => [s.name, s.endToEndMaterializeUnverifiedPojoDuration - s.serializeDuration - s.zstdDuration]),
},
],
});
Highcharts.chart('messageSize', {
chart: {
type: 'column'
},
title: {
text: 'Message Size'
},
xAxis: {
type: 'category',
labels: {
autoRotation: [-45, -90],
style: {
fontSize: '13px',
fontFamily: 'Verdana, sans-serif'
}
}
},
yAxis: {
min: 0,
title: {
text: 'Bytes (lower is better)'
}
},
plotOptions: {
column: {
pointPadding: 0.2,
borderWidth: 0
}
},
legend: {
enabled: false
},
tooltip: {
pointFormat: '{point.y:.0f} bytes'
},
series: [
{
name: 'size',
colors: sizeStats.map((s) => colorsMapping[s.name]),
colorByPoint: true,
data: sizeStats.map((s) => [s.name, s.size]),
},
],
});
Highcharts.chart('durationToReadBody', {
chart: {
type: 'column'
},
title: {
text: 'Duration To Read Message Body'
},
xAxis: {
type: 'category',
labels: {
autoRotation: [-45, -90],
style: {
fontSize: '13px',
fontFamily: 'Verdana, sans-serif'
}
}
},
yAxis: {
min: 0,
title: {
text: 'Milliseconds (lower is better)'
}
},
plotOptions: {
column: {
pointPadding: 0.2,
borderWidth: 0
}
},
legend: {
enabled: false
},
tooltip: {
pointFormat: '{point.y:.1f} ms'
},
series: [
{
name: 'duration',
colors: frontRunnerStats.map((s) => colorsMapping[s.name]),
colorByPoint: true,
data: frontRunnerStats.map((s) => [s.name, s.bodyReadDuration]),
},
],
});
Highcharts.chart('compressedSizes', {
chart: {
type: 'column'
},
title: {
text: 'Compressed Message Sizes'
},
xAxis: {
type: 'category',
labels: {
autoRotation: [-45, -90],
style: {
fontSize: '13px',
fontFamily: 'Verdana, sans-serif'
}
}
},
yAxis: {
min: 0,
title: {
text: 'Bytes (lower is better)'
}
},
plotOptions: {
column: {
pointPadding: 0.2,
borderWidth: 0
}
},
legend: {
enabled: false
},
tooltip: {
pointFormat: '{point.y:.0f} bytes'
},
series: [
{
name: 'size',
colors: frontRunnerStats.map((s) => colorsMapping[s.name]),
colorByPoint: true,
data: frontRunnerStats.map((s) => [s.name, s.zstdCompressedSize]),
},
],
});
Highcharts.chart('verified', {
chart: {
type: 'column'
},
title: {
text: 'Deserialize and Verify Latency'
},
xAxis: {
type: 'category',
labels: {
autoRotation: [-45, -90],
style: {
fontSize: '13px',
fontFamily: 'Verdana, sans-serif'
}
}
},
yAxis: {
min: 0,
title: {
text: 'Milliseconds (lower is better)'
}
},
plotOptions: {
column: {
pointPadding: 0.2,
borderWidth: 0
}
},
legend: {
enabled: false
},
tooltip: {
pointFormat: '{point.y:.1f} ms'
},
series: [
{
name: 'duration',
colors: frontRunnerStats.map((s) => colorsMapping[s.name]),
colorByPoint: true,
data: frontRunnerStats.map((s) => [s.name,
s.endToEndMaterializeUnverifiedPojoDuration -
s.serializeDuration -
s.zstdDuration -
s.materializeAsUnverifiedPojoDuration +
s.materializeAsVerifiedPojoDuration
]),
},
],
});
Highcharts.chart('avro', {
chart: {
type: 'column'
},
title: {
text: 'Avro Deserialize Latency'
},
xAxis: {
type: 'category',
labels: {
autoRotation: [-45, -90],
style: {
fontSize: '13px',
fontFamily: 'Verdana, sans-serif'
}
}
},
yAxis: {
min: 0,
title: {
text: 'Milliseconds (lower is better)'
}
},
plotOptions: {
column: {
pointPadding: 0.2,
borderWidth: 0
}
},
legend: {
enabled: false
},
tooltip: {
pointFormat: '{point.y:.1f} ms'
},
series: [
{
name: 'duration',
colors: avroStats.map((s) => colorsMapping[s.name]),
colorByPoint: true,
data: avroStats.map((s) => [s.name,
s.endToEndMaterializeUnverifiedPojoDuration -
s.serializeDuration -
s.zstdDuration
]),
},
],
});
Highcharts.chart('protobuf', {
chart: {
type: 'column'
},
title: {
text: 'Protobuf Deserialize Latency'
},
xAxis: {
type: 'category',
labels: {
autoRotation: [-45, -90],
style: {
fontSize: '13px',
fontFamily: 'Verdana, sans-serif'
}
}
},
yAxis: {
min: 0,
title: {
text: 'Milliseconds (lower is better)'
}
},
plotOptions: {
column: {
pointPadding: 0.2,
borderWidth: 0
}
},
legend: {
enabled: false
},
tooltip: {
pointFormat: '{point.y:.1f} ms'
},
series: [
{
name: 'duration',
colors: protobufStats.map((s) => colorsMapping[s.name]),
colorByPoint: true,
data: protobufStats.map((s) => [s.name,
s.endToEndMaterializeUnverifiedPojoDuration -
s.serializeDuration -
s.zstdDuration
]),
},
],
});
Highcharts.chart('capnp', {
chart: {
type: 'column'
},
title: {
text: 'Deserialize Latency'
},
xAxis: {
type: 'category',
labels: {
autoRotation: [-45, -90],
style: {
fontSize: '13px',
fontFamily: 'Verdana, sans-serif'
}
}
},
yAxis: {
min: 0,
title: {
text: 'Milliseconds (lower is better)'
}
},
plotOptions: {
column: {
pointPadding: 0.2,
borderWidth: 0
}
},
legend: {
enabled: false
},
tooltip: {
pointFormat: '{point.y:.1f} ms'
},
series: [
{
name: 'duration',
colors: capnpStats.map((s) => colorsMapping[s.name]),
colorByPoint: true,
data: capnpStats.map((s) => [s.name,
s.endToEndMaterializeUnverifiedPojoDuration -
s.serializeDuration -
s.zstdDuration
]),
},
],
});
});
</script>
<h1>Binary Formats are Better Than JSON in Browsers!</h1>
<p>2025-04-23</p>
<h2>TL;DR</h2>
<p>JSON used to be faster than alternatives in browsers, but that's not the case anymore. For performance sensitive web apps, it is worth considering Avro, Protobuf, or Bebop.</p>
<h2>Introduction</h2>
<p>Back in 2023, the company I was working for needed an alternative to JSON for large responses from the server to our Web App. We chose to use MessagePack primarily because it was convenient. During this process, I benchmarked a number of different binary encodings. To my surprise, I found that all of them were much slower than JSON in browsers. My conclusion was that MessagePack was a reasonable fit for our needs, but that MessagePack, as well as other reasonable alternatives, came with a large performance cost. We decided this was acceptable, as our bottlenecks were primarily on the server side.</p>
<p>However, I think a number of recent trends have made deserialization performance more important:</p>
<ol>
<li>Internet speeds have gotten much faster in the past few years, as residential gigabit internet is more broadly rolled out, and as 5G is offering increased performance. This eliminates some of the network IO bottleneck. For business software, this effect is greatly amplified, since many large companies will prioritize providing fast internet to their employees.</li>
<li>Web apps are becoming more complicated; there's increased desire to use richer and larger datasets in web apps. For many web apps, it may be desirable to send large datasets than to build a smart backend, as this may save development time and simplify the architecture of the application.</li>
<li>Users care about responsiveness in web apps. Due to JavaScript's single threaded nature, slow deserialization can block all other interactions with a web app, which will frustrate users.</li>
</ol>
<p>In the past 3 months, <a href="https://github.com/adamfaulkner/serialization_bakeoff">I've been experimenting with JavaScript binary encoding libraries</a>, and I've found that there are now several options that outperform JSON and otherwise seem really solid.</p>
<p>I'm writing this post to share my experience and my conclusions.</p>
<div id="deserializeDuration"></div>
<p>This graph shows the client side latency for receiving and deserializing a large (340 MB of JSON) message using various different libraries.</p>
<h2>Challenges of Benchmarking Deserialization in a Browser</h2>
<p>It's easy to make mistakes when doing these kinds of benchmarks. Here are some common things I noticed when researching:</p>
<h3>Using Node.js Rather than a Browser for Benchmarking</h3>
<p>The most common mistake that I saw when researching this was that most people benchmarking JavaScript use Node.js rather than a browser. There are two major differences between browsers and Node, which have a big impact on this performance:</p>
<ol>
<li>Node.js has a built in class called <code>Buffer</code>, which predates when <code>Uint8Array</code> was standardized. <code>Buffer</code> is still widely used in the Node ecosystem, but since it doesn't exist in the browser, any serialization library that uses it will need to find an alternative. The <code>Buffer</code> class performs very differently from <code>Uint8Array</code>, as it seems to have faster methods for slicing and converting UTF-8 data to a string.</li>
<li>Node can easily incorporate compiled binary code, and libraries like <code>msgpackr</code> and <code>cbor-x</code> can use these to accelerate their performance.</li>
</ol>
<p>I suspect that using Node here has caused both <code>avsc</code> and <code>protobuf.js</code> to overstate their performance historically.</p>
<h3>Finding Representative Datasets</h3>
<p>Other benchmarks (for example, this <a href="https://github.com/Adelost/javascript-serialization-benchmark">JavaScript Serialization Benchmark</a>) tend towards datasets with a high proportion of numeric data, which tend to be easier to optimize with binary encodings. This is great if your use case involves a bunch of floating point numbers, but for most use cases I've seen in my career, strings are more important.</p>
<p>When making decisions using benchmarks, it's important that the data involved be representative of what you're trying to do.</p>
<h3>Accounting for Differences in Input Data Type and Size</h3>
<p>In my earlier benchmarks, I compared <code>JSON.parse(some string)</code> directly against <code>binaryEncoding.parse(some buffer)</code>. This was a mistake for several reasons.</p>
<p>Firstly, decoding a string from a buffer is expensive. In this comparison, JSON gets to skip this expensive step, but in reality, the bytes coming into a browser have to be decoded somewhere.</p>
<p>Secondly, JSON messages are much larger than the binary encoded messages. This means that many parts of the browser that touch the message will need to spend more time processing it, beyond just decoding. For example, the networking code inside the browser would need to spend more time copying memory. By limiting the comparison to just deserialization, we don't capture this cost.</p>
<p>This graph shows how large the uncompressed message sizes were across different encodings:</p>
<div id="messageSize"></div>
<p>This graph shows how long it took to read the message body from the server. It's obvious that JSON is at a huge disadvantage before deserialization even starts. JSON is slow here not only because the message body is larger, but also because the body needs to be decoded to a string before it can be deserialized:</p>
<div id="durationToReadBody"></div>
<p>I corrected for both of these effects in my benchmarks by measuring the end-to-end latency from server request to client processing. I then deducted time spent purely on the server side from this end-to-end latency to get the pure client side latency.</p>
<h4>Aside: Compression</h4>
<p>You might think that compression could help us here. Indeed, compression almost totally wipes out the size differences between different encodings:</p>
<div id="compressedSizes"></div>
<p>This is very helpful when it comes to addressing network bandwidth concerns, however, the browser still needs to decompress and process more bytes. By measuring the end to end latency for deserializing messages, we capture all of this extra time needed for decompression and processing.</p>
<h3>Schema vs Schema-less</h3>
<p>Some of these encodings have a schema; some are schema-less. Encodings that have a schema implicitly perform some validation of the data. If the developer wants the decoded data to be validated, then we need to capture the additional time needed to do this when using a schema-less encoding.</p>
<p>This graph shows how long it took to decode and validate data from each encoding:</p>
<div id="verified"></div>
<p>In my test, this didn't really change things much. But I think it's an important call-out, since schema encodings provide a lot of useful safety for free.</p>
<h3>Lazy Decoding and Type Differences</h3>
<p>Several libraries, like Flatbuffers and Cap'n Proto, implement some form of "lazy decoding", where the deserialized object does not actually do any deserialization until needed. This can significantly improve performance for scenarios where not all parts of a serialized message actually need to be read.</p>
<p>It would be problematic if we were to only compare Flatbuffer's "deserialize" method against something like <code>JSON.parse</code>, since JSON actually does all of the heavy lifting to allocate real values up front.</p>
<p>Another related problem is that different encodings support different sets of types. For example, Bebop supports <code>Date</code>s directly, while something like JSON needs to serialize a number or a string and manually convert it to a date after deserialization.</p>
<p>In order to compare these fairly, I had my benchmark materialize "Plain Old JavaScript objects" from deserialized messages. These objects include <code>Date</code> objects and force evaluation of properties from lazy encodings.</p>
<p>The time needed to materialize a "Plain Old JavaScript object" is included in the end-to-end time that we measure.</p>
<h2>Recent(ish) Developments that Make Libraries Faster</h2>
<p>As far as I can tell, the browser features needed to support fast binary encodings have been available <a href="https://caniuse.com/textencoder">since at least early 2020</a>, but haven't been widely adopted until more recently. In the past couple of years, there have been a few developments that make binary encodings in the browser much more tenable:</p>
<h3>Bebop</h3>
<p><a href="https://github.com/betwixt-labs/bebop">Bebop</a> is an encoding format very similar to protobuf that was developed in <a href="https://github.com/betwixt-labs/bebop/commit/b123649a5bfc4b5d19c31f42676ec6dd546d7ae2">2020</a>. It seems really great; the tooling works well, and the performance across different supported languages seems good. It also supports <code>Date</code> types out of the box. <a href="https://web.archive.org/web/20220826230212/https://rainway.com/blog/2020/12/09/bebop-an-efficient-schema-based-binary-serialization-format/">Bebop was originally designed to target browsers with high performance</a>.</p>
<p>The main downside of Bebop is that it seems relatively new and unknown. In fact, the blog post linked above now redirects to "text-os.com", which makes me feel uncomfortable about the future of the company behind this library. I'm not sure Bebop is as safe of a bet as Avro or Protobuf.</p>
<h3>Avro (<code>avsc</code> library)</h3>
<p>By default, as of April of 2025, the released version of <a href="https://github.com/mtth/avsc">avsc</a> uses a very slow <code>Buffer</code> polyfill in browsers, which causes extremely bad deserialization performance. However, <a href="https://github.com/mtth/avsc/commit/c80c670e81b0f7ba020f72db63f081d41dfd3c49">recently</a> the latest version on the <code>master</code> branch began using <code>Uint8Array</code> directly, and does not suffer these performance problems:</p>
<div id="avro"></div>
<p>With this big improvement in performance, <code>avsc</code> becomes the most compelling option for pure performance that I tested.</p>
<h3>Protobuf (<code>protobuf.js</code> library)</h3>
<p>In my tests, <a href="github.com/protobufjs/protobuf.js">Protobuf.js</a> did not perform very well at deserialization by default. The main problem was that its algorithm for decoding strings is not as efficient as other options. Fortunately, this was easily fixed, and I've submitted <a href="https://github.com/protobufjs/protobuf.js/pull/2062">a pull request to the protobuf.js project</a>.</p>
<div id="protobuf"></div>
<p>I also tried alternative Protobuf libraries, including <a href="https://github.com/bufbuild/protobuf-es/">protobuf-es</a>, which did not perform well, and <a href="https://github.com/mapbox/pbf">pbf</a>, which had fantastic performance but did not feature much flexibility or configurability with generated code.</p>
<h2>Other Libraries</h2>
<p>I also tested a few other libraries. I don't think they are a good fit for the browser at this point in time:</p>
<h3>Flatbuffers</h3>
<p><a href="https://github.com/google/flatbuffers">Flatbuffers</a> had a nice tooling and developer experience story. Flatbuffers in JavaScript uses a lazy approach to deserialization, so Flatbuffers end up being a good choice for scenarios where the entire message will not be deserialized.</p>
<p>Otherwise, when materializing a full blown "fat" JavaScript object, I found Flatbuffer performance to be less than alternatives.</p>
<h3>Cap'n Proto</h3>
<p><a href="https://github.com/capnproto/node-capnp">Cap'n Proto's recommended JavaScript implementation</a> only targets Node.js, as it is simply a wrapper around a C++ module. It also seems like it's suffered some bitrot, as it was not possible for me to get it to compile on a recent Node.js version. Also, it had plenty of caveats around performance being bad in the documentation.</p>
<p> Instead, I opted to test <a href="https://github.com/unjs/capnp-es">capnp-es</a>. This library also uses a lazy approach to deserialization. Performance was so bad with Cap'n Proto that I had to drop it from my investigation to be able to iterate more quickly on the other options.</p>
<div id="capnp"></div>
<p>With so many similar alternatives that feature better performance and better browser support, I'm not sure it makes sense to use Cap'n Proto in 2025 for targeting browsers.</p>
<h3>MessagePack and Cbor</h3>
<p>I also tested <a href="https://github.com/kriszyp/msgpackr">msgpackr</a> and <a href="https://github.com/kriszyp/cbor-x">cbor-x</a>. I've seen great performance with both of these libraries in the past, on the server side, but in the browser, they were some of the slowest libraries that I tested.</p>
<h2>Non-Performance Reasons to Avoid JSON</h2>
<p>So far this post has been focused on performance, but I think it's also worth mentioning the myriad of downsides to JSON beyond performance:</p>
<h3>Required String Input</h3>
<p><code>JSON.parse</code> requires a string as input. This can be problematic with large inputs, as <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/length">some JavaScript implementations cap the length of a string at 512MB (Node.js) or 1GB (Chrome)</a>.</p>
<h3>Limited Type Support & Lack of a Schema</h3>
<p>JSON does not support many types that are important for many types of programs. For example, JSON does not have a 64 bit integer type. Some implementations will use a 64 bit floating point type for these values, which will silently mangle values greater than or equal to 2 to the 53rd power. The workaround for this is typically to represent these values using strings, which can be parsed into a BigInt value, or just kept in a string representation if numeric operations will not be performed on these values.</p>
<p>Since JSON is schema-less, decoding a JSON message performs no validation on the shape or type of the decoded data. If we want these sorts of validations, we must validate messages ourselves when we receive them.</p>
<p>Many serialization formats, like Protobuf, perform this sort of validation implicitly when deserializing messages.</p>
<h4>A Short Anecdote: Broken Ad Campaigns</h4>
<p>At one point in my career, I broke an important client's ad campaign because the company I was at used strings to represent numbers to avoid BigInt issues, and I accidentally added 500 to "500", resulting in a request to fetch the client's 500,500th ad instead of the 1000th ad. This was embarrassing and cost the company some amount of money.</p>
<p>Nowadays, most people use TypeScript, which can avoid some of these kinds of issues. But JSON does nothing to help us here, and indeed actively harms us. We have to actually use a library like <a href="https://zod.dev/">zod</a> to get good end to end protections here.</p>
<h3>Poor Performance on the Server Side</h3>
<p>In my benchmarks, using a Rust server, JSON serialization was slower than most alternatives.</p>
<h2>Conclusions & Future Work</h2>
<p>In conclusion, I think that Bebop, Avro, and Protobuf are all good solutions that can outperform JSON for encoding messages from a server to a browser. (Hopefully Avro has a a release soon, and Protobuf.js accepts my PR!)</p>
<p>Next, I'd like to write a similar post about the server side implementations of these technologies, where I think things are a bit less nuanced and simpler.</p>
<h2>Notes on Benchmarking</h2>
<ul>
<li><a href="https://github.com/adamfaulkner/serialization_bakeoff">The code for the benchmarks is here.</a></li>
<li>All of these benchmarks were for processing a payload of 1 million trips of <a href="https://citibikenyc.com/system-data">NYC Citibike data</a>. It was 340MB when encoded using JSON.</li>
<li>All of these benchmarks except Cap'n Proto were run 10 times, and an average was taken. Cap'n Proto was only run once on account of how slow it was.</li>
</ul>