Skip to content

Commit e814763

Browse files
committed
Address claude.ai review suggestions for post 1
- Move p99 explanation before first passthrough table where percentiles are first encountered; remove duplicate from encryption section - Expand Layer 7 point with one sentence of context for non-technical readers: most Kafka proxies operate at L4, Kroxylicious parses every message yet still adds only 0.2 ms - Add distribution board analogy for independent connection handling vs broker shared resource contention - Simplify replication factor caveat to one sentence, linking to companion post for detail - Fix "Most proxies" → "Most proxies operate on Kafka" for accuracy Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
1 parent 23b8c4e commit e814763

1 file changed

Lines changed: 5 additions & 5 deletions

File tree

_posts/2026-05-21-benchmarking-the-proxy.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,8 @@ One important caveat: this Kafka cluster is deliberately untuned. We're not tryi
4848

4949
Good news first. The proxy itself — with no filter chain, just routing traffic — adds almost nothing.
5050

51+
A quick note on percentiles for anyone not steeped in performance benchmarking: p99 latency is the value that 99% of requests complete within — meaning 1 in 100 requests takes longer. Averages flatter; the p99 is what your slowest clients actually experience, and it's usually the number that matters.
52+
5153
**10 topics, 1 KB messages (5,000 msg/s per topic):**
5254

5355
| Metric | Baseline | Proxy | Delta |
@@ -70,9 +72,9 @@ Good news first. The proxy itself — with no filter chain, just routing traffic
7072

7173
**The headline: ~0.2 ms additional average publish latency. Throughput is unaffected.**
7274

73-
What did I take away from this entirely unsurprising result? Not much, honestly — without filters the proxy boils the latency-sensitive path down to little more than a couple of hops through the TCP stack. We replaced a hunch with data. The remarkable part: the proxy is doing this at Layer 7.
75+
What did I take away from this entirely unsurprising result? Not much, honestly — without filters the proxy boils the latency-sensitive path down to little more than a couple of hops through the TCP stack. We replaced a hunch with data. The remarkable part: the proxy is doing this at Layer 7. Most proxies operate on Kafka at Layer 4 — they shuffle bytes without ever understanding what those bytes mean. Kroxylicious works at Layer 7, parsing every Kafka message, yet still adds only 0.2 ms. That's the design working.
7476

75-
The overhead holding across 10 and 100 topics makes sense for the same reason: the proxy doesn't contend between topics. A Kafka broker juggles disk I/O, partition leaders, and replication across everything it manages; the proxy treats each connection independently. Topics don't contend for shared resources: throughput scales linearly across them, and the connection sweep validates it.
77+
The overhead holding across 10 and 100 topics makes sense for the same reason: the proxy doesn't contend between topics. Think of the proxy as independent circuits on a distribution board — switching the breaker for lights doesn't cut power to the fridge. A Kafka broker is more like the mains supply itself — every circuit draws from the same source, so heavy load anywhere reduces what's available everywhere. Topics don't contend for shared resources: throughput scales linearly across them, and the connection sweep validates it.
7678

7779
The end-to-end p99 figure is dominated by Kafka consumer fetch timeouts, as it should be. That said, it is reassuring to have a sub-ms impact on the p99.
7880

@@ -84,8 +86,6 @@ Ok, so let's make the proxy smarter — make it do something people actually car
8486

8587
### Latency at sub-saturation rates
8688

87-
A quick note on percentiles for anyone not steeped in performance benchmarking: p99 latency is the value that 99% of requests complete within — meaning 1 in 100 requests takes longer. Averages flatter; the p99 is what your slowest clients actually experience, and it's usually the number that matters.
88-
8989
So we know encryption is doing a lot of work, but to find out the real impact we need to compare it to a plain Kafka cluster (and yes, people do run Kroxylicious without filters — TLS termination, stable client endpoints, virtual clusters — but that's a different post). The table below tells us that above a certain inflection point the numbers get really, really noisy — especially in the p99 range.
9090

9191
**1 topic, 1 KB messages — baseline vs encryption:**
@@ -162,7 +162,7 @@ Numbers without guidance aren't very useful, so here's how to translate these re
162162
These are real results from real hardware, but they don't tell a story for your workload. A few things worth knowing before you put these numbers in a slide deck:
163163

164164
- **Message size**: all results use 1 KB messages. The coefficient is message-size-dependent — encryption overhead as a percentage is likely lower for larger messages.
165-
- **Replication factor**: the 1-topic rate sweep ran at RF=3. At that replication factor, Kafka's ISR replication traffic creates a per-partition ceiling that sits close to where proxy CPU also saturates — the two limits are entangled in those results. The sizing coefficient was derived from RF=1 multi-topic workloads specifically to isolate proxy CPU. The [companion engineering post]({% post_url 2026-05-28-benchmarking-the-proxy-under-the-hood %}) has that detail.
165+
- **Replication factor**: the encryption numbers assume traffic isn't already hitting Kafka's own replication limits — the [companion post]({% post_url 2026-05-28-benchmarking-the-proxy-under-the-hood %}) explains why that matters.
166166
- **Horizontal scaling**: linear scaling has been validated across CPU allocations on a single pod; multi-pod horizontal scaling hasn't been measured but is expected to follow the same coefficient.
167167

168168
For the engineering story — why we built a custom harness on top of OMB, what the CPU flamegraphs actually show, and the bugs we found in our own tooling along the way — that's in the [companion post]({% post_url 2026-05-28-benchmarking-the-proxy-under-the-hood %}).

0 commit comments

Comments
 (0)