Add Kemal (Crystal) framework#14
Conversation
|
cc @sdogruyol |
The CI validation runs 'shards install --production' which requires shard.lock to exist. Added the generated lock file and updated the Dockerfile COPY to include it.
d4c8f1e to
7c19119
Compare
- Vendor Crystal shards in lib/ to avoid network-dependent builds - Remove --static flag (incompatible with Process.fork) - Drop multi-process forking (causes port binding conflicts) - Add runtime dependencies (pcre2, libevent, gc, etc.) - All 18 validation tests passing locally
|
Yo @BennyFranciscus 20k LOC?! do we need to include all this code in? |
Benchmark ResultsFramework: Full log |
|
@BennyFranciscus seems that this is running single thread (only 1 CPU core - 100% CPU usage on docker) |
|
Yeah the 20k LOC is the vendored Crystal shards (lib/ directory). The Docker build needs them because shards install fetches from GitHub at build time, and the CI runner's Docker network can't resolve github.com. Options:
I'll look into option 2 or 3 to slim this down. |
|
You're right — it's single-threaded. I had a fork-based multi-worker setup but Process.fork in Crystal can't rebind the same port without SO_REUSEPORT, and Kemal doesn't expose socket options. I'll rework it to use a Working on it. |
- Remove 20k LOC of vendored Crystal shards - Add build.sh (same pattern as caddy) for network-dependent builds - Dockerfile now fetches deps at build time via shards install - All 18 validation tests still passing
Benchmark ResultsFramework: Full log |
|
@BennyFranciscus looks like it ran but failed at maybe disable that test |
|
Yeah, I saw that — looks like Kemal is failing to handle the I'll dig into it. My guess is either the connection recycling is tripping up Crystal's fiber scheduler, or there's a socket exhaustion issue when connections are being torn down and re-established that rapidly. The other test profiles (baseline, pipelined, json) seem to run fine, so it's specific to the short-lived connection pattern. For now I'll disable |
…ection recycling)
|
Done — disabled Re: the 20k LOC — that's gone now too. Switched to |
- Each CPU core gets its own Kemal server instance with SO_REUSEPORT - Bypasses Kemal.run to manually create HTTP::Server with reuse_port: true - Properly wires up 404 error handler (setup_404 equivalent) - Parent process manages worker lifecycle with signal forwarding - All 18 validation tests passing
|
Fixed the single-thread issue. Now using Had to bypass Should see proper multi-core utilization now. All 18 validation tests passing locally. |
Benchmark ResultsFramework: Full log |
|
Now we're talking 🔥 — 2.2M req/s baseline at 4096 connections with ~82 CPU cores lit up. That's a massive jump from the 39K single-threaded run. The SO_REUSEPORT + Process.fork approach is working exactly as expected — each core gets its own accept loop with zero contention. Crystal's event loop is genuinely fast once you let it spread across cores. Memory is a bit chunky (2.1GB at 4096c, 4.5GB at 16384c) since each forked process gets its own heap, but that's the trade-off for zero-contention parallelism. Could look into a shared-nothing arena allocator later but honestly for a full framework with routing + middleware, these numbers are solid. Still waiting on the rest of the profiles to run — curious to see how pipelined and json look with multi-core. |
Benchmark ResultsFramework: Full log |
|
Full profile results are in! Quick breakdown: 🚀 Pipelined: 6.3M req/s at 512c — Crystal's event loop really shines with pipeline depth 16. This is legitimately fast. ✅ Baseline: 2.2M at 4096c — solid, consistent with the earlier run. ✅ Noisy: 1.6M at 4096c — good resilience under mixed valid/invalid traffic. 📊 JSON: 371K — reasonable for a framework doing actual JSON serialization. 📊 Compression: 68-89K — Crystal's built-in gzip isn't the fastest but it works. The upload memory is the main thing I'd want to improve. Everything else looks solid for a full-featured framework. Happy to iterate on the upload handling if you want to get those memory numbers down before merging. |
Crystal's Process.fork doesn't work reliably in Docker containers. Switch to LD_PRELOAD SO_REUSEPORT shim + shell wrapper spawning N independent processes (same approach as other entries).
|
hey! pushed a cleaner approach — ditched Crystal's Process.fork entirely (it's unreliable in Docker containers). now using an LD_PRELOAD SO_REUSEPORT shim + shell wrapper that spawns N independent processes, same pattern as the other entries. should scale properly across all cores now 🤞 |
Benchmark ResultsFramework: Full log |
|
LD_PRELOAD approach is working great! 🔥
Memory usage climbs a bit under high concurrency (30GB at 512c upload) but that's expected with per-process Crystal GC. Ready for merge whenever you are 🚀 |
Benchmark ResultsFramework: Full log |
Adds Kemal, a web framework for Crystal — a compiled language with Ruby-like syntax and C-like performance.
What's included
Why Kemal / Crystal?
Crystal is currently unrepresented in HttpArena. It compiles to native code via LLVM and is known for performance competitive with Go and Rust in many workloads. Kemal is Crystal's most popular web framework (3.8K stars).
Would be interesting to see where it lands relative to the existing Rust, Go, and C++ frameworks.