Skip to content

feat: replace deprecated http-parser with llhttp (#331)#400

Open
bushidocodes wants to merge 1 commit into
masterfrom
feat/llhttp-replace-http-parser
Open

feat: replace deprecated http-parser with llhttp (#331)#400
bushidocodes wants to merge 1 commit into
masterfrom
feat/llhttp-replace-http-parser

Conversation

@bushidocodes

Copy link
Copy Markdown
Contributor

Closes #331.

What

Replaces Node's deprecated, frozen http-parser (vendored at v2.9.4) with its maintained successor, llhttp (~2x faster, MIT, same incremental callback model).

Build toolchain: unchanged

The usual objection to llhttp is that it's generated from TypeScript via llparse, which would drag Node/npm into the build. We sidestep that entirely by vendoring llhttp's pre-generated C release artifacts from the upstream release branch (tag release/v9.4.2):

runtime/thirdparty/llhttp/
├── include/llhttp.h
└── src/{llhttp,api,http}.c

These compile with the normal C toolchain — no Node, npm, or llparse. The old http-parser git submodule is dropped (removed from .gitmodules), so this also removes one submodule from make submodules. See runtime/thirdparty/llhttp/VENDOR.md for provenance and update steps.

Changes

  • thirdparty: vendor llhttp release sources; compile the three .c files into dist/ (built with -O3). One http_parser.o CFILE → three llhttp*.o CFILEs.
  • http_parser_settings.{c,h}: port callbacks to the llhttp_t / llhttp_settings_t API (llhttp_settings_init, settings bound at llhttp_init time). Callback signatures are otherwise identical.
  • http_session.h: llhttp_execute returns an errno, not a byte count. Bytes-consumed is derived from the buffer length on HPE_OK, or llhttp_get_error_pos() on HPE_PAUSED/HPE_PAUSED_UPGRADE.
  • completion semantics: http-parser's return 2 (skip body) → llhttp return 1; on_message_complete now returns HPE_PAUSED so the parser stops cleanly after one message rather than parsing trailing bytes as a pipelined request (which would re-enter on_message_begin and trip its asserts). This faithfully mirrors http-parser, which RETURNed immediately after firing on_message_complete for our request shapes.
  • .vscode: repoint include paths / file associations at the new sources.

Verification

  • llhttp release sources compile cleanly (llhttp.o, api.o, http.o).
  • Both touched translation units (http_parser_settings.c, and a TU including http_session.h) pass clang -std=gnu11 -Wall -fsyntax-only with no errors.
  • Not yet run: a full make / runtime integration test. The project builds inside its Docker dev container; my local tree can't complete the ck (Concurrency Kit) submodule build because its committed Makefile has the container's /sledge path baked in (pre-existing, unrelated). Left as draft pending a full in-container build + load test against GET and POST/PUT (body) workloads.

Notes for reviewers

  • llhttp is stricter than http-parser 2.9 by default (RFC-7230 conformance). If any benchmark client sends technically-malformed requests that http-parser tolerated, we may need llhttp_set_lenient_* toggles. Worth a sanity load test.
  • const char *http_methods[] in http_parser_settings.c is pre-existing dead code (unused, and not keyed to any parser enum); left untouched to keep the diff focused.

🤖 Generated with Claude Code

Node's http-parser (vendored at v2.9.4) is deprecated and frozen. Its
maintained successor is llhttp, which is ~2x faster and keeps the same
incremental, callback-based parsing model.

To avoid pulling Node/npm/llparse into the build, we vendor llhttp's
*pre-generated* C release artifacts (the `release` branch of nodejs/llhttp,
tag release/v9.4.2): include/llhttp.h plus src/{llhttp,api,http}.c. These
compile with the normal C toolchain, so the build toolchain is unchanged.

- thirdparty: drop the http-parser git submodule, vendor llhttp sources,
  and compile the three release .c files into dist/ (see VENDOR.md).
- http_parser_settings: port callbacks to the llhttp_t / llhttp_settings_t
  API (llhttp_settings_init, settings bound at init). The signatures are
  otherwise identical.
- on_headers_complete: http-parser's "return 2" (skip body) maps to llhttp's
  "return 1"; on_message_complete now returns HPE_PAUSED so the parser stops
  cleanly after one message instead of parsing trailing bytes as a pipelined
  request (which would re-enter on_message_begin and trip its asserts). This
  matches http-parser, which returned immediately after firing
  on_message_complete for our requests.
- http_session: llhttp_execute returns an errno rather than a byte count;
  derive bytes-consumed from the buffer length (HPE_OK) or llhttp_get_error_pos
  (HPE_PAUSED / HPE_PAUSED_UPGRADE).
- .vscode: repoint include paths/associations at the new llhttp sources.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@bushidocodes

Copy link
Copy Markdown
Contributor Author

✅ Full in-container build + load test passed

Built and tested inside the project's sledge-dev Docker container (repo mounted at /sledge, clang 13).

Build

  • make -C runtimeck + jsmn + the three vendored llhttp objects (llhttp.o, llhttp_api.o, llhttp_http.o) compiled, runtime linked. No warnings or errors.
  • nm confirms llhttp_execute / llhttp_init / llhttp_settings_init are linked into sledgert.

Functional smoke test (both parser paths)

Test Result
GET /empty (no-body path) 200 OK, Content-Length: 0
POST /fib body=10 200 OK55
POST /fib body=30 200 OK832040
GET /nope (unknown route) 404 Not Found

The no-body path exercises on_headers_complete → 1 + on_message_complete → HPE_PAUSED; the POST path exercises Content-Length + on_body delivery to the module (fib values are correct).

Load / stability

  • hey -disable-keepalive -n 2000 -c 50 GET /empty2000/2000 [200], ~12.6k req/s
  • hey -disable-keepalive -n 2000 -c 50 POST /fib2000/2000 [200], ~10.3k req/s
  • Runtime stayed alive (no crash); zero error/panic/assert/malformed lines in the log.

(Keepalive disabled matches SLEdge's Connection: close responses — one request per connection — which is exactly the shape the HPE_PAUSED-after-message design targets.)

Marking ready for review.

@bushidocodes bushidocodes marked this pull request as ready for review June 22, 2026 23:21
@bushidocodes

Copy link
Copy Markdown
Contributor Author

✅ From-scratch rebuild + full experiment suite — 16/16 pass

Per request, did a complete clean rebuild and ran the canonical experiment suite (make test = test.mk all, plus trap_divzero and stack_overflow) inside the sledge-dev container.

From-scratch rebuild

Deleted runtime/bin/sledgert, runtime/thirdparty/dist/, and ran cargo clean on awsm — verified all gone — then rebuilt the whole pipeline: awsm → libsledge → runtime (ck/jsmn/llhttp + sledgert) → all wasm app modules. Clean, no warnings/errors; llhttp objects compiled and llhttp_* symbols linked into the fresh sledgert.

Experiment results (all PASS)

Experiment Result Time
gocr/by_dpi 249s
gocr/by_font 249s
gocr/by_word 116s
gocr/fivebyeight 8s
gocr/handwriting 9s
gocr/hyde 29s
TinyEKF/by_iteration 53s
TinyEKF/one_iteration 11s
CMSIS_5_NN/imageclassification (cifar10) 547s
sod/image_resize/test 14s
sod/image_resize/by_resolution 94s
sod/lpd/by_plate_count 128s
fibonacci/bimodal 378s
empty/concurrency 99s
traps
stack_overflow

16 / 16 passed. This spans GET, POST with text bodies, POST with large binary image bodies (sod resize / LPD), heavy CNN inference (cifar10), scheduler experiments under EDF/FIFO load (fibonacci bimodal), high-concurrency GET load (empty: 10k requests × concurrency 1–100), and the error path — traps confirms divide-by-zero returns 500 and valid requests 200 through the new parser.

Note on the two "traps" experiments

traps and stack_overflow initially failed — but not due to llhttp. Their wasm modules (trap_divzero.wasm.so, stack_overflow.wasm.so) aren't part of the default make applications target; they're built only by their dedicated test.mk targets. With the modules missing, sledgert asserts at startup on the unresolved .so (Assertion 'module != NULL'), well before any HTTP parsing. After building the two modules via make -f test.mk, both pass.

@bushidocodes

Copy link
Copy Markdown
Contributor Author

📊 Focused parser microbenchmark — llhttp vs http-parser

Isolated the parser from WASM/scheduling by benchmarking raw request parsing in a standalone harness (container clang-13, the project's toolchain). Each parser was compiled to its own object separately from the harness (no LTO) so callbacks aren't cross-inlined — matching how the runtime actually links. Identical callbacks (count headers / span url+body) for both. Best of 5 runs, pinned to one core, 4M iterations per request type.

I measured two comparisons, because the old vendored http-parser was compiled by the thirdparty Makefile with empty CFLAGS (i.e. -O0), whereas the new llhttp rule compiles at -O3:

  • llhttp@O3 vs http-parser@O3 — isolates the parser algorithm/codegen difference.
  • llhttp@O3 vs http-parser@O0 — the actual change as deployed (unoptimized old → optimized new).

Results (ns per request; lower is better)

Request Bytes llhttp@O3 hp@O3 hp@O0 Speedup vs O3 Speedup vs O0 (deployed)
GET-tiny (4 headers) 92 108.6 149.9 429.2 1.38× 3.95×
GET-realistic (8 headers + query) 318 262.6 441.3 1307.8 1.68× 4.98×
POST-small (numeric body) 92 127.3 171.0 482.1 1.34× 3.79×
POST-4KB-body (binary) 4206 135.4 196.2 562.6 1.45× 4.16×

(No parse failures — both parsers validated HPE_OK consuming the full buffer on every input.)

Takeaways

  • Parser-for-parser, llhttp is ~1.3–1.7× faster at the same -O3. Header-heavy requests benefit most (1.68×), consistent with llhttp's auto-generated multi-character matching vs http-parser's hand-rolled byte loop.
  • As actually shipped, parsing is ~3.8–5.0× faster, because the swap also fixed an optimization gap: the retired vendored http-parser was being built unoptimized (-O0), while the new llhttp rule uses -O3.
  • Absolute cost is tiny either way (~100–270 ns/req), which is exactly why this never showed up in the end-to-end experiments — parsing is a rounding error next to WASM execution. The win is real but it's a constant-factor cleanup, not a throughput unlock for these workloads.

Methodology note: representative request shapes (hey-style GET, fibonacci POST, sod-style 4KB binary POST); the harness was a throwaway (not committed — it would have re-vendored the very http_parser.c this PR removes). Happy to commit a trimmed llhttp-only version as a perf-regression guard if useful.

@emil916

emil916 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

This is great! Had no idea the existing one was using node.js.

However, I think this PR will have to wait until I run my final experiments these days. I will come back to this later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

http-parser is deprecated

2 participants