feat: replace deprecated http-parser with llhttp (#331)#400
feat: replace deprecated http-parser with llhttp (#331)#400bushidocodes wants to merge 1 commit into
Conversation
Node's http-parser (vendored at v2.9.4) is deprecated and frozen. Its
maintained successor is llhttp, which is ~2x faster and keeps the same
incremental, callback-based parsing model.
To avoid pulling Node/npm/llparse into the build, we vendor llhttp's
*pre-generated* C release artifacts (the `release` branch of nodejs/llhttp,
tag release/v9.4.2): include/llhttp.h plus src/{llhttp,api,http}.c. These
compile with the normal C toolchain, so the build toolchain is unchanged.
- thirdparty: drop the http-parser git submodule, vendor llhttp sources,
and compile the three release .c files into dist/ (see VENDOR.md).
- http_parser_settings: port callbacks to the llhttp_t / llhttp_settings_t
API (llhttp_settings_init, settings bound at init). The signatures are
otherwise identical.
- on_headers_complete: http-parser's "return 2" (skip body) maps to llhttp's
"return 1"; on_message_complete now returns HPE_PAUSED so the parser stops
cleanly after one message instead of parsing trailing bytes as a pipelined
request (which would re-enter on_message_begin and trip its asserts). This
matches http-parser, which returned immediately after firing
on_message_complete for our requests.
- http_session: llhttp_execute returns an errno rather than a byte count;
derive bytes-consumed from the buffer length (HPE_OK) or llhttp_get_error_pos
(HPE_PAUSED / HPE_PAUSED_UPGRADE).
- .vscode: repoint include paths/associations at the new llhttp sources.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
✅ Full in-container build + load test passedBuilt and tested inside the project's Build
Functional smoke test (both parser paths)
The no-body path exercises Load / stability
(Keepalive disabled matches SLEdge's Marking ready for review. |
✅ From-scratch rebuild + full experiment suite — 16/16 passPer request, did a complete clean rebuild and ran the canonical experiment suite ( From-scratch rebuildDeleted Experiment results (all PASS)
16 / 16 passed. This spans GET, POST with text bodies, POST with large binary image bodies (sod resize / LPD), heavy CNN inference (cifar10), scheduler experiments under EDF/FIFO load (fibonacci bimodal), high-concurrency GET load (empty: 10k requests × concurrency 1–100), and the error path — Note on the two "traps" experiments
|
📊 Focused parser microbenchmark — llhttp vs http-parserIsolated the parser from WASM/scheduling by benchmarking raw request parsing in a standalone harness (container clang-13, the project's toolchain). Each parser was compiled to its own object separately from the harness (no LTO) so callbacks aren't cross-inlined — matching how the runtime actually links. Identical callbacks (count headers / span url+body) for both. Best of 5 runs, pinned to one core, 4M iterations per request type. I measured two comparisons, because the old vendored
Results (ns per request; lower is better)
(No parse failures — both parsers validated Takeaways
Methodology note: representative request shapes (hey-style GET, fibonacci POST, sod-style 4KB binary POST); the harness was a throwaway (not committed — it would have re-vendored the very |
|
This is great! Had no idea the existing one was using node.js. However, I think this PR will have to wait until I run my final experiments these days. I will come back to this later. |
Closes #331.
What
Replaces Node's deprecated, frozen http-parser (vendored at v2.9.4) with its maintained successor, llhttp (~2x faster, MIT, same incremental callback model).
Build toolchain: unchanged
The usual objection to llhttp is that it's generated from TypeScript via
llparse, which would drag Node/npm into the build. We sidestep that entirely by vendoring llhttp's pre-generated C release artifacts from the upstreamreleasebranch (tagrelease/v9.4.2):These compile with the normal C toolchain — no Node, npm, or llparse. The old
http-parsergit submodule is dropped (removed from.gitmodules), so this also removes one submodule frommake submodules. Seeruntime/thirdparty/llhttp/VENDOR.mdfor provenance and update steps.Changes
.cfiles intodist/(built with-O3). Onehttp_parser.oCFILE → threellhttp*.oCFILEs.http_parser_settings.{c,h}: port callbacks to thellhttp_t/llhttp_settings_tAPI (llhttp_settings_init, settings bound atllhttp_inittime). Callback signatures are otherwise identical.http_session.h:llhttp_executereturns an errno, not a byte count. Bytes-consumed is derived from the buffer length onHPE_OK, orllhttp_get_error_pos()onHPE_PAUSED/HPE_PAUSED_UPGRADE.return 2(skip body) → llhttpreturn 1;on_message_completenow returnsHPE_PAUSEDso the parser stops cleanly after one message rather than parsing trailing bytes as a pipelined request (which would re-enteron_message_beginand trip its asserts). This faithfully mirrors http-parser, whichRETURNed immediately after firingon_message_completefor our request shapes.Verification
llhttp.o,api.o,http.o).http_parser_settings.c, and a TU includinghttp_session.h) passclang -std=gnu11 -Wall -fsyntax-onlywith no errors.make/ runtime integration test. The project builds inside its Docker dev container; my local tree can't complete theck(Concurrency Kit) submodule build because its committed Makefile has the container's/sledgepath baked in (pre-existing, unrelated). Left as draft pending a full in-container build + load test against GET and POST/PUT (body) workloads.Notes for reviewers
llhttp_set_lenient_*toggles. Worth a sanity load test.const char *http_methods[]inhttp_parser_settings.cis pre-existing dead code (unused, and not keyed to any parser enum); left untouched to keep the diff focused.🤖 Generated with Claude Code