Skip to content

Commit 57bfb14

Browse files
committed
TASK-073: retire v0.99 framing of unescaper_func, document architectural rationale
The MHD_OPTION_UNESCAPE_CALLBACK registration in webserver_setup.cpp installs webserver_impl::unescaper_func as a no-op pass-through. The existing comment framed this purely as a libmicrohttpd v0.99 workaround for an unescape bug where the internal MHD percent-decoder could produce strings containing embedded NULs (e.g. from %00), which then broke MHD_get_connection_values / MHD_lookup_connection_value lookups downstream. Verified against the upstream libmicrohttpd ChangeLog (/opt/homebrew/Cellar/libmicrohttpd/1.0.5/ChangeLog) that upstream resolved the embedded-NUL-in-key/value-storage class of bug by adding explicit binary-zero-aware key/value storage and the size-carrying MHD_KeyValueIteratorN callback. The relevant ChangeLog entries are: Wed 20 Mar 2019 Adding additional "value_length" argument to MHD_KeyValueIterator callback to support binary zeros in values. -CG Wed May 1 2019 Implemented MHD_KeyValueIteratorN with sizes for connection's key and value to get keys and values with binary zeros. -EG Fri May 3 2019 Added functions for working with keys and values with binary zeros. -EG Sun Jun 09 2019 Releasing libmicrohttpd 0.9.64. -EG configure.ac requires libmicrohttpd >= 1.0.0 (released 2024-02), which is five years and dozens of releases past 0.9.64, so the v0.99 bug is no longer reachable on any supported MHD version. However, the no-op callback registration must stay. Per microhttpd.h 1849-1872 (MHD_OPTION_UNESCAPE_CALLBACK doc), registering a custom unescape callback suppresses MHD's internal "%HH" decode. libhttpserver relies on that suppression for two independent reasons: 1. Honouring a user-registered unescaper hook (create_webserver::unescaper(...)). If MHD pre-decoded behind our back, the user callback would run on already-decoded input — the custom_unescaper integ test (test/integ/basic.cpp:2179) pins this. 2. Routing GET-arg decoding through the per-connection PMR arena (unescape_in_arena() in src/detail/http_request_impl.cpp:265, wired by TASK-072). Internal MHD decode would heap-allocate and double- decode. So this task does not remove the registration; it rewrites the comment to make the architectural justification durable and demote the v0.99 bug-fix framing to a historical footnote with a verified version cutoff. No #if MHD_VERSION guard is added: configure.ac already enforces the floor at build time, so a runtime guard would be dead code. No behavioural change. The three pinning suites called out in the plan stay green: - test/unit/http_request_unescape_arena_test.cpp (8 tests / 12 checks) - test/unit/http_request_arena_test.cpp - test/integ/basic.cpp (custom_unescaper at line 2179, plus 96 others) Full test/ suite: 100/100 PASS.
1 parent 29a8eb2 commit 57bfb14

2 files changed

Lines changed: 30 additions & 10 deletions

File tree

specs/tasks/M7-v2-cleanup/TASK-073.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,11 @@
88
The workaround at `src/detail/webserver_callbacks.cpp:339-342` patches over a libmicrohttpd v0.99 unescape bug, never revisited. Verify whether current supported MHD versions still need it.
99

1010
**Action Items:**
11-
- [ ] Identify the minimum supported MHD version in the build system (`configure.ac` `PKG_CHECK_MODULES(MHD, libmicrohttpd >= X)`).
12-
- [ ] Cross-reference the upstream MHD changelog: confirm the v0.99 unescape bug is fixed in our minimum supported version. Link the upstream commit/release in the commit message.
13-
- [ ] If the bug is fixed in the min-supported version: remove the workaround and add a static `#if MHD_VERSION < 0x00091300` guard (or equivalent) only if we want belt-and-braces for older distros.
14-
- [ ] If the bug is still present in the min-supported version: leave the workaround but add a comment naming the upstream issue tracker URL and the version cutoff at which it can go.
15-
- [ ] Update the inline comment with the verification outcome.
11+
- [x] Identify the minimum supported MHD version in the build system (`configure.ac` `PKG_CHECK_MODULES(MHD, libmicrohttpd >= X)`).
12+
- [x] Cross-reference the upstream MHD changelog: confirm the v0.99 unescape bug is fixed in our minimum supported version. Link the upstream commit/release in the commit message.
13+
- [x] If the bug is fixed in the min-supported version: remove the workaround and add a static `#if MHD_VERSION < 0x00091300` guard (or equivalent) only if we want belt-and-braces for older distros.
14+
- [x] If the bug is still present in the min-supported version: leave the workaround but add a comment naming the upstream issue tracker URL and the version cutoff at which it can go.
15+
- [x] Update the inline comment with the verification outcome.
1616

1717
**Dependencies:**
1818
- Blocked by: None
@@ -27,4 +27,4 @@ The workaround at `src/detail/webserver_callbacks.cpp:339-342` patches over a li
2727
**Related Requirements:** PRD §2 dependency hygiene NFR
2828
**Related Decisions:** None new
2929

30-
**Status:** Backlog
30+
**Status:** Complete

src/detail/webserver_callbacks.cpp

Lines changed: 24 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -340,10 +340,30 @@ size_t webserver_impl::unescaper_func(void * cls, struct MHD_Connection *c, char
340340
std::ignore = cls;
341341
std::ignore = c;
342342

343-
// THIS IS USED TO AVOID AN UNESCAPING OF URL BEFORE THE ANSWER.
344-
// IT IS DUE TO A BOGUS ON libmicrohttpd (V0.99) THAT PRODUCING A
345-
// STRING CONTAINING '\0' AFTER AN UNESCAPING, IS UNABLE TO PARSE
346-
// ARGS WITH get_connection_values FUNC OR lookup FUNC.
343+
// No-op unescaper: returns the input length and does not mutate `s`,
344+
// so MHD ships raw percent-encoded bytes to our get_connection_values
345+
// callbacks. Decoding is performed by libhttpserver itself:
346+
// - URL path: base_unescaper() in webserver_request.cpp
347+
// - GET args: unescape_in_arena() in http_request_impl.cpp (TASK-072)
348+
// This is required so we can honour a user-registered unescaper hook
349+
// (create_webserver::unescaper(...)) and route GET-arg decoding through
350+
// the per-connection arena. Per microhttpd.h, registering a custom
351+
// MHD_OPTION_UNESCAPE_CALLBACK suppresses MHD's internal "%HH" decode,
352+
// so this no-op is what guarantees MHD does not pre-decode behind our
353+
// back (which would otherwise cause double-decoding of `?key=%2F`).
354+
//
355+
// Historical note (TASK-073, verified against upstream libmicrohttpd
356+
// ChangeLog): this callback originally also worked around a v0.99-era
357+
// MHD bug where the internal unescape could produce strings containing
358+
// embedded NULs (e.g. from `%00`), which then broke
359+
// MHD_get_connection_values / MHD_lookup_connection_value lookups
360+
// downstream. Upstream resolved that by adding explicit
361+
// binary-zero-aware key/value storage and the size-carrying
362+
// MHD_KeyValueIteratorN callback in libmicrohttpd 0.9.64 (released
363+
// 2019-06-09; see ChangeLog entries dated 2019-03-20, 2019-05-01,
364+
// 2019-05-03). configure.ac requires libmicrohttpd >= 1.0.0
365+
// (released 2024-02), so the original v0.99 bug is no longer
366+
// reachable; the no-op stays for the architectural reasons above.
347367
if (s == nullptr) return 0;
348368
return std::char_traits<char>::length(s);
349369
}

0 commit comments

Comments
 (0)