Skip to content

Commit d577b9e

Browse files
MDA2AVclaude
andcommitted
Add deep analysis, verified RFC evidence, and ABNF grammar to all 140 glossary pages
Each test page now includes: - Deep Analysis section with ABNF grammar, RFC evidence chain, chain of reasoning, and scored/unscored justification - All RFC quotes verified against actual RFC text downloads from rfc-editor.org - Corrected several section references (OBS-FOLD §5.1→§5.2, CONNECTION-CLOSE §9.3→§9.6, TE-OBS-FOLD §5.1→§5.2, TE-IDENTITY §7→§6.1, MULTIPLE-HOST-COMMA §7.2→§3.2) - Fixed requirement levels (EXPECT-UNKNOWN MUST→MAY, WHITESPACE-BEFORE-HEADERS SHOULD→MUST, CHUNKED-EXTENSION SHOULD→MUST, LEADING-CRLF MAY→SHOULD) - Updated obsolete RFC 7231 terminology to RFC 9110 (payload body→content) - Fixed scoring descriptions to match code (BARE-LF Pass/Warn→Pass/Fail) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 55db343 commit d577b9e

140 files changed

Lines changed: 7500 additions & 384 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/content/docs/body/chunked-body.md

Lines changed: 54 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,14 +29,66 @@ hello\r\n
2929

3030
## What the RFC says
3131

32-
> "The chunked transfer coding wraps the payload body in order to transfer it as a series of chunks, each with its own size indicator, followed by an OPTIONAL trailer section containing trailer fields." — RFC 9112 Section 7.1
32+
> "The chunked transfer coding wraps content in order to transfer it as a series of chunks, each with its own size indicator, followed by an OPTIONAL trailer section containing trailer fields." — RFC 9112 Section 7.1
3333
34-
A server that supports HTTP/1.1 must be able to decode chunked transfer encoding.
34+
> "A recipient MUST be able to parse and decode the chunked transfer coding." — RFC 9112 Section 7.1
35+
36+
> "A recipient MUST be able to parse the chunked transfer coding because it plays a crucial role in framing messages when the content size is not known in advance." — RFC 9112 Section 6.1
37+
38+
A server that supports HTTP/1.1 must be able to decode chunked transfer encoding. This is a MUST-level requirement.
3539

3640
## Why it matters
3741

3842
Chunked encoding is fundamental to HTTP/1.1 — it enables streaming, server-sent data, and requests where the body size isn't known in advance. If a server can't decode a basic chunked body, it cannot fully participate in HTTP/1.1.
3943

44+
## Deep Analysis
45+
46+
### Relevant ABNF grammar
47+
48+
From RFC 9112 Section 7.1:
49+
50+
```
51+
chunked-body = *chunk
52+
last-chunk
53+
trailer-section
54+
CRLF
55+
56+
chunk = chunk-size [ chunk-ext ] CRLF
57+
chunk-data CRLF
58+
chunk-size = 1*HEXDIG
59+
last-chunk = 1*("0") [ chunk-ext ] CRLF
60+
61+
chunk-data = 1*OCTET ; a sequence of chunk-size octets
62+
trailer-section = *( field-line CRLF )
63+
```
64+
65+
### Direct RFC quotes
66+
67+
> "The chunked transfer coding wraps content in order to transfer it as a series of chunks, each with its own size indicator, followed by an OPTIONAL trailer section containing trailer fields." -- RFC 9112 Section 7.1
68+
69+
> "A recipient MUST be able to parse and decode the chunked transfer coding." -- RFC 9112 Section 7.1
70+
71+
> "A recipient MUST be able to parse the chunked transfer coding because it plays a crucial role in framing messages when the content size is not known in advance." -- RFC 9112 Section 6.1
72+
73+
### Chain of reasoning
74+
75+
1. The test sends `Transfer-Encoding: chunked`, which triggers chunked body parsing per RFC 9112 Section 6.1.
76+
2. Per the ABNF, the server must parse the chunk-size `5` (1*HEXDIG = "5"), read the CRLF, then read exactly 5 octets of chunk-data (`hello`), then read the trailing CRLF.
77+
3. The next line is `0\r\n`, which matches `last-chunk = 1*("0") [ chunk-ext ] CRLF` -- this signals the end of chunked data.
78+
4. The final `\r\n` satisfies the trailing CRLF in the `chunked-body` production.
79+
5. The entire message is syntactically valid against the ABNF grammar. The server has no grounds to reject it.
80+
6. RFC 9112 Section 7.1 uses "MUST be able to parse and decode" -- the strongest normative keyword. Failure to accept this request is a protocol violation.
81+
82+
### Scored / Unscored justification
83+
84+
**Scored.** The requirement uses MUST ("A recipient MUST be able to parse and decode the chunked transfer coding"). This is a non-negotiable RFC requirement. Any server claiming HTTP/1.1 support that rejects a syntactically valid single-chunk body is non-compliant. The test expects `2xx` with no fallback to `400` because there is no ambiguity in the grammar or the requirement level.
85+
86+
### Edge cases
87+
88+
- Some servers reject chunked encoding on POST if they expect `Content-Length` only -- this violates RFC 9112 Section 6.1 which mandates chunked parsing support.
89+
- Servers behind load balancers may never see chunked requests if the LB de-chunks first, but the server itself must still support it.
90+
- A few lightweight embedded HTTP servers omit chunked support entirely, treating it as an HTTP/1.0-only implementation. This test correctly flags that deficiency.
91+
4092
## Sources
4193

4294
- [RFC 9112 Section 7.1](https://www.rfc-editor.org/rfc/rfc9112#section-7.1)

docs/content/docs/body/chunked-empty.md

Lines changed: 54 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,66 @@ Transfer-Encoding: chunked\r\n
2727

2828
## What the RFC says
2929

30-
> "The last chunk has a chunk size of zero, indicating the end of the chunk data." — RFC 9112 Section 7.1
30+
> "A recipient MUST be able to parse and decode the chunked transfer coding." — RFC 9112 Section 7.1
3131
32-
A zero-size first chunk is valid and indicates an empty body. The server must recognize the terminator and not block waiting for additional data.
32+
The chunked grammar defines `last-chunk = 1*("0") [ chunk-ext ] CRLF`. A zero-size first chunk is the terminator and indicates an empty body. The server must recognize it and not block waiting for additional data.
33+
34+
The grammar allows `*chunk` (zero or more data chunks) before the `last-chunk`, so a chunked body containing only the zero terminator is syntactically valid.
3335

3436
## Why it matters
3537

3638
Empty chunked bodies occur when a client starts a chunked transfer but has nothing to send, or when a proxy rewrites a zero-length CL body into chunked encoding. The server must handle this edge case cleanly.
3739

40+
## Deep Analysis
41+
42+
### Relevant ABNF grammar
43+
44+
From RFC 9112 Section 7.1:
45+
46+
```
47+
chunked-body = *chunk
48+
last-chunk
49+
trailer-section
50+
CRLF
51+
52+
chunk = chunk-size [ chunk-ext ] CRLF
53+
chunk-data CRLF
54+
chunk-size = 1*HEXDIG
55+
last-chunk = 1*("0") [ chunk-ext ] CRLF
56+
57+
chunk-data = 1*OCTET ; a sequence of chunk-size octets
58+
trailer-section = *( field-line CRLF )
59+
```
60+
61+
### Direct RFC quotes
62+
63+
> "A recipient MUST be able to parse and decode the chunked transfer coding." -- RFC 9112 Section 7.1
64+
65+
> "A recipient MUST be able to parse the chunked transfer coding because it plays a crucial role in framing messages when the content size is not known in advance." -- RFC 9112 Section 6.1
66+
67+
> "The chunked transfer coding wraps content in order to transfer it as a series of chunks, each with its own size indicator, followed by an OPTIONAL trailer section containing trailer fields." -- RFC 9112 Section 7.1
68+
69+
### Chain of reasoning
70+
71+
1. The test sends `Transfer-Encoding: chunked`, activating chunked body parsing per RFC 9112 Section 6.1.
72+
2. The ABNF production `chunked-body = *chunk last-chunk trailer-section CRLF` uses `*chunk`, meaning **zero or more** data chunks are valid before the `last-chunk`.
73+
3. The first (and only) line of the body is `0\r\n`, which matches `last-chunk = 1*("0") [ chunk-ext ] CRLF`. This is the zero-length terminator with no preceding data chunks.
74+
4. The `trailer-section` production is `*( field-line CRLF )` -- zero or more trailer fields. In this test, there are none.
75+
5. The final `\r\n` satisfies the trailing CRLF in the `chunked-body` production.
76+
6. The complete body `0\r\n\r\n` is a valid instance of `chunked-body` with zero data chunks, zero trailer fields. The grammar explicitly permits this.
77+
7. A server that blocks waiting for additional data after seeing the zero-length chunk has failed to correctly implement the chunked decoder.
78+
79+
### Scored / Unscored justification
80+
81+
**Scored.** The requirement uses MUST ("A recipient MUST be able to parse and decode the chunked transfer coding"). The `*chunk` production (zero or more) explicitly allows an empty body. The server must accept this and respond with `2xx` or close the connection cleanly. The `AllowConnectionClose` flag is set because some servers may close the connection after processing a zero-length chunked body, which is acceptable behavior.
82+
83+
### Edge cases
84+
85+
- Some servers interpret a zero-length chunked body as "no body at all" and respond with `411 Length Required`, which is incorrect because the framing headers (Transfer-Encoding: chunked) are present and well-formed.
86+
- Proxies may rewrite `Content-Length: 0` into chunked encoding, producing exactly this payload. Servers must handle it.
87+
- A server that hangs waiting for data after the `0\r\n\r\n` terminator has a bug in its chunked state machine -- it is not recognizing the last-chunk production.
88+
- Some implementations require at least one non-zero chunk before the terminator, which contradicts the `*chunk` (zero-or-more) ABNF.
89+
3890
## Sources
3991

4092
- [RFC 9112 Section 7.1](https://www.rfc-editor.org/rfc/rfc9112#section-7.1)

docs/content/docs/body/chunked-extension.md

Lines changed: 53 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ weight: 10
99
| **Test ID** | `COMP-CHUNKED-EXTENSION` |
1010
| **Category** | Compliance |
1111
| **RFC** | [RFC 9112 Section 7.1.1](https://www.rfc-editor.org/rfc/rfc9112#section-7.1.1) |
12-
| **Requirement** | SHOULD accept |
13-
| **Expected** | `2xx` = Pass; `400` = Warn |
12+
| **Requirement** | MUST ignore unrecognized extensions |
13+
| **Expected** | `2xx` or `400` |
1414

1515
## What it sends
1616

@@ -29,14 +29,63 @@ hello\r\n
2929

3030
## What the RFC says
3131

32-
> "The chunked encoding allows each chunk to include zero or more chunk extensions, immediately following the chunk-size, for the sake of supplying per-chunk metadata." — RFC 9112 Section 7.1.1
32+
> "The chunked coding allows each chunk to include zero or more chunk extensions, immediately following the chunk-size, for the sake of supplying per-chunk metadata (such as a signature or hash), mid-message control information, or randomization of message body size." — RFC 9112 Section 7.1.1
3333
34-
Chunk extensions are part of the chunked encoding grammar. A compliant parser should skip unrecognized extensions and process the chunk data normally.
34+
> "A recipient MUST ignore unrecognized chunk extensions." — RFC 9112 Section 7.1.1
35+
36+
> "A server ought to limit the total length of chunk extensions received in a request to an amount reasonable for the services provided, in the same way that it applies length limitations and timeouts for other parts of a message, and generate an appropriate 4xx (Client Error) response if that amount is exceeded." — RFC 9112 Section 7.1.1
37+
38+
Chunk extensions are part of the chunked encoding grammar. A compliant parser must ignore unrecognized extensions and process the chunk data normally.
3539

3640
## Why it matters
3741

3842
While chunk extensions are rarely used in practice, they are syntactically valid. A server that rejects them has an overly strict chunk parser that may break with legitimate clients or proxies that add extensions for metadata (e.g., checksums, signatures).
3943

44+
## Deep Analysis
45+
46+
### Relevant ABNF grammar
47+
48+
From RFC 9112 Section 7.1 and 7.1.1:
49+
50+
```
51+
chunk = chunk-size [ chunk-ext ] CRLF
52+
chunk-data CRLF
53+
chunk-size = 1*HEXDIG
54+
55+
chunk-ext = *( BWS ";" BWS chunk-ext-name
56+
[ BWS "=" BWS chunk-ext-val ] )
57+
58+
chunk-ext-name = token
59+
chunk-ext-val = token / quoted-string
60+
```
61+
62+
### Direct RFC quotes
63+
64+
> "The chunked coding allows each chunk to include zero or more chunk extensions, immediately following the chunk-size, for the sake of supplying per-chunk metadata (such as a signature or hash), mid-message control information, or randomization of message body size." -- RFC 9112 Section 7.1.1
65+
66+
> "A recipient MUST ignore unrecognized chunk extensions." -- RFC 9112 Section 7.1.1
67+
68+
> "A server ought to limit the total length of chunk extensions received in a request to an amount reasonable for the services provided, in the same way that it applies length limitations and timeouts for other parts of a message, and generate an appropriate 4xx (Client Error) response if that amount is exceeded." -- RFC 9112 Section 7.1.1
69+
70+
### Chain of reasoning
71+
72+
1. The test sends chunk-size line `5;ext=value\r\n`. Parsing this against the ABNF: `chunk-size` matches `5`, then `chunk-ext` matches `;ext=value` where `ext` is the `chunk-ext-name` (a token) and `value` is the `chunk-ext-val` (also a token).
73+
2. The `chunk` production explicitly includes `[ chunk-ext ]` -- chunk extensions are an optional but grammatically valid part of every chunk.
74+
3. RFC 9112 Section 7.1.1 states recipients "MUST ignore unrecognized chunk extensions". The word "ignore" means the server must parse past them and process the chunk-data normally.
75+
4. However, the RFC also says servers "ought to limit the total length of chunk extensions" and may generate a 4xx response if limits are exceeded. This introduces a legitimate reason for a `400` response.
76+
5. The extension in this test (`ext=value`) is short (9 bytes), so a length-limit rejection would be unreasonable. But the RFC permits it in principle.
77+
78+
### Scored / Unscored justification
79+
80+
**Unscored.** The MUST keyword applies to *ignoring unrecognized* extensions, which implies the server should parse and skip them. However, the RFC also explicitly permits servers to reject requests with excessive chunk extensions via a 4xx response. Because the boundary between "acceptable" and "excessive" is left to the server's discretion, there is room for a compliant server to reject even short extensions. The test uses SHOULD accept (`2xx` = Pass, `400` = Warn) to acknowledge that `2xx` is the preferred behavior while `400` is not a clear violation.
81+
82+
### Edge cases
83+
84+
- Some servers strip chunk extensions before passing data to the application layer -- this is correct behavior per "MUST ignore unrecognized chunk extensions."
85+
- A few servers fail to parse the semicolon delimiter and treat `5;ext=value` as an invalid chunk-size, returning `400`. This is a parser bug, not a policy decision.
86+
- Chunk extensions with quoted-string values (e.g., `5;ext="hello world"`) are also valid per the ABNF but may trigger additional parser failures in implementations that only handle token values.
87+
- The BWS (bad whitespace) allowance means `5 ; ext = value` is also technically valid, though rarely seen in practice.
88+
4089
## Sources
4190

4291
- [RFC 9112 Section 7.1.1](https://www.rfc-editor.org/rfc/rfc9112#section-7.1.1)

docs/content/docs/body/chunked-hex-uppercase.md

Lines changed: 51 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,14 +31,63 @@ The chunk size `A` is uppercase hex for 10. The chunk data `helloworld` is exact
3131

3232
## What the RFC says
3333

34-
> "chunk-size = 1*HEXDIG" -- RFC 9112 Section 7.1
34+
> "A recipient MUST be able to parse and decode the chunked transfer coding." — RFC 9112 Section 7.1
3535
36-
`HEXDIG` is defined as `DIGIT / "A" / "B" / "C" / "D" / "E" / "F"` (case-insensitive per RFC 5234 ABNF conventions). Both `a` and `A` represent the decimal value 10. A compliant chunked parser must accept hex digits in any case.
36+
The chunked grammar defines `chunk-size = 1*HEXDIG`. `HEXDIG` is defined in RFC 5234 (ABNF) as `DIGIT / "A" / "B" / "C" / "D" / "E" / "F"`, and ABNF string matching is case-insensitive by definition. Both `a` and `A` represent the decimal value 10. A compliant chunked parser must accept hex digits in any case.
37+
38+
> "Recipients MUST anticipate potentially large hexadecimal numerals and prevent parsing errors due to integer conversion overflows or precision loss due to integer representation." — RFC 9112 Section 7.1
3739
3840
## Why it matters
3941

4042
While most chunk sizes in practice are small decimal numbers (like `5` or `1a`), the grammar allows any combination of uppercase and lowercase hex digits. A parser that only handles lowercase hex, or only decimal digits, will fail on legitimate chunked bodies. This is a basic interoperability requirement for any HTTP/1.1 implementation.
4143

44+
## Deep Analysis
45+
46+
### Relevant ABNF grammar
47+
48+
From RFC 9112 Section 7.1:
49+
50+
```
51+
chunk-size = 1*HEXDIG
52+
```
53+
54+
From RFC 5234 Appendix B.1 (Core ABNF):
55+
56+
```
57+
HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
58+
DIGIT = %x30-39 ; 0-9
59+
```
60+
61+
Note: RFC 5234 Section 2.3 states that ABNF strings are case-insensitive. The HEXDIG definition listing uppercase `"A"` through `"F"` implicitly includes `"a"` through `"f"`.
62+
63+
### Direct RFC quotes
64+
65+
> "A recipient MUST be able to parse and decode the chunked transfer coding." -- RFC 9112 Section 7.1
66+
67+
> "Recipients MUST anticipate potentially large hexadecimal numerals and prevent parsing errors due to integer conversion overflows or precision loss due to integer representation." -- RFC 9112 Section 7.1
68+
69+
> "HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"" -- RFC 5234 Appendix B.1
70+
71+
### Chain of reasoning
72+
73+
1. The test sends chunk-size `A\r\n` followed by exactly 10 bytes of data (`helloworld`).
74+
2. The `chunk-size` ABNF production is `1*HEXDIG`, requiring one or more hexadecimal digits.
75+
3. `HEXDIG` is defined in RFC 5234 as `DIGIT / "A" / "B" / "C" / "D" / "E" / "F"`. Per RFC 5234 Section 2.3, ABNF string comparison is case-insensitive, so both `A` and `a` are valid HEXDIG values.
76+
4. `A` in hexadecimal equals 10 in decimal. The test provides exactly 10 bytes of chunk-data, satisfying the `chunk-data = 1*OCTET` production with the correct length.
77+
5. The `0\r\n\r\n` terminator satisfies `last-chunk` and the trailing CRLF.
78+
6. The entire message is a valid `chunked-body`. The MUST requirement to "parse and decode" chunked encoding necessarily includes correctly interpreting hex digits of any case.
79+
80+
### Scored / Unscored justification
81+
82+
**Scored.** The MUST requirement ("A recipient MUST be able to parse and decode the chunked transfer coding") encompasses correct hex parsing. Since `chunk-size = 1*HEXDIG` and HEXDIG is case-insensitive by ABNF rules, rejecting uppercase hex is a failure to parse valid chunked encoding. There is no SHOULD or MAY ambiguity -- the grammar is unambiguous and the requirement is MUST-level.
83+
84+
### Edge cases
85+
86+
- Some implementations use `strtol()` or equivalent with base 16, which naturally handles both cases. These pass without issue.
87+
- Implementations that use a hand-rolled hex parser with only `0-9` and `a-f` ranges (missing `A-F`) will fail this test. This is a common bug in minimal HTTP parsers.
88+
- Mixed-case chunk sizes like `1a`, `1A`, `1b3F` are all equally valid per HEXDIG case-insensitivity. This test uses pure uppercase to catch the most common parser limitation.
89+
- The RFC also warns about large hex numerals causing integer overflow. While this test uses a small value (`A` = 10), the parser must be robust against both case variation and large values.
90+
4291
## Sources
4392

4493
- [RFC 9112 §7.1 -- Chunked Transfer Coding](https://www.rfc-editor.org/rfc/rfc9112#section-7.1)

0 commit comments

Comments
 (0)