Add agentic classification to the headers workflow#3400
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds an OpenAI-based RFC classifier and response-only header lookup, refactors the header generator to incrementally classify new IANA header names and regenerate two modules with a stored timestamp, updates the well-known headers module and test, and updates workflow and dev dependencies for model access. ChangesHTTP Header Incremental Classification
Sequence Diagram(s)sequenceDiagram
participant ToolsHeaders as tools/headers.ts
participant RFC_Agent as tools/rfc-agent.classifyHeaders
participant GitHubModels as GitHub Models API
participant lookupRfc as tools/rfc-agent.lookupRfc
participant RFCEditor as rfc-editor.org
ToolsHeaders->>RFC_Agent: call classifyHeaders(headers[])
RFC_Agent->>GitHubModels: chat.completions.create (with lookup_rfc tool schema)
loop tool calls requested
GitHubModels->>RFC_Agent: requests lookup_rfc(number)
RFC_Agent->>lookupRfc: lookupRfc(number)
lookupRfc->>RFCEditor: GET /rfc/rfc{number}.txt
RFCEditor-->>lookupRfc: RFC text
lookupRfc-->>RFC_Agent: excerpt(s)
RFC_Agent->>GitHubModels: append tool result
end
GitHubModels-->>RFC_Agent: final assistant content (JSON)
RFC_Agent->>RFC_Agent: JSON.parse + ResponseSchema.parse
RFC_Agent-->>ToolsHeaders: return classification result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-authored-by: Anna Bocharova <robin_tail@me.com>
Automated updates to well-known HTTP headers. New headers classified via GitHub Models API (GPT-4.1). Response-only headers moved to `tools/response-only-headers.yml`. Co-authored-by: RobinTail <13189514+RobinTail@users.noreply.github.com>
Co-authored-by: Anna Bocharova <robin_tail@me.com>
Automated updates to well-known HTTP headers. New headers classified via GitHub Models API (GPT-4.1). Response-only headers moved to `tools/response-only-headers.yml`. Co-authored-by: RobinTail <13189514+RobinTail@users.noreply.github.com>
There was a problem hiding this comment.
♻️ Duplicate comments (2)
tools/rfc-agent.ts (2)
115-130:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCap the tool-calling loop.
If the model keeps returning
finish_reason: "tool_calls", this loop can run until the Actions job times out.Suggested fix
+ const maxToolTurns = 8; let completion = await client.chat.completions.create(agentConfig); let toolCallCount = 0; - while (completion.choices[0].finish_reason === "tool_calls") { + for (let turn = 0; completion.choices[0].finish_reason === "tool_calls"; turn++) { + if (turn >= maxToolTurns) { + throw new Error("Exceeded max tool-call turns"); + } const choice = completion.choices[0]; messages.push(choice.message); for (const toolCall of choice.message.tool_calls ?? []) { if (!("function" in toolCall)) continue; const { number } = JSON.parse(toolCall.function.arguments); @@ } if (toolCallCount >= headers.length) agentConfig.tool_choice = "none"; completion = await client.chat.completions.create(agentConfig); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/rfc-agent.ts` around lines 115 - 130, The loop that re-invokes client.chat.completions.create while completion.choices[0].finish_reason === "tool_calls" can spin indefinitely; add a hard cap (e.g., maxToolCalls) and enforce it using the existing toolCallCount (increment per processed tool call) and/or an iteration counter so the while loop breaks when the cap is reached, set agentConfig.tool_choice = "none" when capped, and log an error/warning; update references in this block (completion, toolCallCount, headers, agentConfig, client.chat.completions.create, messages) to check the cap before making another request to ensure the loop always terminates.
135-136:⚠️ Potential issue | 🟠 Major | ⚡ Quick winVerify the model classified exactly the requested headers.
ResponseSchema.parse()only checks shape. A partial response, duplicate names, or extra names still passes and can silently corrupt the incremental header lists.Suggested fix
- const parsed = JSON.parse(raw); - return ResponseSchema.parse(parsed); + const parsed = ResponseSchema.parse(JSON.parse(raw)); + const expected = new Set(headers.map((header) => header.toLowerCase())); + const actual = new Set(parsed.map(({ name }) => name.toLowerCase())); + + if (parsed.length !== headers.length || actual.size !== expected.size) { + throw new Error("LLM response must classify each requested header exactly once"); + } + for (const name of actual) { + if (!expected.has(name)) { + throw new Error(`LLM returned an unexpected header: ${name}`); + } + } + + return parsed;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/rfc-agent.ts` around lines 135 - 136, The current code simply does const parsed = JSON.parse(raw); return ResponseSchema.parse(parsed); which only enforces shape and can allow partial, duplicate, or extra header names; after parsing (parsed) and schema validation (ResponseSchema.parse), explicitly validate that the returned header list exactly matches the requested header names: verify same length, no duplicates (use a Set on the names), and set equality between the parsed header names and the expected/requested header names (fail fast if any extra, missing, or duplicate names are present), and throw or return a clear validation error instead of accepting the result.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@tools/rfc-agent.ts`:
- Around line 115-130: The loop that re-invokes client.chat.completions.create
while completion.choices[0].finish_reason === "tool_calls" can spin
indefinitely; add a hard cap (e.g., maxToolCalls) and enforce it using the
existing toolCallCount (increment per processed tool call) and/or an iteration
counter so the while loop breaks when the cap is reached, set
agentConfig.tool_choice = "none" when capped, and log an error/warning; update
references in this block (completion, toolCallCount, headers, agentConfig,
client.chat.completions.create, messages) to check the cap before making another
request to ensure the loop always terminates.
- Around line 135-136: The current code simply does const parsed =
JSON.parse(raw); return ResponseSchema.parse(parsed); which only enforces shape
and can allow partial, duplicate, or extra header names; after parsing (parsed)
and schema validation (ResponseSchema.parse), explicitly validate that the
returned header list exactly matches the requested header names: verify same
length, no duplicates (use a Set on the names), and set equality between the
parsed header names and the expected/requested header names (fail fast if any
extra, missing, or duplicate names are present), and throw or return a clear
validation error instead of accepting the result.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3e7cc8da-a221-43a2-bcd5-96097345757f
📒 Files selected for processing (4)
express-zod-api/src/well-known-headers.tstools/headers.tstools/response-only-headers.tstools/rfc-agent.ts
✅ Files skipped from review due to trivial changes (2)
- tools/response-only-headers.ts
- express-zod-api/src/well-known-headers.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- tools/headers.ts
There was a problem hiding this comment.
No new issues. Reviewed the following changes:
- Wrapped
lookupRfc'sfetchandresp.text()in a try/catch and addedAbortSignal.timeout(5000)to the fetch call intools/rfc-agent.ts, returning a boundedError: Failed to fetch RFC ${number}payload on any thrown error. The existing!resp.okbranch and the caller'scontent.startsWith("Error:")detection still work for both error paths. Directly closes the priortools/rfc-agent.ts:54-58finding about the unbounded RFC fetch.
Claude Opus | 𝕏
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tools/rfc-agent.ts`:
- Around line 51-52: The current filter using /^[\w-]+$/ drops RFC9110-valid
tchar characters; update the logic around headerPattern/rfcLookupRegex so no
valid header is silently omitted: either 1) escape every header when building
headerPattern (e.g., map headers through a regex-escape function before join) so
rfcLookupRegex = new RegExp(`\\b(${escapedHeaderPattern})\\b`, "gi"), or 2)
validate each header against the full RFC token charset (tchars:
!#$%&'*+-.^_`|~0-9A-Za-z) and throw an error if any header fails validation
(instead of filtering) so the code fails fast; apply the change to the headers
-> headerPattern construction and keep rfcLookupRegex creation unchanged except
for using the escaped/validated headerPattern.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
There was a problem hiding this comment.
Note
Mirrors the lookupRfc hardening from f705c32 onto the IANA fetch. No new issues.
Reviewed the following changes:
- Added
AbortSignal.timeout(5000)and aresponse.okcheck around the IANAfield-names.csvfetch intools/headers.ts, so a slow/broken IANA endpoint now fails fast at module top-level rather than hanging the workflow. Addresses CodeRabbit's prior outside-diff feedback on the IANA fetch. - Moved the
exceptionNamesset construction down next to its only use site (just above thenewHeadersfilter), purely a locality cleanup.
Claude Opus | 𝕏
There was a problem hiding this comment.
No new issues. Reviewed the following changes:
- Made
ResponseSchemadynamic per batch intools/rfc-agent.tsvia a newmakeResponseSchema(names)factory called insideclassifyHeaders. The schema now constrainsname: z.literal(names)and.length(names.length), so the LLM response is rejected when it contains an unknown header name, a missing entry, or an extra entry. The interpolatedz.toJSONSchema(ResponseSchema)in the user prompt also carries the enum +minItems/maxItems, so the model sees the constraints at generation time. This substantially closes the long-tracked "validate that returnednameset matches the requested batch" finding — one narrow case (a response that duplicates one batch name and omits another) still slips through both checks, but the gap is strictly narrower than before.
Claude Opus | 𝕏
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tools/rfc-agent.ts`:
- Around line 130-135: The code currently JSON.parse()s
toolCall.function.arguments and coerces Number(number) || 0 which can throw or
mask bad inputs; change it to safely parse and validate using a Zod schema
(e.g., schema = z.object({ number: z.preprocess(n => Number(n),
z.number().int().positive()) }) ), catch JSON.parse errors, validate via
schema.parse or safeParse, and on validation failure return/emit a clear tool
error for the lookup_rfc invocation (do not call lookupRfc with 0); update the
loop handling around toolCall and lookupRfc to only call
lookupRfc(Number(number)) after successful validation and log/return the schema
validation error when invalid.
- Around line 47-49: The function classifyHeaders currently uses z.infer<typeof
ResponseSchema> in its return type while ResponseSchema is declared inside the
function body, causing a block-scoped symbol error; fix by hoisting the Zod
schema (ResponseSchema) and/or a top-level type alias (e.g., ResponseType =
z.infer<typeof ResponseSchema>) above the classifyHeaders declaration so the
signature can legally reference it, then update classifyHeaders to return
Promise<ResponseType> (or Promise<z.infer<typeof ResponseSchema>> if you hoisted
the schema).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
tools/rfc-agent.ts (3)
129-133:⚠️ Potential issue | 🟠 MajorValidate
lookup_rfcarguments before parsing or fetching.
toolCall.function.argumentsis model output. RawJSON.parse()can abort the whole run, andNumber(number) || 0
silently turns malformed input into an RFC 0 lookup.
47-49:⚠️ Potential issue | 🔴 Critical
ResponseSchemais out of scope in the exported return type.This signature references a block-scoped symbol that is only declared later inside
classifyHeaders, so the file
cannot type-check as written. Hoist a stable interface or the schema/type alias above the function.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/rfc-agent.ts` around lines 47 - 49, The function classifyHeaders currently references ResponseSchema in its return type even though ResponseSchema is declared later and is out of scope; fix this by hoisting a stable type declaration above classifyHeaders—either move the ResponseSchema zod schema (and its corresponding export) above the classifyHeaders function or create and export a standalone TypeScript interface/type alias (e.g., ResponseSchemaType or IResponseSchema) above the function and use that in classifyHeaders' Promise return type so the symbol is in scope when classifyHeaders is declared.
146-147:⚠️ Potential issue | 🟠 MajorReject duplicate, missing, or extra header names before returning.
ResponseSchema.parse(parsed)only validates shape. It still allows repeated names to satisfy.length(...), which
can silently drop a requested header from the final classification set.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/rfc-agent.ts` around lines 146 - 147, The code currently does JSON.parse(raw) and immediately returns ResponseSchema.parse(parsed), but ResponseSchema.parse only checks shape and can accept repeated header names; add an explicit validation before returning: extract the parsed.headers (or the relevant header-names array) after JSON.parse(raw), ensure there are no duplicates (use a Set to compare size), and verify the set of header names exactly equals the expected header-name list (compare lengths and membership or compare Sets) so missing or extra names are rejected; if validation fails, throw or return a clear error before calling ResponseSchema.parse(parsed) so duplicates/missing/extra header names are not accepted.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tools/rfc-agent.ts`:
- Around line 11-15: The schema currently uses z.literal(names) which validates
the whole array instead of individual header strings; update the
makeResponseSchema signature to accept a non-empty tuple type ([string,
...string[]]) and replace z.literal(names) with z.enum(names) when building the
object schema (the object with the name property inside makeResponseSchema) so
the name field validates against each individual string in names; ensure
TypeScript typing for makeResponseSchema and any callers is adjusted to the new
tuple type.
---
Duplicate comments:
In `@tools/rfc-agent.ts`:
- Around line 47-49: The function classifyHeaders currently references
ResponseSchema in its return type even though ResponseSchema is declared later
and is out of scope; fix this by hoisting a stable type declaration above
classifyHeaders—either move the ResponseSchema zod schema (and its corresponding
export) above the classifyHeaders function or create and export a standalone
TypeScript interface/type alias (e.g., ResponseSchemaType or IResponseSchema)
above the function and use that in classifyHeaders' Promise return type so the
symbol is in scope when classifyHeaders is declared.
- Around line 146-147: The code currently does JSON.parse(raw) and immediately
returns ResponseSchema.parse(parsed), but ResponseSchema.parse only checks shape
and can accept repeated header names; add an explicit validation before
returning: extract the parsed.headers (or the relevant header-names array) after
JSON.parse(raw), ensure there are no duplicates (use a Set to compare size), and
verify the set of header names exactly equals the expected header-name list
(compare lengths and membership or compare Sets) so missing or extra names are
rejected; if validation fails, throw or return a clear error before calling
ResponseSchema.parse(parsed) so duplicates/missing/extra header names are not
accepted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
There was a problem hiding this comment.
No new issues. Reviewed the following changes:
- Reclassified
public,public-key-pins, andpublic-key-pins-report-onlyas response-only headers (moved out ofgetWellKnownHeaders()and intotools/response-only-headers.ts).public-key-pinsand-report-onlyare cleanly response-only per RFC 7469 §2.3.1/§2.3.2. - Drove the
lookup_rfctool definition from a Zod schema intools/rfc-agent.ts— addedlookupRfcSchema = z.function({ input: [z.object({ number: z.number() })], output: z.string() }), derived the OpenAI toolparametersJSON Schema from it, and wrappedlookupRfcvialookupRfcSchema.implementAsync(...)so the model's tool args are Zod-validated at runtime instead of coerced throughNumber(x) || 0.
Claude Opus | 𝕏
24c03df to
97ca344
Compare
There was a problem hiding this comment.
Important
The new v28.0.1 "Removed (not expected in Request)" list is missing four headers that this PR actually drops from getWellKnownHeaders() — accept-ch, alternates, authentication-control, authentication-info. They were present on master (faa3e4b) and now live in tools/response-only-headers.ts, so the published changelog will undercount the behaviour change of the Documentation generator.
Reviewed the following changes:
- Added a
### v28.0.1entry summarising the well-known header set changes (addedcdn-loop; removed thirteen response-only reclassifications).
Claude Opus | 𝕏
There was a problem hiding this comment.
No new issues. Reviewed the following changes:
- Added
accept-ch,alternates,authentication-control,authentication-infoto the "Removed (not expected in Request)" line of the### v28.0.1CHANGELOG entry, closing the prior[!IMPORTANT]finding about the four removals missing from the release notes. The Removed list now matches the full set of headers reclassified totools/response-only-headers.ts.
Claude Opus | 𝕏
1e45b01 to
1f2c446
Compare

This PR adds an agentic automation to the well-known headers updating workflow.
The goal is to delegate the labor that used to require more manual effort.
It involves a GitHub hosted LLM equipped with an RFC lookup tool for headers classification.
Response-only headers are extracted into a maintained list of exceptions within dedicated files.
Both lists are improted statically for a further incremental update.
Summary by CodeRabbit
New Features
Chores
Tests