Skip to content

fix: address QA defects found by 10-spec public-API corpus#8

Merged
nic-6443 merged 3 commits intomainfrom
fix/qa-bugs-v103
Apr 23, 2026
Merged

fix: address QA defects found by 10-spec public-API corpus#8
nic-6443 merged 3 commits intomainfrom
fix/qa-bugs-v103

Conversation

@jarvis9443
Copy link
Copy Markdown
Contributor

@jarvis9443 jarvis9443 commented Apr 23, 2026

Summary

Multi-spec QA against 10 widely-adopted public OpenAPI specs (github, twilio, box, cloudflare, plaid, openai, square, discourse, adyen-checkout, notion — 13,148 cases) surfaced four validator-side defects plus one jsonschema-side defect (the latter is fixed in api7/jsonschema#100). This PR fixes all four validator-side bugs and adds regression coverage.

QA report (with verification numbers): api7ee/qa/lua-resty-openapi-validator-v1.0.3.md.

Bugs fixed

Bug 1 — router: path parameter dropped when followed by a literal suffix

lua-resty-radixtree's fetch_pat() treats any /-segment whose first byte is : as a single-segment param, discarding the literal suffix. Templates like /users/{id}.json or /files/{name}.{ext} became regex users/([^/]+) with the param stored under name id.json (or name}.{ext}), so match() looked up the param under its declared name and got nil.

Effect on corpus: 195/197 Twilio operations and ~30 Discourse operations could never validate a positive request.

Fix: detect mixed segments at compile time, build a per-route PCRE ^(prefix)<literal>(?P<id>[^/]+?)<literal>...$ and use ngx.re.match to re-extract path params after radixtree returns. Non-greedy +? is critical for {name}.{ext} patterns. base_paths are stripped so the PCRE only sees the spec-relative path.

Bug 2 — normalize: nullable enum with explicit null silently disables the enum check

For schemas like {type: string, nullable: true, enum: ["a", null]}, normalize wrapped them as anyOf [original, {type: null}] but kept cjson.null inside the cloned enum. The cjson.null userdata in the enum array caused api7/jsonschema to silently skip the enum check entirely — any string value passed (Plaid corpus: 12 false negatives).

Fix: when wrapping nullable schemas, strip cjson.null from the cloned enum (and clear const if it equals cjson.null). The added {type: null} branch already permits null, so the cjson.null entries were redundant.

Bug 4 — params: pipeDelimited / spaceDelimited / form?explode=false crash on table value

The array-style branches called split() on raw_value without checking whether it was already a Lua table (multi-value query input). Crashed with attempt to index a table value on Box's deepObject-adjacent operations (48 crashes).

Fix: coerce table raw_value to a delimiter-joined string before splitting. The form/explode=true branch was already table-safe.

Bug 5 — body: cjson.null content_type crash

if route.body_content and content_type then accepted cjson.null because userdata is truthy in Lua, then dereferenced it as a string. Easy to hit when callers decode requests via cjson.decode and pass fields through without sanitization.

Fix: guard with type(content_type) == "string".

Verification

Re-ran the full 13,148-case corpus with these fixes (out/*.fixed.results.jsonl). Highlights:

Spec / metric Before After
discourse positive 62/93 (66.7%) 91/93 (97.8%)
box positive 246/296 (83.1%) 277/296 (93.6%)
box crashes 48 0
cloudflare crashes 17 0 (via jsonschema#100)
plaid bad_enum_body 42/54 54/54
plaid missing_required_body 264/271 271/271

Twilio positive remains at 2/197, but the failures changed from "path param missing" to legitimate "value too short for minLength: 34" — that's a test-data artefact (Twilio SIDs are 34-char prefixed IDs; the generator emits "1"), not a validator defect.

All existing unit + conformance tests still pass.

Tests

Adds t/conformance/test_qa_bugs_v103.lua with one focused case per bug (compile-and-validate, no real HTTP).

Summary by CodeRabbit

  • Bug Fixes

    • Fixed crashes when validating request body content-type with non-string values
    • Improved nullable schema validation for enum properties with null values
    • Fixed array parameter deserialization with delimiter-based formats
    • Enhanced path parameter extraction in templates with mixed literal text and parameters
  • Tests

    • Added comprehensive test suite for v1.0.3 QA regression cases

Multi-spec QA against 10 widely-adopted public OpenAPI specs (github,
twilio, box, cloudflare, plaid, openai, square, discourse,
adyen-checkout, notion — 13,148 cases) surfaced four validator-side
defects plus one jsonschema-side defect (the latter fixed in
api7/jsonschema#100). This commit fixes all four validator-side bugs
and adds regression coverage.

Bug 1 — router: path parameter dropped when followed by literal suffix
====================================================================
lua-resty-radixtree's fetch_pat() treats any '/'-segment whose first
byte is ':' as a single-segment param, discarding the literal suffix.
Templates like '/users/{id}.json' or '/files/{name}.{ext}' became
regex 'users/([^/]+)' with the param stored under name 'id.json' (or
'name}.{ext}'), so match() looked up the param under its declared
name and found nil. Effect on the corpus: 195/197 Twilio operations
and ~30 Discourse operations could never validate a positive request.

Fix: detect mixed segments at compile time, build a per-route PCRE
'^(prefix)<literal>(?P<id>[^/]+?)<literal>...$' and use ngx.re.match
to re-extract path params after radixtree returns. Non-greedy '+?' is
critical for '{name}.{ext}' patterns. base_paths are stripped so the
PCRE only sees the spec-relative path.

Bug 2 — normalize: nullable enum with explicit null silently disables
====================================================================
For schemas like '{type: string, nullable: true, enum: ["a", null]}',
normalize wrapped them as 'anyOf [original, {type: null}]' but kept
cjson.null inside the cloned enum. The cjson.null userdata in the
enum array caused api7/jsonschema to silently skip the enum check
entirely — any string value passed (Plaid corpus: 12 false negatives).

Fix: when wrapping nullable schemas, strip cjson.null from the cloned
enum (and clear const if it equals cjson.null). The added '{type:null}'
branch already permits null, so the cjson.null entries were redundant.

Bug 4 — params: pipeDelimited / spaceDelimited / form?explode=false
====================================================================
The array-style branches called split() on raw_value without checking
whether it was already a Lua table (multi-value query input). Crashed
with 'attempt to index a table value' on Box's deepObject-adjacent
operations (48 crashes).

Fix: coerce table raw_value to a delimiter-joined string before
splitting. The form/explode=true branch was already table-safe.

Bug 5 — body: cjson.null content_type crash
==========================================
'if route.body_content and content_type then' accepted cjson.null
because userdata is truthy in Lua, then dereferenced it as a string.
Easy to hit when callers decode requests via cjson.decode and pass
fields through without sanitization.

Fix: guard with 'type(content_type) == "string"'.

Verification
============
Re-ran the full 13,148-case corpus with these fixes. Highlights:
- discourse positive:  62/93 → 91/93 (Bug 1)
- box positive:        246/296 → 277/296; crashes 48 → 0 (Bug 4)
- cloudflare crashes:  17 → 0 (Bug 3, via jsonschema#100)
- plaid bad_enum_body: 42/54 → 54/54 (Bug 2)
- All existing unit + conformance tests still pass.

Tests
=====
Adds t/conformance/test_qa_bugs_v103.lua with one focused case per
bug (compile-and-validate, no real HTTP).
Copilot AI review requested due to automatic review settings April 23, 2026 11:42
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 23, 2026

Warning

Rate limit exceeded

@jarvis9443 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 5 minutes and 15 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 5 minutes and 15 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 85cca1e6-c735-4272-957c-5935778bd18a

📥 Commits

Reviewing files that changed from the base of the PR and between 86f7c34 and d7e6c85.

📒 Files selected for processing (4)
  • lib/resty/openapi_validator/body.lua
  • lib/resty/openapi_validator/normalize.lua
  • lib/resty/openapi_validator/router.lua
  • t/conformance/test_qa_bugs_v103.lua
📝 Walkthrough

Walkthrough

The PR fixes multiple issues: content-type validation crashes when non-string values are encountered, nullable schemas with enum/const containing null sentinels are incorrectly normalized, array parameters arriving as tables in delimited styles lack proper coercion, and path templates with mixed literal text and parameters are not matched accurately. A new conformance test suite validates all fixes.

Changes

Cohort / File(s) Summary
Body Validation
lib/resty/openapi_validator/body.lua
Adds type check to ensure content_type is a string before attempting lowercase and matching against route.body_content, preventing crashes from non-string sentinel values.
Schema Normalization
lib/resty/openapi_validator/normalize.lua
Updates nullable-schema handling for OpenAPI 3.0 to filter out cjson.null from enum arrays and clear const when it equals cjson.null, preventing silent validation breaks from userdata values while preserving null-type allowance.
Parameter Deserialization
lib/resty/openapi_validator/params.lua
Introduces coercion logic that converts table-typed raw_value into a delimiter-joined string before passing to split() for delimiter-based array parameter styles (simple, form with explode:false, pipeDelimited, spaceDelimited).
Router Path Matching
lib/resty/openapi_validator/router.lua
Adds detection and specialized matching for OpenAPI path templates with mixed /-bounded segments containing both {param} placeholders and literal prefix/suffix text; builds template-specific PCRE patterns and performs secondary authoritative matching during route resolution.
Conformance Tests
t/conformance/test_qa_bugs_v103.lua
Introduces v1.0.3 QA regression test suite covering path parameter extraction in mixed-segment routes, nullable enum validation, delimited array parameter coercion, and safe handling of null content-types.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Router
    participant Radixtree
    participant PCRE
    participant ParamExtractor

    Client->>Router: Request with path /users/{id}.json
    Router->>Radixtree: Match path against tree
    Radixtree-->>Router: Return route candidate (with base_path)
    
    Note over Router: Detect mixed segment<br/>(contains {param} + literal suffix)
    
    Router->>PCRE: Execute template-specific PCRE<br/>pattern with path
    PCRE-->>Router: Capture groups with<br/>parameter mappings
    
    alt PCRE Match Success
        Router->>ParamExtractor: Extract params from<br/>PCRE captures
        ParamExtractor-->>Router: Return accurate<br/>parameter values
        Router-->>Client: Route match + params
    else PCRE Match Fails
        Router-->>Client: No match (override<br/>radixtree result)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • nic-6443
🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
E2e Test Quality Review ⚠️ Warning PR has critical implementation and test quality issues: incomplete content_type guard in body.lua (lines 385, 390, 420, 438, 439 lack type protection), fundamentally flawed Bug 5 test that doesn't exercise vulnerable code paths, and unresolved router.lua lint warnings. Implement comprehensive content_type guard using normalized string variable, fix Bug 5 test with non-empty body scenario, and resolve router.lua lint warnings by removing unused variable assignments.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title "fix: address QA defects found by 10-spec public-API corpus" directly describes the main purpose of the PR: fixing QA defects discovered through testing. However, it is somewhat generic and does not specify which defects or what areas were fixed, making it broad rather than precisely focused on the primary technical change.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Security Check ✅ Passed Security review of pull request changes across 7 vulnerability categories (data exposure, secrets, auth bypass, insecure resource access, TLS misconfiguration, resource isolation, and secret reference resolution) found no issues. Modified files contain only OpenAPI validation logic without logging sensitive data, storing secrets, bypassing authorization, or creating security gaps.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/qa-bugs-v103

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes validator defects discovered by running a large public OpenAPI corpus, improving routing correctness, schema normalization, and parameter/body handling; adds targeted regression tests.

Changes:

  • Router: add mixed-segment detection and PCRE-based re-extraction of path params for templates like /users/{id}.json.
  • Normalization: ensure nullable enum/const normalization doesn’t leak cjson.null into jsonschema validation.
  • Params/body: make delimiter-style array params table-safe; add a guard for non-string content_type; add conformance regression tests.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
t/conformance/test_qa_bugs_v103.lua Adds regression cases covering the QA-found defects.
lib/resty/openapi_validator/router.lua Adds mixed-segment detection and fallback param extraction via ngx.re.match.
lib/resty/openapi_validator/params.lua Coerces table query values for delimiter-based array styles before splitting.
lib/resty/openapi_validator/normalize.lua Cleans cjson.null out of nullable enum and adjusts nullable wrapping behavior.
lib/resty/openapi_validator/body.lua Adds a type guard around the content-type declared-in-spec check.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/resty/openapi_validator/body.lua Outdated
Comment thread t/conformance/test_qa_bugs_v103.lua
Comment thread lib/resty/openapi_validator/normalize.lua Outdated
Comment thread lib/resty/openapi_validator/router.lua
Comment thread lib/resty/openapi_validator/router.lua
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/resty/openapi_validator/body.lua`:
- Around line 363-367: The current guard only checks content_type before one
str_lower call, but non-string content_type can still reach other str_lower
usages (e.g., later checks that assume a string). Sanitize content_type once at
the start of the validation scope (e.g., in the function that references
route.body_content) by replacing non-strings with nil: if type(content_type) ~=
"string" then content_type = nil end; then use the sanitized content_type for
all subsequent calls to str_lower and helper functions so no non-string value
ever reaches string.lower (refer to the variables content_type,
route.body_content and the str_lower calls).

In `@lib/resty/openapi_validator/router.lua`:
- Around line 323-338: Remove the unused local ngx_re and the unused err from
the fallback matcher: delete the unused require("ngx.re") assignment (remove the
ngx_re local) and change the re_match call that currently does "local mm, err =
re_match(rel, route.param_pcre, 'jo')" to either capture only mm or use a
throwaway variable (e.g., local mm, _ = re_match(...)) so that err is not an
unused local; this touches the fallback matcher that references re_match and
route.param_pcre.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2bf2b5ef-dcc9-4b71-b156-055394f5b922

📥 Commits

Reviewing files that changed from the base of the PR and between bf4c423 and 86f7c34.

📒 Files selected for processing (5)
  • lib/resty/openapi_validator/body.lua
  • lib/resty/openapi_validator/normalize.lua
  • lib/resty/openapi_validator/params.lua
  • lib/resty/openapi_validator/router.lua
  • t/conformance/test_qa_bugs_v103.lua

Comment thread lib/resty/openapi_validator/body.lua Outdated
Comment thread lib/resty/openapi_validator/router.lua Outdated
- body.lua: normalize non-string content_type to nil at the top of
  validate() so all downstream sites (find_body_schema_for_content_type,
  is_json_content_type, ...) treat it like an absent header. The
  previous spot-fix only guarded the 'declared in spec' check.

- normalize.lua: previously, nullable+const-equals-null cleared the
  const and produced an empty 'original' branch in the anyOf wrapper,
  which would accept ANY value. Collapse to {type:'null'} instead so
  the schema correctly accepts only null. Same collapse applied when
  cleaning enum leaves an empty array (i.e. the enum was [null]).

- t/conformance/test_qa_bugs_v103.lua: the Bug 5 case used to early-
  return on empty body, missing the actual crash path. Now sends a
  non-empty body so content_type is evaluated. Added Bug 2b case for
  the nullable+const=null collapse.
@jarvis9443
Copy link
Copy Markdown
Contributor Author

Addressed in d7e6c85:

  • Bug 5 spot-fix → top-of-function normalization: content_type is now coerced to nil at the top of body.validate(), so find_body_schema_for_content_type() and is_json_content_type() (and any future content_type consumer) see nil, not the cjson.null userdata. The single guard wasn't enough — good catch.
  • Bug 2 const-null fix was wrong: I was clearing original.const when it equaled cjson.null, which collapsed the schema to anyOf [{}, {type:"null"}] (matches anything). Replaced with: when the only allowed value was null, collapse the wrapper to just {type:"null"}. Same collapse when stripping null leaves an empty enum.
  • Bug 5 test didn't exercise the crash path: it short-circuited on empty body. Now sends a JSON body so content_type is actually evaluated.
  • Added a Bug 2b test for the nullable+const=null collapse.
  • Lint warnings already cleaned up in ab57baf.

@nic-6443 nic-6443 merged commit ff9db5c into main Apr 23, 2026
3 checks passed
@nic-6443 nic-6443 deleted the fix/qa-bugs-v103 branch April 23, 2026 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants