Skip to content

feat(fuzz): mutation fuzzer with schema-conformance oracle#9

Merged
nic-6443 merged 7 commits intomainfrom
feat/fuzz-harness
Apr 24, 2026
Merged

feat(fuzz): mutation fuzzer with schema-conformance oracle#9
nic-6443 merged 7 commits intomainfrom
feat/fuzz-harness

Conversation

@jarvis9443
Copy link
Copy Markdown
Contributor

@jarvis9443 jarvis9443 commented Apr 23, 2026

What

Adds a mutation-based fuzzer (fuzz/mutate_fuzz.py) that runs the validator against AST-mutated copies of real-world OpenAPI specs and checks two oracles:

  1. No crashesvalidate_request must not throw a Lua error (caught with pcall).
  2. Schema conformance — a request generated to satisfy an operation's schema must be accepted by the validator. Rejections are candidate false-negative bugs.

Changes

  • fuzz/mutate_fuzz.py — Python orchestrator with 6 mutation strategies, $ref resolution, schema-conforming request generator, and Lua subprocess harness
  • fuzz/seeds/ — Two real-world OpenAPI specs (Discourse, Notion) as mutation seeds
  • fuzz/README.md — Architecture, usage, and extension docs
  • .github/workflows/fuzz.yml — PR CI: 120s fuzz budget
  • .github/workflows/fuzz-nightly.yml — Nightly: 600s budget, auto-opens tracking issues on failure
  • Makefilemake fuzz target

Review feedback addressed

  • Removed simple from query param styles (invalid per OAS 3.0)
  • Handle enum/const generically in sample_value() before type-specific branches
  • Clamp/swap integer bounds to prevent ValueError
  • Style-aware array serialization for query params (pipeDelimited, spaceDelimited)
  • Resolve $ref pointers before generating test cases
  • Wrap fuzz loop in try/finally to always write summary.json
  • Validate FUZZ_BUDGET as numeric in nightly workflow
  • Fix gh issue list null handling with // empty
  • Add text language tag to README fenced code block

Summary by CodeRabbit

  • New Features

    • Adds mutation-based fuzzing with configurable run budget, CI and nightly executions, artifact uploads on failures, and automated issue creation/updating for findings.
  • Documentation

    • Adds a detailed fuzzing guide covering mutators, request-case generation, outputs, failure handling, and CI/local usage.
  • Chores

    • Make/build and ignore-rule updates to support fuzz outputs and local/CI execution.

A small mutation-based fuzzer that runs the validator against AST-mutated
copies of real-world OpenAPI specs and checks two oracles:

1. No crashes (validate_request must not throw a Lua error).
2. Schema conformance — a request generated to satisfy an operation's
   schema must be accepted by the validator.

This is the productionised form of the harness used during v1.0.3 QA.
Locally it reproduces the path-extension Bug 1 against pre-fix v1.0.3
and the utf8_len(table) Bug 3 against unpatched jsonschema, and is
clean against the current main + jsonschema main.

Wired into CI:

- fuzz.yml         — runs on every PR / push to main, 120s budget.
                     Fails the job on any crash or candidate
                     false-negative; uploads fuzz/out/ as an artifact.
- fuzz-nightly.yml — runs daily at 18:00 UTC, 600s budget. On failure
                     uploads findings, then opens (or comments on) a
                     fuzz-nightly tracking issue assigned to @jarvis9443.

See fuzz/README.md for architecture, mutator list, and how to extend.
Copilot AI review requested due to automatic review settings April 23, 2026 14:35
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a mutation-based OpenAPI fuzzer: Python harness, seed spec, README, Makefile integration, .gitignore updates, and two GitHub Actions workflows (PR/main and nightly) that install OpenResty/LuaRocks, run the fuzzer with configurable budget, upload artifacts, and create/comment issues on failures.

Changes

Cohort / File(s) Summary
CI Workflows
.github/workflows/fuzz.yml, .github/workflows/fuzz-nightly.yml
New GitHub Actions workflows that install OpenResty and LuaRocks, configure OpenSSL paths, install Lua deps, run make fuzz with FUZZ_BUDGET, upload fuzz/out/ artifacts (nightly retains longer and creates/updates fuzz-nightly issues on failures).
Makefile
Makefile
Add FUZZ_BUDGET ?= 60, new fuzz target invoking python3 fuzz/mutate_fuzz.py --budget $(FUZZ_BUDGET) --out fuzz/out, and update clean to remove fuzz/out.
Fuzzer code & ignore rules
fuzz/mutate_fuzz.py, fuzz/.gitignore, .gitignore
Add Python mutation-fuzz harness (main entrypoint), output path handling and exit codes; ignore fuzz/out/ and fuzz/__pycache__/.
Docs & seeds
fuzz/README.md, fuzz/seeds/notion.json
Add README describing harness, mutators, generators, CI behavior and outputs; add notion.json OpenAPI seed for fuzzing.

Sequence Diagram(s)

sequenceDiagram
    participant PythonFuzzer as Python Fuzzer
    participant FS as File System
    participant Resty as Resty CLI
    participant Lua as Lua Validator

    PythonFuzzer->>FS: Load seed OpenAPI spec
    PythonFuzzer->>PythonFuzzer: Apply AST mutations & generate request cases
    loop per batch/request
        PythonFuzzer->>Resty: Invoke resty CLI (spec + JSONL cases via stdin)
        Resty->>Lua: Compile spec and call v.validate_request inside pcall
        Lua-->>Resty: Emit validation result or Lua error (JSONL)
        Resty-->>PythonFuzzer: Stream JSONL results
    end
    PythonFuzzer->>FS: Write fuzz/out/crashes.jsonl and fuzz/out/summary.json
    PythonFuzzer->>PythonFuzzer: Exit non-zero if crashes or non-noisy false-negatives found
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
E2e Test Quality Review ⚠️ Warning PR lacks error handling for malformed seed JSON files, missing resty command validation, and has no unit/E2E tests for fuzzer implementation. Add try/except for JSON parsing, validate resty exists, check Lua dependencies, and create test_mutate_fuzz.py with comprehensive unit and E2E test coverage.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: introducing a mutation fuzzer with schema-conformance oracle validation for the fuzzing system.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Security Check ✅ Passed The pull request introduces mutation fuzzing infrastructure with no critical security vulnerabilities detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/fuzz-harness

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@jarvis9443 jarvis9443 requested a review from Copilot April 23, 2026 14:42
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Makefile (1)

10-10: ⚠️ Potential issue | 🟠 Major

Add fuzz to .PHONY declaration.

The fuzz target is not listed in .PHONY, but a fuzz/ directory exists. Make will consider the target up-to-date based on the directory's existence, causing make fuzz to skip execution.

🐛 Proposed fix
-.PHONY: test test-unit test-conformance lint dev install clean help
+.PHONY: test test-unit test-conformance lint dev install clean help fuzz
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` at line 10, The .PHONY declaration is missing the fuzz target so
Make may treat the existing fuzz/ directory as an up-to-date target; update the
.PHONY line (the .PHONY declaration that currently lists test test-unit
test-conformance lint dev install clean help) to also include fuzz so the fuzz
target always runs regardless of the fuzz/ directory's existence.
🧹 Nitpick comments (4)
fuzz/mutate_fuzz.py (3)

247-248: max_per_op parameter is unused.

The max_per_op parameter is declared but never used in the function body. Either implement the limiting logic or remove the parameter.

♻️ Option 1: Remove unused parameter
-def gen_cases(spec: dict, rng: random.Random, max_per_op: int = 2) -> list[dict]:
+def gen_cases(spec: dict, rng: random.Random) -> list[dict]:

And update the call site at line 426:

-            cases = gen_cases(spec, rng, max_per_op=2)
+            cases = gen_cases(spec, rng)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fuzz/mutate_fuzz.py` around lines 247 - 248, The gen_cases function declares
max_per_op but never uses it; either implement per-operation limiting inside
gen_cases by counting or slicing generated cases for each operation (use the
function name gen_cases and the parameter max_per_op to locate where to apply
the limit) so no more than max_per_op cases are returned per op, or remove the
max_per_op parameter from gen_cases and update all call sites that pass that
argument to call the new signature (search for gen_cases(...) calls to update).
Ensure whichever path you take keeps the function signature and callers
consistent.

351-352: Use explicit Optional type hint.

PEP 484 prohibits implicit Optional. The extra_includes parameter should use explicit union syntax.

♻️ Proposed fix
-def run_validator(spec: dict, cases: list[dict], deps: str, lib: str,
-                  extra_includes: list[str] = None, timeout: float = 30.0):
+def run_validator(spec: dict, cases: list[dict], deps: str, lib: str,
+                  extra_includes: list[str] | None = None, timeout: float = 30.0):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fuzz/mutate_fuzz.py` around lines 351 - 352, The type hint for
run_validator's extra_includes is currently implicit None; change it to an
explicit Optional type (e.g., Optional[List[str]]) and ensure typing.Optional
(or from typing import Optional, List) is imported so the signature becomes
run_validator(..., extra_includes: Optional[List[str]] = None, ...); update the
function definition and add the necessary typing import if missing to satisfy
PEP 484.

370-374: Catch specific exception instead of bare Exception.

Silently swallowing all exceptions masks unexpected errors. Since you're parsing JSON, catch json.JSONDecodeError specifically.

♻️ Proposed fix
         try:
             out.append(json.loads(line))
-        except Exception:
+        except json.JSONDecodeError:
             pass
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fuzz/mutate_fuzz.py` around lines 370 - 374, Replace the bare except
Exception around the json.loads(line) call with an except json.JSONDecodeError
to avoid swallowing unrelated errors: locate the try/except that wraps
json.loads(line) (which appends to out) and change the exception handler to only
catch json.JSONDecodeError so malformed JSON lines are skipped while other
exceptions propagate; ensure json is imported where mutate_fuzz.py uses
json.loads if not already.
fuzz/README.md (1)

19-27: Add language specifier to fenced code block.

The architecture diagram code block lacks a language specifier. While it's ASCII art, adding text or plaintext satisfies linters and improves rendering consistency.

📝 Suggested fix
-```
+```text
 mutate_fuzz.py (Python orchestrator)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fuzz/README.md` around lines 19 - 27, The fenced code block that shows the
architecture diagram (the block starting with "mutate_fuzz.py (Python
orchestrator)") lacks a language specifier; update the opening fence from ``` to
```text (or ```plaintext) so linters/renderers recognize it as plain text and
the README's code block renders consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/fuzz-nightly.yml:
- Around line 40-42: The "Install LuaRocks" step currently pipes an external
script into sh (curl ... | sh), which creates a supply-chain risk; change the
step so it first downloads the script (the command invoked in the Install
LuaRocks step), pins a specific commit or tag for the script URL, verifies its
integrity (e.g., compare a checked-in or workflow-provided SHA256 or GPG
signature) and only then executes the verified file, or alternatively vendor a
known-good installer into the repo or use an official action/installer instead
of piping to shell; update the Install LuaRocks step to reference the pinned URL
and add explicit verification before execution.

In @.github/workflows/fuzz.yml:
- Around line 32-34: The GitHub Actions step that installs LuaRocks currently
pipes an external script to sh (the "Install LuaRocks" run step), which is a
supply-chain risk; replace that by either vendoring the apache/apisix
utils/linux-install-luarocks.sh script into this repo and invoking the local
copy, or pin the curl URL to a specific commit SHA (use
raw.githubusercontent.com/.../<commit_sha>/utils/linux-install-luarocks.sh)
before executing it; if you vendor the script, add a short comment in the
workflow referencing the original upstream URL and commit SHA and add a periodic
checklist entry to review upstream updates so the workflow uses a known-good,
auditable script rather than an unpinned remote one.

In `@fuzz/README.md`:
- Around line 85-87: The README's Nightly entry references the GitHub handle
"@jarvis-api7" which is inconsistent with the actual assignee "jarvis9443" used
in .github/workflows/fuzz-nightly.yml; update the README.md Nightly line (the
string "@jarvis-api7") to the correct handle "jarvis9443" (or alternatively
update the workflow to match the README) so both the Nightly description and the
fuzz-nightly.yml assignee are the same.
- Around line 12-15: The markdown link
"../../../qa/lua-resty-openapi-validator-v1.0.3.md" in the README is a broken
relative path; update that link to point to the correct location of the QA
document (use the actual repo-relative path or an absolute URL), or if the QA
doc doesn't exist or is external, remove the link or replace it with a valid URL
and adjust the surrounding text accordingly so the reference no longer 404s.

---

Outside diff comments:
In `@Makefile`:
- Line 10: The .PHONY declaration is missing the fuzz target so Make may treat
the existing fuzz/ directory as an up-to-date target; update the .PHONY line
(the .PHONY declaration that currently lists test test-unit test-conformance
lint dev install clean help) to also include fuzz so the fuzz target always runs
regardless of the fuzz/ directory's existence.

---

Nitpick comments:
In `@fuzz/mutate_fuzz.py`:
- Around line 247-248: The gen_cases function declares max_per_op but never uses
it; either implement per-operation limiting inside gen_cases by counting or
slicing generated cases for each operation (use the function name gen_cases and
the parameter max_per_op to locate where to apply the limit) so no more than
max_per_op cases are returned per op, or remove the max_per_op parameter from
gen_cases and update all call sites that pass that argument to call the new
signature (search for gen_cases(...) calls to update). Ensure whichever path you
take keeps the function signature and callers consistent.
- Around line 351-352: The type hint for run_validator's extra_includes is
currently implicit None; change it to an explicit Optional type (e.g.,
Optional[List[str]]) and ensure typing.Optional (or from typing import Optional,
List) is imported so the signature becomes run_validator(..., extra_includes:
Optional[List[str]] = None, ...); update the function definition and add the
necessary typing import if missing to satisfy PEP 484.
- Around line 370-374: Replace the bare except Exception around the
json.loads(line) call with an except json.JSONDecodeError to avoid swallowing
unrelated errors: locate the try/except that wraps json.loads(line) (which
appends to out) and change the exception handler to only catch
json.JSONDecodeError so malformed JSON lines are skipped while other exceptions
propagate; ensure json is imported where mutate_fuzz.py uses json.loads if not
already.

In `@fuzz/README.md`:
- Around line 19-27: The fenced code block that shows the architecture diagram
(the block starting with "mutate_fuzz.py (Python orchestrator)") lacks a
language specifier; update the opening fence from ``` to ```text (or
```plaintext) so linters/renderers recognize it as plain text and the README's
code block renders consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4558101f-da50-48d4-b4a2-190146535dc5

📥 Commits

Reviewing files that changed from the base of the PR and between ff9db5c and 68c8b73.

📒 Files selected for processing (8)
  • .github/workflows/fuzz-nightly.yml
  • .github/workflows/fuzz.yml
  • Makefile
  • fuzz/.gitignore
  • fuzz/README.md
  • fuzz/mutate_fuzz.py
  • fuzz/seeds/discourse.json
  • fuzz/seeds/notion.json

Comment thread .github/workflows/fuzz-nightly.yml Outdated
Comment thread .github/workflows/fuzz.yml Outdated
Comment thread fuzz/README.md Outdated
Comment thread fuzz/README.md Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread fuzz/mutate_fuzz.py Outdated
Comment thread fuzz/mutate_fuzz.py
Comment thread fuzz/mutate_fuzz.py
Comment thread fuzz/mutate_fuzz.py Outdated
Comment thread fuzz/README.md Outdated
Comment thread fuzz/README.md Outdated
Comment thread fuzz/seeds/discourse.json Outdated
Comment thread fuzz/mutate_fuzz.py Outdated
Comment thread fuzz/README.md Outdated
- Remove unused 'import copy' (Thread 6)
- Fix module docstring: remove unimplemented mutators (scalar↔array,
  $ref cycle), clarify only positive cases are generated (Thread 5, 12)
- Enforce max_per_op limit in gen_cases (Thread 7)
- Separate crash_count and false_negative_count in summary (Thread 8)
- Fix README: mutator contract returns bool not string (Thread 9)
- Fix README: broken QA doc link removed (Thread 3)
- Fix README: align LITERAL_EXTS with actual code (Thread 13)
- Fix README: @jarvis-api7 → @jarvis9443 (Thread 4, 10)
- Redact API key in discourse.json seed (Thread 11)
- Replace curl-pipe-sh LuaRocks install with pinned tarball (Thread 1, 2)
- Update summary.json format in README docs (Thread 8)
The previous 3.11.1 tarball from luarocks.org failed because LuaJIT
hits the 65536 constants limit when parsing the large manifest file.

Switch to the same approach used by apache/apisix CI: luarocks 3.12.0
from GitHub releases with proper OpenSSL path configuration.
Copilot AI review requested due to automatic review settings April 24, 2026 02:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread fuzz/mutate_fuzz.py
Comment thread fuzz/mutate_fuzz.py
Comment thread fuzz/mutate_fuzz.py
Comment thread .github/workflows/fuzz-nightly.yml Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/fuzz-nightly.yml:
- Around line 50-53: The Install Lua dependencies step currently installs
floating versions of jsonschema and lua-resty-radixtree; change the commands to
pin exact versions (e.g. replace `sudo luarocks install jsonschema` and `sudo
luarocks install lua-resty-radixtree` with `sudo luarocks install jsonschema
<version>` and `sudo luarocks install lua-resty-radixtree <version>`) or use
`luarocks install --pin` to produce a lockfile and install from that to make the
nightly job reproducible; update the step that runs these commands (the "Install
Lua dependencies" run block) to include the chosen version strings or the --pin
workflow and commit the generated lockfile.
- Around line 22-23: Ensure FUZZ_BUDGET is validated as a safe numeric value and
quoted when passed to the shell: validate env.FUZZ_BUDGET in the run step (e.g.,
reject or default if it does not match ^[0-9]+$) and call the target with
quoting like make fuzz FUZZ_BUDGET="$FUZZ_BUDGET" so user-controlled
workflow_dispatch.inputs.budget cannot inject shell metacharacters; reference
the FUZZ_BUDGET variable and the make fuzz invocation for where to add the
numeric check and the quoted use.

In `@fuzz/mutate_fuzz.py`:
- Around line 426-432: The spec loaded into variable spec is mutated then passed
to gen_cases before $ref resolution, causing referenced schemas to be collapsed
to fallbacks; resolve all $ref references on spec (e.g., via the same OV/OpenAPI
compile/resolution used by run_validator/ov.compile or your existing resolver)
immediately after mutate(...) and before calling gen_cases(spec, ...), so
gen_cases operates on the fully-resolved spec; ensure the resolver mutates or
returns a resolved spec and use that resolved spec for gen_cases and later
run_validator.
- Around line 423-491: The loop can raise exceptions and currently skips writing
summary.json; wrap the main fuzz loop (the block using crashes_path.open and
iterating while time.time() - t0 < args.budget, which updates rounds, cases_run,
crashes, false_negatives) in a try/finally (or try/except/finally) so that in
the finally block you build the summary dict using the current rounds,
cases_run, t0, crashes, false_negatives and always call
summary_path.write_text(json.dumps(summary, indent=2)) and print it before
exiting; ensure variables referenced (rounds, cases_run, t0, crashes,
false_negatives, crashes_path) are in scope for the finally block and preserve
the same exit code logic (sys.exit(1 if (crashes or false_negatives) else 0)).
- Around line 209-242: sample_value() currently only honors enums inside
_sample_string(), causing integer/number/boolean values to ignore schema["enum"]
and produce invalid samples; fix by checking for schema.get("enum") near the top
of sample_value (after the nullable check and before the type-specific branches)
and, if present, return rng.choice(schema["enum"]) (handling nullability if enum
contains None or schema.get("nullable") is true) so all types respect enum
constraints; keep the existing _sample_string() behavior but remove its
special-case-only enum handling.
- Line 46: Remove "simple" from the ARRAY_STYLES list so m_param_style() no
longer selects an invalid style for query parameters; update the constant
ARRAY_STYLES to only include "form", "pipeDelimited", and "spaceDelimited" and
ensure any references in m_param_style() continue to use that constant for
random selection of query parameter styles.

In `@fuzz/README.md`:
- Around line 18-26: The fenced code block in the README describing the fuzz
orchestrator is unlabeled which triggers markdownlint MD040; update the fence by
adding a language tag (e.g., "text") to the opening triple-backticks around the
mutate_fuzz.py architecture block so the block is labeled (refer to the block
that mentions mutate_fuzz.py, fuzz/seeds/, RUNNER_LUA and the JSONL output) and
re-run lint to confirm the MD040 violation is resolved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d63b5207-39b3-45df-b968-59be4ac1a784

📥 Commits

Reviewing files that changed from the base of the PR and between 68c8b73 and ae88834.

📒 Files selected for processing (5)
  • .github/workflows/fuzz-nightly.yml
  • .github/workflows/fuzz.yml
  • fuzz/README.md
  • fuzz/mutate_fuzz.py
  • fuzz/seeds/discourse.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/fuzz.yml

Comment thread .github/workflows/fuzz-nightly.yml
Comment thread .github/workflows/fuzz-nightly.yml
Comment thread fuzz/mutate_fuzz.py Outdated
Comment thread fuzz/mutate_fuzz.py
Comment thread fuzz/mutate_fuzz.py Outdated
Comment thread fuzz/mutate_fuzz.py Outdated
Comment thread fuzz/README.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
.github/workflows/fuzz-nightly.yml (3)

22-23: ⚠️ Potential issue | 🟠 Major

Validate and quote FUZZ_BUDGET before invoking make.

workflow_dispatch.inputs.budget is user-controlled text. Expanding it unquoted on Line 73 lets shell metacharacters change what runs on the runner. Since fuzz/mutate_fuzz.py already accepts a numeric budget, validate that format first and then quote the make assignment.

🛡️ Proposed hardening
       - name: Run mutation fuzzer
         id: fuzz
         continue-on-error: true
         run: |
           export PATH=$OPENRESTY_PREFIX/nginx/sbin:$OPENRESTY_PREFIX/bin:$PATH
-          make fuzz FUZZ_BUDGET=$FUZZ_BUDGET
+          if ! printf '%s\n' "$FUZZ_BUDGET" | grep -Eq '^[0-9]+([.][0-9]+)?$'; then
+            echo "FUZZ_BUDGET must be a numeric number of seconds" >&2
+            exit 2
+          fi
+          make fuzz "FUZZ_BUDGET=$FUZZ_BUDGET"

Also applies to: 71-73

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/fuzz-nightly.yml around lines 22 - 23, Validate that the
user-provided workflow input for FUZZ_BUDGET is a positive integer and fail fast
if it isn't, then ensure the value is quoted when passed to the make invocation;
specifically, add a step that checks the workflow_dispatch input (the
FUZZ_BUDGET value) matches a numeric regex (e.g. only digits), set a sanitized
variable with that validated value, and update the make invocation to use the
quoted variable (e.g., FUZZ_BUDGET="${{ env.SANITIZED_FUZZ_BUDGET }}" or
similar) so shell metacharacters cannot be expanded.

42-47: ⚠️ Potential issue | 🟠 Major

Verify the LuaRocks tarball before building it.

Pinning LUAROCKS_VER helps, but this still builds whatever the remote archive serves and then installs it with sudo. Please verify a pinned checksum or signature before make build.

🛡️ Proposed hardening
           LUAROCKS_VER=3.12.0
-          wget -q "https://github.com/luarocks/luarocks/archive/v${LUAROCKS_VER}.tar.gz"
-          tar xzf "v${LUAROCKS_VER}.tar.gz"
+          LUAROCKS_TARBALL="luarocks-${LUAROCKS_VER}.tar.gz"
+          LUAROCKS_SHA256="<expected-sha256>"
+          curl -fsSLo "$LUAROCKS_TARBALL" \
+            "https://github.com/luarocks/luarocks/archive/refs/tags/v${LUAROCKS_VER}.tar.gz"
+          echo "${LUAROCKS_SHA256}  ${LUAROCKS_TARBALL}" | sha256sum -c -
+          tar xzf "$LUAROCKS_TARBALL"
           cd "luarocks-${LUAROCKS_VER}"
           ./configure --with-lua=$OPENRESTY_PREFIX/luajit
           make build && sudo make install
-          cd .. && rm -rf "luarocks-${LUAROCKS_VER}" "v${LUAROCKS_VER}.tar.gz"
+          cd .. && rm -rf "luarocks-${LUAROCKS_VER}" "$LUAROCKS_TARBALL"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/fuzz-nightly.yml around lines 42 - 47, The workflow
currently downloads and builds the LUAROCKS tarball without integrity
verification; change the steps around LUAROCKS_VER so the job first fetches the
tarball and its checksum or signature (use the repo's .sha256/.sha512 or .asc
signature), verify the archive (e.g., sha256sum -c against the pinned checksum
or gpg --verify against a trusted key) and only if verification succeeds proceed
to tar xzf, cd "luarocks-${LUAROCKS_VER}", ./configure
--with-lua=$OPENRESTY_PREFIX/luajit and make build && sudo make install; fail
the job on checksum/signature mismatch so unverified archives are never
built/installed.

63-66: ⚠️ Potential issue | 🟠 Major

Pin the Lua rock versions used by nightly CI.

These installs float to whatever versions are latest that day, so a nightly failure can come from upstream drift rather than this repo. Please install exact known-good versions or switch this step to a lockfile-driven flow.

📌 Minimal fix
-          sudo luarocks install jsonschema
-          sudo luarocks install lua-resty-radixtree
+          sudo luarocks install jsonschema <known-good-version>
+          sudo luarocks install lua-resty-radixtree <known-good-version>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/fuzz-nightly.yml around lines 63 - 66, The Install Lua
dependencies step currently installs floating versions of jsonschema and
lua-resty-radixtree; change it to install exact pinned versions or switch to a
lockfile-driven install. Update the commands that install jsonschema and
lua-resty-radixtree to include precise version specifiers (pin known-good
versions) or replace the step with a luarocks lockfile install flow (using a
generated luarocks.lock and invoking luarocks to install from it). Ensure you
modify the "Install Lua dependencies" step to reference the rock names
jsonschema and lua-resty-radixtree with those pinned versions or the
lockfile-based installation command.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/fuzz-nightly.yml:
- Around line 114-116: The variable existing is being set from the gh issue list
pipeline and ends up containing the string "null" when no issues are returned;
update the jq expression in the assignment to use the fallback operator so null
becomes empty (e.g., change the jq filter '.[0].number' to '.[0].number //
empty') so that existing is an empty string when no issue is found and the
subsequent if [ -n "$existing" ] check behaves correctly.

---

Duplicate comments:
In @.github/workflows/fuzz-nightly.yml:
- Around line 22-23: Validate that the user-provided workflow input for
FUZZ_BUDGET is a positive integer and fail fast if it isn't, then ensure the
value is quoted when passed to the make invocation; specifically, add a step
that checks the workflow_dispatch input (the FUZZ_BUDGET value) matches a
numeric regex (e.g. only digits), set a sanitized variable with that validated
value, and update the make invocation to use the quoted variable (e.g.,
FUZZ_BUDGET="${{ env.SANITIZED_FUZZ_BUDGET }}" or similar) so shell
metacharacters cannot be expanded.
- Around line 42-47: The workflow currently downloads and builds the LUAROCKS
tarball without integrity verification; change the steps around LUAROCKS_VER so
the job first fetches the tarball and its checksum or signature (use the repo's
.sha256/.sha512 or .asc signature), verify the archive (e.g., sha256sum -c
against the pinned checksum or gpg --verify against a trusted key) and only if
verification succeeds proceed to tar xzf, cd "luarocks-${LUAROCKS_VER}",
./configure --with-lua=$OPENRESTY_PREFIX/luajit and make build && sudo make
install; fail the job on checksum/signature mismatch so unverified archives are
never built/installed.
- Around line 63-66: The Install Lua dependencies step currently installs
floating versions of jsonschema and lua-resty-radixtree; change it to install
exact pinned versions or switch to a lockfile-driven install. Update the
commands that install jsonschema and lua-resty-radixtree to include precise
version specifiers (pin known-good versions) or replace the step with a luarocks
lockfile install flow (using a generated luarocks.lock and invoking luarocks to
install from it). Ensure you modify the "Install Lua dependencies" step to
reference the rock names jsonschema and lua-resty-radixtree with those pinned
versions or the lockfile-based installation command.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f4dc1868-fdba-4c5c-87d0-ca26419ca97e

📥 Commits

Reviewing files that changed from the base of the PR and between ae88834 and 7d9af9d.

📒 Files selected for processing (2)
  • .github/workflows/fuzz-nightly.yml
  • .github/workflows/fuzz.yml
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/fuzz.yml

Comment thread .github/workflows/fuzz-nightly.yml
- Remove 'simple' from ARRAY_STYLES (invalid for query params per OAS 3.0)
- Handle enum/const generically in sample_value() before type branches
- Clamp/swap bounds in integer sampling to prevent ValueError
- Encode query arrays using style-appropriate delimiter
- Resolve $ref pointers before generating cases
- Wrap main fuzz loop in try/finally to always write summary.json
- Validate FUZZ_BUDGET as numeric in nightly workflow
- Fix gh issue list jq to use '// empty' for null handling
- Add 'text' language tag to README fenced code block
@jarvis9443 jarvis9443 changed the title feat(fuzz): mutation fuzzer with schema-conformance oracle for CI feat(fuzz): mutation fuzzer with schema-conformance oracle Apr 24, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
.github/workflows/fuzz-nightly.yml (1)

63-66: ⚠️ Potential issue | 🟠 Major

Pin LuaRocks dependency versions for deterministic nightly runs.

Line 65 and Line 66 still install floating latest packages, so nightly failures can come from upstream releases rather than repo changes.

#!/bin/bash
set -euo pipefail
# Verify whether nightly workflow uses floating LuaRocks installs.
rg -n '^\s*sudo luarocks install (jsonschema|lua-resty-radixtree)\s*$' .github/workflows/fuzz-nightly.yml
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/fuzz-nightly.yml around lines 63 - 66, The workflow
installs floating LuaRocks packages in the "Install Lua dependencies" step;
replace the two commands that currently run "sudo luarocks install jsonschema"
and "sudo luarocks install lua-resty-radixtree" with pinned installs (explicit
version strings or exact rockspecs) or reference workflow variables (e.g.,
JSONSCHEMA_VERSION and RADIXTREE_VERSION) and use them like "sudo luarocks
install jsonschema <version>" and "sudo luarocks install lua-resty-radixtree
<version>" so nightly runs are deterministic; update the step commands and add
corresponding variables/inputs for the chosen versions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@fuzz/mutate_fuzz.py`:
- Around line 304-305: The code builds path_params from only operation-level
parameters (path_params = {p["name"]: p for p in (op.get("parameters") or [])
...}) but ignores path-item parameters; update the parameter collection to merge
path-item and operation-level parameters so path_params includes parameters from
both sources (e.g., combine path_item.get("parameters") and op.get("parameters")
before filtering), deduplicate by name+in, and then use that merged path_params
wherever the generator uses path_params (the blocks around the existing
path_params assignment and the later logic at lines ~323-346) so “positive”
requests include required path-item params as well.
- Around line 189-212: The resolver resolve_refs (and its inner helper _resolve)
currently follows $ref links without tracking which ref targets have been
visited, causing RecursionError on cyclical refs; update _resolve to accept and
propagate a visited set (e.g., visited_refs) and, when encountering a "$ref"
string, compute a stable key for the target (the ref string or the resolved path
parts) and check visited_refs before descending—if already visited, return the
node as-is (or a shallow copy) to break the cycle; otherwise add the key to
visited_refs before recursing into the resolved target and remove it after
returning so other branches remain unaffected.
- Around line 416-421: The code currently swallows JSON parsing errors in the
loop over r.stdout (for line in r.stdout.splitlines(): ... except Exception:
pass), which can hide malformed validator output; update the loop to catch
json.JSONDecodeError specifically, record the offending line(s) (e.g., add to a
parse_errors list or append raw lines to out_errors), and include that condition
when deciding failure: change the final check that uses r.returncode and out (if
r.returncode != 0 and not out) to also fail when parse_errors is non-empty
(e.g., if r.returncode != 0 or not out or parse_errors), and log the
parse_errors with context so malformed lines are visible (reference
variables/functions: r, r.stdout, r.returncode, out, json.loads).

---

Duplicate comments:
In @.github/workflows/fuzz-nightly.yml:
- Around line 63-66: The workflow installs floating LuaRocks packages in the
"Install Lua dependencies" step; replace the two commands that currently run
"sudo luarocks install jsonschema" and "sudo luarocks install
lua-resty-radixtree" with pinned installs (explicit version strings or exact
rockspecs) or reference workflow variables (e.g., JSONSCHEMA_VERSION and
RADIXTREE_VERSION) and use them like "sudo luarocks install jsonschema
<version>" and "sudo luarocks install lua-resty-radixtree <version>" so nightly
runs are deterministic; update the step commands and add corresponding
variables/inputs for the chosen versions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7021f23a-df6a-48b9-803b-6d08bd84a07d

📥 Commits

Reviewing files that changed from the base of the PR and between 7d9af9d and 0b09794.

📒 Files selected for processing (4)
  • .github/workflows/fuzz-nightly.yml
  • .gitignore
  • fuzz/README.md
  • fuzz/mutate_fuzz.py
✅ Files skipped from review due to trivial changes (1)
  • .gitignore

Comment thread fuzz/mutate_fuzz.py Outdated
Comment thread fuzz/mutate_fuzz.py Outdated
Comment thread fuzz/mutate_fuzz.py
- Add cycle detection in $ref resolution to prevent recursion blowups
- Merge path-item and operation-level parameters for complete coverage
- Report malformed JSONL output instead of silently swallowing parse errors
Copilot AI review requested due to automatic review settings April 24, 2026 06:55
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
fuzz/mutate_fuzz.py (2)

491-495: ⚠️ Potential issue | 🟠 Major

Use the same resolved spec for both case generation and validation.

Line 492 generates “positive” requests from resolved, but Line 495 validates against spec. This can desync the oracle and introduce false-negative noise. Pass resolved to run_validator too.

Suggested fix
-                results = run_validator(spec, cases, args.deps, args.lib, args.include)
+                results = run_validator(resolved, cases, args.deps, args.lib, args.include)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fuzz/mutate_fuzz.py` around lines 491 - 495, The validation is using the
original spec variable while generated cases come from the transformed resolved
spec, which can cause mismatches; update the call to run_validator to pass
resolved instead of spec so the oracle matches the generated positive
cases—i.e., after calling resolve_refs(spec) and gen_cases(resolved,...), call
run_validator(resolved, cases, args.deps, args.lib, args.include) (referencing
resolve_refs, gen_cases, run_validator, spec, resolved).

429-437: ⚠️ Potential issue | 🟠 Major

Don’t ignore malformed JSONL when some lines are valid.

Current logic only emits subprocess_error when bad_lines > 0 and no valid JSON lines exist. Mixed output still passes silently, which can hide runner/protocol issues.

Suggested fix
-    if bad_lines and not out:
+    if bad_lines:
         out.append({"phase": "subprocess_error", "rc": r.returncode,
                      "stderr": f"malformed validator JSONL output ({bad_lines} lines)"})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fuzz/mutate_fuzz.py` around lines 429 - 437, The current parsing loop in
mutate_fuzz.py collects valid JSON lines into out but drops information when
there are malformed lines if any valid lines exist; update the post-loop logic
(the block handling r.stdout, out, bad_lines and the subsequent if r.returncode
checks) so that whenever bad_lines > 0 you also append or merge an additional
result entry documenting the malformed JSONL (e.g.,
{"phase":"subprocess_malformed_lines","bad_lines":bad_lines,"stderr":"malformed
validator JSONL output"}) instead of only emitting when out is empty; keep
existing valid parsed entries but ensure the malformed-lines entry is always
added when bad_lines > 0 so mixed output is not silently ignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@fuzz/mutate_fuzz.py`:
- Around line 491-495: The validation is using the original spec variable while
generated cases come from the transformed resolved spec, which can cause
mismatches; update the call to run_validator to pass resolved instead of spec so
the oracle matches the generated positive cases—i.e., after calling
resolve_refs(spec) and gen_cases(resolved,...), call run_validator(resolved,
cases, args.deps, args.lib, args.include) (referencing resolve_refs, gen_cases,
run_validator, spec, resolved).
- Around line 429-437: The current parsing loop in mutate_fuzz.py collects valid
JSON lines into out but drops information when there are malformed lines if any
valid lines exist; update the post-loop logic (the block handling r.stdout, out,
bad_lines and the subsequent if r.returncode checks) so that whenever bad_lines
> 0 you also append or merge an additional result entry documenting the
malformed JSONL (e.g.,
{"phase":"subprocess_malformed_lines","bad_lines":bad_lines,"stderr":"malformed
validator JSONL output"}) instead of only emitting when out is empty; keep
existing valid parsed entries but ensure the malformed-lines entry is always
added when bad_lines > 0 so mixed output is not silently ignored.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 321bf6eb-5bc6-4ee2-b34e-d038f43d7239

📥 Commits

Reviewing files that changed from the base of the PR and between 0b09794 and 3cc7110.

📒 Files selected for processing (1)
  • fuzz/mutate_fuzz.py

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 9 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread fuzz/mutate_fuzz.py Outdated
Comment thread fuzz/mutate_fuzz.py Outdated
Comment thread fuzz/mutate_fuzz.py
Comment thread fuzz/README.md Outdated
Comment thread fuzz/mutate_fuzz.py Outdated
- Update docstring: OpenAPI 3.x (not just 3.0)
- Clarify crashes.jsonl contains both crashes and false negatives
- Fix resolve_refs docstring (returns new tree, not in-place)
- Fix README: length_on_array targets array only
- Generate and log RNG seed for reproducibility
@nic-6443 nic-6443 merged commit 53d4799 into main Apr 24, 2026
4 checks passed
@nic-6443 nic-6443 deleted the feat/fuzz-harness branch April 24, 2026 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants