|
| 1 | +# Mutation Fuzzer for `lua-resty-openapi-validator` |
| 2 | + |
| 3 | +A small mutation fuzzer that runs the validator against AST-mutated copies of |
| 4 | +real-world OpenAPI specs and checks two oracles: |
| 5 | + |
| 6 | +1. **No crashes**: `validate_request` must not throw a Lua error |
| 7 | + (caught with `pcall`). |
| 8 | +2. **Schema conformance**: a request **generated to satisfy** an operation's |
| 9 | + schema must be **accepted** by the validator. A rejection is a candidate |
| 10 | + false-negative bug. |
| 11 | + |
| 12 | +The fuzzer is the productionised form of the harness used during |
| 13 | +[v1.0.3 QA](../../../qa/lua-resty-openapi-validator-v1.0.3.md). It reproduces |
| 14 | +the bugs that QA found (path-extension Bug 1 against the unfixed validator, |
| 15 | +`utf8_len(table)` Bug 3 against the unfixed jsonschema). |
| 16 | + |
| 17 | +## Architecture |
| 18 | + |
| 19 | +``` |
| 20 | +mutate_fuzz.py (Python orchestrator) |
| 21 | + ├─ pick a seed spec from fuzz/seeds/ |
| 22 | + ├─ apply N random mutations (mutators below) |
| 23 | + ├─ generate schema-conforming positive requests |
| 24 | + └─ resty -e RUNNER_LUA (validator subprocess, one per round) |
| 25 | + └─ for each case: pcall(v:validate_request, req) |
| 26 | + └─ JSONL result on stdout: {phase, accepted, err} |
| 27 | +``` |
| 28 | + |
| 29 | +Mutators (`fuzz/mutate_fuzz.py`): |
| 30 | + |
| 31 | +| name | what it does | targets | |
| 32 | +|---|---|---| |
| 33 | +| `path_extension` | append `.json` / `.csv` etc. to a random path | path-routing edge cases (Bug 1) | |
| 34 | +| `nullable_enum` | inject `null` into an enum + flip `nullable: true` | nullable-enum handling (Bug 2) | |
| 35 | +| `length_on_array` | move `maxLength` onto an `array`/`object` schema | type-inappropriate keywords (Bug 3) | |
| 36 | +| `param_style` | flip parameter `style`/`explode` | parameter parsing (Bug 4 family) | |
| 37 | +| `required_phantom` | add a non-existent property name to `required` | schema-validation edge cases | |
| 38 | +| `swap_scalar_type` | swap `type: integer` ↔ `type: string` | coercion paths | |
| 39 | + |
| 40 | +Generator (`sample_value`): produces JSON values that match a JSON Schema |
| 41 | +fragment (string / integer / number / boolean / array / object / enum), with |
| 42 | +a depth limit. Path/query/header parameters that are `required: true` are |
| 43 | +filled in; the request body is sampled from the operation's |
| 44 | +`requestBody` schema if present. |
| 45 | + |
| 46 | +## Run locally |
| 47 | + |
| 48 | +```bash |
| 49 | +make fuzz # 60s budget |
| 50 | +make fuzz FUZZ_BUDGET=300 # 5 min |
| 51 | +python3 fuzz/mutate_fuzz.py --budget 60 --seed 7 # reproducible |
| 52 | +``` |
| 53 | + |
| 54 | +Output: |
| 55 | + |
| 56 | +- `fuzz/out/crashes.jsonl` — one JSON object per finding |
| 57 | +- `fuzz/out/summary.json` — `{rounds, cases_run, elapsed_s, crash_count}` |
| 58 | +- exits non-zero on any crash or candidate false-negative (CI-friendly) |
| 59 | + |
| 60 | +## Add a seed |
| 61 | + |
| 62 | +Drop any OpenAPI 3.x spec into `fuzz/seeds/`. Smaller specs (50–100 ops) |
| 63 | +give more mutation rounds per second; very large specs (>500 ops) slow |
| 64 | +each round. Recommended size: 30 KB – 300 KB. |
| 65 | + |
| 66 | +## Add a mutator |
| 67 | + |
| 68 | +1. Add `def my_mutator(spec, rng): ...` near the other mutators. |
| 69 | +2. Append to `MUTATORS` list. |
| 70 | +3. Mutator must mutate `spec` **in place** and return a label string |
| 71 | + (or just its function name) for logging. |
| 72 | + |
| 73 | +## Noise filter |
| 74 | + |
| 75 | +`gen_cases` does not try to satisfy every JSON Schema construct — `oneOf` / |
| 76 | +`allOf` / `discriminator` / complex `pattern` are common in real specs but |
| 77 | +hard to satisfy generically. Errors mentioning these are filtered as |
| 78 | +generator artefacts, not validator bugs. The list lives near the bottom |
| 79 | +of `mutate_fuzz.py`. If the filter masks a real bug you find by other |
| 80 | +means, narrow / shrink it; if it lets through too much noise, widen it. |
| 81 | + |
| 82 | +## CI |
| 83 | + |
| 84 | +- **PR**: `.github/workflows/fuzz.yml` — 120s budget, fails the PR on any finding. |
| 85 | +- **Nightly**: `.github/workflows/fuzz-nightly.yml` — 600s budget, on |
| 86 | + failure uploads `fuzz/out/` as an artifact and opens (or comments on) |
| 87 | + a `fuzz-nightly` issue assigned to `@jarvis-api7`. |
0 commit comments