fix: don't share datatype across recursive validator calls#100
fix: don't share datatype across recursive validator calls#100nic-6443 merged 2 commits intoapi7:masterfrom
Conversation
The codegen stored each generated function's per-call `datatype` / `datakind` variables as fields on a shared `locals` table (introduced by api7#41 to dodge LuaJIT's 200-local limit). Because every generated schema validator is its own Lua function, recursing into a nested validator (e.g. via `items` or `properties`) overwrote the outer function's datatype with the inner value's type. After the recursion returned, the outer function would take the wrong branch. Concrete crash: a schema like { type = "array", maxLength = 10, items = { type = "string" } } generated an outer validator that, after looping over string items, saw `datatype == "string"` (the last item's type) and ran the string-only `maxLength` check on the array value, calling `utf8_len` on a table and crashing with 'string expected, got table'. Fix: use real `local` slots for `datatype` / `datakind`. Each generated validator function has its own 200-local budget, so the original LuaJIT workaround is unnecessary for these per-call vars. `localvartab` is still used for the cross-function validator forward declarations (line 171) and propset cache (line 746) where shared storage is intentional. Adds a regression test in t/default.lua.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughEmits Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
Fixes a codegen reentrancy bug where per-call datatype/datakind state was stored in a shared locals table, allowing nested validator recursion to clobber the caller’s type state and trigger wrong-branch validation (including crashes).
Changes:
- Switch
datatype/datakindin generated validators fromctx:localvartab(...)(shared table fields) toctx:localvar(...)(true per-function locals). - Add a regression test covering the recursive clobbering scenario involving
itemsrecursion andmaxLength.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
lib/jsonschema.lua |
Moves per-call type state to real locals to prevent recursion from mutating outer validator state. |
t/default.lua |
Adds regression coverage for the recursive datatype clobbering crash scenario. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@t/default.lua`:
- Around line 249-250: The test currently only checks that calling
pcall(validator, { "a", "b", "c" }) did not error (local ok, err = ...), which
can hide a validator failure (validator returns false, error_msg); update the
assertion to capture pcall success and the validator result separately (e.g.,
local pcall_ok, valid, val_err = pcall(validator, { "a", "b", "c" })) and then
assert both pcall_ok and valid, including both pcall error and validator error
in the failure message so you fail the test when validation returns false as
well as when pcall throws.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2748523f-4f76-43bb-a7d5-4e26b9c89cc6
📒 Files selected for processing (2)
lib/jsonschema.luat/default.lua
…rrors Per coderabbit nit: the previous `assert(ok, ...)` only caught pcall crashes; if the validator returned `false, err` instead of erroring, the test would silently pass.
|
Addressed in 249a917 — split the assertion to check pcall success and the validator's boolean result separately. |
What
The codegen stored each generated function's per-call
datatype/datakindvariables as fields on a sharedlocalstable (introduced by #41 to dodge LuaJIT's 200-local limit). Because every generated schema validator is its own Lua function, recursing into a nested validator (e.g. viaitemsorproperties) overwrote the outer function's datatype with the inner value's type. After the recursion returned, the outer function would take the wrong branch.Concrete crash
The outer validator looped over string items (each call set
datatype = 'string'on the sharedlocalstable), and after the loop returned,if datatype == 'string' thenwas true, so the string-onlymaxLengthbranch ran on the array and calledutf8_lenon a table.Fix
Use real
localslots fordatatype/datakind. Each generated validator function has its own 200-local budget, so the original LuaJIT workaround from #41 is unnecessary for these per-call vars.localvartabis still used for the cross-function validator forward declarations and the propset cache where shared storage is intentional.Tests
t/default.lua(test case 9).t/200more_variables.luawhich guards bugfix: workaround for 200 local variables limitation in luajit #41's original concern.Summary by CodeRabbit
Bug Fixes
Tests