Skip to content
Open
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ CMakeFiles
cmake_install.cmake
json.so
json.dll
rapidjson.so
rapidjson.so.dSYM/
deps/
/*.src.rock
/rapidjson/test/
Expand All @@ -26,3 +28,4 @@ rapidjson/rapidjson.autopkg
rapidjson/travis-doxygen.sh
/cmake-build-debug
/.idea
/docs/superpowers/
58 changes: 58 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
.PHONY: fuzz

LUA ?= lua
DURATION ?= 3600
INTERVAL ?= 5
WORKERS ?= 1
SEED ?= $(shell date +%s)
SORT_KEYS ?= 1

fuzz:
@set -u; \
tmpdir=$$(mktemp -d "$${TMPDIR:-/tmp}/lua-rapidjson-fuzz.XXXXXX"); \
pids=""; \
cleanup() { rm -rf "$$tmpdir"; }; \
stop_workers() { for pid in $$pids; do kill "$$pid" 2>/dev/null || true; done; cleanup; }; \
trap cleanup EXIT; \
trap stop_workers INT TERM; \
Comment on lines +17 to +19

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify trap + loop dependency on tmpdir markers.
rg -n -C2 'stop_workers\(\)|trap stop_workers|tmpdir|done_count|fail\.\*|done\.\*' Makefile

Repository: api7/lua-rapidjson

Length of output: 954


Stop the fuzz polling loop from continuing after INT/TERM cleanup
In Makefile’s fuzz target, stop_workers() (lines ~15–17) calls cleanup() which rm -rfs $$tmpdir, but the parent while :; do polling loop (lines ~39–50) keeps waiting on ls "$$tmpdir"/fail.* / ls "$$tmpdir"/done.*. After $$tmpdir is removed, fail.* never appears and done_count stays at 0, so the loop can sleep indefinitely.

Suggested fix
-	stop_workers() { for pid in $$pids; do kill "$$pid" 2>/dev/null || true; done; cleanup; }; \
+	stop_workers() { \
+		for pid in $$pids; do kill "$$pid" 2>/dev/null || true; done; \
+		exit 130; \
+	}; \
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
stop_workers() { for pid in $$pids; do kill "$$pid" 2>/dev/null || true; done; cleanup; }; \
trap cleanup EXIT; \
trap stop_workers INT TERM; \
stop_workers() { \
for pid in $$pids; do kill "$$pid" 2>/dev/null || true; done; \
exit 130; \
}; \
trap cleanup EXIT; \
trap stop_workers INT TERM; \
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Makefile` around lines 15 - 17, The polling loop in the fuzz target can hang
after stop_workers() calls cleanup() and removes $$tmpdir; update the loop that
waits on ls "$$tmpdir"/fail.* / done.* (the while :; do ... done loop) so it
breaks if the temp directory is gone (e.g., add a check if [ ! -d "$$tmpdir" ];
then break; fi) or also check for a sentinel file; alternatively, have
stop_workers() create a sentinel outside $$tmpdir before calling cleanup() and
make the loop exit when that sentinel is present. Reference: stop_workers(),
cleanup, the fuzz target while loop, $$tmpdir, fail.*, and done_count.

worker=1; \
while [ "$$worker" -le "$(WORKERS)" ]; do \
seed=$$(( $(SEED) + $$worker - 1 )); \

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify mismatch between Makefile arithmetic requirement and fuzz lib seed normalization.
rg -n -C2 'seed=\$\$\(\(|SEED' Makefile
rg -n -C3 'normalize_seed|tonumber\(|SEED' tools/fuzz_encode_lib.lua

Repository: api7/lua-rapidjson

Length of output: 1770


🏁 Script executed:

#!/bin/bash
# Inspect Makefile around SEED and seed arithmetic.
sed -n '1,60p' Makefile

# Find other SEED usage in repo.
rg -n 'SEED\b' Makefile tools -S

Repository: api7/lua-rapidjson

Length of output: 1750


🏁 Script executed:

#!/bin/bash
# Inspect how SEED is passed into Lua and normalized.
sed -n '1,120p' tools/fuzz_encode.lua
sed -n '1,120p' tools/fuzz_encode_lib.lua

Repository: api7/lua-rapidjson

Length of output: 4798


Coerce SEED to numeric (matching Lua) before Make arithmetic
The fuzz recipe computes per-worker seeds with shell arithmetic (seed=$$(( $(SEED) + $$worker - 1 ))), which requires $(SEED) to be numeric; SEED=abc fails before Lua starts. Lua’s tools/fuzz_encode_lib.lua normalize_seed() already falls back to os.time() for non-numeric inputs, so the Makefile should apply the same numeric fallback (e.g., use date +%s when SEED isn’t an integer) instead of failing early.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Makefile` at line 20, The Makefile's fuzz recipe must coerce SEED to a
numeric fallback before doing shell arithmetic; modify the recipe to compute a
numeric SEED value (e.g., set SEED_NUM from SEED if it matches an integer,
otherwise use the current epoch like date +%s) and then compute per-worker seed
using that numeric SEED_NUM (replace the existing seed=$$(( $(SEED) + $$worker -
1 )) with arithmetic based on SEED_NUM). Reference the fuzz recipe, the SEED
variable, the per-worker seed calculation, and
tools/fuzz_encode_lib.lua::normalize_seed() so the behavior matches Lua's
fallback to os.time().

( \
DURATION="$(DURATION)" \
INTERVAL="$(INTERVAL)" \
WORKERS="$(WORKERS)" \
WORKER_ID="$$worker" \
SEED="$$seed" \
SORT_KEYS="$(SORT_KEYS)" \
"$(LUA)" tools/fuzz_encode.lua; \
rc=$$?; \
if [ "$$rc" -ne 0 ]; then \
echo "$$rc" > "$$tmpdir/fail.$$worker"; \
fi; \
echo "$$rc" > "$$tmpdir/done.$$worker"; \
) & \
pids="$$pids $$!"; \
worker=$$(( $$worker + 1 )); \
done; \
status=0; \
while :; do \
if ls "$$tmpdir"/fail.* >/dev/null 2>&1; then \
status=1; \
for pid in $$pids; do \
kill "$$pid" 2>/dev/null || true; \
done; \
break; \
fi; \
done_count=$$(ls "$$tmpdir"/done.* 2>/dev/null | wc -l | tr -d ' '); \
if [ "$$done_count" -ge "$(WORKERS)" ]; then \
break; \
fi; \
sleep 1; \
done; \
for pid in $$pids; do \
if ! wait "$$pid" 2>/dev/null; then \
status=1; \
fi; \
done; \
exit "$$status"
303 changes: 303 additions & 0 deletions spec/fuzz_encode_lib_spec.lua
Original file line number Diff line number Diff line change
@@ -0,0 +1,303 @@
require 'busted.runner'()

describe('tools.fuzz_encode_lib', function()
local fuzz = require('tools.fuzz_encode_lib')
local rapidjson = require('rapidjson')

describe('parse_config', function()
it('uses production defaults', function()
local cfg = fuzz.parse_config({})

assert.are.equal(3600, cfg.duration)
assert.are.equal(5, cfg.interval)
assert.are.equal(1, cfg.workers)
assert.are.equal(1, cfg.worker_id)
assert.are.equal(true, cfg.sort_keys)
assert.are.equal('number', type(cfg.seed))
end)

it('accepts numeric and boolean overrides', function()
local cfg = fuzz.parse_config({
DURATION = '12',
INTERVAL = '3',
WORKERS = '2',
WORKER_ID = '2',
SEED = '99',
SORT_KEYS = '0',
})

assert.are.equal(12, cfg.duration)
assert.are.equal(3, cfg.interval)
assert.are.equal(2, cfg.workers)
assert.are.equal(2, cfg.worker_id)
assert.are.equal(99, cfg.seed)
assert.are.equal(false, cfg.sort_keys)
end)

it('treats numeric zero as disabling sorted keys', function()
local cfg = fuzz.parse_config({ SORT_KEYS = 0 })

assert.are.equal(false, cfg.sort_keys)
end)
end)

describe('env_from_args', function()
it('turns KEY=VALUE args into config environment entries', function()
local env = fuzz.env_from_args({
'DURATION=2',
'INTERVAL=1',
'SEED=123',
'WORKERS=1',
})

assert.are.equal('2', env.DURATION)
assert.are.equal('1', env.INTERVAL)
assert.are.equal('123', env.SEED)
assert.are.equal('1', env.WORKERS)
end)
end)

describe('new_rng', function()
it('is deterministic for the same seed', function()
local a = fuzz.new_rng(123)
local b = fuzz.new_rng(123)

assert.are.equal(a:int(1, 1000000), b:int(1, 1000000))
assert.are.equal(a:int(1, 1000000), b:int(1, 1000000))
assert.are.equal(a:bool(), b:bool())
end)
end)

describe('format_summary', function()
it('formats the progress counters', function()
local line = fuzz.format_summary({
elapsed = 5,
total = 100,
encoded = 99,
encode_errors = 1,
validation_failures = 0,
rate = 20,
seed = 123,
last_case_id = 100,
worker_id = 1,
})

assert.matches('worker=1', line, 1, true)
assert.matches('elapsed=5s', line, 1, true)
assert.matches('total=100', line, 1, true)
assert.matches('encoded=99', line, 1, true)
assert.matches('encode_errors=1', line, 1, true)
assert.matches('validation_failures=0', line, 1, true)
assert.matches('rate=20.00/s', line, 1, true)
assert.matches('seed=123', line, 1, true)
assert.matches('last_case=100', line, 1, true)
end)
end)

describe('generate_case', function()
it('generates deterministic schema-guided cases with selected metadata', function()
local a = fuzz.generate_case(fuzz.new_rng(321), 1, rapidjson)
local b = fuzz.generate_case(fuzz.new_rng(321), 1, rapidjson)

assert.are.same(a.value, b.value)
assert.are.same(a.expected, b.expected)
assert.are.equal('number', type(a.id))
assert.are.equal('string', type(a.schema))
assert.are.equal('schema_guided', a.kind)
assert.are.equal('object', a.expected.top_level_kind)
assert.are.equal('table', type(a.value.fuzz))
assert.is_true(#a.expected.objects >= 1)
assert.is_true(#a.expected.arrays >= 1)
assert.is_true(#a.expected.scalars >= 1)
end)

it('adds pure recursive random cases with nested objects and arrays', function()
local case = fuzz.generate_case(fuzz.new_rng(98765), 3, rapidjson)

assert.are.equal('recursive_random', case.kind)
assert.are.equal('recursive_random', case.schema)
assert.are.equal('table', type(case.value))
assert.are.equal('table', type(case.value.random))
assert.are.equal('table', type(case.expected.random))
assert.is_true(case.expected.random.max_depth >= 3)
assert.is_true(case.expected.random.object_count >= 2)
assert.is_true(case.expected.random.array_count >= 1)
assert.is_true(#case.expected.objects >= case.expected.random.object_count)
assert.is_true(#case.expected.arrays >= case.expected.random.array_count)
end)

it('tracks recursive random arrays from the generated core', function()
local case = fuzz.generate_case(fuzz.new_rng(98765), 3, rapidjson)
local saw_core_array = false

for _, entry in ipairs(case.expected.arrays) do
if entry.path:match('^%$%.random') then
saw_core_array = true
end
end

assert.is_true(saw_core_array)
end)

it('emits rapidjson null sentinels that round-trip as JSON null', function()
local case = fuzz.generate_case(fuzz.new_rng(100), 10, rapidjson)

assert.are.equal('paginated_list', case.schema)
assert.are.equal(rapidjson.null, case.value.links.previous)

local encoded = rapidjson.encode(case.value)
local decoded = rapidjson.decode(encoded)

assert.matches('"previous":null', encoded, 1, true)
assert.are.equal(rapidjson.null, decoded.links.previous)
end)

it('requires a real rapidjson null sentinel', function()
assert.has_error(function()
fuzz.generate_case(fuzz.new_rng(1), 1, {})
end, 'rapidjson.null is required')
end)

it('rejects fake table null sentinels', function()
assert.has_error(function()
fuzz.generate_case(fuzz.new_rng(1), 1, { null = {} })
end, 'rapidjson.null is required')
end)

it('runs pure recursive random cases at least as often as schema-guided cases', function()
local rng = fuzz.new_rng(1)
local seen = {}
local counts = {
schema_guided = 0,
recursive_random = 0,
}

for case_id = 1, 30 do
local case = fuzz.generate_case(rng, case_id, rapidjson)
counts[case.kind] = counts[case.kind] + 1
if case.kind == 'schema_guided' then
seen[case.schema] = true
end
end

assert.is_true(counts.recursive_random >= counts.schema_guided)
assert.are.equal(10, counts.schema_guided)
assert.are.equal(20, counts.recursive_random)
assert.is_true(seen.llm_response)
assert.is_true(seen.github_issue)
assert.is_true(seen.social_feed)
assert.is_true(seen.paginated_list)
assert.is_true(seen.metadata_config)
end)
end)

describe('validate_encoded_case', function()
it('accepts a generated case encoded with sorted keys', function()
local case = fuzz.generate_case(fuzz.new_rng(77), 1, rapidjson)
local json = rapidjson.encode(case.value, { sort_keys = true })

local ok, err = fuzz.validate_encoded_case(rapidjson, case, json)

assert.is_true(ok)
assert.is_nil(err)
end)

it('rejects unsorted encoded object keys for tracked objects', function()
local case = {
id = 1,
kind = 'manual',
schema = 'manual',
value = { b = 1, a = 2 },
expected = {
top_level_kind = 'object',
objects = {
{ path = '$', key_count = 2, keys = { 'a', 'b' } },
},
arrays = {},
scalars = {},
},
}

local ok, err = fuzz.validate_encoded_case(rapidjson, case, '{"b":1,"a":2}')

assert.is_false(ok)
assert.matches('key order', err, 1, true)
end)

it('rejects unsorted nested object keys for tracked object paths', function()
local case = {
id = 2,
kind = 'manual',
schema = 'manual',
value = { a = { b = 1, a = 2 } },
expected = {
top_level_kind = 'object',
objects = {
{ path = '$.a', key_count = 2, keys = { 'a', 'b' } },
},
arrays = {},
scalars = {},
},
}

local ok, err = fuzz.validate_encoded_case(rapidjson, case, '{"a":{"b":1,"a":2}}')

assert.is_false(ok)
assert.matches('key order', err, 1, true)
end)

it('validates recursive_random core metadata after encode and decode', function()
local case = fuzz.generate_case(fuzz.new_rng(98765), 2, rapidjson)
local json = rapidjson.encode(case.value, { sort_keys = true })

assert.are.equal('recursive_random', case.kind)
assert.are.equal('recursive_random', case.schema)
assert.are.equal('table', type(case.expected.random))

local ok, err = fuzz.validate_encoded_case(rapidjson, case, json)

assert.is_true(ok)
assert.is_nil(err)
end)

it('returns decode diagnostics when JSON cannot be decoded', function()
local ok, err = fuzz.validate_encoded_case(rapidjson, { expected = {} }, '{"a":}')

assert.is_false(ok)
assert.matches('decode failed:', err, 1, true)
end)
end)

describe('format_failure', function()
it('is reproducible and includes fuzz failure diagnostics', function()
local case = {
id = 42,
kind = 'manual',
schema = 'manual_schema',
value = { b = 1, a = { true, rapidjson.null } },
}
local details = {
seed = 12345,
worker_id = 2,
case = case,
reason = 'key order mismatch at $',
json = '{"b":1,"a":[true,null]}',
}

local first = fuzz.format_failure(details)
local second = fuzz.format_failure(details)

assert.are.equal(first, second)
assert.matches('FUZZ FAILURE', first, 1, true)
assert.matches('seed=12345', first, 1, true)
assert.matches('worker=2', first, 1, true)
assert.matches('case=42', first, 1, true)
assert.matches('kind=manual', first, 1, true)
assert.matches('schema=manual_schema', first, 1, true)
assert.matches('reason=key order mismatch at $', first, 1, true)
assert.matches('value=', first, 1, true)
assert.matches('"a"', first, 1, true)
assert.matches('json={"b":1,"a":[true,null]}', first, 1, true)
end)
end)
end)
Loading