Skip to content

[FuzzMutate] Reject invalid inputs in the IR fuzzers#202348

Open
chfast wants to merge 1 commit into
llvm:mainfrom
chfast:fuzzer-ignore-broken-input
Open

[FuzzMutate] Reject invalid inputs in the IR fuzzers#202348
chfast wants to merge 1 commit into
llvm:mainfrom
chfast:fuzzer-ignore-broken-input

Conversation

@chfast
Copy link
Copy Markdown
Member

@chfast chfast commented Jun 8, 2026

In llvm-isel-fuzzer and llvm-opt-fuzzer return -1 from LLVMFuzzerTestOneInput to ignore an invalid input. This will remove it from in-memory corpus and prevent sending it to the custom mutators (useless and crashes llvm-isel-fuzzer). libfuzzer truncates with -max_len so inputs above the limit are automatically invalid.

In llvm-isel-fuzzer and llvm-opt-fuzzer return -1 from LLVMFuzzerTestOneInput
to ignore an invalid input. This will remove it from in-memory corpus and
prevent sending it to the custom mutators (useless and crashes
llvm-isel-fuzzer). libfuzzer truncates with -max_len so inputs above the
limit are automatically invalid.
@mgcarrasco
Copy link
Copy Markdown
Contributor

I see that ignoring broken inputs can be useful, but also depending on the objective allowing invalid inputs is not necessarily wrong. Invalid inputs can still find crashes in the parser or verifier.

Would it be possible to put this change behind a flag?

@DataCorrupted
Copy link
Copy Markdown
Member

I agree that we should put the change under a flag but under a different reasoning:

Re to @mgcarrasco I don't see the value in invalid inputs here on the files being changed (opt-fuzzer and isel-fuzzer), whose purpose is to find deeper bugs in the pipeline through valid input. Aside from the fact that LLParser's correctness has low priority, even if you wish to stress test the parser/verifier (say we invented a new intrinsic / IR opcode), llvm-dis-fuzzer and/or llvm-as-fuzzer should be the one to use (Tho they lack some development, which justifies my "lower priority" point)

On the change itself, I can't speak for the effect on libFuzzer, but on AFL, IIUC, it does little:

  • return 0: tells AFL this harness is not crashing, which is true; and AFL will store the invalid input in the queue for later mutation, although the priority of that will be very low, because AFL will drop its priority when it finds that the input (being invalid) can't trigger more execution path than the error message.
  • return -1: tells AFL this harness is crashing, it deviates from the program's true behavior, and trick AFL to store the seed in the crash directory, a very important directory for post-fuzzing analysis where ppl categorize the crashes and patch their optimization/backend. Adding invalid input to crash dir is a little noise that I do not want to introduce.

So from AFL's perspective, this is a slight regression, not totally unacceptable if its guarded under a flag and justifies your usage. I would like to see in which use case do you find return -1 useful?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants