Cranelift CLIF-level differential fuzzer #3038

afonso360 · 2021-06-27T16:34:25Z

Hey!

Here's an initial version of the cranelift fuzzer.

We generate a random cranelift function based on input bytes from the fuzzer. Currently we can generate 7 instructions, but it is already enough to get us into trouble.

There is also a gen_testcase util that grabs fuzzer artifacts and converts them into clif files for easier testing.

In order for the fuzzer to be useful in testing DivByZero traps we need to disable overflow checks:

 RUSTFLAGS="-C overflow-checks=off" cargo fuzz run fuzzgen

Otherwise, the interpreter crashes immediately with overflows (We also need to fix this).

I've based my design a lot on @fitzgen 's blog post about wasm-smith. Its what made me start thinking about building this for cranelift in the first place. Thanks!

Organization
- I've created a new cargo-fuzz package for cranelift, should I merge the fuzzer with wasmtime's fuzzers?
- Is there a way to keep track which cranelift features we are covering / not? It would be useful to keep track of this, so that we don't accidentally miss any feature.
Design questions:
- How do we deal with arbitrary memory accesses?
  - @cfallin suggested we run on the interpreter first and check for invalid memory accesses.
  - @abrown mentioned that we still need some work on the interpreter before being able to do this
- How do we deal with traps?
  - This first version is also a POC for this, the idea being that the fuzzer will generate a div by zero or overflow.
  - We should be able to catch this in the interpreter
  - If the program traps, it would be useful to run it on the host and check for the same Trap. Can we do this?
- How do we deal with infinite loops?
  - My idea here is to allow a fixed number of steps on the interpreter, and if we exceed those, we just ignore this program.
- Is there any other bad behavior that we need to handle specially?
- Should we distinguish between pointer types and i64's in the generator? We can just use any i64 and the program will crash until it does a valid thing, This might not be great in terms of fuzzing efficiency but it might generate more interesting programs.

CC: @cfallin, @andrewbrown, @fitzgen, @bjorn3 from the earlier fuzzer discussion

cranelift/filetests/src/function_runner.rs

cranelift/fuzz/.gitignore

cranelift/fuzz/Cargo.toml

cranelift/fuzz/fuzz_targets/fuzzgen.rs

@bjorn3

Thanks! @bjorn3

afonso360 · 2021-06-27T17:11:24Z

Fixed, Thanks! @bjorn3

cfallin

Thanks so much for taking this on, @afonso360!

I have a bunch of comments below, but I think the general direction here is good.

As a meta-comment, it's pretty clear to me that this is an effort that can take place over multiple PRs over a span of time; it is valuable even to have the first step of "straight-line code with arithmetic operators only". I think that we'll want to outline the next steps, maybe even in a centralized TODO comment. Specifically, I can think of a few things:

Generating test inputs with control flow
Generating calls (just to builtins/intrinsics? or just between multiple functions in one testcase?)
Generating memory loads/stores (stack and heap)
Full coverage of arithmetic ops
Full coverage of SIMD ops (this will take a while undoubtedly!)

And at the testing-harness level, we need a bit more thought about (i) infinite loops and (ii) what to do about traps, though I have some thoughts below.

Anyway, I think we can iterate here to get just straightline testcases with a few ops working, then expand from there!

cranelift/filetests/src/function_runner.rs

cranelift/fuzz/Cargo.toml

cfallin · 2021-06-28T17:32:05Z

cranelift/fuzz/fuzz_targets/fuzzgen.rs

+use cranelift_interpreter::step::ControlFlow;
+use cranelift_interpreter::step::CraneliftTrap;
+
+enum RunResult {


Should there be an error-case for "timeout", i.e. ran too long? I'll keep reading to see how infinite loops are caught but this seems like a separate logical case from explicit trap.

I think it makes sense. I'm not seeing a quick way to constrain the interpreter to run just N instructions, so lets add that when we implement branches and control flow.

cfallin · 2021-06-28T17:35:16Z

cranelift/fuzz/fuzz_targets/fuzzgen.rs

+fn run_in_interpreter(interpreter: &mut Interpreter, args: &[DataValue]) -> RunResult {
+    // The entrypoint function is always 0
+    let index = FuncIndex::from_u32(0);
+    let res = interpreter.call_by_index(index, args);


Ah, it seems we might want a "run for max N steps" method on the interpreter here. Otherwise, if we have an infinite loop (when fuzz-case generation becomes sophisticated enough for that, if it's not already?), we tie up a core indefinitely.

There's a question whether we step for N steps on both sides (interp and compiled) and then compare state at the end, or if we just reject inputs that run that long. I suspect probably the latter is good enough; the former would require instrumentation in the compiler to count ops and stop. We have "fuel" at the Wasmtime level but it's not built-in to Cranelift itself. And, anything a testcase could do to run indefinitely would be tested by other valid test cases as well, so I think it's fine to avoid these inputs altogether.

when fuzz-case generation becomes sophisticated enough for that, if it's not already?

It isn't simply because we limit the number of generated instructions up to a maximum of 16. Otherwise I think we are simply limited by the input length. On a related note, I moved all of the arbitrary restrictions that we place on the generator into a Config module so that this become more obvious (and so that we can eventually do swarm fuzzing).

There's a question whether we step for N steps on both sides (interp and compiled) and then compare state at the end, or if we just reject inputs that run that long. I suspect probably the latter is good enough

Rejecting inputs that run that long on the interpreter seems like the way to go for me.

cranelift/fuzz/fuzz_targets/fuzzgen.rs

cranelift/fuzzgen/README.md

cranelift/fuzzgen/src/function_generator.rs

cfallin · 2021-06-28T17:51:38Z

cranelift/fuzzgen/src/function_generator.rs

+        // Reuse var
+        if self.vars_of_type(ty).len() != 0 {
+            opts.push(|fg, _, ty| {
+                let opts = fg.vars_of_type(ty);


This "available variables" approach is nice, but gets somewhat complicated with control flow, and you can see e.g. in the regalloc2 SSA generator there's a two-staged approach where we decide what values are defined in each block first, then (i) pick among values defined in blocks that dominate us for uses, (ii) ensure we define all vars we said we would in a given block.

So, in this implementation, what happens if the variable is defined, but only in one predecessor? Do we unconditionally re-define it (and let the frontend handle the SSA fixup for that)?

Another workaround is to decide at the top of the function which variables will exist, and assign initial values to all of them. Then all the rest of the instruction outputs are just redefinitions. This is actually probably the easiest and least buggy (which is what we want for a fuzztest generator!) and still shouldn't sacrifice any coverage, since it can turn into arbitrary SSA.

The idea here was that this set of "available variables" is defined per block not per function (its not currently separated because we generate just one block), when branching or calling return we choose a random number of variables from the current block to transfer onto the next block (depending on the block "signature").

I don't think we have issues with picking variables defined in predecessors because we never do that, and I don't think we can do that with clif? (I may be wrong here)

Another workaround is to decide at the top of the function which variables will exist, and assign initial values to all of them.

Ah! I didn't know that we could transform non SSA code into SSA code. This might be something that we actually want to fuzz as well. In that case I think this is a good way to go.

when branching ... we choose a random number of variables ... to transfer onto the next block

This definitely works for a straightline sequence of blocks, but how does this work for loops? More generally, if you decide at the end of a block which variables to see as "defined" in the next block, you need to see all in-edges to a block before you know for sure that a variable is defined on every in-edge.

This is the origin of the two-stage approach -- it breaks the cycle: if you pin down what should be defined in each location, you can then, in parallel, both rely on that (in uses) and ensure that (in defs).

I think the easiest answer is to rely on the SSA-builder algorithm built into the variable helper layer which you're using here, and def everything at the top first, then select any variable for each use. The key thing (which I think may be flying under the radar here given the "transfer to the next block" phrasing above) is that "variables" as you're using them are not SSA values or blockparams; the latter are defined by the former depending on where you place the one or more definitions of a variable. (And indeed, a variable can have more than one def.) It's always correct to use a variable that has been defined at least once on every path from the entry block (has some dominating def), so the trivially correct thing to do is define everything (for the first time) in the entry block :-)

I've updated the fuzzer to perform this upfront allocation of variables. I'm slightly concerned that it now seems to be reusing the same variable a lot consecutively, but I think this may be the fuzzer exploring the non ssa to ssa code?

I hadn't seen your reply when writing my previous comment. I've changed it to the upfront allocation and i think it should be better because we are probably going to stress the variable helper a bit more with the fuzzer.

But i'm still not sure i'm following what you are saying.

This definitely works for a straightline sequence of blocks, but how does this work for loops? More generally, if you decide at the end of a block which variables to see as "defined" in the next block, you need to see all in-edges to a block before you know for sure that a variable is defined on every in-edge.

The way I'm thinking about this is as if every block is sort of a function.

function %test(i32, i32) -> i32 { block0(v0: i32, v1: i32): v2 = iconst.i32 10 jump block1(v1) ; Here we can choose any of v0, v1, v2 block1(v4: i32): v5 = iconst.i32 12 v6 = iconst.i32 13 jump block0(v4, v5) ; Here we can choose either v4, v5 or v6 but never v0,v1,v2 }

So in a sense, what I was thinking was: We generate the next block (or reuse an existing one) see what types it has in its signature and choose variables available in the current block to "call" it with (either block params or values locally defined by us).

This way, we don't need to know who called this block, because we never use variables defined by anyone else.

This is more of a curiosity now, because I've changed into the other approach and think it is better, but I'd like to understand why it wouldn't work.

We generate the next block (or reuse an existing one) see what types it has in its signature and choose variables available in the current block to "call" it with (either block params or values locally defined by us).

Ah, I see! The key thing here, I think, is that "what types it has in its signature" is decided before code is generated, and so breaks the cycle: this is the equivalent to the pre-pass I mentioned that decides where to define things so that defs and uses can both rely on those decisions. So this approach would also work.

Given that the cranelift-frontend logic is built to accept multi-def vars and generate SSA, I think that the approach this PR takes now (upfront initializations) is still the right one, but your approach would work well for generating the SSA directly, if we decide to do that instead at some point.

cranelift/fuzzgen/src/gen_testcase.rs

cranelift/fuzzgen/src/lib.rs

abrown · 2021-06-28T18:18:11Z

@afonso360, just wanted to say this is shaping up to be a good contribution--nice work!

@cfallin

Thanks @cfallin!

…estcases

github-actions · 2021-06-29T13:16:02Z

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "cranelift", "fuzzing"

Thus the following users have been cc'd because of the following labels:

fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

afonso360 · 2021-06-29T14:55:54Z

@cfallin Thanks for taking the time to review this!

I think that we'll want to outline the next steps, maybe even in a centralized TODO comment

I think it would be good to have a tracking issue for this so that we could also keep the discussion there. (I can open one if so)

Besides the steps that you mentioned here, we should add the TODO's that were left pending from this PR, namely:

Deriving the instructions to be generated from cranelift-meta
Reconsider if we want to catch traps and ensure that they produce the same trap both on the interpreter and the host. (As @cfallin points out this is probably more effort than its worth)

afonso360 · 2021-06-30T19:09:03Z

cranelift/fuzzgen/src/function_generator.rs

+
+        // Define variables for the function signature
+        for (i, param) in sig.params.iter().enumerate() {
+            let var = self.create_var(&mut builder, param.value_type)?;


Now that I'm looking at this again, I don't think its quite right, we probably should be assigning the block params to the pool of variables instead of creating new ones.

I'll look at this again tommorow

I think this should be right? At least, my understanding is that the blockparams are SSA values while the variables are a higher-level abstraction; it would be more confusing to have a pool of both and then selectively lower vars to SSA values or use an SSA value directly, I think.

Ah, this works only because the first block is special and has the function params. I think for the other blocks we have to associate the block params to existing variables? Otherwise we run into the issue you were pointing out earlier where we may not have defined a variable depending on where we came from. (i.e. we created it for a blockparam on a block that we haven't visited yet)

Anyway, I'll probably start working on the control flow improvements today, and should be in a more informed position to address this when I submit the next PR.

cfallin

Thanks for the updates -- this looks good to me and is a great starting point for further work! I'm excited to see what this turns into.

I think there's just one issue with the verify-publish CI job because of the new crate (cranelift-fuzzgen) -- you'll have to add it to a list somewhere. Once that's fixed, I'm happy to merge this.

Regarding the followup work, yes, please do open an issue or issues to track; a single top-level issue with a checklist (features until v1 -- control flow, memory, ... as listed above) would be fine for now, I think.

I assume it is probably not super-useful to add this to our oss-fuzz configuration now, but once it is more featureful, we can talk about that as well!

afonso360 · 2021-07-01T10:08:57Z

It should be ok to merge now. I've disabled publishing the fuzzgen crate, as we do with the wasmtime-fuzzing crate. If someone finds it useful in the future, we can always change it.

Thanks to everyone for reviewing and discussing this! 🎉

afonso360 added 10 commits June 25, 2021 20:03

cranelift: Initial fuzzer implementation

7211ace

cranelift: Generate multiple test cases in fuzzer

84b74f6

cranelift: Separate function generator in fuzzer

766d661

cranelift: Insert random instructions in fuzzer

227c1a0

cranelift: Rename gen_testcase

6c9a351

cranelift: Implement div for unsigned values in interpreter

a26851c

cranelift: Run all test cases in fuzzer

945e570

cranelift: Comment options in function_runner

db44254

cranelift: Improve fuzzgen README.md

bb72611

cranelift: Fuzzgen remove unused variable

d0b1d12

bjorn3 reviewed Jun 27, 2021

View reviewed changes

github-actions bot added the cranelift Issues related to the Cranelift code generator label Jun 27, 2021

cranelift: Fuzzer code style fixes

2e11524

Thanks! @bjorn3

afonso360 changed the title ~~Cranelift fuzzer POC~~ Cranelift CLIF-level differential fuzzer Jun 28, 2021

cfallin reviewed Jun 28, 2021

View reviewed changes

afonso360 added 7 commits June 29, 2021 10:17

cranelift: Fix nits in CLIF fuzzer

de171e1

Thanks @cfallin!

cranelift: Implement Arbitrary for TestCase

b767923

cranelift: Remove gen_testcase

820e5a4

cranelift: Move fuzzers to wasmtime fuzz directory

b54876c

cranelift: CLIF-Fuzzer ignore tests that produce traps

ac03086

cranelift: CLIF-Fuzzer create new fuzz target to validate generated t…

0854306

…estcases

cranelift: Store clif-fuzzer config in a separate struct

1e505c0

github-actions bot added the fuzzing Issues related to our fuzzing infrastructure label Jun 29, 2021

cranelift: Generate variables upfront per function

c808f09

afonso360 requested a review from cfallin June 29, 2021 17:39

afonso360 mentioned this pull request Jun 30, 2021

cranelift: Prevent overflow errors in interpreter for add,sub,mul #3043

Merged

afonso360 commented Jun 30, 2021

View reviewed changes

cfallin approved these changes Jun 30, 2021

View reviewed changes

cranelift: Prevent publishing of fuzzgen crate

15ee4fe

afonso360 mentioned this pull request Jul 1, 2021

Tracking issue for the Cranelift CLIF-level differential fuzzer #3050

Open

32 tasks

abrown merged commit 7453bd5 into bytecodealliance:main Jul 1, 2021

afonso360 deleted the fuzz-gen branch July 1, 2021 16:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cranelift CLIF-level differential fuzzer #3038

Cranelift CLIF-level differential fuzzer #3038

afonso360 commented Jun 27, 2021

afonso360 commented Jun 27, 2021

cfallin left a comment

cfallin Jun 28, 2021

afonso360 Jun 29, 2021

cfallin Jun 28, 2021

afonso360 Jun 29, 2021

cfallin Jun 28, 2021

afonso360 Jun 29, 2021 •

edited

Loading

cfallin Jun 29, 2021

afonso360 Jun 29, 2021

afonso360 Jun 30, 2021 •

edited

Loading

cfallin Jun 30, 2021

abrown commented Jun 28, 2021

github-actions bot commented Jun 29, 2021

afonso360 commented Jun 29, 2021

afonso360 Jun 30, 2021

cfallin Jun 30, 2021

afonso360 Jul 1, 2021 •

edited

Loading

cfallin left a comment

afonso360 commented Jul 1, 2021

Cranelift CLIF-level differential fuzzer #3038

Cranelift CLIF-level differential fuzzer #3038

Conversation

afonso360 commented Jun 27, 2021

afonso360 commented Jun 27, 2021

cfallin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

afonso360 Jun 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

afonso360 Jun 30, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abrown commented Jun 28, 2021

github-actions bot commented Jun 29, 2021

Subscribe to Label Action

afonso360 commented Jun 29, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

afonso360 Jul 1, 2021 • edited Loading

Choose a reason for hiding this comment

cfallin left a comment

Choose a reason for hiding this comment

afonso360 commented Jul 1, 2021

afonso360 Jun 29, 2021 •

edited

Loading

afonso360 Jun 30, 2021 •

edited

Loading

afonso360 Jul 1, 2021 •

edited

Loading