-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cranelift CLIF-level differential fuzzer #3038
Conversation
Fixed, Thanks! @bjorn3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for taking this on, @afonso360!
I have a bunch of comments below, but I think the general direction here is good.
As a meta-comment, it's pretty clear to me that this is an effort that can take place over multiple PRs over a span of time; it is valuable even to have the first step of "straight-line code with arithmetic operators only". I think that we'll want to outline the next steps, maybe even in a centralized TODO comment. Specifically, I can think of a few things:
- Generating test inputs with control flow
- Generating calls (just to builtins/intrinsics? or just between multiple functions in one testcase?)
- Generating memory loads/stores (stack and heap)
- Full coverage of arithmetic ops
- Full coverage of SIMD ops (this will take a while undoubtedly!)
And at the testing-harness level, we need a bit more thought about (i) infinite loops and (ii) what to do about traps, though I have some thoughts below.
Anyway, I think we can iterate here to get just straightline testcases with a few ops working, then expand from there!
use cranelift_interpreter::step::ControlFlow; | ||
use cranelift_interpreter::step::CraneliftTrap; | ||
|
||
enum RunResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be an error-case for "timeout", i.e. ran too long? I'll keep reading to see how infinite loops are caught but this seems like a separate logical case from explicit trap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense. I'm not seeing a quick way to constrain the interpreter to run just N instructions, so lets add that when we implement branches and control flow.
fn run_in_interpreter(interpreter: &mut Interpreter, args: &[DataValue]) -> RunResult { | ||
// The entrypoint function is always 0 | ||
let index = FuncIndex::from_u32(0); | ||
let res = interpreter.call_by_index(index, args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it seems we might want a "run for max N steps" method on the interpreter here. Otherwise, if we have an infinite loop (when fuzz-case generation becomes sophisticated enough for that, if it's not already?), we tie up a core indefinitely.
There's a question whether we step for N steps on both sides (interp and compiled) and then compare state at the end, or if we just reject inputs that run that long. I suspect probably the latter is good enough; the former would require instrumentation in the compiler to count ops and stop. We have "fuel" at the Wasmtime level but it's not built-in to Cranelift itself. And, anything a testcase could do to run indefinitely would be tested by other valid test cases as well, so I think it's fine to avoid these inputs altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when fuzz-case generation becomes sophisticated enough for that, if it's not already?
It isn't simply because we limit the number of generated instructions up to a maximum of 16. Otherwise I think we are simply limited by the input length. On a related note, I moved all of the arbitrary restrictions that we place on the generator into a Config
module so that this become more obvious (and so that we can eventually do swarm fuzzing).
There's a question whether we step for N steps on both sides (interp and compiled) and then compare state at the end, or if we just reject inputs that run that long. I suspect probably the latter is good enough
Rejecting inputs that run that long on the interpreter seems like the way to go for me.
// Reuse var | ||
if self.vars_of_type(ty).len() != 0 { | ||
opts.push(|fg, _, ty| { | ||
let opts = fg.vars_of_type(ty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "available variables" approach is nice, but gets somewhat complicated with control flow, and you can see e.g. in the regalloc2 SSA generator there's a two-staged approach where we decide what values are defined in each block first, then (i) pick among values defined in blocks that dominate us for uses, (ii) ensure we define all vars we said we would in a given block.
So, in this implementation, what happens if the variable is defined, but only in one predecessor? Do we unconditionally re-define it (and let the frontend handle the SSA fixup for that)?
Another workaround is to decide at the top of the function which variables will exist, and assign initial values to all of them. Then all the rest of the instruction outputs are just redefinitions. This is actually probably the easiest and least buggy (which is what we want for a fuzztest generator!) and still shouldn't sacrifice any coverage, since it can turn into arbitrary SSA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here was that this set of "available variables" is defined per block not per function (its not currently separated because we generate just one block), when branching or calling return
we choose a random number of variables from the current block to transfer onto the next block (depending on the block "signature").
I don't think we have issues with picking variables defined in predecessors because we never do that, and I don't think we can do that with clif? (I may be wrong here)
Another workaround is to decide at the top of the function which variables will exist, and assign initial values to all of them.
Ah! I didn't know that we could transform non SSA code into SSA code. This might be something that we actually want to fuzz as well. In that case I think this is a good way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when branching ... we choose a random number of variables ... to transfer onto the next block
This definitely works for a straightline sequence of blocks, but how does this work for loops? More generally, if you decide at the end of a block which variables to see as "defined" in the next block, you need to see all in-edges to a block before you know for sure that a variable is defined on every in-edge.
This is the origin of the two-stage approach -- it breaks the cycle: if you pin down what should be defined in each location, you can then, in parallel, both rely on that (in uses) and ensure that (in defs).
I think the easiest answer is to rely on the SSA-builder algorithm built into the variable helper layer which you're using here, and def everything at the top first, then select any variable for each use. The key thing (which I think may be flying under the radar here given the "transfer to the next block" phrasing above) is that "variables" as you're using them are not SSA values or blockparams; the latter are defined by the former depending on where you place the one or more definitions of a variable. (And indeed, a variable can have more than one def.) It's always correct to use a variable that has been defined at least once on every path from the entry block (has some dominating def), so the trivially correct thing to do is define everything (for the first time) in the entry block :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the fuzzer to perform this upfront allocation of variables. I'm slightly concerned that it now seems to be reusing the same variable a lot consecutively, but I think this may be the fuzzer exploring the non ssa to ssa code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't seen your reply when writing my previous comment. I've changed it to the upfront allocation and i think it should be better because we are probably going to stress the variable helper a bit more with the fuzzer.
But i'm still not sure i'm following what you are saying.
This definitely works for a straightline sequence of blocks, but how does this work for loops? More generally, if you decide at the end of a block which variables to see as "defined" in the next block, you need to see all in-edges to a block before you know for sure that a variable is defined on every in-edge.
The way I'm thinking about this is as if every block is sort of a function.
function %test(i32, i32) -> i32 {
block0(v0: i32, v1: i32):
v2 = iconst.i32 10
jump block1(v1) ; Here we can choose any of v0, v1, v2
block1(v4: i32):
v5 = iconst.i32 12
v6 = iconst.i32 13
jump block0(v4, v5) ; Here we can choose either v4, v5 or v6 but never v0,v1,v2
}
So in a sense, what I was thinking was: We generate the next block (or reuse an existing one) see what types it has in its signature and choose variables available in the current block to "call" it with (either block params or values locally defined by us).
This way, we don't need to know who called this block, because we never use variables defined by anyone else.
This is more of a curiosity now, because I've changed into the other approach and think it is better, but I'd like to understand why it wouldn't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generate the next block (or reuse an existing one) see what types it has in its signature and choose variables available in the current block to "call" it with (either block params or values locally defined by us).
Ah, I see! The key thing here, I think, is that "what types it has in its signature" is decided before code is generated, and so breaks the cycle: this is the equivalent to the pre-pass I mentioned that decides where to define things so that defs and uses can both rely on those decisions. So this approach would also work.
Given that the cranelift-frontend
logic is built to accept multi-def vars and generate SSA, I think that the approach this PR takes now (upfront initializations) is still the right one, but your approach would work well for generating the SSA directly, if we decide to do that instead at some point.
@afonso360, just wanted to say this is shaping up to be a good contribution--nice work! |
Subscribe to Label Actioncc @fitzgen
This issue or pull request has been labeled: "cranelift", "fuzzing"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
@cfallin Thanks for taking the time to review this!
I think it would be good to have a tracking issue for this so that we could also keep the discussion there. (I can open one if so) Besides the steps that you mentioned here, we should add the TODO's that were left pending from this PR, namely:
|
|
||
// Define variables for the function signature | ||
for (i, param) in sig.params.iter().enumerate() { | ||
let var = self.create_var(&mut builder, param.value_type)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I'm looking at this again, I don't think its quite right, we probably should be assigning the block params to the pool of variables instead of creating new ones.
I'll look at this again tommorow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be right? At least, my understanding is that the blockparams are SSA values while the variables are a higher-level abstraction; it would be more confusing to have a pool of both and then selectively lower vars to SSA values or use an SSA value directly, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this works only because the first block is special and has the function params. I think for the other blocks we have to associate the block params to existing variables? Otherwise we run into the issue you were pointing out earlier where we may not have defined a variable depending on where we came from. (i.e. we created it for a blockparam on a block that we haven't visited yet)
Anyway, I'll probably start working on the control flow improvements today, and should be in a more informed position to address this when I submit the next PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates -- this looks good to me and is a great starting point for further work! I'm excited to see what this turns into.
I think there's just one issue with the verify-publish CI job because of the new crate (cranelift-fuzzgen
) -- you'll have to add it to a list somewhere. Once that's fixed, I'm happy to merge this.
Regarding the followup work, yes, please do open an issue or issues to track; a single top-level issue with a checklist (features until v1 -- control flow, memory, ... as listed above) would be fine for now, I think.
I assume it is probably not super-useful to add this to our oss-fuzz configuration now, but once it is more featureful, we can talk about that as well!
It should be ok to merge now. I've disabled publishing the Thanks to everyone for reviewing and discussing this! 🎉 |
Hey!
Here's an initial version of the cranelift fuzzer.
We generate a random cranelift function based on input bytes from the fuzzer. Currently we can generate 7 instructions, but it is already enough to get us into trouble.
There is also a
gen_testcase
util that grabs fuzzer artifacts and converts them into clif files for easier testing.In order for the fuzzer to be useful in testing DivByZero traps we need to disable overflow checks:
Otherwise, the interpreter crashes immediately with overflows (We also need to fix this).
I've based my design a lot on @fitzgen 's blog post about wasm-smith. Its what made me start thinking about building this for cranelift in the first place. Thanks!
Organization
cargo-fuzz
package for cranelift, should I merge the fuzzer with wasmtime's fuzzers?Design questions:
How do we deal with arbitrary memory accesses?
How do we deal with traps?
How do we deal with infinite loops?
Is there any other bad behavior that we need to handle specially?
Should we distinguish between pointer types and i64's in the generator? We can just use any i64 and the program will crash until it does a valid thing, This might not be great in terms of fuzzing efficiency but it might generate more interesting programs.
CC: @cfallin, @andrewbrown, @fitzgen, @bjorn3 from the earlier fuzzer discussion