Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate types are not stale, and fix many issues there #1011

Closed
wants to merge 50 commits into from

Conversation

kripken
Copy link
Member

@kripken kripken commented May 13, 2017

afl-fuzz hinted in this direction, but this is much bigger than a fuzz bug fix. I think the full story is this:

  • We had our old type system before block types changed in wasm.
  • Wasm changed block types and we decided to change with them, in particular, changed what unreachable meant in blocks, that they are not actually exited.
  • Recently, fuzzing found we didn't actually do that fully - there were blocks that should have been unreachable, but not marked as such. Those were fixed.
  • Those fixes opened up new bugs that showed we didn't do the same for most other nodes, like e.g. drop of an unreachable. Those were also fixed.
  • At this point things seemed stable, but I realized those fuzz bugs would have been found earlier if we checked during validation whether finalize() on the node changes its type. It can do that in only one legal way, turning a concrete value into unreachable, e.g. finalizing (block i32 (unreachable)) will remove the extra i32 annotation, and set the more natural unreachable type. But aside from that, if finalize would have changed the type, we have a bug, generally that the type is 'stale', it hasn't been recomputed. So I added that to validation.
  • This found a large number of bugs, fixed in this PR. Interestingly, meanwhile afl-fuzz, still working on the old binary, found a few more testcases, all covered by the fixes in this PR.

A sad thing in all of this is that we need to do more work. We have passes that may turn a node's type to unreachable, and logically that means we need to update the types of all its parents. That we didn't notice this before is due to a combination of not having the new extra validation, and that optimizations tend to work around it, in particular dce. I don't think we can avoid this, unless we get rid of the unreachable type.

@kripken
Copy link
Member Author

kripken commented May 13, 2017

This looks stable now. Sadly it is 6% slower than master (tested on poppler).

The core issue is that full and proper handling of unreachable means that any time you remove a branch or alter a type to unreachable, you must recompute types for the entire potentially-affected tree underneath it. Perhaps we should reconsider our options here, but this PR should probably land since it fixes a large amount of fuzz bugs + adds a lot more testing to prevent more.

@kripken
Copy link
Member Author

kripken commented May 15, 2017

This is getting out of hand. May need to rethink the entire approach here - it is going to be very hard to actually fix all these issues.

@kripken kripken closed this May 15, 2017
@jgravelle-google
Copy link
Contributor

Hm. Sounds like we need to re-rethink what unreachable means in the type system. #903

My gut feeling is that the complexity stems from:

  1. us overloading unreachable to mean both "this node has unreachable type", and "this node was marked unreachable as part of an IR pass and we need to fix it up when we want to output wasm"
  2. unreachability can propagate both up and down the layers of the AST

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants