-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove unnecessary, too strict assertion. Fix for 3161. #3209
Conversation
Assertion was unnecessary and incorrect. It assumed input types had to align with a very specific Wasm instruction which is not the right thing to do. |
fix for #3161 |
Also seems to fix #3160 |
@@ -4579,12 +4579,6 @@ fn lower_insn_to_regs<C: LowerCtx<I = Inst>>( | |||
}; | |||
let src = put_input_in_reg(ctx, uwiden_input); | |||
let dst = get_output_reg(ctx, outputs[0]).only_reg().unwrap(); | |||
let input_ty = ctx.input_ty(uwiden, 0); | |||
let output_ty = ctx.output_ty(insn, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like, without this check, all of the lowering below is fully independent of the input and output types -- is that right? If so, I'm not sure I fully understand how the lowering remains correct: shouldn't it be different if, e.g., we want to do a widen-and-convert from I16X8
to F64X2
, vs. I32X4
to F64X2
, or I16X8
to F32X4
, or any other combination?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cfallin .. Hi thanks. Yes, the lowering and assertion was there because the lowering was for a very specific instruction F64x2ConvertLowI32x4U, that expects input of I32. The input can't be I16X8. So I see, I think the change is needed at the instruction definition? Or better yet, just put this code in a path that targets I32 input and marks anything else as unimplemented? What does the fuzzer use to generate it's valid input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fuzzer generates arbitrary CLIF; generally, if the CLIF validator allows an instruction with a particular type, then we should support lowering it. (In other words, the failure mode should always be "fails to validate" rather than "assertion triggered".) If we have a lowering that only supports particular types, then we can add those constraints to the instruction, yes -- that would bring things into correspondence.
This is a bit of a wider net than Wasm-based fuzzing -- I definitely recognize that we've left several corners unfilled in several lowerings because "the Wasm translator would never do that". I guess that's sort of the point now -- we're fuzzing at the CLIF level to find these corners and ensure completeness :-)
And, since we're talking about the fcvt_from_uint
instruction generally, it seems to me that we don't really want to limit it; e.g. the aarch64 implementation here supports all vector types.
So my question is: how much work is it to actually build the other cases out? Is there a simple non-optimized lowering we could use for the other types? Or is it as substantial as this whole 50-line sequence for each case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, since we're talking about the fcvt_from_uint instruction generally, it seems to me that we don't really want to limit it; e.g. the aarch64 implementation here supports all vector types.
So my question is: how much work is it to actually build the other cases out? Is there a simple non-optimized lowering we could use for the other types? Or is it as substantial as this whole 50-line sequence for each case?
@cfallin .. Yes, unfortunately there is no simple lowering to convert I8 or even I16 .. even with AVX-512. It will take finding some accepted sequence. I can look deeper into this, but in general doing thing with packed in particular bytes is never so straight forward. So ... that is a problem then, right? If we constrain the CLIF to only allow I32 because X64 is just handling the lowering it expects for F64x2ConvertLowI32x4U, while the aarch64 implementation these other type cases those other cases would never get tested by the fuzzer. This is the type of thing that I need to document. Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, OK -- thank you for looking into this, and sorry that it's a bit more than we expected! I do think that filling this out to support the other type cases is the right thing to do; the alternative is to restrict the instruction's acceptable types, but then that is a weird incongruity in the CLIF instruction set (most other SIMD instructions are fully polymorphic over vector types). And the general direction of "restrict CLIF to only exactly what Wasm needs" feels wrong too, because it's already (much) more general: e.g. it supports 8/16 bit types and booleans, which aren't used in Wasm at all. If this becomes a huge burden, of course, then we can always reconsider how broadly we define the CLIF ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, I don't think that we have anything generating arbitrary clif right now and running it through the backends. #3161 was a fuzz-bug generated by generating wasm and feeding it through Cranelift, so this is an assertion being tripped from wasm, not just arbitrary CLIF that a compiler could hypothetically generate.
Along those lines I'm not sure if this is the right fix if this only works for the pair of types that were previously listed here? The conversion here that's tripping the assertion is types::I16X8 => types::F32X4
from adding some debug prints, so if this code only handles the 32x4 => 64x2 case then is it incorrect that the 16x8 => 32x4 case is getting here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my apologies, for some reason I had the understanding that the fuzzbug was from the CLIF fuzzer, but you're right, it's from a Wasm translation. Everything said above re: covering CLIF semantics generally is still true, but this is even more short-term important if reachable from Wasm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks all. Then sounds like the solution is to create separate paths, failing as unimplemented/todo's in the path for types i8 and i16. I will make this change if agreed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I agree with @cfallin that the backends should implement all of clif, and clif should be changed if we otherwise don't want some form of instructions to make it to the backend. I would also agree, though, that fixing the wasm input is most important relative to otherwise ensuring all of clif is handled when there are no actual producers of many clif constructs today.
Given that all simd instructions have already been implemented for wasm in clif I'd probably say the fix here would be to ensure that everything "stays on the rails" where possible in that each individual wasm instruction is implemented but it appears combinations of them are hitting different blocks/panics/etc. Ensuring that each instruction hits its own individual instruction would probably be best, and then future improvements where more cases are handled in more places makes sense to fill out for more clif semantics as well as optimizing wasm.
0x00, 0x00, 0x00, | ||
]; | ||
} else if output_ty == types::F64X2 { | ||
if let Some(uwiden) = matches_input(ctx, inputs[0], Opcode::UwidenLow) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and the line above can be written together with a let_chain feature:
#![feature(let_chains)]
to the crate attributes to enable`
but the feature is experimental.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's a nice feature!
Thanks @alexcrichton. This really just needed refactoring to avoid entering prematurely the branch that implements F64x2ConvertLowI32x4U. Above is a fix .. hopefully nothing is amiss. |
I'll defer to @cfallin's review of the code here, I'm not super familiar with the x64 backend myself (although I do think it would be good to add some tests) |
i8x16.extract_lane_s 10)) | ||
|
||
(assert_return (invoke "f32x4.convert_i32x4_u" (v128.const i32x4 0x00000000 0x00000000 0x00000000 0x00000000)) | ||
(v128.const f32x4 0 0 0 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to remove the extract lane part above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I also added more tests. Thanks.
Assertion was intended for SIMD lowering of F64x2ConvertLowI32x4U
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me now -- thanks for bearing with us through all the discussion! As long as we're covering what the Wasm translator generates, this is good incremental progress; ideally of course it'd be nice to cover all cases (and remove the assert on the uwiden
's input type) but we can save that for another day, maybe once patterns are easier to write.
0x00, 0x00, 0x00, | ||
]; | ||
} else if output_ty == types::F64X2 { | ||
if let Some(uwiden) = matches_input(ctx, inputs[0], Opcode::UwidenLow) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's a nice feature!
@cfallin No, thank you. Simply needed to refactor after interpreting correctly what the bug was saying. Highlights the need for fuzz testing when lowering becomes branchy. |
Assertion was intended for SIMD lowering of F64x2ConvertLowI32x4U