Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary, too strict assertion. Fix for 3161. #3209

Merged
merged 1 commit into from
Aug 26, 2021

Conversation

jlb6740
Copy link
Contributor

@jlb6740 jlb6740 commented Aug 18, 2021

Assertion was intended for SIMD lowering of F64x2ConvertLowI32x4U

@jlb6740
Copy link
Contributor Author

jlb6740 commented Aug 18, 2021

Assertion was unnecessary and incorrect. It assumed input types had to align with a very specific Wasm instruction which is not the right thing to do.

@jlb6740 jlb6740 requested review from abrown and alexcrichton August 18, 2021 22:17
@jlb6740
Copy link
Contributor Author

jlb6740 commented Aug 18, 2021

fix for #3161

@jlb6740
Copy link
Contributor Author

jlb6740 commented Aug 18, 2021

Also seems to fix #3160

@@ -4579,12 +4579,6 @@ fn lower_insn_to_regs<C: LowerCtx<I = Inst>>(
};
let src = put_input_in_reg(ctx, uwiden_input);
let dst = get_output_reg(ctx, outputs[0]).only_reg().unwrap();
let input_ty = ctx.input_ty(uwiden, 0);
let output_ty = ctx.output_ty(insn, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like, without this check, all of the lowering below is fully independent of the input and output types -- is that right? If so, I'm not sure I fully understand how the lowering remains correct: shouldn't it be different if, e.g., we want to do a widen-and-convert from I16X8 to F64X2, vs. I32X4 to F64X2, or I16X8 to F32X4, or any other combination?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cfallin .. Hi thanks. Yes, the lowering and assertion was there because the lowering was for a very specific instruction F64x2ConvertLowI32x4U, that expects input of I32. The input can't be I16X8. So I see, I think the change is needed at the instruction definition? Or better yet, just put this code in a path that targets I32 input and marks anything else as unimplemented? What does the fuzzer use to generate it's valid input?

Copy link
Member

@cfallin cfallin Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fuzzer generates arbitrary CLIF; generally, if the CLIF validator allows an instruction with a particular type, then we should support lowering it. (In other words, the failure mode should always be "fails to validate" rather than "assertion triggered".) If we have a lowering that only supports particular types, then we can add those constraints to the instruction, yes -- that would bring things into correspondence.

This is a bit of a wider net than Wasm-based fuzzing -- I definitely recognize that we've left several corners unfilled in several lowerings because "the Wasm translator would never do that". I guess that's sort of the point now -- we're fuzzing at the CLIF level to find these corners and ensure completeness :-)

And, since we're talking about the fcvt_from_uint instruction generally, it seems to me that we don't really want to limit it; e.g. the aarch64 implementation here supports all vector types.

So my question is: how much work is it to actually build the other cases out? Is there a simple non-optimized lowering we could use for the other types? Or is it as substantial as this whole 50-line sequence for each case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, since we're talking about the fcvt_from_uint instruction generally, it seems to me that we don't really want to limit it; e.g. the aarch64 implementation here supports all vector types.

So my question is: how much work is it to actually build the other cases out? Is there a simple non-optimized lowering we could use for the other types? Or is it as substantial as this whole 50-line sequence for each case?

@cfallin .. Yes, unfortunately there is no simple lowering to convert I8 or even I16 .. even with AVX-512. It will take finding some accepted sequence. I can look deeper into this, but in general doing thing with packed in particular bytes is never so straight forward. So ... that is a problem then, right? If we constrain the CLIF to only allow I32 because X64 is just handling the lowering it expects for F64x2ConvertLowI32x4U, while the aarch64 implementation these other type cases those other cases would never get tested by the fuzzer. This is the type of thing that I need to document. Any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, OK -- thank you for looking into this, and sorry that it's a bit more than we expected! I do think that filling this out to support the other type cases is the right thing to do; the alternative is to restrict the instruction's acceptable types, but then that is a weird incongruity in the CLIF instruction set (most other SIMD instructions are fully polymorphic over vector types). And the general direction of "restrict CLIF to only exactly what Wasm needs" feels wrong too, because it's already (much) more general: e.g. it supports 8/16 bit types and booleans, which aren't used in Wasm at all. If this becomes a huge burden, of course, then we can always reconsider how broadly we define the CLIF ops.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, I don't think that we have anything generating arbitrary clif right now and running it through the backends. #3161 was a fuzz-bug generated by generating wasm and feeding it through Cranelift, so this is an assertion being tripped from wasm, not just arbitrary CLIF that a compiler could hypothetically generate.

Along those lines I'm not sure if this is the right fix if this only works for the pair of types that were previously listed here? The conversion here that's tripping the assertion is types::I16X8 => types::F32X4 from adding some debug prints, so if this code only handles the 32x4 => 64x2 case then is it incorrect that the 16x8 => 32x4 case is getting here as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my apologies, for some reason I had the understanding that the fuzzbug was from the CLIF fuzzer, but you're right, it's from a Wasm translation. Everything said above re: covering CLIF semantics generally is still true, but this is even more short-term important if reachable from Wasm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks all. Then sounds like the solution is to create separate paths, failing as unimplemented/todo's in the path for types i8 and i16. I will make this change if agreed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I agree with @cfallin that the backends should implement all of clif, and clif should be changed if we otherwise don't want some form of instructions to make it to the backend. I would also agree, though, that fixing the wasm input is most important relative to otherwise ensuring all of clif is handled when there are no actual producers of many clif constructs today.

Given that all simd instructions have already been implemented for wasm in clif I'd probably say the fix here would be to ensure that everything "stays on the rails" where possible in that each individual wasm instruction is implemented but it appears combinations of them are hitting different blocks/panics/etc. Ensuring that each instruction hits its own individual instruction would probably be best, and then future improvements where more cases are handled in more places makes sense to fill out for more clif semantics as well as optimizing wasm.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen labels Aug 18, 2021
0x00, 0x00, 0x00,
];
} else if output_ty == types::F64X2 {
if let Some(uwiden) = matches_input(ctx, inputs[0], Opcode::UwidenLow) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the line above can be written together with a let_chain feature:

#![feature(let_chains)] to the crate attributes to enable`

but the feature is experimental.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's a nice feature!

@jlb6740
Copy link
Contributor Author

jlb6740 commented Aug 24, 2021

Thanks @alexcrichton. This really just needed refactoring to avoid entering prematurely the branch that implements F64x2ConvertLowI32x4U. Above is a fix .. hopefully nothing is amiss.

@jlb6740 jlb6740 requested a review from cfallin August 24, 2021 05:26
@alexcrichton
Copy link
Member

I'll defer to @cfallin's review of the code here, I'm not super familiar with the x64 backend myself

(although I do think it would be good to add some tests)

i8x16.extract_lane_s 10))

(assert_return (invoke "f32x4.convert_i32x4_u" (v128.const i32x4 0x00000000 0x00000000 0x00000000 0x00000000))
(v128.const f32x4 0 0 0 0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to remove the extract lane part above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also added more tests. Thanks.

Assertion was intended for SIMD lowering of F64x2ConvertLowI32x4U
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me now -- thanks for bearing with us through all the discussion! As long as we're covering what the Wasm translator generates, this is good incremental progress; ideally of course it'd be nice to cover all cases (and remove the assert on the uwiden's input type) but we can save that for another day, maybe once patterns are easier to write.

0x00, 0x00, 0x00,
];
} else if output_ty == types::F64X2 {
if let Some(uwiden) = matches_input(ctx, inputs[0], Opcode::UwidenLow) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's a nice feature!

@cfallin cfallin merged commit 0771abf into bytecodealliance:main Aug 26, 2021
@jlb6740
Copy link
Contributor Author

jlb6740 commented Aug 26, 2021

This looks good to me now -- thanks for bearing with us through all the discussion!

@cfallin No, thank you. Simply needed to refactor after interpreting correctly what the bug was saying. Highlights the need for fuzz testing when lowering becomes branchy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants