-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
riscv64: Improve icmp codegen #7113
Conversation
This makes it a bit more precise than `Reg`.
I've been curious to poke around more at riscv64 and it looks like the handling of branches/comparisons can be improved in our backend, so this is the first of what may be a number of commits to improve the situation here. This commit specifically targets the `icmp` Cranelift instruction when producing a 0 or 1 value as a result. This is unlikely to be used all that much in normal programs since most of the time a comparison is fed into a branch for example. Nevertheless I was hoping to start simple and work my way out towards branches eventually. My hope is that by improving this codegen this can be extracted to helpers later on to assist branches and various other lowerings. One part that this commit removes is various sign-extensions around `icmp` because, at least according to RISC-V's ABI, values are always sign extended when sitting at rest in their registers. I'm not sure if Cranelift respects this everywhere just yet, but it seems like a good rule of thumb to follow and if it causes issues it may be best to track down other lowering rules to fix the problems. Additionally this does not update `icmp` for 128-bit integers just yet. This is only the comparisons necessary for register-size values or smaller.
Ok looks like this won't work for at least one reason which is that @afonso360 do you know if this is a known issue with the risc-v backend? I'd be happy to help fix this up, but I'm not sure how deep this necessarily goes. |
The approach we were using in the past is to use |
Subscribe to Label Action
This issue or pull request has been labeled: "cranelift", "isle"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
Ok so I've looked around a bit, my conclusion is I don't know enough to be making changes here yet. I'm going to go back to the drawing board on some of this and see what I can figure out. |
This is something that we really need! Currently we emit at least 4 instructions when we could do just one! Thanks for looking into this. Also this just reminded me that I had a icmp PR that I had forgot about. Feel free to pick out anything you need from there!
We actually don't follow this at all. We follow the general cranelift convention that upper bits are undefined and we clear / sign extend them in the ops that need it. I looked up the ABI document and I could only find that sign extend requirement with regards to the calling convention, but as far as I know that is handled somewhere else in cranelift right? |
Oh dear apologies for missing that! I'll probably restart from there. I also need to investigate what's up with the ABI business and all that, but ok makes sense that Cranelift considers the upper bits undefined state. I know the ABI bits should do sign-extension as necessary, so I'll try to hunt that down and confirm and make sure it's all working. |
This commit is the first in a few steps to reimplement bytecodealliance#6112 and bytecodealliance#7113. The `Icmp` pseudo-instruction is removed here and necessary functionality is all pushed into ISLE lowerings. This enables deleting the `lower_br_icmp` helper in `emit.rs` as it's no longer necessary, meaning that all conditional branches should ideally be generated in lowering rather than pseudo-instructions to benefit from various optimizations. Currently the lowering is the bare-bones minimum to get things working. This involved creating a helper to lower an `IntegerCompare` into an `XReg` via negation/swapping args/etc. In generated code this removes branches and makes `icmp` a straight-line instruction for non-128-bit arguments.
* riscv64: Constants are always sign-extended Skip generating extra sign-extension instructions in this case because the materialization of a constant will implicitly sign-extend into the entire register. * riscv64: Rename `lower_int_compare` helper Try to reserve `lower_*` as taking a thing and producing a `*Reg`. Rename this helper to `is_nonzero_cmp` to represent how it's testing for nonzero and producing a comparison. * riscv64: Rename some float comparison helpers * `FCmp` -> `FloatCompare` * `emit_fcmp` -> `fcmp_to_float_compare` * `lower_fcmp` -> `lower_float_compare` Make some room for upcoming integer comparison functions. * riscv64: Remove `ordered` helper This is only used by one lowering so inline its definition directly. * riscv64: Remove the `Icmp` pseudo-instruction This commit is the first in a few steps to reimplement #6112 and #7113. The `Icmp` pseudo-instruction is removed here and necessary functionality is all pushed into ISLE lowerings. This enables deleting the `lower_br_icmp` helper in `emit.rs` as it's no longer necessary, meaning that all conditional branches should ideally be generated in lowering rather than pseudo-instructions to benefit from various optimizations. Currently the lowering is the bare-bones minimum to get things working. This involved creating a helper to lower an `IntegerCompare` into an `XReg` via negation/swapping args/etc. In generated code this removes branches and makes `icmp` a straight-line instruction for non-128-bit arguments. * riscv64: Remove an unused helper * riscv64: Optimize comparisons with 0 Use the `x0` register which is always zero where possible and avoid unnecessary `xor`s against this register. * riscv64: Specialize `a < $imm` * riscv64: Optimize Equal/NotEqual against constants * riscv64: Optimize LessThan with constant argument * Optimize `a <= $imm` * riscv64: Optimize `a >= $imm` * riscv64: Add comment for new helper * Use i64 in icmp optimizations Matches the sign-extension that happens at the hardware layer. * Correct some sign extensions * riscv64: Don't assume immediates are extended * riscv64: Fix encoding for `c.addi4spn` * riscv64: Remove `icmp` lowerings which modify constants These aren't correct and will need updating * Add regression test * riscv64: Fix handling unsigned comparisons with constants --------- Co-authored-by: Afonso Bordado <afonsobordado@az8.co>
I've been curious to poke around more at riscv64 and it looks like the
handling of branches/comparisons can be improved in our backend, so this
is the first of what may be a number of commits to improve the situation
here. This commit specifically targets the
icmp
Cranelift instructionwhen producing a 0 or 1 value as a result. This is unlikely to be used
all that much in normal programs since most of the time a comparison is
fed into a branch for example.
Nevertheless I was hoping to start simple and work my way out towards
branches eventually. My hope is that by improving this codegen this can
be extracted to helpers later on to assist branches and various other
lowerings.
One part that this commit removes is various sign-extensions around
icmp
because, at least according to RISC-V's ABI, values are alwayssign extended when sitting at rest in their registers. I'm not sure if
Cranelift respects this everywhere just yet, but it seems like a good
rule of thumb to follow and if it causes issues it may be best to track
down other lowering rules to fix the problems.
Additionally this does not update
icmp
for 128-bit integers just yet.This is only the comparisons necessary for register-size values or
smaller.