-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial ISLE integration for x64 #3506
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks cool! Some of my comments below are subjective; feel free to decline my suggestions :-)
913567f
to
f712b80
Compare
For some user-feedback information, my understanding was that one of the major benefits of ISLE is to make lowerings easier and more understandable. One of the knock-on-effects of this I was curious to poke at today was to see if this applies to any of our fuzz bugs. I know very little about the existing backends, so I was curious if someone like me could jump in and "just fix the issue" by "simply translate to ISLE". To that end I chose #3327 since it only related to 2 instructions. I had no idea what the bug was before I started. The tl;dr; is that translating to ISLE did indeed "simply fix the bug", but not necessarily in the most obvious way you might expect. The precise bug is that the Overall I found it surprisingly easy to get up-and-running with ISLE. I made one mistake early on which resulted in a bug in the ISLE compiler but other than that I found it quite easy to envision how to translate the handwritten lowerings to ISLE. I had a lot of complexity to boot-up on related to x86_64 and how SSE works but none of that was related to ISLE, and overall the ISLE I produced in the commit felt clear, easy to read, and easy to understand. I hope this adds at least a concrete data point of someone who knows very little about the backends was able to pretty easily fix an outstanding codegen bug simply by translating things to ISLE in the long run. I personally felt that the idioms in the |
@alexcrichton Thanks for giving the new framework some real-world testing with that bug! I'm very happy to hear that the secondhand effects of making small helpers and factoring easier have actually worked out here; that was indeed the goal of having a concise language, where we are not distracted by a lot of repetitive boilerplate. I see in your commit that the fix involved expanding out the glue (support for another instruction format) as well as the actual helpers and patterns; for anyone else looking at this diff, I'll note that an eventual goal is to auto-generate more of that glue; and once most of the backend is in ISLE, we'll have a large library of helpers for all of the instructions and other common idioms, so adding specific lowering patterns should look mostly like just the ~10 lines of code added to Really excited to see this coming together! |
…isters There were a few previous code paths that attempted to handle this, but this new check handles it for all callers. Rematerializing constants, rather than assigning and reusing a register, allows for lower register pressure.
I did some benchmarking overnight comparing my Using Benchmark Results
TLDR:
This provides strong evidence that ISLE is a small win for compile times--certainly not a regression--and that there is also no regression in code quality and execution times. |
On the build side, this commit introduces two things: 1. The automatic generation of various ISLE definitions for working with CLIF. Specifically, it generates extern type definitions for clif opcodes and the clif instruction data `enum`, as well as extractors for matching each clif instructions. This happens inside the `cranelift-codegen-meta` crate. 2. The compilation of ISLE DSL sources to Rust code, that can be included in the main `cranelift-codegen` compilation. Next, this commit introduces the integration glue code required to get ISLE-generated Rust code hooked up in clif-to-x64 lowering. When lowering a clif instruction, we first try to use the ISLE code path. If it succeeds, then we are done lowering this instruction. If it fails, then we proceed along the existing hand-written code path for lowering. Finally, this commit ports many lowering rules over from hand-written, open-coded Rust to ISLE. In the process of supporting ISLE, this commit also makes the x64 `Inst` capable of expressing SSA by supporting 3-operand forms for all of the existing instructions that only have a 2-operand form encoding: dst = src1 op src2 Rather than only the typical x86-64 2-operand form: dst = dst op src This allows `MachInst` to be in SSA form, since `dst` and `src1` are disentangled. ("3-operand" and "2-operand" are a little bit of a misnomer since not all operations are binary operations, but we do the same thing for, e.g., unary operations by disentangling the sole operand from the result.) There are two motivations for this change: 1. To allow ISLE lowering code to have value-equivalence semantics. We want ISLE lowering to translate a CLIF expression that evaluates to some value into a `MachInst` expression that evaluates to the same value. We want both the lowering itself and the resulting `MachInst` to be pure and referentially transparent. This is both a nice paradigm for compiler writers that are authoring and maintaining lowering rules and is a prerequisite to any sort of formal verification of our lowering rules in the future. 2. Better align `MachInst` with `regalloc2`'s API, which requires that the input be in SSA form.
@cfallin very helpfully did some git history juggling and now we have the full history for On that note, over the last few weeks I've been familiarizing myself with the ISLE compiler and the code is very high quality. For the few things it wasn't handling super well (like when we keep type checking to find more errors after the first error has been found, but then run into missing state that "should" be filled in for error-free programs but wasn't because of the first error, and then we panic) I've made commits to address them and a fuzz target to exercise those code paths. So consider this my r+ stamp of code review approval for the ISLE compiler code that @cfallin wrote. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fantastic! I'm very excited to have a clean way of generating instruction lowerings; looking again at e.g. the wide-multiply or shift code, the concise sequence of ops is just so much better than the handwritten ctx.emit(Machinst::Alu... { ... })
and the fiddly matching/destructuring.
I also want to say that the ergonomics of the helpers you've defined are really nice -- (has_type (fits_in_64 ...))
just reads really cleanly. Well done!
All of my comments below are little bikeshed bits / nitpicks; hopefully they're fast to address.
Also, it would probably be good to go through the discussion so far and record GitHub issues for any "we can fix that after this PR" sort of answers; deferring v2 stuff is perfectly valid, just want to make sure we capture it.
(I've been looking at this code at various points over the past month, so for anyone concerned about an insta-fast review following RFC merge, no worries; just did a final pass now!)
Anyway, thanks again!
Currently, the `build.rs` script that generates Rust source from the ISLE DSL will only do this generation if the `rebuild-isle` Cargo feature is specified. By default, it is not. This is based on the principle that we (the build script) do not modify the source tree as managed by git; git-managed files are strictly a human-managed and human-edited resource. By adding the opt-in Cargo feature, a developer is requesting the build script to perform an explicit action. (In my understanding at least, this principle comes from the general philosophy of hermetic builds: the output should be a pure function of the input, and part of this is that the input is read-only. If we modify the source tree, then all bets are off.) Unfortunately, requiring the opt-in feature also creates a footgun that is easy to hit: if a developer modifies the ISLE DSL source, but forgets to specify the Cargo feature, then the compiler will silently be built successfully with stale source, and will silently exclude any changes that were made. The generated source is checked into git for a good reason: we want DSL compiler to not affect build times for the overwhelmingly common case that Cranelift is used as a dependency but the backends are not being actively developed. (This overhead comes mainly from building `islec` itself.) So, what to do? This PR implements a middle ground first described in [this conversation](bytecodealliance#3506 (comment)), in which we: - Generate a hash (SHA-512) of the ISLE DSL source and produce a "manifest" of ISLE inputs alongside the generated source; and - Always read the ISLE DSL source, and see if the manifest is still valid, on builds that do not have the opt-in "rebuild" feature. This allows us to know whether the ISLE compiler output would have been the same (modulo changes to the DSL compiler itself, which are out-of-scope here), without actually building the ISLE compiler and running it. If the compiler-backend developer modifies an ISLE source file and then tries to build `cranelift-codegen` without adding the `rebuild-isle` Cargo feature, they get the following output: ```text Error: the ISLE source files that resulted in the generated Rust source * src/isa/x64/lower/isle/generated_code.rs have changed but the generated source was not rebuilt! These ISLE source files are: * src/clif.isle * src/prelude.isle * src/isa/x64/inst.isle * src/isa/x64/lower.isle Please add `--features rebuild-isle` to your `cargo build` command if you wish to rebuild the generated source, then include these changes in any git commits you make that include the changes to the ISLE. For example: $ cargo build -p cranelift-codegen --features rebuild-isle (This build script cannot do this for you by default because we cannot modify checked-into-git source without your explicit opt-in.) ``` which will tell them exactly what they need to do to fix the problem! Note that I am leaving the "Rebuild ISLE" CI job alone for now, because otherwise, we are trusting whomever submits a PR to generate the correct generated source.
Currently, the `build.rs` script that generates Rust source from the ISLE DSL will only do this generation if the `rebuild-isle` Cargo feature is specified. By default, it is not. This is based on the principle that we (the build script) do not modify the source tree as managed by git; git-managed files are strictly a human-managed and human-edited resource. By adding the opt-in Cargo feature, a developer is requesting the build script to perform an explicit action. (In my understanding at least, this principle comes from the general philosophy of hermetic builds: the output should be a pure function of the input, and part of this is that the input is read-only. If we modify the source tree, then all bets are off.) Unfortunately, requiring the opt-in feature also creates a footgun that is easy to hit: if a developer modifies the ISLE DSL source, but forgets to specify the Cargo feature, then the compiler will silently be built successfully with stale source, and will silently exclude any changes that were made. The generated source is checked into git for a good reason: we want DSL compiler to not affect build times for the overwhelmingly common case that Cranelift is used as a dependency but the backends are not being actively developed. (This overhead comes mainly from building `islec` itself.) So, what to do? This PR implements a middle ground first described in [this conversation](bytecodealliance#3506 (comment)), in which we: - Generate a hash (SHA-512) of the ISLE DSL source and produce a "manifest" of ISLE inputs alongside the generated source; and - Always read the ISLE DSL source, and see if the manifest is still valid, on builds that do not have the opt-in "rebuild" feature. This allows us to know whether the ISLE compiler output would have been the same (modulo changes to the DSL compiler itself, which are out-of-scope here), without actually building the ISLE compiler and running it. If the compiler-backend developer modifies an ISLE source file and then tries to build `cranelift-codegen` without adding the `rebuild-isle` Cargo feature, they get the following output: ```text Error: the ISLE source files that resulted in the generated Rust source * src/isa/x64/lower/isle/generated_code.rs have changed but the generated source was not rebuilt! These ISLE source files are: * src/clif.isle * src/prelude.isle * src/isa/x64/inst.isle * src/isa/x64/lower.isle Please add `--features rebuild-isle` to your `cargo build` command if you wish to rebuild the generated source, then include these changes in any git commits you make that include the changes to the ISLE. For example: $ cargo build -p cranelift-codegen --features rebuild-isle (This build script cannot do this for you by default because we cannot modify checked-into-git source without your explicit opt-in.) ``` which will tell them exactly what they need to do to fix the problem! Note that I am leaving the "Rebuild ISLE" CI job alone for now, because otherwise, we are trusting whomever submits a PR to generate the correct generated source. In other words, the manifest is a communication from the checked-in tree to the developer, but we still need to verify that the checked-in generated Rust source and the manifest are correct with respect to the checked-in ISLE source.
Currently, the `build.rs` script that generates Rust source from the ISLE DSL will only do this generation if the `rebuild-isle` Cargo feature is specified. By default, it is not. This is based on the principle that we (the build script) do not modify the source tree as managed by git; git-managed files are strictly a human-managed and human-edited resource. By adding the opt-in Cargo feature, a developer is requesting the build script to perform an explicit action. (In my understanding at least, this principle comes from the general philosophy of hermetic builds: the output should be a pure function of the input, and part of this is that the input is read-only. If we modify the source tree, then all bets are off.) Unfortunately, requiring the opt-in feature also creates a footgun that is easy to hit: if a developer modifies the ISLE DSL source, but forgets to specify the Cargo feature, then the compiler will silently be built successfully with stale source, and will silently exclude any changes that were made. The generated source is checked into git for a good reason: we want DSL compiler to not affect build times for the overwhelmingly common case that Cranelift is used as a dependency but the backends are not being actively developed. (This overhead comes mainly from building `islec` itself.) So, what to do? This PR implements a middle ground first described in [this conversation](bytecodealliance#3506 (comment)), in which we: - Generate a hash (SHA-512) of the ISLE DSL source and produce a "manifest" of ISLE inputs alongside the generated source; and - Always read the ISLE DSL source, and see if the manifest is still valid, on builds that do not have the opt-in "rebuild" feature. This allows us to know whether the ISLE compiler output would have been the same (modulo changes to the DSL compiler itself, which are out-of-scope here), without actually building the ISLE compiler and running it. If the compiler-backend developer modifies an ISLE source file and then tries to build `cranelift-codegen` without adding the `rebuild-isle` Cargo feature, they get the following output: ```text Error: the ISLE source files that resulted in the generated Rust source * src/isa/x64/lower/isle/generated_code.rs have changed but the generated source was not rebuilt! These ISLE source files are: * src/clif.isle * src/prelude.isle * src/isa/x64/inst.isle * src/isa/x64/lower.isle Please add `--features rebuild-isle` to your `cargo build` command if you wish to rebuild the generated source, then include these changes in any git commits you make that include the changes to the ISLE. For example: $ cargo build -p cranelift-codegen --features rebuild-isle (This build script cannot do this for you by default because we cannot modify checked-into-git source without your explicit opt-in.) ``` which will tell them exactly what they need to do to fix the problem! Note that I am leaving the "Rebuild ISLE" CI job alone for now, because otherwise, we are trusting whomever submits a PR to generate the correct generated source. In other words, the manifest is a communication from the checked-in tree to the developer, but we still need to verify that the checked-in generated Rust source and the manifest are correct with respect to the checked-in ISLE source.
Currently, the `build.rs` script that generates Rust source from the ISLE DSL will only do this generation if the `rebuild-isle` Cargo feature is specified. By default, it is not. This is based on the principle that we (the build script) do not modify the source tree as managed by git; git-managed files are strictly a human-managed and human-edited resource. By adding the opt-in Cargo feature, a developer is requesting the build script to perform an explicit action. (In my understanding at least, this principle comes from the general philosophy of hermetic builds: the output should be a pure function of the input, and part of this is that the input is read-only. If we modify the source tree, then all bets are off.) Unfortunately, requiring the opt-in feature also creates a footgun that is easy to hit: if a developer modifies the ISLE DSL source, but forgets to specify the Cargo feature, then the compiler will silently be built successfully with stale source, and will silently exclude any changes that were made. The generated source is checked into git for a good reason: we want DSL compiler to not affect build times for the overwhelmingly common case that Cranelift is used as a dependency but the backends are not being actively developed. (This overhead comes mainly from building `islec` itself.) So, what to do? This PR implements a middle ground first described in [this conversation](bytecodealliance#3506 (comment)), in which we: - Generate a hash (SHA-512) of the ISLE DSL source and produce a "manifest" of ISLE inputs alongside the generated source; and - Always read the ISLE DSL source, and see if the manifest is still valid, on builds that do not have the opt-in "rebuild" feature. This allows us to know whether the ISLE compiler output would have been the same (modulo changes to the DSL compiler itself, which are out-of-scope here), without actually building the ISLE compiler and running it. If the compiler-backend developer modifies an ISLE source file and then tries to build `cranelift-codegen` without adding the `rebuild-isle` Cargo feature, they get the following output: ```text Error: the ISLE source files that resulted in the generated Rust source * src/isa/x64/lower/isle/generated_code.rs have changed but the generated source was not rebuilt! These ISLE source files are: * src/clif.isle * src/prelude.isle * src/isa/x64/inst.isle * src/isa/x64/lower.isle Please add `--features rebuild-isle` to your `cargo build` command if you wish to rebuild the generated source, then include these changes in any git commits you make that include the changes to the ISLE. For example: $ cargo build -p cranelift-codegen --features rebuild-isle (This build script cannot do this for you by default because we cannot modify checked-into-git source without your explicit opt-in.) ``` which will tell them exactly what they need to do to fix the problem! Note that I am leaving the "Rebuild ISLE" CI job alone for now, because otherwise, we are trusting whomever submits a PR to generate the correct generated source. In other words, the manifest is a communication from the checked-in tree to the developer, but we still need to verify that the checked-in generated Rust source and the manifest are correct with respect to the checked-in ISLE source.
This commit adds a test from bytecodealliance#3337 which is an issue that was fixed in bytecodealliance#3506 due to moving `imul` lowering rules to ISLE which fixed the underlying issue of accidentally not falling through to the necessary case for general `i64x2.mul` multiplication. Closes bytecodealliance#3337
This pull request introduces initial ISLE integration for the x64 backend. It is blocked on bytecodealliance/rfcs#15 finishing its final comment period and being accepted.
I'm opening this up early since (a) I have a bunch of working code here that already ports a significant amount of the hand-written clif-to-MachInst lowering for x64, and (b) so that folks who are interested in how ISLE will work can get a chance to dig into more details and actual code rather than just slides I've presented at Cranelift meetings.
I'll keep pushing to this branch/PR as I continue porting more hand-written lowering code to ISLE.
Feel free to comment and give feedback or ask questions!