Pre-RFC: Instruction Selector DSL for Cranelift. #13

cfallin · 2021-08-05T20:42:06Z

Summary

This pre-RFC aims to describe the case for developing or adopting a DSL to write instruction selection/lowering rules in Cranelift backends, and to introduce discussion points.

The goal is not (yet) to design a concrete DSL and start work. Rather, the goal here is to collect requirements, discuss different design choices and how they might work in our context, and generally to see what the community thinks about this and what folks might prefer.

bjorn3 · 2021-08-05T21:03:47Z

accepted/cranelift-isel-pre-rfc.md

+Now that we have established the need for some sort of DSL, let's
+examine what requirements are imposed by the problem at hand.
+
+## Requirement 1: First-Class Destructuring/Matching


Removing InstructionFormat in favor of a big enum with a single variant for each instruction would already help with this even if there is no DSL by making it easy to match on said enum instead of the opcode, which allows binding the fields.

Possibly, but that's probably an orthogonal change to consider; or rather, it would make more sense to think about concrete quality-of-life refactors like this if we decide not to do a DSL, since whether to do a DSL is the high-order bit for developer experience.

cfallin · 2021-08-05T21:04:07Z

A note to frame things a bit: this pre-RFC contains what I think are a reasonable set of requirements to consider, but the requirements themselves are very much up for discussion, and I'm interested to hear what others think about their relative importance, or if this is missing any important requirements.

The discussion questions in the last part are also what I believe to be relevant design axes to talk about, but there may be others as well; please discuss!

bjorn3 · 2021-08-05T21:10:21Z

accepted/cranelift-isel-pre-rfc.md

+
+  This is reminiscent of "unsafe" code in Rust: it allows one to build
+  axiomatic building blocks with flexibility, but it requires one to
+  carefully define *what* the building blocks do.


I think that even after the migration is complete it should be possible to write arbitrary code for the output generation, but not necessarily the input pattern matching. This would allow for complex output code to handle for example turning fixed divisions into multiplications or shifts while still preserving analyzability and optimizability of the patterns if desired.

Right, there are a number of cases where "little building block of arbitrary logic" is useful. Creation of immediate operands is another good example: aarch64 has a very interesting logical-immediate format that can support only some values, with a complex algorithm to derive it. We'd want to make that a "primitive" in some sense by calling out to the existing implementation.

fitzgen

Looks great, thanks @cfallin!

fitzgen · 2021-08-05T21:29:59Z

accepted/cranelift-isel-pre-rfc.md

+expressed in some sort of DSL-internal type system, so that we do not
+have to hardcode lowering rules into the DSL design itself.
+
+## Requirement 6: Helps to Advance Verification Efforts


Related to verification efforts: I think we should call out superoptimization as something we would like to support for our isel. We should be able to do the superoptimization offline, based on CLIF patterns that we've harvested from real world programs, and then take the learned CLIF->vcode pairs and dump them into our new DSL. We should be able to effectively fold the preopt pass (and many more peepholes!) into isel lowering.

This would give us

better compilation throughput because we have fewer passes over the clif,

correct by construction CLIF->vcode patterns, and

optimal (according to some cost function) code generation for these CLIF->vcode patterns.

That's a good point, for sure!

I would hope that the format in which we express our basic lowering patterns is general enough to support arbitrary superoptimizer-derived patterns (i.e., if this is not even technically possible then something is very wrong), so in practice it seems this boils down to, I think:

We should have or build a translator/bridge from a superoptimizer format into the lowering DSL (like peepmatic-souper)

We should ensure that the whole infrastructure supports the "enormous pile of complex rules" case efficiently

Exactly, well said :)

sparker-arm · 2021-08-16T14:38:21Z

My gut feeling is that using an existing DSL (peepmatic), and having a layer that tries the generated patterns first, falling back to the existing matchers, would provide the lowest barrier to entry for users as well as keeping the build system simple. Any finger in the air guesses how compilation time would be effected against the simple match strategy used currently? I also would wonder whether a DSL for just isel would be a big enough carrot for getting people to port over... Considering the amount of C++ in LLVM used for isel, even though tablegen has been there for as long as I know, should not be underestimated :) So, I feel at least half of tablegens value is the ease in which it enables the encodings to be described and it would be great if we had something like that too.

fitzgen · 2021-08-17T20:52:34Z

having a layer that tries the generated patterns first, falling back to the existing matchers, would provide the lowest barrier to entry for users as well as keeping the build system simple.

This is the "horizontal" integration in the pre-RFC, and I am inclined to agree, unless the vertical integration happens to fall out of the DSL's compilation/execution model "for free".

Any finger in the air guesses how compilation time would be effected against the simple match strategy used currently?

I wouldn't expect any significant slow downs, assuming that the DSL also compiles down to Rust code similar to what you would otherwise have written by hand (e.g. a match that switches on opcode). Maaaayyyybe a little bit of overhead related to icache because there are now two code paths rather than one, but I wouldn't expect too much.

cfallin · 2021-08-17T21:02:43Z

@sparker-arm the goal is certainly to generate code equivalent to what we have today, so ideally we have zero slowdown in the Cranelift compile time, and in the future, possibly improvements that are enabled by more centralized control of the backend code's idioms (i.e., right now if we come up with a new way of matching, we have to modify all of the open-coded use sites; but if we generate this from patterns then we can transition instantly). The latter is especially interesting to me as it will let us eventually move to the native SSA-based API of regalloc2 which should give some speedups.

I've got a reasonable design down on paper now and am working on refining the writeup before posting the RFC -- hope to have it up in the next few days :-)

cfallin · 2021-08-19T05:51:15Z

I will go ahead and close this pre-RFC, as I think it has served its purpose well in starting discussions and getting early feedback on ideas that have gone into a now more fully-formed RFC, #15. Thanks all for the input and please do give any thoughts you might have on the new RFC!

Pre-RFC: Instruction Selector DSL for Cranelift.

f45b445

bjorn3 reviewed Aug 5, 2021

View reviewed changes

Fix footnote.

0d80990

bjorn3 reviewed Aug 5, 2021

View reviewed changes

Footnotes don't work in GitHub Markdown...

7a3f3b5

fitzgen reviewed Aug 5, 2021

View reviewed changes

cfallin mentioned this pull request Aug 19, 2021

RFC: design of ISLE instruction-selector DSL. #15

Merged

cfallin closed this Aug 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-RFC: Instruction Selector DSL for Cranelift. #13

Pre-RFC: Instruction Selector DSL for Cranelift. #13

cfallin commented Aug 5, 2021

bjorn3 Aug 5, 2021

cfallin Aug 6, 2021

cfallin commented Aug 5, 2021

bjorn3 Aug 5, 2021

cfallin Aug 6, 2021

fitzgen left a comment

fitzgen Aug 5, 2021

cfallin Aug 6, 2021

fitzgen Aug 6, 2021

sparker-arm commented Aug 16, 2021 •

edited

Loading

fitzgen commented Aug 17, 2021

cfallin commented Aug 17, 2021

cfallin commented Aug 19, 2021

Pre-RFC: Instruction Selector DSL for Cranelift. #13

Pre-RFC: Instruction Selector DSL for Cranelift. #13

Conversation

cfallin commented Aug 5, 2021

Summary

bjorn3 Aug 5, 2021

Choose a reason for hiding this comment

cfallin Aug 6, 2021

Choose a reason for hiding this comment

cfallin commented Aug 5, 2021

bjorn3 Aug 5, 2021

Choose a reason for hiding this comment

cfallin Aug 6, 2021

Choose a reason for hiding this comment

fitzgen left a comment

Choose a reason for hiding this comment

fitzgen Aug 5, 2021

Choose a reason for hiding this comment

cfallin Aug 6, 2021

Choose a reason for hiding this comment

fitzgen Aug 6, 2021

Choose a reason for hiding this comment

sparker-arm commented Aug 16, 2021 • edited Loading

fitzgen commented Aug 17, 2021

cfallin commented Aug 17, 2021

cfallin commented Aug 19, 2021

sparker-arm commented Aug 16, 2021 •

edited

Loading