Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline assembly #2850

Closed
wants to merge 29 commits into from
Closed

Inline assembly #2850

wants to merge 29 commits into from

Conversation

Amanieu
Copy link
Member

@Amanieu Amanieu commented Jan 13, 2020

A redesigned asm! macro, with a path to stabilization.

Rendered

Thanks to everyone involved in the inline asm project group for their feedback which helped make this RFC possible!

The discussion in this thread has grown rather large, so a new thread has been opened at #2873 where the discussion should continue.

@jonas-schievink jonas-schievink added A-ASM Proposals related to embedding assemly into Rust. T-lang Relevant to the language team, which will review and decide on the RFC. labels Jan 13, 2020

This RFC specifies a new syntax for inline assembly which is suitable for eventual stabilization.

The initial implementation of this feature will focus on the ARM, x86 and RISC-V architectures. Support for more architectures will be added based on user demand.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial implementation of this feature will focus on the ARM, x86 and RISC-V architectures.

A reasonable question to ask of this would be "is there anything in the design of this feature that precludes support for additional architectures, or is the feature sufficiently general that we do not (to the best of our ability) foresee any difficulty supporting additional architectures in a backwards-compatible way?" For example, elsewhere the document discusses how registers are highly architecture-specific; are registers the only place we would expect such different behavior, or are there other potential points of divergence? (I'm also not suggesting that we answer this question in the summary; perhaps in the Future Possibilities section.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, register definitions are basically the only thing needed to add support for a new architecture. This should be fairly straightforward once the basic infrastructure for inline asm is implemented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

register definitions are basically the only thing needed to add support for a new architecture. This should be fairly straightforward

Unless it's something alien, like e.g. Intel GPU ISA 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice the backend must have the inline assembly support for said target too. Not all of them do.

text/0000-inline-asm.md Show resolved Hide resolved
text/0000-inline-asm.md Show resolved Hide resolved
We can see that `inout` is used to specify an argument that is both input and output.
This is different from specifying an input and output separately in that it is guaranteed to assign both to the same register.

It is also possible to specify different variables for the input and output parts of an `inout` operand:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is also possible to specify different variables for the input and output parts of an inout operand

What sort of pattern necessitates the existence of this construction? Under what circumstances would one find themselves reaching for this? Does this need to exist if the same behavior can be achieved by either a mov inside the assembly or a let outside of it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly what happens to the "input" end of the variable? Is it considered "move"d? If it is Copy, is it copied before the invocation of asm! (the same way as function calls would work?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the main uses for this is to indicate an input register which is clobbered. This is represented as inout(reg) some_val => _.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently only Copy types are supported as asm operands. But otherwise the input part is essentially treated the same way as a function argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bstrie Even if you could do it with a mov instruction, telling the compiler where you left the variable's value allows the compiler to just treat that as the new location of the variable, which means the resulting assembly won't need a mov at all.

text/0000-inline-asm.md Outdated Show resolved Hide resolved

The assembler template uses the same syntax as [format strings][format-syntax] (i.e. placeholders are specified by curly braces). The corresponding arguments are accessed in order, by index, or by name. However, implicit named arguments (introduced by [RFC #2795][rfc-2795]) are not supported.

The assembly code syntax used is that of the GNU assembler (GAS). The only exception is on x86 where the Intel syntax is used instead of GCC's AT&T syntax.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only exception is on x86 where the Intel syntax is used instead of GCC's AT&T syntax.

Can we see an example of Intel syntax being used with this macro?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All examples in this RFC use Intel syntax.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to aid in mechanical translation of code, could we expose the appropriate syntax flags here, so that people can ask for AT&T syntax on x86, for instance? (That would make it easy for others to implement an asm_att! macro, for instance.)

And, of course, the alternatives section should mention the possibility of using AT&T syntax on all platforms and providing a flag for Intel syntax, in which case people could easily implement an asm_intel! macro for Intel syntax.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tricky to support reliably. In particular, GCC requires that all asm code in a compilation unit use the same syntax, so this would at least exclude the possibility of inline asm support with a GCC backend.

I would prefer if we simply chose a single asm syntax and stuck with it. Also note that this only affects x86 which has 2 asm syntaxes. Every other architecture only has a single standardized asm syntax.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be a crate level setting, since each crate is a single translation unit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that a crate-level setting will be enough if you take LTO into account. And even then, a crate-level setting won't work with inline functions.

Copy link

@comex comex Jan 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for that. You can switch syntaxes on-the-fly using assembler directives, .att_syntax and .intel_syntax noprefix: example.

Edit: This is supported by both GNU as and LLVM's assembler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is more with the register placeholders, since GCC needs to know whether to emit eax or %eax.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... good point. For the record, this only affects a hypothetical GCC backend, not LLVM, which supports inteldialect per asm block.

I suppose such a backend could always tell GCC to compile in Intel mode, and then just add the %s and $s itself.

Copy link
Member

@joshtriplett joshtriplett Jan 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Amanieu GCC supports prefixed registers even when in Intel syntax mode, and even with noprefix; noprefix just makes the prefixes optional. So a hypothetical GCC backend can (and should) always emit prefixes like % and $, always leave GCC in AT&T mode, and just wrap assembly blocks that use Intel syntax in .intel_syntax noprefix and .att_syntax.

```
dir_spec := "in" / "out" / "lateout" / "inout" / "inlateout"
reg_spec := <arch specific register class> / "<arch specific register name>"
operand_expr := expr / "_" / expr "=>" expr / expr "=>" "_"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

operand_expr := expr / "_" / expr "=>" expr / expr "=>" "_"

Should the exprs in this line be idents, or do we really support arbitrary expressions in all these locations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do want these to be full expressions. This allows for example using my_struct.field as an asm operand.

text/0000-inline-asm.md Outdated Show resolved Hide resolved
- `nomem`: The `asm` blocks does not read or write to any memory. This allows the compiler to cache the values of modified global variables in registers across the `asm` block since it knows that they are not read or written to by the `asm`.
- `readonly`: The `asm` block does not write to any memory. This allows the compiler to cache the values of unmodified global variables in registers across the `asm` block since it knows that they are not written to by the `asm`.
- `preserves_flags`: The `asm` block does not modify the flags register (defined below). This allows the compiler to avoid recomputing the condition flags after the `asm` block.
- `noreturn`: The `asm` block never returns, and its return type is defined as `!` (never). Behavior is undefined if execution falls through past the end of the asm code.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its return type is defined as ! (never)

Ordinarily does asm! have a return type of ()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there any difference between using noreturn and putting a call to unreachable_unchecked after the asm! block?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, asm! has a return type of ().

Also, is there any difference between using noreturn and putting a call to unreachable_unchecked after the asm! block?

If you look at the "Mapping to LLVM IR" section, you'll see that's exactly what it does.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I saw this:

If the noreturn flag is set then an unreachable LLVM instruction is inserted after the asm invocation.

But I think unreachable_unchecked will result in the same code (modulo inlining, etc.). Do you think this has a performance impact, or is there another reason other than brevity to prefer the noreturn modifier?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the noreturn flag because I feel that it makes it more explicit that the asm never returns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main advantage is that the compiler handles name mangling for you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the main advantage is that it is inline, which is much easier to read than jumping to another file and makes it easier to deal with things like accessing globals or modifying rust variables.

@roblabla Is this documented somewhere? I've used a lot of naked functions with function calls etc, and it seems to work fine as long as your asm sets up the stack properly.

Copy link
Member

@mark-i-m mark-i-m Jan 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, I'm glad to discuss naked fn further, but perhaps we should do it in the tracking issue instead?

Copy link
Member

@bjorn3 bjorn3 Jan 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the main advantage is that it is inline, which is much easier to read than jumping to another file

You don't have to place global_asm!(); in a different file.

and makes it easier to deal with things like accessing globals or modifying rust variables.

👍

I've used a lot of naked functions with function calls etc, and it seems to work fine as long as your asm sets up the stack properly.

One example is the Redox interrupt handling.

Copy link

@roblabla roblabla Jan 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mark-i-m

it is generally unsafe to write anything but inline assembly inside a naked function. The LLVM language reference describes this feature as having "very system-specific consequences", which the programmer must be aware of.

And

It is easy to quietly generate wrong code in naked functions, such as by causing the compiler to allocate stack space for temporaries where none were anticipated. There is currently no restriction on writing Rust statements inside a naked function, while most compilers supporting similar features either require or strongly recommend that authors write only inline assembly inside naked functions to ensure no code is generated that assumes a particular stack layout. It may be desirable to place further restrictions on what statements are permitted in the body of a naked function, such as permitting only asm!
statements.

There is an open PR to amend the naked fn RFC to explicitly deny various common misuses of it. Let's move naked function discussion there.

text/0000-inline-asm.md Outdated Show resolved Hide resolved
@matklad
Copy link
Member

matklad commented Jan 13, 2020

As someone who hasn't really used inline assembly in any language, I have a stupid question about how this feature works as a whole. I think I inferred this from RFC text, but I am not sure my understanding is correct (perhaps this is a content for the future user-level docs?).

This is how I think asm works, is this right ballpark?

So, what we fundamentally hope to achieve here is an ability to insert arbitrary instruction sequences into the generated machine code. I naively expect the solution along the lines of D DSL: we invent a syntax for specifying arbitrary instructions, and compiler produces the binary for us. From this point of view, specifying assembly as a string literal seems very odd.

Now, the reason why this doesn't work too well is that there are numerous fundamentally different CPU architectures (which are themselves moving targets), with fundamentally different institutions, and it is unreasonable to expect that the compiler would fully support all of them. Instead, we rely on the fact that each architecture has a dedicated external tool, an assembler, which is capable of turning assembler-specific syntax into machine code. Compiler more or less invokes the assembler as a black box. Compiler does not understand what mov in ""mov {}, 5" means. The only thing compiler needs to know is the interface. That's why the RFC rigorously specifies the grammar for register specification and flags, but doesn't actually tell us, what can be written between "".

However (and I think this is not spelled out explicitly, or have I just missed it?) we also ship a specific assembler (namely llvm assembler), with specific syntax, which rustc will use for asm! calls. So, to learn what I can actually put into "", I will need to read the docs of this assembler to understand the syntax, and then maybe the docs for the specific CPU/instruction set, to learn which instructions are available and provide which semantics.

Is this all at least somewhat reasonable description of the reality? :)

@Ixrec
Copy link
Contributor

Ixrec commented Jan 13, 2020

My understanding of past inline asm discussions is that the biggest issue was always the stability and portability implications of supporting inline asm at all (and to a lesser extent of exposing LLVM's asm syntax, which thankfully is no longer being proposed). Unless the Rust teams think these issues are now self-evident, we should probably make some explicit statements about them in the RFC.

Specifically, I think all of the following are intended to be true:

  • The syntax of an asm!() invocation is part of the Rust language grammar, and would go into a future Rust spec, so all Rust compilers must be able to parse it. I'm not sure if this has any practical consequences today, but presumably we want to leave the door open for doing something like #[cfg(backend = "llvm")] where an asm!() might need to get parsed just so it can be discarded successfully.
  • No Rust compiler is required to "actually implement" asm!(), because asm!()'s semantics are inherently target-specific and cannot be specified terms of any Rust Abstract Machine. Also, implementing asm!() for one target/backend combination does not require implementing it for any other target/backend combinations. Every toolchain gets to choose exactly how much inline asm it does or doesn't provide, independently of rustc, and still be a spec-compliant Rust compiler.
  • rustc will commit to providing asm!() on the architectures listed in this RFC for as long as LLVM provides any sort of inline asm feature that makes this feasible to implement. So if LLVM makes backwards incompatible changes to its inline asm, rustc will be committed to adapt to those changes as part of its regular LLVM upgrades.
  • But because LLVM has no stability promise on its inline asm (that hasn't changed, right?), and in theory could simply delete that feature without our consent, rustc cannot unconditionally guarantee stability for inline asm the way it can for most other language features (unless we do something "crazy" like fork LLVM or make Cranelift part of rustc). Or does this non-stability come from the system assemblers, and LLVM is just a middleman?
  • At this time, we are not making any guarantees about alternate backends for rustc. So rustc is allowed to add backends that have no inline asm support. And if Cranelift ever provides inline asm, making stable versions of rustc use it would constitute an additional stability commitment (i.e. it'd need an RFC or an FCP on a PR).

If I am correct and all of these are intended, I think at least some of them should be made explicit in the RFC. Especially that part in bold.

Also, should we consider how this interacts with the target tier policy? At first blush, I would think any changes to inline asm support on tier 3 targets require no special approval, but any changes to inline asm support on tier 1 and 2 targets probably should get... I dunno, compiler team approval? And be included in release notes?


As for what's actually in the feature proposal, I have no objections, and (as another person who's never used inline asm in anger) @matklad's elaboration of the constraints matches my understanding of the constraints involved.

@Amanieu
Copy link
Member Author

Amanieu commented Jan 13, 2020

@matklad Yes, that sounds about right.

@mark-i-m
Copy link
Member

But because LLVM has no stability promise on its inline asm (that hasn't changed, right?), and in theory could simply delete that feature without our consent, rustc cannot unconditionally guarantee stability for inline asm the way it can for most other language features (unless we do something "crazy" like fork LLVM or make Cranelift part of rustc).

In theory, yes, I think, but in practice, I would be pretty shocked if any mature C compiler (e.g. clang) dropped support for inline assembly, and since LLVM is the backend for clang, it's hard to imagine LLVM dropping support altogether.

That said, I'm not sure how stable the LLVM asm interface is. I haven't seen it change in the last few years though...

@Ixrec
Copy link
Contributor

Ixrec commented Jan 13, 2020

In theory, yes, I think, but in practice, I would be pretty shocked if any mature C compiler (e.g. clang) dropped support for inline assembly, and since LLVM is the backend for clang, it's hard to imagine LLVM dropping support altogether.

That said, I'm not sure how stable the LLVM asm interface is. I haven't seen it change in the last few years though...

I completely agree that it's "stable in practice", but since this has always been the thing explicitly cited by Rust teams in past inline asm discussions to answer "why isn't inline asm on stable yet?", it seems like something we need an official statement on.

I'm hoping the answer is that they were specifically concerned about LLVM changing their inline asm syntax (because independently specifying our own syntax completely solves that), or about muddying the messaging on Rust's stability promise too close to 1.0 (which I'd assume is no longer a concern), but I just don't know for sure.

@Amanieu
Copy link
Member Author

Amanieu commented Jan 14, 2020

  • The syntax of an asm!() invocation is part of the Rust language grammar, and would go into a future Rust spec, so all Rust compilers must be able to parse it. I'm not sure if this has any practical consequences today, but presumably we want to leave the door open for doing something like #[cfg(backend = "llvm")] where an asm!() might need to get parsed just so it can be discarded successfully.

  • No Rust compiler is required to "actually implement" asm!(), because asm!()'s semantics are inherently target-specific and cannot be specified terms of any Rust Abstract Machine. Also, implementing asm!() for one target/backend combination does not require implementing it for any other target/backend combinations. Every toolchain gets to choose exactly how much inline asm it does or doesn't provide, independently of rustc, and still be a spec-compliant Rust compiler.

  • At this time, we are not making any guarantees about alternate backends for rustc. So rustc is allowed to add backends that have no inline asm support. And if Cranelift ever provides inline asm, making stable versions of rustc use it would constitute an additional stability commitment (i.e. it'd need an RFC or an FCP on a PR).

The intent is that all backends will supports asm!() in some form. If the backend doesn't support inline assembly natively then it can fall back to invoking an external assembler and generate an out-of-line call to the compiled asm blob. See the "Difficulty of support" section in the RFC for an example of how this would work.

  • rustc will commit to providing asm!() on the architectures listed in this RFC for as long as LLVM provides any sort of inline asm feature that makes this feasible to implement. So if LLVM makes backwards incompatible changes to its inline asm, rustc will be committed to adapt to those changes as part of its regular LLVM upgrades.

  • But because LLVM has no stability promise on its inline asm (that hasn't changed, right?), and in theory could simply delete that feature without our consent, rustc cannot unconditionally guarantee stability for inline asm the way it can for most other language features (unless we do something "crazy" like fork LLVM or make Cranelift part of rustc). Or does this non-stability come from the system assemblers, and LLVM is just a middleman?

There are two distinct parts to LLVM's inline assembly support:

  • The first part is the constraint specification, which is the syntax that LLVM uses to specify operands in the asm code. AFAIK this isn't stable, but this RFC does not expose any of this directly, so this is fine.
  • The second part is how the final asm string is interpreted, after all the string substitution to insert register names is completed. This is done by LLVM's integrated assembler, which implements the GNU assembler syntax (e.g. .section directive). We do expose this directly, but this is fine since non-LLVM backends can use an external GNU assembler to parse the generated asm code.

Finally, LLVM doesn't exist in a vacuum. As @mark-i-m said, clang is a big user, but keep in mind that Rust itself is also a big user of LLVM. Once Rust gets stable inline assembly support, LLVM will have a strong incentive to keep this support working correctly.


## Memory operands

We could support `mem` as an alternative to specifying a register class which would leave the operand in memory and instead produce a memory address when inserted into the asm string. This would allow generating more efficient code by taking advantage of addressing modes instead of using an intermediate register to hold the computed address.
Copy link

@recmo recmo Jan 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very common to have instructions (like add or mov) that accept inputs as both registers or memory. Specifying this as a rm constraint allows the compiler to do better register allocation/spilling.

That being said, LLVM currently ignores any freedom in the constraints and always picks memory:

Thus, it simply tries to make a choice that’s most likely to compile, not one that will be optimal performance. (e.g., given “rm”, it’ll always choose to use memory, not registers).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also LLVM only really supports memory addressing modes for inline asm on x86. On ARM/AArch64 it will just perform the address calculation and put the result in a register.


## Flag outputs

GCC supports a special type of output which allows an asm block to return a `bool` encoded in the condition flags register. This allows the compiler to branch directly on the condition flag instead of materializing the condition as a `bool`.
Copy link

@recmo recmo Jan 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be super useful for Intel ADCX/ADOX instructions for multi-precision integer math like in this assembly multiplication routine. Here it would be useful if I could specify the carry and overflow flags as inputs/outputs.

This is in line with what the LLVM instrinsics for these instructions do. See core::arch::x86_64::_addcarryx_u64.

Unfortunately, again it seems like LLVM just ignores it. As mentioned in rust-lang/stdarch#666 (comment), there is an open LLVM issue and it does not seem easy:

The X86 backend isn't currently set up to model the C flag and O flag separately. We model all of the flags as one register. Because of this we can't interleave the flag dependencies. We would need to do something about that before it makes sense to implement _addcarryx_u64 as anything other than plain adc.

I would have used intrinsics instead of asm if it wasn't for this issue.

@comex
Copy link

comex commented Jan 14, 2020

I’m a little surprised this is already being RFCed. I was expecting we’d come up with a prototype implementation as a procedural macro first, within the working group. This could serve two purposes:

  • If it supported transforming asm! to external assembly, it would serve as proof that that transformation is possible, settling the controversy over Cranelift. It could also serve as a basis for the actual implementation of that in rustc.
  • It would let us get hands-on experience with what the new syntax feels like to use, and iterate on it if desired.

...Well, I say “we”, but I haven’t been contributing other than responding in some GitHub threads. I really appreciate the energy investment in moving this forward! I just think it might be good to start on that before finalizing the design… though I guess that can still happen before stabilization.

Edit: And I haven’t even expressed my wish for an implementation before, so I can’t really complain. I just thought we were still in a somewhat early phase in the working group.

text/0000-inline-asm.md Outdated Show resolved Hide resolved
Copy link
Member

@LukasKalbertodt LukasKalbertodt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general note: I'd like more "this is UB!" explanations in several places. Maybe the guide-level section could use a complete subsection explicitly explaining again that many things can lead to UB.

Currently only in a few places it explicitly says "UB". In most places it just says "you cannot" and "you must" and that the compiler can assume something. While people used to inline assembly will most certainly know that all of this is about UB, newcomers might not and might instead expect a compiler error or something.

text/0000-inline-asm.md Outdated Show resolved Hide resolved
text/0000-inline-asm.md Outdated Show resolved Hide resolved
- `nomem`: The `asm` blocks does not read or write to any memory. This allows the compiler to cache the values of modified global variables in registers across the `asm` block since it knows that they are not read or written to by the `asm`.
- `readonly`: The `asm` block does not write to any memory. This allows the compiler to cache the values of unmodified global variables in registers across the `asm` block since it knows that they are not written to by the `asm`.
- `preserves_flags`: The `asm` block does not modify the flags register (defined below). This allows the compiler to avoid recomputing the condition flags after the `asm` block.
- `noreturn`: The `asm` block never returns, and its return type is defined as `!` (never). Behavior is undefined if execution falls through past the end of the asm code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually confused by the name noreturn. I initially assumed this meant that there is no ret instruction (or similar) in the assembly. How about using diverging or diverge as the name instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that noreturn expresses this better: execution never returns from the asm block.

| All | `bp` (x86), `r11` (ARM), `x29` (AArch64), `x8` (RISC-V) | The frame pointer cannot be used as an input or output. |
| x86 | `ah`, `bh`, `ch`, `dh` | These are poorly supported by compiler backends. Use 16-bit register views (e.g. `ax`) instead. |
| x86 | `k0` | This is a constant zero register which can't be modified. |
| x86 | `ip` | This is the program counter, not a real register. |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does ip have aliases if it is unusable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that the compiler can provide better error messages ("unknown register" vs "disallowed register").

| AArch64 | `xzr` | This is a constant zero register which can't be modified. |
| ARM | `pc` | This is the program counter, not a real register. |
| RISC-V | `x0` | This is a constant zero register which can't be modified. |
| RISC-V | `gp`, `tp` | These registers are reserved and cannot be used as inputs or outputs. |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused about this list. It is written as though it is an exhaustive list, and even goes so far as to mention a register only used in a fairly recent instruction set extension (k0 for x86).

Wouldn't it make more sense to just list a few examples here and keep an up-to-date list in the documentation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for inline assembly will most likely be a verbatim copy of the contents of this RFC.


In this example we call the `out` instruction to output the content of the `cmd` variable
to port `0x64`. Since the `out` instruction only accepts `eax` (and its sub registers) as operand
we had to use the `eax` constraint specifier.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the problems I have with the current asm!() macro is that it does not at all feel like the error reporting story is on the same level of quality as the rest of rustc. See e.g. rust-lang/rust#15402

If I accidentally use reg here instead of "eax", will I get poor error messages from deep inside the belly of LLVM?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting that your specific example won't be possible anymore, since as noted in the RFC, the input/output parameters will only accept raw numeric types (and pointers where it makes sense). Your wrapper struct would get rejected by the type checker.

There are other types of weird "low-level" llvm errors that can get reported when the feature is misused though. Notably, I remember getting really weird aborts when misusing constraints. Rustc will probably need to sanitize them before passing it to LLVM, making sure they make sense. Some things on the top of my head:

  • Using the same named register twice as an input parameter (it's UB, LLVM accepts it and will either abort or does something weird)
  • Using the same named register as both a clobber and an input or an output (UB, LLVM generates garbage, sometimes aborts)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specification in this RFC and the validation it requires should eliminate the possibility of any LLVM internal error. For example, the RFC explicitly says this:

It is a compile-time error to use the same explicit register two input operand or two output operands.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the out instruction must use the al, ax or eax register per the instruction set reference.

If I accidentally wrote reg instead of "eax" in the example, would the error message be of the same quality as normal rust error messages (or at least using the same infrastructure wrt highlighting the span in question etc)?

Another concern is that it would potentially compile some of the time, when LLVM happens to allocate eax as its register of choice. Would that be possible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I accidentally wrote reg instead of "eax" in the example, would the error message be of the same quality as normal rust error messages (or at least using the same infrastructure wrt highlighting the span in question etc)?

LLVM has support for sending error messages from inline assembly back to rustc so that they can be displayed through rustc's normal error message functionality. This already works with the current asm! macro: https://play.rust-lang.org/?version=nightly&mode=release&edition=2018&gist=e5d9c3c74edc6e02858ec965abfd4d98

Another concern is that it would potentially compile some of the time, when LLVM happens to allocate eax as its register of choice. Would that be possible?

Yes it would sometimes compile fine and sometimes not, depending on the register selected by LLVM. There isn't much we can do about that.

Copy link

@khimru khimru Mar 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More than cryptic error messages I fear silent errors without any error messages.

See here: https://godbolt.org/z/5d0MxL

As you can see if you would pass one-byte variable and request an "r" constraint (instead of "q" or "Q" constraint) LLVM is all to happy to make a mess out of your assembler (note how LLVM tries to return both y1 and y4 in the same eax register).

Will rustcc be able to detect and report that somehow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems to be a bug in LLVM. It should not allocate the *H registers for inline assembly.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The emitted code is correct, though: after the assembly block, it first stores al somewhere else, only then ah in the entire eax register. Technically, since you only asked for the short registers, you shouldn't have a way to e.g. zero-extend one of them and clobber the other. Should this be a strict guarantee, that you don't get the *h registers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the way this RFC is currently written the *h registers are never chosen for operands (and I don't intend to change it).

The reason why I argue that LLVM's behavior is buggy is that Clang's inline asm is designed to emulate GCC's inline asm, and GCC never allocates *h for register operands.

I will submit a fix to LLVM as part of the asm! work.

text/0000-inline-asm.md Outdated Show resolved Hide resolved
text/0000-inline-asm.md Outdated Show resolved Hide resolved
text/0000-inline-asm.md Outdated Show resolved Hide resolved
Copy link

@valarauca valarauca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comment on source/destination semantics.


The assembler template uses the same syntax as [format strings][format-syntax] (i.e. placeholders are specified by curly braces). The corresponding arguments are accessed in order, by index, or by name. However, implicit named arguments (introduced by [RFC #2795][rfc-2795]) are not supported.

The assembly code syntax used is that of the GNU assembler (GAS). The only exception is on x86 where the Intel syntax is used instead of GCC's AT&T syntax.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this imply that x86 will use the format of mnemonic destination, source (like Intel Syntax) while every other platform will use mnemonic source, destination (like AT&T/GAS)? Because that'll extremely hurt portability/readability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every other architecture already uses mnemonic destination, source. x86 is the exception with AT&T syntax that reverses this order.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so every platform is standardized to mnemonic dst, src?

This is a bit confusing, as I'm not sure if the Rust-Source is being specified, or the IR which is being transferred to the compiler.

So does this imply that on non-x86 platforms rustc will be responsible for re-ordering mnemonic arguments to the backend?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reordering needed, all non-x86 platforms already use the correct ordering. The Intel/ATT syntax split is only a thing on x86.

@Amanieu
Copy link
Member Author

Amanieu commented Jan 14, 2020

I’m a little surprised this is already being RFCed. I was expecting we’d come up with a prototype implementation as a procedural macro first, within the working group.

Unfortunately we can't implement this with just a proc macro that wraps the existing asm! implementation because we need type information for validation and to emit the correct template modifiers for LLVM.

text/0000-inline-asm.md Outdated Show resolved Hide resolved
This was referenced May 22, 2020
phansch pushed a commit to phansch/rust-clippy that referenced this pull request May 24, 2020
Implement new asm! syntax from RFC 2850

This PR implements the new `asm!` syntax proposed in rust-lang/rfcs#2850.

# Design

A large part of this PR revolves around taking an `asm!` macro invocation and plumbing it through all of the compiler layers down to LLVM codegen. Throughout the various stages, an `InlineAsm` generally consists of 3 components:

- The template string, which is stored as an array of `InlineAsmTemplatePiece`. Each piece represents either a literal or a placeholder for an operand (just like format strings).
```rust
pub enum InlineAsmTemplatePiece {
    String(String),
    Placeholder { operand_idx: usize, modifier: Option<char>, span: Span },
}
```

- The list of operands to the `asm!` (`in`, `[late]out`, `in[late]out`, `sym`, `const`). These are represented differently at each stage of lowering, but follow a common pattern:
  - `in`, `out` and `inout` all have an associated register class (`reg`) or explicit register (`"eax"`).
  - `inout` has 2 forms: one with a single expression that is both read from and written to, and one with two separate expressions for the input and output parts.
  - `out` and `inout` have a `late` flag (`lateout` / `inlateout`) to indicate that the register allocator is allowed to reuse an input register for this output.
  - `out` and the split variant of `inout` allow `_` to be specified for an output, which means that the output is discarded. This is used to allocate scratch registers for assembly code.
  - `sym` is a bit special since it only accepts a path expression, which must point to a `static` or a `fn`.

- The options set at the end of the `asm!` macro. The only one that is particularly of interest to rustc is `NORETURN` which makes `asm!` return `!` instead of `()`.
```rust
bitflags::bitflags! {
    pub struct InlineAsmOptions: u8 {
        const PURE = 1 << 0;
        const NOMEM = 1 << 1;
        const READONLY = 1 << 2;
        const PRESERVES_FLAGS = 1 << 3;
        const NORETURN = 1 << 4;
        const NOSTACK = 1 << 5;
    }
}
```

## AST

`InlineAsm` is represented as an expression in the AST:

```rust
pub struct InlineAsm {
    pub template: Vec<InlineAsmTemplatePiece>,
    pub operands: Vec<(InlineAsmOperand, Span)>,
    pub options: InlineAsmOptions,
}

pub enum InlineAsmRegOrRegClass {
    Reg(Symbol),
    RegClass(Symbol),
}

pub enum InlineAsmOperand {
    In {
        reg: InlineAsmRegOrRegClass,
        expr: P<Expr>,
    },
    Out {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        expr: Option<P<Expr>>,
    },
    InOut {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        expr: P<Expr>,
    },
    SplitInOut {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        in_expr: P<Expr>,
        out_expr: Option<P<Expr>>,
    },
    Const {
        expr: P<Expr>,
    },
    Sym {
        expr: P<Expr>,
    },
}
```

The `asm!` macro is implemented in librustc_builtin_macros and outputs an `InlineAsm` AST node. The template string is parsed using libfmt_macros, positional and named operands are resolved to explicit operand indicies. Since target information is not available to macro invocations, validation of the registers and register classes is deferred to AST lowering.

## HIR

`InlineAsm` is represented as an expression in the HIR:

```rust
pub struct InlineAsm<'hir> {
    pub template: &'hir [InlineAsmTemplatePiece],
    pub operands: &'hir [InlineAsmOperand<'hir>],
    pub options: InlineAsmOptions,
}

pub enum InlineAsmRegOrRegClass {
    Reg(InlineAsmReg),
    RegClass(InlineAsmRegClass),
}

pub enum InlineAsmOperand<'hir> {
    In {
        reg: InlineAsmRegOrRegClass,
        expr: Expr<'hir>,
    },
    Out {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        expr: Option<Expr<'hir>>,
    },
    InOut {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        expr: Expr<'hir>,
    },
    SplitInOut {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        in_expr: Expr<'hir>,
        out_expr: Option<Expr<'hir>>,
    },
    Const {
        expr: Expr<'hir>,
    },
    Sym {
        expr: Expr<'hir>,
    },
}
```

AST lowering is where `InlineAsmRegOrRegClass` is converted from `Symbol`s to an actual register or register class. If any modifiers are specified for a template string placeholder, these are validated against the set allowed for that operand type. Finally, explicit registers for inputs and outputs are checked for conflicts (same register used for different operands).

## Type checking

Each register class has a whitelist of types that it may be used with. After the types of all operands have been determined, the `intrinsicck` pass will check that these types are in the whitelist. It also checks that split `inout` operands have compatible types and that `const` operands are integers or floats. Suggestions are emitted where needed if a template modifier should be used for an operand based on the type that was passed into it.

## HAIR

`InlineAsm` is represented as an expression in the HAIR:

```rust
crate enum ExprKind<'tcx> {
    // [..]
    InlineAsm {
        template: &'tcx [InlineAsmTemplatePiece],
        operands: Vec<InlineAsmOperand<'tcx>>,
        options: InlineAsmOptions,
    },
}
crate enum InlineAsmOperand<'tcx> {
    In {
        reg: InlineAsmRegOrRegClass,
        expr: ExprRef<'tcx>,
    },
    Out {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        expr: Option<ExprRef<'tcx>>,
    },
    InOut {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        expr: ExprRef<'tcx>,
    },
    SplitInOut {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        in_expr: ExprRef<'tcx>,
        out_expr: Option<ExprRef<'tcx>>,
    },
    Const {
        expr: ExprRef<'tcx>,
    },
    SymFn {
        expr: ExprRef<'tcx>,
    },
    SymStatic {
        expr: ExprRef<'tcx>,
    },
}
```

The only significant change compared to HIR is that `Sym` has been lowered to either a `SymFn` whose `expr` is a `Literal` ZST of the `fn`, or a `SymStatic` whose `expr` is a `StaticRef`.

## MIR

`InlineAsm` is represented as a `Terminator` in the MIR:

```rust
pub enum TerminatorKind<'tcx> {
    // [..]

    /// Block ends with an inline assembly block. This is a terminator since
    /// inline assembly is allowed to diverge.
    InlineAsm {
        /// The template for the inline assembly, with placeholders.
        template: &'tcx [InlineAsmTemplatePiece],

        /// The operands for the inline assembly, as `Operand`s or `Place`s.
        operands: Vec<InlineAsmOperand<'tcx>>,

        /// Miscellaneous options for the inline assembly.
        options: InlineAsmOptions,

        /// Destination block after the inline assembly returns, unless it is
        /// diverging (InlineAsmOptions::NORETURN).
        destination: Option<BasicBlock>,
    },
}

pub enum InlineAsmOperand<'tcx> {
    In {
        reg: InlineAsmRegOrRegClass,
        value: Operand<'tcx>,
    },
    Out {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        place: Option<Place<'tcx>>,
    },
    InOut {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        in_value: Operand<'tcx>,
        out_place: Option<Place<'tcx>>,
    },
    Const {
        value: Operand<'tcx>,
    },
    SymFn {
        value: Box<Constant<'tcx>>,
    },
    SymStatic {
        value: Box<Constant<'tcx>>,
    },
}
```

As part of HAIR lowering, `InOut` and `SplitInOut` operands are lowered to a split form with a separate `in_value` and `out_place`.

Semantically, the `InlineAsm` terminator is similar to the `Call` terminator except that it has multiple output places where a `Call` only has a single return place output.

The constant promotion pass is used to ensure that `const` operands are actually constants (using the same logic as `#[rustc_args_required_const]`).

## Codegen

Operands are lowered one more time before being passed to LLVM codegen:

```rust
pub enum InlineAsmOperandRef<'tcx, B: BackendTypes + ?Sized> {
    In {
        reg: InlineAsmRegOrRegClass,
        value: OperandRef<'tcx, B::Value>,
    },
    Out {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        place: Option<PlaceRef<'tcx, B::Value>>,
    },
    InOut {
        reg: InlineAsmRegOrRegClass,
        late: bool,
        in_value: OperandRef<'tcx, B::Value>,
        out_place: Option<PlaceRef<'tcx, B::Value>>,
    },
    Const {
        string: String,
    },
    SymFn {
        instance: Instance<'tcx>,
    },
    SymStatic {
        def_id: DefId,
    },
}
```

The operands are lowered to LLVM operands and constraint codes as follow:
- `out` and the output part of `inout` operands are added first, as required by LLVM. Late output operands have a `=` prefix added to their constraint code, non-late output operands have a `=&` prefix added to their constraint code.
- `in` operands are added normally.
- `inout` operands are tied to the matching output operand.
- `sym` operands are passed as function pointers or pointers, using the `"s"` constraint.
- `const` operands are formatted to a string and directly inserted in the template string.

The template string is converted to LLVM form:
- `$` characters are escaped as `$$`.
- `const` operands are converted to strings and inserted directly.
- Placeholders are formatted as `${X:M}` where `X` is the operand index and `M` is the modifier character. Modifiers are converted from the Rust form to the LLVM form.

The various options are converted to clobber constraints or LLVM attributes, refer to the [RFC](https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#mapping-to-llvm-ir) for more details.

Note that LLVM is sometimes rather picky about what types it accepts for certain constraint codes so we sometimes need to insert conversions to/from a supported type. See the target-specific ISelLowering.cpp files in LLVM for details.

# Adding support for new architectures

Adding inline assembly support to an architecture is mostly a matter of defining the registers and register classes for that architecture. All the definitions for register classes are located in `src/librustc_target/asm/`.

Additionally you will need to implement lowering of these register classes to LLVM constraint codes in `src/librustc_codegen_llvm/asm.rs`.
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 14, 2021
Stabilize asm! and global_asm!

Tracking issue: rust-lang#72016

It's been almost 2 years since the original [RFC](rust-lang/rfcs#2850) was posted and we're finally ready to stabilize this feature!

The main changes in this PR are:
- Removing `asm!` and `global_asm!` from the prelude as per the decision in rust-lang#87228.
- Stabilizing the `asm` and `global_asm` features.
- Removing the unstable book pages for `asm` and `global_asm`. The contents are moved to the [reference](rust-lang/reference#1105) and [rust by example](rust-lang/rust-by-example#1483).
  - All links to these pages have been removed to satisfy the link checker. In a later PR these will be replaced with links to the reference or rust by example.
- Removing the automatic suggestion for using `llvm_asm!` instead of `asm!` if you're still using the old syntax, since it doesn't work anymore with `asm!` no longer being in the prelude. This only affects code that predates the old LLVM-style `asm!` being renamed to `llvm_asm!`.
- Updating `stdarch` and `compiler-builtins`.
- Updating all the tests.

r? `@joshtriplett`
flip1995 pushed a commit to flip1995/rust-clippy that referenced this pull request Dec 17, 2021
Stabilize asm! and global_asm!

Tracking issue: #72016

It's been almost 2 years since the original [RFC](rust-lang/rfcs#2850) was posted and we're finally ready to stabilize this feature!

The main changes in this PR are:
- Removing `asm!` and `global_asm!` from the prelude as per the decision in #87228.
- Stabilizing the `asm` and `global_asm` features.
- Removing the unstable book pages for `asm` and `global_asm`. The contents are moved to the [reference](rust-lang/reference#1105) and [rust by example](rust-lang/rust-by-example#1483).
  - All links to these pages have been removed to satisfy the link checker. In a later PR these will be replaced with links to the reference or rust by example.
- Removing the automatic suggestion for using `llvm_asm!` instead of `asm!` if you're still using the old syntax, since it doesn't work anymore with `asm!` no longer being in the prelude. This only affects code that predates the old LLVM-style `asm!` being renamed to `llvm_asm!`.
- Updating `stdarch` and `compiler-builtins`.
- Updating all the tests.

r? `@joshtriplett`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ASM Proposals related to embedding assemly into Rust. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.