Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codegen #[naked] functions using global asm #128004

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

folkertdev
Copy link
Contributor

@folkertdev folkertdev commented Jul 20, 2024

tracking issue: #90957

Fixes #124375

This implements the approach suggested in the tracking issue: use the existing global assembly infrastructure to emit the body of #[naked] functions. The main advantage is that we now have full control over what gets generated, and are no longer dependent on LLVM not sneakily messing with our output (inlining, adding extra instructions, etc).

I discussed this approach with @Amanieu and while I think the general direction is correct, there is probably a bunch of stuff that needs to change or move around here. I'll leave some inline comments on things that I'm not sure about.

Combined with #127853, if both accepted, I think that resolves all steps from the tracking issue.

r? @Amanieu

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 20, 2024
compiler/rustc_codegen_ssa/src/codegen_attrs.rs Outdated Show resolved Hide resolved
compiler/rustc_codegen_ssa/src/mir/mod.rs Outdated Show resolved Hide resolved
compiler/rustc_codegen_ssa/src/mir/naked_asm.rs Outdated Show resolved Hide resolved
compiler/rustc_codegen_ssa/src/mir/naked_asm.rs Outdated Show resolved Hide resolved
compiler/rustc_codegen_ssa/src/mir/naked_asm.rs Outdated Show resolved Hide resolved
tests/codegen/naked-fn/x86_64-linux.rs Outdated Show resolved Hide resolved
@rust-log-analyzer

This comment has been minimized.

@tgross35
Copy link
Contributor

On the tracking issue a naked_asm! macro was proposed that would closer follow global_asm! (cc @Lokathor since you liked this idea). It sounds like this PR effectively turns #[naked] + asm! into exactly what naked_asm would do, so that would no longer be necessary?

The changes in this PR seem like good direction.

@Lokathor
Copy link
Contributor

I think that separately from any internal implementation change, the surface syntax of rust should use naked_asm separately from asm, because the two have different enough user interface and semantics.

@folkertdev
Copy link
Contributor Author

I agree that it is still a good idea to add naked_asm! as a public api, if only because it makes the documentation much more straightforward: instead of having to explain the interaction between #[naked] and asm!, we just mandate that #[naked] must use a naked_asm! block and the naked_asm! docs can give the exact details and restrictions.

But that is separate from how the codegen works, which is what this PR is for.

Comment on lines +182 to +177
if let Visibility::Hidden = item_data.visibility {
writeln!(begin, ".hidden {asm_name}").unwrap();
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not been able to actually generate code that triggers this if. Visibility just seems to always be Default. So this is entirely untested at the moment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For anything like this, drop a question on Zulip https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp. Unfortunately that enum doesn't seem to be very well documented

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

}

fn inline_to_global_operand<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>>(
cx: &'a Bx::CodegenCx,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function can probably just take TyCtxt.

cx.tcx(),
value.span,
const_value,
cx.layout_of(value.ty()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this layout_of call uses the CodegenCx. This is during codegen though, so RevealAllLayoutCx (a wrapper around TyCtxt) is enough: https://github.com/rust-lang/rustc_codegen_cranelift/blob/b70ad2defd4bb5fba6af7958893e22be0f33dfdd/src/common.rs#L450-L518 Maybe it should be uplifted out of cg_clif?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems useful, should that be part of this PR though?

@folkertdev

This comment was marked as resolved.

@folkertdev folkertdev marked this pull request as ready for review July 21, 2024 15:05
@rustbot rustbot added the A-run-make Area: port run-make Makefiles to rmake.rs label Jul 26, 2024
@rustbot

This comment was marked as outdated.

@rust-log-analyzer

This comment has been minimized.

@folkertdev folkertdev force-pushed the naked-fn-asm branch 2 times, most recently from f54a458 to 6783e26 Compare July 26, 2024 17:32
@bors
Copy link
Contributor

bors commented Jul 28, 2024

☔ The latest upstream changes (presumably #128298) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Jul 29, 2024

☔ The latest upstream changes (presumably #125443) made this pull request unmergeable. Please resolve the merge conflicts.

rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Aug 7, 2024
Rollup merge of rust-lang#128362 - folkertdev:naked-function-symbol-visibility, r=bjorn3

add test for symbol visibility of `#[naked]` functions

tracking issue: rust-lang#90957

This test is extracted from rust-lang#128004

That PR attempts to generated naked functions as an extern function declaration, combined with a global asm block that provides the implementation for that declaration.

In order to link declaration and definition together, some flavor of external linking must be used: LLVM will error for other linkage types. Specifically the allowed options are `#[linkage = "external"]` and `#[linkage = "extern_weak"]`. That is kind of an implementation detail though: to the user, a naked function should just behave like a normal function.

Hence it should be visible to the linker under the same circumstances as a normal, vanilla function and have the same attributes (Weak, External). Getting this behavior right will require some care, so I think it's a good idea to lock it in now, before making any changes, to make sure we don't regress.

Are there any interesting cases that I missed here? E.g. is checking on different architectures worth it? I don't think the other binary types (rlib etc) are relevant here, but may be missing something.

r? ``@bjorn3``
@rust-log-analyzer

This comment has been minimized.

@rustbot
Copy link
Collaborator

rustbot commented Aug 11, 2024

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@folkertdev
Copy link
Contributor Author

I think this is as far as I can take it without some external feedback

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 11, 2024
Copy link
Member

@Amanieu Amanieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. I'm still not 100% sure about the linkage-related things, but I think it's good enough to merge. We can handle any issues that are discovered later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also test that at the end of each function the assember mode is restored to the default for the target.

Copy link
Contributor Author

@folkertdev folkertdev Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are these assembler modes based on? I don't see .arm here at all, and for thumb only thumb_function appears to be used.

https://godbolt.org/z/rjjc4KxnG

and even then only at the start, not at the end

        .section        .text.test_thumb,"ax",%progbits
        .globl  test_thumb
        .p2align        1
        .type   test_thumb,%function
        .code   16
        .thumb_func
test_thumb:
.Lfunc_begin0:
        .fnstart
        .cfi_sections .debug_frame
        .cfi_startproc
        .file   1 "/app" "example.rs"
        .loc    1 19 5 prologue_end
        bx      lr
        .inst.n 0xdefe
.Lfunc_end0:
        .size   test_thumb, .Lfunc_end0-test_thumb
        .cfi_endproc
        .cantunwind
        .fnend

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is confusing, the assert is triggered in this example

    let is_thumb = tcx.sess.unstable_target_features.contains(&sym::thumb_mode);

    let attrs = tcx.codegen_fn_attrs(instance.def_id());
    let link_section = attrs.link_section.map(|symbol| symbol.as_str().to_string());
    let align = attrs.alignment.map(|a| a.bytes()).unwrap_or(4);

    let (arch_prefix, arch_suffix) = if is_arm {
        (
            match attrs.instruction_set {
                None => match is_thumb {
                    true => ".thumb\n.thumb_func",
                    false => ".arm",
                },
                Some(InstructionSetAttr::ArmT32) => {
                    assert!(is_thumb);
                    ".thumb\n.thumb_func"
                }

that conflicts with

/// Computes the set of target features used in a function for the purposes of
/// inline assembly.
fn asm_target_features(tcx: TyCtxt<'_>, did: DefId) -> &FxIndexSet<Symbol> {
    let mut target_features = tcx.sess.unstable_target_features.clone();
    if tcx.def_kind(did).has_codegen_attrs() {
        let attrs = tcx.codegen_fn_attrs(did);
        target_features.extend(attrs.target_features.iter().map(|feature| feature.name));
        match attrs.instruction_set {
            None => {}
            Some(InstructionSetAttr::ArmA32) => {
                // FIXME(#120456) - is `swap_remove` correct?
                target_features.swap_remove(&sym::thumb_mode);
            }
            Some(InstructionSetAttr::ArmT32) => {
                target_features.insert(sym::thumb_mode);
            }
        }
    }

    tcx.arena.alloc(target_features)
}

so information is getting lost somewhere I think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have deep assembly knowledge, but I do know a little bit about the hardware.

It's good to separate out two things:

  • There is the hardware that is machine code
  • Then there is the instructions making up that machine code

The hardware can run in two modes: Arm mode and Thumb mode. Some cpus can run both and some can only run one, but they all behave the same way. The cpus can switch which mode they're running in at runtime on the fly.
But important to know is that the hardware does not know what machine code it is running. It must be told when to switch.

Afaik, this can only be done in branch (b) instructions that also have the x (for eXchange) in them, like bx and blx. All functions (and I guess labels) are 16-bit aligned. So the lowest bit has no significance in determining the address to jump to. Instead the lowest bit is known as the 'thumb' bit. On a bx instruction, if the thumb bit is not set then the cpu will switch to (or remain in) Arm mode. If the thumb bit is set, then the cpu will switch to Thumb mode.

This is where .thumb_func comes in. It implies a .thumb directive and gives information to the assembler and linker that this is a thumb symbol and as such the thumb bit must be set when branching to it. This is called 'interworking' and there's a bit more to it because the cpu must also possibly switch modes when returning from a functions.

Then over to the instructions themselves. They can be compiled to machine code and are done so either using Arm or Thumb representation. In the assembly you can tell the assembler which one is to be used.

When adding .arm or .code 32, it means the instructions after it are to be assembled using the Arm mode.
When adding .thumb or .code 16, then the instructions will be encoded as Thumb.

Then another thing is which syntax is used. It can be .syntax [unified | divided]. When unified is used, then a newer assembly is used that can specify both Arm and Thumb instructions. The older divided has slightly different syntax for the two modes. I'm not sure what the default is when you don't specify it. But I believe that when looking at the instructions using e.g. objdump it's always shown with the unified syntax. I don't know... I also don't know if it's relevant for you now.

I found this nice page with directives with some short explanations:
https://sourceware.org/binutils/docs/as/ARM-Directives.html

Note though that for .thumb_func it says: "This directive is not necessary when generating EABI objects. On these targets the encoding is implicit when generating Thumb code."
I think that may not be true for the assembly we put into LLVM and that .thumb_func is required. But I've not tested that.

TLDR:

  • .arm or .code 32 switch the instruction encoding to Arm mode
  • .thumb or .code 16 switch the instruction encoding to Thumb mode
  • Use .thumb_func to mark a symbol as a Thumb symbol. It sets up the 'interworking' so the cpu will switch to thumb mode when branching to the symbol or returning to it. It also implies .thumb

I hope this helps understanding! (Or at least didn't make it worse)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, I've just added two

//@ revisions: arm-mode thumb-mode
//@ [arm-mode] compile-flags: --target armv5te-none-eabi
//@ [thumb-mode] compile-flags: --target thumbv5te-none-eabi

// <snip>

// CHECK: .arm
// CHECK-LABEL: test_unspecified:
// CHECK: bx lr
// CHECK: .popsection
// arm-mode: .arm
// thumb-mode: .thumb

i.e., on both arm and thumb mode, the .code value is reset to the default for the target at the end of the global asm block

That seems to fix the final open question?

compiler/rustc_codegen_ssa/src/mir/naked_asm.rs Outdated Show resolved Hide resolved
compiler/rustc_codegen_ssa/src/mir/naked_asm.rs Outdated Show resolved Hide resolved
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Sep 7, 2024

☔ The latest upstream changes (presumably #130066) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Oct 7, 2024

☔ The latest upstream changes (presumably #128651) made this pull request unmergeable. Please resolve the merge conflicts.

first steps of codegen for `#[naked]` functions using `global_asm!`

configure external linkage if no linkage is explicitly set

create a `FunctionCx` and properly evaluate consts

inline attribute is no longer relevant for naked functions

the naked attribute no longer needs to be set by llvm/...

we're generating global asm now, so this attribute is meaningless for the codegen backend
correctly emit `.hidden`

this test was added in rust-lang#105193

but actually NO_COVERAGE is no longer a thing in the compiler. Sadly,
the link to the issue is broken, so I don't know what the problem was
originally, but I don't think this is relevant any more with the global
asm approach

rename test file

because it now specifically checks for directives only used by
non-macos, non-windows x86_64

add codegen tests for 4 interesting platforms

add codegen test for the `#[instruction_set]` attribute

add test for `#[link_section]`

use `tcx.codegen_fn_attrs` to get attribute info

Fix rust-lang#124375

inline const monomorphization/evaluation

getting rid of FunctionCx

mark naked functions as `InstantiatedMode::GloballyShared`

this makes sure that the function prototype is defined correctly, and we don't see LLVM complaining about a global value with invalid linkage

monomorphize type given to `SymFn`

remove hack that always emits `.globl`

monomorphize type given to `Const`

remove `linkage_directive`

make naked functions always have external linkage

mark naked functions as `#[inline(never)]`

add test file for functional generics/const/impl/trait usage of naked functions
… it earlier, then some other logic causes invalid visibility for the item (exporting when it shouldn't).
- codegen tests: change `windows` to `win`
- cleanup
- fix review comments
    - better way of checking for thumb
    - get the mangled name from the codegen backend
- propagate function alignment
- fix gcc backend
- fix asan test
- check that assembler mode restored
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-run-make Area: port run-make Makefiles to rmake.rs S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ICE: codegen: index out of bounds: the len is 3 but the index is 4
10 participants