Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] intrinsics for all architectures appear in rustdoc #1104

Merged
merged 45 commits into from
Apr 17, 2021

Conversation

Byron
Copy link
Member

@Byron Byron commented Apr 1, 2021

I will keep pushing changes until all files mentioned in #1055 have been adjusted, and would be glad if you could let me know early if I generally get it right 😅.

This PR starts out with a sample on how I would go about this.
Thanks for your help.

@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Amanieu (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

Copy link
Member

@Amanieu Amanieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking back at #1055, I now realize that we don't need to add any doc(cfg) at all. What we need to do is reorganize the arm/aarch64 code so that:

  • acle contains code shared between ARM and AArch64.
  • arm contains ARM-specific code.
  • aarch64 contains AArch64-specific code. It also needs to only re-export acle instead of all of arm.

This shouldn't be too difficult to do:

  • The neon, crc and crypto submodules of arm need to be moved to acle.
  • The sat, dsp and simd32 submodules of acle need to be moved to arm.
  • Some functions in neon/mod.rs are marked with #[cfg(target_arch = "arm")], these should be moved to a neon module under arm.
  • Don't worry about the v6, v7 and armclang modules for now: I'm actually considering deleting them since they aren't in the ARM spec.

Once this reorganization is done, we still need to add doc to any cfg blocks that match on target_feature or target_arch otherwise these functions will be excluded from docs generated from x86.

Comment on lines 12 to 13
#[cfg(any(target_feature = "v6", target_arch = "aarch64", doc))]
#[doc(cfg(target_arch = "arm"))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding doc to the cfg is correct: we want this function to be included in the documentation even if we aren't compiling for arm/aarch64 (e.g. the current target is x86).

However the doc(cfg) is not needed here. We don't need to mark this function as ARM-only (which is incorrect: this function is available on both ARM and AArch64). We already have a blanket doc(cfg) over the entire arm and aarch64 top level modules.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I correct in understanding that aarch64 and arm will each re-export all of acle, so all the available intrinsics for a given target will get rendered together?

I'm not sure if there is known rationale for this, so please excuse my ignorance, but the naming of acle is a little awkward; it usually stands for "Arm C Language Extension", which would be fine if it were simple bindings, but it looks like a subset. Does a name like arm_shared make more sense here?

Out of curiosity, why does the same approach not work for v6 and the like? When you say that they "aren't in the ARM spec", which specification are you referring to? There are definitely intrinsics available at v7, but not v6, for example. One argument for keeping them all is so that run-time feature detection (std::detect) can be used to guard intrinsics not available in target_feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, arm_shared would make more sense here. Let's rename acle to that.

I actually misread the ACLE spec, the v6 and v7 modules can stay in arm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for the hint, I found the blanket doc(cfg) and hope to slowly get the idea :). Should be fixed with 8c6fdc5 .

@Byron
Copy link
Member Author

Byron commented Apr 1, 2021

Thanks a lot, I will make the changes accordingly, in small steps with about a step per day. This should help to change course early while remaining fast to review.

@Byron
Copy link
Member Author

Byron commented Apr 2, 2021

I pushed the first tiny step towards moving modules to acle which should eventually be renamed to arm_shared if I understand the comments correctly.
The directives used in the moved modules are unchanged and I wonder if that makes sense. If it does, I can continue with moving modules 'mechanically' and when deemed correct I would continue with the addition of the #[cfg(…, doc) adjustment started in 21b4259 .

@@ -119,6 +119,9 @@ mod simd32;
))]
pub use self::simd32::*;

#[cfg(any(target_arch = "aarch64", target_feature = "v7"))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to enable doc here.

mod crc;
#[cfg(any(target_arch = "aarch64", target_feature = "v7"))]
pub use self::crc::*;
pub use crate::core_arch::acle::crc::*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be needed, there is already a blanket re-export of acle below.

@@ -119,6 +119,9 @@ mod simd32;
))]
pub use self::simd32::*;

#[cfg(any(target_arch = "aarch64", target_feature = "v7"))]
pub mod crc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a pub use here to re-export the contents of the module and make the module private.

@Byron
Copy link
Member Author

Byron commented Apr 3, 2021

When moving acle/(arm|simd32) I encountered test failures as arm seems to be governed by an auto-test suite. For now I thought it was best to teach it to skip the newly added functions.

@Byron
Copy link
Member Author

Byron commented Apr 4, 2021

  • The neon, crc and crypto submodules of arm need to be moved to acle.
  • The sat, dsp and simd32 submodules of acle need to be moved to arm.
  • Some functions in neon/mod.rs are marked with #[cfg(target_arch = "arm")], these should be moved to a neon module under arm.

I was just about to move on to the last point when I noticed that there is a 135 of functions that are qualified with exactly #[cfg(target_arch = "arm")] and I wanted to clarify that this is indeed the right move. Lastly, there also is a generator at play which when executed via OUT_DIR=pwd/crates/core_arch cargo run -p stdarch-gen -- crates/stdarch-gen/neon.spec puts its output into the old location. I will adjust that for now.

Tomorrow is tackling bullet point number three and finally renaming acle to arm_shared (while assuring the generator places its file in arm_shared then).

@Amanieu
Copy link
Member

Amanieu commented Apr 5, 2021

I was just about to move on to the last point when I noticed that there is a 135 of functions that are qualified with exactly #[cfg(target_arch = "arm")] and I wanted to clarify that this is indeed the right move.

That's correct for now. These intrinsics were only implemented for arm, and still need to be ported to aarch64. But this should be done in a separate PR.

@Byron
Copy link
Member Author

Byron commented Apr 7, 2021

I have started with…

Some functions in neon/mod.rs are marked with #[cfg(target_arch = "arm")], these should be moved to a neon module under arm.

… and would still have to remove now unused #[cfg(target_arch="arm")] as that directive now governs the whole module. While getting started with it I noticed that there are a few functions like this:

/// Unsigned Add and Accumulate Long Pairwise.
#[inline]
#[target_feature(enable = "neon")]
#[cfg_attr(target_arch = "arm", target_feature(enable = "v7"))]
#[cfg_attr(all(test, target_arch = "arm"), assert_instr(vpadal.u32))]
#[cfg_attr(all(test, target_arch = "aarch64"), assert_instr(uadalp))]
pub unsafe fn vpadal_u32(a: uint64x1_t, b: uint32x2_t) -> uint64x1_t {
    #[cfg(target_arch = "arm")]
    {
        vpadal_u32_(a, b)
    }
    #[cfg(target_arch = "aarch64")]
    {
        simd_add(vpaddl_u32_(b), a)
    }
}

Do I assume correctly that these would rather stay in ACLE/ARM_shared?

Furthermore, the version you see here is just to 'make it work' and I would very much welcome any feedback to drive the impending improvements. Thanks a lot :)!

PS: It looks like I have truly added some breakage this time which unfortunately I can't see locally - any suggestions on how to see these issues on my machine would be welcome, too.

@Amanieu
Copy link
Member

Amanieu commented Apr 7, 2021

These functions are implemented for both arm and aarch64 and therefore should stay in arm_shared.

You can run the same tests as CI by using the scripts: TARGET=thumbv6m-none-eabi ci/run.sh.

Refer to .github/workflows/main.yml for the full list of TARGET options. For arm targets you sometimes need to add RUSTFLAGS="-C target-feature=+neon".

@Byron
Copy link
Member Author

Byron commented Apr 17, 2021

Thanks a lot for your help!

Now that CI seems to be well, all that's left on my list is to rename acle to arm_shared if this is still a name you find acceptable. Then I would hope to iterate towards merging this PR as I live in fear of merge-conflicts, for now auto-merge seems to work fine though.

I suggest to look into f14ce23 particularly - it maybe partially fixed thanks to your suggestions, but it's definitely where I might have committed crimes to the source code to somehow make CI happy (I was desperate).

crates/core_arch/src/arm/neon.rs Outdated Show resolved Hide resolved
crates/core_arch/src/arm/mod.rs Outdated Show resolved Hide resolved
@Amanieu
Copy link
Member

Amanieu commented Apr 17, 2021

I will squash the commits, so I'm mainly using the overall diff to review rather than individual commits.

Please go ahead and rename acle to arm_shared, then do a final merge with master if needed.

@Byron
Copy link
Member Author

Byron commented Apr 17, 2021

Alright, I believe to have addressed the change requests and renamed acle to arm_shared. Even though I am happy to address all issues that come up I express my appreciation for making fixes during the review to avoid hand-offs in favour of merging this beast more swiftly 😁.

@Amanieu
Copy link
Member

Amanieu commented Apr 17, 2021

A few last minute issues, then I think we can merge:

  • armclang.rs needs to be split into arm/aarch64 versions.
  • Unnecessary cfg(target_arch) on udf in arm/mod.rs
  • __dbg should be moved to arm.

@Amanieu
Copy link
Member

Amanieu commented Apr 17, 2021

  • arm_shared/ex.rs is missing doc and target_arch = "aarch64" in the cfgs.

…ually

I thought it would/should have worked to put it on the top of the module
then, at least the target_arch one
@Byron
Copy link
Member Author

Byron commented Apr 17, 2021

I have addressed the following:

  • [x] __dbg should be moved to arm.
  • arm_shared/ex.rs is missing doc and target_arch = "aarch64" in the cfgs.
  • Unnecessary cfg(target_arch) on udf in arm/mod.rs
  • armclang.rs needs to be split into arm/aarch64 version

Even though it appears CI points out an issue with __dbg that I couldn't solve with two attempts of adjusting cfg's.
Edit: reverted breaking __dbg change.

@Amanieu Maybe you could take a stab at the __dbg move yourself assuming CI is clear now. Whatever I was doing wasn't it.

/// `VAL` is a compile-time constant integer in range `[0, 65535]`.
///
/// The breakpoint instruction inserted is `BRK` on A64.
#[cfg(all(target_arch = "aarch64", not(doc)))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cfg is no longer needed.

static_assert_imm16!(VAL);
asm!("brk {}", const VAL);
}

/// Inserts a breakpoint instruction.
///
/// `VAL` is a compile-time constant integer in range `[0, 255]`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the cfg below.

Comment on lines 8 to 13
#[cfg(target_feature = "aarch64")]
#[cfg(any(
all(target_feature = "v6k", not(target_feature = "mclass")), // excludes v6-M
all(target_feature = "v7", target_feature = "mclass"), // v7-M
doc
))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[cfg(target_feature = "aarch64")]
#[cfg(any(
all(target_feature = "v6k", not(target_feature = "mclass")), // excludes v6-M
all(target_feature = "v7", target_feature = "mclass"), // v7-M
doc
))]
#[cfg(any(
target_feature = "aarch64",
all(target_feature = "v6k", not(target_feature = "mclass")), // excludes v6-M
all(target_feature = "v7", target_feature = "mclass"), // v7-M
doc
))]

Same with the others below.

@Amanieu
Copy link
Member

Amanieu commented Apr 17, 2021

For dbg you also need to copy this from a bit lower in the file:

    #[link_name = "llvm.arm.dbg"]
    fn dbg(_: i32);

Make sure to remove the cfg(target_arch = "arm").

@Amanieu
Copy link
Member

Amanieu commented Apr 17, 2021

Actually, don't worry about it. I can see you're having trouble running the tests so I'll finish cleaning up this PR and merge it.

@Amanieu Amanieu merged commit 3fd6dd1 into rust-lang:master Apr 17, 2021
@Byron
Copy link
Member Author

Byron commented Apr 17, 2021

Thanks so much, this is awesome!

@Amanieu
Copy link
Member

Amanieu commented Apr 17, 2021

Note that actually shipping the updated stdarch with rustc is currently still blocked on rust-lang/rust#83278 getting resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants