Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for Missing BMI1, AVX2, SSE2, SSE4.1, SSE4a and TBM intrinsics #126936

Closed
2 tasks
sayantn opened this issue Jun 25, 2024 · 7 comments
Closed
2 tasks
Labels
C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. O-x86_32 Target: x86 processors, 32 bit (like i686-*) O-x86_64 Target: x86-64 processors (like x86_64-*) T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@sayantn
Copy link
Contributor

sayantn commented Jun 25, 2024

The feature gate is #[feature(simd_x86_updates)].

The Public API is 13 new intrinsics (probably overlooked in the simd_x86 feature). See rust-lang/stdarch#1178.

Steps

  • Final Comment Period (FCP)
  • Stabilization PR

Implementation History

We cannot add _mm_malloc and _mm_free as they need access to OS, but core_arch is a no_std environment.

@sayantn sayantn added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Jun 25, 2024
@rustbot rustbot added O-x86_32 Target: x86 processors, 32 bit (like i686-*) O-x86_64 Target: x86-64 processors (like x86_64-*) T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Jun 25, 2024
@Amanieu
Copy link
Member

Amanieu commented Jul 6, 2024

These intrinsics were supposed to be part of a already-stabilized set, but were previously overlooked.

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Jul 6, 2024

Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jul 6, 2024
@sayantn sayantn changed the title Tracking Issue for Missing BMI1, AVX2, SSE2 and SSE4.1 intrinsics Tracking Issue for Missing BMI1, AVX2, SSE2, SSE4.1, SSE4a and TBM intrinsics Jul 7, 2024
@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Jul 7, 2024
@rfcbot
Copy link

rfcbot commented Jul 7, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

@RalfJung
Copy link
Member

Oh no, more non-temporal operations... quoting from a recent x86 memory model paper:

In addition to several non-temporal store instructions, Intel-x86 architectures provide a single
non-temporal load instruction. However, as our private correspondence with the lead architect of
the Intel instruction set system architecture has revealed, the non-temporal load instruction has
been a source of implementation issues, it has not been implemented consistently, and there has
been ambiguity regarding its semantics.

Are we sure we can just pretend that those are regular loads, for the purpose of language semantics?

@Amanieu
Copy link
Member

Amanieu commented Jul 11, 2024

Non-temporal loads are not allowed to violate normal memory ordering rules, at least when accessing normal (i.e. write-back cachable) memory. x86 of course allows some regions of memory to be marked as write-combining, at which point the normal memory ordering rules go out the window, but this only happens for memory-mapped I/O, not normal memory. The problem with non-temporal stores on x86 is that they violate normal memory ordering rules even when used on normal (write-back) memory.

See this answer on SO for more details.

@RalfJung
Copy link
Member

Thanks; I will get in touch with the authors of the paper to clarify whether the architect they spoke with was referring to non-temporal loads behaving in odd ways only for "non-standard" memory regions or also for write-back memory.

Meanwhile, would be worth warning about people using these intrinsics on non-write-back memory? Though that warning is probably better placed at whatever operation creates such memory. It's not really well-defined to access such memory with Rust operations (i.e., outside of inline assembly) anyway...

sayantn added a commit to sayantn/stdarch that referenced this issue Jul 12, 2024
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Jul 17, 2024
@rfcbot
Copy link

rfcbot commented Jul 17, 2024

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@rfcbot rfcbot added the to-announce Announce this issue on triage meeting label Jul 17, 2024
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Jul 25, 2024
@sayantn sayantn closed this as completed Aug 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. O-x86_32 Target: x86 processors, 32 bit (like i686-*) O-x86_64 Target: x86-64 processors (like x86_64-*) T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants