Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing x86 vendor intrinsics (SSE2, SSE 4.1, AVX2) #1178

Open
10 tasks
newpavlov opened this issue Jun 7, 2021 · 8 comments
Open
10 tasks

Missing x86 vendor intrinsics (SSE2, SSE 4.1, AVX2) #1178

newpavlov opened this issue Jun 7, 2021 · 8 comments

Comments

@newpavlov
Copy link
Contributor

newpavlov commented Jun 7, 2021

Previous issue: #40

AVX2

MMX

EDIT(@workingjubilee): Direct MMX support is no longer in scope for std::arch, see:

SSE

SSE2

SSE4.1

Personally I am interested only in _mm_stream_load_si128 and _mm256_stream_load_si256, but I think it's worth to properly track all unimplemented intrinsics. Some of those intrinsics (e.g. _mm_malloc and _mm_free) probably should not be exposed, but, in my opinion, motivation behind such decision should be explicitly recorded somewhere (ideally in comments of relevant source files).

@newpavlov newpavlov changed the title Missing x86 vendor intrinsics Missing x86 vendor intrinsics (MMX, SSE2, SSE 4.1, AVX2) Jun 7, 2021
@Lokathor
Copy link
Contributor

Lokathor commented Jun 7, 2021

I was under the impression that we'd deliberately removed the MMX stuff.

@bjorn3
Copy link
Member

bjorn3 commented Jun 7, 2021

Indeed. It requires special handling in the compiler to emit the right type for MMX vectors as they are a different type from regular vectors. In addition it is pretty much impossible to use correctly as LLVM can reorder MMX usage before the intrinsic that enables MMX.

@newpavlov
Copy link
Contributor Author

What about the streaming load intrinsics? Is there a reason why they have been omitted?

@Lokathor
Copy link
Contributor

Lokathor commented Jun 7, 2021

Some of the streaming ops are already in, and stabilized (eg: _mm_stream_pd).

Given this, I'd guess that any missing streaming ops are likely an oversight (at least for 128 or 256 bit).

@jhorstmann
Copy link
Contributor

_mm_broadcastsi128_si256 seems to be an alias for _mm256_broadcastsi128_si256 which is implemented. The intrinsics guide lists both as translating to the same instruction and with the same description.

@workingjubilee workingjubilee changed the title Missing x86 vendor intrinsics (MMX, SSE2, SSE 4.1, AVX2) Missing x86 vendor intrinsics (SSE2, SSE 4.1, AVX2) Aug 31, 2022
@workingjubilee
Copy link
Member

_mm_malloc and _mm_free seem like they require implementing in libstd.

@Noratrieb
Copy link
Member

note that there are issues with these streaming intrinsics, as they have nontemporal hints that are not properly modelled.
: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Non-temporal.20stores

@workingjubilee
Copy link
Member

They've been converted into assembly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants