Add core::intrinsics::simd #118853

calebzulawski · 2023-12-12T04:24:10Z

Intended to close rust-lang/portable-simd#381.

r? ralfjung

RalfJung · 2023-12-12T17:28:30Z

Cc @rust-lang/opsem

library/core/src/intrinsics/simd.rs

workingjubilee · 2023-12-14T08:14:11Z

🤨 I forgot about that "fun" detail. Nevermind, I guess.

workingjubilee · 2023-12-14T08:16:50Z

actually, while we're here, does anyone know if these are actually incompatible signatures, or is that lint just overly strict...?

digama0 · 2023-12-15T00:52:59Z

doesn't that mean the rename should also happen in portable-simd?

I can't imagine that that generic parameter names would matter here, but it still seems like a good idea to ensure the names are the same for clarity.

RalfJung · 2023-12-15T10:32:25Z

Yeah that looks like an overeager lint to me. So it can be allowed temporarily until the names are back in sync.

library/core/src/intrinsics/simd.rs

RalfJung · 2023-12-17T09:01:34Z

library/core/src/intrinsics/simd.rs

+    /// The bitmask is always packed into the smallest/first bits, but the order is LSB-first for
+    /// little endian and MSB-first for big endian.
+    /// In other words, the LSB corresponds to the first vector element for little endian,
+    /// and the last vector element for big endian.


I have no idea what this means.^^ There's too many different notions of "first" here. Also, there's two cases to consider (output a packed int vs output an array).

What about something like this:
No matter whether the output is an array or an unsigned integer, it is treated as a single contiguous list of bits. The bitmask is always packed on the least-significant side of the output, and padded with 0s in the most-significant bits. The order of the bits depends on endianess:

On little endian, the least significant bit corresponds to the first vector element.

On big endian, the least significant bit corresponds to the last vector element.

I think this also needs examples to have any chance of being comprehensible.

As always, please first state the types and then start discussing the details; without knowing what U is the rest of this is even harder to understand.

This is much better. I took your text and added a simple example

library/core/src/intrinsics/simd.rs

Co-authored-by: Ralf Jung <post@ralfj.de>

calebzulawski · 2023-12-18T04:46:09Z

I thought those intrinsics were added? Or is this something bootstrap compiler related?

RalfJung · 2023-12-18T18:17:36Z

Yeah this probably needs #[cfg(not(bootstrap))]

RalfJung · 2023-12-19T07:17:18Z

Looking good. :) We can always refine this later if more things come up.
I assume the next step is to change portable-simd to import this module rather than directly importing the intrinsics?

@bors r+

bors · 2023-12-19T07:17:21Z

📌 Commit e61aaf9 has been approved by RalfJung

It is now in the queue for this repository.

calebzulawski · 2023-12-19T12:00:09Z

Looking good. :) We can always refine this later if more things come up. I assume the next step is to change portable-simd to import this module rather than directly importing the intrinsics?

Yep! And one more feature gate to remove

bors · 2023-12-19T12:42:36Z

⌛ Testing commit e61aaf9 with merge 558ac1c...

bors · 2023-12-19T14:39:50Z

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing 558ac1c to master...

rust-timer · 2023-12-19T16:04:23Z

Finished benchmarking commit (558ac1c): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.4%	[0.5%, 3.9%]	5
Regressions ❌ (secondary)	3.9%	[2.6%, 4.5%]	3
Improvements ✅ (primary)	-2.0%	[-2.0%, -2.0%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.8%	[-2.0%, 3.9%]	6

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.5%	[-0.5%, -0.5%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.5%	[-0.5%, -0.5%]	1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 672.757s -> 673.009s (0.04%)
Artifact size: 312.46 MiB -> 312.42 MiB (-0.01%)

RalfJung · 2023-12-21T16:39:57Z

library/core/src/intrinsics/simd.rs

+    ///
+    /// `mask` must only contain `0` or `!0` values.
+    #[cfg(not(bootstrap))]
+    pub fn simd_masked_load<V, U, T>(mask: V, ptr: U, val: T) -> T;


In implementing this, I am confused. Isn't this equivalent to simd_gather?

gather accepts a vector of N pointers, where each element will be loaded from its corresponding pointer.

masked_load accepts a single pointer and all elements of the resulting vector (when unmasked) are loadded from a constant offset from that pointer. i.e the first element will be loaded from ptr, second from ptr.offset(1), and so on.

Then the documentation here is wrong. For masked_load it says

/// `U` must be a vector of pointers to the element type of `T`, with the same length as `T`.

Yeah, looks like we missed this. I'll create a follow up PR soon

rustbot assigned RalfJung Dec 12, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Dec 12, 2023

RalfJung mentioned this pull request Dec 13, 2023

Add more SIMD platform-intrinsics #117953

Merged

6 tasks

farnoy reviewed Dec 13, 2023

View reviewed changes

library/core/src/intrinsics/simd.rs Show resolved Hide resolved

workingjubilee reviewed Dec 14, 2023

View reviewed changes

library/core/src/intrinsics/simd.rs Outdated Show resolved Hide resolved

library/core/src/intrinsics/simd.rs Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view