AVX512 code will need to change to handle the whole AVX10 business #693

workingjubilee · 2024-08-27T14:38:22Z

Hello.

You heard about the whole AVX10 business, I presume? With AVX10/256 and AVX10/512?

We will likely need to adjust what the "avx512" features even mean, unfortunately, as a result of this. Fortunately, they're unstable, so we can do that! So it might need some modifications in dalek, but it's not like there will be any unforeseen complications from changing nightly features in code that didn't enable it intentionally... right?

curve25519-dalek/curve25519-dalek/build.rs

Lines 38 to 44 in 0964f80

    
           if rustc_version::version_meta() 
        
               .expect("failed to detect rustc version") 
        
               .channel 
        
               == rustc_version::Channel::Nightly 
        
           { 
        
               println!("cargo:rustc-cfg=nightly"); 
        
           }

...right?

RalfJung · 2024-08-27T18:46:41Z

In other words: building this crate may fail on nightly when the work on redoing these features happens. The compiler team considers automatic use of nightly features in crates that otherwise work on stable to be a breach of the nightly compiler contract, so this is not considered a breaking change -- but if this happens, it will mean that everyone using nightly to build this crate (or any crate that depends on this crate) will get compilation failures.

For context:

Feature gates for renamed features hard error, breaking "nightly-detecting" crates rust-lang/rust#120804
allow overwriting the output of rustc --version rust-lang/rust#124339
warn against automatically using nightly features SergioBenitez/version_check#23

The preferred way to avoid this issue is to not enable nightly features automatically, but instead make them cargo features that need to be opt-in. That makes it possible to build the crate on nightly in exactly the same way that it would build on stable, which is extremely valuable for doing compiler development and testing.

Another option is to use something like autocfg to test whether the concrete kind of code you need works on the current compiler. If that code stops working, the crate can then automatically fall back to the stable code path. However, this reduces the usefulness of nightly as a testing vehicle: the point of nightly is that we can know that things will break long before they break on stable. To achieve this, however, it is important that testing a crate on nightly actually tests the thing that would happen when the nightly became stable. By making the same crate build different code on nightly vs stable, this is not the case any more.

tarcieri · 2024-08-27T18:50:39Z

I can't speak to auto-enabling nightly features, but in general I think nightly users need to expect that upstream rustc changes can break nightly builds at any time, and all we can do is release new versions of these crates when such upstream breaking changes happen, and nightly users need to upgrade to the latest versions which such breaking changes occur.

RalfJung · 2024-08-27T19:14:23Z

Of course nightly changes can break nightly crates (as in, crates that need nightly to build), but it should generally be the case that if a crate builds on stable, it also builds on nightly. This is crucial e.g. for narrowing down a stable-to-nightly regression.

tarcieri · 2024-08-27T20:11:05Z

I agree that the nightly-related functionality should probably be opt-in (though my comment wasn’t related to that).

pinkforest · 2024-08-28T18:53:39Z

Ref issues I raised earlier:

AVX512-IFMA & AVX10 status #635

I've pointed the AVX512 probably could be default-removed because it cannot be tested and maintained reliably.

IMO it probably should be made opt-in until the support stabilises better and it can be tested reliably between AVX10?

Only consumer thing that benefits of it is zen4 / 7950X now that intel slid up with it

Also I raised issue now to create the opt-in mechanism:

Experimental supports cfg #694

For deciding the opt-in mechanism - we've declared the backends as implementation detail so it's non-breaking

However I would still recommend doing minor bump if we implement the opt-in.

workingjubilee · 2024-08-28T21:05:25Z

Yes, I most certainly have an AVX512 CPU that can run the code in question, despite it being a desktop, consumer-oriented CPU, not even a workstation CPU, so I do not think you should disable it entirely, necessarily. :^)

And I am happy to help people run tests if necessary.

The main thing is that upcoming Intel CPUs will implement what Intel is calling "AVX10", which is specified here: https://cdrdv2-public.intel.com/784267/355989-intel-avx10-spec.pdf

Essentially, it allows new instruction patterns to be used while not requiring the full 512-bit zmm registers, but rather independently specifying that. This is especially useful to code which has exceptionally specialized instructions for accelerating it beyond just things like basic arithmetic ops, where widening the ops means computing faster. So, cryptography instructions.

Thus, we need to figure out some way of allowing people to say "I want to use this AVX{version} intrinsic, but I only need {M} bits" (or "no, I need the full {N} bits") which basically means we have to mess up something for someone somewhere.

pinkforest · 2024-08-28T22:33:46Z

How can we CI it? The original idea was that github used to have AVX512 capable runners at reasonable ratio and the nightly job tested it out. But now that GitHub has way less AVX512 capable runners testing it in CI has become impossible. Qemu has support but I don't know if it can be relied to a level to provide it as "Tier1 backend" or "Tier2" 🤔

tgross35 · 2024-08-28T23:03:03Z

I believe in stdarch we use the Intel SDE (wow that website has terrible scrolling), not sure how easy that is to get off the ground. @sayantn may know more.

pinkforest · 2024-08-29T01:35:45Z

PR's up:

Make AVX512IFMA opt-in backend #695

sayantn · 2024-08-29T03:27:45Z

We are actually kind of stuck with SDE being our only option - qemu-user doesn't support AVX512, qemu-system does, but needs KVM support, which we don't have in the runners. SDE has some bugs too, when we tested with stdarch, v9.38 was quite buggy on linux (but surprisingly, the windows version was fine), so just be a little careful.

pinkforest mentioned this issue Aug 28, 2024

Experimental supports cfg #694

Closed

pinkforest mentioned this issue Aug 29, 2024

Make AVX512IFMA opt-in backend #695

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AVX512 code will need to change to handle the whole AVX10 business #693

AVX512 code will need to change to handle the whole AVX10 business #693

workingjubilee commented Aug 27, 2024

RalfJung commented Aug 27, 2024

tarcieri commented Aug 27, 2024

RalfJung commented Aug 27, 2024

tarcieri commented Aug 27, 2024

pinkforest commented Aug 28, 2024 •

edited

Loading

workingjubilee commented Aug 28, 2024 •

edited

Loading

pinkforest commented Aug 28, 2024

tgross35 commented Aug 28, 2024 •

edited

Loading

pinkforest commented Aug 29, 2024

sayantn commented Aug 29, 2024

AVX512 code will need to change to handle the whole AVX10 business #693

AVX512 code will need to change to handle the whole AVX10 business #693

Comments

workingjubilee commented Aug 27, 2024

RalfJung commented Aug 27, 2024

tarcieri commented Aug 27, 2024

RalfJung commented Aug 27, 2024

tarcieri commented Aug 27, 2024

pinkforest commented Aug 28, 2024 • edited Loading

workingjubilee commented Aug 28, 2024 • edited Loading

pinkforest commented Aug 28, 2024

tgross35 commented Aug 28, 2024 • edited Loading

pinkforest commented Aug 29, 2024

sayantn commented Aug 29, 2024

pinkforest commented Aug 28, 2024 •

edited

Loading

workingjubilee commented Aug 28, 2024 •

edited

Loading

tgross35 commented Aug 28, 2024 •

edited

Loading