Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for CPUs which do not have SSE2 extensions? #3118

Closed
strega-nil-ms opened this issue Sep 21, 2022 · 8 comments · Fixed by #4741
Closed

Support for CPUs which do not have SSE2 extensions? #3118

strega-nil-ms opened this issue Sep 21, 2022 · 8 comments · Fixed by #4741
Labels
affects redist Results in changes to separately compiled bits fixed Something works now, yay! performance Must go faster

Comments

@strega-nil-ms
Copy link
Contributor

strega-nil-ms commented Sep 21, 2022

Currently, on x86, we support /arch:IA32, and build our separately compiled sources with SSE2 support disabled. Is this still necessary, or can we allow ourselves to assume SSE2 hardware?

Notes:

  • XP and Vista support has been dropped
  • Windows 7 supported CPUs without SSE2 in its initial release, but dropped support in a 2018 update
  • Windows 8 has never supported CPUs without SSE2
  • SSE2 doesn't work in 32-bit kernels (this is a problem as long as we support Windows 10)
@barcharcraz
Copy link
Member

Given /arch:IA32 is the default on x86 we probably have to support that, however we may be able to get away with building the DLLs with SSE2

@Alcaro
Copy link
Contributor

Alcaro commented Sep 21, 2022

No, /arch:IA32 is not the default on x86. Proof: This code gives different results on IA32 vs no flags. (It also proves that IA32 automatically promotes every float32 to float64 before doing any math.) https://godbolt.org/z/Pv7Go5Te8

The relevant 2018 update is https://support.microsoft.com/en-us/topic/may-8-2018-kb4103718-monthly-rollup-c4c01989-faca-af5f-46f4-2bdc2d0171fd.

@AlexGuteniev
Copy link
Contributor

Might be an issue for 32-bit kernel mode usage.

@barcharcraz
Copy link
Member

No, /arch:IA32 is not the default on x86. Proof: This code gives different results on IA32 vs no flags. (It also proves that IA32 automatically promotes every float32 to float64 before doing any math.) https://godbolt.org/z/Pv7Go5Te8

The relevant 2018 update is https://support.microsoft.com/en-us/topic/may-8-2018-kb4103718-monthly-rollup-c4c01989-faca-af5f-46f4-2bdc2d0171fd.

You're right, although the floating-point difference is that /arch:IA32 uses x87 floating point instructions, which are 80-bit

@CaseyCarter
Copy link
Contributor

From https://learn.microsoft.com/en-us/cpp/build/reference/arch-x86?view=msvc-170:

/arch:SSE2
Enables the use of SSE2 instructions. This option is the default instruction set on x86 platforms if no /arch option is specified.

@StephanTLavavej StephanTLavavej added decision needed We need to choose something before working on this affects redist Results in changes to separately compiled bits performance Must go faster and removed decision needed We need to choose something before working on this labels Sep 21, 2022
@StephanTLavavej
Copy link
Member

We talked about this at the weekly maintainer meeting - although the potentially affected set of users is extremely small, if installing an updated redist caused code to fail at runtime, that would be very severe. In general, we have very little code affected by /arch:IA32 / the availability of SSE2 (from a quick scan, it's Special Math, vectorized algorithms, and the __vectorcall calling convention), so the benefits of making such a general change would be relatively small (e.g. in comparison to dropping Vista support which allowed us to remove a massive amount of code and significant runtime logic for Win7+ users).

However, Special Math is a special case - that is implemented in a separate "satellite DLL", and @strega-nil-ms has found that the availability of SSE2 impacts its precision (and presumably its performance). @CaseyCarter noted that we could change just the Special Math satellite DLL to use SSE2, which would be an extremely safe change - only programs actually using Special Math would be affected, as it is a pure leaf of the STL, and this satellite DLL was added relatively recently (VS 2017) so it is extraordinarily unlikely that machines with ancient processors are running code that uses this.

Note: such a change would need to happen in both the GitHub/CMake and internal/MSBuild build systems.

@AlexGuteniev
Copy link
Contributor

Does building Special Math with /fp:strict or /fp:precise fix the precision issue?

@strega-nil-ms
Copy link
Contributor Author

@AlexGuteniev no, we already build with /fp:strict, and that doesn't really have anything to do with why the result is different on non-SSE2 chips. The implementation of the special math functions does quite a bit of logic, and that logic is (necessarily) different on machines without SSE2, and on machines with SSE2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects redist Results in changes to separately compiled bits fixed Something works now, yay! performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants