Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add windows ssse3,sse4_1,sse4_2 detection for non avx path #251

Merged
merged 4 commits into from
Jul 21, 2022

Conversation

ajax16384
Copy link
Contributor

@ajax16384 ajax16384 commented Jul 14, 2022

This PR allows to detect >sse3 for non-avx processors by using only windows methods. Which brings identical detection windows caps as of linux or macos.

@ajax16384
Copy link
Contributor Author

BTW this detection may superseed WESTMERE arch detect

#if (_WIN32_WINNT >= 0x0601) // Win7+

so it might be removed at all (and _WIN32_WINNT compile time dependency seems fragile)

@toor1245
Copy link
Contributor

BTW this detection may superseed WESTMERE arch detect

#if (_WIN32_WINNT >= 0x0601) // Win7+

so it might be removed at all (and _WIN32_WINNT compile time dependency seems fragile)

ah, my bad, when I was working on this problem I only referred to the Microsoft docs. You are right, there is no point in _WIN32_WINNT, it can be removed. Thanks for the PR 👍

Comment on lines 23 to 36

// modern WinSDK winnt.h contains newer features detection definitions
#if !defined(PF_SSSE3_INSTRUCTIONS_AVAILABLE)
#define PF_SSSE3_INSTRUCTIONS_AVAILABLE 36
#endif

#if !defined(PF_SSE4_1_INSTRUCTIONS_AVAILABLE)
#define PF_SSE4_1_INSTRUCTIONS_AVAILABLE 37
#endif

#if !defined(PF_SSE4_2_INSTRUCTIONS_AVAILABLE)
#define PF_SSE4_2_INSTRUCTIONS_AVAILABLE 38
#endif

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we move this code to a common place to avoid duplication?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid code duplication some windows specific header should added to project and referenced instead of original windows.h like :
include\internal\windows.h

#ifndef CPU_FEATURES_INCLUDE_INTERNAL_WINDOWS_H_
#define CPU_FEATURES_INCLUDE_INTERNAL_WINDOWS_H_
#include <windows.h>  // IsProcessorFeaturePresent
// modern WinSDK winnt.h contains newer features detection definitions
#if !defined(PF_SSSE3_INSTRUCTIONS_AVAILABLE)
#define PF_SSSE3_INSTRUCTIONS_AVAILABLE             36
#endif
......
#endif

test\cpuinfo_x86_test.cc

...
#if defined(CPU_FEATURES_OS_WINDOWS)
#include "internal/windows.h"
...

If such change worth it then I can fix PR in that way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no objections, @gchatelet what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes go ahead with the introduction of the new header, I think it's the correct way forward,
We will need to adapt the bazel file as well. I suspect that this will be a bit of a pain but I can help you with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

GetWindowsIsProcessorFeaturePresent(PF_SSE4_2_INSTRUCTIONS_AVAILABLE);

// do not bother checking PF_AVX*
// cause AVX enabled processor will have XCR0 be exposed and this function will be skipped at all

// https://github.com/google/cpu_features/issues/200
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code can be removed as the new functionality replaces detection properly

Copy link
Contributor Author

@ajax16384 ajax16384 Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, will fix PR

@@ -0,0 +1,33 @@
// Copyright 2017 Google LLC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2022?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

EXPECT_TRUE(info.features.ssse3);
EXPECT_TRUE(info.features.sse4_1);
EXPECT_TRUE(info.features.sse4_2);
#endif

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment, extra space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@gchatelet gchatelet merged commit 38ae5d0 into google:main Jul 21, 2022
@gchatelet
Copy link
Collaborator

Thx @ajax16384 and @toor1245 !

@toor1245
Copy link
Contributor

Created PR to update IsProcessorFeaturePresent docs MicrosoftDocs/sdk-api#1251

@gchatelet gchatelet added this to the v0.8.0 milestone Apr 27, 2023
@gchatelet gchatelet added the enhancement New feature or request label Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants