-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add windows ssse3,sse4_1,sse4_2 detection for non avx path #251
Conversation
BTW this detection may superseed WESTMERE arch detect cpu_features/src/impl_x86_windows.c Line 48 in 8eb944f
so it might be removed at all (and _WIN32_WINNT compile time dependency seems fragile) |
ah, my bad, when I was working on this problem I only referred to the Microsoft docs. You are right, there is no point in _WIN32_WINNT, it can be removed. Thanks for the PR 👍 |
test/cpuinfo_x86_test.cc
Outdated
|
||
// modern WinSDK winnt.h contains newer features detection definitions | ||
#if !defined(PF_SSSE3_INSTRUCTIONS_AVAILABLE) | ||
#define PF_SSSE3_INSTRUCTIONS_AVAILABLE 36 | ||
#endif | ||
|
||
#if !defined(PF_SSE4_1_INSTRUCTIONS_AVAILABLE) | ||
#define PF_SSE4_1_INSTRUCTIONS_AVAILABLE 37 | ||
#endif | ||
|
||
#if !defined(PF_SSE4_2_INSTRUCTIONS_AVAILABLE) | ||
#define PF_SSE4_2_INSTRUCTIONS_AVAILABLE 38 | ||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we move this code to a common place to avoid duplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid code duplication some windows specific header should added to project and referenced instead of original windows.h like :
include\internal\windows.h
#ifndef CPU_FEATURES_INCLUDE_INTERNAL_WINDOWS_H_
#define CPU_FEATURES_INCLUDE_INTERNAL_WINDOWS_H_
#include <windows.h> // IsProcessorFeaturePresent
// modern WinSDK winnt.h contains newer features detection definitions
#if !defined(PF_SSSE3_INSTRUCTIONS_AVAILABLE)
#define PF_SSSE3_INSTRUCTIONS_AVAILABLE 36
#endif
......
#endif
test\cpuinfo_x86_test.cc
...
#if defined(CPU_FEATURES_OS_WINDOWS)
#include "internal/windows.h"
...
If such change worth it then I can fix PR in that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no objections, @gchatelet what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes go ahead with the introduction of the new header, I think it's the correct way forward,
We will need to adapt the bazel file as well. I suspect that this will be a bit of a pain but I can help you with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
src/impl_x86_windows.c
Outdated
GetWindowsIsProcessorFeaturePresent(PF_SSE4_2_INSTRUCTIONS_AVAILABLE); | ||
|
||
// do not bother checking PF_AVX* | ||
// cause AVX enabled processor will have XCR0 be exposed and this function will be skipped at all | ||
|
||
// https://github.com/google/cpu_features/issues/200 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code can be removed as the new functionality replaces detection properly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, will fix PR
include/internal/windows_utils.h
Outdated
@@ -0,0 +1,33 @@ | |||
// Copyright 2017 Google LLC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2022?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
test/cpuinfo_x86_test.cc
Outdated
EXPECT_TRUE(info.features.ssse3); | ||
EXPECT_TRUE(info.features.sse4_1); | ||
EXPECT_TRUE(info.features.sse4_2); | ||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment, extra space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Thx @ajax16384 and @toor1245 ! |
Created PR to update |
This PR allows to detect >sse3 for non-avx processors by using only windows methods. Which brings identical detection windows caps as of linux or macos.