Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX2 64-bit support #93

Merged
merged 2 commits into from
Oct 25, 2023
Merged

AVX2 64-bit support #93

merged 2 commits into from
Oct 25, 2023

Conversation

sterrettm2
Copy link
Contributor

This adds support for 64 bit signed and unsigned integers as well as double.

Benchmark                                                                 Time             CPU      Time Old      Time New       CPU Old       CPU New
------------------------------------------------------------------------------------------------------------------------------------------------------
[scalarsort.*random vs. simdsort.*random]_128/uint64_t                 +0.2167         +0.2168          1473          1792          1474          1794
[scalarsort.*random vs. simdsort.*random]_256/uint64_t                 +0.4015         +0.4024          2238          3137          2239          3140
[scalarsort.*random vs. simdsort.*random]_512/uint64_t                 +0.4101         +0.4104          4017          5665          4018          5667
[scalarsort.*random vs. simdsort.*random]_1k/uint64_t                  -0.2849         -0.2847         16574         11852         16576         11857
[scalarsort.*random vs. simdsort.*random]_5k/uint64_t                  -0.7358         -0.7358        234470         61939        234491         61948
[scalarsort.*random vs. simdsort.*random]_100k/uint64_t                -0.7067         -0.7067       6414865       1881636       6414637       1881570
[scalarsort.*random vs. simdsort.*random]_1m/uint64_t                  -0.7023         -0.7023      75780531      22560374      75776030      22558344
[scalarsort.*random vs. simdsort.*random]_10m/uint64_t                 -0.7024         -0.7024     901927451     268434143     901823797     268403663
[scalarsort.*random vs. simdsort.*random]_128/int64_t                  +0.1570         +0.1574          1438          1664          1439          1665
[scalarsort.*random vs. simdsort.*random]_256/int64_t                  +0.2673         +0.2679          2197          2784          2198          2787
[scalarsort.*random vs. simdsort.*random]_512/int64_t                  +0.2628         +0.2638          3930          4963          3930          4967
[scalarsort.*random vs. simdsort.*random]_1k/int64_t                   -0.2445         -0.2443         13535         10225         13536         10229
[scalarsort.*random vs. simdsort.*random]_5k/int64_t                   -0.7711         -0.7711        231237         52920        231261         52928
[scalarsort.*random vs. simdsort.*random]_100k/int64_t                 -0.7410         -0.7410       6311722       1634694       6311216       1634541
[scalarsort.*random vs. simdsort.*random]_1m/int64_t                   -0.7326         -0.7327      74504489      19919243      74499187      19915838
[scalarsort.*random vs. simdsort.*random]_10m/int64_t                  -0.7342         -0.7342     889359086     236422366     889271533     236395618
[scalarsort.*random vs. simdsort.*random]_128/double                   -0.1461         -0.1457          1515          1294          1517          1296
[scalarsort.*random vs. simdsort.*random]_256/double                   -0.1358         -0.1350          2477          2140          2478          2143
[scalarsort.*random vs. simdsort.*random]_512/double                   -0.2540         -0.2533          4435          3309          4436          3313
[scalarsort.*random vs. simdsort.*random]_1k/double                    -0.5494         -0.5497         15236          6865         15251          6867
[scalarsort.*random vs. simdsort.*random]_5k/double                    -0.8510         -0.8509        248527         37041        248550         37047
[scalarsort.*random vs. simdsort.*random]_100k/double                  -0.8067         -0.8067       6705854       1296268       6705526       1296134
[scalarsort.*random vs. simdsort.*random]_1m/double                    -0.7992         -0.7992      81622570      16392520      81611182      16388987
[scalarsort.*random vs. simdsort.*random]_10m/double                   -0.7868         -0.7869     957027419     203991024     956927995     203957561

Copy link
Contributor

@r-devulap r-devulap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, LGTM otherwise. Could you also run clang-format and fix the formatting?

src/avx2-64bit-qsort.hpp Outdated Show resolved Hide resolved
src/avx2-64bit-qsort.hpp Outdated Show resolved Hide resolved
src/avx2-64bit-qsort.hpp Show resolved Hide resolved
src/avx2-64bit-qsort.hpp Outdated Show resolved Hide resolved
Copy link
Contributor

@r-devulap r-devulap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the awesome work @sterrettm2!

@r-devulap r-devulap merged commit 3c9bf9a into intel:main Oct 25, 2023
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants