Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an optional argument "hasnan" for qselect and partialsort #49

Merged
merged 1 commit into from
Jun 22, 2023

Conversation

r-devulap
Copy link
Contributor

Handling NAN's slows down the algorithm. We make it an optional argument. Also fixes a bug in qselect and partialsort where NAN's were handled incorrectly. Improved benchmarks for float and doubles not containing NAN's:

Benchmark                                                                  Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------------------------------------
[avx512_qselect<double> vs. avx512_qselect<double>]/10                  -0.1488         -0.1488          8922          7594          8928          7600
[avx512_qselect<double> vs. avx512_qselect<double>]/100                 -0.1542         -0.1540          8963          7582          8970          7588
[avx512_qselect<double> vs. avx512_qselect<double>]/1000                -0.1615         -0.1617          8664          7265          8672          7269
[avx512_qselect<double> vs. avx512_qselect<double>]/5000                -0.1472         -0.1472          9705          8276          9711          8281
[avx512_qselect<float> vs. avx512_qselect<float>]/10                  -0.1708         -0.1706          6049          5016          6052          5019
[avx512_qselect<float> vs. avx512_qselect<float>]/100                 -0.1691         -0.1690          6171          5128          6174          5131
[avx512_qselect<float> vs. avx512_qselect<float>]/1000                -0.1645         -0.1643          6229          5204          6231          5207
[avx512_qselect<float> vs. avx512_qselect<float>]/5000                -0.1586         -0.1581          6140          5166          6143          5172

Compared to stdnthelement:

Benchmark                                                                      Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------
[stdpartialsort<float> vs. avx512_partial_qsort<float>]/10                  -0.2713         -0.2706          6952          5066          6953          5071
[stdpartialsort<float> vs. avx512_partial_qsort<float>]/100                 -0.7868         -0.7865         25051          5341         25053          5350
[stdpartialsort<float> vs. avx512_partial_qsort<float>]/1000                -0.9652         -0.9652        253841          8834        253840          8842
[stdpartialsort<float> vs. avx512_partial_qsort<float>]/5000                -0.9686         -0.9686        755803         23749        755779         23756
OVERALL_GEOMEAN                                                             -0.8858         -0.8857             0             0             0             0
Benchmark                                                                        Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------------------------------------------
[stdpartialsort<double> vs. avx512_partial_qsort<double>]/10                  -0.0438         -0.0429          7958          7609          7957          7616
[stdpartialsort<double> vs. avx512_partial_qsort<double>]/100                 -0.6967         -0.6963         25841          7839         25843          7848
[stdpartialsort<double> vs. avx512_partial_qsort<double>]/1000                -0.9505         -0.9505        243181         12033        243190         12043
[stdpartialsort<double> vs. avx512_partial_qsort<double>]/5000                -0.9480         -0.9480        694575         36117        694556         36134

ping @mosullivan93

@r-devulap r-devulap merged commit a4e57cb into intel:main Jun 22, 2023
@mosullivan93
Copy link
Contributor

Sorry, this notification got buried in my inbox until I saw the merge email. Thanks for patching up the partial functions, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants