Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use uint32_t instead of size_t for object sort #126

Merged
merged 3 commits into from
Feb 1, 2024

Conversation

r-devulap
Copy link
Contributor

Improves performance especially when the metric to sort is a 32-bit dtype:

Benchmark                                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
[scalarobjsort.*float vs. simdobjsort.*float],x>>/100                             +0.0933         +0.0947          1419          1551          1421          1555
[scalarobjsort.*float vs. simdobjsort.*float],x>>/1000                            -0.3962         -0.3965         15418          9309         15431          9313
[scalarobjsort.*float vs. simdobjsort.*float],x>>/10000                           -0.8120         -0.8119        576470        108403        576505        108422
[scalarobjsort.*float vs. simdobjsort.*float],x>>/100000                          -0.7364         -0.7364       7084507       1867259       7084966       1867333
[scalarobjsort.*float vs. simdobjsort.*float],x>>/1000000                         -0.5463         -0.5463      86243039      39130817      86238537      39128228
[scalarobjsort.*float vs. simdobjsort.*float],x>>/10000000                        +0.0628         +0.0628    1018919772    1082899576    1018866896    1082838013
[scalarobjsort.*float vs. simdobjsort.*float],taxicab>>/100                       -0.2089         -0.2088          2028          1604          2031          1607
[scalarobjsort.*float vs. simdobjsort.*float],taxicab>>/1000                      -0.6529         -0.6529         28076          9744         28085          9749
[scalarobjsort.*float vs. simdobjsort.*float],taxicab>>/10000                     -0.8666         -0.8666        852263        113676        852319        113696
[scalarobjsort.*float vs. simdobjsort.*float],taxicab>>/100000                    -0.8211         -0.8211      10474638       1873979      10474826       1873929
[scalarobjsort.*float vs. simdobjsort.*float],taxicab>>/1000000                   -0.6921         -0.6921     126190844      38854176     126185333      38851333
[scalarobjsort.*float vs. simdobjsort.*float],taxicab>>/10000000                  -0.2535         -0.2535    1460305765    1090189561    1460270229    1090125905
[scalarobjsort.*float vs. simdobjsort.*float],euclidean>>/100                     -0.3014         -0.3004          2273          1588          2275          1591
[scalarobjsort.*float vs. simdobjsort.*float],euclidean>>/1000                    -0.7477         -0.7476         36658          9247         36672          9254
[scalarobjsort.*float vs. simdobjsort.*float],euclidean>>/10000                   -0.8951         -0.8951       1073122        112611       1073198        112625
[scalarobjsort.*float vs. simdobjsort.*float],euclidean>>/100000                  -0.8582         -0.8581      13320127       1889451      13320036       1889454
[scalarobjsort.*float vs. simdobjsort.*float],euclidean>>/1000000                 -0.7599         -0.7599     162466510      39011741     162464089      39008496
[scalarobjsort.*float vs. simdobjsort.*float],euclidean>>/10000000                -0.4183         -0.4183    1865672145    1085333325    1865605675    1085226497
[scalarobjsort.*float vs. simdobjsort.*float],chebyshev>>/100                     -0.1406         -0.1400          1822          1566          1825          1570
[scalarobjsort.*float vs. simdobjsort.*float],chebyshev>>/1000                    -0.6403         -0.6403         25983          9346         25999          9352
[scalarobjsort.*float vs. simdobjsort.*float],chebyshev>>/10000                   -0.8596         -0.8596        808734        113518        808769        113526
[scalarobjsort.*float vs. simdobjsort.*float],chebyshev>>/100000                  -0.8172         -0.8172      10281025       1879824      10281239       1879870
[scalarobjsort.*float vs. simdobjsort.*float],chebyshev>>/1000000                 -0.6841         -0.6841     122766074      38785146     122764627      38783435
[scalarobjsort.*float vs. simdobjsort.*float],chebyshev>>/10000000                -0.2506         -0.2507    1441875046    1080508739    1441843964    1080441164
OVERALL_GEOMEAN                                                                   -0.6447         -0.6447             0             0             0             0

@r-devulap r-devulap merged commit 38f89b0 into intel:main Feb 1, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant