Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark explicit template specializations inline #112

Merged
merged 3 commits into from
Dec 12, 2023

Conversation

r-devulap
Copy link
Contributor

@Flamefire please review if you can :) This alone won't fix the NumPy build, I will have a patch for that soon.

@r-devulap r-devulap force-pushed the np-build branch 2 times, most recently from 1b73813 to 28740e7 Compare December 12, 2023 20:24
@r-devulap r-devulap merged commit a372569 into intel:main Dec 12, 2023
7 checks passed
Copy link

@Flamefire Flamefire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this includes the inline for the specializations similar to what I already tested this works.

As for the implementation:

  • Why use X86_SIMD_SORT_INLINE_ONLY instead of inline? Doesn't it make the code less readable?
  • And as mentioned in the numpy issue: Why keep the templates at all? Overloads are simpler and would allow using the X86_SIMD_SORT_INLINE define (or IMO preferably simply static) to create separate instances per TU allowing the Numpy multiple-compilation mode without ODR issues.

@@ -47,14 +48,17 @@
* Force inline in cygwin to work around a compiler bug. See
* https://github.com/numpy/numpy/pull/22315#issuecomment-1267757584
*/
#define X86_SIMD_SORT_INLINE_ONLY inline

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the comment above misplaced as it applies to the define below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point. Will fix the order :)

@r-devulap
Copy link
Contributor Author

r-devulap commented Dec 13, 2023

Why use X86_SIMD_SORT_INLINE_ONLY instead of inline? Doesn't it make the code less readable?

I think the macro is pretty self-explanatory. Having a macro gives the ability to make modifications to it easily in one place, if ever needed in the future.

And as mentioned in the numpy issue: Why keep the templates at all?

Without templates, won't I would need 9 identical copies of all the functions? (qsort, qselect, partialsort, argsort, argselect, etc.) Ex:

X86_SIMD_SORT_INLINE void xss_qsort(T *arr, arrsize_t arrsize, bool hasnan)
.

@Flamefire
Copy link

I think the macro is pretty self-explanatory. Having a macro gives the ability to make modifications to it easily in one place, if ever needed in the future.

It is much longer and e.g. X86_SIMD_SORT_INLINE could be confused with that. Given the name "INLINE_ONLY" what modification would be possibly without changing the name?
Anyway I have to admit that I dislike macros and try to avoid them whenever possible. Here it seems very odd to have it defined empty. So I'm wondering if that is even correct but I don't see which compilers would actually use that code path.

And as mentioned in the numpy issue: Why keep the templates at all?

Without templates, won't I would need 9 identical copies of all the functions? (qsort, qselect, partialsort, argsort, argselect, etc.)

True yes, there it is well suited. I was more referring to functions which basically only have specializations and no base implementation. This seems to have changed. Previously this was related to e.g. avx512_qselect but now I don't even see the base template anymore (which was only declared but not defined). Only the specialization in src/avx512fp16-16bit-qsort.hpp. How does that work? What am I missing?

Anyway that comment doesn't seem to apply anymore as I missed what was changed in other PRs/commits such as 472c7d0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants