Mark explicit template specializations inline #112

r-devulap · 2023-12-12T19:11:18Z

@Flamefire please review if you can :) This alone won't fix the NumPy build, I will have a patch for that soon.

Flamefire

As this includes the inline for the specializations similar to what I already tested this works.

As for the implementation:

Why use X86_SIMD_SORT_INLINE_ONLY instead of inline? Doesn't it make the code less readable?
And as mentioned in the numpy issue: Why keep the templates at all? Overloads are simpler and would allow using the X86_SIMD_SORT_INLINE define (or IMO preferably simply static) to create separate instances per TU allowing the Numpy multiple-compilation mode without ODR issues.

Flamefire · 2023-12-13T08:24:08Z

src/xss-common-includes.h

@@ -47,14 +48,17 @@
 * Force inline in cygwin to work around a compiler bug. See
 * https://github.com/numpy/numpy/pull/22315#issuecomment-1267757584
 */
+#define X86_SIMD_SORT_INLINE_ONLY inline


This makes the comment above misplaced as it applies to the define below

Ah, good point. Will fix the order :)

r-devulap · 2023-12-13T18:20:46Z

Why use X86_SIMD_SORT_INLINE_ONLY instead of inline? Doesn't it make the code less readable?

I think the macro is pretty self-explanatory. Having a macro gives the ability to make modifications to it easily in one place, if ever needed in the future.

And as mentioned in the numpy issue: Why keep the templates at all?

Without templates, won't I would need 9 identical copies of all the functions? (qsort, qselect, partialsort, argsort, argselect, etc.) Ex:

x86-simd-sort/src/xss-common-qsort.h

Line 554 in 7060e3c

X86_SIMD_SORT_INLINE void xss_qsort(T *arr, arrsize_t arrsize, bool hasnan)

.

Flamefire · 2023-12-13T19:04:30Z

I think the macro is pretty self-explanatory. Having a macro gives the ability to make modifications to it easily in one place, if ever needed in the future.

It is much longer and e.g. X86_SIMD_SORT_INLINE could be confused with that. Given the name "INLINE_ONLY" what modification would be possibly without changing the name?
Anyway I have to admit that I dislike macros and try to avoid them whenever possible. Here it seems very odd to have it defined empty. So I'm wondering if that is even correct but I don't see which compilers would actually use that code path.

And as mentioned in the numpy issue: Why keep the templates at all?

Without templates, won't I would need 9 identical copies of all the functions? (qsort, qselect, partialsort, argsort, argselect, etc.)

True yes, there it is well suited. I was more referring to functions which basically only have specializations and no base implementation. This seems to have changed. Previously this was related to e.g. avx512_qselect but now I don't even see the base template anymore (which was only declared but not defined). Only the specialization in src/avx512fp16-16bit-qsort.hpp. How does that work? What am I missing?

Anyway that comment doesn't seem to apply anymore as I missed what was changed in other PRs/commits such as 472c7d0

r-devulap force-pushed the np-build branch 2 times, most recently from 1b73813 to 28740e7 Compare December 12, 2023 20:24

r-devulap added 3 commits December 12, 2023 12:24

Add inline keyword for template specializations

28740e7

Add CI job to build numpy with SPR baseline

118ed88

typo fix

abe9974

r-devulap merged commit a372569 into intel:main Dec 12, 2023
7 checks passed

Flamefire reviewed Dec 13, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mark explicit template specializations inline #112

Mark explicit template specializations inline #112

r-devulap commented Dec 12, 2023

Flamefire left a comment

Flamefire Dec 13, 2023

r-devulap Dec 13, 2023

r-devulap commented Dec 13, 2023 •

edited

Loading

Flamefire commented Dec 13, 2023

Mark explicit template specializations inline #112

Mark explicit template specializations inline #112

Conversation

r-devulap commented Dec 12, 2023

Flamefire left a comment

Choose a reason for hiding this comment

Flamefire Dec 13, 2023

Choose a reason for hiding this comment

r-devulap Dec 13, 2023

Choose a reason for hiding this comment

r-devulap commented Dec 13, 2023 • edited Loading

Flamefire commented Dec 13, 2023

r-devulap commented Dec 13, 2023 •

edited

Loading