Improvements for intel -m32 builds. #62

jkbonfield · 2022-08-17T15:40:59Z

On this platform _mm256_extract_epi64 isn't defined, but the rest of
AVX2 is. It needs to fail auto-detection.

Also we get unaligned accesses in the SSE4 code with tbuf due to
differing data alignment caused by 32-bit pointers instead of 64-bit.
This exposes an underlying problem of using aligned SIMD writes on
tbuf without explicitly asking for alignment. (The new code is also
sometimes a little faster.)

See also samtools/htslib#1500

On this platform _mm256_extract_epi64 isn't defined, but the rest of AVX2 is. It needs to fail auto-detection. Also we get unaligned accesses in the SSE4 code with tbuf due to differing data alignment caused by 32-bit pointers instead of 64-bit. This exposes an underlying problem of using aligned SIMD writes on tbuf without explicitly asking for alignment. (The new code is also sometimes a little faster.) See also samtools/htslib#1500

daviesrob merged commit 843d4f6 into samtools:master Aug 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements for intel -m32 builds. #62

Improvements for intel -m32 builds. #62

jkbonfield commented Aug 17, 2022

Improvements for intel -m32 builds. #62

Improvements for intel -m32 builds. #62

Conversation

jkbonfield commented Aug 17, 2022