Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements for intel -m32 builds. #62

Merged
merged 1 commit into from
Aug 25, 2022
Merged

Conversation

jkbonfield
Copy link
Collaborator

On this platform _mm256_extract_epi64 isn't defined, but the rest of
AVX2 is. It needs to fail auto-detection.

Also we get unaligned accesses in the SSE4 code with tbuf due to
differing data alignment caused by 32-bit pointers instead of 64-bit.
This exposes an underlying problem of using aligned SIMD writes on
tbuf without explicitly asking for alignment. (The new code is also
sometimes a little faster.)

See also samtools/htslib#1500

On this platform _mm256_extract_epi64 isn't defined, but the rest of
AVX2 is.  It needs to fail auto-detection.

Also we get unaligned accesses in the SSE4 code with tbuf due to
differing data alignment caused by 32-bit pointers instead of 64-bit.
This exposes an underlying problem of using aligned SIMD writes on
tbuf without explicitly asking for alignment. (The new code is also
sometimes a little faster.)

See also samtools/htslib#1500
@daviesrob daviesrob merged commit 843d4f6 into samtools:master Aug 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants