Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a SIMD implementation of swisstable_group_query suitable for ARM #17

Open
wesleywiser opened this issue Sep 20, 2021 · 2 comments

Comments

@wesleywiser
Copy link
Member

Briefly mentioned in #16, but as ARM devices become more popular, it would great to have an accelerated implementation for them as well.

@michaelwoerister
Copy link
Member

According to this comment in hashbrown it might not be worth the trouble:

// Use the SSE2 implementation if possible: it allows us to scan 16 buckets
// at once instead of 8. We don't bother with AVX since it would require
// runtime dispatch and wouldn't gain us much anyways: the probability of
// finding a match drops off drastically after the first few buckets.
//
// I attempted an implementation on ARM using NEON instructions, but it
// turns out that most NEON instructions have multi-cycle latency, which in
// the end outweighs any gains over the generic implementation.

Also, according to local benchmarks someone ran for me on an M1 MacMini, the non-SIMD version there still easily outperformed the SIMD version on an AMD Ryzen 5900x 😃

@michaelwoerister
Copy link
Member

I just found this PR/discussion in the hashbrown repo: rust-lang/hashbrown#269
Very interesting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants