-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unionArrayBy: Find next 1-bits with countTrailingZeros #395
Conversation
The benchmark results are pretty surprising: The mean runtime of the I think this is due to the weird benchmark data of contiguous So, I'll need to improve the benchmark data, maybe simply try with Also, check the generated code. |
These numbers are from my desktop with an i7-4790K CPU. I'm now using my old laptop with an i3-2350M CPU. For the That's pretty terrible. I'll have to look at the generated code. At least I know that it matters! |
…and bring back array indices.
0f4f623
to
eedc326
Compare
The latest version (eedc326) finally brings a speed up for the The (Both measurements are from the ancient laptop.) I should try another version that computes the indices instead of having them as arguments in the loop. It might also be interesting to combine the various bitmap checks into a single case expression. Maybe GHC can generate something like a lookup table from this?! |
Much worse, even if I keep |
On my relatively beefier machine with the i7-4790K I'm getting the following numbers:
|
I tried this:
Unfortunately it's slower (measured on the old laptop):
|
I think I'll leave it at this for now. Ultimately it would be nice to recover the perf losses for cases like the |
Closes #374.