-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark custom maps #2573
Benchmark custom maps #2573
Conversation
5282504
to
24c1564
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleans up quite a bit of loose ends, so a good job. I have some concerns about using arrays for Path in every function. keyPath computes the list lazily in in most functions, if we use a list, on most calls we will look at only the first 6-8 items in the path, while the full path may contain 32+ items. So laziness is our friend. There is 1 exception to the rule in insertWithKey! where we need to access particular locations in the path with VP.!. We might have two kinds of paths, one an array for use in insertWithKey' and one a lazy list for every where else a path is used.
We ought to talk about this.
twoLeaf path1 leaf1 path2 k2 v2 = go path1 path2 | ||
where | ||
leaf2 = Leaf k2 v2 | ||
go p1 p2 | ||
| Just (i, is) <- VP.uncons p1, | ||
Just (j, js) <- VP.uncons p2 = | ||
if i == j |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be some tension between thinking of Path as a list or as an array. Sometimes I think a list is better (like here, where we are pattern matching) and other times an array (when we need something like (path !! n)) The 'n' is usually needed to handle the (Leaf __ ) case because unlike all the other constructors Leaf's do not track which segement they are under. So we do some adustment to the Path, to recover this lost information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your reasoning, but I was using benchmarks as a guide, rather than logical reasoning. In the end what we think the performance should be is not necessarily the performance that we'll get. Using lists for Path makes the worst case 5 times slower than when using primitive vector.
We can certainly discuss it in more detail. If you can come up with the worst case example where vector would perform worse that would be tremendously useful, because then we could benchmark it and reason about the performance hit if any. One way or another insert
is one of the most important operations in this map, and that is the one that I've optimized.
accum !ans (k, v) = insert k v ans | ||
{-# INLINE fromList #-} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't fold' strict in the accumulator argument (that is what the ' is used for), so is the ! on ans needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't technically necessary to fix memory leaks, but there is a subtle difference between just foldl'
and foldl'
with bang on accumulator: https://www.well-typed.com/blog/2014/04/fixing-foldl/#a-postscript-which-foldl
With all honesty I added it because I was benchmarking the function and I was tapping into all possible optimizations I could find. So this change is not strictly necessary (pun intended), but it won't hurt either.
…Also: * Manual export of KeyMap interface * Make tests compile, rewrite histogram as debugging function with Vector and move it away from tests * Improve maxSuccess logic
…benhcmark * Add tests and bench to hie. Fix regression. Better name for deleteInternal function * Rename `wordSize` -> `segementMaxValue` * Fix Show instance * Rename `insertWithKey'` -> `insertWithKeyInternal`
99ff492
to
6fb198a
Compare
This PR adds benchmarks, applies hlint and some renameing for consistency.
Most importantly it also improves speed of insertion into the KeyMap and fixes a bug in
filterArrayWithBitmap
that resulted inintersection
throwing an exception on some input.