-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential memory optimization for IntMap and IntSet #991
Comments
does it? transparent conversion between old and proposed new representation seems to me like the perfect use case for a pattern synonym - that GHC should be able to inline away. I am skeptical of the overall effect on containers (certainly the original author(s) must have discussed this idea) but at least it could serve as a GHC test case. |
Thanks, that should help incrementally test and benchmark it.
Do you mean checking how it affects GHC tests? Documenting some ideas: An alternative to the code above is making i2w :: Int -> Word
i2w = fromIntegral
nomatch k pm = i2w k < i2w p || i2w (pm+m) < i2w k
where
p = pm .&. (pm-1)
m = pm - p In fact it can even be done today nomatch k p m = i2w k < i2w p || i2w (p+m+m) < i2w k I wonder if this is better than the current version, which, expanded, looks like nomatch k p m = (k .&. (-m `xor` m)) /= p |
Definitely worth trying, if there really are enough bits! Give a pattern synonym a go; that should at least validate the semantics in a hurry. My experience with pattern synonyms is a little bit old. When I was messing around with them a bunch, I kept running into performance problems where they wouldn't inline well (IIRC, join point issues again). But now there are |
that's what I meant: the proposed transformation may serve as a test case for GHC's inliner. |
Cool, I'll test this out, maybe some time in the next couple of weeks. |
I assume this will make the merges (and everything similar to them) worse. As mentioned I wonder what the slowdown would be and whether bothering with it is even worth the meager 8 byte savings on each branch. |
I have tested out the changes on
What do you think? |
I think keeping it as a pattern synonym would probably be a good thing, but we can decide later. |
This isn't particularly novel, so maybe it has been proposed before. Let me know if it has. I only found #340 in my search, which is a bigger idea.
Current definition
Today,
Bin
forIntMap
andIntSet
is represented ascontainers/containers/src/Data/IntMap/Internal.hs
Lines 355 to 362 in 3c13e0b
Potential new definition
The prefix and the mask can be merged so that we save one word per
Bin
.The mask bit is always zero in the prefix, so this isn't throwing away any information. The lowest set bit of the new int is the current mask and the rest of it is the current prefix.
Current branching on
Bin
Branching on
Bin
is currently done like this:containers/containers/src/Data/IntMap/Internal.hs
Lines 813 to 817 in 3c13e0b
containers/containers/src/Data/IntMap/Internal.hs
Lines 3527 to 3528 in 3c13e0b
containers/containers/src/Data/IntMap/Internal.hs
Lines 3519 to 3520 in 3c13e0b
New branching on
Bin
Performance impact
I don't know yet, I need to make the change and benchmark. And it will involve changing every function, so it will be take a while.
Memory is certainly saved.
nomatch
gets a little more expensive, butleft
is cheaper thanzero
, so I'm hoping there is zero or positive overall effect.What do you think? Is it worth checking out how this will fare? And is there any bad consequence of this representation, that I didn't think of?
The text was updated successfully, but these errors were encountered: