-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative optimisations for insertion #980
base: master
Are you sure you want to change the base?
Alternative optimisations for insertion #980
Conversation
The current logic uses unsafe pointer equality to determine if a rebalance is required, but this case is niche, and we can instead pass this information upwards. This performs well in benchmarks.
The pointer equality is used to check if we can not do anything when we retraverse back up the chain, but, if we turn it around and use an explicit continuation instead, we can just choose to not run it.
I'm surprised you're getting good results with explicit continuations. Do you know how GHC is transforming those? |
Yeah I couldn't really see any differences in the union benchmarks (maybe one was marginally faster?). From trying an explicit continuation on the other inserts, I expect that it is slower when there is a value inserted. I haven't looked at the core yet. I also considered just throwing a sentinel exception if (Actually, delimited continuations are in GHC now, could use those). |
@@ -84,6 +86,9 @@ main = do | |||
, bench "mapMaybeWithKey" $ whnf (M.mapMaybeWithKey (const maybeDel)) m | |||
, bench "lookupIndex" $ whnf (lookupIndex keys) m | |||
, bench "union" $ whnf (M.union m_even) m_odd | |||
, bench "union_identical" $ whnf (M.union m_even) m_even | |||
, bench "union_sparse" $ whnf (M.union m_even) m_sparse | |||
, bench "union_into_sparse" $ whnf (M.union m_sparse) m_even |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two more with m_sparse
and m_odd
would show if there's a slowdown in insertR
.
I would expect exceptions to be slow. I think I might have tried? I don't
remember for sure. If you try, use unsafeDupablePerformIO to catch the
exception. Also consider using raise# and catch# manually, to allow an
exception type other than SomeException (this may require
reallyUnsafePtrEquality# to determine whether you got the not-present
exception).
…On Thu, Jan 11, 2024, 10:31 PM Huw Campbell ***@***.***> wrote:
Yeah I couldn't really see any differences in the union benchmarks (maybe
one was marginally faster?). From trying an explicit continuation on the
other inserts, I expect that it is slower when there is a value inserted.
I haven't looked at the core yet.
I also considered just throwing a sentinel exception if EQ is found, then
catching and returning the top map. It would probably be fast, but
exceptions for control flow are a bit gross.
—
Reply to this email directly, view it on GitHub
<#980 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOOF7I45DIO6Q6DACGP4PLYOCVARAVCNFSM6AAAAABBWC3ULWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBYGM3TQNRTGA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
I'm not as concerned with I guess the question is do we want a 10-15% sacrifice on insertion when the key in not already present but a 20-30% improvement when it is? How would that decision be made? |
This sounds reasonable. The current code seems optimized for a very rare case. We have a comment acknowledging the same: containers/containers/src/Data/Map/Internal.hs Lines 785 to 788 in 41783ed
But why is it likely in There are a few variants of A. Pointer equality (current) Curiously, strict map uses C instead of A: containers/containers/src/Data/Map/Strict/Internal.hs Lines 503 to 506 in 41783ed
And there are 3 situations that can occur:
So, there's a bunch of benchmarking involved, if you want to figure this out. |
This is to open the conversation. I haven't looked at the strict version or gone 100% at this stage.
I'm not convinced that the (unsafe) optimisations for identical pointers is worth the complexity, and it hurts performance for inserts of new elements and the replacement of elements compared to this version.
Here, instead of checking pointer equality, we instead pass upwards whether a rebalance is required.
So inserting when the key is missing (common) or a new item at the same key (also common), are faster, while only literally inserting the same identical item (i.e., a no-op, probably less common) is slower.
I've redone the benchmarks, adding some extra calls to
rnf
in the suite. It looks like insert is about 22% faster if the key is matched and 14% slower if it's not.I've also done the same trick for
insertWith
, where it's closer to 25% faster and 10% slower respectively.For
insertR
, I've instead manually built the continuation, and just return the top level map if the key is found. I'm not sure if this helps much at this stage; union uses it, but it doesn't lean on it too hard.