-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
don't perform unnecessary work when diffing maps #45
Conversation
(A ∩ B) ∪ (B - A) = B
@@ -108,20 +108,10 @@ | |||
(map ->Insertion) | |||
(remove #(contains? exp %) act))) | |||
|
|||
(let [exp {false 0, 0 0} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a stray top level form that probably should've been in a comment block. I just deleted it because it was a test of the impl below which is changing here and so it's no longer applicable.
Thanks for digging in and profiling! I didn't intuitively understand the identity (perhaps due to unfamiliarity with the notation) until I thought about it more. It's basically saying the intersection (i.e. what's in A and B): Combined with the difference (i.e., what's only in B): is just B: The code looks good and it passes our tests (including property-based tests), so this seems like a solid change. I have a nagging worry there's some implementation detail that would make these two not equivalent. |
I also had some uneasiness (why was it like that in the first place if it could be so simple?). However, as far as I can tell the order of |
I think enough due diligence has been done here. @latacora-paul could you still add a CHANGELOG entry? then this is ready to go out. |
done! |
I will note that I didn't notice tests running any faster after this change, but there's a lot of noise on our tests and I don't think they involve many large maps. I did a quick test using criterium on my computer on a 100-element hash-map, and it was several times faster and somewhat faster for a small hash-map. |
I was seeing some pretty bad performance when performing a large diff and tracked down the primary cause. I'm certain there are other improvements that can be made but I'll keep the change small for now :). This change runs ~15x faster for my use case.
Math identity illustrating why this change is okay:
Before:
After: