-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX-#4564: Register custom serializer to avoid Ray race condition #4568
FIX-#4564: Register custom serializer to avoid Ray race condition #4568
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4568 +/- ##
===========================================
+ Coverage 70.08% 89.38% +19.30%
===========================================
Files 228 229 +1
Lines 18438 18714 +276
===========================================
+ Hits 12922 16728 +3806
+ Misses 5516 1986 -3530
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Pleased to release notes!
This pull request introduces 1 alert when merging f2031ed00a1f5b660a185cd1d972b5253a0cb9ab into d5ff94e - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix the PR title, link an issue to be resolved and add a release note with your GH nickname.
…e condition (modin-project#4568) Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>
Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>
f2031ed
to
f5fe1f4
Compare
@devin-petersohn You have "customer" instead of "custom" in the PR title and first commit description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! We should conduct a perf check to see how pickle
compares to Ray's serializer.
🤣 fixed |
@devin-petersohn "customer" is still in the first commit description. It's fine by me to leave it that way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add your name to the contributors in the release notes!
Co-authored-by: Yaroslav Igoshev <Poolliver868@mail.ru>
Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@modin-project/modin-ray is this the recommended way of fixing this? It seems like a bad idea but it is the only way we could use the latest ray release without a race condition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though I would like to find a better way... I wish it existed 😞
@devin-petersohn a good point here:
Could someone please check some benchmarks? |
Drive-by-ing here, IDK if you have a good way of running benchmarks, but perhaps the github-actions bot that we wrote for the dedupe library could be useful to you. Take a look at this comment to see how it is used. Makes it really easy for us to test performance changes on a PR. |
Thanks @NickCrews, we do have asv running on every pulled-in commit: https://modin.org/modin-bench/#/ Having it run as a job for pull requests would be nice, with our current setup we can't do that. I will definitely take a look! @vnlitvinov I have run some performance benchmark numbers and they don't look good at all. On small-moderate datasets we see 2x performance degradation across the board. Extra serialization + storage costs 200ms for every 50MB serialized, and the memory usage is higher across the board. I think we need to pin Ray<1.13, as much as I hate doing that. I cannot justify the increased performance penalty. |
I have converted this pull request to a draft, hoping we can get some response from @modin-project/modin-ray . Linking to the ray issue I created here: ray-project/ray#25675 |
Closing because we chose to go a different direction. |
Signed-off-by: Devin Petersohn devin.petersohn@gmail.com
What do these changes do?
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-date