-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deserialization: zero-copy merge subframes when possible #5208
Deserialization: zero-copy merge subframes when possible #5208
Conversation
Use NumPy to get the pointers of the memoryviews and do pointer arithmetic. This is way more reliable and self-contained. TODO unit test for `merge_memoryviews` and deal with NumPy situation
It's very annoying: `from_buffer` requires the buffer to be writeable, because ctypes is inconsistently strict about const correctness (see https://bugs.python.org/issue11427#msg148586). I don't think there's any clever pointer-casting we could do as mentioned in that thread, since we need to access through the buffer protocol—we can't just cast `mv.obj` to `c_void_p`, since we need the pointer into the current location within the buffer, not just at the start of the buffer. So for read-only buffers, we try to use NumPy, and if it's not available, we'll end up silently(!) falling back on copying.
Co-authored-by: crusaderky <crusaderky@gmail.com>
@jakirkham do you have any more thoughts on this? @jcrist mentioned he thought frame splitting itself might be unnecessary except for websockets, and should be handled at the transport level instead. I think this PR has become somewhat low priority because there's discussion around significantly reworking the protocol, but I do have a feeling it would make a difference for users right now. |
I think this is still useful. Have some WIP review comments that I've fallen behind. Will try to catch back up later this week or next week. Yeah at least on the To Jim's point, I think only frame splitting we do on for communication is baked into the transport layers already. One case is websockets. Another is working around a bug in OpenSSL 1.0.2. In either case neither of these should be in serialization. Jim if you see other issues please let us know. |
I think this is a high impact issue. Are there any major concerns left before we can merge? I get that some parts of the protocol are currently in review and might change but until then this should deliver a significant improvement. The changes are well enough isolated that we can easily revert if necessary. friendly ping @jakirkham @madsbk @jcrist If there are no objections within the next 24h I will go ahead and merge. |
I've given this a read through and it seems fine to me to merge as is. I still think that splitting large buffers should be done at the transport level instead of here, but for now this seems fine. Test failure is unrelated, should should be good-to-go. |
Have some draft review comments above that I haven't had a chance to finish. Can try to finish them, but unfortunately it won't happen for a few more days. That said, feel free to merge anyways if you need this soon and can try to follow up later |
@fjetter shall we merge this then? I'd love to see it in the next release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay. Had a few minor comments below
Co-authored-by: jakirkham <jakirkham@gmail.com>
I merged main and fixed a linting issue. will merge once green-ish |
17 tests broke on the |
…copy-merge-memoryviews
@fjetter the tests you just added in #5160 are failing on 3.7 everywhere, and ubuntu 3.8, with odd TypeErrors related to the mocks > assert fut2.key in mocked_gather.call_args.kwargs["to_gather"]
E TypeError: tuple indices must be integers or slices, not str Then there's known flaky #3574 and one |
…copy-merge-memoryviews
This is still open. I've merged main and the test failures feel flaky to me, but I'm not sure. |
FWIW merged @pentschev's PR ( #5484 ), which included some test fixes for UCX that should clear out the gpuCI failures seen here. May need to merge again to get those Did see a couple unrelated test failures over there as well. Maybe @jrbourbeau will be able to let us know if these are flaky tests or if they need further investigation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @gjoseph92, clever approach that is useful even if we move the splitting/merging to the transport layer!
rerun tests |
gpuCI now passed, thanks for checking this @jakirkham ! |
…copy-merge-memoryviews
This is green besides xref #5494 |
Apologies for the delay here. A few folks have approved this PR, so I'll plan to merge this in later today if no further comments |
Thanks @jrbourbeau and everyone for reviewing! Very excited to have this in. |
Better implementation of #5112.
The point of this change (copied from original PR; note I tested and the screencap behavior works with this PR too):
I'd always wanted to merge the memoryviews by comparing their pointers, but couldn't find an API to access that information, hence the gnarly previous PR passing around byte offsets.
Thanks to @jakirkham on #5112 (comment), I wrote the pointer arithmetic using NumPy to access the buffer interface. Then I found @mattgwwalker's great post explaining how to do exactly what we needed just using
ctypes
: https://mattgwwalker.wordpress.com/2020/10/15/address-of-a-buffer-in-python/.black distributed
/flake8 distributed
/isort distributed