Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Ray breaks alignment for NumPy arrays of doubles #11760

Closed
1 of 2 tasks
ienkovich opened this issue Nov 2, 2020 · 8 comments · Fixed by #11888
Closed
1 of 2 tasks

[Core] Ray breaks alignment for NumPy arrays of doubles #11760

ienkovich opened this issue Nov 2, 2020 · 8 comments · Fixed by #11888
Assignees
Labels
bug Something that is supposed to be working; but isn't P1 Issue that should be fixed within a few weeks

Comments

@ienkovich
Copy link

What is the problem?

I'm using Ray 1.0.0 with Python 3.8.3 on Ubuntu 18.04

When I use Ray to serialize/deserialize a NumPy array of doubles, I get an unaligned NumPy array (data is not 8-byte aligned).

Reproduction

Here is a reproducer I use

import numpy as np
import ray
ray.init()

arr1 = np.array([10.1, 20.2, 30.3], dtype = np.double)
oid = ray.put(arr1)
arr2 = ray.get(oid)

ptr1 = arr1.__array_interface__["data"][0]
print(f"Original address: 0x{hex(ptr1)} - {'un' if ptr1 % 8 != 0 else ''}aligned")
assert(ptr1 % 8 == 0)
ptr2 = arr2.__array_interface__["data"][0]
print(f"Deserialized address: 0x{hex(ptr2)} - {'un' if ptr2 % 8 != 0 else ''}aligned")
assert(ptr2 % 8 == 0)

The output is:

Original address: 0x0x56150a88c9d0 - aligned
Deserialized address: 0x0x7f738756010c - unaligned
Traceback (most recent call last):
  File "1.py", line 14, in <module>
    assert(ptr2 % 8 == 0)
AssertionError
  • I have verified my script runs in a clean environment and reproduces the issue.
  • I have verified the issue also occurs with the latest wheels.
@ienkovich ienkovich added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Nov 2, 2020
@ericl
Copy link
Contributor

ericl commented Nov 2, 2020

Is this a regression from Ray 0.8 series?

@ericl ericl removed the triage Needs triage (eg: priority, bug/not-bug, and owning component) label Nov 2, 2020
@robertnishihara
Copy link
Collaborator

I think we actually want the arrays to be 64-byte aligned for performance reasons. I know we guaranteed that when we were using pyarrow for serialization. @pcmoritz @suquark

@ericl ericl added the triage Needs triage (eg: priority, bug/not-bug, and owning component) label Nov 2, 2020
@ericl
Copy link
Contributor

ericl commented Nov 2, 2020

Interesting, so this may have been broken for some time now.

@vnlitvinov
Copy link
Contributor

Food for thought: I've tried pickling and unpickling back (by using builtin pickle module and calling pickle.loads(pickle.dumps(arr, -1))), and unpickled array is properly aligned, while retrieved via Ray is not.

@rkooo567 rkooo567 changed the title Ray breaks alignment for NumPy arrays of doubles [Core] Ray breaks alignment for NumPy arrays of doubles Nov 3, 2020
@rkooo567 rkooo567 added the core label Nov 3, 2020
@rkooo567
Copy link
Contributor

rkooo567 commented Nov 3, 2020

@ericl What's the action item here?

@ericl ericl added P1 Issue that should be fixed within a few weeks and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Nov 5, 2020
@suquark
Copy link
Member

suquark commented Nov 6, 2020

@robertnishihara I almost sure when moving from pyarrow to pickle5, the buffer is still aligned:

# Increase buffer address.
if view.len < kMajorBufferSize:
self._curr_buffer_addr = padded_length(
self._curr_buffer_addr, kMinorBufferAlign)
else:
self._curr_buffer_addr = padded_length(
self._curr_buffer_addr, kMajorBufferAlign)

For large chunks, we align the buffer to 64 bytes(kMajorBufferAlign); for small chunks, we align it to 8 bytes (kMinorBufferAlign).

Maybe it is related to the msgpack header which is introduced recently?

@pcmoritz
Copy link
Contributor

pcmoritz commented Nov 9, 2020

@suquark Could it be that the relative addresses are aligned, but not the start? Maybe we need to make sure to make ptr aligned in here: https://github.com/ray-project/ray/blob/master/python/ray/includes/serialization.pxi#L360

Can you test that and create a PR for it?

@robertnishihara
Copy link
Collaborator

robertnishihara commented Nov 9, 2020

@suquark it's definitely not aligned right now, e.g.,

import ray
import numpy as np
ray.init()
x = ray.get(ray.put(np.zeros(10**8)))
assert x.ctypes.data % 64 == 0  # This is often (always?) 12 for me. Not that I tried larger arrays in case the different code paths for smaller objects change things.

For some reason I thought we had a test for this, but I can't find it now.

EDIT: I guess we added the test in Arrow.

https://github.com/apache/arrow/blob/599b458c68dfcba38fe5448913d4bb69723e1439/python/pyarrow/tests/test_serialization.py#L1156-L1176

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't P1 Issue that should be fixed within a few weeks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants