Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support mutable tensor on oscar #2432

Merged
merged 49 commits into from
Oct 19, 2021
Merged

Support mutable tensor on oscar #2432

merged 49 commits into from
Oct 19, 2021

Conversation

Coco58323
Copy link
Contributor

@Coco58323 Coco58323 commented Sep 6, 2021

What do these changes do?

  1. add an API to the client/session to create a mutable tensor with given shape
  2. starts a "MutableActor" for every mutable tensor, which, will first create a set of chunks according to the shape, and distribute those tiled chunks to all workers, and, on each worker, create an actor to manage to write/read requests on the chunks being placed on that worker.
  3. maintain a chunk -> actor worker mapping.
  4. when read/write to the mutable tensor, first figure out which chunk to read/write, and forward the request to that worker. Now it can read/write a single point of a tensor.

Related issue number

Fixes #2195

@qinxuye
Copy link
Collaborator

qinxuye commented Sep 9, 2021

Just merge the master branch.

Copy link
Collaborator

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments

.gitignore Outdated Show resolved Hide resolved
@abstractmethod
async def create_mutable_tensor(self,
shape: tuple,
dtype: str,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type should be Union[np.dtype, str]

mars/services/mutable/supervisor/core.py Outdated Show resolved Hide resolved


class MutableTensorActor(mo.Actor):
def __init__(self, shape: tuple, dtype: str, chunk_size, worker_pools, name: str = None, default_value=0):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtype should be Union[np.dtype, str], and how about just getting all bands when needed?

mars/services/mutable/supervisor/service.py Outdated Show resolved Hide resolved
mars/services/mutable/supervisor/service.py Outdated Show resolved Hide resolved
mars/services/mutable/supervisor/service.py Outdated Show resolved Hide resolved
mars/services/mutable/worker/service.py Outdated Show resolved Hide resolved
@@ -109,6 +109,20 @@ async def destroy_remote_object(self,
session = await self._get_session_ref(session_id)
return await session.destroy_remote_object(name)

async def create_mutable_tensor(self,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend to add this API to mutable API instead of session API.

mars/services/session/supervisor/core.py Outdated Show resolved Hide resolved
@qinxuye
Copy link
Collaborator

qinxuye commented Sep 14, 2021

Besides, now, the data stored in worker is just a numpy ndarray, please use storage API to store data. And seal_mutable_data can be added in order to convert a mutable tensor with a given version into a vanilla mars tensor.

Signed-off-by: Tao He <sighingnow@gmail.com>
Signed-off-by: Tao He <sighingnow@gmail.com>
Signed-off-by: Tao He <sighingnow@gmail.com>
Signed-off-by: Tao He <sighingnow@gmail.com>
Signed-off-by: Tao He <sighingnow@gmail.com>
qinxuye
qinxuye previously approved these changes Oct 11, 2021
Copy link
Collaborator

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

res = await oscar_api.get_mutable_tensor(name)
self.write(serialize_serializable(res))

@web_api('(?P<name>[^/]+)', method='delete')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use DELETE method to handle seal? Maybe POST is better. I'm also wondering why this is not covered in tests. A parameterized test with different APIs may be applied.

if name is None:
name = str(uuid.uuid1())
if name in self._mutable_objects:
raise ValueError("Mutable tensor %s already exists!" % name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use f-string to format error messages.

self._session_id = session_id
self._address = address
self._cluster_api = cluster_api
self._mutable_objects = dict()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it proper to reference mutable objects locally, or if we need to manage it at service side?

Copy link
Contributor

@sighingnow sighingnow Oct 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a get_mutable_tensor method on session.

We shouldn't record the state here, actually. I missed it in previous review.

from .service import MutableTensorActor


class MutableTensor:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an object passed to clients, I recommend putting it in mars.services.mutable.core.

from ....typing import ChunkType


class Chunk:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use more explicit class name, for inst., MutableTensorChunk.

Signed-off-by: Tao He <sighingnow@gmail.com>
@sighingnow
Copy link
Contributor

All concerns in previous review comments and in DingDing talks have been addressed.

Signed-off-by: Tao He <sighingnow@gmail.com>
Signed-off-by: Tao He <sighingnow@gmail.com>
Signed-off-by: Tao He <sighingnow@gmail.com>
Signed-off-by: Tao He <sighingnow@gmail.com>
Signed-off-by: Tao He <sighingnow@gmail.com>
Signed-off-by: Tao He <sighingnow@gmail.com>
Signed-off-by: Tao He <sighingnow@gmail.com>
Signed-off-by: Tao He <sighingnow@gmail.com>
Copy link
Member

@wjsi wjsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wjsi wjsi merged commit a5e4044 into mars-project:master Oct 19, 2021
@qinxuye
Copy link
Collaborator

qinxuye commented Oct 19, 2021

Thanks, look forward to seeing more contributions from you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mutable tensor/dataframe on oscar
4 participants