You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
We previous implemented "mutable tensor" in #415 (see also linked pull requests), by,
when creating a mutable tensor, create a corresponding actor on the scheduler node
intermediate writes happens on this MutableTensorActor
when sealing the mutable tensor, this actor split the tensor to a set of chunks and distribute them to workers, as an executed Tensor.
The previous design has many limitation, as the tensor is stored in scheduler, it is not very scalable, especially the tensor is large and not sparse.
Describe the solution you'd like
In this task we are going to revisit the mutable tensor/mutable dataframe design, as,
(please feel free to correct me if there are any mistakes/misunderstandings)
add an API to client/session to create mutable tensor/dataframe with given shape (or given columns and indexes for dataframe).
starts a "MutableActor" for every mutable tensors/dataframes, which, will first create a set of chunks according to the shape, and distribute those tiled chunks to all workers, and, on each worker, create a actor to manage write/read requests on the chunks being placed on that worker.
maintain a chunk -> actor worker mapping as well.
when read/write to the mutable tensor/dataframe, first figure out which chunk to read/write, and forward the request to that worker.
when sealing the mutable tensor/dataframe, put those intermediate chunks on workers to the storage backends (and may be to some other storage endpoints), register metadata for those chunks, destroy the actors on workers, register the metadata for the tensor/dataframe itself, and finally destroy the manage actor.
Additional context
See also #415 for previous design and see related pull requests for details about previous implementation.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
We previous implemented "mutable tensor" in #415 (see also linked pull requests), by,
MutableTensorActor
Tensor
.The previous design has many limitation, as the tensor is stored in scheduler, it is not very scalable, especially the tensor is large and not sparse.
Describe the solution you'd like
In this task we are going to revisit the mutable tensor/mutable dataframe design, as,
(please feel free to correct me if there are any mistakes/misunderstandings)
chunk -> actor worker
mapping as well.Additional context
See also #415 for previous design and see related pull requests for details about previous implementation.
The text was updated successfully, but these errors were encountered: