You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Mars Actors is the key component of entire distributed scheduling. Some enhancements need to be done in summary.
Support stateful and stateless actors, statefull actors can only be created on main process, stateless ones can be created on main and sub processes. This is to ensure that all subprocess can be killed without leading to inconsistency status. This is important to reach goal of cancel-free. When user wants to cancel a job, the task which performed on a subprocess of worker can be cancelled by killing the subprocess.
More sophisticated error handling. For older Mars Actors, if a subprocess is crashed due to reason OOM, the actors who sent messages to the actors on the subprocess will finally get timeout or broken pipe. We need to raise a ActorDead instead to indicate that the actor is dead due to death of the subprocess.
Deadlock detection. If actor A sent a message to actor B, in the on_message of B, it send another message to actor A, the deadlock happens, we need to be able to detect the potential deadlock. The solution is to embed the calling chain into the message, and if a cycle call is detected, raise an error.
API tuning. Previously, we provide the basic API like send and tell, calling actor's method remotely is implemented with an inherited actor based on the basic one. We can support this internally. For more details, refer to API examples shown below.
Promise support internally. Promise is usefull when an actor send a message, and expect callback to another actor. The key point is that when the message sent, the first actor must be able to process other messages due to the reason that it's reentrancy now. However, for now, the promise is supported via another module, and it's quite complicated, and the usage is not very natrual as well.
Multiple backends support, firstly should be Ray. Actors can be created on Ray instead of Mars Actors itself.
importmars.oscarasmaclassMyActor3(ma.Actor):
defmethod_3():
# some processdo_some_operations# send message to other Actor, and # quit the function to process other messages,# when callback comes, resumeyieldactor_ref.method_1.async_wait(1, 2, a=1, b=2), \
actor_ref2.method_2.async_wait(1, 2)
# resume to processdo_other_operations
Long running annotation
importmars.oscarasmaclassMyActor4(ma.Actor):
@ma.long_runningdefmethod_4():
# CPU intensive operation, # if not annotate with long running,# this function may block other coroutines,# `long_running` will let the method run in a thread,# and other coroutines could proceedpass
Oscar: Mars Actors 2.0
Background
Mars Actors is the key component of entire distributed scheduling. Some enhancements need to be done in summary.
on_message
of B, it send another message to actor A, the deadlock happens, we need to be able to detect the potential deadlock. The solution is to embed the calling chain into the message, and if a cycle call is detected, raise an error.send
andtell
, calling actor's method remotely is implemented with an inherited actor based on the basic one. We can support this internally. For more details, refer to API examples shown below.APIs
Oscar will change to
Basic APIs
Actor Worker-level API
User-defined actor pool.
Creating actor pool.
Actor driver API
Other backends like Ray could implement this method in order to create a Mars cluster.
The text was updated successfully, but these errors were encountered: