Skip to content

Implementation Notes: Event Flow

xiaoyunwu edited this page Nov 25, 2014 · 3 revisions

TaskGraph Framework Implementation Notes: Event Flow

In TaskGraph, each node only runs one task. A task always has one primary copy running, some time it has some backup copy running. TaskGraph try to handle fault tolerance for both cases: parent/child primary failure (between tasks), and primary/backup copy failure (for single task).

The simplest configuration is each task has one primary copy (or running on one node only). Under this assumption, we only need to handle failures where parent or child node fails. The framework implementation should work in the following sense.

As node starts in cloud, framework code inside it should grab the etcd information to decide which task it should take on. Use this information, along with TaskBuilder supplied by app developer, framework code can figure out which task implementation should be running on current node. It should call Init on the task implementation, so that task implementation know which task id it takes on, and topology can define local connection for it. As soon as driver calls framework.Start, framework code should go into the loop where it does the following and then wait for event to handle.

We maintain global epoch in etcd, it was set to 0 when job was started by controller. All nodes will setup watch for epoch. Inside Start, framework implementation will check with etcd to figure out what is the current global epoch. When some node calls IncEpoch on framework, all node will be notified with new epoch value. Upon receiving the new value, framework does the following things:

  1. figure out master copy of parents/children for this epoch, and setup watches for these status. (Using topology supplied by app developer)
  2. call SetEpoch on task implementation so that it can setup application dependent logic.

After this, framework basically reacts to all etcd events: parent failure/restart, and child failure/restart, it simply setup watches based on these information, and call respective method on task implementation so that they have chance to do these application dependent recovery.

The framework also handles the communication between nodes and its parent and children. We follows a pull model in TaskGraph. Basically, parent or children of a node will let framework know that it has something ready, node will then be notified by framework that even along with some meta data which might contain some application dependent info. Task implementation on the node will be notified with such status change on parent/children, it can then use meta data to decide what to do. In case he need to pull the data from parent/children, it simply call framework.DataRequest, which will pull the data down (with retries), and asynchronous let task know when the data is ready for it to consume.

The implementation is mostly event driven. TaskGraph hides some implementation details associated with distributed systems, so it is easy to develop application on top of it.

Say A is parent of B
When A says FlagChildMetaReady
B got ChildMetaReady called.
B then call DataRequest on taskid of A
framework then invoke a rpc (or something else) on A
now serveAsParent on A is called.
when it returns, ParentDataReady will be called on B.