Skip to content
Andre Cronje edited this page Oct 25, 2018 · 2 revisions

Introduction

Blockchain has emerged as a technology for secure decentralized transaction ledgers with broad applications in financial systems, supply chains and health care. Byzantine fault tolerance is addressed in distributed database systems, in which up to one-third of the participant nodes may be compromised. Consensus algorithms ensures the integrity of transactions between participants over a distributed network and is equivalent to the proof of Byzantine fault tolerance in distributed database systems .

A large number of consensus algorithms have been proposed. For example; the original Nakamoto consensus protocol in Bitcoin uses Proof of Work (PoW) . Proof Of Stake (PoS) uses participants’ stakes to generate the blocks respectively. Our previous paper gives a survey of previous DAG-based approaches .

In the previous paper , we introduced a new consensus protocol, called L0. The protocol L0 is a DAG-based asynchronous non-deterministic protocol that guarantees pBFT. L0 generates each block asynchronously and uses the OPERA chain (DAG) for faster consensus by confirming how many nodes share the blocks.

The Lachesis protocol as previously proposed is a set of protocols that create a directed acyclic graph for distributed systems. Each node can receive transactions and batch them into an event block. An event block is then shared with it’s peers. When peers communicate they share this information again and thus spread this information through the network. In BFT systems we would use a broadcast voting approach and ask each node to vote on the validity of each block. This event is synchronous in nature. Instead we proposed an asynchronous system where we leverage the concepts of distributed common knowledge, dominator relations in graph theory and broadcast based gossip to achieve a local view with high probability of being a global view. It accomplishes this asynchronously, meaning that we can increase throughput near linearly as nodes enter the network.

In this work, we propose a further enhancement on these concepts and we formalize them so that they can be applied to any asynchronous distributed system.

Contributions

In summary, this paper makes the following contributions:

  • We introduce the n-row flag table for faster root selection of the Lachesis Protocol

  • We define continuous consistent cuts of a local view to achieve consensus

  • We present proof of how domination relationships can be used for share information

  • We formalize our proofs that can be applied to any generic asynchronous DAG solution

Paper structure

The rest of this paper is organised as follows. Section [se:prelim] describes our Fantom framework. Section [se:lca] presents the protocol implementation. Section [se:con] concludes our paper with some future directions. Proof of Byzantine fault tolerance is described in Section [se:proof].

Preliminaries

The protocol is run via nodes representing users’ machines which together create a network. The basic units of the protocol are called event blocks - a data structure created by a single node to share transaction and user information with the rest of the network. These event blocks reference previous event blocks that are known to the node. This flow or stream of information creates a sequence of history.

The history of the protocol can be represented by a directed acyclic graph G = (V, E), where V is a set of vertices and E is a set of edges. Each vertex in a row (node) represents an event. Time flows left-to-right of the graph, so left vertices represent earlier events in history.

For a graph G, a path p in G is a sequence of vertices (v1, v2, …, vk) by following the edges in E. Let vc be a vertex in G. A vertex vp is the parent of vc if there is an edge from vp to vc. A vertex va is an ancestor of vc if there is a path from va to vc.

Figure [fig:operachain] shows an example of an OPERA chain (DAG) constructed through the Lachesis protocol. Event blocks are representated by circles. Blocks of the same frame have the same color.

Basic Definitions

(Lachesis) The set of protocols

(Node) Each machine that participates in the Lachesis protocol is called a node. Let n denote the total number of nodes.

(k) A constant defined in the system.

(Peer node) A node ni has k peer nodes.

(Process) A process pi represents a machine or a node. The process identifier of pi is i. A set P = {1,...,n} denotes the set of process identifiers.

(Channel) A process i can send messages to process j if there is a channel (i,j). Let C ⊆ {(i,j) s.t. i, j ∈ P} denote the set of channels.

(Event block) Each node can create event blocks, send (receive) messages to (from) other nodes.The structure of an event block includes the signature, generation time, transaction history, and hash information to references.

All nodes can create event blocks. The information of the referenced event blocks can be copied by each node. The first event block of each node is called a leaf event.

Suppose a node ni creates an event vc after an event vs in ni. Each event block has exactly k references. One of the references is self-reference, and the other k-1 references point to the top events of ni’s k-1 peer nodes.

(Top event) An event v is a top event of a node ni if there is no other event in ni referencing v.

(Height Vector) The height vector is the number of event blocks created by the it**h node.

(In-degree Vector) The in-degree vector refers to the number of edges from other event blocks created by other nodes to the top event block of this node. The top event block indicates the most recently created event block by this node.

(Ref) An event vr is called “ref" of event vc if the reference hash of vc points to the event vr. Denoted by vcrvr. For simplicity, we can use ↪ to denote a reference relationship (either ↪r or ↪s).

(Self-ref) An event vs is called “self-ref" of event vc, if the self-ref hash of vc points to the event vs. Denoted by vcsvs.

(k references) Each event block has at least k references. One of the references is self-reference, and the other k-1 references point to the top events of ni’s k-1 peer nodes.

(Self-ancestor) An event block va is self-ancestor of an event block vc if there is a sequence of events such that vcsv1s…↪svmsva. Denoted by vcs**ava.

(Ancestor) An event block va is an ancestor of an event block vc if there is a sequence of events such that vc ↪ v1 ↪ … ↪ vm ↪ va. Denoted by vcava.

For simplicity, we simply use vcavs to refer both ancestor and self-ancestor relationship, unless we need to distinguish the two cases.

(Flag Table) The flag table is a nxk matrix, where n is the number of nodes and k is the number of roots that an event block can reach. If an event block e created by it**h node can reach jt**h root, then the flag table stores the hash value of the jt**h root.

Lamport timestamps

Our Lachesis protocols relies on Lamport timestamps to define a topological ordering of event blocks in OPERA chain. By using Lamport timestamps, we do not rely on physical clocks to determine a partial ordering of events.

The “happened before“ relation, denoted by →, gives a partial ordering of events from a distributed system of nodes. Each node ni (also called a process) is identified by its process identifier i. For a pair of event blocks v and v′, the relation ”→" satisfies: (1) If v and v′ are events of process Pi, and v comes before v′, then b → v′. (2) If v is the send(m) by one process and v′ is the receive(m) by another process, then v → v′. (3) If v → v′ and v′→v″ then v → v″. Two distinct events v and v′ are said to be concurrent if v ↛ v′ and v′↛v.

For an arbitrary total ordering ≺ of the processes, a relation ⇒ is defined as follows: if v is an event in process Pi and v′ is an event in process Pj, then v ⇒ v′ if and only if either (i) Ci(v)<Cj(v′) or (ii) C(v)=Cj(v′) and Pi ≺ Pj. This defines a total ordering, and that the Clock Condition implies that if v → v′ then v ⇒ v′.

We use this total ordering in our Lachesis protocol. This ordering is used to determine consensus time, as described in Section [se:lca].

(Happened-Immediate-Before) An event block vx is said Happened-Immediate-Before an event block vy if vx is a (self-) ref of vy. Denoted by vx ↦ vy.

(Happened-before) An event block vx is said Happened-Before an event block vy if vx is a (self-) ancestor of vy. Denoted by vx → vy.

Happened-before is the relationship between nodes which have event blocks. If there is a path from an event block vx to vy, then vx Happened-before vy. “vx Happened-before vy" means that the node creating vy knows event block vx. This relation is the transitive closure of happens-immediately-before. Thus, an event vx happened before an event vy if one of the followings happens: (a) vysvx, (b) vyrvx, or (c) vyavx. The happened-before relation of events form an acyclic directed graph G′=(V, E′) such that an edge (vi, vj)∈E′ has a reverse direction of the same edge in E.

(Concurrent) Two event blocks vx and vy are said concurrent if neither of them happened before the other. Denoted by vx ∥ vy.

Given two vertices vx and vy both contained in two OPERA chains (DAGs) G1 and G2 on two nodes. We have the following: (1) vx → vy in G1 if vx → vy in G2; (2)vx ∥ vy in G1 if vx ∥ vy in G2.

State Definitions

Each node has a local state, a collection of histories, messages, event blocks, and peer information, we describe the components of each.

(State) A state of a process i is denoted by sji.

(Local State) A local state consists of a sequence of event blocks sji = v0i, v1i, …, vji.

In a DAG-based protocol, each vji event block is valid only if the reference blocks exist before it. From a local state sji, one can reconstruct a unique DAG. That is, the mapping from a local state sji into a DAG is injective or one-to-one. Thus, for Fantom, we can simply denote the j-th local state of a process i by the DAG gji (often we simply use Gi to denote the current local state of a process i).

(Action) An action is a function from one local state to another local state.

Generally speaking, an action can be one of: a sen**d(m) action where m is a message, a receive(m) action, and an internal action. A message m is a triple ⟨i, j, B⟩ where i ∈ P is the sender of the message, j ∈ P is the message recipient, and B is the body of the message. Let M denote the set of messages. In the Lachesis protocol, B consists of the content of an event block v. Semantics-wise, in Lachesis, there are two actions that can change a process’s local state: creating a new event and receiving an event from another process. (Event) An event is a tuple ⟨s, α, s′⟩ consisting of a state, an action, and a state. Sometimes, the event can be represented by the end state s′.

The j-th event in history hi of process i is ⟨sj − 1i, α, sji⟩, denoted by vji.

(Local history) A local history hi of process i is a (possibly infinite) sequence of alternating local states — beginning with a distinguished initial state. A set Hi of possible local histories for each process i in P.

The state of a process can be obtained from its initial state and the sequence of actions or events that have occurred up to the current state. In the Lachesis protocol, we use append-only semantics. The local history may be equivalently described as either of the following: hi = s0i, α1i, α2i, α3ihi = s0i, v1i, v2i, v3ihi = s0i, s1i, s2i, s3i, …

In Lachesis, a local history is equivalently expressed as: hi = g0i, g1i, g2i, g3i, … where gji is the j-th local DAG (local state) of the process i. (Run) Each asynchronous run is a vector of local histories. Denoted by σ = ⟨h1, h2, h3, ...hN⟩.

Let Σ denote the set of asynchronous runs. We can now use Lamport’s theory to talk about global states of an asynchronous system. A global state of run σ is an n-vector of prefixes of local histories of σ, one prefix per process. The happens-before relation can be used to define a consistent global state, often termed a consistent cut, as follows.

Consistent Cut

Consistent cuts represent the concept of scalar time in distributed computation, it is possible to distinguish between a “before" and an “after"

In the Lachesis protocol, an OPERA chain G = (V, E) is a directed acyclic graph (DAG). V is a set of vertices and E is a set of edges. DAG is a directed graph with no cycle. There is no path that has source and destination at the same vertex. A path is a sequence of vertices (v1, v2, ..., vk − 1, vk) that uses no edge more than once.

An asynchronous system consists of the following sets: a set P of process identifiers; a set C of channels; a set Hi is the set of possible local histories for each process i; a set A of asynchronous runs; a set M of all messages.

Each process / node in Lachesis selects k other nodes as peers. For certain gossip protocol, nodes may be constrained to gossip with its k peers. In such a case, the set of channels C can be modelled as follows. If node i selects node j as a peer, then (i, j)∈C. In general, one can express the history of each node in DAG-based protocol in general or in Lachesis protocol in particular, in the same manner as in the CCK paper .

(Consistent cut) A consistent cut of a run σ is any global state such that if vxi → vyj and vyj is in the global state, then vxi is also in the global state. Denoted by c(σ).

The concept of consistent cut formalizes such a global state of a run. A consistent cut consists of all consistent DAG chains. A received event block exists in the global state implies the existence of the original event block. Note that a consistent cut is simply a vector of local states; we will use the notation c(σ)[i] to indicate the local state of i in cut c of run σ.

A message chain of an asynchronous run is a sequence of messages m1, m2, m3, …, such that, for all i, receive(mi) → sen**d(mi + 1). Consequently, sen**d(m1) → receive(m1) → sen**d(m2) → receive(m2) → sen**d(m3) ….

The formal semantics of an asynchronous system is given via the satisfaction relation ⊢. Intuitively c(σ)⊢ϕ, “c(σ) satisfies ϕ,” if fact ϕ is true in cut c of run σ.

We assume that we are given a function π that assigns a truth value to each primitive proposition p. The truth of a primitive proposition p in c(σ) is determined by π and c. This defines c(σ)⊢p. (Equivalent cuts) Two cuts c(σ) and c**′(σ′) are equivalent with respect to i if: c(σ)∼ic**′(σ′) ⇔ c(σ)[i]=c**′**(σ′)[i] .

(i knows ϕ) Ki(ϕ) represents the statement “ϕ is true in all possible consistent global states that include i’s local state”. c(σ)⊢Ki(ϕ)⇔∀c****′(σ′)(c**′(σ′)∼ic(σ)  ⇒  c**′(σ′) ⊢ ϕ)

(i partially knows ϕ) Pi(ϕ) represents the statement “there is some consistent global state in this run that includes i’s local state, in which ϕ is true.” c(σ)⊢Pi(ϕ)⇔∃c****′(σ)(c**′(σ)∼ic(σ)  ∧  c**′(σ)⊢ϕ)

(Majority concurrently knows)

The next modal operator is written MC and stands for “majority concurrently knows.” The definition of MC(ϕ) is as follows.

MC(ϕ)=defi ∈ SKiPi(ϕ), where S ⊆ P and |S|>2n/3.

This is adapted from the “everyone concurrently knows” in CCK paper . In the presence of one-third of faulty nodes, the original operator “everyone concurrently knows” is sometimes not feasible. Our modal operator MC(ϕ) fits precisely the semantics for BFT systems, in which unreliable processes may exist. (Concurrent common knowledge) The last modal operator is concurrent common knowledge (CCK), denoted by CC. CC(ϕ) is defined as a fixed point of MC(ϕ ∧ X).

CCK defines a state of process knowledge that implies that all processes are in that same state of knowledge, with respect to ϕ, along some cut of the run. In other words, we want a state of knowledge X satisfying: X = MC(ϕ ∧ X). CC will be defined semantically as the weakest such fixed point, namely as the greatest fixed-point of MC(ϕ ∧ X).It therefore satisfies:

CC(ϕ)⇔MC(ϕ ∧ CC(ϕ))

Thus, Pi(ϕ) states that there is some cut in the same asynchronous run σ including i’s local state, such that ϕ is true in that cut. Note that ϕ implies Pi(ϕ). But it is not the case, in general, that Pi(ϕ) implies ϕ or even that MC(ϕ) implies ϕ. The truth of MC(ϕ) is determined with respect to some cut c(σ). A process cannot distinguish which cut, of the perhaps many cuts that are in the run and consistent with its local state, satisfies ϕ; it can only know the existence of such a cut. (Global fact) Fact ϕ is valid in system Σ, denoted by Σ ⊢ ϕ, if ϕ is true in all cuts of all runs of Σ. Σ ⊢ ϕ ⇔ (∀σ ∈ Σ)(∀c)(c(a)⊢ϕ)

Fact ϕ is valid, denoted ⊢ϕ, if ϕ is valid in all systems, i.e. (∀Σ)(Σ ⊢ ϕ). (Local fact) A fact ϕ is local to process i in system Σ if Σ ⊢ (ϕ ⇒ Kiϕ).

Dominator (graph theory)

In a graph G = (V, E, r) a dominator is the relation between two vertices. A vertex v is dominated by another vertex w, if every path in the graph from the root r to v have to go through w. Furthermore, the immediate dominator for a vertex v is the last of v’s dominators, which every path in the graph have to go through to reach v.

(Pseudo top) A pseudo vertex, called top, is the parent of all top event blocks. Denoted by ⊤.

(Pseudo bottom) A pseudo vertex, called bottom, is the child of all leaf event blocks. Denoted by ⊥. With the pseudo vertices, we have ⊥ happened-before all event blocks. Also all event blocks happened-before ⊤. That is, for all event vi, ⊥ → vi and vi → ⊤.

(Dom) An event vd dominates an event vx if every path from ⊤ to vx must go through vd. Denoted by vd ≫ vx.

(Strict dom) An event vd strictly dominates an event vx if vd ≫ vx and vd does not equal vx. Denoted by vdsvx.

(Domfront) A vertex vd is said “domfront” a vertex vx if vd dominates an immediate predecessor of vx, but vd does not strictly dominate vx. Denoted by vdfvx.

(Dominance frontier) The dominance frontier of a vertex vd is the set of all nodes vx such that vdfvx. Denoted by D**F(vd).

From the above definitions of domfront and dominance frontier, the following holds. If vdfvx, then vx ∈ D**F(vd).

OPERA chain (DAG)

The core idea of the Lachesis protocol is to use a DAG-based structure, called the OPERA chain for our consensus algorithm. In the Lachesis protocol, a (participant) node is a server (machine) of the distributed system. Each node can create messages, send messages to, and receive messages from, other nodes. The communication between nodes is asynchronous.

Let n be the number of participant nodes. For consensus, the algorithm examines whether an event block is dominated by 2n/3 nodes, where n is the number of all nodes. The Happen-before relation of event blocks with 2n/3 nodes means that more than two-thirds of all nodes in the OPERA chain know the event block.

The OPERA chain (DAG) is the local view of the DAG held by each node, this local view is used to identify topological ordering, select Clotho, and create time consensus through Atropos selection. OPERA chain is a DAG graph G = (V, E) consisting of V vertices and E edges. Each vertex vi ∈ V is an event block. An edge (vi, vj)∈E refers to a hashing reference from vi to vj; that is, vi ↪ vj.

(Leaf) The first created event block of a node is called a leaf event block.

(Root) The leaf event block of a node is a root. When an event block v can reach more than 2n/3 of the roots in the previous frames, v becomes a root.

(Root set) The set of all first event blocks (leaf events) of all nodes form the first root set R1 (|R1| = n). The root set Rk consists of all roots ri such that riRi, ∀ i = 1..(k-1) and ri can reach more than 2n/3 other roots in the current frame, i = 1..(k-1).

(Frame) Frame fi is a natural number that separates Root sets. The root set at frame fi is denoted by Ri.

(Consistent chains) OPERA chains G1 and G2 are consistent if for any event v contained in both chains, G1[v]=G2[v]. Denoted by G1 ∼ G2.

When two consistent chains contain the same event v, both chains contain the same set of ancestors for v, with the same reference and self-ref edges between those ancestors.

If two nodes have OPERA chains containing event v, then they have the same k hashes contained within v. A node will not accept an event during a sync unless that node already has k references for that event, so both OPERA chains must contain k references for v. The cryptographic hashes are assumed to be secure, therefore the references must be the same. By induction, all ancestors of v must be the same. Therefore, the two OPERA chains are consistent. (Creator) If a node nx creates an event block v, then the creator of v, denoted by c**r(v), is nx.

(Consistent chain) A global consistent chain GC is a chain if GC ∼ Gi for all Gi.

We denote G ⊑ G′ to stand for G is a subgraph of G′. Some properties of GC are given as follows:

  1. Gi (GC ⊑ Gi).

  2. v ∈ GCGi (GC[v]⊑Gi[v]).

  3. (∀vc ∈ GC) (∀vp ∈ Gi) ((vp → vc)⇒vp ∈ GC).

(Consistent root) Two chains G1 and G2 are root consistent, if for every v contained in both chains, v is a root of j-th frame in G1, then v is a root of j-th frame in G2.

By consistent chains, if G1 ∼ G2 and v belongs to both chains, then G1[v] = G2[v]. We can prove the proposition by induction. For j = 0, the first root set is the same in both G1 and G2. Hence, it holds for j = 0. Suppose that the proposition holds for every j from 0 to k. We prove that it also holds for j= k + 1. Suppose that v is a root of frame fk + 1 in G1. Then there exists a set S reaching 2/3 of members in G1 of frame fk such that ∀u ∈ S (u → v). As G1 ∼ G2, and v in G2, then ∀u ∈ S (u ∈ G2). Since the proposition holds for j=k, As u is a root of frame fk in G1, u is a root of frame fk in G2. Hence, the set S of 2/3 members u happens before v in G2. So v belongs to fk + 1 in G2.

Thus, all nodes have the same consistent root sets, which are the root sets in GC. Frame numbers are consistent for all nodes. (Flag table) A flag table stores reachability from an event block to another root. The sum of all reachabilities, namely all values in flag table, indicates the number of reachabilities from an event block to other roots.

(Consistent flag table) For any top event v in both OPERA chains G1 and G2, and G1 ∼ G2, then the flag tables of v are consistent if they are the same in both chains.

From the above, the root sets of G1 and G2 are consistent. If v contained in G1, and v is a root of j-th frame in G1, then v is a root of j-th frame in Gi. Since G1 ∼ G2, G1[v]=G2[v]. The reference event blocks of v are the same in both chains. Thus the flag tables of v of both chains are the same. (Clotho) A root rk in the frame fa + 3 can nominate a root ra as Clotho if more than 2n/3 roots in the frame fa + 1 dominate ra and rk dominates the roots in the frame fa + 1.

Each node nominates a root into Clotho via the flag table. If all nodes have an OPERA chain with same shape, the values in flag table will be equal to each other in OPERA chain. Thus, all nodes nominate the same root into Clotho since the OPERA chain of all nodes has same shape.

(Atropos) An Atropos is assigned consensus time through the Lachesis consensus algorithm and is utilized for determining the order between event blocks. Atropos blocks form a Main-chain, which allows time consensus ordering and responses to attacks.

For any root set R in the frame fi, the time consensus algorithm checks whether more than 2n/3 roots in the frame fi − 1 selects the same value. However, each node selects one of the values collected from the root set in the previous frame by the time consensus algorithm and Reselection process. Based on the Reselection process, the time consensus algorithm can reach agreement. However, there is a possibility that consensus time candidate does not reach agreement . To solve this problem, time consensus algorithm includes minimal selection frame per next h frame. In minimal value selection algorithm, each root selects minimum value among values collected from previous root set. Thus, the consensus time reaches consensus by time consensus algorithm.

(Main-chain (Blockchain)) For faster consensus, Main-chain is a special sub-graph of the OPERA chain (DAG).

The Main chain — a core subgraph of OPERA chain, plays the important role of ordering the event blocks. The Main chain stores shortcuts to connect between the Atropos. After the topological ordering is computed over all event blocks through the Lachesis protocol, Atropos blocks are determined and form the Main chain. To improve path searching, we use a flag table — a local hash table structure as a cache that is used to quickly determine the closest root to a event block.

In the OPERA chain, an event block is called a root if the event block is linked to more than two-thirds of previous roots. A leaf vertex is also a root itself. With root event blocks, we can keep track of “vital” blocks that 2n/3 of the network agree on.

[H]

Figure [fig:mainchain] shows an example of the Main chain composed of Atropos event blocks. In particular, the Main chain consists of Atropos blocks that are derived from root blocks and so are agreed by 2n/3 of the network nodes. Thus, this guarantees that at least 2n/3 of nodes have come to consensus on this Main chain.

Each participant node has a copy of the Main chain and can search consensus position of its own event blocks. Each event block can compute its own consensus position by checking the nearest Atropos event block. Assigning and searching consensus position are introduced in the consensus time selection section.

The Main chain provides quick access to the previous transaction history to efficiently process new incoming event blocks. From the Main chain, information about unknown participants or attackers can be easily viewed. The Main chain can be used efficiently in transaction information management by providing quick access to new event blocks that have been agreed on by the majority of nodes. In short, the Main-chain gives the following advantages:

  • All event blocks or nodes do not need to store all information. It is efficient for data management.

  • Access to previous information is efficient and fast.

Based on these advantages, OPERA chain can respond strongly to efficient transaction treatment and attacks through its Main-chain.

Lachesis Protocol

loop: A, B = k-node Selection algorithm() Request sync to node A and B Sync all known events by Lachesis protocol Event block creation (optional) Broadcast out the message Root selection Clotho selection Atropos time consensus loop: Request sync from a node Sync all known events by Lachesis protocol

Algorithm [al:main] shows the pseudo algorithm for the Lachesis core procedure. The algorithm consists of two parts and runs them in parallel.

  • In part one, each node requests synchronization and creates event blocks. In line 3, a node runs the Node Selection Algorithm. The Node Selection Algorithm returns the k IDs of other nodes to communicate with. In line 4 and 5, the node synchronizes the OPERA chain (DAG) with the other nodes. Line 6 runs the Event block creation, at which step the node creates an event block and checks whether it is a root. The node then broadcasts the created event block to all other known nodes in line 7. The step in this line is optional. In line 8 and 9, Clotho selection and Atropos time consensus algorithms are invoked. The algorithms determines whether the specified root can be a Clotho, assign the consensus time, and then confirm the Atropos.

  • The second part is to respond to synchronization requests. In line 10 and 11, the node receives a synchronization request and then sends its response about the OPERA chain.

Peer selection algorithm

In order to create an event block, a node needs to select k other nodes. Lachesis protocols does not depend on how peer nodes are selected. One simple approach can use a random selection from the pool of n nodes. The other approach is to define some criteria or cost function to select other peers of a node.

Within distributed system, a node can select other nodes with low communication costs, low network latency, high bandwidth, high successful transaction throughputs.

Dynamic participants

Our Lachesis protocol allows an arbitrary number of participants to dynamically join the system. The OPERA chain (DAG) can still operate with new participants. Computation on flag tables is set based and independent of which and how many participants have joined the system. Algorithms for selection of Roots, Clothos and Atroposes are flexible enough and not dependence on a fixed number of participants.

Peer synchronization

We describe an algorithm that synchronizes events between the nodes.

Node n1 selects random peer to synchronize with n1 gets local known events (map[int]int) n1 sends RPC request Sync request to peer n2 receives RPC requestSync request n2 does an EventDiff check on the known map (map[int]int)

n2 returns unknown events, and map[int]int of known events to n1

The algorithm assumes that a node always needs the events in topological ordering (specifically in reference to the lamport timestamps), an alternative would be to use an inverse bloom lookup table (IBLT) for completely potential randomized events. Alternatively, one can simply use a fixed incrementing index to keep track of the top event for each node.

Node Structure

This section gives an overview of the node structure in Lachesis.

Each node has a height vector, in-degree vector, flag table, frames, clotho check list, max-min value, main-chain (blockchain), and their own local view of the OPERA chain (DAG). The height vector is the number of event blocks created by the it**h node. The in-degree vector refers to the number of edges from other event blocks created by other nodes to the top event block of this node. The top event block indicates the most recently created event block by this node. The flag table is a nxk matrix, where n is the number of nodes and k is the number of roots that an event block can reach. If an event block e created by it**h node can reach jt**h root, then the flag table stores the hash value of the jt**h root. Each node maintains the flag table of each top event block.

Frames store the root set in each frame. Clotho check list has two types of check points; Clotho candidate (C**C) and Clotho (C). If a root in a frame is a C**C, a node check the C**C part and if a root becomes Clotho, a node check C part. Max-min value is timestamp that addresses for Atropos selection. The Main-chain is a data structure storing hash values of the Atropos blocks.

Figure [fig:node] shows an example of the node structure component of a node A. In the figure, each value excluding self height in the height vector is 1 since the initial state is shared to all nodes. In the in-degree vector, node A stores the number of edges from other event blocks created by other nodes to the top event block. The in-degrees of node A, B, and C are 1. In flag table, node A knows other two root hashes since the top event block can reach those two roots. Node A also knows that other nodes know their own roots. In the example situation there is no clotho candidate and Clotho, and thus clotho check list is empty. The main-chain and max-min value are empty for the same reason as clotho check list.

Peer selection algorithm via Cost function

We define three versions of the Cost Function (CF). Version one is focused around updated information share and is discussed below. The other two versions are focused on root creation and consensus facilitation, these will be discussed in a following paper.

We define a Cost Function (CF) for preventing the creation of lazy nodes. A lazy node is a node that has a lower work portion in the OPERA chain (has created fewer event blocks). When a node creates an event block, the node selects other nodes with low value outputs from the cost function and refers to the top event blocks of the reference nodes. An equation ([eq1]) of CF is as follows,

CF = I/H

where I and H denote values of in-degree vector and height vector respectively. If the number of nodes with the lowest CF is more than k, one of the nodes is selected at random. The reason for selecting high H is that we can expect a high possibility to create a root because the high H indicates that the communication frequency of the node had more opportunities than others with low H. Otherwise, the nodes that have high CF (the case of I > H) have generated fewer event blocks than the nodes that have low CF..

Figure [fig:costfunction_1] shows an example of the node selection based on the cost function after the creation of leaf events by all nodes. In this example, there are five nodes and each node created leaf events. All nodes know other leaf events. Node A creates an event block v1 and A calculates the cost functions. Step 2 in Figure [fig:costfunction_1] shows the results of cost functions based on the height and in-degree vectors of node A. In the initial step, each value in the vectors are same because all nodes have only leaf events. Node A randomly selects k nodes and connects v1 to the leaf events of selected nodes. In this example, we set k=3 and assume that node A selects node B and C.

Figure [fig:costfunction_2] shows an example of the node selection after a few steps of the simulation in Figure [fig:costfunction_1]. In Figure [fig:costfunction_2], the recent event block is v5 created by node A. Node A calculates the cost function and selects the other two nodes that have the lowest results of the cost function. In this example, node B has 0.5 as the result and other nodes have the same values. Because of this, node A first selects node B and randomly selects other nodes among nodes C, D, and E.

The height of node D in the example is 2 (leaf event and event block v4). On the other hand, the height of node D in node structure of A is 1. Node A is still not aware of the presence of the event block v4. It means that there is no path from the event blocks created by node A to the event block v4. Thus, node A has 1 as the height of node D.

Input: Height Vector H, In-degree Vector I Output: reference node ref min_cost ← INF sref ← None cf$\frac{I_k}{H_k}$ min_cost ← cf srefk srefsrefk ref ← random select in sref

Algorithm [al:ns] shows the selecting algorithm for selecting reference nodes. The algorithm operates for each node to select a communication partner from other nodes. Line 4 and 5 set min_cost and Sref to initial state. Line 7 calculates the cost function cf for each node. In line 8, 9, and 10, we find the minimum value of the cost function and set min_cost and Sref to cf and the ID of each node respectively. Line 11 and 12 append the ID of each node to Sref if min_cost equals cf. Finally, line 13 selects randomly k node IDs from Sref as communication partners. The time complexity of Algorithm 2 is O(n), where n is the number of nodes.

After the reference node is selected, each node communicates and shares information of all event blocks known by them. A node creates an event block by referring to the top event block of the reference node. The Lachesis protocol works and communicates asynchronously. This allows a node to create an event block asynchronously even when another node creates an event block. The communication between nodes does not allow simultaneous communication with the same node.

Figure [fig:node_selection] shows an example of the node selection in Lachesis protocol. In this example, there are five nodes (A, B, C, D, and E) and each node generates the first event blocks, called leaf events. All nodes share other leaf events with each other. In the first step, node A generates new event block a1. Then node A calculates the cost function to connect other nodes. In this initial situation, all nodes have one event block called leaf event, thus the height vector and the in-degree vector in node A has same values. In other words, the heights of each node are 1 and in-degrees are 0. Node A randomly select the other two nodes and connects a1 to the top two event blocks from the other two nodes. Step 2 shows the situation after connections. In this example, node A select node B and C to connect a1 and the event block a1 is connected to the top event blocks of node B and C. Node A only knows the situation of the step 2.

After that, in the example, node B generates a new event block b1 and also calculates the cost function. B randomly select the other two nodes; A, and D, since B only has information of the leaf events. Node B requests to A and D to connect b1, then nodes A and D send information for their top event blocks to node B as response. The top event block of node A is a1 and node D is the leaf event. The event block b1 is connected to a1 and leaf event from node D. Step 4 shows these connections.

Event block creation

In the Lachesis protocol, every node can create an event block. Each event block refers to other k event blocks using their hash values. In the Lachesis protocol, a new event block refers to k-neighbor event blocks under the following conditions:

  1. Each of the k reference event blocks is the top event blocks of its own node.

  2. One reference should be made to a self-ref that references to an event block of the same node.

  3. The other k-1 reference refers to the other k-1 top event nodes on other nodes.

Figure [fig:ex_ebc] shows the example of an event block creation with a flag table. In this example the recent created event block is b1 by node B. The figure shows the node structure of node B. We omit the other information such as height and in-degree vectors since we only focus on the change of the flag table with the event block creation in this example. The flag table of b1 in Figure [fig:ex_ebc] is updated with the information of the previous connected event blocks a1, b0, and c1. Thus, the set of the flag table is the results of OR operation among the three root sets for a1 (a0, b0, and c0), b0 (b0), and c1 (b0, c0, and d0).

Figure [fig:communication process], shows the communication process is divided into five steps for two nodes to create an event block. Simply, a node A requests to B. then, B responds to A directly.

Topological ordering of events using Lamport timestamps

Every node has a physical clock and it needs physical time to create an event block. However, for consensus, Lachesis protocols relies on a logical clock for each node. For the purpose, we use “Lamport timestamps” to determine the time ordering between event blocks in a asynchronous distributed system.

The Lamport timestamps algorithm is as follows:

  1. Each node increments its count value before creating an event block.

  2. When sending a message include its count value, receiver should consider which sender’s message is received and increments its count value.

  3. If current counter is less than or equal to the received count value from another node, then the count value of the recipient is updated.

  4. If current counter is greater than the received count value from another node, then the current count value is updated.

We use the Lamport’s algorithm to enforce a topological ordering of event blocks and use it in the Atropos selection algorithm.

Since an event block is created based on logical time, the sequence between each event blocks is immediately determined. Because the Lamport timestamps algorithm gives a partial order of all events, the whole time ordering process can be used for Byzantine fault tolerance.

Domination Relation

Here, we introduce a new idea that extends the concept of domination. For a vertex v in a DAG G, let G[v]=(Vv, Ev) denote an induced-subgraph of G such that Vv consists of all ancestors of v including v, and Ev is the induced edges of Vv in G. For a set S of vertices, an event vd $\frac{2}{3}$-dominates S if there are more than 2/3 of vertices vx in S such that vd dominates vx. Recall that R1 is the set of all leaf vertices in G. The $\frac{2}{3}$-dom set D0 is the same as the set R1.The $\frac{2}{3}$-dom set Di is defined as follows: A vertex vd belongs to a $\frac{2}{3}$-dom set within the graph G[vd], if vd $\frac{2}{3}$-dominates R1. The $\frac{2}{3}$-dom set Dk consists of all roots di such that diDi, ∀ i = 1..(k-1), and di $\frac{2}{3}$-dominates Di − 1. The $\frac{2}{3}$-dom set Di is the same with the root set Ri, for all nodes.

Examples of domination relation in DAGs

This section gives several examples of DAGs and the domination relation between their event blocks.

(a)Examples of OPERA chain and dominator tree (b)Examples of OPERA chain and dominator tree

Figure [fig:domtrees1] shows an examples of a DAG and dominator trees.

An example of OPERA chain and its 2/3 domination graph. The \frac{2}{3}-dom sets are shown in grey.

Figure [fig:domset1] depicts an example of a DAG and 2/3 dom sets.

(a) An example of dependency graphs on individual nodes. From (a)-(c) there is one new event block appended. There is no fork, the simplified dependency graphs become trees. (b) An example of dependency graphs on individual nodes. From (a)-(c) there is one new event block appended. There is no fork, the simplified dependency graphs become trees. (c) An example of dependency graphs on individual nodes. From (a)-(c) there is one new event block appended. There is no fork, the simplified dependency graphs become trees.

Figure [fig:deptreesmod1] shows an example an dependency graphs. On each row, the left most figure shows the latest OPERA chain. The left figures on each row depict the dependency graphs of each node, which are in their compact form. When no fork presents, each of the compact dependency graphs is a tree.

(a)An example of a pair of fork events in an OPERA chain. The fork events are shown in red and green. The OPERA chains from (a) to (d) are different by adding one single event at a time. (b)An example of a pair of fork events in an OPERA chain. The fork events are shown in red and green. The OPERA chains from (a) to (d) are different by adding one single event at a time. (c)An example of a pair of fork events in an OPERA chain. The fork events are shown in red and green. The OPERA chains from (a) to (d) are different by adding one single event at a time.

Figure [fig:deptreesfork1] shows an example of a pair of fork events. Each row shows an OPERA chain (left most) and the compact dependency graphs on each node (right). The fork events are shown in red and green vertices

Root Selection

All nodes can create event blocks and an event block can be a root when satisfying specific conditions. Not all event blocks can be roots. First, the first created event blocks are themselves roots. These leaf event blocks form the first root set RS1 of the first frame f1. If there are total n nodes and these nodes create the event blocks, then the cardinality of the first root set |RS1| is n. Second, if an event block e can reach at least 2n/3 roots, then e is called a root. This event e does not belong to RS1, but the next root set RS2 of the next frame f2. Thus, excluding the first root set, the range of cardinality of root set RSk is 2n/3 < |RSk|≤n. The event blocks including RSk before RSk + 1 is in the frame fk. The roots in RSk + 1 does not belong to the frame fk. Those are included in the frame fk + 1 when a root belonging to RSk + 2 occurs.

We introduce the use of a flag table to quickly determine whether a new event block becomes a root. Each node maintains a flag table of the top event block. Every event block that is newly created is assigned k hashes for its k referenced event blocks. We apply an O**R operation on each set in the flag table of the referenced event blocks.

[H]

Figure [fig:ex_ft] shows an example of how to use flag tables to determine a root. In this example, b1 is the most recently created event block. We apply an O**R operation on each set of the flag tables for b1’s k referenced event blocks. The result is the flag table of b1. If the cardinality of the root set in b1’s flag table is more than 2n/3, b1 is a root. In this example, the cardinality of the root set in b1 is 4, which is greater than 2n/3 (n=5). Thus, b1 becomes root. In this example, b1 is added to frame f2 since b1 becomes new root.

The root selection algorithm is as follows:

  1. The first event blocks are considered as roots.

  2. When a new event block is added in the OPERA chain (DAG), we check whether the event block is a root by applying an O**R operation on each set of the flag tables connected to the new event block. If the cardinality of the root set in the flag table for the new event block is more than 2n/3, the new event block becomes a root.

  3. When a new root appears on the OPERA chain, nodes update their frames. If one of the new event blocks becomes a root, all nodes that share the new event block add the hash value of the event block to their frames.

  4. The new root set is created if the cardinality of the previous root set RSp is more than 2n/3 and the new event block can reach 2n/3 roots in RSp.

  5. When the new root set RSk + 1 is created, the event blocks from the previous root set RSk to before RSk + 1 belong to the frame fk.

Clotho Selection

A Clotho is a root that satisfies the Clotho creation conditions. Clotho creation conditions are that more than 2n/3 nodes know the root and a root knows this information.

In order for a root r in frame fi to become a Clotho, r must be reached by more than n/3 roots in the frame fi + 1. Based on the definition of the root, each root reaches more than 2n/3 roots in previous frames. If more than n/3 roots in the frame fi + 1 can reach r, then r is spread to all roots in the frame fi + 2. It means that all nodes know the existence of r. If we have any root in the frame fi + 3, a root knows that r is spread to more than 2n/3 nodes. It satisfies Clotho creation conditions.

[H]

In the example in Figure [fig:frame4], n is 5 and each circle indicates a root in a frame. Each arrow means one root can reach (happened-before) to the previous root. Each root has 4 or 5 arrows (out-degree) since n is 5 (more than 2n/3 ≥ 4). b1 and c2 in frame f2 are roots that can reach c0 in frame f1. d1 and e1 also can reach c0, but we only marked b1 and c2 (when n is 5, more than n/3 ≥ 2) since we show at least more than n/3 conditions in this example. And it was marked with a blue bold arrow (Namely, the roots that can reach root c0 have the blue bold arrow). In this situation, an event block must be able to reach b1 or c2 in order to become a root in frame f3 (In our example, n=5, more than n/3 ≥ 2, and more than 2n/3 ≥ 4. Thus, to be a root, either must be reached). All roots in frame f3 reach c0 in frame f1.

To be a root in frame f4, an event block must reach more than 2n/3 roots in frame f3 that can reach c0. Therefore, if any of the root in frame f4 exists, the root must have happened-before more than 2n/3 roots in frame f3. Thus, the root of f4 knows that c0 is spread over more than 2n/3 of the entire nodes. Thus, we can select c0 as Clotho.

Figure [fig:Clotho] shows an example of a Clotho. In this example, all roots in the frame f1 have happened-before more than n/3 roots in the frame f2. We can select all roots in the frame f1 as Clotho since the recent frame is f4.

[H]

Input: a root r c.i**s_cloth**onil c.yes ← 0 c.yes ← c.yes + 1 c.i**s_cloth**oyes

Algorithm [al:acs] shows the pseudo code for Clotho selection. The algorithm takes a root r as input. Line 4 and 5 set c.i**s_cloth**o and c.yes to nil and 0 respectively. Line 6-8 checks whether any root c′ in frame(i − 3, r) has happened-before with the 2n/3 condition c where i is the current frame. In line 9-10, if the number of roots in frame(i − 2, r) which happened-before c is more than 2n/3, the root c is set as a Clotho. The time complexity of Algorithm 3 is O(n2), where n is the number of nodes.

Figure [fig:ClothoNodeA] shows the state of node A when a Clotho is selected. In this example, node A knows all roots in the frame f1 become Clotho’s. Node A prunes unnecessary information on its own structure. In this case, node A prunes the root set in the frame f1 since all roots in the frame f1 become Clotho and the Clotho Check list stores the Clotho information.

Atropos Selection

Atropos selection algorithm is the process in which the candidate time generated from Clotho selection is shared with other nodes, and each root re-selects candidate time repeatedly until all nodes have same candidate time for a Clotho.

After a Clotho is nominated, each node then computes a candidate time of the Clotho. If there are more than two-thirds of the nodes that compute the same value for candidate time, that time value is recorded. Otherwise, each node reselects candidate time. By the reselection process, each node reaches time consensus for candidate time of Clotho as the OPERA chain (DAG) grows. The candidate time reaching the consensus is called Atropos consensus time. After Atropos consensus time is computed, the Clotho is nominated to Atropos and each node stores the hash value of Atropos and Atropos consensus time in Main-Chain (blockchain). The Main-chain is used for time order between event blocks. The proof of Atropos consensus time selection is shown in the section [se:proof].

Figure [fig:Atropos] shows the example of Atropos selection. In Figure [fig:Clotho], all roots in the frame f1 are selected as Clotho through the existence of roots in the frame f4. Each root in the frame f5 computes candidate time using timestamps of reachable roots in the frame f4. Each root in the frame f5 stores the candidate time to min-max value space. The root r6 in the frame f6 can reach more than 2n/3 roots in f5 and r6 can know the candidate time of the reachable roots that f5 takes. If r6 knows the same candidate time than more than 2n/3, we select the candidate time as Atropos consensus time. Then all Clotho in the frame f1 become Atropos.

Figure [fig:Atropos_Node] shows the state of node B when Atropos is selected. In this example, node B knows all roots in the frame f1 become Atropos. Then node B prunes information of the frame f1 in clotho check list since all roots in the frame f1 become Atropos and main chain stores Atropos information.

[H]

Input: c.Cloth**o in frame fi c.consensus_tim**enil m ← the index of the last frame fm R ← be the Root set RSi + d in frame fi + d r.tim**e(c) ← r.lamport_tim**e s ← the set of Root in fj − 1 that r can be happened-before with 2n/3 condition t ← RESELECTION(s, c) k ← the number of root having t in s c.consensus_tim**et r.tim**e(c) ← t r.tim**e(c) ← t r.tim**e(c) ← the minimum value in s

[H]

ALG@cmd@@L @Function @currentfunctionReselection bsphack @writeauxout esphack Input: Root set R, and Clotho c Output: candidate time t τ ← set of all ti = r.tim**e(c) for all r in R D ← set of tuples (ti, ci) computed from τ, where ci = count(ti) max_countmax(ci) tinfinit**e tti return t

Algorithm [al:atc] and [al:resel] show pseudo code of Atropos consensus time selection and Consensus time reselection. In Algorithm [al:atc], at line 6, d saves the deference of relationship between root set of c and w. Thus, line 8 means that w is one of the elements in root set of the frame fi + 3, where the frame fi includes c. Line 10, each root in the frame fj selects own Lamport timestamp as candidate time of c when they confirm root c as Cltoho. In line 12, 13, and 14, s, t, and k save the set of root that w can be happened-before with 2n/3 condition c, the result of RESELECTION function, and the number of root in s having t. Line 15 is checking whether there is a difference as much as h between i and j where h is a constant value for minimum selection frame. Line 16-20 is checking whether more than two-thirds of root in the frame fj − 1 nominate the same candidate time. If two-thirds of root in the frame fj − 1 nominate the same candidate time, the root c is assigned consensus time as t. Line 22 is minimum selection frame. In minimum selection frame, minimum value of candidate time is selected to reach byzantine agreement. Algorithm [al:resel] operates in the middle of Algorithm [al:atc]. In Algorithm [al:resel], input is a root set W and output is a reselected candidate time. Line 4-5 computes the frequencies of each candidate time from all the roots in W. In line 6-11, a candidate time which is smallest time that is the most nomitated. The time complexity of Algorithm [al:resel] is O(n) where n is the number of nodes. Since Algorithm [al:atc] includes Algorithm [al:resel], the time complexity of Algorithm [al:atc] is O(n2) where n is the number of nodes.

In the Atropos Consensus Time Selection algorithm, nodes reach consensus agreement about candidate time of a Clotho without additional communication (i.e., exchanging candidate time) with each other. Each node communicates with each other through the Lachesis protocol, the OPERA chain of all nodes grows up into same shape. This allows each node to know the candidate time of other nodes based on its OPERA chain and reach a consensus agreement. The proof that the agreement based on OPERA chain become agreement in action is shown in the section [se:proof].

Atropos can be determined by the consensus time of each Clotho. It is an event block that is determined by finality and is non-modifiable. Furthermore, all event blocks can be reached from Atropos guarantee finality.

Lachesis Consensus

Consensus Method in a DAG (combines chain with consensus process of pBFT)

Figure [fig:pBFTtoPath] illustrates how consensus is reached through the domination relation in the OPERA chain. In the figure, leaf set, denoted by Rs0, consists of the first event blocks created by individual participant nodes. V is the set of event blocks that do not belong neither in Rs0 nor in any root set Rs**i. Given a vertex v in V ∪ Rs**i, there exists a path from v that can reach a leaf vertex u in Rs0. Let r1 and r2 be root event blocks in root set Rs1 and Rs2, respectively. r1 is the block where a quorum or more blocks exist on a path that reaches a leaf event block. Every path from r1 to a leaf vertex will contain a vertex in V1. Thus, if there exists a vertex r in V1 such that r is created by more than a quorum of participants, then r is already included in Rs1. Likewise, r2 is a block that can be reached for Rs1 including r1 through blocks made by a quorum of participants. For all leaf event blocks that could be reached by r1, they are connected with more than quorum participants through the presence of r1. The existence of the root r2 shows that information of r1 is connected with more than a quorum. This kind of a path search allows the chain to reach consensus in a similar manner as the pBFT consensus processes. It is essential to keep track of the blocks satisfying the pBFT consensus process for quicker path search; our OPERA chain and Main-chain keep track of these blocks.

The sequential order of each event block is an important aspect for Byzantine fault tolerance. In order to determine the pre-and-post sequence between all event blocks, we use Atropos consensus time, Lamport timestamp algorithm and the hash value of the event block.

First, when each node creates event blocks, they have a logical timestamp based on Lamport timestamp. This means that they have a partial ordering between the relevant event blocks. Each Clotho has consensus time to the Atropos. This consensus time is computed based on the logical time nominated from other nodes at the time of the 2n/3 agreement.

Each event block is based on the following three rules to reach an agreement:

  1. If there are more than one Atropos with different times on the same frame, the event block with smaller consensus time has higher priority.

  2. If there are more than one Atropos having any of the same consensus time on the same frame, determine the order based on the own logical time from Lamport timestamp.

  3. When there are more than one Atropos having the same consensus time, if the local logical time is same, a smaller hash value is given priority through hash function.

Figure [fig:sequence of operachain] shows the part of OPERA chain in which the final consensus order is determined based on these 3 rules. The number represented by each event block is a logical time based on Lamport timestamp. Final topological consensus order containing the event blocks are based on agreement from the apropos. Based on each Atropos, they will have different colors depending on their range.

Detecting Forks

(Fork) A pair of events (vx, vy) is a fork if vx and vy have the same creator, but neither is a self-ancestor of the other. Denoted by vx ⋔ vy.

For example, let vz be an event in node n1 and two child events vx and vy of vz. if vxsvz, vysvz, $v_x \not {\hookrightarrow^{s}}v_y$, $v_y \not {\hookrightarrow^{s}}v_z$, then (vx, vy) is a fork. The fork relation is symmetric; that is vx ⋔ vy iff vy ⋔ vx.

By definition, (vx, vy) is a fork if c**r(vx)=c**r(vy), $v_x \not {\hookrightarrow^{a}}v_y$ and $v_y \not {\hookrightarrow^{a}}v_x$. Using Happened-Before, the second part means $v_x \not \rightarrow v_y$ and $v_y \not \rightarrow v_x$. By definition of concurrent, we get vx ∥ vy.

If there is a fork vx ⋔ vy, then vx and vy cannot both be roots on honest nodes.

Here, we show a proof by contradiction. Any honest node cannot accept a fork so vx and vy cannot be roots on the same honest node. Now we prove a more general case. Suppose that both vx is a root of nx and vy is root of ny, where nx and ny are honest nodes. Since vx is a root, it reached events created by more than 2/3 of member nodes. Similarly, vy is a root, it reached events created by more than 2/3 of member nodes. Thus, there must be an overlap of more than n/3 members of those events in both sets. Since we assume less than n/3 members are not honest, so there must be at least one honest member in the overlap set. Let nm be such an honest member. Because nm is honest, nm does not allow the fork.

Conclusion

We further optimize the OPERA chain and Main-chain for faster consensus. By using Lamport timestamps and domination relation, the topological ordering of event blocks in OPERA chain and Main chain is more intuitive and reliable in distributed system.

We have presented a formal semantics for Lachesis protocol in Section [se:lca]. Our formal proof of pBFT for our Lachesis protocol is given in Section [se:proof]. Our work is the first that studies such concurrent common knowledge sematics and dominator relationships in DAG-based protocols.

Appendix

Preliminaries

The history of a Lachesis protocol can be represented by a directed acyclic graph G = (V, E), where V is a set of vertices and E is a set of edges. Each vertex in a row (node) represents an event. Time flows left-to-right of the graph, so left vertices represent earlier events in history. A path p in G is a sequence of vertices (v1, v2, …, vk) by following the edges in E. Let vc be a vertex in G. A vertex vp is the parent of vc if there is an edge from vp to vc. A vertex va is an ancestor of vc if there is a path from va to vc.

Each machine that participates in the Lachesis protocol is called a node.

Let n denote the total number of nodes.

Each node can create event blocks, send (receive) messages to (from) other nodes.

An event block is a vertex of the OPERA chain.

Suppose a node ni creates an event vc after an event vs in ni. Each event block has exactly k references. One of the references is self-reference, and the other k-1 references point to the top events of ni’s k-1 peer nodes.

A node ni has k peer nodes.

An event v is a top event of a node ni if there is no other event in ni referencing v.

An event vs is called “self-ref" of event vc, if the self-ref hash of vc points to the event vs. Denoted by vcsvs.

An event vr is called “ref" of event vc if the reference hash of vc points to the event vr. Denoted by vcrvr.

For simplicity, we can use ↪ to denote a reference relationship (either ↪r or ↪s).

An event block va is self-ancestor of an event block vc if there is a sequence of events such that vcsv1s…↪svmsva. Denoted by vcs**ava.

An event block va is an ancestor of an event block vc if there is a sequence of events such that vc ↪ v1 ↪ … ↪ vm ↪ va. Denoted by vcava.

For simplicity, we simply use vcavs to refer both ancestor and self-ancestor relationship, unless we need to distinguish the two cases.

OPERA chain is a DAG graph G = (V, E) consisting of V vertices and E edges. Each vertex vi ∈ V is an event block. An edge (vi, vj)∈E refers to a hashing reference from vi to vj; that is, vi ↪ vj.

Domination relation

Then we define the domination relation for event blocks. To begin with, we first introduce pseudo vertices, top and bot, of the DAG OPERA chain G.

A pseudo vertex, called top, is the parent of all top event blocks. Denoted by ⊤.

A pseudo vertex, called bottom, is the child of all leaf event blocks. Denoted by ⊥.

With the pseudo vertices, we have ⊥ happened before all event blocks. Also all event blocks happened before ⊤. That is, for all event vi, ⊥ → vi and vi → ⊤.

An event vd dominates an event vx if every path from ⊤ to vx must go through vd. Denoted by vd ≫ vx.

An event vd strictly dominates an event vx if vd ≫ vx and vd does not equal vx. Denoted by vdsvx.

A vertex vd is said “domfront” a vertex vx if vd dominates an immediate predecessor of vx, but vd does not strictly dominate vx. Denoted by vdfvx.

The dominance frontier of a vertex vd is the set of all nodes vx such that vdfvx. Denoted by D**F(vd).

From the above definitions of domfront and dominance frontier, the following holds. If vdfvx, then vx ∈ D**F(vd).

Here, we introduce a new idea that extends the concept domination.

For a vertex v in a DAG G, let G[v]=(Vv, Ev) denote an induced-subgraph of G such that Vv consists of all ancestors of v including v, and Ev is the induced edges of Vv in G.

For a set S of vertices, an event vd $\frac{2}{3}$-dominates S if there are more than 2/3 of vertices vx in S such that vd dominates vx. Recall that R1 is the set of all leaf vertices in G. The $\frac{2}{3}$-dom set D0 is the same as the set R1.The $\frac{2}{3}$-dom set Di is defined as follows:

A vertex vd belongs to a $\frac{2}{3}$-dom set within the graph G[vd], if vd $\frac{2}{3}$-dominates R1. The $\frac{2}{3}$-dom set Dk consists of all roots di such that diDi, ∀ i = 1..(k-1), and di $\frac{2}{3}$-dominates Di − 1.

The $\frac{2}{3}$-dom set Di is the same with the root set Ri, for all nodes.

Proof of Lachesis Consensus Algorithm

This section presents a proof of liveness and safety of our Lachesis protocols. We aim to show that our consensus is Byzantine fault tolerant with a presumption that more than two-thirds of participants are reliable nodes. We first provide some definitions, lemmas and theorems. Then we validate the Byzantine fault tolerance.

Proof of Byzantine Fault Tolerance for Lachesis Consensus Algorithm

An event block vx is said Happened-Immediate-Before an event block vy if vx is a (self-) ref of vy. Denoted by vx ↦ vy.

An event block vx is said Happened-Before an event block vy if vx is a (self-) ancestor of vy. Denoted by vx → vy.

The happens-before relation is the transitive closure of happens-immediately-before. Thus, an event vx happened before an event vy if one of the followings happens: (a) vysvx, (b) vyrvx, or (c) vyavx. We come up with the following proposition:

vx ↦ vy iff vy ↪ vx iff edge (vy, vx) ∈E of OPERA chain.

vx → vy iff vyavx.

Two event blocks vx and vy are said concurrent if neither of them happened before the other. Denoted by vx ∥ vy.

Given two vertices vx and vy both contained in two OPERA chains G1 and G2 on two nodes. We have the following: (1) vx → vy in G1 iff vx → vy in G2; (2) vx ∥ vy in G1 iff vx ∥ vy in G2.

Below is some main definitions in Lachesis protocol.

The first created event block of a node is called a leaf event block.

[def:root] The leaf event block of a node is a root. When an event block v can reach more than 2n/3 of the roots in the previous frames, v becomes a root.

The set of all first event blocks (leaf events) of all nodes form the first root set R1 (|R1| = n). The root set Rk consists of all roots ri such that riRi, ∀ i = 1..(k-1) and ri can reach more than 2n/3 other roots in the current frame, i = 1..(k-1).

Frame fi is a natural number that separates Root sets.

The root set at frame fi is denoted by Ri.

[dfn:conchains] OPERA chains G1 and G2 are consistent iff for any event v contained in both chains, G1[v]=G2[v]. Denoted by G1 ∼ G2.

When two consistent chains contain the same event v, both chains contain the same set of ancestors for v, with the same reference and self-ref edges between those ancestors:

[thm:conchains] All nodes have consistent OPERA chains.

If two nodes have OPERA chains containing event v, then they have the same k hashes contained within v. A node will not accept an event during a sync unless that node already has k references for that event, so both OPERA chains must contain k references for v. The cryptographic hashes are assumed to be secure, therefore the references must be the same. By induction, all ancestors of v must be the same. Therefore, the two OPERA chains are consistent.

If a node nx creates an event block v, then the creator of v, denoted by c**r(v), is nx.

The pair of events (vx, vy) is a fork if vx and vy have the same creator, but neither is a self-ancestor of the other. Denoted by vx ⋔ vy.

For example, let vz be an event in node n1 and two child events vx and vy of vz. if vxsvz, vysvz, $v_x \not {\hookrightarrow^{s}}v_y$, $v_y \not {\hookrightarrow^{s}}v_z$, then (vx, vy) is a fork. The fork relation is symmetric; that is vx ⋔ vy iff vy ⋔ vx.

vx ⋔ vy iff c**r(vx)=c**r(vy) and vx ∥ vy.

By definition, (vx, vy) is a fork if c**r(vx)=c**r(vy), $v_x \not {\hookrightarrow^{a}}v_y$ and $v_y \not {\hookrightarrow^{a}}v_x$. Using Happened-Before, the second part means $v_x \not \rightarrow v_y$ and $v_y \not \rightarrow v_x$. By definition of concurrent, we get vx ∥ vy.

(fork detection). If there is a fork vx ⋔ vy, then vx and vy cannot both be roots on honest nodes.

Here, we show a proof by contradiction. Any honest node cannot accept a fork so vx and vy cannot be roots on the same honest node. Now we prove a more general case. Suppose that both vx is a root of nx and vy is root of ny, where nx and ny are honest nodes. Since vx is a root, it reached events created by more than 2/3 of member nodes. Similary, vy is a root, it reached events created by more than 2/3 of member nodes. Thus, there must be an overlap of more than n/3 members of those events in both sets. Since we assume less than n/3 members are not honest, so there must be at least one honest member in the overlap set. Let nm be such an honest member. Because nm is honest, nm does not allow the fork. This contradicts the assumption. Thus, the lemma is proved.

Each node ni has an OPERA chain Gi. We define a consistent chain from a sequence of OPERA chain Gi.

A global consistent chain GC is a chain if GC ∼ Gi for all Gi.

We denote G ⊑ G′ to stand for G is a subgraph of G′.

Gi (GC ⊑ Gi).

v ∈ GCGi (GC[v]⊑Gi[v]).

(∀vc ∈ GC) (∀vp ∈ Gi) ((vp → vc)⇒vp ∈ GC).

Now we state the following important propositions.

Two chains G1 and G2 are root consistent, if for every v contained in both chains, and v is a root of j-th frame in G1, then v is a root of j-th frame in G2.

If G1 ∼ G2, then G1 and G2 are root consistent.

By consistent chains, if G1 ∼ G2 and v belongs to both chains, then G1[v] = G2[v]. We can prove the proposition by induction. For j = 0, the first root set is the same in both G1 and G2. Hence, it holds for j = 0. Suppose that the proposition holds for every j from 0 to k. We prove that it also holds for j= k + 1. Suppose that v is a root of frame fk + 1 in G1. Then there exists a set S reaching 2/3 of members in G1 of frame fk such that ∀u ∈ S (u → v). As G1 ∼ G2, and v in G2, then ∀u ∈ S (u ∈ G2). Since the proposition holds for j=k, As u is a root of frame fk in G1, u is a root of frame fk in G2. Hence, the set S of 2/3 members u happens before v in G2. So v belongs to fk + 1 in G2. The proposition is proved.

From the above proposition, one can deduce the following:

GC is root consistent with Gi for all nodes.

Thus, all nodes have the same consistent root sets, which are the root sets in GC. Frame numbers are consistent for all nodes.

For any top event v in both OPERA chains G1 and G2, and G1 ∼ G2, then the flag tables of v are consistent iff they are the same in both chains.

From the above lemmas, the root sets of G1 and G2 are consistent. If v contained in G1, and v is a root of j-th frame in G1, then v is a root of j-th frame in Gi. Since G1 ∼ G2, G1[v]=G2[v]. The reference event blocks of v are the same in both chains. Thus the flag tables of v of both chains are the same.

Thus, all nodes have consistent flag tables.

A root rk in the frame fa + 3 can nominate a root ra as Clotho if more than 2n/3 roots in the frame fa + 1 Happened-Before ra and rk Happened-Before the roots in the frame fa + 1.

[lem:root] For any root set R, all nodes nominate same root into Clotho.

Based on Theorem [thm:same], each node nominates a root into Clotho via the flag table. If all nodes have an OPERA chain with same shape, the values in flag table should be equal to each other in OPERA chain. Thus, all nodes nominate the same root into Clotho since the OPERA chain of all nodes has same shape.

[lem:resel] In the Reselection algorithm, for any Clotho, a root in OPERA chain selects the same consensus time candidate.

Based on Theorem [thm:same], if all nodes have an OPERA chain with the same partial shape, a root in OPERA chain selects the same consensus time candidate by the Reselection algorithm.

[lem:fork] If the pair of event blocks (x,y) is fork and a root has Happened-before the fork, this fork is detected in the Clotho selection process.

We show a proof by contradiction. Assume that no node can detect the fork in the Clotho selection process.

Assume that there is a root ri that becomes Clotho in fi, which was selected as Clotho by n/3 of the roots in fi + 1. More than 2n/3 roots in fi + 1 should have happened-before by a root in fi + 2. If a pair (vx, vy) is fork, There are two cases: (1) assume that ri is one of vx and vy, (2) assume that ri can reach both vx and vy.

Our proof for both two cases is as follows.

Let k denote that the number of roots in fi + 1 that can reach ri in fi (∴ n/3 < k).

In order to select root in fi + 2, the root in fi + 2 should reach more than 2n/3 of roots in fi + 1 by Definition [def:root]. At the moment, assume that l is the number of roots in fi + 1 that can be reached by the root in fi + 2 (∴ 2n/3 < l).

At this time, n < k + l (∵ n/3 + 2n/3 < k + l), there are (n - k + l) roots in frame fi + 1 that should reach ri in fi and all roots in fi + 2 should reach at least n – k + l of roots in fi + 1. It means that all roots in fi + 2 know the existence of ri. Therefore, the node that generated all the roots of fi + 2 detect the fork in the Clotho selection of ri, which contradicts the assumption.

It can be covered two cases. If ri is part of the fork, we can detect in fi + 2. If there is fork (vx, vy) that can be reached by ri, it also can be detected in fi + 2 since we can detect the fork in the Clotho selection of ri and it indicates that all event blocks that can be reached by ri are detected by the roots in fi + 2.

For a root v happened-before a fork in OPERA chain, v must see the fork before becoming Clotho.

Suppose that a node creates two event blocks (vx, vy), which forms a fork. To create two Clothos that can reach both events, the event blocks should reach by more than 2n/3 nodes. Therefore, the OPERA chain can structurally detect the fork before roots become Clotho.

[thm:same] All nodes grows up into same consistent OPERA chain GC, which contains no fork.

Suppose that there are two event blocks vx and vy contained in both G1 and G2, and their path between vx and vy in G1 is not equal to that in G2. We can consider that the path difference between the nodes is a kind of fork attack. Based on Lemma [lem:fork], if an attacker forks an event block, each chain of Gi and G2 can detect and remove the fork before the Clotho is generated. Thus, any two nodes have consistent OPERA chain.

If the consensus time of Clotho is validated, the Clotho become an Atropos.

[thm:ct] Lachesis consensus algorithm guarantees to reach agreement for the consensus time.

For any root set R in the frame fi, time consensus algorithm checks whether more than 2n/3 roots in the frame fi − 1 selects the same value. However, each node selects one of the values collected from the root set in the previous frame by the time consensus algorithm and Reselection process. Based on the Reselection process, the time consensus algorithm can reach agreement. However, there is a possibility that consensus time candidate does not reach agreement . To solve this problem, time consensus algorithm includes minimal selection frame per next h frame. In minimal value selection algorithm, each root selects minimum value among values collected from previous root set. Thus, the consensus time reaches consensus by time consensus algorithm.

[thm:bft] If the number of reliable nodes is more than 2n/3, event blocks created by reliable nodes must be assigned to consensus order.

In OPERA chain, since reliable nodes try to create event blocks by communicating with every other nodes continuously, reliable nodes will share the event block x with each other. If a root y in the frame fi Happened-Before event block x and more than n/3 roots in the frame fi + 1 Happened-Before the root y, the root y will be nominated as Clotho and Atropos. Thus, event block x and root y will be assigned consensus time t.

For an event block, assigning consensus time means that the validated event block is shared by more than 2n/3 nodes. Therefore, malicious node cannot try to attack after the event blocks are assigned consensus time. When the event block x has consensus time t, it cannot occur to discover new event blocks with earlier consensus time than t. There are two conditions to be assigned consensus time earlier than t for new event blocks. First, a root r in the frame fi should be able to share new event blocks. Second, the more than 2n/3 roots in the frame fi + 1 should be able to share r. Even if the first condition is satisfied by malicious nodes (e.g., parasite chain), the second condition cannot be satisfied since at least 2n/3 roots in the frame fi + 1 are already created and cannot be changed. Therefore, after an event block is validated, new event blocks should not be participate earlier consensus time to OPERA chain.

Semantics of Lachesis protocol

This section gives the formal semantics of Lachesis consensus protocol. We use CCK model of an asynchronous system as the base of the semantics of our Lachesis protocol. Events are ordered based on Lamport’s happens-before relation. In particular, we use Lamport’s theory to describe global states of an asynchronous system.

We present notations and concepts, which are important for Lachesis protocol. In several places, we adapt the notations and concepts of CCK paper to suit our Lachesis protocol.

An asynchronous system consists of the following:

A process pi represents a machine or a node. The process identifier of pi is i. A set P = {1,...,n} denotes the set of process identifiers.

A process i can send messages to process j if there is a channel (i,j). Let C ⊆ {(i,j) s.t. i, j ∈ P} denote the set of channels.

A local state of a process i is denoted by sji.

A local state consists of a sequence of event blocks sji = v0i, v1i, …, vji.

In a DAG-based protocol, each vji event block is valid only the reference blocks exist exist before it. From a local state sji, one can reconstruct a unique DAG. That is, the mapping from a local state sji into a DAG is injective or one-to-one. Thus, for Lachesis, we can simply denote the j-th local state of a process i by the OPERA chain gji (often we simply use Gi to denote the current local state of a process i).

An action is a function from one local state to another local state.

Generally speaking, an action can be either: a sen**d(m) action where m is a message, a receive(m) action, and an internal action. A message m is a triple ⟨i, j, B⟩ where i ∈ P is the sender of the message, j ∈ P is the message recipient, and B is the body of the message. Let M denote the set of messages. In Lachesis protocol, B consists of the content of an event block v. Semantics-wise, in Lachesis, there are two actions that can change a process’s local state: creating a new event and receiving an event from another process.

An event is a tuple ⟨s, α, s′⟩ consisting of a state, an action, and a state.

Sometimes, the event can be represented by the end state s′. The j-th event in history hi of process i is ⟨sj − 1i, α, sji⟩, denoted by vji.

A local history hi of process i is a (possibly infinite) sequence of alternating local states — beginning with a distinguished initial state. A set Hi of possible local histories for each process i in P.

The state of a process can be obtained from its initial state and the sequence of actions or events that have occurred up to the current state. In Lachesis protocol, we use append-only sematics. The local history may be equivalently described as either of the following: hi = s0i, α1i, α2i, α3ihi = s0i, v1i, v2i, v3ihi = s0i, s1i, s2i, s3i, …

In Lachesis, a local history is equivalently expressed as: hi = g0i, g1i, g2i, g3i, … where gji is the j-th local OPERA chain (local state) of the process i.

Each asynchronous run is a vector of local histories. Denoted by σ = ⟨h1, h2, h3, ...hN⟩.

Let Σ denote the set of asynchronous runs.

We can now use Lamport’s theory to talk about global states of an asynchronous system. A global state of run σ is an n-vector of prefixes of local histories of σ, one prefix per process. The happens-before relation can be used to define a consistent global state, often termed a consistent cut, as follows.

A consistent cut of a run σ is any global state such that if vxi → vyj and vyj is in the global state, then vxi is also in the global state. Denoted by c(σ).

By Theorem [thm:conchains], all nodes have consistent local OPERA chains. The concept of consistent cut formalizes such a global state of a run. A consistent cut consists of all consistent OPERA chains. A received event block exists in the global state implies the existence of the original event block. Note that a consistent cut is simply a vector of local states; we will use the notation c(σ)[i] to indicate the local state of i in cut c of run σ.

The formal semantics of an asynchronous system is given via the satisfaction relation ⊢. Intuitively c(σ)⊢ϕ, “c(σ) satisfies ϕ,” if fact ϕ is true in cut c of run σ. We assume that we are given a function π that assigns a truth value to each primitive proposition p. The truth of a primitive proposition p in c(σ) is determined by π and c. This defines c(σ)⊢p.

Two cuts c(σ) and c**′(σ′) are equivalent with respect to i if: c(σ)∼ic**′(σ′) ⇔ c(σ)[i]=c**′**(σ′)[i]

We introduce two families of modal operators, denoted by Ki and Pi, respectively. Each family indexed by process identifiers. Given a fact ϕ, the modal operators are defined as follows:

Ki(ϕ) represents the statement “ϕ is true in all possible consistent global states that include i’s local state”. c(σ)⊢Ki(ϕ)⇔∀c****′(σ′)(c**′(σ′)∼ic(σ)  ⇒  c**′(σ′) ⊢ ϕ)

Pi(ϕ) represents the statement “there is some consistent global state in this run that includes i’s local state, in which ϕ is true.” c(σ)⊢Pi(ϕ)⇔∃c****′(σ)(c**′(σ)∼ic(σ)  ∧  c**′(σ)⊢ϕ)

The next modal operator is written MC and stands for “majority concurrently knows.” This is adapted from the “everyone concurrently knows” in CCK paper . The definition of MC(ϕ) is as follows.

MC(ϕ)=defi ∈ SKiPi(ϕ), where S ⊆ P and |S|>2n/3.

In the presence of one-third of faulty nodes, the original operator “everyone concurrently knows” is sometimes not feasible. Our modal operator MC(ϕ) fits precisely the semantics for BFT systems, in which unreliable processes may exist.

The last modal operator is concurrent common knowledge (CCK), denoted by CC.

CC(ϕ) is defined as a fixed point of MC(ϕ ∧ X)

CCK defines a state of process knowledge that implies that all processes are in that same state of knowledge, with respect to ϕ, along some cut of the run. In other words, we want a state of knowledge X satisfying: X = MC(ϕ ∧ X). CC will be defined semantically as the weakest such fixed point, namely as the greatest fixed-point of MC(ϕ ∧ X). It therefore satisfies: CC(ϕ)⇔MC(ϕ ∧ CC(ϕ))

Thus, Pi(ϕ) states that there is some cut in the same asynchronous run σ including i’s local state, such that ϕ is true in that cut.

Note that ϕ implies Pi(ϕ). But it is not the case, in general, that Pi(ϕ) implies ϕ or even that MC(ϕ) implies ϕ. The truth of MC(ϕ) is determined with respect to some cut c(σ). A process cannot distinguish which cut, of the perhaps many cuts that are in the run and consistent with its local state, satisfies ϕ; it can only know the existence of such a cut.

Fact ϕ is valid in system Σ, denoted by Σ ⊢ ϕ, if ϕ is true in all cuts of all runs of Σ. Σ ⊢ ϕ ⇔ (∀σ ∈ Σ)(∀c)(c(a)⊢ϕ)

Fact ϕ is valid, denoted ⊢ϕ, if ϕ is valid in all systems, i.e. (∀Σ)(Σ ⊢ ϕ).

A fact ϕ is local to process i in system Σ if Σ ⊢ (ϕ ⇒ Kiϕ)

If ϕ is local to process i in system Σ, then Σ ⊢ (Pi(ϕ)⇒ϕ).

If fact ϕ is local to 2/3 of the processes in a system Σ, then Σ ⊢ (MC(ϕ)⇒ϕ) and furthermore Σ ⊢ (CC(ϕ)⇒ϕ).

A fact ϕ is attained in run σ if ∃c(σ)(c(σ)⊢ϕ).

Often, we refer to “knowing” a fact ϕ in a state rather than in a consistent cut, since knowledge is dependent only on the local state of a process. Formally, i knows ϕ in state s is shorthand for ∀c(σ)(c(σ)[i]=s ⇒ c(σ)⊢ϕ)

For example, if a process in Lachesis protocol knows a fork exists (i.e., ϕ is the exsistenc of fork) in its local state s (i.e., gji), then a consistent cut contains the state s will know the existence of that fork.

Process i learns ϕ in state sji of run σ if i knows ϕ in sji and, for all previous states ski in run σ, k < j, i does not know ϕ.

The following theorem says that if CC(ϕ is attained in a run then all processes i learn PiCC(ϕ) along a single consistent cut.

If CC(ϕ) is attained in a run σ, then the set of states in which all processes learn PiCC(ϕ) forms a consistent cut in σ.

We have presented a formal semantics of Lachesis protocol based on the concepts and notations of concurrent common knowledge . For a proof of the above theorems and lemmas in this Section, we can use similar proofs as described in the original CCK paper.

With the formal semantics of Lachesis, the theorems and lemmas described in Section [se:proof] can be expressed in term of CCK. For example, one can study a fact ϕ (or some primitive proposition p) in the following forms: ‘is there any existence of fork?’. One can make Lachesis-specific questions like ’is event block v a root?’, ’is v a clotho?’, or ’is v a atropos?’. This is a remarkable result, since we are the first that define such a formal semantics for DAG-based protocol.

Reference