Support in-library Multi-Raft #131

akiradeveloper · 2021-01-03T09:06:36Z

Multi-Raft is a data-sharding in Raft where the key space is split into N groups and each group has individual Raft log. This technique is used in KV-store like TiKV.

The current lol 0.6.2 can achieve multi-raft by use socket port as the group ID however this isn't efficient in the following reasons:

Wasting heartbeat: A node can be a leader for group A and B. In this case, we can only send heartbeats in only one group. In other words, the two groups can share the liveness of nodes.
State machines can't share the same data like configuration.

Also, keeping connections to 1000 nodes in cluster may let the system be unstable.

Since the implementation will be very hard and there is no demand for this feature at the moment the priority isn't high but at least worth to be noted.

akiradeveloper · 2021-02-01T08:01:25Z

Here is my initial design sketch.

We define MultiRaftAdapter so requests to each Raft group are directed properly. For the receiver side this looks ok to me but question is on the sender side.

The text in red is a example of request to send. Here the problem is RaftCore knows it's group id. Ideally, each RaftCore can be ignorant of the group id: the knowledge about the group id is managed outside e.g. in the MultiRaftAdapter.

akiradeveloper · 2021-02-01T10:30:57Z

One another possible motivation is that this extension will benefit for performance of a normal application (not only sharded like kvs). The application may have two consensus in the cluster in which the two are independent. For example, one log for distributed sequencer and one for application log; they are logically independent. If we have multi-raft distinguished by group_id, we have use gid=1 for the former and gid=2 for the latter.

Now I am willing to do this. But it will be a big change that will take a long long time.

LegNeato · 2021-03-16T06:08:31Z

I would love to use this feature fwiw

akiradeveloper · 2021-04-14T02:53:02Z

With k8s we can deploy many Raft nodes in separate namespace. (Well, container technology is revolution in distributed computing) This means we don't need to distinguish nodes by their ports any more when they are located in the same physical server.

However, as they are independent Raft nodes they have own Tonic server (and Tokio runtime). Apparently, this consumes memory footprint for each instance. But more importantly, there will be created a hundreds of thread pools and OS scheduler needs to take care of thousands of threads, the overhead will not be negligible in my experience. In conclusion we can't put many (100~) nodes in the same server and the number of shards is restricted therefore.

With this feature in-library Multi-Raft, only one Raft node is put per server and shards coexist in the same Raft node where they are distinguished by 64bits IDs. As a result, we can make 10000~ shards to boost the performance.

akiradeveloper · 2021-04-16T02:42:58Z

This is a very simple example: There are 3 nodes and 3 shards each 2 replicas.

Imagine we split the key space into more shards (like 10000) each maintaining 2 replicas as well.

Other than the resource problem (memory, too many pooled threads), this incurs massive number of heartbeat. The number of heartbeat per interval is proportional to the number of Raft node so in this case the order of 10k. If the heartbeat interval is 100ms, there will be 100 heartbeats in 1 ms, 1 heartbeat in 10us which involves TCP send/recv and protobuf encode/decode. I think this will be totally useless.

My idea is separate the heartbeat layer from Raft node level to phy. node level.

akiradeveloper added the idea label Jan 3, 2021

akiradeveloper mentioned this issue Feb 27, 2021

Remove type parameter A from RaftCore #171

Closed

akiradeveloper mentioned this issue Mar 27, 2021

Change Id format #172

Closed

Repository owner locked and limited conversation to collaborators Sep 9, 2021

akiradeveloper closed this as completed Sep 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Support in-library Multi-Raft #131

Support in-library Multi-Raft #131

akiradeveloper commented Jan 3, 2021

akiradeveloper commented Feb 1, 2021

akiradeveloper commented Feb 1, 2021

LegNeato commented Mar 16, 2021

akiradeveloper commented Apr 14, 2021 •

edited

Loading

akiradeveloper commented Apr 16, 2021

This issue was moved to a discussion.

This issue was moved to a discussion.

Support in-library Multi-Raft #131

Support in-library Multi-Raft #131

Comments

akiradeveloper commented Jan 3, 2021

akiradeveloper commented Feb 1, 2021

akiradeveloper commented Feb 1, 2021

LegNeato commented Mar 16, 2021

akiradeveloper commented Apr 14, 2021 • edited Loading

akiradeveloper commented Apr 16, 2021

This issue was moved to a discussion.

akiradeveloper commented Apr 14, 2021 •

edited

Loading