support backup request on client side #84

levy5307 · 2020-01-06T08:09:38Z

The proposal resides in apache/incubator-pegasus#251.

一、背景

Backup request功能可以优化业务在服务抖动时读延迟的长尾问题，适合对一致性要求低的用户。

二、设计

增加一个配置项，让用户选择是否启用backup request功能。
在ReplicaConfiguration中添加Map<rpc_address, ReplicaSession> allSessions成员，保存所有的session。
在TableHandler::initTableConfiguration中：
1. 如果没有开启backup request，则只创建primary副本的session连接放入allSessions。
2. 如果开启了backup reqeust，secondary也同样要创建连接, 将所有创建的session存入allSessions。
在TableHandler::call中：
1. 如果是写操作，则要沿用已有逻辑，将请求发往primary。
2. 如果是读操作，则向allSessions中的所有session都发送请求, 只接收最快的response，其他都忽略（backup reqeust未开启时，allSessions中只有primary session，也相当于只向primary发送）。
  1. 增加is_success(boolean类型), 初始状态下为false，表示请求还没有成功返回，并将其传递给TableHandler::onRpcReply。
  2. 如果isEmpty()，说明还没有和该partition建立好连接，则tryQueryMeta并重新建立连接。
在TableHandler::onRpcReply中，处理请求response：
1. 如果is_success=true，则说明之前已经有response返回了，则直接忽略。
2. 如果is_success=false, 说明该response是第一个返回的, 则按照之前的逻辑处理该请求，处理完之后设置is_success=true。

备注：

需要对is_success的处理进行加锁。
需要对client_operator中添加一个字段，标记是读操作还是写操作。
在4和5中，对于是否开启了backup request是透明的，无需添加额外逻辑判断。只需要在3中创建session时做一下判断。

neverchanje · 2020-02-15T09:09:28Z

如果是读操作，则向allSessions中的所有session都发送请求, 只接收最快的response，其他都忽略（backup reqeust未开启时，allSessions中只有primary session，也相当于只向primary发送）。

增加is_success(boolean类型), 初始状态下为false，表示请求还没有成功返回，并将其传递给TableHandler::onRpcReply。

如果isEmpty()，说明还没有和该partition建立好连接，则tryQueryMeta并重新建立连接。

Your first idea is to broadcast read requests to all the replicas in a group. Apparently, this is unacceptable. It will result in 3 times read throughput than previous. The overall latency will certainly increase by the higher load. Actually, the most difficult part of "backup request" is to make overhead as minimum as possible. Tripling or doubling the workload should not be our option.

One scheme ("hedged request") is to defer the secondary request for a short period, which is often the desired p999 latency, 15ms eg. If the first request can't get replied within the period, the second request is sent. Theoretically, this solution requires only 0.1% additional load, which is cost-effective for our latency-sensitive users. This scheme is easy to implement, but one problem is the user has to learn how to appropriately set the "period". In BRPC's implementation of hedged request, it's an option called backup_request_ms. Maybe we can make it adaptive in the future.

For more readings, I copied here the paragraphs related to "hedged request" in "The tail at scale". Take a look.

levy5307 · 2020-02-18T03:05:06Z

There are two ways to implement backup request.

Hedged requests

A client first sends one request to the replica believed to be the most appropriate, but then falls back on sending a secondary request after the first request has been outstanding for more than the 95th-percentile(or 99th-percentline, etc) expected latency. The client cancels remaining outstanding requests once the first result is received. This approach limits the additional load to approximately 5%(1%) while substantially shortening the latency tail. This approach limits the additional load to approximately 5%(1%) while substantially shortening the latency tail.

This approach limits the benefits to only a small fraction of requests(the tail of the latency distribution).

Tied requests

the client send the request to two different servers, each tagged with the identity of the other server (“tied”). When a request begins execution, it sends a cancellation message to its counterpart. The corre- sponding request, if still enqueued in the other server, can be aborted imme- diately or deprioritized substantially.

There is another variation in which the request is sent to one server and forwarded to replicas only if the ini- tial server does not have it in its cache and uses cross-server cancellations.

This approach limits the benefits to not only the tail of the latency, but also median latency distribution. But it result in higher network load.

neverchanje · 2020-02-18T15:16:35Z

My first choice is "hedged request". Because it's apparently simpler. We can leave optimization later after the initial version.

To dig deeper into the final design, there're still several problems remain:

then falls back on sending a secondary request after the first request has been outstanding for more than the 95th-percentile(or 99th-percentile, etc) expected latency.

Since we have 2 secondaries, we can choose randomly one in 50:50 for the second request.

Another question is, how to design the API for configuring the period waiting for the secondary request (call it backup_request_ms)?

One way I suggest is to add an argument in PegasusClient.openTable, and passes backup_request_ms to it.

levy5307 · 2020-03-16T07:56:35Z

It is more effective to send to one sencondary randomly than send to all of the secondaries. Because according to performance test, send to all of the two secondaries will increase p95. Because send to all secondaries will increase server load a lot.

levy5307 · 2020-03-16T08:22:52Z

performance test

set/get operation:

test case	enable backup request	qps	read/write propotion	read avg	read p95	read p99	read p999	read p9999	write avg	write p95	write p99	write p999	write p9999
3-clients 15-threads	no	1 : 3	7076	880.6512836149132	428.0	727.0	138495.0	988671.0	2495.0710801540517	6319.0	9023.0	36319.0	531455.0
3-clients 15-threads	yes, delay 138ms	1 : 3	6987	1010.1412488662884	403.0	7747.0	138751.0	153599.0	2476.104380444753	6859.0	9119.0	13759.0	185855.0
3-clients 100-threads	no	1 : 0	140607	707.98960978	1474.0	2731.0	5511.0	167551.0
3-clients 100-threads	yes, delay 5ms	1 : 0	77429	1288.01461934	2935.0	3487.0	6323.0	71743.0	----	----	----	----	---
3-clients 30-threads	no	30 : 1	87198	306.9600544730426	513.0	805.0	4863.0	28271.0	1369.4669874672938	2661.0	5795.0	22319.0	51359.0
3-clients 30-threads	yes, delay 5ms	30 : 1	88541	298.22470022339127	493.0	711.0	4483.0	18479.0	1467.6130963728997	3263.0	6411.0	17439.0	50975.0

Multi-get/Batch-Set operation:

test case	enable backup request	qps	read/write porpotion	read avg	read p95	read p99	read p999	read p9999	write avg	write p95	write p99	write p999	write p9999
3-clients 7-threads	no	20 : 1	24113	200.37956913733476	277.0	410.0	2317.0	21647.0	2034.1923768463382	4283.0	6427.0	18271.0	62687.0
3-clients 7-threads	yes, deley 2ms	20 : 1	23756	197.48540031650361	268.0	351.0	2173.0	5759.0	2187.199077764627	4531.0	6551.0	21551.0	63999.0
3-clients 15-threads	no	20 : 1	30980	236.7482510418767	348.0	526.0	3535.0	25695.0	5361.380053671262	14087.0	20223.0	40639.0	90815.0
3-clients 15-threads	yes, delay 3ms	20 : 1	30483	244.1182599024727	386.0	540.0	3105.0	13287.0	5377.992155339365	14119.0	19535.0	31311.0	103103.0

neverchanje mentioned this issue Feb 15, 2020

support backup request to alleviate the long tail problem #31

Closed

neverchanje changed the title ~~backup request方案讨论~~ backup request on client side Feb 15, 2020

neverchanje changed the title ~~backup request on client side~~ support backup request on client side Feb 15, 2020

neverchanje added the enhancement New feature or request label Feb 15, 2020

levy5307 mentioned this issue Mar 6, 2020

feat: implement backup request #93

Merged

foreverneverer closed this as completed Aug 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support backup request on client side #84

support backup request on client side #84

levy5307 commented Jan 6, 2020 •

edited by neverchanje

Loading

neverchanje commented Feb 15, 2020 •

edited

Loading

levy5307 commented Feb 18, 2020 •

edited

Loading

neverchanje commented Feb 18, 2020 •

edited

Loading

levy5307 commented Mar 16, 2020 •

edited

Loading

levy5307 commented Mar 16, 2020 •

edited

Loading

support backup request on client side #84

support backup request on client side #84

Comments

levy5307 commented Jan 6, 2020 • edited by neverchanje Loading

一、背景

二、设计

neverchanje commented Feb 15, 2020 • edited Loading

levy5307 commented Feb 18, 2020 • edited Loading

neverchanje commented Feb 18, 2020 • edited Loading

levy5307 commented Mar 16, 2020 • edited Loading

levy5307 commented Mar 16, 2020 • edited Loading

levy5307 commented Jan 6, 2020 •

edited by neverchanje

Loading

neverchanje commented Feb 15, 2020 •

edited

Loading

levy5307 commented Feb 18, 2020 •

edited

Loading

neverchanje commented Feb 18, 2020 •

edited

Loading

levy5307 commented Mar 16, 2020 •

edited

Loading

levy5307 commented Mar 16, 2020 •

edited

Loading