-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raftstore: remove the local reader thread #4558
Conversation
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
7580914
to
0dbf9e5
Compare
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
/run-integration-tests |
After this pr merged, don't forget to modify tikv's grafana in tidb-ansible repo. |
Do we have any benchmark results? |
|
@ngaut I wrote a test log on confluence: 2019-04 Remove local reader thread |
Could you post a brief result here? |
YCSB Stress TestingTiKV Configurationreadpool.storage.high-concurrency: 24
readpool.storage.normal-concurrency: 24
readpool.storage.low-concurrency: 24
readpool.coprocessor.high-concurrency: 24
readpool.coprocessor.normal-concurrency: 24
readpool.coprocessor.low-concurrency: 24
server.grpc-concurrency: 12
server.stats-concurrency: 5 Deploy single TiKV instance and single PD instance. YCSB Configurationuse go-ycsb. recordcount=10000000
operationcount=100000000
workload=core
readallfields=true
readproportion=1
updateproportion=0
scanproportion=0
insertproportion=0
requestdistribution=uniform Two YCSB clients, script: ./bin/go-ycsb run tikv -P workload-point-get -p tikv.type="txn" -p threadcount=2048 ResultBoth throughput and latency get pretty improvement, and the frequency of context switch has decrease. |
@ngaut OK, I have just posted it, see the previous comment. |
Thanks. |
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
993242e
to
686baf9
Compare
/run-all-tests |
1 similar comment
/run-all-tests |
@@ -370,14 +370,6 @@ impl Peer { | |||
pub fn activate<T, C>(&self, ctx: &PollContext<T, C>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please update the comment, and I prefer writting
let mut meta = ctx.store_meta.lock().unwrap();
meta.readers
.insert(self.region_id, ReadDelegate::from_peer(self));
here rather than in post_raft_ready_append
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess seperating them is better for now, because activate
is also called in fsm/peer.rs
, where the meta is already locked. It will introduce much more changes.
…into thread-local-reader
Signed-off-by: qupeng <qupeng@pingcap.com>
PTAL @Connor1996 thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
src/raftstore/store/worker/read.rs
Outdated
return; | ||
} else { | ||
// Remove delegate for updating it by next cmd execution. | ||
self.delegates.borrow_mut().remove(®ion_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not move this line to L339?
Some(delegate) => { | ||
fail_point!("localreader_on_find_delegate"); | ||
delegate | ||
match delegate.take() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why take away? How about the following read requests for this region?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because of lifetime. After handle a request, it will be put into the hashmap back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be a performance issue, please pay attention to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
Signed-off-by: qupeng <qupeng@pingcap.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
What have you changed? (mandatory)
In TiKV, all read requests are collected into batches and executed by a thread called local-reader.
We found that this thread has become a bottleneck of read performance.
This PR removes this thread and moves the execution of read request to the readpool, which improves the read performance and reduces context switches.
What are the type of the changes? (mandatory)
How has this PR been tested? (mandatory)
Unit test, integration test, Jepsen test(still running, don't find any problem by now)
Does this PR affect tidb-ansible update? (mandatory)
Yes. see pingcap/tidb-ansible#753