-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store,kv: snapshot doesn't cache the non-exists kv entries lead to poor 'insert ignore' performance #12872
Conversation
…or 'insert ignore' performance Snapshot should cache all the BatchGet() kv entries, including the non-exist ones. This commit fix a bug that non-exist kv entries are not cached, causing `insert ignore` to access TiKV everytime and the performance is poor.
PTAL @jackysp @crazycs520 |
Could you add benchamrk result here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the code
I'm making a comparison... if those I create a table like this:
and fill the table with 100w rows:
and create another table, set
This result is under the TiDB binary of this commit. |
By the way, I mock a 1ms network latency (for both) because I'm testing on my laptop (no so obvious penalty for networking). I guess at least 20mins have passed away.
|
The result is here, use master branch @shenli
|
Codecov Report
@@ Coverage Diff @@
## master #12872 +/- ##
===============================================
+ Coverage 80.0693% 80.456% +0.3866%
===============================================
Files 465 465
Lines 107262 109333 +2071
===============================================
+ Hits 85884 87965 +2081
+ Misses 14934 14920 -14
- Partials 6444 6448 +4 |
/run-all-tests |
PTAL @crazycs520 @coocood |
@tiancaiamao
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we add a unit test case, e.g.trace fomat='row' insert ignore?
That's exactly the bug! |
The result is difficult to check. @jackysp |
It's cached, after this operation, |
3fcb401
to
e5c36eb
Compare
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We could check if it send the kv request in the result set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need a test case.
/run-all-tests
There are always SendRequest in the result set, before or after this PR! |
Any idea about how to write that test? |
64ea7eb
to
e85de23
Compare
PTAL @jackysp |
LGTM |
/run-all-tests |
cherry pick to release-3.0 failed |
cherry pick to release-3.1 failed |
…or 'insert ignore' performance (pingcap#12872)
…ect/tidb into feature-add-udf-support * 'feature-add-udf-support' of https://github.com/JustProject/tidb: (26 commits) *: fix bug that the kill command doesn't work when the killed session is waiting for the pessimistic lock (pingcap#12852) executor: fix the projection upon the indexLookUp in indexLookUpJoin can't get result. (pingcap#12889) planner, executor: support create view on union (pingcap#12595) planner/cascades: introduce TransformationID in cascades planner (pingcap#12879) executor: fix data race in test (pingcap#12910) executor: reuse chunk row for insert on duplicate update (pingcap#12847) ddl: speed up tests (pingcap#12888) executor: speed up test (pingcap#12896) expression: implement vectorized evaluation for `builtinSecondSig` (pingcap#12886) expression: implement vectorized evaluation for `builtinJSONObjectSig` (pingcap#12663) expression: speed up builtinRepeatSig by using MergeNulls (pingcap#12674) expression: speed up unit tests under the expression package (pingcap#12887) store,kv: snapshot doesn't cache the non-exists kv entries lead to poor 'insert ignore' performance (pingcap#12872) executor: fix data race in `GetDirtyTable()` (pingcap#12767) domain: increase TTL to reduce the occurrence of reporting min startTS errors (pingcap#12578) executor: split test for speed up (pingcap#12881) executor: fix inconsistent of grants privileges with MySQL when executing `grant all on ...` (pingcap#12330) expression: implement vectorized evaluation for `builtinJSONUnquoteSig` (pingcap#12841) tune grpc connection count between tidb and tikv (pingcap#12884) Makefile: change test parallel to 8 (pingcap#12885) ...
…or 'insert ignore' performance pingcap#12872
…or 'insert ignore' performance (pingcap#12872)
What problem does this PR solve?
The Snapshot should cache all the BatchGet() KV entries, including the non-exist ones.
This commit fixes a bug (start from #12108) that non-exist KV entries are not cached, causing
insert ignore
to access TiKV every time and the performance is poor.What is changed and how it works?
The confusing API convention leads to this bug. Some places use len(value) = 0 in map[key]value to represent non-exist, some places use
_, ok := map[key]value; !ok
to represent non-exist ... and in the end, we forget to cache the non-exist information.After this change, we can see that the
tikvSnapshot.get
doesn't call rpcSendRequest
anymore.See the tracing result to verify that:
Check List
Tests
This is not a bug related to correctness.
Side effects
Related changes
Release note