Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: refactor union scan and dirty table #11702

Merged
merged 46 commits into from
Sep 12, 2019

Conversation

crazycs520
Copy link
Contributor

@crazycs520 crazycs520 commented Aug 9, 2019

What problem does this PR solve?

background

In #10673, TiDB union-scan use memory index reader in the transaction.

TiDB has optimized in the update statement: TiDB won't write index key/value unless the index key/value changed. This optimization affects the union-scan implementation as follows:

In TiDB transaction mem-buffer, the row data may be inconsistent with the untouched index when you execute the update. Then simply using mem-index-reader and mem-index-look-up-reader trigger a bug. To avoid this problem, I have to use the original buildAndSortAddedRows function to avoid the corner case and use getMissIndexRowsByHandle to get the missed row(s) when using mem-index-reader.

The purpose of this PR

  • Remove the buildAndSortAddedRows function.
  • Remove the getMissIndexRowsByHandle function.
  • Add mem-index-lookup-reader.
  • Remove the redundant logic in DirtyTable.

How to implement?

If the update statement with the untouched index will also write the untouched index key/value to mem-buffer, Then everything of union-scan will be easy, Right?

So, In this PR, the update statement with the untouched index will also write the untouched index key/value to mem-buffer, but ignore those key/value when committing the transaction.

How to ignore the untouched index key/value when committing the transaction?

When writing the untouched index key/value to mem-buffer, I mark the index value as uncommitted, then check the type of the key is the index key and the value is marked to be uncommitted when committing the transaction.

How to mark the untouched index value?

The original value of no-unique-index is 0, and the value of the unique index is the handle ID
after marked the value to uncommitted,
The value of no-unique-index is 1, and the value of the unique index is the handle ID + '1'.

Drawback

As you can see, in this way, TiDB needs to write untouched index key/value to mem-buffer, This will introduce the encode and memory overhead.

Benchmark select in transaction

1 TiDB, 1 PD, 1 TiKV, deploy in 1 machine.

prepare

create table t (a int,b int,c int, index idxa(a,b));
insert into 10000 rows. [0,10000).

Best Case for mem index lookup reader:

There is no update with untouched index, so we can safely use the mem-index-lookup reader in the union scan.

  • insert 1000 rows in the transaction.
begin
insert 1000 rows. [10000,11000);
select * from t use index (idxa)  where a = 1; // Bench test index-look-up-reader of Uion-scan.

Before This PR:
1.8 ms. // This still uses the union-scan.BuildAndSort function:
image
image

This PR:
0.8ms
image
image

  • insert 10000 rows in the transaction.
begin
insert 1000 rows. [10000,20000);
select * from t use index (idxa)  where a = 1; // Bench test index-look-up-reader of Uion-scan.

Before This PR:
12 ms.

This PR:
0.8ms

Benchmark for mem-index-reader

Since the getMissIndexRowsByHandle function was removed, so the mem-index-reader also has improvement.

begin
insert 1000 rows. [10000,20000);
select sum(a) from t use index (idx) where a>10000;" 

Before This PR:
0.6ms.

This PR:
0.52ms.

Drawback: Bench for update with untouched index.

update t set c=c+1; -- Update 10000 rows. and table t has 1 index was untouched.

Before This PR:
250ms.

image

This PR:
265ms.
image

What is changed and how it works?

Check List

Tests

  • Unit test

Code changes

  • Has exported function/method change

Side effects

Related changes

@codecov
Copy link

codecov bot commented Aug 9, 2019

Codecov Report

Merging #11702 into master will decrease coverage by 0.0444%.
The diff coverage is 88.372%.

@@               Coverage Diff                @@
##             master     #11702        +/-   ##
================================================
- Coverage   81.5397%   81.4953%   -0.0445%     
================================================
  Files           450        449         -1     
  Lines         96456      95992       -464     
================================================
- Hits          78650      78229       -421     
+ Misses        12222      12189        -33     
+ Partials       5584       5574        -10

@crazycs520
Copy link
Contributor Author

/run-all-tests

@crazycs520
Copy link
Contributor Author

/run-unit-test

@crazycs520
Copy link
Contributor Author

/run-all-tests

@crazycs520
Copy link
Contributor Author

/bench

@crazycs520 crazycs520 changed the title executor: simplify union scan logic, remove get miss rows and try to remove dirty table. executor: simplify union scan logic, remove get miss rows and remove dirty table. Aug 10, 2019
@crazycs520
Copy link
Contributor Author

/run-all-tests

@crazycs520
Copy link
Contributor Author

/bench

@winkyao
Copy link
Contributor

winkyao commented Aug 11, 2019

@crazycs520 Need a benchmark for this PR.

@crazycs520
Copy link
Contributor Author

/run-all-tests

store/tikv/2pc.go Outdated Show resolved Hide resolved
table/tables/index.go Outdated Show resolved Hide resolved
table/tables/index.go Outdated Show resolved Hide resolved
@crazycs520
Copy link
Contributor Author

/run-all-tests

@codecov
Copy link

codecov bot commented Sep 9, 2019

Codecov Report

Merging #11702 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #11702   +/-   ##
===========================================
  Coverage   81.3349%   81.3349%           
===========================================
  Files           453        453           
  Lines         97369      97369           
===========================================
  Hits          79195      79195           
  Misses        12504      12504           
  Partials       5670       5670

belowHandleIndex: us.belowHandleIndex,
}

return &memIndexLookUpReader{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not embed memIndexReader to memIndexLookUpReader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PhysicalIndexLookUpReader also use indexPlan and tablePlan separately.
And I think it is more clear.

@coocood
Copy link
Member

coocood commented Sep 11, 2019

LGTM

@coocood coocood added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 11, 2019
@ngaut
Copy link
Member

ngaut commented Sep 11, 2019

/bench

@sre-bot
Copy link
Contributor

sre-bot commented Sep 11, 2019

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: 213b783dc685a47f26dfc46a05730ed228afe6cc
+++ tidb: 0d10ddc2fcb5be76d2999138addfc70c8f93297c
tikv: 9b0e35d53adb819cfaad8f9299f4e132874f883d
pd: e678a1f5c022fed729fb79397fe02b6c9f54ff4a
================================================================================
test-1: < oltp_select_desc >
    * QPS : 11919.82 ± 0.3973% (std=38.61) delta: -2.49%
    * AvgMs : 64.43 ± 0.3983% (std=0.21) delta: 2.56%
    * PercentileMs99 : 124.19 ± 2.9116% (std=2.29) delta: 2.57%
            
test-2: < oltp_insert >
    * QPS : 21380.86 ± 0.2042% (std=27.34) delta: 0.85%
    * AvgMs : 11.96 ± 0.2089% (std=0.02) delta: -0.87%
    * PercentileMs99 : 42.61 ± 0.0000% (std=0.00) delta: 0.00%
            
test-3: < oltp_update_non_index >
    * QPS : 29453.83 ± 0.0125% (std=2.62) delta: -0.60%
    * AvgMs : 8.69 ± 0.0000% (std=0.00) delta: 0.63%
    * PercentileMs99 : 30.26 ± 0.0000% (std=0.00) delta: 0.00%
            
test-4: < oltp_read_write >
    * QPS : 37162.94 ± 0.6862% (std=158.49) delta: 0.34%
    * AvgMs : 138.32 ± 0.6854% (std=0.58) delta: -0.33%
    * PercentileMs99 : 257.95 ± 0.0000% (std=0.00) delta: 0.00%
            
test-5: < oltp_point_select >
    * QPS : 75474.33 ± 1.3469% (std=753.96) delta: -0.67%
    * AvgMs : 3.39 ± 1.2382% (std=0.03) delta: 0.77%
    * PercentileMs99 : 7.43 ± 0.0000% (std=0.00) delta: 0.00%
            
test-6: < oltp_update_index >
    * QPS : 17020.34 ± 0.6948% (std=71.58) delta: 0.56%
    * AvgMs : 15.04 ± 0.6783% (std=0.06) delta: -0.56%
    * PercentileMs99 : 48.34 ± 0.0000% (std=0.00) delta: 0.36%
            

https://perf.pingcap.com

Copy link
Member

@jackysp jackysp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jackysp jackysp added the status/can-merge Indicates a PR has been approved by a committer. label Sep 12, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Sep 12, 2019

Your auto merge job has been accepted, waiting for 12159

@jackysp jackysp added status/LGT3 The PR has already had 3 LGTM. and removed status/LGT2 Indicates that a PR has LGTM 2. labels Sep 12, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Sep 12, 2019

/run-all-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution sig/transaction SIG:Transaction status/can-merge Indicates a PR has been approved by a committer. status/LGT3 The PR has already had 3 LGTM.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants