Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: load data/batch insert improvement reducing memory allocation/free #11284

Merged
merged 2 commits into from
Jul 23, 2019

Conversation

cfzjywxk
Copy link
Contributor

What problem does this PR solve?

  • reducing the memory allocation/free and copy times during batch insert/load data execution process, trying to improve the performance
  • refactor some code logic

plz help taking a look for row buffers if they are still used after one batch commitment, this pr will make these memories reused after one batch transaction committing, by now unit tests show good.

What is changed and how it works?

do not alloc mem for every row per batch, use one mem buf for all batches

Check List

Tests

  • Unit test

Code changes

  • Has exported function/method change
  • Has exported variable/fields change

Side effects

  • Possible performance regression
  • Increased code complexity

@jackysp
Copy link
Member

jackysp commented Jul 17, 2019

Could you provide some memory usage test results or performance test results?

@jackysp
Copy link
Member

jackysp commented Jul 17, 2019

/run-all-tests

@jackysp
Copy link
Member

jackysp commented Jul 17, 2019

I think it should work, we can try it on the IDC server.

@lysu
Copy link
Contributor

lysu commented Jul 17, 2019

@jackysp @lysu any suggestions ?
eh.. not as expected ..
old implementation will "make as much as input rows mem buf" and for each row there will be a copy between "filled row and row buf", but the results seems not differ too much.

maybe Golang did a real good mem cache job ...

Golang using managed memory, so view rss in os level maybe useless - -, maybe can take look go's memstat https://blog.xiaoba.me/2017/09/02/how-to-play-golang-with-gdb-and-python.html.. but IMHO it's hard to observe too - - maybe take look other metric like import speed? :D

@cfzjywxk
Copy link
Contributor Author

500w data load this is tested on idc machine, speed and process mem use almost the same(not very stable, sometimes even slower)

if uint64(cap(e.rows)) < limit {
e.rows = make([][]types.Datum, 0, limit)
for i := 0; uint64(i) < limit; i++ {
e.rows = append(e.rows, make([]types.Datum, len(e.Table.Cols())))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we only need new limit - cap(e.rows) types.Datum here if the limit can be dynamical change

and the other optimize maybe or not work is make([]types.Datum, (limit -cap(e.rows)) * len(e.Table.Cols())) once and then cut slice from this big array to append

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems trace not support load data now ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lysu currently init will be called only once. in another pr making variable settable will do this

@ngaut
Copy link
Member

ngaut commented Jul 17, 2019

Please squash your commit message before you file the PR.

@cfzjywxk
Copy link
Contributor Author

@jackysp PTAL

@jackysp
Copy link
Member

jackysp commented Jul 18, 2019

On IDC, we can use ansible to start tikv as the storage engine. You could take a look at the tidb heap separately during the test. @lysu will help.

@cfzjywxk
Copy link
Contributor Author

some tests results on idc tidb cluster, loading 500w rows sysbench table schema


master branch

Git Commit Hash: fcc15b1256e4e33132a80e39b841a6e062c353a0

mysql> load data local infile '/data3/rwork/tests/data/500w.dat' into table 2_master;
Query OK, 5000000 rows affected (13 min 39.47 sec)
Records: 5000000 Deleted: 0 Skipped: 0 Warnings: 0

Type: alloc_space
Time: Jul 18, 2019 at 7:34pm (CST)
Showing nodes accounting for 101682.08MB, 96.52% of 105347.99MB total
Dropped 775 nodes (cum <= 526.74MB)
flat flat% sum% cum cum%
42212.87MB 40.07% 40.07% 42212.87MB 40.07% github.com/pingcap/tidb/executor.(*LoadDataInfo).getLine
21196.27MB 20.12% 60.19% 21196.27MB 20.12% github.com/pingcap/goleveldb/leveldb/memdb.New
10225.16MB 9.71% 69.90% 10225.16MB 9.71% github.com/pingcap/goleveldb/leveldb/memdb.(*DB).Put
3374.63MB 3.20% 73.10% 3374.63MB 3.20% github.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter).initKeysAndMutations.func1
2650.29MB 2.52% 75.62% 2650.29MB 2.52% github.com/pingcap/tidb/store/tikv.(*RegionCache).GroupKeysByRegion


dev branch(master commit position with  dev commits)

Git Commit Hash: 7f6f89c80f39a5a6e0022fd79a9b9af85fadfff9

mysql> load data local infile '/data3/rwork/tests/data/500w.dat' into table p4348_2;
Query OK, 5000000 rows affected (10 min 42.53 sec)
Records: 5000000 Deleted: 0 Skipped: 0 Warnings: 0

 

Build ID: b6b3aac95012ab192d0a3e1110aebf241dd40a9f
Type: alloc_space
Time: Jul 18, 2019 at 7:52pm (CST)
Showing nodes accounting for 101016.81MB, 96.95% of 104191.50MB total

41964.72MB 40.28% 40.28% 41964.72MB 40.28% github.com/pingcap/tidb/executor.(*LoadDataInfo).getLine
21250.88MB 20.40% 60.67% 21250.88MB 20.40% github.com/pingcap/goleveldb/leveldb/memdb.New
10198.45MB 9.79% 70.46% 10198.45MB 9.79% github.com/pingcap/goleveldb/leveldb/memdb.(*DB).Put
3477.13MB 3.34% 73.80% 3477.13MB 3.34% github.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter).initKeysAndMutations.func1
2684.87MB 2.58% 76.37% 2685.37MB 2.58% github.com/pingcap/tidb/store/tikv.(*RegionCache).GroupKeysByRegion
2312.38MB 2.22% 78.59% 2312.38MB 2.22% github.com/pingcap/tidb/util/codec.reallocBytes
1815.13MB 1.74% 80.34% 1815.13MB 1.74% github.com/pingcap/tidb/executor.(*fieldWriter).outputField

@ngaut
Copy link
Member

ngaut commented Jul 18, 2019

According to the memory profile output, there is not much difference between master and current PR.
It's wired, since the loading time reduce 25%.

@cfzjywxk
Copy link
Contributor Author

cfzjywxk commented Jul 18, 2019

thanks @lysu for another mem usage opt suggestion for "getLine" function,
rerun with update,

memory usage better:


dev branch(master commit position with  dev commits)

 Git Commit Hash: 7ad46a5702eb08288f7558d27ab478bc44d51bb7

mysql> load data local infile '/data3/rwork/tests/data/500w.dat' into table newt;
Query OK, 5000000 rows affected (2 min 45.06 sec)
Records: 5000000 Deleted: 0 Skipped: 0 Warnings: 0

 

start memory profile  

Showing nodes accounting for 99.66MB, 98.03% of 101.66MB total

 

end memory profile

Showing nodes accounting for 57741.90MB, 96.02% of 60136.50MB total
Dropped 553 nodes (cum <= 300.68MB)
flat flat% sum% cum cum%
20618.74MB 34.29% 34.29% 20618.74MB 34.29% github.com/pingcap/goleveldb/leveldb/memdb.New
9918.86MB 16.49% 50.78% 9918.86MB 16.49% github.com/pingcap/goleveldb/leveldb/memdb.(*DB).Put
3340.36MB 5.55% 56.34% 3340.36MB 5.55% github.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter).initKeysAndMutations.func1


 

master branch

Git Commit Hash: fcc15b1256e4e33132a80e39b841a6e062c353a0

 

mysql> load data local infile '/data3/rwork/tests/data/500w.dat' into table mastert;
Query OK, 5000000 rows affected (3 min 7.95 sec)
Records: 5000000 Deleted: 0 Skipped: 0 Warnings: 0

 

start memory profile  

Showing nodes accounting for 99.61MB, 100% of 99.61MB total

end memory profile

Showing nodes accounting for 100685.37MB, 97.31% of 103467.93MB total
Dropped 489 nodes (cum <= 517.34MB)
flat flat% sum% cum cum%
41421.47MB 40.03% 40.03% 41421.47MB 40.03% github.com/pingcap/tidb/executor.(*LoadDataInfo).getLine
20895.71MB 20.20% 60.23% 20895.71MB 20.20% github.com/pingcap/goleveldb/leveldb/memdb.New
10134.26MB 9.79% 70.02% 10134.26MB 9.79% github.com/pingcap/goleveldb/leveldb/memdb.(*DB).Put

 

@cfzjywxk
Copy link
Contributor Author

@jackysp @lysu should we merge this first and leave other optimizations in other prs so that this will not be too big diff ?

@@ -16,6 +16,7 @@ package executor
import (
"context"
"fmt"
"github.com/pingcap/tidb/util/hack"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move it to the 3rd party libs.

@@ -681,6 +681,9 @@ func (b *executorBuilder) buildLoadData(v *plannercore.LoadData) Executor {
},
}

var defaultLoadDataBatchCnt uint64 = 20000 //TODO this will be changed to variable in another pr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment should start with whitespace.

Copy link
Member

@jackysp jackysp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

…ing copy times

fix getLine function causing mem copy problem
@cfzjywxk
Copy link
Contributor Author

/run-all-tests

1 similar comment
@cfzjywxk
Copy link
Contributor Author

/run-all-tests

Copy link
Member

@jackysp jackysp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jackysp
Copy link
Member

jackysp commented Jul 22, 2019

PTAL @lysu

@cfzjywxk
Copy link
Contributor Author

/run-circleci-tests

@codecov
Copy link

codecov bot commented Jul 22, 2019

Codecov Report

Merging #11284 into master will decrease coverage by 0.5427%.
The diff coverage is 92.5925%.

@@               Coverage Diff                @@
##             master     #11284        +/-   ##
================================================
- Coverage   81.8226%   81.2799%   -0.5428%     
================================================
  Files           424        423         -1     
  Lines         92131      90256      -1875     
================================================
- Hits          75384      73360      -2024     
- Misses        11480      11600       +120     
- Partials       5267       5296        +29

@codecov
Copy link

codecov bot commented Jul 22, 2019

Codecov Report

Merging #11284 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #11284   +/-   ##
===========================================
  Coverage   81.6377%   81.6377%           
===========================================
  Files           423        423           
  Lines         91318      91318           
===========================================
  Hits          74550      74550           
  Misses        11467      11467           
  Partials       5301       5301

Copy link
Member

@jackysp jackysp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@lysu lysu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason zz-jason merged commit 119d532 into pingcap:master Jul 23, 2019
@zz-jason zz-jason added the status/LGT2 Indicates that a PR has LGTM 2. label Jul 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants