-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: support fast analyze. #9973
Conversation
Codecov Report
@@ Coverage Diff @@
## master #9973 +/- ##
===========================================
Coverage ? 77.9428%
===========================================
Files ? 405
Lines ? 82540
Branches ? 0
===========================================
Hits ? 64334
Misses ? 13445
Partials ? 4761 |
This PR is too large to review. I'll split it by:
|
Test by TPC-H factor=50. Table Normal analyze tasks 7min 54sec. |
Wonderful result! |
@winoros Now, the number of rows regards the same rows with a different snapshot as the different rows. So the total count for a column is not collect. The problem will be fixed at the next PR——build stats info for fast analyze. |
|
||
keys := make([]kv.Key, 0, task.SampSize) | ||
for i := 0; i < int(task.SampSize); i++ { | ||
randKey := rander.Int63n(maxRowID-minRowID) + minRowID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since rand.Intn(0) will result in panic, do we need to check maxRowID-minRowID > 0
here?
if collectors[0].Samples[samplePos] == nil { | ||
collectors[0].Samples[samplePos] = &statistics.SampleItem{} | ||
} | ||
collectors[0].Samples[samplePos].Ordinal = int(samplePos) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ordinal
represents this item's relative order of physical position among samples.
For example, if there are two regions, they have condition that region1 < region2
.
And you sample item1 from region1 and item2, item3 from region2 and they have condition that item1.key < item2.key < item3.key
.
Then you have to meet condition that item1.Ordinal < item2.Ordinal < item3.Ordinal
.
But in this implementation, item2.Ordinal < item3.Ordinal < item1.Ordinal
can happen.
@eurekaka PTAL and please help to confirm if this problem exist?
for buildCnt := 0; buildCnt < 5; buildCnt++ { | ||
needRebuild, err := e.buildSampTask() | ||
if err != nil { | ||
return nil, nil, errors.Trace(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to call errors.Trace()
anymore.
go e.getSampRegionsRowCount(bo, &needRebuildForRoutine[i], &errs[i], &sampTasksForRoutine[i]) | ||
} | ||
|
||
store, _ := e.ctx.GetStore().(tikv.Storage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we return error if the store is not tikv.Storage
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check has been done at the planner.
scanTasks []*tikv.KeyLocation | ||
} | ||
|
||
func (e *AnalyzeFastExec) getSampRegionsRowCount(bo *tikv.Backoffer, needRebuild *bool, err *error, sampTasks *[]*AnalyzeFastTask) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use another function signature? for example:
func (e *AnalyzeFastExec) getSampRegionsRowCount(bo *tikv.Backoffer) (needRebuild bool, sampTasks []*AnalyzeFastTask, err error) {
All the sub-PRs have been merged. |
What problem does this PR solve?
Support fast analyze.
What is changed and how it works?
We random the keys in each region to get samples instead of scanning all regions.
Check List
Tests
Code changes
Side effects
Related changes