-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
planner, statistics: maintain histogram for inner join #8097
Conversation
232fe9b
to
2adb8dd
Compare
72667ab
to
ea60d70
Compare
63cf9bb
to
8a9ac4f
Compare
I'll add more tests to cover the case that the count of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Codecov Report
@@ Coverage Diff @@
## master #8097 +/- ##
==========================================
+ Coverage 67.16% 67.23% +0.06%
==========================================
Files 371 371
Lines 76393 76543 +150
==========================================
+ Hits 51311 51461 +150
- Misses 20494 20496 +2
+ Partials 4588 4586 -2
Continue to review full report at Codecov.
|
planner/core/stats.go
Outdated
@@ -281,6 +281,9 @@ func (p *LogicalJoin) DeriveStats(childStats []*property.StatsInfo) (*property.S | |||
leftKeys = append(leftKeys, eqCond.GetArgs()[0].(*expression.Column)) | |||
rightKeys = append(rightKeys, eqCond.GetArgs()[1].(*expression.Column)) | |||
} | |||
if p.JoinType == InnerJoin && p.ctx.GetSessionVars().OptimizerSelectivityLevel >= 1 { | |||
return p.deriveInnerJoinStatsWithHist(leftKeys, rightKeys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we pass the childStats
parameter down to deriveInnerJoinStatsWithHist
? so this DeriveStats
function can be used by cascades planner as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, using childStats
in new commits. But it uses the schema of join's children inside this method. Seems that i need some way to not rely on it.
cc8ea08
to
15485aa
Compare
leftCol := &statistics.Column{Info: leftHist.Info, Histogram: *newHist} | ||
rightCol := &statistics.Column{Info: rightHist.Info, Histogram: *newHist} | ||
lIncreaseFactor := leftHist.GetIncreaseFactor(leftProfile.HistColl.Count) | ||
// The factor is used to scale the NDV. When it's higher than one. NDV doesn't need to be changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we change NDV if this factor is larger than one?
keyNdv = float64(newHist.NDV) * lIncreaseFactor * rIncreaseFactor | ||
lPosNew := p.schema.ColumnIndex(leftKeys[i]) | ||
rPosNew := p.schema.ColumnIndex(rightKeys[i]) | ||
cardinality[lPosNew] = float64(newHist.NDV) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not multiply this cardinality by lIncreaseFactor
like keyNdv
.
closed for a while. will reopen it in the future. |
What problem does this PR solve?
Initial commit for maintaining histogram for inner join.
Don't consider use index to calculate join's
StatsInfo
currently.What is changed and how it works?
Following the description in #7605
Check List
Tests
Not enough. I'll add more, but maybe not in this pr.
Code changes
Side effects
This change is