-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
planner, statistics: build new histogram using range information #7921
Conversation
@winoros any update? |
Updated @lamxTyler |
Please take a look at |
For index, we need to encode ranges if we want to split them by bucket bound while we need to use non-encode ranges to call |
@lamxTyler PTAL |
@@ -24,7 +24,7 @@ type StatsInfo struct { | |||
RowCount float64 | |||
Cardinality []float64 | |||
|
|||
HistColl statistics.HistColl | |||
HistColl *statistics.HistColl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it may be nil now? We should check all the places that use it to avoid nil pointer reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this pr, it's safe. DataSource always holds a histogram.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -465,10 +466,17 @@ func (hg *Histogram) totalRowCount() float64 { | |||
return float64(hg.Buckets[hg.Len()-1].Count + hg.NullCount) | |||
} | |||
|
|||
func (hg *Histogram) notNullCount() float64 { | |||
if hg.Len() == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@winoros OK. Could you simplify totalRowCount()
in this PR?
@winoros Please merge master and resolve conflicts. |
@winoros CI failed, PTAL |
@zz-jason It's the thing i mentioned internally. Statistics' feedback has something wrong with UINT range. |
/run-all-tests |
@lamxTyler I changed the place where calling the SplitRange for |
Also, add some code when doing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What problem does this PR solve?
Not a problem. To enhance the
histColl
held byDataSoure
'sStatsInfo
if we will hold it.What is changed and how it works?
Cut the histograms' buckets by the given range information.
Currently only numeric column's bucket' bound will be changed during this process.
Other column and index will only remove the bucket that have no intersection with ranges.
Check List
Tests
Code changes
Side effects
Related changes
This change is