-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats: fix row estimation for pseudo unique key #6199
Conversation
func PseudoTable(tableID int64) *Table { | ||
return &Table{ | ||
TableID: tableID, | ||
func PseudoTable(tblInfo *model.TableInfo) *Table { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we cache the pseudo table stats in TableInfo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
statistics/selectivity.go
Outdated
continue | ||
} | ||
switch fun.FuncName.L { | ||
case ast.EQ, ast.NullEQ, ast.In: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In
and EQ
has the same selectivity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getConstantColumnID
guarantees there exists only two parameters for the current expr, so in
is equivalent to eq
statistics/selectivity.go
Outdated
} | ||
} | ||
if unique { | ||
return 1.0 / float64(t.Count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the expression is IN
with many unique values?
statistics/selectivity.go
Outdated
if len(e) != 2 { | ||
return false | ||
return -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should get ColumnID in expressions like c in (1, 2, 3)
?
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if !ok { | ||
return PseudoTable(tblID) | ||
tbl = PseudoTable(tblInfo) | ||
h.UpdateTableStats([]*Table{tbl}, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not mix pseudo table with non-pseudo ones by adding pseudoCache
in Handle
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we don't need to load histograms for pseudo ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it won't load histograms for pseudo ones.
@@ -179,7 +182,10 @@ func (h *Handle) UpdateTableStats(tables []*Table, deletedIDs []int64) { | |||
func (h *Handle) LoadNeededHistograms() error { | |||
cols := histogramNeededColumns.allCols() | |||
for _, col := range cols { | |||
tbl := h.GetTableStats(col.tableID).copy() | |||
tbl, ok := h.statsCache.Load().(statsCache)[col.tableID] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not copy the tbl?
/run-integration-common-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If there are equal conditions on the unique key, the row count should not be greater than 1.
PTAL @coocood @winoros