Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats: fix row estimation for pseudo unique key #6199

Merged
merged 6 commits into from
Apr 2, 2018

Conversation

alivxxx
Copy link
Contributor

@alivxxx alivxxx commented Apr 2, 2018

If there are equal conditions on the unique key, the row count should not be greater than 1.
PTAL @coocood @winoros

func PseudoTable(tableID int64) *Table {
return &Table{
TableID: tableID,
func PseudoTable(tblInfo *model.TableInfo) *Table {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we cache the pseudo table stats in TableInfo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

continue
}
switch fun.FuncName.L {
case ast.EQ, ast.NullEQ, ast.In:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In and EQ has the same selectivity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getConstantColumnID guarantees there exists only two parameters for the current expr, so in is equivalent to eq

}
}
if unique {
return 1.0 / float64(t.Count)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the expression is IN with many unique values?

if len(e) != 2 {
return false
return -1
Copy link
Member

@coocood coocood Apr 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should get ColumnID in expressions like c in (1, 2, 3)?

@zz-jason
Copy link
Member

zz-jason commented Apr 2, 2018

/run-all-tests

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason zz-jason added the status/LGT1 Indicates that a PR has LGTM 1. label Apr 2, 2018
if !ok {
return PseudoTable(tblID)
tbl = PseudoTable(tblInfo)
h.UpdateTableStats([]*Table{tbl}, nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not mix pseudo table with non-pseudo ones by adding pseudoCache in Handle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we don't need to load histograms for pseudo ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it won't load histograms for pseudo ones.

@@ -179,7 +182,10 @@ func (h *Handle) UpdateTableStats(tables []*Table, deletedIDs []int64) {
func (h *Handle) LoadNeededHistograms() error {
cols := histogramNeededColumns.allCols()
for _, col := range cols {
tbl := h.GetTableStats(col.tableID).copy()
tbl, ok := h.statsCache.Load().(statsCache)[col.tableID]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not copy the tbl?

@alivxxx
Copy link
Contributor Author

alivxxx commented Apr 2, 2018

/run-integration-common-test

Copy link
Member

@coocood coocood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason zz-jason merged commit 905eda7 into pingcap:master Apr 2, 2018
@zz-jason zz-jason added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Apr 2, 2018
@alivxxx alivxxx deleted the unique branch April 3, 2018 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants