-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plan: remove other accessPath
s if one is unique key and full matched.
#6925
Conversation
plan/logical_plans.go
Outdated
@@ -332,7 +332,7 @@ type accessPath struct { | |||
forced bool | |||
} | |||
|
|||
func (ds *DataSource) deriveTablePathStats(path *accessPath) error { | |||
func (ds *DataSource) deriveTablePathStats(path *accessPath) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment for the newly added return value.
plan/logical_plans.go
Outdated
} | ||
|
||
func (ds *DataSource) deriveIndexPathStats(path *accessPath) error { | ||
func (ds *DataSource) deriveIndexPathStats(path *accessPath) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
plan/stats.go
Outdated
continue | ||
} | ||
err := ds.deriveIndexPathStats(path) | ||
uniqueAndOnlyPoint, err := ds.deriveIndexPathStats(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pointQuery
is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to tell that this key is unique key.
plan/logical_plans.go
Outdated
} | ||
path.countAfterAccess, err = ds.statisticTable.GetRowCountByIntColumnRanges(sc, pkCol.ID, path.ranges) | ||
return errors.Trace(err) | ||
noIntervalRange := true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about s/noIntervalRange/allPointRanges/? This variable renaming can make code logic more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree
plan/logical_plans.go
Outdated
@@ -388,7 +395,7 @@ func (ds *DataSource) deriveIndexPathStats(path *accessPath) error { | |||
} | |||
path.countAfterIndex = math.Max(path.countAfterAccess*selectivity, ds.statsAfterSelect.count) | |||
} | |||
return nil | |||
return path.index.Unique && path.eqCondCount == len(path.index.Columns), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to only check whether the ranges are all point ranges, just like deriveTablePathStats
. We can check path.index.Unique
in deriveStats()
} | ||
path.countAfterAccess, err = ds.statisticTable.GetRowCountByIntColumnRanges(sc, pkCol.ID, path.ranges) | ||
return errors.Trace(err) | ||
// Check whether the primary key is covered by point query. | ||
noIntervalRange := true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/noIntervalRange/allPointRanges/ or s/noIntervalRange/onlyPoint/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noIntervalRange
is better. Since there may be the case that the range is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like a > 5 and a < 3...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@winoros |
@coocood maybe be range query https://github.com/lysu/tidb/blob/12dbd3285495ed4606fe5bdd8763f3d0acaf73bb/util/ranger/types.go#L53, and It seems no problem? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we have a unique key a
, and most of its value is NULL, the condition is a is NULL
, the index path may not be the best.
@coocood What's is your suggestion? Add a constraint about the estimated row count? Such as if the accessPath.count < 100, then choose the unique key, else fall to the normal way. |
@shenli |
@coocood PTAL |
What have you changed? (mandatory)
It's a conservative optimization.
If one unique key is full matched by point query. We'll simply remove other choices.
What are the type of the changes (mandatory)?
How has this PR been tested (mandatory)?
explain test added.
Add a few positive/negative examples (optional)
When the statistics is out of date. We may choose a wrong index. This optimization can avoid some bad situation.
And it's conservative enough that we can make sure the decision we made by this optimization is no worse than others.