Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plan: remove other accessPaths if one is unique key and full matched. #6925

Merged
merged 13 commits into from
Jun 29, 2018

Conversation

winoros
Copy link
Member

@winoros winoros commented Jun 28, 2018

What have you changed? (mandatory)

It's a conservative optimization.
If one unique key is full matched by point query. We'll simply remove other choices.

What are the type of the changes (mandatory)?

  • Improvement (non-breaking change which is an improvement to an existing feature)

How has this PR been tested (mandatory)?

explain test added.

Add a few positive/negative examples (optional)

When the statistics is out of date. We may choose a wrong index. This optimization can avoid some bad situation.
And it's conservative enough that we can make sure the decision we made by this optimization is no worse than others.

@@ -332,7 +332,7 @@ type accessPath struct {
forced bool
}

func (ds *DataSource) deriveTablePathStats(path *accessPath) error {
func (ds *DataSource) deriveTablePathStats(path *accessPath) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment for the newly added return value.

}

func (ds *DataSource) deriveIndexPathStats(path *accessPath) error {
func (ds *DataSource) deriveIndexPathStats(path *accessPath) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

plan/stats.go Outdated
continue
}
err := ds.deriveIndexPathStats(path)
uniqueAndOnlyPoint, err := ds.deriveIndexPathStats(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pointQuery is enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to tell that this key is unique key.

}
path.countAfterAccess, err = ds.statisticTable.GetRowCountByIntColumnRanges(sc, pkCol.ID, path.ranges)
return errors.Trace(err)
noIntervalRange := true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about s/noIntervalRange/allPointRanges/? This variable renaming can make code logic more clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

@shenli shenli added sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement. labels Jun 28, 2018
@@ -388,7 +395,7 @@ func (ds *DataSource) deriveIndexPathStats(path *accessPath) error {
}
path.countAfterIndex = math.Max(path.countAfterAccess*selectivity, ds.statsAfterSelect.count)
}
return nil
return path.index.Unique && path.eqCondCount == len(path.index.Columns), nil
Copy link
Member

@zz-jason zz-jason Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to only check whether the ranges are all point ranges, just like deriveTablePathStats. We can check path.index.Unique in deriveStats()

@zz-jason zz-jason added the priority/release-blocker This issue blocks a release. Please solve it ASAP. label Jun 28, 2018
@winoros
Copy link
Member Author

winoros commented Jun 28, 2018

PTAL @shenli @zz-jason @coocood

}
path.countAfterAccess, err = ds.statisticTable.GetRowCountByIntColumnRanges(sc, pkCol.ID, path.ranges)
return errors.Trace(err)
// Check whether the primary key is covered by point query.
noIntervalRange := true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/noIntervalRange/allPointRanges/ or s/noIntervalRange/onlyPoint/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noIntervalRange is better. Since there may be the case that the range is empty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any example?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like a > 5 and a < 3...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coocood
Copy link
Member

coocood commented Jun 28, 2018

@winoros
What if the range is [<nil>, <nil>] or [1 <nil>, 1 <nil>]?

@lysu
Copy link
Contributor

lysu commented Jun 28, 2018

Copy link
Contributor

@lysu lysu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason
Copy link
Member

/run-all-tests

zz-jason
zz-jason previously approved these changes Jun 28, 2018
@zz-jason zz-jason added the status/LGT2 Indicates that a PR has LGTM 2. label Jun 28, 2018
Copy link
Member

@coocood coocood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we have a unique key a, and most of its value is NULL, the condition is a is NULL, the index path may not be the best.

@shenli
Copy link
Member

shenli commented Jun 29, 2018

@coocood What's is your suggestion? Add a constraint about the estimated row count? Such as if the accessPath.count < 100, then choose the unique key, else fall to the normal way.

@coocood
Copy link
Member

coocood commented Jun 29, 2018

@shenli
No, we can just exclude the case when the point range contains NULL.

@shenli
Copy link
Member

shenli commented Jun 29, 2018

@coocood PTAL

@coocood coocood merged commit 6a931e5 into pingcap:master Jun 29, 2018
@winoros winoros deleted the cut-off-paths branch June 29, 2018 07:00
winoros added a commit to winoros/tidb that referenced this pull request Jul 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/release-blocker This issue blocks a release. Please solve it ASAP. sig/planner SIG: Planner status/LGT2 Indicates that a PR has LGTM 2. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants