-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: maintaining histograms in plan. #7605
Conversation
@CaitinChen PTAL, thanks! |
## Background | ||
|
||
Currently, TiDB only uses statistics when deciding which physical scan method a table should use. And TiDB only stores simple statistics in the plan structure. But when deciding the join order and considering some other optimization rules, we need more detailed statistics. | ||
So we need to maintain the statistics in the plan structure to get sufficient statistics information to do optimizations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we need to maintain the statistics in the plan structure to get sufficient statistics information to do optimizations.
|
||
For `Sort`, we can just copy children's `statsInfo` without doing any change. | ||
|
||
For `Limit`, we can just copy children's `statsInfo` or ignore the histogram information. As you know, its execution logic is based on randomization. Hard to maintain the statistics information after it. But we may use the information before it to do some estimation in some scenarios. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Limit
, we can just copy children's statsInfo
or ignore the histogram information. As you know, its execution logic is based on randomization. It is hard to maintain the statistics information after Limit
. But we may use the information before Limit
to do some estimation in some scenarios.
?
|
||
For `Join`, there’re joins as follows: | ||
|
||
- Inner join: use histograms to do the row count estimation with the join key condition. Since it won’t have one side filter, we only need to consider the composite filters after considering the join key. We can simply multiply `selectionFactor` if there are other composite filters in our first version of implementation. Since `Selectivity` cannot calculate selectivity of expression that containing multiple column. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> ... selectivity of an expression that contains multiple columns.
or
-> ... selectivity of an expression containing multiple columns.
|
||
- One side outer join: It depends on the join keys’ NDV. And we can just use histograms to estimate it if there’re non-join-key filters. | ||
|
||
- Semi join: It’s something similar to inner join. But no data expanding occurs. When we maintain the range information. We can get a nearly accurate answer of its row count. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... When we maintain the range information, we can get a nearly accurate answer of its row count
or
... But no data expanding occurs when we maintain the range information. We can get a nearly accurate answer of its row count
??
|
||
For `Selection`, just use it to calculate the selectivity. | ||
|
||
For `DataSource`, if it’s a non-partitioned table, we just maintain the map. If it’s a partitioned table, we now only store the statistics of each partition So we need to merge them. We’ll need a cache or something else to ensure that we won’t merge them each time we need it, which will consume tooooo much time and memory space. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For DataSource
, if it’s a non-partitioned table, we just maintain the map. If it’s a partitioned table, we now only store the statistics of each partition. So we need to merge them. We need a cache or something else to ensure that we won’t merge them each time we need it, which will consume tooooo much time and memory space.
|
||
### What is the impact of not doing this? | ||
|
||
Many cases reported by our customer already prove that we need more accurate statistics to choose a better join order and a proper join algorithm. Only maintaining a number about row count and a slice about ndv is not enough for making that decision. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ndv -> NDV
Note: The terms in an article should be consistent.
|
||
## Implementation | ||
|
||
First maintain the histogram in `DataSource`. In this step, there will be some changes in the `statistics` package to make it work. It may take a little long time to do this. [PR#7385](https://github.com/pingcap/tidb/pull/7385) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, maintain the histogram in DataSource
. In this step, there will be some changes in the statistics
package to make it work. It may take a little long time to do this. PR#7385
|
||
For `Projection`, change the reflection of the map we maintained. | ||
|
||
For `Aggregation`, use the histogram to estimate the ndv of group-by items. If one index cannot cover the group-by item, we’ll multiply the ndv of each group-by column. If the output of `Aggregation` includes group-by columns, we’ll maintain the histogram of them for future use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ndv -> NDV
"ndv" or "NDV"? Please make your terms consistent.
|
||
I’ve looked into Spark. They did nearly the same thing with what I said. They only maintain the max and min values, no `ranges` information. And they don’t have index, so they only maintain the column’s max/min value which make problem far more much easier. | ||
|
||
As for Orca and Calcite, I haven’t discovered where they maintain this information. But there’s something about statistics in Orca’s paper. According to the paper, I think they construct new histogram during planning and cache it for not building to often. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... I think they construct a new histogram during planning and cache it to avoid building too many times.
or
... I think they construct a new histogram during planning and cache it to avoid repeated building.
|
||
And the `expectedCount` we used in physical plan is something same with `Limit`. So the row count modification during physical plan won’t be affected. | ||
|
||
After we switch to the cascade like planner. The rule that needs cost to make decision is still a small set of all. And the existence of `Group` can also help us. If we lazily construct the `statsInfo`, this may not be the bottleneck. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After we switch to the cascade-like planner. The rule that needs cost
to make a decision is still a small set of all.
@CaitinChen I've addressed these comments. PTAL thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
## Background | ||
|
||
Currently, TiDB only uses statistics when deciding which physical scan method a table should use. And TiDB only stores simple statistics in the plan structure. But when deciding the join order and considering some other optimization rules, we need more detailed statistics. | ||
So we need to maintain the statistics in the plan structure to get sufficient statistics information to do optimizations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can click "View" in the upper right corner. Then you can see that these two paragraphs are combined together.
Next time, when you want to write another paragraph, please break a line~ Then this article can be displayed normally.
|
||
For `Aggregation`, we only need to cut off the things which are not in ranges when doing estimation. There is no need to update the ranges information. | ||
|
||
For `TopN`, we now have the alibity to maintain histograms of the order-by items. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alibity -> ability
|
||
I’ve looked into Spark. They did nearly the same thing with what I said. They only maintain the max and min values, rather than the `ranges` information. And they don’t have the index, so they only maintain the column’s max/min value which make problem much easier to solve. | ||
|
||
As for Orca and Calcite, I haven’t discovered where they maintain this information. But there’s something about statistics in Orca’s paper. According to the paper, I think they construct new histogram during planning and cache it to avoid building too many times. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... I think they construct a new histogram during planning...
or
... I think they construct new histograms during planning...
|
||
And the `expectedCount` we used in physical plan is something same with `Limit`. So the row count modification during physical plan won’t be affected. | ||
|
||
After we switch to the cascade-like planner. The rule that needs cost to make decision is still a small set of all. And the existence of `Group` can also help us. If we lazily construct the `statsInfo`, this may not be the bottleneck. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After we switch to the cascade-like planner, the rule that needs cost to make a decision...
@@ -0,0 +1,115 @@ | |||
# Proposal: Maintain statistics in `Plan` | |||
|
|||
- Author: Yiding CUI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add the link of your GitHub profile page after your name~
@CaitinChen PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
The new `statsInfo` of `plan` should be something like the following structure: | ||
|
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
```go
|
||
We maintain the histogram in `Projection`, `Selection`, `Join`, `Aggregation`, `Sort`, `Limit` and `DataSource` operators. | ||
|
||
For `Sort`, we can just copy children's `statsInfo` without doing any change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about use a separate section to describe how to maintain statistics for each operators, like:
### `Sort`
### `Limit`
### `Project`
### ...
|
||
For `Join`, there’re joins as follows: | ||
|
||
- Inner join: use histograms to do the row count estimation with the join key condition. Since it won’t have one side filter, we only need to consider the composite filters after considering the join key. We can simply multiply `selectionFactor` if there are other composite filters in our first version of implementation. Since `Selectivity` cannot calculate selectivity of an expression containing multiple columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/`Selectivity`/`Selectivity()`/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how the statistics of Join
is derived from the statistics of its children?
|
||
For `Projection`, change the reflection of the map we maintained. | ||
|
||
For `Aggregation`, use the histogram to estimate the NDV of group-by items. If one index cannot cover the group-by item, we’ll multiply the NDV of each group-by column. If the output of `Aggregation` includes group-by columns, we’ll maintain the histogram of them for future use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how to maintain the statistics info for the Aggregate
operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not need to do anything. Just use the child is okay.
I'll update this soon. |
8ad555a
to
539c59a
Compare
539c59a
to
bdd26fa
Compare
updated. |
|
||
We maintain the histogram in `Projection`, `Selection`, `Join`, `Aggregation`, `Sort`, `Limit` and `DataSource` operators. | ||
|
||
#### `Sort` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should describe the following things for each operator:
- How to maintain and set the the content of the
statsInfo
struct of the operator based on thestatsInfo
of it's child operator?- how to calculate the
ndv
slice? - how to calculate the
histColl
slice? - how to calculate the
rangesOfXXX
slice? - how to calculate the
max/min Values
map?
- how to calculate the
- How to estimate the output row count for the operator based on the
statsInfo
of it's child operator?
|
||
Where <img alt="$joinKeySelectivity = \frac{1}{NDV(t1.key)}*\frac{1}{NDV(t2.key)}*ndvAfterJoin$" src="svgs/291c9eb6e8db885402c716ffc3e17a65.png?invert_in_darkmode" align="middle" width="466.6166208pt" height="27.7756545pt"/>. | ||
|
||
The `ndvAfterJoin` can be <img alt="$min(NDV(t1.key), NDV(t2.key))$" src="svgs/30df1c648fa9fe43985776847c8dbe60.png?invert_in_darkmode" align="middle" width="248.4423216pt" height="24.657534pt"/> or a more detailed one if we can caculate it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
caculate
-> calculate
##### One side outer join | ||
It's almost the same as inner join's behavior. But we need to consider two more thing: | ||
|
||
- The unmatched row will be filled as `NULL`. This should be calculated in the new histogram. The null count can be caculated when we estimate the matched count bucket by bucket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
caculated
-> calculated
It's almost the same as inner join's behavior. But we need to consider two more thing: | ||
|
||
- The unmatched row will be filled as `NULL`. This should be calculated in the new histogram. The null count can be caculated when we estimate the matched count bucket by bucket. | ||
- There will be one side filters of the outer table. If the filter is about join key and can be converted to range information, it's can be easily involved when we do the caculation bucket by bucket. Otherwise it's a little hard to deal with it. Don't consider this case currently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
caculation
-> calculation
Same with semi join. | ||
|
||
#### `Aggregate` | ||
Just read the NDV information from the `statsInfo` to dicide the row count after aggregate. If there's index can fully match the group-by items. We just use its NDV. Otherwise we multiply the ndv of each column(or index that can match part of the group-by item). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dicide
-> decide
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
. We just use
-> , we just use
?
We can just copy children's `statsInfo` without doing any change. Since the data distribution is not changed. | ||
|
||
#### `Limit` | ||
Currently we won't maintain hitogram information for it. But it can be considered in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hitogram
-> histogram
<img alt="Step 2" src="./histogram-3.png" width="150pt"/> | ||
</div> | ||
|
||
The calculation inside the bucket can be calculated as this formula <img alt="$selecivity=joinKeySelectivity*RowCount(t1)*RowCount(t2)$" src="svgs/35fa60f709be6b9ab8aa9036bd5e7f7f.png?invert_in_darkmode" align="middle" width="476.19356895pt" height="24.657534pt"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
selecivity
-> selectivity
.
#### `Aggregate` | ||
Just read the NDV information from the `statsInfo` to dicide the row count after aggregate. If there's index can fully match the group-by items. We just use its NDV. Otherwise we multiply the ndv of each column(or index that can match part of the group-by item). | ||
|
||
If some of the group-by items are also in the select field. We will create new histograms modify the `totalCnt` of each bucket(set it the same with `NDV`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
. We will
-> , we will
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modify the
-> by modifying the
?
If some of the group-by items are also in the select field. We will create new histograms modify the `totalCnt` of each bucket(set it the same with `NDV`). | ||
|
||
#### `Sort` | ||
We can just copy children's `statsInfo` without doing any change. Since the data distribution is not changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
. Since
- , since
?
|
||
This struct will be maintained when we call `deriveStats`. | ||
|
||
Currently we don't change the histogram itself during planning. Because it will consume a lot of time and memory space. I’ll try to maintain ranges slice or the max/min value to improve the accuracy of row count estimation instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we change the histogram in this proposal?
Actually, we only use the |
@lamxTyler Yes, it can be removed |
LGTM |
@lamxTyler @eurekaka PTAL |
@winoros They are still some comments not addressed. |
@lamxTyler So now it can be reviewed. |
Seems the file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
bugfix fixed pingcap#7518 expression: MySQL compatible current_user function (pingcap#7801) plan: propagate constant over outer join (pingcap#7794) - extract `outerCol = const` from join conditions and filter conditions, substitute `outerCol` in join conditions with `const`; - extract `outerCol = innerCol` from join conditions, derive new join conditions based on this column equal condition and `outerCol` related expressions in join conditions and filter conditions; util/timeutil: fix data race caused by forgetting set stats lease to 0 (pingcap#7901) stats: handle ddl event for partition table (pingcap#7903) plan: implement Operand and Pattern of cascades planner. (pingcap#7910) planner: not convert to TableDual if empty range is derived from deferred constants (pingcap#7808) plan: move projEliminate behind aggEliminate (pingcap#7909) admin: fix admin check table bug of byte compare (pingcap#7887) * admin: remove reflect deepEqual stats: fix panic caused by empty histogram (pingcap#7912) plan: fix panic caused by empty schema of LogicalTableDual (pingcap#7906) * fix drop view if exist error (pingcap#7833) executor: refine `explain analyze` (pingcap#7888) executor: add an variable to compatible with MySQL insert for OGG (pingcap#7863) expression: maintain `DeferredExpr` in aggressive constant folding. (pingcap#7915) stats: fix histogram boundaries overflow error (pingcap#7883) ddl:support the definition of `null` change to `not null` using `alter table` (pingcap#7771) * ddl:support the definition of null change to not null using alter table ddl: add check when create table with foreign key. (pingcap#7885) * ddl: add check when create table with foreign key planner: eliminate if null on non null column (pingcap#7924) executor: fix a bug in point get (pingcap#7934) planner, executor: refine ColumnPrune for LogicalUnionAll (pingcap#7930) executor: fix panic when limit is too large (pingcap#7936) ddl: add TiDB version to metrics (pingcap#7902) stats: limit the length of sample values (pingcap#7931) vendor: update tipb (pingcap#7893) planner: support the Group and GroupExpr for the cascades planner (pingcap#7917) store/tikv: log more information when other err occurs (pingcap#7948) types: fix date time parse (pingcap#7933) ddl: just print error message when ddl job is normal to calcel, to eliminate noisy log (pingcap#7875) stats: update delta info for partition table (pingcap#7947) explaintest: add explain test for partition pruning (pingcap#7505) util: move disjoint set to util package (pingcap#7950) util: add PreAlloc4Row and Insert for Chunk and List (pingcap#7916) executor: add the slow log for commit (pingcap#7951) expression: add builtin json_keys (pingcap#7776) privilege: add USAGE in `show grants` for mysql compatibility (pingcap#7955) ddl: fix invailid ddl job panic (pingcap#7940) *: move ast.NewValueExpr to standalone parser_driver package (pingcap#7952) Make the ast package get rid of the dependency of types.Datum server: allow cors http request (pingcap#7939) *: move `Statement` and `RecordSet` from ast to sqlexec package (pingcap#7970) pr suggestion update executor/aggfuncs: split unit tests to corresponding file (pingcap#7993) store/tikv: fix typo (pingcap#7990) executor, planner: clone proj schema for different children in buildProj4Union (pingcap#7999) executor: let information_schema be the first database in ShowDatabases (pingcap#7938) stats: use local feedback for partition table (pingcap#7963) executor: add unit test for aggfuncs (pingcap#7966) server: add log for binary execute statement (pingcap#7987) admin: refine admin check decoder (pingcap#7862) executor: improve wide table insert & update performance (pingcap#7935) ddl: fix reassigned partition id in `truncate table` does not take effect (pingcap#7919) fix reassigned partition id in truncate table does not take effect add changelog for 2.1.0 rc4 (pingcap#8020) *: make parser package dependency as small as possible (pingcap#7989) parser: support `:=` in the `set` syntax (pingcap#8018) According to MySQL document, `set` use the = assignment operator, but the := assignment operator is also permitted stats: garbage collect stats for partition table (pingcap#7962) docs: add the proposal for the column pool (pingcap#7988) expression: refine built-in func truncate to support uint arg (pingcap#8000) stats: support show stats for partition table (pingcap#8023) stats: update error rate for partition table (pingcap#8022) stats: fix estimation for out of range point queries (pingcap#8015) *: move parser to a separate repository (pingcap#8036) executor: fix wrong result when index join on union scan. (pingcap#8031) Do not modify Plan of dataReaderBuilder directly, because it would impact next batch of outer rows, as well as other concurrent inner workers. Instead, build a local child builder to store the child plan. planner: fix a panic of a cached prepared statement with IndexScan (pingcap#8017) *: fix the issue of executing DDL after executing SQL failure in txn (pingcap#8044) * ddl, executor: fix the issue of executing DDL after executing SQL failure in txn add unit test remove debug info add like evaluator case sensitive test ddl, domain: make schema correct after canceling jobs (pingcap#7997) unit test fix code format proposal: maintaining histograms in plan. (pingcap#7605) support _tidb_rowid for table scan range (pingcap#8047) var rename fix
What problem does this PR solve?
Proposal about maintaining histogram information in plan's operators.
Check List
Tests
Side effects