-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
planner: put all cardinality estimation code into a single separate package #46358
Labels
epic/cardinality-estimation
the optimizer cardinality estimation
sig/planner
SIG: Planner
type/enhancement
The issue or PR belongs to an enhancement.
Comments
Merged
12 tasks
Merged
12 tasks
This was referenced Aug 24, 2023
Closed
Merged
12 tasks
Merged
12 tasks
Merged
12 tasks
12 tasks
12 tasks
12 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
epic/cardinality-estimation
the optimizer cardinality estimation
sig/planner
SIG: Planner
type/enhancement
The issue or PR belongs to an enhancement.
Enhancement
Currently, the code of CE is coupled with other modules, and you can find its code in:
HistColl.Selectivity
,HistColl.GetRowCountByXXX
,HistColl.crossValidationSelectivity
, etc.fullJoinRowCountHelper.estimate
,DataSource.getOriginalPhysicalTableScan
, etc.baseLogicalPlan.recursiveDeriveStats
,LogicalPlan.DeriveStats
, etc.This makes this module hard to maintain and evolve.
An ideal architecture is shown below, where boundaries between the
![image](https://private-user-images.githubusercontent.com/7499936/262632047-d23f1c9d-99f5-48eb-85df-1c6a3cce986f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1ODcwNDUsIm5iZiI6MTczOTU4Njc0NSwicGF0aCI6Ii83NDk5OTM2LzI2MjYzMjA0Ny1kMjNmMWM5ZC05OWY1LTQ4ZWItODVkZi0xYzZhM2NjZTk4NmYucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTVUMDIzMjI1WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9YWZlNGRkMzE5YjUyMzgxYWM1MTJjOTBhMjk3YzU3Njg1ODg2ZTVmMGEyZDY0OTI0ZWJjZTRhZWU0Y2U0YzVhNSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.STqw1OjHrv8hMuI3YBJLVlsYL9r5vPy9sqX-hNGuKUA)
CE
module and other modules are clear:Hist
,TopN
, etc.Stats
, theCE
module provides some high-level interfaces to estimate cardinality forCNF
,Join
,Agg
, etc. All estimation strategies should be put in this separate package, e.g.:2.1. How to estimate for Join (fullJoinRowCountHelper.estimate)?
2.2. How to handle ModifyCnt?
2.3. How to handle out-of-range estimation (outOfRangeEQSelectivity)?
2.4. How to prioritize index statistics and column statistics?
2.5. How to use multiple sorts of statistics to make the estimation result more accurate (crossValidationSelectivity)?
2.6. ...
We decided to refactor(reorganize) related code by following the above design.
The text was updated successfully, but these errors were encountered: