Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate a nulls fraction when NDV is known in TableScanStatsRule #15132

Merged
merged 1 commit into from
Nov 23, 2022

Conversation

raunaqmorarka
Copy link
Member

@raunaqmorarka raunaqmorarka commented Nov 21, 2022

Description

If a connector provides NDV but is missing nulls fraction statistic for a column
(e.g. Delta Lake after "delta.dataSkippingNumIndexedCols" columns and MySql), populate a
10% guess value so that the CBO can still produce some estimates rather than
failing to make any estimates due to lack of nulls fraction.

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Delta
* Improve CBO estimates when the nulls fraction statistic is not available for some columns. ({issue}`15132`)

Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

@raunaqmorarka raunaqmorarka force-pushed the delta-nulls branch 3 times, most recently from 681dd91 to 3866153 Compare November 23, 2022 14:25
If a connector provides NDV but is missing nulls fraction statistic for a column
(e.g. Delta Lake after "delta.dataSkippingNumIndexedCols" columns and MySql), populate a
10% guess value so that the CBO can still produce some estimates rather than
failing to make any estimates due to lack of nulls fraction.
@JunhyungSong
Copy link
Member

Minor comment. It would be better to use a full term instead of using an abbreviation(acronym) like NDV.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants