-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for LIMIT
push down during query planning to reduce number of files scanned
#1495
Conversation
d4c13ad
to
72cbf9b
Compare
core/src/main/scala/org/apache/spark/sql/delta/stats/DataSkippingReader.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/sql/delta/stats/DataSkippingReader.scala
Outdated
Show resolved
Hide resolved
respond to PR comments
72cbf9b
to
53d4f1a
Compare
|
||
override def close(): Unit = { | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra line
Please make the title better. LimitPushdown into what? |
LIMIT
push down during query planning to reduce number of files scanned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
@scottsand-db Is OFFSET also affected by this change? It seems plausible for optimization to occur, as files without the required data could be skipped. For example, if there were two files with 1000 rows each, and an attempt was made to read them with OFFSET 1001, the first file could be easily skipped using the metadata information. |
Good question. However, using |
@zsxwing What about z-ordered tables? I would assume that's a fairly common optimisation |
I'm wondering about a possible enhancement on top of this limit pushdown support. If the query specifies an |
Description
This PR adds support for limit pushdown, where we will "push down" any
LIMIT
s during query planning so that we scan the minimum number of files necessary.How was this patch tested?
New test suite.
Does this PR introduce any user-facing changes?
No.