Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(query): behavior not like MySQL when set LIMIT 0 #16399

Closed
1 of 2 tasks
scbizu opened this issue Sep 5, 2024 · 6 comments · Fixed by #16403
Closed
1 of 2 tasks

bug(query): behavior not like MySQL when set LIMIT 0 #16399

scbizu opened this issue Sep 5, 2024 · 6 comments · Fixed by #16403
Assignees
Labels
A-query Area: databend query C-bug Category: something isn't working C-mysql Category: MySQL

Comments

@scbizu
Copy link

scbizu commented Sep 5, 2024

Search before asking

  • I had searched in the issues and found no similar issues.

Version

v1.2.629-nightly

What's Wrong?

As MySQL documented:

LIMIT 0 quickly returns an empty set. This can be useful for checking the validity of a query. It can also be employed to obtain the types of the result columns within applications that use a MySQL API that makes result set metadata available.

In databend , when set LIMIT 0 (or LIMIT 0,0) , the EXPLAIN (I paste it in Slack) said databend still performs full table data scan instead of only extracting the metadata . Some of client-drivers and SQL frameworks use this feature to check SQL validation and do extract the metadata to the ORM-based structure . With too much data scanning , the performance will step down in this case.

How to Reproduce?

SELECT ... FROM  my_table WHERE 1=1 LIMIT 0,0;

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@scbizu scbizu added the C-bug Category: something isn't working label Sep 5, 2024
@dosubot dosubot bot added A-query Area: databend query C-mysql Category: MySQL labels Sep 5, 2024
Copy link

dosubot bot commented Sep 5, 2024

Hey @scbizu! I'm here to help you with any bugs, questions, or contributions you have for databend. Let's work together to solve this issue!

I found a similar closed issue: Support Output MySQL format, which was last updated on May 30, 2024. This issue was referenced by another issue: tracing issues about OutputFormat & result compare of tests, also last updated on May 30, 2024. The suggested solution was to refactor DFQueryResultWriter into output MySQL format like Tsv/Csv/Parquet [1].

Regarding the specific behavior of Databend when using LIMIT 0, Databend performs a full table data scan instead of only extracting metadata. This is because its query execution plan includes a TableScan operation that reads rows from the table, even with a LIMIT clause. This behavior is different from MySQL, which can optimize such queries to only extract metadata without scanning the table data [2][3][4].

Additionally, the source code for the Limit physical plan in Databend shows that the Limit plan is built on top of the input plan, which includes the TableScan operation. The Limit plan does not inherently skip the data scan but rather applies the limit after the data has been read. This design choice ensures that the query execution adheres to the specified limits but does not avoid the initial data scan [4].

This behavior might impact performance, especially for client-drivers and SQL frameworks that use this feature to check SQL validation and extract metadata.

To continue talking to Dosu, mention @dosu.

@TCeason
Copy link
Collaborator

TCeason commented Sep 5, 2024

Maybe you need ignore result?

SELECT number FROM numbers(2) IGNORE_RESULT;

https://docs.databend.com/sql/sql-commands/query-syntax/query-select#ignore_result

@BohuTANG
Copy link
Member

BohuTANG commented Sep 5, 2024

This is a optimizer issue? cc @xudong963

@BohuTANG
Copy link
Member

BohuTANG commented Sep 5, 2024

Maybe you need ignore result?

SELECT number FROM numbers(2) IGNORE_RESULT;

https://docs.databend.com/sql/sql-commands/query-syntax/query-select#ignore_result

No, this still scan full tables but ignore the result.

@sundy-li
Copy link
Member

sundy-li commented Sep 5, 2024

Join is blocking operator, so limit return early will not be reached.

limit 0 should be optimized it's input to be empty result scan.

@Dousir9
Copy link
Member

Dousir9 commented Sep 5, 2024

Join is blocking operator, so limit return early will not be reached.

limit 0 should be optimized it's input to be empty result scan.

Yes.

@xudong963 xudong963 self-assigned this Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query Area: databend query C-bug Category: something isn't working C-mysql Category: MySQL
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants