-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement SQLancer (a end-to-end SQL fuzz testing library) #11030
Comments
Thank you @2010YOUY01 Sounds like a great idea to me -- I have created a datafusion_contrib repo for this work in case you would like to put it there: https://github.com/datafusion-contrib/datafusion-sqllancer |
This is the first interesting bug found: #11248: |
Nice! |
The initial implementation is done (with ~10 bugs found 👀 ) There is a lot of work can be done to find more bugs, any contributions are welcomed! |
This is really nice work @2010YOUY01 -- thank you so much. |
Filed #11430 to note this on the docs Also posted on twitter: https://twitter.com/andrewlamb1111/status/1811725290801963475 Thanks again @2010YOUY01 |
PARTITION BY
caused panic in 'tokio-runtime-worker' (SQLancer)
#12057
I would like to help with this, I'm big into tests |
@rluvaton Thank you! I'm still interested in this project (though I haven’t been working on it for a few months 😅) and I'm happy to help with any contributions. I think we can start with a few easier tasks:
I still have some local changes that haven’t been pushed to https://github.com/datafusion-contrib/datafusion-sqlancer. I will update it and ensure it works with the latest version of DataFusion next week. |
This would be great. We currently have some tests running only on commits to main that we could potentially extend https://github.com/apache/datafusion/blob/main/.github/workflows/extended.yml |
Is your feature request related to a problem or challenge?
I noticed an awesome SQL fuzzing framework SQLancer can be implemented on DataFusion, and it is able to detect many bugs even in PostgreSQL and SQLite
Update:
Implementation is now at datafusion-sqlancer
Supported SQL Features
JOIN
s,ORDER BY
,WHERE
HAVING
clausetarget_partition
,prefer_hash_join
etc.Supported Test Oracles
Note: most oracles only apply to a subset of available query types, for advanced SQL features like window functions we can only generate random queries and report crashes.
More context for below test oracles at https://github.com/sqlancer/sqlancer/tree/main
How SQLancer works in short
JDBC
to do SQL level testingsSQLancer
has 5 logic check oracles, one of them works like:Above showed consistency check generated Q1 (very likely to be optimized by predicate pushdown), and Q2(hard to be optimized), such test suit focus on correctness of the optimizer. There are 5 similar test oracles available to be implemented, those carefully designed checks make this testing framework really powerful.
Describe the solution you'd like
I plan to implement
SQLancer
onDataFusion
(starting with a specific test oralcleNoREC
which requires less engineering effort).For now, a minimal subset of SQL features is implemented: it hasn't detected any logical bug yet, just 2 bad-input bugs for some scalar functions showed up
(Will share the code once it is cleaned up)
If you have any features (SQL clauses / data types / specific functions) would like to be further tested, I can implement them first :)
Describe alternatives you've considered
SQLsmith
looks like another popular choice, I haven't looked into it carefully yet.But if it's only generating random SQL to test if the system will crash, then
SQLancer
should be a more comprehensive tool.Additional context
SQLancer's page have several papers/YouTube talk video recordings available
The text was updated successfully, but these errors were encountered: