-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sql): add option to enable/disable select merging #9065
Conversation
I'm not sure why we should keep select merging around at all. Does it have any non-aesthetic effect or benefit? |
It reduces the query size. Also the user facing compiled query is much more readable with fusing enabled. |
We don't have any evidence that either of these things is worth the bugs we've already had to fix with merging selects, and are likely to continue having to fix if we keep it around. We do have a bunch of recent examples of correctness bugs with it enabled though and IMO correctness is more important than query size and readability of generated SQL. |
Why not wait for issues to come in for the others? If we never have to add these optimizations, then that's code we don't have to maintain/support/test, etc. |
I feel pretty strongly about this given all the time and effort we have spent to avoid these bugs from the previous implementation. It seems like if we keep merging around we're back to having to whack-a-mole incorrect optimizations yet again. Personally I don't want to spend time on that anymore. I think we're in a really great place with our core abstractions otherwise, and the work you've done on it has really moved the whole project forward. |
I'm not saying we should never explore this problem, only that we should avoid enabling optimizations--right before then next release--that we have seen are at high risk for correctness issues when we can simply disable them. |
This PR is about providing an option to enable/disable the select merges. I would prefer to discuss the default value in the relevant issue to not scatter the discussion. |
Closing in favor of #9070 |
I took the weekend to think about some of the issues here and did a bit of research on the SQL standard specification of predicate short-circuiting.
For example, an engine can evaluate a fallible expression first, even if a subsequent expression with short circuiting would avoid the fallible expression evaluation failing. In practice, DuckDB merges a fallible cast with an infallible operation in the same way producing the exact same plan regardless of how the SQL is specified:
This means that even if Ibis decided to generate subqueries that map one-to-one to SQL selects it provides absolutely zero guarantee that generating the SQL in that way produces specific behavior. IMO this means that Ibis should be free to merge predicates, assuming our implementation of that process is correct. Now, I'm not trying to gloss over the fact that we have had a number of issues with select merging, and so I think we should provide a way to disable it if it can potentially help unblock users. I think the opt-in behavior would be kind of pointless, since people will never opt-in without lots of documentation and if we're going to go that route we should disable it. On the other hand, if we think this is long-term a thing we want enabled, then we definitely need feedback on it from users. The only way to get that feedback AFAICT is to enable this feature. That does mean we have to accept that the current implementation may not be 100% finished and will elicit bug reports. The opt-out behavior IMO seems like an acceptable balance here: people can disable select merging on an as-needed basis, and we can continuously evaluate whether keeping it around is causing overhead that isn't worth it. Disabling select merging it turns out is also not without issue: SQLite fails due to a query we generate being too large for the parser stack, and MS SQL loses the ability to specify order by in many cases as well as +1 on merging this PR and cutting a release after #8917 is merged. |
Add a global option to enable or disable merging select statements where possible. This can serve as an escape route if the optimization fails. It is currently enabled by default, we may change to make it opt-in.
xref #9064 and #9058
Closes #9058.