Express Infix Clauses not in Select List #1597
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes (first issue) #1583, (second issue) #1598, (third issue) #1564
Also fixes #1580
Problem
First Issue
As explained in #1583, select elements that are infix clauses cannot be simply removed from a query when the are not being selected (at least if they are not pure). For example:
This typically occurs where multiple map clauses are combined into a single one. Since clearly this cannot be done for the aforementioned infix clauses, we need to express the chained map clauses as sub-super queries which Quill already does, however, when deciding which select-values should be in the sub-query Quill excludes the values of elements that are not in the outer select list. This is perfectly acceptable in normal cases e.g:
However not in situations where the are infix clauses being used as explained above.
For this reason, we need to find which infixes of every sub-query have not been expressed in the
expandSelect
(now refactored intoExpandSelect
) method and then put them back into the sub-query. This is tricky because these infixes could be inside of tuples or case classes whose sibling elements have been selected so blindly including allSelectValue
objects with infixes will cause duplicate fields to be selected. For this reason, we need to recursively traverse allSelectValue
objects containing case classes and tuples and extract the infix values inside.During the traversal, we need to keep track of which element inside the respective case class or tuple we are in and check if that element has already been expressed as a
SelectValue
because it matched someProperty
of the output. Since arbitrary things can be selected inside of arbitrary paths inside case-classes and tuples, we need to keep track of not only the order of elements in the initial selection (of the sub-query) but also which element of the respective tuple (or sub-tuple, or sub-sub tuple) the infix is in. For this reason, we created theOrderedSelect
object which keeps aList[Int]
that represents the "path" of an element inside a list of select values. For example, say you have a query like this:Once the
SqlQuery(ast)
has been created, it looks something like this:The "orders" of the elements are the following:
Order = 1 -
Property(Ident("p"), "id")
Order = 2 -
Tuple(List(Property(Ident("p"), "name"), Infix(List("RANK() OVER(ORDER BY ", ")"), List(Property(Ident("p"), "age")), false)
Order = 2,1 -
Property(Ident("p"), "name")
Order = 2,2 - Infix(List("RANK() OVER(ORDER BY ", ")")
Now, since we have already selected element 2,2 (i.e.
Property(Ident("p"), "name")
) we cannot just select element 2 (i.e. the entire tuple since that would cause thename
property to be selected twice (i.e.SELECT ... FROM (SELECT p.id, p.name, p.name, RANK() ...)
). For this reason, we need to search down into element 2 into 2,2 and then pull out the infix. Then we need to put this infix into the correct place in the resulting query.Second Issue
A related issue #1583 is where tuples are mapped to ad-hoc case classes and then are part of a nested select. This typically breaks because when in certain situations, multiple nested clauses exist in the AST (i.e.
Nested(Nested(q))
and sub-query fields cannot be attached to the corresponding elements because of how the nested query expansion works.This occurs:
This is due to the fact that there are multiple
Nested
clauses in the AST:This leads to an AST that looks like this:
Notice that we need to lookup the properties
_1
and_2
from thex
variable in the middle? This will obviously fail becausex
does not contain these values. The problem here is that the extra nested clause produces and incorrect expression between the_1
and_2
keys and thea
andb
select values to which they refer. Collapsing theNested(Nested(q))
inside ofSqlQuery
solves this problem.Third Issue
The third issue involves using an embedded entity inside of a query using
distinct
. In addition to potentially having a double nesting issue (Second Issue), this kind of query fails in theValidateSqlQuery
step because it's elements are not properly expanded. This occurs because root-level tuples in map-clauses and root-level ad-hoc case classes are not treated equivalently. Take for instance the following nested query that maps to a tuple:The SqlQuery gets expanded to the following:
Then take the following nested query that maps to an ad-hoc case class:
The SqlQuery gets expanded to the following:
Notice that in the former, the tuple has been flattened to an array of SelectValue elements as opposed to the latter has not.
The difference in behavior also has an impact on
ExpandNestedQueries
. Notice for instance that tuple indices are used to de-reference the Nth element of a given select:The reason why Quill behaves differently for Tuples and Ad-Hoc case classes is due to the treason that tuples are used as both a row-coproduct type i.e. most notably from applicative joins, as well as an element type from a standard map method. Due to having this double-meaning, Quill automatically expands element-type types inside of SqlQuery so that they behave the same was as coproduct-type tuples. Now when using Ad-Hoc case classes as coproduct-types (i.e. with the use of
Embedded
), some additional effort needs to be taken in order to expand them properly prior to verification inVerifySqlQuery
. This allowsVerifySqlQuery
to properly exclude sub-element identities (i.e. identities inside ofSelectValue(CaseClass(...))
elements).Potential Issues in Future
Are there situations where
SELECT ... FROM (SELECT C1, .... RANK() (...))
queries will change their results by the mere exclusion of a column thatC1
that is excluded from the outer select. If this is the case, there should exist a mode that dissallowsExpandSelect
from excluding ANY columns from sub-queries. This is fairly straightfoward to do now since we are basically doing this for infixes, whereas instead of just infixes, we would do it for all kinds of columns.Trace
Due to the complex nature of
ExpandNestedQueries
and the AST transformations in general, I have decided to add some tracing code that can give the user more insight into what is going on with these operations. This has been instrumented as a Interpolator so as to clearly distinguish itself from the surrounding code. Since these things are considered a "side effect" and are to be avoided in functional code (at least by some schools of thought), I have introduced various methods such asandReturn
andandContinune
to the trace Interpolator that should allow the user to keep code in the functional style, at least in some places.Conclusion
Since map-chained infixes are the most typical cause of nested queries, and distincts are a close-second, all three of these issues are closely related and require the same set of functionality in
ExpandNestedQueries
andVerifySqlQuery
in order to function properly. Therefore I have chosen to bundle them into a single PR.Checklist
README.md
if applicable[WIP]
to the pull request title if it's work in progresssbt scalariformFormat test:scalariformFormat
to make sure that the source files are formatted@getquill/maintainers