Quat Enhancements to Support Needed Spark Use Cases #2010
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Spark has a use case that allows Quats on abstract types even for static queries. Here is a pattern in which I have discovered this possibility:
Now attempt the following query:
Firstly, this will not be expanded correctly since the ApplyMap phase incorrectly removes the Map element around the FlatJoin of this query. Once that minor issues is fixed however, there is a larger one. Namely that outer nested field of the
c
member 'child' only knows about theid
field.This is problematic since the Child object needs both a name and id field for spark to be able to encode the type back (notice how the actual return type is
Quoted[Query[SomeChild]]
).The above behavior happens because when the quats of this query are introspected, they infer the quat from the
SomeChild
type as opposed to theChild
type. This can be surmised based on the Quat of the query being create:In order to fix this issue, we need to identify which child quats are abstract and which are not. The
Quat.Product.Type.Concreate
andQuat.Product.Type.Abstract
fields have been created for this purpose. Using this logic, the AST has been changed to this (i.e. CCA represents an abstract case class).Furthermore, we a check in the SimpleNestedExpansion to not expand identifiers whose type is abstract, even they are the only field in a selection.
Also, since these kinds of idents (i.e. with a abstract Quat.Product) now need to be expanded with a star in both a struct (e.g.
Ident(a)
=>struct(a.*)
) as well as just start selects if the are on the top level of a FlattenSqlQuery (e.g.Ident(a)
=>a.*
), modifications need to be made of the sqlQueryTokenizer in SparkDialect.Finally, it is important to note that since quats can now be abstract, it is perfectly possible to select a property from a quat that does not actually exist. For this reason, we have changed the Property AST element to not throw an exception when a field is looked up that does not exist on the quat inside. Instead it returns Quat.Unknown. This type of Quat is also useful to identify situations where
inferQuat
has received a field is does not know anything about.