-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark sql support #941
Spark sql support #941
Conversation
1da8512
to
63fae7d
Compare
@getquill/maintainers this is ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work!
def probe(statement: String): Try[_] = Success(Unit) | ||
|
||
val idiom = SparkDialect | ||
val naming = Literal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean that using different naming strategies is not possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, given that the new module uses spark encoders and they don't support naming strategies, only Literal
can be used.
val ((an, iAn, on), ont) = dealias(a, iA, o)((_, _, _)) | ||
val ((bn, iBn, onn), _) = ont.dealias(b, iB, on)((_, _, _)) | ||
val ((an, iAn, on), _) = dealias(a, iA, o)((_, _, _)) | ||
val ((bn, iBn, onn), _) = dealias(b, iB, on)((_, _, _)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give more context on these changes? I was debugging #939 and found that the root issue comes from this class. (this comment does not relate to spark module, so we could continue this discussion somewhere else)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was another bug actually. It was using the same alias for the case where the same entity is on both sides of the join (qr1).join(qr1)
. Each branch of the join should be treated as independent, so it doesn't make sense to use the transformer of the first one.
org: User | ||
) | ||
|
||
object GithubExample extends App { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should create SPARK.md which would show all advantages of quill against raw spark.
And add link to this file in readme, since imo, existing example in readme does not show "Tired of the #scala #spark untyped madness?" :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'm actually planning to write a blog post based on this example. I think I'll wait for a release with the new module first, though.
q match { | ||
case q: Entity => Some(q) | ||
case q: Infix => Some(q) | ||
case _ => None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How this could be related to other quill modules? I suppose these changes required to generate queries like this:
SELECT x1.age _1 FROM (?) x1 WHERE x1.name = ?
. However does it impact other world somehow? e.g. adds new opportunities?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say this is a bug fix. Users should be able to use an infix
in an entity position.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hey man, this is really cool, glad to see more people taking type safety in Spark seriously! I was wondering if you've seen the Frameless project? We started with providing a type safe query for Spark SQL and have since started supporting other functionality as well. Great work! |
@adelbertc Thanks for the feedback! :) It's indeed great to see type-safe solutions like Quill and Frameless for spark sql. I imagine I know Frameless, but I'm not a big fan of the Dataset API in general ( |
Fixes #93
Problem
Quill could be used as a more user-friendly way to use Spark's SQL engine. The
Dataset
API is unintuitive and has many untyped methods.Solution
Create a new module that allows users to define queries using Quill's DSL to run on top of
Dataset
s andRDD
s.Notes
sbt-doge
. It didn't really solve the issue. I reverted the SBT 1.0 upgrade and used a hack using a system property.SqlContext
are supported by Spark. That's why the context extends directly fromContext
.Checklist
README.md
if applicable[WIP]
to the pull request title if it's work in progresssbt scalariformFormat test:scalariformFormat
to make sure that the source files are formatted@getquill/maintainers