-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Enable Comet shuffle in Spark SQL tests #210
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #210 +/- ##
============================================
- Coverage 33.41% 33.32% -0.09%
- Complexity 768 769 +1
============================================
Files 107 107
Lines 36329 37037 +708
Branches 7935 8106 +171
============================================
+ Hits 12138 12342 +204
- Misses 21643 22098 +455
- Partials 2548 2597 +49 ☔ View full report in Codecov by Sentry. |
b58a3e8
to
7ec1500
Compare
3662dab
to
77216a5
Compare
8d4d8a4
to
11acec2
Compare
11acec2
to
04c4aed
Compare
cc @viirya @advancedxy this PR is ready now |
@@ -1238,6 +1392,9 @@ index ed2e309fa07..3767d4e7ca4 100644 | |||
+ conf | |||
+ .set("spark.comet.exec.enabled", "true") | |||
+ .set("spark.comet.exec.all.enabled", "true") | |||
+ .set("spark.shuffle.manager", | |||
+ "org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager") | |||
+ .set("spark.comet.exec.shuffle.enabled", "true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about comet columnar shuffle? I think we are prioritizing that as the first class shuffle support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still pending on the work to make it as default. Once it becomes default, all the tests will automatically switch to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that makes sense.
Other changes lgtm. Only about whether we should enable columnar shuffle by default or not. |
|
||
- test("SPARK-35886: PromotePrecision should be subexpr replaced") { | ||
+ test("SPARK-35886: PromotePrecision should be subexpr replaced", | ||
+ IgnoreComet("TODO: fix Comet for this test")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be forgotten easily. Can we create ticket for that?
|
||
test("SPARK-38204: flatMapGroupsWithState should require StatefulOpClusteredDistribution " + | ||
- "from children - without initial state") { | ||
+ "from children - without initial state", IgnoreComet("TODO: fix Comet for this test")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this.
test("SPARK-38204: flatMapGroupsWithState should require ClusteredDistribution " + | ||
- "from children if the query starts from checkpoint in 3.2.x - without initial state") { | ||
+ "from children if the query starts from checkpoint in 3.2.x - without initial state", | ||
+ IgnoreComet("TODO: fix Comet for this test")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
test("SPARK-38204: flatMapGroupsWithState should require ClusteredDistribution " + | ||
- "from children if the query starts from checkpoint in prior to 3.2") { | ||
+ "from children if the query starts from checkpoint in prior to 3.2", | ||
+ IgnoreComet("TODO: fix Comet for this test")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
|
||
case class Result(key: Long, count: Int) | ||
|
||
+// TODO: fix Comet to enable this suite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, all this suite fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I suggested that we can create tickets for the failed tests.
I created #231 to track these failed tests. |
Merged. Thanks. |
* test: Enable Comet shuffle in Spark SQL tests * disable some tests * disable another test * update * update
* test: Enable Comet shuffle in Spark SQL tests * disable some tests * disable another test * update * update
Which issue does this PR close?
Closes #195.
Rationale for this change
Currently the Spark SQL tests do not enable Comet shuffle yet, which causes it to miss quite a lot test coverage. This PR enables shuffle for them. It uses the default shuffle mechanism for now (Comet native shuffle).
I have to disable a few tests that are failing probably due to some bugs in Comet itself. Once this is merged, I'll create separate Github issues to track them.
What changes are included in this PR?
Enable Comet shuffle for Spark SQL tests.
How are these changes tested?