Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49569][BUILD][FOLLOWUP] Exclude spark-connect-shims from sql/core module #48403

Closed
wants to merge 3 commits into from

Conversation

LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Oct 9, 2024

What changes were proposed in this pull request?

This pr exclude spark-connect-shims from sql/core module for further fix maven daily test.

Why are the changes needed?

For fix maven daily test:

After #48399, although the Maven build was successful in my local environment, the Maven daily test pipeline still failed to build:

Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala:121: value makeRDD is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:75: value id is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.columnar.CachedBatch]
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:82: value env is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:88: value env is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:185: value parallelize is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:481: value cleaner is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:500: value parallelize is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:940: value addSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:943: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:947: value removeSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1667: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1668: value addSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1673: value partitions is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1674: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1682: value partitions is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1683: value listenerBus is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1687: value removeSparkListener is not a member of org.apache.spark.SparkContext
Error: ] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:1708: value partitions is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
...

After using the mvn dependency:tree command to check, I found that sql/core cascadingly introduced org.apache.spark:spark-connect-shims_2.13:jar:4.0.0-SNAPSHOT:test through org.apache.spark:spark-sql-api_2.13:test-jar:tests:4.0.0-SNAPSHOT:test.

[INFO] ------------------< org.apache.spark:spark-sql_2.13 >-------------------
[INFO] Building Spark Project SQL 4.0.0-SNAPSHOT                        [18/42]
[INFO]   from sql/core/pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- dependency:3.6.1:tree (default-cli) @ spark-sql_2.13 ---
[INFO] org.apache.spark:spark-sql_2.13:jar:4.0.0-SNAPSHOT
...
[INFO] +- org.apache.spark:spark-catalyst_2.13:test-jar:tests:4.0.0-SNAPSHOT:test
[INFO] +- org.apache.spark:spark-sql-api_2.13:test-jar:tests:4.0.0-SNAPSHOT:test
[INFO] |  +- org.scala-lang.modules:scala-parser-combinators_2.13:jar:2.4.0:compile
[INFO] |  +- org.apache.spark:spark-connect-shims_2.13:jar:4.0.0-SNAPSHOT:test
[INFO] |  +- org.antlr:antlr4-runtime:jar:4.13.1:compile
[INFO] |  +- org.apache.arrow:arrow-vector:jar:17.0.0:compile
[INFO] |  |  +- org.apache.arrow:arrow-format:jar:17.0.0:compile
[INFO] |  |  +- org.apache.arrow:arrow-memory-core:jar:17.0.0:compile
[INFO] |  |  +- com.fasterxml.jackson.datatype:jackson-datatype-jsr310:jar:2.18.0:compile
[INFO] |  |  \- com.google.flatbuffers:flatbuffers-java:jar:24.3.25:compile
[INFO] |  \- org.apache.arrow:arrow-memory-netty:jar:17.0.0:compile
[INFO] |     \- org.apache.arrow:arrow-memory-netty-buffer-patch:jar:17.0.0:compile

This should be unexpected.

Does this PR introduce any user-facing change?

No

How was this patch tested?

image

All maven test passed

Was this patch authored or co-authored using generative AI tooling?

No

@@ -306,6 +306,12 @@ jobs:
name: unit-tests-log-${{ matrix.modules }}-${{ matrix.comment }}-${{ matrix.java }}-${{ matrix.hadoop }}-${{ matrix.hive }}
path: "**/target/*.log"

maven-test:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make another attempt. This is to confirm that the GA can be successfully built. Once it can be successfully built, we will delete this piece of code.

@LuciferYang
Copy link
Contributor Author

image

all maven test passed

@github-actions github-actions bot removed the INFRA label Oct 10, 2024
@LuciferYang LuciferYang changed the title Test Maven Build [SPARK-49569][BUILD][FOLLOWUP] Exclude spark-connect-shims from sql/core module Oct 10, 2024
@LuciferYang
Copy link
Contributor Author

cc @hvanhovell @HyukjinKwon @dongjoon-hyun FYI

@HyukjinKwon
Copy link
Member

I think it needs @hvanhovell's look

@LuciferYang
Copy link
Contributor Author

I think it needs @hvanhovell's look

OK

@LuciferYang
Copy link
Contributor Author

Merged into master. Thanks @hvanhovell and @HyukjinKwon

@LuciferYang
Copy link
Contributor Author

LuciferYang commented Oct 11, 2024

After merging this PR, Maven daily test has been restored:

image image

I just thought of a question that needs confirmation. In which directory should the spark-connect-shims.jar be located in the distribution? Currently, after executing dev/make-distribution.sh --tgz, it exists in the jars directory but not in the jars/connect-repl directory. Is this expected? @hvanhovell @HyukjinKwon

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Oct 11, 2024

@LuciferYang
Copy link
Contributor Author

FWIW, I think it's gonna fix https://github.com/apache/spark/actions/runs/11259624487/job/31309026637 too

image

Not successful?It seems that there is still a class conflict caused by classpath issues?

@HyukjinKwon
Copy link
Member

actually yeah seems like all REPL are broken after Maven build. Taking a look.

@LuciferYang
Copy link
Contributor Author

actually yeah seems like all REPL are broken after Maven build. Taking a look.

we should move spark-connect-shims.jar from jars to jars/connect-repl

@HyukjinKwon
Copy link
Member

@LuciferYang mind creating a PR when you find some time? 🙏

@LuciferYang
Copy link
Contributor Author

@LuciferYang mind creating a PR when you find some time? 🙏

OK ~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants