Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK_HOME detection supports Spark 4 #6413

Closed
wants to merge 1 commit into from

Conversation

pan3793
Copy link
Member

@pan3793 pan3793 commented May 23, 2024

Description

When SPARK_HOME is not set explicitly, the Kyuubi server supports detecting it based on Scala versions, while the rules are not applicable for Spark 4.

This PR enhances the SPARK_HOME detection logic to make it support both Spark 3 and Spark 4.

The above logic is mainly used for testing purposes, the change does not affect users who configure SPARK_HOME in kyuubi-env.sh.

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Test Plan

Related Unit Tests

  • SparkProcessBuilderSuite

Checklist 📝

Be nice. Be informative.

@@ -403,36 +403,43 @@ class SparkProcessBuilderSuite extends KerberizedTestHelper with MockitoSugar {
"spark-core_2.13-3.5.0-abc-20230921.jar",
"spark-core_2.13-3.5.0-xyz-1.2.3.jar",
"spark-core_2.13-3.5.0.1.jar",
"spark-core_2.13.2-3.5.0.jar").foreach { f =>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this actually is a negative case.

"""^spark-3\.\d+\.\d+-bin-hadoop\d(\.\d+)?+-scala2\.13$""".r

final private[kyuubi] val SPARK4_HOME_REGEX_SCALA_213 =
"""^spark-4\.\d+\.\d+(-\w*)?-bin-hadoop\d(\.\d+)?+$""".r
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SPARK_HOME detection should be conservative since it is only used for testing purposes, and users/developers can always set SPARK_HOME explicitly to override it. While SPARK_CORE_SCALA_VERSION_REGEX is used to detect the Scala version, it should consider vender-provided Spark distributions too.

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 81.81818% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 58.34%. Comparing base (4285abc) to head (20e71fd).

Files Patch % Lines
...ache/kyuubi/engine/spark/SparkProcessBuilder.scala 81.81% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #6413      +/-   ##
============================================
+ Coverage     58.32%   58.34%   +0.01%     
  Complexity       24       24              
============================================
  Files           656      656              
  Lines         40263    40265       +2     
  Branches       5499     5498       -1     
============================================
+ Hits          23482    23491       +9     
+ Misses        14276    14269       -7     
  Partials       2505     2505              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@bowenliang123 bowenliang123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Well done.

@pan3793 pan3793 self-assigned this May 23, 2024
@pan3793 pan3793 added this to the v1.9.1 milestone May 23, 2024
@pan3793 pan3793 closed this in b89c185 May 23, 2024
@pan3793
Copy link
Member Author

pan3793 commented May 23, 2024

Thanks, merged to master/1.9

pan3793 added a commit that referenced this pull request May 23, 2024
# Description

When `SPARK_HOME` is not set explicitly, the Kyuubi server supports detecting it based on Scala versions, while the rules are not applicable for Spark 4.

This PR enhances the SPARK_HOME detection logic to make it support both Spark 3 and Spark 4.

The above logic is mainly used for testing purposes, the change does not affect users who configure `SPARK_HOME` in `kyuubi-env.sh`.

## Types of changes

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan

#### Related Unit Tests

- `SparkProcessBuilderSuite`

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6413 from pan3793/spark4-home.

Closes #6413

20e71fd [Cheng Pan] SPARK_HOME detection supports Spark 4

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
(cherry picked from commit b89c185)
Signed-off-by: Cheng Pan <chengpan@apache.org>
@pan3793 pan3793 deleted the spark4-home branch June 6, 2024 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants