Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CELEBORN-1413] Support Spark 4.0 #2813

Closed
wants to merge 5 commits into from
Closed

[CELEBORN-1413] Support Spark 4.0 #2813

wants to merge 5 commits into from

Conversation

FMX
Copy link
Contributor

@FMX FMX commented Oct 15, 2024

What changes were proposed in this pull request?

To support Spark 4.0.0 preview.

Why are the changes needed?

  1. Changed Scala to 2.13.
  2. Introduce columnar shuffle module for spark 4.0.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Cluster test.

@FMX FMX changed the title [CELEBORN-1413] Support spark 4.0 [CELEBORN-1413] Support Spark 4.0 Oct 15, 2024
Copy link

github-actions bot commented Nov 5, 2024

This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the stale label Nov 5, 2024
@FMX
Copy link
Contributor Author

FMX commented Nov 12, 2024

Will update this PR next week.

@github-actions github-actions bot removed the stale label Nov 13, 2024
@FMX FMX force-pushed the b1413 branch 7 times, most recently from 6e99713 to 044f892 Compare November 28, 2024 06:20
@FMX
Copy link
Contributor Author

FMX commented Nov 28, 2024

The CI for dependency check is not related to this PR. I'll fix that in another PR.

@FMX
Copy link
Contributor Author

FMX commented Nov 28, 2024

Spark-4.0 profile can be compiled in JDK 17+ environment only.

@SteNicholas
Copy link
Member

I have already tested the implementation of Spark 4.0 integration in internal environment of Spark 4.0 and worked well for Spark 4.0.
image
image
image
image

Copy link
Member

@SteNicholas SteNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@RexXiong RexXiong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: Why are we moving spark-3 to spark-3-4 ?
It is possible that spark-4 impl ends up evolving nontrivially in time, and in a manner which conflicts with spark-3.

Do we want to simply keep spark-3 (which eventually goes away when support for spark-3.5 is dropped), and start with a new spark-4 ?

I understand the duplication aspect could be nontrivial - wondering if there is a middle ground for 3.5 vs 4, which minimizes this.

This would also minimize impact on existing spark-3 users (from dep pov)

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a quick pass, will try to go over it in more details later during holidays.
But please dont block on my review.

@FMX
Copy link
Contributor Author

FMX commented Dec 23, 2024

QQ: Why are we moving spark-3 to spark-3-4 ? It is possible that spark-4 impl ends up evolving nontrivially in time, and in a manner which conflicts with spark-3.

Do we want to simply keep spark-3 (which eventually goes away when support for spark-3.5 is dropped), and start with a new spark-4 ?

I understand the duplication aspect could be nontrivial - wondering if there is a middle ground for 3.5 vs 4, which minimizes this.

This would also minimize impact on existing spark-3 users (from dep pov)

The main reason to add a module to share within spark 3.5 and spark 4 is because the shuffle APIs are stable. The unstable APIs are in the columnar shuffle module which is separated from the module of spark 3.5.

@RexXiong RexXiong closed this in fde6365 Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants