Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shuffle both sides at the same time for md.merge #3041

Merged
merged 3 commits into from
May 18, 2022

Conversation

hekaisheng
Copy link
Contributor

What do these changes do?

For now, we shuffle left and right separately and make reducer data locate in different workers, this leads to additional transfer when execute merge operand. This PR only shuffles once for both left and right, so that reducer data locate in same worker and no transfer for following execution.

Related issue number

Fixes #xxxx

Check code requirements

  • tests added / passed (if needed)
  • Ensure all linting tests pass, see here for how to run them

@hekaisheng hekaisheng added type: enhancement request mod: dataframe to be backported Indicate that the PR need to be backported to stable branch labels May 17, 2022
@hekaisheng hekaisheng added this to the v0.10.0a1 milestone May 17, 2022
@hekaisheng
Copy link
Contributor Author

As this PR changes the tile logic of merge, it will affect assign time which make asv benchmark fail in CI.

@hekaisheng hekaisheng marked this pull request as ready for review May 18, 2022 02:02
@hekaisheng hekaisheng requested a review from a team as a code owner May 18, 2022 02:02
Copy link
Collaborator

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall.

mars/dataframe/merge/merge.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@wjsi wjsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wjsi wjsi merged commit 3b0130e into mars-project:master May 18, 2022
@qinxuye qinxuye deleted the enh/optimize-merge branch May 18, 2022 10:08
hekaisheng added a commit to hekaisheng/mars that referenced this pull request May 23, 2022
@wjsi wjsi added backported already PR has been backported and removed to be backported Indicate that the PR need to be backported to stable branch labels May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants