Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix XGBoost when some workers do not have evals data #2861

Merged
merged 2 commits into from
Mar 24, 2022

Conversation

qinxuye
Copy link
Collaborator

@qinxuye qinxuye commented Mar 24, 2022

What do these changes do?

For now, if evals is specified for xgb train, assume dtrain data spans over 3 workers, but evals data only exist on 2 workers, the xgb train would become very slow, and cpu usage will be low, that is, at most 1 cpu can be leveraged, this PR fixed this issue.

Besides, this PR copied the newest tracker.py from xgboost official repo.

Related issue number

Fixes #2860 .

Check code requirements

  • tests added / passed (if needed)
  • Ensure all linting tests pass, see here for how to run them

@qinxuye qinxuye added type: bug Something isn't working to be backported Indicate that the PR need to be backported to stable branch mod: learn labels Mar 24, 2022
@qinxuye qinxuye added this to the v0.9.0rc2 milestone Mar 24, 2022
@qinxuye qinxuye requested review from hekaisheng and wjsi as code owners March 24, 2022 03:53
@qinxuye qinxuye changed the title Fix XGBoost when some workers do not have evals data Fix XGBoost when some workers do not have evals data Mar 24, 2022
Copy link
Member

@wjsi wjsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@hekaisheng hekaisheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hekaisheng hekaisheng merged commit e12963d into mars-project:master Mar 24, 2022
@qinxuye qinxuye deleted the bugfix/xgb branch March 24, 2022 07:50
qinxuye pushed a commit to qinxuye/mars that referenced this pull request Mar 24, 2022
@qinxuye qinxuye added backported already PR has been backported and removed to be backported Indicate that the PR need to be backported to stable branch labels Mar 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backported already PR has been backported mod: learn type: bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG]xgb train exception in py 3.9.7
3 participants