Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine MarsDMatrix & support more parameters for XGB classifier and regressor #2498

Merged
merged 5 commits into from
Oct 9, 2021

Conversation

qinxuye
Copy link
Collaborator

@qinxuye qinxuye commented Oct 8, 2021

What do these changes do?

This PR did a few things:

  1. Refine MarsDMatrix, use yield mechanism in yield instead of calling execute.
  2. Support base_margin for MarsDMatrix.
  3. Process evals correctly for xgb train.
  4. Support more parameters like base_margin, base_margin_eval_set etc for fit of xgb classifier and regressor.

Related issue number

@qinxuye qinxuye added type: enhancement request to be backported Indicate that the PR need to be backported to stable branch mod: learn labels Oct 8, 2021
@qinxuye qinxuye added this to the v0.8.0b2 milestone Oct 8, 2021
Copy link
Member

@wjsi wjsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@hekaisheng hekaisheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hekaisheng hekaisheng merged commit e367ea3 into mars-project:master Oct 9, 2021
@qinxuye qinxuye deleted the enh/xgb-evals branch October 9, 2021 05:59
qinxuye pushed a commit to qinxuye/mars that referenced this pull request Oct 9, 2021
wjsi pushed a commit that referenced this pull request Oct 9, 2021
@qinxuye qinxuye added backported already PR has been backported and removed to be backported Indicate that the PR need to be backported to stable branch labels Oct 9, 2021
chaokunyang added a commit to chaokunyang/mars that referenced this pull request May 31, 2022
Merge branch merge_github_2524 of git@gitlab.alipay-inc.com:ray-project/mars.git into master
https://code.alipay.com/ray-project/mars/pull_requests/58?tab=diff

Signed-off-by: 捕牛 <hejialing.hjl@antgroup.com>


* [Ray] Support reconstructing worker (mars-project#2413)


* Make cmdline support third party modules (mars-project#2454)

Co-authored-by: hanguang <zhusiyuan.zsy@alibaba-inc.com>
* Support visualizing subtask graphs on Mars Web (mars-project#2426)


* Fix timeout error when waiting for a submitted task (mars-project#2457)


* Print the error message when error happens in `TaskProcessor` (mars-project#2458)


* Add nightly builds for docker images (mars-project#2456)


* Fix misuse of `name` parameter in DataFrame align (mars-project#2469)


* Fix hang when start sub pool fails (mars-project#2468)


* Refine and unify subtask detail APIs (mars-project#2465)


* Fix coverage for Azure pipeline (mars-project#2474)


* Split tileable information and subtask graph into two tabs (mars-project#2480)


* Support specified vineyard socket and skip the launching vineyardd process (mars-project#2481)


* Basic reschedule subtask (mars-project#2467)


* Compatible with scikit-learn 1.0 (mars-project#2486)

Co-authored-by: hekaisheng <kaisheng.hks@alibaba-inc.com>
* Fix wrong translation in cluster deployment. (mars-project#2489)


* Fix bug that failed to execute query when there are multiple arguments (mars-project#2490)


* Include tileable property in detail api (mars-project#2493)


* Fix version of statsmodels to pass CI (mars-project#2497)


* Implements `glm.LogisticRegression` (mars-project#2466)


* Implements bagging sampling (mars-project#2496)


* Refine MarsDMatrix & support more parameters for XGB classifier and regressor (mars-project#2498)


* Fix output of df.groupby(as_index=False).size() (mars-project#2507)


* Add preliminary implementations for ufunc methods (mars-project#2510)


* Add doc for reading csv in oss (mars-project#2514)


* [Ray] Fix serializing lambdas in web (mars-project#2512)


* Add `make_regression` support for learn module (mars-project#2515)


* Fix reduction result on empty series (mars-project#2520)


* Fix df.loc when df is empty (mars-project#2524)


* fix start subpool

* fix test_kill_and_wait_timeout

* fix autoscale timeout

* fix ray larger clsuter fixture

* Update ci ray package to 1.2.2

* remove python3.6 3.8 .39 ut and upgrade ray 3.7 image

* echo python path

* fix json decode error

* fix bundle release timeout

* fix remove placement group timeout

* fix no_restart

* fix ci

* fix autoscale
chaokunyang added a commit to chaokunyang/mars that referenced this pull request May 31, 2022
Merge branch merge_github_2524 of git@gitlab.alipay-inc.com:ray-project/mars.git into master
https://code.alipay.com/ray-project/mars/pull_requests/58?tab=diff

Signed-off-by: 捕牛 <hejialing.hjl@antgroup.com>

* [Ray] Support reconstructing worker (mars-project#2413)

* Make cmdline support third party modules (mars-project#2454)

Co-authored-by: hanguang <zhusiyuan.zsy@alibaba-inc.com>
* Support visualizing subtask graphs on Mars Web (mars-project#2426)

* Fix timeout error when waiting for a submitted task (mars-project#2457)

* Print the error message when error happens in `TaskProcessor` (mars-project#2458)

* Add nightly builds for docker images (mars-project#2456)

* Fix misuse of `name` parameter in DataFrame align (mars-project#2469)

* Fix hang when start sub pool fails (mars-project#2468)

* Refine and unify subtask detail APIs (mars-project#2465)

* Fix coverage for Azure pipeline (mars-project#2474)

* Split tileable information and subtask graph into two tabs (mars-project#2480)

* Support specified vineyard socket and skip the launching vineyardd process (mars-project#2481)

* Basic reschedule subtask (mars-project#2467)

* Compatible with scikit-learn 1.0 (mars-project#2486)

Co-authored-by: hekaisheng <kaisheng.hks@alibaba-inc.com>
* Fix wrong translation in cluster deployment. (mars-project#2489)

* Fix bug that failed to execute query when there are multiple arguments (mars-project#2490)

* Include tileable property in detail api (mars-project#2493)

* Fix version of statsmodels to pass CI (mars-project#2497)

* Implements `glm.LogisticRegression` (mars-project#2466)

* Implements bagging sampling (mars-project#2496)

* Refine MarsDMatrix & support more parameters for XGB classifier and regressor (mars-project#2498)

* Fix output of df.groupby(as_index=False).size() (mars-project#2507)

* Add preliminary implementations for ufunc methods (mars-project#2510)

* Add doc for reading csv in oss (mars-project#2514)

* [Ray] Fix serializing lambdas in web (mars-project#2512)

* Add `make_regression` support for learn module (mars-project#2515)

* Fix reduction result on empty series (mars-project#2520)

* Fix df.loc when df is empty (mars-project#2524)

* fix start subpool

* fix test_kill_and_wait_timeout

* fix autoscale timeout

* fix ray larger clsuter fixture

* Update ci ray package to 1.2.2

* remove python3.6 3.8 .39 ut and upgrade ray 3.7 image

* echo python path

* fix json decode error

* fix bundle release timeout

* fix remove placement group timeout

* fix no_restart

* fix ci

* fix autoscale
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants