Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Add training metrics monitoring for xgboost() #10747

Merged
merged 3 commits into from
Nov 7, 2024

Conversation

david-cortes
Copy link
Contributor

ref #9810

I just realized that the new xgboost() interface was not doing anything with the verbose parameter. This PR adds an option to use the same x/y training data for metrics monitoring, which gets activated when passing a non-silent verbosity level.

@@ -932,11 +935,16 @@ xgboost <- function(

fn_dm <- if (use_qdm) xgb.QuantileDMatrix else xgb.DMatrix
dm <- do.call(fn_dm, lst_args$dmatrix_args)
evals <- list()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I missed this the last time. How does one use validation with this X/y interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no validation. This is just monitoring on the same training data.

I guess evaluation on a different dataset could be added later, but it would need be more involved - for example, it would need to re-encoded categorical columns in the evaluation data using the same categories as in 'x', would need to check whether the classes in 'y' match, and so on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation. Might be necessary though, considering validation is pretty much the only effective way in keeping ML training in check. I doubt that anyone who uses ML training aside from DL doesn't have validation during training.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's xgb.cv though. There could be an xgboost.cv (or cv.xgboost) to do it better, or somethng like 'eval_data_fraction' (better yet if implemented in the core library), but I think those could be added later after CRAN release blockers are addressed.

@trivialfis trivialfis merged commit 8fa2dbb into dmlc:master Nov 7, 2024
25 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants