Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent predictions #11120

Open
Hayakawa94 opened this issue Dec 19, 2024 · 2 comments
Open

Inconsistent predictions #11120

Hayakawa94 opened this issue Dec 19, 2024 · 2 comments

Comments

@Hayakawa94
Copy link

Hi

I need some help with the R xgboost model. I have built a claim severity model using the reg:gamma objective. When assessing the predictions, I noticed different predictions being outputted when iterationrange = c(1,1) is specified. The result is below:

(predict(xgb_model,
newdata = as.matrix(gbm.data%>% select( xgb_model$feature_names ) ) ,
iterationrange = c(1,1)
) / predict(xgb_model,
newdata = as.matrix(gbm.data%>% select( xgb_model$feature_names ) )
) ) %>% summary

Min. 1st Qu. Median Mean 3rd Qu. Max.
0.6827 0.9903 0.9990 0.9987 1.0072 1.4092

Which is the correct prediction and which method is used to compute SHAP?

Thanks in advance

@Hayakawa94
Copy link
Author

It appears that using the predict function without specifying iterationrange = c(1,1) includes trees from the early stopping rounds in the predictions. For example, if nround = 100, early stopping rounds = 10, and 80 trees were built, the predict function would use 90 trees instead of the 80 that were actually built. Could someone clarify if this would impact the SHAP computations?

@david-cortes
Copy link
Contributor

If you are using the latest development version of XGBoost, or if you installed it from GitHub, note that the interpretation of iterationrange = c(1,1) has changed, and the docs have been updated to reflect the new behavior:

#' @param iterationrange Sequence of rounds/iterations from the model to use for prediction, specified by passing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants