Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompability between scikit-learn and xgboost #11093

Open
piotrjacak opened this issue Dec 12, 2024 · 5 comments
Open

Incompability between scikit-learn and xgboost #11093

piotrjacak opened this issue Dec 12, 2024 · 5 comments

Comments

@piotrjacak
Copy link

piotrjacak commented Dec 12, 2024

I have xgboost 2.1.3 and scikit-learn 1.6.0.
After running this code
grid_search = GridSearchCV(XGBClassifier(objective='binary:logistic'), param_grid, scoring='accuracy', cv=5, verbose=1)
grid_search.fit(X_train, y_train)

I got following error


AttributeError Traceback (most recent call last)

Cell In[103], line 6
5 grid_search = GridSearchCV(XGBClassifier(objective='binary:logistic'), param_grid, scoring='accuracy', cv=5, verbose=1)

----> 6 grid_search.fit(X_train, y_train)

File ~/lab3/lib/python3.11/site-packages/sklearn/base.py:1389, in _fit_context..decorator..wrapper(estimator, *args, **kwargs)
1382 estimator._validate_params()
1384 with config_context(
1385 skip_parameter_validation=(
1386 prefer_skip_nested_validation or global_skip_validation
1387 )
1388 ):
-> 1389 return fit_method(estimator, *args, **kwargs)

File ~/lab3/lib/python3.11/site-packages/sklearn/model_selection/_search.py:932, in BaseSearchCV.fit(self, X, y, **params)
928 params = _check_method_params(X, params=params)
930 routed_params = self._get_routed_params_for_fit(params)
--> 932 cv_orig = check_cv(self.cv, y, classifier=is_classifier(estimator))
933 n_splits = cv_orig.get_n_splits(X, y, **routed_params.splitter.split)
935 base_estimator = clone(self.estimator)

File ~/lab3/lib/python3.11/site-packages/sklearn/base.py:1237, in is_classifier(estimator)
1230 warnings.warn(
1231 f"passing a class to {print(inspect.stack()[0][3])} is deprecated and "
1232 "will be removed in 1.8. Use an instance of the class instead.",
1233 FutureWarning,
1234 )
1235 return getattr(estimator, "_estimator_type", None) == "classifier"
-> 1237 return get_tags(estimator).estimator_type == "classifier"

File ~/lab3/lib/python3.11/site-packages/sklearn/utils/_tags.py:405, in get_tags(estimator)
403 for klass in reversed(type(estimator).mro()):
404 if "sklearn_tags" in vars(klass):
--> 405 sklearn_tags_provider[klass] = klass.sklearn_tags(estimator) # type: ignore[attr-defined]
406 class_order.append(klass)
407 elif "_more_tags" in vars(klass):

File ~/lab3/lib/python3.11/site-packages/sklearn/base.py:540, in ClassifierMixin.sklearn_tags(self)
539 def sklearn_tags(self):
--> 540 tags = super().sklearn_tags()
541 tags.estimator_type = "classifier"
542 tags.classifier_tags = ClassifierTags()

AttributeError: 'super' object has no attribute 'sklearn_tags'

@ShootingStarD
Copy link

I have the same error when I finished to fit a model and then try to print it in a jupyter notebook, as well as when I try to load a model

@ShootingStarD
Copy link

@piotrjacak Try using scikit learn version 1.5.0, does it solve your issue?

@piotrjacak
Copy link
Author

Thank you, it helped. As I tried to figure this out I found another solution. I wrapped XGBClassifier into a class, using sklearn BaseEstimator and ClassifierMixin. Then I passed instance of this class to GridSearchCV. I used following code:

from sklearn.base import BaseEstimator, ClassifierMixin

class SklearnXGBClassifier(BaseEstimator, ClassifierMixin):
    def __init__(self, **kwargs):
        self.model = XGBClassifier(**kwargs)

    def fit(self, X, y, **kwargs):
        self.model.fit(X, y, **kwargs)
        return self

    def predict(self, X):
        return self.model.predict(X)

    def predict_proba(self, X):
        return self.model.predict_proba(X)

    def get_params(self, deep=True):
        return self.model.get_params(deep)

    def set_params(self, **params):
        self.model.set_params(**params)
        return self

xgb = SklearnXGBClassifier(objective='binary:logistic')
grid_search = GridSearchCV(xgb, param_grid, scoring='accuracy', cv=5, verbose=1)

@PhilippBach
Copy link

We experienced the same issue. I think it's related to some changes in release of scikit-learn 1.6.0., see scikit-learn/scikit-learn#30122 and recent release notes https://scikit-learn.org/stable/whats_new/v1.6.html#sklearn-base

see Issue DoubleML/doubleml-for-py#278 for DoubleML

@trivialfis
Copy link
Member

The fix is in the master branch, but it will take some time for us to make a new release, please keep sklearn at 1.5 or use the nightly XGB build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants