-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] fix mypy errors about missing annotations and incompatible types #5672
[python-package] fix mypy errors about missing annotations and incompatible types #5672
Conversation
@microsoft-github-policy-service agree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 Thanks so much for working on this @IdoKendo !
Could you please move the two basic.py
changes and the eval_at
tuple fixes to another PR? I totally support and understand those, and we could merge a PR with them immediately.
The other changes touch more complicated parts of the code and will take me a bit longer to review.
@jameslamb Sure thing, I moved those changes to #5674. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!
I agree with the predict()
changes for the Dask interface. Can you please pull those out into a separate PR? I'll approve and merge that quickly.
Sorry to ask you to do that again, but please understand that we're not only interested in resolving these warnings, but also in understanding why mypy
is raising them. Since the root cause of some might be bugs or undesirable behavior in lightgbm
, not just mistakes in the type hints. For example... why does mypy
think there's a code path that results in callback._format_eval_result()
being called on a value that's None
? Should we just eliminate that code path? I'd like to investigate concerns like that before approving that change.
@jameslamb Sure thing, I moved those to #5678. |
Awesome! I'd appreciate more, smaller PRs for this, each focused on one part of the codebase. That'll allow us to merge in the changes that require less investigation quickly. |
@IdoKendo Are you interested in pursuing this? If so, please update this PR to latest |
Yeah for sure, updated! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! Please see my next round of suggestions.
python-package/lightgbm/dask.py
Outdated
@@ -1174,7 +1174,7 @@ def fit( # type: ignore[override] | |||
eval_init_score: Optional[List[_DaskCollection]] = None, | |||
eval_metric: Optional[_LGBM_ScikitEvalMetricType] = None, | |||
**kwargs: Any | |||
) -> "DaskLGBMClassifier": | |||
) -> "_DaskLGBMModel": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right fix. I'd expect DaskLGBMClassifier.fit()
to return a DaskLGBMClassifier
(which, by the way, is a child of _DaskLGBMModel
).
But I understand why mypy
thinks that's not happening... self._lgb_dask_fit()
is inherited from parent class _DaskLGBMModel
.
Instead of changing the return type hints, can you please try changing the body of these fit()
methods so they're like this pseudo-code?
def fit(...) -> "DaskLGBMClassifier":
self._lgb_dask_fit(...)
return self
And see if:
- those
mypy
errors about the return type go away - the unit tests all still pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed you've removed these Dask changes from this PR and they aren't included in the other mypy-related PR you opened (#5731).
Are you interested in trying them out? If not, I'll fix these issues related to lightgbm.dask
return types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're doing # type: ignore[override]
it seemed kind of redundant to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not. No worries, I'll attempt this particlar fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #5756.
To add more details on why the presence of # type: ignore[override]
doesn't make this concern redundant...that check and these errors about return types are about different things.
The [override]
check is saying "hey, you told me this was a child of {ParentClass}
, but it has a method with a different signature than {ParentClass}
, which means it can't be automatically used anywhere that {ParentClass}
is used".
The errors about return types are saying "hey you told me that this fit()
method returns an instance of {ChildClass}
but it's returning an instance of {ParentClass}
". That's a different thing, and important to fix for at least 2 reasons:
{ChildClass}
might define additional attributes that{ParentClass}
doesn't (e.g. extra configuration captured through the constructor)- People writing code that depends on
lightgbm
may wish to perform checks likeisinstance(model, lgb.DaskLGBMRegressor)
. Such checks would fail today if called on the result offit()
, since it's returning an internal classlgb._DaskLGBMModel
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! Especially for working through figuring out the signature of cmp_op
.
Please make the one modification I've requested in this newest review, and update the list of mypy
errors in the description.
python-package/lightgbm/callback.py
Outdated
@@ -16,7 +16,8 @@ | |||
_EvalResultDict = Dict[str, Dict[str, List[Any]]] | |||
_EvalResultTuple = Union[ | |||
List[_LGBM_BoosterEvalMethodResultType], | |||
List[Tuple[str, str, float, bool, float]] | |||
List[Tuple[str, str, float, bool, float]], | |||
None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't include None
in this type hint. That makes it difficult to use in other parts of the code where None
isn't a valid value.
python-package/lightgbm/callback.py
Outdated
_log_info(f"Evaluated only: {eval_name_splitted[-1]}") | ||
raise EarlyStopException(self.best_iter[i], self.best_score_list[i]) | ||
best_score_list = self.best_score_list[i] | ||
if best_score_list is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this new refactoring that you pushed (pulling out this item of self.best_score_list
and checking if it's None
) necessary? If the purpose of the PR is to address 4 warnings that are just "need a type annotation"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove the None hint from _EvalResultTuple
then this part introduces another error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please tell me what the new error is that this refactoring helps to avoid?
Changing logic requires more review effort than just-changing-type-hints types of changes. I'm happy to provide that, but I want to understand first what feedback from mypy
led to this proposed change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These errors:
python-package\lightgbm\callback.py:369: error: Item "None" of "Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]" has no attribute "__iter__" (not iterable) [union-attr]
python-package\lightgbm\callback.py:373: error: Argument 2 to "EarlyStopException" has incompatible type "Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]"; expected "Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]" [arg-type]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm I'm struggling to understand that.
I don't understand why mypy
thinks that the type at this point is
Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]
I understand how best_score_list
could have a None
as a list element, but now how it could itself be None
.
Anyway, I'll try to review this when I can. Now that the logic is changing, it'll take longer to review. Sorry about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need some time to investigate the proposed changes for best_score_list
, sorry.
But I 100% support the changes for best_score
, best_iter
, and cmp_op
. Could you please put those up in a separate PR? I'll approve and merge that quickly.
Sure, I think I'll do it the other way and just move the refactor to a different PR, keeping the types in this one. |
Ok that works, thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
This pull request has been automatically locked since there has not been any recent activity since it was closed. |
Contributes to #3867.
mypy
currently raises the following errors:These errors due to missing and mismatched annotations in the code. This PR proposes fixing these errors by aligning all the annotations in these areas of the code to share the same annotation.
Notes for Reviewers
This was tested by running mypy as documented in #3867.