Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package] fix mypy errors about missing annotations and incompatible types #5672

Merged
merged 14 commits into from
Mar 9, 2023
Merged

[python-package] fix mypy errors about missing annotations and incompatible types #5672

merged 14 commits into from
Mar 9, 2023

Conversation

IdoKendo
Copy link
Contributor

@IdoKendo IdoKendo commented Jan 11, 2023

Contributes to #3867.
mypy currently raises the following errors:

python-package\lightgbm\callback.py:257: error: Need type annotation for "best_score" (hint: "best_score: List[<type>] = ...")  [var-annotated]
python-package\lightgbm\callback.py:258: error: Need type annotation for "best_iter" (hint: "best_iter: List[<type>] = ...")  [var-annotated]
python-package\lightgbm\callback.py:259: error: Need type annotation for "best_score_list" (hint: "best_score_list: List[<type>] = ...")  [var-annotated]
python-package\lightgbm\callback.py:260: error: Need type annotation for "cmp_op" (hint: "cmp_op: List[<type>] = ...")  [var-annotated]

These errors due to missing and mismatched annotations in the code. This PR proposes fixing these errors by aligning all the annotations in these areas of the code to share the same annotation.

Notes for Reviewers

This was tested by running mypy as documented in #3867.

mypy \
    --exclude='python-package/compile/|python-package/build' \
    --ignore-missing-imports \
    python-package/

@IdoKendo
Copy link
Contributor Author

@microsoft-github-policy-service agree

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 Thanks so much for working on this @IdoKendo !

Could you please move the two basic.py changes and the eval_at tuple fixes to another PR? I totally support and understand those, and we could merge a PR with them immediately.

The other changes touch more complicated parts of the code and will take me a bit longer to review.

@IdoKendo
Copy link
Contributor Author

@jameslamb Sure thing, I moved those changes to #5674.
I'll continue working on the rest of the errors as well. LMK regarding the changes left in this PR.

@IdoKendo IdoKendo requested review from jameslamb and removed request for jmoralez, shiyu1994 and StrikerRUS January 12, 2023 07:24
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

I agree with the predict() changes for the Dask interface. Can you please pull those out into a separate PR? I'll approve and merge that quickly.

Sorry to ask you to do that again, but please understand that we're not only interested in resolving these warnings, but also in understanding why mypy is raising them. Since the root cause of some might be bugs or undesirable behavior in lightgbm, not just mistakes in the type hints. For example... why does mypy think there's a code path that results in callback._format_eval_result() being called on a value that's None? Should we just eliminate that code path? I'd like to investigate concerns like that before approving that change.

@IdoKendo
Copy link
Contributor Author

@jameslamb Sure thing, I moved those to #5678.
I fully understand the investigation before approving!
I have some more fixes already lined up, should I add them to this PR then? Or would it be more comfortable for you to have them separately from the get-go?

@IdoKendo IdoKendo requested a review from jameslamb January 17, 2023 08:41
@jameslamb
Copy link
Collaborator

I have some more fixes already lined up, should I add them to this PR then?

Awesome! I'd appreciate more, smaller PRs for this, each focused on one part of the codebase. That'll allow us to merge in the changes that require less investigation quickly.

@jameslamb
Copy link
Collaborator

@IdoKendo Are you interested in pursuing this?

If so, please update this PR to latest master and I'll provide a review.

@IdoKendo
Copy link
Contributor Author

@IdoKendo Are you interested in pursuing this?

If so, please update this PR to latest master and I'll provide a review.

Yeah for sure, updated!

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! Please see my next round of suggestions.

python-package/lightgbm/dask.py Outdated Show resolved Hide resolved
@@ -1174,7 +1174,7 @@ def fit( # type: ignore[override]
eval_init_score: Optional[List[_DaskCollection]] = None,
eval_metric: Optional[_LGBM_ScikitEvalMetricType] = None,
**kwargs: Any
) -> "DaskLGBMClassifier":
) -> "_DaskLGBMModel":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right fix. I'd expect DaskLGBMClassifier.fit() to return a DaskLGBMClassifier (which, by the way, is a child of _DaskLGBMModel).

But I understand why mypy thinks that's not happening... self._lgb_dask_fit() is inherited from parent class _DaskLGBMModel.

Instead of changing the return type hints, can you please try changing the body of these fit() methods so they're like this pseudo-code?

def fit(...) -> "DaskLGBMClassifier":
    self._lgb_dask_fit(...)
    return self

And see if:

  • those mypy errors about the return type go away
  • the unit tests all still pass

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed you've removed these Dask changes from this PR and they aren't included in the other mypy-related PR you opened (#5731).

Are you interested in trying them out? If not, I'll fix these issues related to lightgbm.dask return types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're doing # type: ignore[override] it seemed kind of redundant to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not. No worries, I'll attempt this particlar fix.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #5756.

To add more details on why the presence of # type: ignore[override] doesn't make this concern redundant...that check and these errors about return types are about different things.

The [override] check is saying "hey, you told me this was a child of {ParentClass}, but it has a method with a different signature than {ParentClass}, which means it can't be automatically used anywhere that {ParentClass} is used".

The errors about return types are saying "hey you told me that this fit() method returns an instance of {ChildClass} but it's returning an instance of {ParentClass}". That's a different thing, and important to fix for at least 2 reasons:

  1. {ChildClass} might define additional attributes that {ParentClass} doesn't (e.g. extra configuration captured through the constructor)
  2. People writing code that depends on lightgbm may wish to perform checks like isinstance(model, lgb.DaskLGBMRegressor). Such checks would fail today if called on the result of fit(), since it's returning an internal class lgb._DaskLGBMModel.

python-package/lightgbm/sklearn.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! Especially for working through figuring out the signature of cmp_op.

Please make the one modification I've requested in this newest review, and update the list of mypy errors in the description.

@@ -16,7 +16,8 @@
_EvalResultDict = Dict[str, Dict[str, List[Any]]]
_EvalResultTuple = Union[
List[_LGBM_BoosterEvalMethodResultType],
List[Tuple[str, str, float, bool, float]]
List[Tuple[str, str, float, bool, float]],
None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't include None in this type hint. That makes it difficult to use in other parts of the code where None isn't a valid value.

python-package/lightgbm/callback.py Outdated Show resolved Hide resolved
@IdoKendo IdoKendo requested a review from jameslamb March 1, 2023 12:20
_log_info(f"Evaluated only: {eval_name_splitted[-1]}")
raise EarlyStopException(self.best_iter[i], self.best_score_list[i])
best_score_list = self.best_score_list[i]
if best_score_list is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this new refactoring that you pushed (pulling out this item of self.best_score_list and checking if it's None) necessary? If the purpose of the PR is to address 4 warnings that are just "need a type annotation"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove the None hint from _EvalResultTuple then this part introduces another error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please tell me what the new error is that this refactoring helps to avoid?

Changing logic requires more review effort than just-changing-type-hints types of changes. I'm happy to provide that, but I want to understand first what feedback from mypy led to this proposed change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These errors:

python-package\lightgbm\callback.py:369: error: Item "None" of "Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]" has no attribute "__iter__" (not iterable)  [union-attr]
python-package\lightgbm\callback.py:373: error: Argument 2 to "EarlyStopException" has incompatible type "Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]"; expected "Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]"  [arg-type]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm I'm struggling to understand that.

I don't understand why mypy thinks that the type at this point is

Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]

I understand how best_score_list could have a None as a list element, but now how it could itself be None.

Anyway, I'll try to review this when I can. Now that the logic is changing, it'll take longer to review. Sorry about that.

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need some time to investigate the proposed changes for best_score_list, sorry.

But I 100% support the changes for best_score, best_iter, and cmp_op. Could you please put those up in a separate PR? I'll approve and merge that quickly.

@IdoKendo
Copy link
Contributor Author

IdoKendo commented Mar 9, 2023

I still need some time to investigate the proposed changes for best_score_list, sorry.

But I 100% support the changes for best_score, best_iter, and cmp_op. Could you please put those up in a separate PR? I'll approve and merge that quickly.

Sure, I think I'll do it the other way and just move the refactor to a different PR, keeping the types in this one.

@IdoKendo IdoKendo requested a review from jameslamb March 9, 2023 11:52
@jameslamb
Copy link
Collaborator

Ok that works, thank you!

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@jameslamb jameslamb merged commit 90a4510 into microsoft:master Mar 9, 2023
@IdoKendo IdoKendo deleted the python-package/fix-mypy-errors branch March 9, 2023 15:20
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants