Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package] fix type annotations for eval result tracking #5793

Merged
merged 5 commits into from
Mar 30, 2023

Conversation

jameslamb
Copy link
Collaborator

@jameslamb jameslamb commented Mar 19, 2023

Contributes to #3756.
Contributes to #3867.
Follow-up to this discussion with @IdoKendo : #5672 (comment)

Fixes the following mypy errors.

callback.py:341: error: Argument 1 to "_format_eval_result" has incompatible type "Union[Tuple[str, str, float, bool], Tuple[str, str, float, bool, float], Any]"; expected "Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]"  [arg-type]
callback.py:341: error: Item "None" of "Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]" has no attribute "__iter__" (not iterable)  [union-attr]
callback.py:346: error: Argument 2 to "EarlyStopException" has incompatible type "Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]"; expected "Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]"  [arg-type]
callback.py:367: error: Argument 1 to "_format_eval_result" has incompatible type "Union[Tuple[str, str, float, bool], Tuple[str, str, float, bool, float], Any]"; expected "Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]"  [arg-type]
callback.py:367: error: Item "None" of "Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]" has no attribute "__iter__" (not iterable)  [union-attr]
callback.py:371: error: Argument 2 to "EarlyStopException" has incompatible type "Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]"; expected "Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]"  [arg-type]

@@ -55,7 +59,7 @@ def _format_eval_result(value: _EvalResultTuple, show_stdv: bool) -> str:
return f"{value[0]}'s {value[1]}: {value[2]:g}"
elif len(value) == 5:
if show_stdv:
return f"{value[0]}'s {value[1]}: {value[2]:g} + {value[4]:g}"
return f"{value[0]}'s {value[1]}: {value[2]:g} + {value[4]:g}" # type: ignore[misc]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These eval result tuples all have the same first four attributes:

(eval_name, metric_name, eval_result, is_higher_better)

In lgb.cv(), that eval_result is the mean of the metric across all cross-validation folds. In that case, these tuples can optionally contain a fifth value with the standard deviation of that metric across all cross-validation folds.

return [('cv_agg', k, np.mean(v), metric_type[k], np.std(v)) for k, v in cvmap.items()]

mypy can't tell automatically that being inside this if len(value) == 5: block means that the tuple has 5 elements, and raises the following error on this line:

callback.py:62: error: Tuple index out of range [misc]

🤷🏻

Comment on lines +363 to +366
if first_time_updating_best_score_list:
self.best_score_list.append(env.evaluation_result_list)
else:
self.best_score_list[i] = env.evaluation_result_list
Copy link
Collaborator Author

@jameslamb jameslamb Mar 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first time _EarlyStoppingCallback is called, it runs _EarlyStoppingCallback._init().

def __call__(self, env: CallbackEnv) -> None:
if env.iteration == env.begin_iteration:
self._init(env)

That _init() methods ensures that that self.best_score_list is the same length as CallbackEnv.evaluation_result_list by padding it with Nones.

for eval_ret, delta in zip(env.evaluation_result_list, deltas):
self.best_iter.append(0)
self.best_score_list.append(None)

In the rest of that initial call, and in every subsequent call of the callback, the code iterates over CallbackEnv.evaluation_result_list and putt the corresponding item from it into self.best_score_list if:

  • it's the very first time the callback was called (self.best_score_list[i] is None)
  • the new evaluation result is better than the previous one by at least some margin delta, which defaults to 0.0 (self.cmp_op[i](score, self.best_score[i])

https://github.com/microsoft/LightGBM/blob/2fe2bf0675ef127a29f9bc49511fc0099cf3e140/python-package/lightgbm/callback.py#L355-LL358

So those None entries in self.best_score_list aren't expected to be there when the first call of _EarlyStoppingCallback completes. They're not visible to user code, and nothing else in lightgbm relies on them.

HOWEVER... their brief presence in that list tells mypy "well sometimes there can be None in this list". And honestly, it confused me too.

This PR proposes making self.best_score_list easier for both mypy and humans to understand by removing that None-padding, and instead changing the update logic to "if it's the first time this list is being updated, build it up by .append()-ing every element of env.evaluation_result_list to it".

@jameslamb jameslamb changed the title WIP: [python-package] fix type annotations for eval result tracking [python-package] fix type annotations for eval result tracking Mar 19, 2023
@jameslamb jameslamb marked this pull request as ready for review March 19, 2023 02:34
@jameslamb jameslamb requested a review from guolinke March 19, 2023 02:35
@jameslamb
Copy link
Collaborator Author

@jmoralez @guolinke @shiyu1994 do you any of you have time to help out on some reviews with this and the other mypy PRs I have up this week?

I'm sorry to @ you... I'm trying to do this in small, easy-to-review pieces but some of the type-hinting things are fairly interrelated so it gets hard when too many unmerged changes stack up.

@jameslamb
Copy link
Collaborator Author

Thanks so much for the reviews @guolinke !

@jameslamb jameslamb merged commit 5f79626 into master Mar 30, 2023
@jameslamb jameslamb deleted the ci/mypy-best-score-list branch March 30, 2023 13:13
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants