[python-package] fix type annotations for eval result tracking #5793

jameslamb · 2023-03-19T01:56:19Z

Contributes to #3756.
Contributes to #3867.
Follow-up to this discussion with @IdoKendo : #5672 (comment)

Fixes the following mypy errors.

callback.py:341: error: Argument 1 to "_format_eval_result" has incompatible type "Union[Tuple[str, str, float, bool], Tuple[str, str, float, bool, float], Any]"; expected "Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]"  [arg-type]
callback.py:341: error: Item "None" of "Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]" has no attribute "__iter__" (not iterable)  [union-attr]
callback.py:346: error: Argument 2 to "EarlyStopException" has incompatible type "Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]"; expected "Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]"  [arg-type]
callback.py:367: error: Argument 1 to "_format_eval_result" has incompatible type "Union[Tuple[str, str, float, bool], Tuple[str, str, float, bool, float], Any]"; expected "Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]"  [arg-type]
callback.py:367: error: Item "None" of "Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]" has no attribute "__iter__" (not iterable)  [union-attr]
callback.py:371: error: Argument 2 to "EarlyStopException" has incompatible type "Optional[Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]]"; expected "Union[List[Tuple[str, str, float, bool]], List[Tuple[str, str, float, bool, float]]]"  [arg-type]

jameslamb · 2023-03-19T02:11:28Z

python-package/lightgbm/callback.py

@@ -55,7 +59,7 @@ def _format_eval_result(value: _EvalResultTuple, show_stdv: bool) -> str:
        return f"{value[0]}'s {value[1]}: {value[2]:g}"
    elif len(value) == 5:
        if show_stdv:
-            return f"{value[0]}'s {value[1]}: {value[2]:g} + {value[4]:g}"
+            return f"{value[0]}'s {value[1]}: {value[2]:g} + {value[4]:g}"  # type: ignore[misc]


These eval result tuples all have the same first four attributes:

(eval_name, metric_name, eval_result, is_higher_better)

In lgb.cv(), that eval_result is the mean of the metric across all cross-validation folds. In that case, these tuples can optionally contain a fifth value with the standard deviation of that metric across all cross-validation folds.

LightGBM/python-package/lightgbm/engine.py

Line 513 in 2fe2bf0

return [('cv_agg', k, np.mean(v), metric_type[k], np.std(v)) for k, v in cvmap.items()]

mypy can't tell automatically that being inside this if len(value) == 5: block means that the tuple has 5 elements, and raises the following error on this line:

callback.py:62: error: Tuple index out of range [misc]

🤷🏻

jameslamb · 2023-03-19T02:26:48Z

python-package/lightgbm/callback.py

+                if first_time_updating_best_score_list:
+                    self.best_score_list.append(env.evaluation_result_list)
+                else:
+                    self.best_score_list[i] = env.evaluation_result_list


The first time _EarlyStoppingCallback is called, it runs _EarlyStoppingCallback._init().

LightGBM/python-package/lightgbm/callback.py

Lines 348 to 350 in 2fe2bf0

def __call__(self, env: CallbackEnv) -> None:

if env.iteration == env.begin_iteration:

self._init(env)

That _init() methods ensures that that self.best_score_list is the same length as CallbackEnv.evaluation_result_list by padding it with Nones.

LightGBM/python-package/lightgbm/callback.py

Lines 328 to 330 in 2fe2bf0

for eval_ret, delta in zip(env.evaluation_result_list, deltas):

self.best_iter.append(0)

self.best_score_list.append(None)

In the rest of that initial call, and in every subsequent call of the callback, the code iterates over CallbackEnv.evaluation_result_list and putt the corresponding item from it into self.best_score_list if:

it's the very first time the callback was called (self.best_score_list[i] is None)

the new evaluation result is better than the previous one by at least some margin delta, which defaults to 0.0 (self.cmp_op[i](score, self.best_score[i])

https://github.com/microsoft/LightGBM/blob/2fe2bf0675ef127a29f9bc49511fc0099cf3e140/python-package/lightgbm/callback.py#L355-LL358

So those None entries in self.best_score_list aren't expected to be there when the first call of _EarlyStoppingCallback completes. They're not visible to user code, and nothing else in lightgbm relies on them.

HOWEVER... their brief presence in that list tells mypy "well sometimes there can be None in this list". And honestly, it confused me too.

This PR proposes making self.best_score_list easier for both mypy and humans to understand by removing that None-padding, and instead changing the update logic to "if it's the first time this list is being updated, build it up by .append()-ing every element of env.evaluation_result_list to it".

jameslamb · 2023-03-30T01:31:58Z

@jmoralez @guolinke @shiyu1994 do you any of you have time to help out on some reviews with this and the other mypy PRs I have up this week?

I'm sorry to @ you... I'm trying to do this in small, easy-to-review pieces but some of the type-hinting things are fairly interrelated so it gets hard when too many unmerged changes stack up.

jameslamb · 2023-03-30T02:52:06Z

Thanks so much for the reviews @guolinke !

github-actions · 2023-08-15T20:17:27Z

This pull request has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

[python-package] fix type annotations for eval result tracking

f036194

jameslamb added awaiting review maintenance labels Mar 19, 2023

jameslamb commented Mar 19, 2023

View reviewed changes

jameslamb added 2 commits March 18, 2023 21:32

remove unnecessary type ignore comment

42d080b

linting

3abe7eb

jameslamb changed the title ~~WIP: [python-package] fix type annotations for eval result tracking~~ [python-package] fix type annotations for eval result tracking Mar 19, 2023

jameslamb marked this pull request as ready for review March 19, 2023 02:34

jameslamb requested review from StrikerRUS, shiyu1994 and jmoralez as code owners March 19, 2023 02:34

jameslamb requested a review from guolinke March 19, 2023 02:35

Merge branch 'master' into ci/mypy-best-score-list

32609f7

merge master

ff6b182

guolinke approved these changes Mar 30, 2023

View reviewed changes

jameslamb merged commit 5f79626 into master Mar 30, 2023

jameslamb deleted the ci/mypy-best-score-list branch March 30, 2023 13:13

github-actions bot removed the awaiting review label Aug 15, 2023

github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python-package] fix type annotations for eval result tracking #5793

[python-package] fix type annotations for eval result tracking #5793

jameslamb commented Mar 19, 2023 •

edited

Loading

jameslamb Mar 19, 2023

jameslamb Mar 19, 2023 •

edited

Loading

jameslamb commented Mar 30, 2023

jameslamb commented Mar 30, 2023

github-actions bot commented Aug 15, 2023

	def __call__(self, env: CallbackEnv) -> None:
	if env.iteration == env.begin_iteration:
	self._init(env)

	for eval_ret, delta in zip(env.evaluation_result_list, deltas):
	self.best_iter.append(0)
	self.best_score_list.append(None)

[python-package] fix type annotations for eval result tracking #5793

[python-package] fix type annotations for eval result tracking #5793

Conversation

jameslamb commented Mar 19, 2023 • edited Loading

jameslamb Mar 19, 2023

Choose a reason for hiding this comment

jameslamb Mar 19, 2023 • edited Loading

Choose a reason for hiding this comment

jameslamb commented Mar 30, 2023

jameslamb commented Mar 30, 2023

github-actions bot commented Aug 15, 2023

jameslamb commented Mar 19, 2023 •

edited

Loading

jameslamb Mar 19, 2023 •

edited

Loading