-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] fix type annotations for eval result tracking #5793
Conversation
@@ -55,7 +59,7 @@ def _format_eval_result(value: _EvalResultTuple, show_stdv: bool) -> str: | |||
return f"{value[0]}'s {value[1]}: {value[2]:g}" | |||
elif len(value) == 5: | |||
if show_stdv: | |||
return f"{value[0]}'s {value[1]}: {value[2]:g} + {value[4]:g}" | |||
return f"{value[0]}'s {value[1]}: {value[2]:g} + {value[4]:g}" # type: ignore[misc] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These eval result tuples all have the same first four attributes:
(eval_name, metric_name, eval_result, is_higher_better)
In lgb.cv()
, that eval_result
is the mean of the metric across all cross-validation folds. In that case, these tuples can optionally contain a fifth value with the standard deviation of that metric across all cross-validation folds.
LightGBM/python-package/lightgbm/engine.py
Line 513 in 2fe2bf0
return [('cv_agg', k, np.mean(v), metric_type[k], np.std(v)) for k, v in cvmap.items()] |
mypy
can't tell automatically that being inside this if len(value) == 5:
block means that the tuple has 5 elements, and raises the following error on this line:
callback.py:62: error: Tuple index out of range [misc]
🤷🏻
if first_time_updating_best_score_list: | ||
self.best_score_list.append(env.evaluation_result_list) | ||
else: | ||
self.best_score_list[i] = env.evaluation_result_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first time _EarlyStoppingCallback
is called, it runs _EarlyStoppingCallback._init()
.
LightGBM/python-package/lightgbm/callback.py
Lines 348 to 350 in 2fe2bf0
def __call__(self, env: CallbackEnv) -> None: | |
if env.iteration == env.begin_iteration: | |
self._init(env) |
That _init()
methods ensures that that self.best_score_list
is the same length as CallbackEnv.evaluation_result_list
by padding it with None
s.
LightGBM/python-package/lightgbm/callback.py
Lines 328 to 330 in 2fe2bf0
for eval_ret, delta in zip(env.evaluation_result_list, deltas): | |
self.best_iter.append(0) | |
self.best_score_list.append(None) |
In the rest of that initial call, and in every subsequent call of the callback, the code iterates over CallbackEnv.evaluation_result_list
and putt the corresponding item from it into self.best_score_list
if:
- it's the very first time the callback was called (
self.best_score_list[i] is None
) - the new evaluation result is better than the previous one by at least some margin
delta
, which defaults to 0.0 (self.cmp_op[i](score, self.best_score[i])
So those None
entries in self.best_score_list
aren't expected to be there when the first call of _EarlyStoppingCallback
completes. They're not visible to user code, and nothing else in lightgbm
relies on them.
HOWEVER... their brief presence in that list tells mypy
"well sometimes there can be None
in this list". And honestly, it confused me too.
This PR proposes making self.best_score_list
easier for both mypy
and humans to understand by removing that None
-padding, and instead changing the update logic to "if it's the first time this list is being updated, build it up by .append()
-ing every element of env.evaluation_result_list
to it".
@jmoralez @guolinke @shiyu1994 do you any of you have time to help out on some reviews with this and the other I'm sorry to |
Thanks so much for the reviews @guolinke ! |
This pull request has been automatically locked since there has not been any recent activity since it was closed. |
Contributes to #3756.
Contributes to #3867.
Follow-up to this discussion with @IdoKendo : #5672 (comment)
Fixes the following
mypy
errors.