-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General stateful metrics fixes #9446
General stateful metrics fixes #9446
Conversation
This makes sure the metric is reset before evaluating valset.
It would be even better to have full-fledged stateful layers documentations, but I lack the knowledge and experience to explain that well.
docs/templates/metrics.md
Outdated
@@ -19,7 +19,7 @@ model.compile(loss='mean_squared_error', | |||
|
|||
A metric function is similar to a [loss function](/losses), except that the results from evaluating a metric are not used when training the model. | |||
|
|||
You can either pass the name of an existing metric, or pass a Theano/TensorFlow symbolic function (see [Custom metrics](#custom-metrics)). | |||
You can either pass the name of an existing metric, or pass a function or layer (see [Custom metrics](#custom-metrics)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider minor reword: You can either pass the name of an existing metric, a metric function, or a layer (see...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pass a layer" is not sound advice in the general case.
For now I would not be comfortable covering usage of stateful metrics in the public docs. This is an API that is still experimental, and subject to changes in the future (e.g. this PR changes it). We'll document it when it is more mature.
Please revert the edits on this page. You can keep them around and we may include them at a future date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! Will do.
docs/templates/metrics.md
Outdated
@@ -19,7 +19,7 @@ model.compile(loss='mean_squared_error', | |||
|
|||
A metric function is similar to a [loss function](/losses), except that the results from evaluating a metric are not used when training the model. | |||
|
|||
You can either pass the name of an existing metric, or pass a Theano/TensorFlow symbolic function (see [Custom metrics](#custom-metrics)). | |||
You can either pass the name of an existing metric, or pass a function or layer (see [Custom metrics](#custom-metrics)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pass a layer" is not sound advice in the general case.
For now I would not be comfortable covering usage of stateful metrics in the public docs. This is an API that is still experimental, and subject to changes in the future (e.g. this PR changes it). We'll document it when it is more mature.
Please revert the edits on this page. You can keep them around and we may include them at a future date.
keras/utils/generic_utils.py
Outdated
@@ -337,7 +337,7 @@ def update(self, current, values=None): | |||
self._values[k][0] += v * (current - self._seen_so_far) | |||
self._values[k][1] += (current - self._seen_so_far) | |||
else: | |||
self._values[k] = v | |||
self._values[k] = [v, 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To distinguish a numerical value (this means taking an average from a single-item list, i.e. just keeping the numerical value the same - I'll add a comment) from a non-numerical log value that should be copied verbatim to output. This way, the floats are properly formatted the same as other numerical metrics.
keras/engine/training.py
Outdated
@@ -2404,13 +2416,18 @@ def evaluate_generator(self, generator, steps=None, | |||
enqueuer.stop() | |||
|
|||
if not isinstance(outs, list): | |||
assert not stateful_metric_indices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Please add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of loss-only models, obviously no stateful_metric_indices may exist.
I have now instead refactored the code to be consistent with all other Model methods in handling of loss-only models.
keras/engine/training.py
Outdated
averages.append(np.average([out[i] for out in all_outs], | ||
weights=batch_sizes)) | ||
# [0] is the loss, the rest are metrics | ||
if i == 0 or (i - 1) not in stateful_metric_indices: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unclear what the underlying logic is... either implement in a simpler/clearer way (preferred), or comment clearly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh - now I understand why other methods use Model.metrics_names rather than Model.metrics to determine the indices. Rewritten.
This is an ok change. |
name: String, name for the metric. | ||
""" | ||
|
||
def __init__(self, name='true_positives', **kwargs): | ||
super(BinaryTruePositives, self).__init__(name=name, **kwargs) | ||
self.stateful = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we have to specify this attribute for each metric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the idea, indeed - as long as your metric behaves as a stateful layer.
@@ -2331,6 +2334,15 @@ def evaluate_generator(self, generator, steps=None, | |||
""" | |||
self._make_test_function() | |||
|
|||
stateful_metric_indices = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reset is needed to make Stateful Metrics work for generators.
How do you feel about spinning this bug fix out into a separate PR? Should be a quick approval.
Some of the other changes, for instance m.stateful
will likely have some discussion. I really want this bug fix to make it into the next release :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind if I raise the reset in evaluate generator to get it through?
Don't want to steal your thunder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries about that! But I'm going to take better care of the MR from now on and update it in a couple of minutes, sorry about the delays!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you got it, I just wanted to make sure this PR wasn't going to get stuck for another couple of releases.
Use metrics_names, rather than metrics + index juggling to skip loss. Make loss-only output handling consistent with other Model methods. all_outs -> outs_per_batch to avoid confusion, all_outs has swapped dimensions in predict_generator().
I have updated the MR based on the review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
…ack-embeddings-from-layer-outputs * upstream/master: (68 commits) fit/evaluate_generator supporting native tensors (keras-team#9816) keras-team#9642 Add kwarg and documentation for dilation_rate to SeparableConvs (keras-team#9844) Document that "same" is inconsistent across backends with strides!=1 (keras-team#9629) Improve tests by designating dtype of sample data (keras-team#9834) Add documentation for 'subset' and interpolation' arguments (ImageDataGenerator) (keras-team#9817) Revert default theme to readthedocs Various docs fixes. Fix conflict Add support for class methods documentation (keras-team#9751) Add missing verbose opt for evaluate_generator (keras-team#9811) Added `data_format` to flatten layer. (keras-team#9696) Allow saving models directly to binary stream (keras-team#9789) Fix ctc_batch_cost() error when batch_size = 1 (keras-team#9775) Fix keras-team#9802 (keras-team#9803) Fix error in ImageDataGenerator documentation (keras-team#9798) fix typo (keras-team#9792) keras-team#9733: Extend RemoteMonitor to send data as application/json (keras-team#9734) Fixed inconsistencies regarding ReduceLROnPlateau (keras-team#9723) Fix doc issue. General stateful metrics fixes (keras-team#9446) ...
* Require stateful metrics layers to be actually stateful * Prevent stateful metrics to leak np.floats to the History object * Progbar: Format stateful metrics values as floats alike other metrics * test_stateful_metrics: Also test validation set evaluation This makes sure the metric is reset before evaluating valset. * Add support for stateful metrics in fit_generator() and evaluate_generator() * Document stateful metrics It would be even better to have full-fledged stateful layers documentations, but I lack the knowledge and experience to explain that well. * evaluate_generator(): Do not leak np.float to History here either * Revert stateful metrics documentation until the API stabilizes * Progbar: Explain stateful metrics handling * Model.evaluate_generator(): More consistent stateful metrics handling Use metrics_names, rather than metrics + index juggling to skip loss. Make loss-only output handling consistent with other Model methods. all_outs -> outs_per_batch to avoid confusion, all_outs has swapped dimensions in predict_generator().
keras/engine/training.py
Outdated
@@ -2427,7 +2427,7 @@ def evaluate_generator(self, generator, steps=None, | |||
averages.append(np.average([out[i] for out in all_outs], | |||
weights=batch_sizes)) | |||
else: | |||
averages.append(all_outs[-1][i]) | |||
averages.append(float(all_outs[-1][i])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why float
(not np.float64
)?
When TensorBoard callback is used with stateful metrics it raise
File "...\keras\callbacks.py", line 942, in on_epoch_end
summary_value.simple_value = value.item()
AttributeError: 'float' object has no attribute 'item'
np.float64
works correctly due to existence of item()
class method.
This implements a variety of stateful metrics improvements I noted in #9253 .
Require stateful metrics layers to be actually stateful: It seems a bit confusing to me that the Layer of a stateful metric doesn't need to have the stateful attribute set - with the assumption that all Layer metrics will be stateful, and behaving statefully even without this attribute. Is that a good assumption to make for the future?
Prevent stateful metrics to leak np.floats to the History object: When logging normal metrics, they are added in _*_loop() to 0., making them float. This doesn't happen with stateful metrics (they are assigned directly), so they are np.float32. This is annoying if you e.g. want to serialize the history object to JSON after training, which used to work fine before.
Progbar: Format stateful metrics values as floats alike other metrics: In progress bar (verbose=1), stateful metric values aren't formatted as %.4f but %s. This is messy with many metrics, but also sometimes one \b too many is printed (didn't find out why) and the progress bar jumps a line upwards (overwriting earlier content)
Most importantly, fit_generator() and evaluate_generator() now support stateful metrics.
I wrote some documentation.