Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classification metrics overhaul: input formatting standardization (1/n) #4837

Merged
merged 53 commits into from
Dec 7, 2020
Merged

Classification metrics overhaul: input formatting standardization (1/n) #4837

merged 53 commits into from
Dec 7, 2020

Conversation

tadejsv
Copy link
Contributor

@tadejsv tadejsv commented Nov 24, 2020

This PR is a spin-off from #4835. It should be merged before any other spin offs, as it provides a base for all of them

What does this PR do?

General (fundamental) changes

I have created a new _input_format_classification function (in metrics/classification/utils). The job of this function is to a) validate, and b) transform the inputs into a common format. This common format is a binary label indicator array: either (N, C), or (N, C, X) (only for multi-dimensional multi-class inputs).

I believe that having such a "central" function is crucial, as it gets rid of code duplication (which was present in PL metrics before), and enables metric developers to focus on developing the metrics themselves, and not on standardizing and validating inputs.

The validation performed on the inputs basically makes sure that they fall into one of the possible input type cases, that the values are consistent with both the type of the inputs and the additional parameters set (e.g. that there is no label higher than num_classes in target). The docstrings (and the new "Input types" section in the documentation) give all the details about how the standardization and validation are performed.

Here I'll list the parameters of this function (many of which are also present on some metrics), and why I decided to use them:

  • threshold: The probability threshold for binarizing binary and multi-label inputs.

  • num_classes: number of classes. Used to either decide the C dimension of inputs, or, if this is already implicitly given, to ensure consistency between inputs and number of classes the user specified when creating the metric (thus ignoring either having to chech this manually in update for each metric, or raising error when updating the state, which may not be very clear
    to the user).

  • top_k: for (multi-dimensional) multi-class, if predictions are given as probabilities, selects the top k highest probabilities per sample. It's a generalization of the usual procedure, with k=1. This will be used by the Accuracy metric in subsequent PRs.

  • is_multiclass: used for transforming binary or multi-label input to 2-class multi-class and 2-class multi-dimensional multi-class, respectively. And vice versa.

    Why? This is similar to multilabel argument that was (is?) present on some metrics. I believe this is a better name for it, as it also deals with transforming to/from binary. But why is it needed? There are cases where it is not clear what the inputs are: for example, say that both preds and target are of the form [0,1,0,1]. This actually appears to be multi-class (could be the case that is simply happened in this batch that there were only 0s and 1s), so an explicit instruction is needed to tell the metrics that this is in fact binary. On the other hand, sometimes we would like to treat binary inputs as two class inputs - this is the case used in confusion matrix.

    I also experiemented with using num_classes to determine this. Besides this being a very confusing approach, requiring several paragraphs to explain clearly, it also does not resolve all ambiguities (is setting num_classes=1 with 2 class probability predictions a request to treat the data as binary, or an inconsitency of inputs that should raise an error?). So I think is_multiclass is the best approach here.

Documentation

Instead of metrics being organized into "Class Metrics" and "Functional Metrics", they are now organized by topics (Classification, Regression, ...), and within topics split into class and functional, if necessary. This allows to add special topic-related sections - in this case I have added a section on what type of inputs are used for classification metric - a section that metrics can link to, in order to not repeat the same thing 100 times, and to keep docstrings short and to the point.

A second half of the Input types section with examples from StatScores metric will be added in the metric's PR.

@teddykoker
Copy link
Contributor

Thanks for splitting off the PR! Reviewing now

Copy link
Contributor

@teddykoker teddykoker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks like good changes, just a few small things to fix.

docs/source/metrics.rst Outdated Show resolved Hide resolved
pytorch_lightning/metrics/classification/utils.py Outdated Show resolved Hide resolved
pytorch_lightning/metrics/classification/utils.py Outdated Show resolved Hide resolved
tests/metrics/classification/test_inputs.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Nov 24, 2020

Codecov Report

Merging #4837 (4a71a56) into master (02152c1) will increase coverage by 0%.
The diff coverage is 99%.

@@           Coverage Diff           @@
##           master   #4837    +/-   ##
=======================================
  Coverage      93%     93%            
=======================================
  Files         129     130     +1     
  Lines        9397    9527   +130     
=======================================
+ Hits         8713    8843   +130     
  Misses        684     684            

Copy link
Member

@SkafteNicki SkafteNicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall very good :)

pytorch_lightning/metrics/classification/utils.py Outdated Show resolved Hide resolved
pytorch_lightning/metrics/classification/utils.py Outdated Show resolved Hide resolved
docs/source/metrics.rst Outdated Show resolved Hide resolved
@SkafteNicki SkafteNicki added docs Documentation related feature Is an improvement or enhancement Metrics labels Nov 25, 2020
@SkafteNicki SkafteNicki added this to the 1.1 milestone Nov 25, 2020
@s-rog s-rog linked an issue Dec 1, 2020 that may be closed by this pull request
@tadejsv
Copy link
Contributor Author

tadejsv commented Dec 4, 2020

Is there anything else that needs to be done before this PR can be merged?
I think I've addressed all issues that arised so far, so we can bring the review process to an end - as I'm very anxious to finally get to the good stuff ;)

@Borda @SkafteNicki @teddykoker

@SkafteNicki
Copy link
Member

@tadejsv, thanks for the further description of the is_multiclass parameter. I am fine with merging this PR now, so please go ahead and solve the conflicts and I will approve it.

@Borda
Copy link
Member

Borda commented Dec 6, 2020

@tadejsv mind resolve conflicts :] probably after #4549
@SkafteNicki @justusschock @teddykoker @ananyahjha93 mind do the final review?

Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just would be nice to have tests also for all the helper functions raising some kind of exception...

pytorch_lightning/metrics/classification/utils.py Outdated Show resolved Hide resolved
pytorch_lightning/metrics/classification/utils.py Outdated Show resolved Hide resolved
@tadejsv
Copy link
Contributor Author

tadejsv commented Dec 6, 2020

Alright, merge conflicts resolved, ready for final review. @SkafteNicki please double check that docs are ok (git diff not useful there).

Copy link
Member

@SkafteNicki SkafteNicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, docs looks fine :]

@SkafteNicki SkafteNicki added the ready PRs ready to be merged label Dec 7, 2020
Copy link
Contributor

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-level review, looks good, nice docs

docs/source/metrics.rst Show resolved Hide resolved
@Borda Borda merged commit fedc0d1 into Lightning-AI:master Dec 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation related feature Is an improvement or enhancement ready PRs ready to be merged
Projects
None yet
9 participants