Make optional argmax for y_pred in Confusion Matrix, Precision, Recall, Accuracy #822

vfdev-5 · 2020-03-02T22:54:11Z

🚀 Feature

Today, the conditions on the input of the Confusion Matrix, (and Precision, Recall, Accuracy in multiclass case) are the following:

    - `y_pred` must contain logits and has the following shape (batch_size, num_categories, ...)
    - `y` should have the following shape (batch_size, ...) and contains ground-truth class indices
        with or without the background class. During the computation, argmax of `y_pred` is taken to determine predicted classes.

Taking argmax on y_pred can be an option if we would like to determine winning class by some other rule. Let's keep argmax as default behaviour if y_pred is (N, C, ...) and do not apply it if y_pred.shape == y.shape and (N, ...).

The text was updated successfully, but these errors were encountered:

sdesrozis · 2020-03-21T19:30:15Z

So

argmax should be optional and user should be able to give their own rule.
if y_pred.shape == y.shape and (N, ...), do not apply

Is it ok ?

vfdev-5 · 2020-03-21T19:37:40Z

argmax should be optional and user should be able to give their own rule.

If y_pred has C dimension like (N, C, ...) there is no way to compute a metric without taking argmax with y_true as (N, ...). In this case we should take argmax without an option, IMO.

if y_pred.shape == y.shape and (N, ...), do not apply

yes. In this case, user can perform winning class selection in output_transform or anywhere before metrics update.

sdesrozis · 2020-03-21T19:41:06Z

Ok I do it soon

sdesrozis · 2020-03-22T15:22:54Z

The following code from precision.py (same for recall.py) can't be done if shape is (N, ...)

ignite/ignite/metrics/precision.py

Lines 144 to 153 in cfb1b3a

    
           elif self._type == "multiclass": 
        
               num_classes = y_pred.size(1) 
        
               if y.max() + 1 > num_classes: 
        
                   raise ValueError( 
        
                       "y_pred contains less classes than y. Number of predicted classes is {}" 
        
                       " and element in y has invalid class = {}.".format(num_classes, y.max().item() + 1) 
        
                   ) 
        
               y = to_onehot(y.view(-1), num_classes=num_classes) 
        
               indices = torch.argmax(y_pred, dim=1).view(-1) 
        
               y_pred = to_onehot(indices, num_classes=num_classes)

From accuracy.py, implementation is different

ignite/ignite/metrics/accuracy.py

Lines 154 to 156 in cfb1b3a

    
           elif self._type == "multiclass": 
        
               indices = torch.argmax(y_pred, dim=1) 
        
               correct = torch.eq(indices, y).view(-1)

sdesrozis · 2020-03-22T15:44:20Z

Original code

ignite/ignite/metrics/accuracy.py

Lines 62 to 77 in cfb1b3a

    
           if y.ndimension() + 1 == y_pred.ndimension(): 
        
               num_classes = y_pred.shape[1] 
        
               if num_classes == 1: 
        
                   update_type = "binary" 
        
                   self._check_binary_multilabel_cases((y_pred, y)) 
        
               else: 
        
                   update_type = "multiclass" 
        
           elif y.ndimension() == y_pred.ndimension(): 
        
               self._check_binary_multilabel_cases((y_pred, y)) 
        
               if self._is_multilabel: 
        
                   update_type = "multilabel" 
        
                   num_classes = y_pred.shape[1] 
        
               else: 
        
                   update_type = "binary" 
        
                   num_classes = 1

Modified code

        if y.ndimension() + 1 == y_pred.ndimension():
            # `y` is in the following shape of (batch_size, ...) and
            # `y_pred` is in the following shape of (batch_size, num_categories, ...)
            num_classes = y_pred.shape[1]
            if num_classes == 1:
                update_type = "binary"
                self._check_binary_multilabel_cases((y_pred, y))
            else:
                update_type = "multiclass"
        elif y.ndimension() == y_pred.ndimension():
            if self._is_multilabel:
                # `y` and `y_pred` are in the following shape of (batch_size, num_categories, ...)
                self._check_binary_multilabel_cases((y_pred, y))
                update_type = "multilabel"
                num_classes = y_pred.shape[1]
            else:
                # `y` and `y_pred` are in the following shape of (batch_size, ...)
                # binary type is used because it works in update (no argmax)
                # should we introduce a new type ?
                update_type = "binary"
                num_classes = None

The value of the parameter _num_classes is not important, it is used to ensure the consistency of y_pred in update calls. So None should be an acceptable value.

@vfdev-5 thoughts ?

sdesrozis · 2020-03-22T16:17:46Z

In test

y_pred = torch.rand(10, 4)
y = torch.randint(0, 4, size=(10,)).long()
acc.update((y_pred, y))
np_y_pred = y_pred.numpy().argmax(axis=1).ravel()
np_y = y.numpy().ravel()
assert acc._type == "multiclass"
assert acc._num_classes == 4
assert isinstance(acc.compute(), float)
assert accuracy_score(np_y, np_y_pred) == pytest.approx(acc.compute())

acc.reset()
y_pred_argmax = torch.argmax(y_pred, dim=1)
acc.update((y_pred_argmax, y))
assert acc._type == "binary"
assert acc._num_classes == None
assert isinstance(acc.compute(), float)
assert accuracy_score(np_y, np_y_pred) == pytest.approx(acc.compute())

vfdev-5 · 2020-03-22T21:38:55Z

If we make argmax optional, then in the branch elif y.ndimension() == y_pred.ndimension(): we can have the following situations:

binary case : y: (N, ...), y_pred: (N, ...) and y ** 2 == y and y_pred ** 2 == y (as previously)
multilabel case : y: (N, C, ...), y_pred: (N, C, ...) and y ** 2 == y and y_pred ** 2 == y (as previously)
multiclass case (new) : y: (N, ...), y_pred: (N, ...) => yes, here we can set num_classes = None

Concerning the tests

y_pred_argmax = torch.argmax(y_pred, dim=1)
acc.update((y_pred_argmax, y))
assert acc._type == "binary"

the last assert looks strange for me if the data is originally multi-class.

sdesrozis · 2020-03-23T08:44:33Z

Ok, so I use the predicat y ** 2 == y and y_pred ** 2 == y to discriminate, I got it.

sdesrozis · 2020-03-24T13:15:37Z

Consider y: (N, ...) and y_pred: (N, ...). During update calls, type should switch from mutliclass to binary and vice versa, it depends on the values and the test to check if y and y_pred lead to binary type.

It means we can't infer the type of the metric using update. Right ? Is it an information given by user (we have multilabel, why not binary ?).

@vfdev-5 thoughs ?

vfdev-5 added enhancement help wanted labels Mar 2, 2020

vfdev-5 changed the title ~~Make optional argmax for y_pred in Confusion Matrix~~ Make optional argmax for y_pred in Confusion Matrix, Precision, Recall, Accuracy Mar 17, 2020

sdesrozis mentioned this issue Mar 26, 2020

Make torch.argmax optional : callable from logits to labels #850

Closed

3 tasks

vfdev-5 added needs-discussion and removed help wanted labels Apr 3, 2020

vfdev-5 mentioned this issue Oct 25, 2020

Add probability to Accuracy #1354

Closed

5 tasks

sallycaoyu mentioned this issue Feb 27, 2023

Possible improvements for Accuracy #1089

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make optional argmax for y_pred in Confusion Matrix, Precision, Recall, Accuracy #822

Make optional argmax for y_pred in Confusion Matrix, Precision, Recall, Accuracy #822

vfdev-5 commented Mar 2, 2020 •

edited

Loading

sdesrozis commented Mar 21, 2020 •

edited

Loading

vfdev-5 commented Mar 21, 2020

sdesrozis commented Mar 21, 2020

sdesrozis commented Mar 22, 2020

sdesrozis commented Mar 22, 2020

sdesrozis commented Mar 22, 2020

vfdev-5 commented Mar 22, 2020

sdesrozis commented Mar 23, 2020

sdesrozis commented Mar 24, 2020

Make optional argmax for y_pred in Confusion Matrix, Precision, Recall, Accuracy #822

Make optional argmax for y_pred in Confusion Matrix, Precision, Recall, Accuracy #822

Comments

vfdev-5 commented Mar 2, 2020 • edited Loading

🚀 Feature

sdesrozis commented Mar 21, 2020 • edited Loading

vfdev-5 commented Mar 21, 2020

sdesrozis commented Mar 21, 2020

sdesrozis commented Mar 22, 2020

sdesrozis commented Mar 22, 2020

sdesrozis commented Mar 22, 2020

vfdev-5 commented Mar 22, 2020

sdesrozis commented Mar 23, 2020

sdesrozis commented Mar 24, 2020

vfdev-5 commented Mar 2, 2020 •

edited

Loading

sdesrozis commented Mar 21, 2020 •

edited

Loading