Correctly using IoU and ConfusionMatrix #2753

remisphere · 2020-07-29T17:00:23Z

❓ Questions and Help

Before asking:

search the issues.
- related: Issues with Confusion Matrix normalization and DDP computation #2724
search the docs.
- related: confusion_matrix, iou

What is your question?

Hello,
I've been trying to use the IoU and ConfusionMatrix metrics for semantic segmentation, but I can't wrap my head around their implementation in PL and their intended usage.
They seem to assume that every class is present in at least the prediction or the target [1, 2, 3] (actually it looks for the max class index), which is a rather strange expectation to me.
With this assumption, they have variable return sizes, depending on what classes are missing in the batch (this was noticed by #2724).
IoU has a num_classes argument, but it is only used to throw warnings if the above expectation is not met.
The docs give very basic examples that are not in the context of a training loop and are thus outside the scope of computing the metrics over several batches.

How then do I get the IoUs (or confusion matrix) on my dataset, since it's not possible to average them as they don't have the same shape?

What have you tried?

For IoU, using the default reduction='elementwise_mean' prevent crashing, but I then get the mean IoU over the classes, and that is not what I want.

What's your environment?

OS: Linux
Packaging: pip
Version: 0.9.0rc3

The text was updated successfully, but these errors were encountered:

justusschock · 2020-07-30T08:35:06Z

Hi @remisphere,
The idea here is to compute them on your whole dataset (e.g in validation_epoch_end). Therefore you currently have to collect your results by returning them in validation_step as part of the dict.

basically it would be something like this:

def validation_step(batch, batch_idx):
    pred = self(batch[0])
    return {'pred': pred, 'target': batch[1]}

def validation_epoch_end(self, outputs):
    preds = torch.cat([tmp['pred'] for tmp in outputs])
    targets = torch.cat([tmp['target'] for tmp in outputs])
    metric_val = my_metric(preds, targets)

We are currently working on a V2 for metrics, which will include some aggregation mechanisms, but I'm afraid, it will still take some time for us to finish that

remisphere · 2020-07-30T09:05:46Z

Thank you @justusschock, that makes sense.

I have indeed seen some example where the full prediction is returned in the validation step, but I'm concerned it will eat up my memory if the validation data-set is a bit large, not talking about tracking the metric for the training data-set (unless it is stored to disk until the epoch's end ?).

I'm looking forward to the V2 then, happy to see PL flourishing !

P.S.:
I come from the PyTorch Ignite framework, where the IoU is computed from the confusion matrix, which has a required num_classes argument that set its size to a constant, that allows each class to have the same index in the confusion matrix / IoU vector regardless of what is in the batch.

justusschock · 2020-07-30T11:12:57Z

@remisphere I'm aware that this may be a memory issue, but we don't store it to disk on default. You could do so manually and just pass the file path to epoch end to restore it. I'm familiar with the way, ignite handles it, but IMO it's not that intuitive and it also does not change much, since you also need all the data to compute the cm.

remisphere · 2020-07-30T11:53:05Z

Ok, thanks for the advice !
To me it looks like Ignite doesn't store all the data (see here), but aggregates statistics, just like what you are planning to do.

abrahambotros · 2020-09-18T19:25:27Z

@remisphere #3097 fixes some of the issues you mentioned in the issue description above - you can now specify num_classes to the class version of IoU, similar to how you could specify that for the functional version before. If you're still computing IoU on just a single example (and not the whole dataset, as was recommended above if you can), you can also specify the absent_score to use for any classes not present in either the target or the prediction.

Regarding aggregation discussed above, it looks like a big metric aggregation PR (#3321) just got merged, though I haven't fully checked it out to know if it would make IoU feasible over large datasets / datasets with large images. Otherwise, maybe some of the recommendations from @justusschock above might be applicable still, and might be things that l look into trying as well.

ToucheSir · 2020-09-18T20:43:36Z

I'm not sure what kind of aggregation @remisphere was looking for, but #3321 won't aggregate confusion matrix based metrics in the same way one would expect e.g. Keras to. Frankly, I'm not too clear on how one would interpret summing or averaging confusion matrices from different batches.

That said, the solution above still works and can be cleaned up a bit now that Train/EvalResult exist.

remisphere added the question Further information is requested label Jul 29, 2020

edenlightning added the Metrics label Jul 29, 2020

edenlightning assigned justusschock Jul 29, 2020

remisphere closed this as completed Jul 30, 2020

remisphere mentioned this issue Jul 30, 2020

Use pytorch-like ignore_index instead of remove_bg for IoU #2736

Closed

abrahambotros mentioned this issue Aug 21, 2020

IoU metric returns 0 score for classes not present in prediction or target #3097

Closed

Lightning-AI locked and limited conversation to collaborators Feb 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Correctly using IoU and ConfusionMatrix #2753

Correctly using IoU and ConfusionMatrix #2753

remisphere commented Jul 29, 2020 •

edited

Loading

justusschock commented Jul 30, 2020

remisphere commented Jul 30, 2020

justusschock commented Jul 30, 2020

remisphere commented Jul 30, 2020 •

edited

Loading

abrahambotros commented Sep 18, 2020

ToucheSir commented Sep 18, 2020

This issue was moved to a discussion.

This issue was moved to a discussion.

Correctly using IoU and ConfusionMatrix #2753

Correctly using IoU and ConfusionMatrix #2753

Comments

remisphere commented Jul 29, 2020 • edited Loading

❓ Questions and Help

Before asking:

What is your question?

What have you tried?

What's your environment?

justusschock commented Jul 30, 2020

remisphere commented Jul 30, 2020

justusschock commented Jul 30, 2020

remisphere commented Jul 30, 2020 • edited Loading

abrahambotros commented Sep 18, 2020

ToucheSir commented Sep 18, 2020

This issue was moved to a discussion.

remisphere commented Jul 29, 2020 •

edited

Loading

remisphere commented Jul 30, 2020 •

edited

Loading