Skip to content
This repository has been archived by the owner on Jul 2, 2021. It is now read-only.

Add ADE20K dataset #429

Merged
merged 17 commits into from
Oct 6, 2017
Merged

Add ADE20K dataset #429

merged 17 commits into from
Oct 6, 2017

Conversation

mitmul
Copy link
Member

@mitmul mitmul commented Sep 27, 2017

This PR adds a semantic segmentation dataset "ADE20K" from MIT Scene Parsing Benchmark http://sceneparsing.csail.mit.edu/

@mitmul mitmul added the feature label Sep 27, 2017
p = Pool(2)
data_root = download.get_dataset_directory(root)
urls = [trainval_url, test_url]
ret = [p.apply_async(utils.cached_download, args=(url,)) for url in urls]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cached_download shows an progress report. Is it safe to call multiple cached_download at the same time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. So, actually, it tries to show two progress reports at the same line, then the display of the progress report is broken. Should I do those download sequentially?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer sequential downloading. Is the downloading too slow?

self.dataset = ADE20KSemanticSegmentationDataset(split=self.split)

@attr.slow
def test_camvid_dataset(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_camvid_dataset?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

automatically downloaded into
:obj:`$CHAINER_DATASET_ROOT/pfnet/chainercv/ade20k`.
split ({'train', 'val', 'test'}): Select from dataset splits used in
Cityscapes dataset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cityscapes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I forgot to fix the name... Thanks for pointing out it!

When :obj:`split` is either :obj:`train` or :obj:`val`, it returns
a tuple consited of a color image and a label whose shapes are
(3, H, W) and (H, W), respectively, while :obj:`split` is
:obj:`test`, it returns only the color image. H and W are height
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If split == 'test', this class is not a semantic segmentation dataset. This behaviour is confusing because other semantic segmentation datasets always return pairs of image and label. This is against the definition of ***SemanticSegmentationDataset.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, what should I do? Are you suggesting that I should remove the test split from this class or I should treat this class as a dataset that is not a SemanticSegmentationDataset?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or making the test split as a separate one that is just an ImageDataset could be a possible choice. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Among those two solutions, I prefer later one, separating into ADE20KSegmentationDataset and ADE20K(Test)ImageDataset.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll try that option. Thanks for the advice!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Among those two solutions, I prefer later one, separating into ADE20KSegmentationDataset and ADE20K(Test)ImageDataset.

Sorry. ADE20KSegmentationDataset -> ADE20KSemanticSegmentationDataset

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@Hakuyume
Copy link
Member

Hakuyume commented Sep 27, 2017

Does ADE20K support other types of annotation? Can we add another ADE20K dataset (e.g. ADE20KInstanceSegmentationDataset)?

@mitmul
Copy link
Member Author

mitmul commented Sep 27, 2017

Does ADE20K support other types of annotation?

Yes, it does. It has instance-aware segmentation mask and parts segmentation mask.

# The values used here are copied from CSAILVision/sceneparsing:
# https://github.com/CSAILVision/sceneparsing

ade20k_label_names = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about changing this to a tuple?

'flag'
]

ade20k_label_colors = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

assert_is_semantic_segmentation_dataset(
self.dataset, len(ade20k_label_names), n_example=10)
else:
for img in self.dataset[:10]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about testing on randomly selected 10 indices.


def get_ade20k():
data_root = download.get_dataset_directory(root)
cache_fn = utils.cached_download(url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache_fn -> cache_path


def get_ade20k():
data_root = download.get_dataset_directory(root)
cache_fn = utils.cached_download(url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

def get_ade20k():
data_root = download.get_dataset_directory(root)
cache_fn = utils.cached_download(url)
utils.extractall(cache_fn, data_root, os.path.splitext(url)[1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

def get_ade20k():
data_root = download.get_dataset_directory(root)
cache_fn = utils.cached_download(url)
utils.extractall(cache_fn, data_root, os.path.splitext(url)[1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@mitmul
Copy link
Member Author

mitmul commented Sep 28, 2017

@yuyu2172 Thanks for the comments, I fixed them

yuyu2172
yuyu2172 previously approved these changes Sep 28, 2017
assert_is_semantic_segmentation_dataset(
self.dataset, len(ade20k_label_names), n_example=10)
else:
idx = np.random.permutation(np.arange(len(self.dataset)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, one more thing.
Can you change idx to indices.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than that, LGTM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Travis failed with a build issue...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed idx to indices

@yuyu2172 yuyu2172 dismissed their stale review September 28, 2017 03:19

Travis needs to be fixed.

@yuyu2172
Copy link
Member

yuyu2172 commented Sep 28, 2017

Can you merge master?
Travis should work now thanks to #431

@mitmul
Copy link
Member Author

mitmul commented Sep 28, 2017

Merged with master.

@yuyu2172
Copy link
Member

LGTM

@Hakuyume
What do you think?

{'split': 'val'},
{'split': 'test'}
)
class TestADE20KDataset(unittest.TestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about separating into two classes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, will fix

from chainercv.datasets.ade20k.ade20k_semantic_segmentation_dataset import ADE20KSemanticSegmentationDataset # NOQA
from chainercv.datasets.ade20k.ade20k_test_image_dataset import ADE20KTestImageDataset # NOQA
from chainercv.datasets.ade20k.ade20k_utils import ade20k_label_colors # NOQA
from chainercv.datasets.ade20k.ade20k_utils import ade20k_label_names # NOQA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to our naming convention, ade20k_label_colors should be ade20k_semantic_segmentation_label_colors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/chainer/chainercv/blob/master/chainercv/datasets/cityscapes/cityscapes_utils.py#L53

You think this is a problem also?
label_names should be changed as well because it can be different from the one used by Instance Segmentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is also a problem. Thank you for pointing out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize:

  • ade_label_colors --> ade_semantic_segmentation_label_colors
  • ade_label_names --> ade_semantic_segmentation_label_names
  • cityscapes_label_colors --> cityscapes_semantic_segmentation_label_colors
  • cityscapes_label_names --> cityscapes_semantic_segmentation_label_names

Objects for CamVid need not be changed because this dataset only contains semantic segmentation data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for summarizing.

Objects for CamVid need not be changed because this dataset only contains semantic segmentation data.

I agree with you.

@mitmul
Copy link
Member Author

mitmul commented Sep 29, 2017

I resolved all points in your reviews

@yuyu2172
Copy link
Member

Thank you.
Oh. Travis is failing.

@mitmul
Copy link
Member Author

mitmul commented Sep 30, 2017

Fixed the travis errors

@yuyu2172 yuyu2172 added this to the v0.7 milestone Oct 5, 2017
.. autoclass:: ADE20KSemanticSegmentationDataset

ADE20KTestImageDataset
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The length of ~.

url = 'http://data.csail.mit.edu/places/ADEchallenge/release_test.zip'


def get_ade20k():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about moving this function under ade20k_utils?

@mitmul
Copy link
Member Author

mitmul commented Oct 5, 2017

@Hakuyume Thanks. Reflected your comments.

Copy link
Member

@Hakuyume Hakuyume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Hakuyume Hakuyume merged commit fb2b5b5 into chainer:master Oct 6, 2017
@mitmul mitmul deleted the add-ade20k branch October 6, 2017 11:24
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants