Skip to content
This repository has been archived by the owner on Jul 2, 2021. It is now read-only.

Add return_bb option to CUBDatasets and add a test #399

Merged
merged 12 commits into from
Oct 5, 2017

Conversation

yuyu2172
Copy link
Member

No description provided.

@yuyu2172 yuyu2172 added the test label Aug 18, 2017
@yuyu2172 yuyu2172 added this to the v0.7 milestone Aug 18, 2017
def test_cub_label_dataset(self):
assert_is_classification_dataset(
self.dataset, len(cub_label_names), n_example=10)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to check the effect of crop_bbox?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is totally different thing, but what do you think about changing this dataset to return bbox instead of returning a cropped image.
This is more flexible because in some cases users want to crop an image by a padded bbox.

This can be done by replacing crop_bbox option to return_bbox. This style of interface is similar to VOCDetectionDataset, which also can return extra data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning bbox looks good to me. As you said, it will be more flexible. Perhaps, adding transforms.image.crop_with_bbox (or simply crop) will be helpful. With this function, we can write cropped dataset easily,
TransformDataset(CUBLabelDataset(return_bbox), lambda in_data: crop_with_bbox(in_data[0], in_data[2]), in_data[1]).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think return_bbox is an inaccurate name because we defined bbox as a set of bounding boxes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of returning a set of bbox (shape=(1,4)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is redundant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, I think it is better to keep data type consistent with other bboxes, so that we can use tools for bboxes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think name return_bb and returns (4,) is better. User can bbox utils by bb[np.newaxis]/bb[None].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reflected this change.

@Hakuyume
Copy link
Member

I noticed we have both LabelDataset (CUBLabelDataset) and ClassificationDataset (DirectoryParsingClassificationDataset). We should unify these names because their tasks are same.

@yuyu2172
Copy link
Member Author

You are right.
ClassificationDataset is good because label is overloaded too much.
Also, it is consistent with other names of datasets which contain task names.

On the other hand, LabelDataset is good because this can be used for tasks other than Classification.

@Hakuyume
Copy link
Member

From my understanding, Classification in the class name indicates the main task for which the dataset was designed. Its name does not limit the usage of the dataset.

As you pointed out, we can use annotation type instead of task type. In this case, we should change class names as follows:

  • ClassificationDataset -> ImagewiseLabelDataset
  • DetectionDataset -> BBoxWithLabelDataset (?)
  • SemanticSegmentationDataset -> PixelwiseLabeldataset

@yuyu2172
Copy link
Member Author

In my opinion, there are three options. I slightly prefer 2 or 3.

  1. Name by the main tasks for which the dataset was designed.
  2. Name the dataset by the most distinctive annotation.

In this case, I would also suggest the following names:

  • ClassificationDataset -> LabelDataset
  • DetectionDataset -> BboxDataset
  • SemanticSegmentationDataset -> SemanticSegmentationDataset.

One advantage with this naming convention is that it only uses notations that have already been used in ChainerCV.

  1. Treat LabelDataset as a special case. LabelDataset can be used in multiple tasks such as Classification and Image Retrieval. This is the primary reason why ClassificationDataset sounds wrong. On the other hand, there is one to one correspondence between a task and a dataset type for Detection and SemanticSegmentation. There is no problem with assigning a task name in this case.

@Hakuyume
Copy link
Member

ClassificationDataset -> LabelDataset
DetectionDataset -> BboxDataset
SemanticSegmentationDataset -> SemanticSegmentationDataset.

SemanticSegmentationDataset looks inconsistent. This is the name of task.

On the other hand, there is one to one correspondence between a task and a dataset type for Detection and SemanticSegmentation.

This is not true. For example, a detection dataset can be used for object counting task.

@yuyu2172
Copy link
Member Author

yuyu2172 commented Aug 20, 2017

This is not true. For example, a detection dataset can be used for object counting task.

I see. So it seems that option 3 is too arbitrary.

SemanticSegmentationDataset looks inconsistent. This is the name of task.

Although it is the name of the task, it is the name of the output. This can be improved.

@Hakuyume
Copy link
Member

Although it is the name of the task, it is the name of the output. This can be improved.

How about SemanticMask?

@yuyu2172
Copy link
Member Author

I am not sure if that name is common in the field.

@Hakuyume
Copy link
Member

Personally, I prefer Name by the main tasks for which the dataset was designed. Is there anyone who think "I cannot use this dataset for image retrieval task because this is named Classification"?

@yuyu2172
Copy link
Member Author

On top of the inherent inconsistency problem, the task name is longer (Label -> Classification).
I observe this phenomenon a lot.

  • Keypoint -> Pose Estimation
  • Caption -> Question Answering (Although in this case, we can abbreviate the task name to QA)
  • Scene Graph -> Scene Graph Generation

I took a quick look at COCO and Visual Genome, which are prominent datasets that cover multiple tasks.

SemanticSegmentationDataset looks inconsistent. This is the name of task.

I think a user would not get confused about the name of Bbox/Detection dataset just because there is SemanticSegmentationDataset.

@Hakuyume
Copy link
Member

I think a user would not get confused about the name of Bbox/Detection dataset just because there is SemanticSegmentationDataset.

What do you mean?

@yuyu2172
Copy link
Member Author

It is totally fine to name SemanticSegmentationDataset together with BboxDataset.
We can have a task and an annotation whose names are the same.

@Hakuyume
Copy link
Member

We can have a task and an annotation whose names are the same.

Yes, that is not the problem. However, Segmentation sounds separating a thing into some pieces. Do we call separated pieces as segmentation? Aren't they segments? This is the reason why SemanticSegmentationDataset looks strange to me.

@yuyu2172
Copy link
Member Author

segmentation can have the same meaning as segment.

I googled the word, and it is used in two ways.
https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FAST

@Hakuyume
Copy link
Member

segmentation can have the same meaning as segment.

I see. So, SemanticSegmentationDataset is OK.
The names of datasets will be LabelDataset, BboxDataset and SemanticSegmentationDataset, right?
Perhaps, it is better to rename (pixel-wise) label to segmentation (seg/segm?). For example, vis_label -> vis_segmentation.

@yuyu2172
Copy link
Member Author

Perhaps, it is better to rename (pixel-wise) label to segmentation (seg/segm?)

Looking back at our internal discussion, label was preferred over segm because there is no similar name for score.
label/score pair is used throughout the library and it was suggested to be used for Semantic Segmentation as well.

However, vis_segmentation is much more informative than vis_label.
Since the situation has changed, I think segm is good.

By the way, taking instance_segmentation into consideration, I think vis_semantic_segmentation is better.

@Hakuyume
Copy link
Member

label/score pair is used throughout the library and it was suggested to be used for Semantic Segmentation as well.

I see.

By the way, taking instance_segmentation into consideration, I think vis_semantic_segmentation is better.

I thought the functionality of vis_instance_segmentation is very similar to that of vis_semantic_segmentation and we can treat both task by one function. But, vis_semantic_segmentation looks better now.

@Hakuyume
Copy link
Member

Let me summarize.

  • datasets -> <Identity><Main-annoatation-type>Dataset
  • dataset assertions -> assert_is_<main-annotation-type>_dataset
  • visualizations-> vis_<main-annoatation-type>
  • model assertions -> assert_is_<task>_link

@yuyu2172
Copy link
Member Author

I think it is OK to change label to segm.

@Hakuyume
Copy link
Member

Hakuyume commented Aug 21, 2017

I think it is OK to change label to segm.

  • LabelDataset -> img, label (per image)
  • BboxDataset -> img, bbox, label (per bounding box)
  • SemanticSegmentationDataset -> img, segm (per pixel)

@yuyu2172
Copy link
Member Author

yuyu2172 commented Aug 21, 2017

Adding to that, there are following objects:

  • Evaluator: <task><metric>Evaluator (edited)
  • VisReport <task>VisReport
  • evaluations: eval_<metric>
  • transforms <operation>_<annotation-type> (e.g. resize_bbox)

@Hakuyume
Copy link
Member

VisReport VisReport

Considering the consistency with vis_*, shouldn't it be <Main-annotation-type>VisReport? Do you intend to specify the type of target?

transforms _ (e.g. resize_bbox)

If the <annotation-type> is image, we omit it, right? (e.g. random_crop_image -> random_crop)

@yuyu2172
Copy link
Member Author

So there are three conventions:

  1. Annotation convention: (e.g. img is CHW)
  2. Dataset convention: Set of annotations returned by a dataset (e.g. img, label is returned by LabelDataset))
  3. Task convention: Input and output of a network (e.g. Classification task handles a network that takes img, label during training as input. It takes img and outputs score (or prob) during testing.)

3 depends on 2 and 2 depends on 1.
For example, Detection task assumes the BboxDataset as a dataset.

Considering the consistency with vis_*, shouldn't it be VisReport? Do you intend to specify the type of target?

Since VisReport assumes the input and output of a network, I think its behavior is selected by a task.

If the is image, we omit it, right? (e.g. random_crop_image -> random_crop)

Yes. This is for convenience.

@yuyu2172
Copy link
Member Author

@Hakuyume
Let's finish this.

@Hakuyume
Copy link
Member

Do you mean this?

label_names (not imagewise_label_names)
LabelDataset instead of ImagewiseLabelDataset.

It looks OK. This can be better because it is shorter.

I mean both.

dataset values additional info
LabelDataset img, label label_names
BoundingboxDataset img, bbox, label boundingbox_label_names
SemanticSegmentationDataset img, label semantic_segmentation_label_names, semantic_segmentation_label_colors

@yuyu2172
Copy link
Member Author

I would prefer bbox_label_names over boundngbox_label_names.
bbox is used throughout the library.

Other than that, it looks ok.

@Hakuyume
Copy link
Member

Hakuyume commented Sep 12, 2017

I would prefer bbox_label_names over boundngbox_label_names.
bbox is used throughout the library.

Sorry, it's my mistake.

dataset values additional info visualizer related tasks
LabelDataset img, label label_names - classification, image retrieval
BboxDataset img, bbox, label bbox_label_names vis_bbox object detection, object counting
SemanticSegmentationDataset img, label semantic_segmentation_label_names, semantic_segmentation_label_colors vis_semantic_segmentation semantic segmentation

@Hakuyume
Copy link
Member

If we choose to use LabelDataset for datasets with imagewise annotations, DirectoryParsingLabelDataset would be a good name.

I agree with you.

@yuyu2172
Copy link
Member Author

Sorry. I forgot to point out, but I prefer the dataset name for bbox to be BboxDataset.

I will start implementing these changes.

@Hakuyume
Copy link
Member

Sorry. I forgot to point out, but I prefer the dataset name for bbox to be BboxDataset.

Sorry, I fixed it. And I added two columns 'visualizer' and 'related tasks'.

@yuyu2172
Copy link
Member Author

Please merge this after #405.

@yuyu2172 yuyu2172 changed the title Add a test for CUBLabelDataset Add return_bb option to CUBDatasets and add a test Sep 20, 2017
Copy link
Member

@Hakuyume Hakuyume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you support mask in CUBLabelDataset?

@yuyu2172
Copy link
Member Author

yuyu2172 commented Oct 5, 2017

Good idea.

@yuyu2172
Copy link
Member Author

yuyu2172 commented Oct 5, 2017

@Hakuyume
I found that mask is not really an appropriate name for this data because it can be in any value between [0, 255].
I think prob_map is better name for this.
Since, this is a relative big change, how about making another PR?

@Hakuyume
Copy link
Member

Hakuyume commented Oct 5, 2017

I found that mask is not really an appropriate name for this data because it can be in any value between [0, 255].
I think prob_map is better name for this.

Is it a value of probability? If so, can we scale it to [0, 1)?

Since, this is a relative big change, how about making another PR?

Yes. it is better.

@yuyu2172
Copy link
Member Author

yuyu2172 commented Oct 5, 2017

Yes.

Please review this first.

Copy link
Member

@Hakuyume Hakuyume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except mask

@Hakuyume Hakuyume merged commit 82f8ef8 into chainer:master Oct 5, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants