Skip to content
This repository has been archived by the owner on Jul 2, 2021. It is now read-only.

Add VGG16 #265

Merged
merged 156 commits into from
Aug 21, 2017
Merged

Add VGG16 #265

merged 156 commits into from
Aug 21, 2017

Conversation

yuyu2172
Copy link
Member

@yuyu2172 yuyu2172 commented Jun 12, 2017

Merge after #271.

This PR adds a VGG16Layers, which will have APIs consistent with rest of the links in ChainerCV.

  • RGB images will work for pretraiend model. (TODO: A script to convert a caffe weight to npz file will be added).
  • Separate caffe loader from the model code. This is good for making the model code shorter.
  • predict returns an iterable of numpy.array.
  • Have an interface to select weight initializers at __init__ (users can manually select to stop using random initializer)
  • __call__ does not have layers option. Also, the return value is not a dictionary but a chainer.Variable.
  • __init__ takes feature option, which selects the feature that is going to be returned by __call__.
  • depending on feature option, layers that are unnecessary will automatically be deleted.

These changes are made for the following concrete scenarios in mind.

  • The chain will be used as a feature extractor for networks used in tasks other than classification. For example, this chain can be used as a feature extractor for FasterRCNNVGG16 with a proper initialization.
  • Usage together with other ChainerCV functions, which assume RGB image order.
  • I am assuming that the chain is used to extract only one type of feature. Also, I am assuming that the kind of extracted feature is fixed after initializing the chain. (EDIT: This assumption is no longer true in our final design)

@yuyu2172 yuyu2172 changed the title Add VGG16Layers [WIP] Add VGG16Layers Jun 12, 2017
@yuyu2172 yuyu2172 mentioned this pull request Jun 12, 2017
3 tasks
self.fc7 = L.Linear(4096, 4096, **kwargs)
self.fc8 = L.Linear(4096, 1000, **kwargs)

self.functions = collections.OrderedDict([
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not recommend this style of network definition. It might cause problems when an instance of this class is copied.
(See chainer/chainer#2810 .)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know that! Thank you!

@yuyu2172
Copy link
Member Author

yuyu2172 commented Jun 14, 2017

I am trying to reproduce the evaluation score reported here.
The website reports that VGG16 scores error 28.5 % for Top-1 Error with single center crop.

Currently, my implementation scored 32%, and I am going to investigate the reason why it scored below the reported score.

EDIT:
I forgot to set chainer.config.train = False. After fixing that, the score is 28.97%

EDIT2:
It seems that the reported performance is calculated on weights trained on Matconvnet. The caffeweight maybe different from the Matconvnet weight.

NOTE:
From the paper, following their evaluation technique, the model is expected to score 27% for Top-1 Error and 8.8% for Top-5 Error (Table 3, Row D, S=Q=256).

NOTE:
With ten-crop evaluation, the model scored 27.06% Top-1 Error, which is reasonably close to 27% reported in the paper.

@yuyu2172
Copy link
Member Author

Merged master

@@ -74,7 +73,8 @@ class FasterRCNNVGG16(FasterRCNN):
'voc07': {
'n_fg_class': 20,
'url': 'https://github.com/yuyu2172/share-weights/releases/'
'download/0.0.3/faster_rcnn_vgg16_voc07_2017_06_06.npz'
'download/0.0.4/'
'faster_rcnn_vgg16_voc07_trained_2017_08_06_trial_4.npz'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you use trial_4? Is this the best model?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the model that is converted from faster_rcnn_vgg16_voc07_2017_06_06.npz.
It performs the same with the previously distributed model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, faster_rcnn_vgg16_voc07_2017_08_06.npz is better. User will think "what is trial_4?" like me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Thanks for you feedback.


class VGG16(SequentialFeatureExtractor):

"""VGG16 Network for classification and feature extraction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VGG16 -> VGG-16

"""VGG16 Network for classification and feature extraction.

This is a feature extraction model.
The network can choose to output features from set of all
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to output features -> features to output or output features?

if pretrained_model in self._models:
mean = self._models[pretrained_model]['mean']
else:
mean = _imagenet_mean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a note about these behaviours? The default values of n_class and mean.

>>> prob = model(imgs)

>>> model.feature_names = 'conv5_3'
# This is feature conv5_3.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about # This is feature conv5_3 (after ReLU).?

Examples:

>>> model = VGG16()
# By default, VGG16.__call__ returns a probability score.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a probability score (after Softmax)?


>>> model.feature_names = ['conv5_3', 'fc6']
>>> # These are features conv5_3 and fc6.
>>> feat5_3, feat6 = model(imgs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# These are features conv5_3 (after ReLU) and fc6 (before ReLU).


Feature Extraction
~~~~~~~~~~~~~~~~~~
Feature extraction models can be used to extract feature(s) given images.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract feature(s) from given images. ?


2. Extract the training data:
```bash
mkdir train && mv ILSVRC2012_img_train.tar train/ && cd train
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ mkdir ...

@@ -0,0 +1,37 @@
import chainer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering consistency with other examples, this file should be put under examples/vgg/.
And it is better to convert from .caffemodel.

Copy link
Member Author

@yuyu2172 yuyu2172 Aug 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think that it is OK to leave conversion code under the current directory. This is because unlike other examples, VGG does not have its own demo.py or train.py. I think it better to keep examples/ with small number of subdirectories.
  2. OK. I will prepare write a script to convert directly from .caffemodel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting a model-specific utility under task-specific directory seems against your intention. The conversion code for VGG-16 can be used for other task than classification.

This is because unlike other examples, VGG does not have its own demo.py or train.py

You won't add any training codes for VGG or other classification networks?

Anyway, we should keep consistency with other examples. If you want to reduce the number of subdirectories, how about putting detection models under detection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting a model-specific utility under task-specific directory seems against your intention. The conversion code for VGG-16 can be used for other task than classification.

OK. Your suggestion seems easier for users who are not interested in classification to access the conversion code.

If you want to reduce the number of subdirectories, how about putting detection models under detection?

The current layout of directories are easier to explore than the nested directories.
I think it is still manageable.
If things start to get out of control, we can write a README under examples/.

@@ -0,0 +1,47 @@
import argparse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can make this code more general. Please check SSD's conversion code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean more general?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can take other VGG model (e.g. VGG-19).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a caffemodel for VGG-19?

# The pretrained weights are trained to accept BGR images.
# Convert weights so that they accept RGB images.
model.conv1_1.conv.W.data[:] = model.conv1_1.conv.W.data[:, ::-1]
model.conv1_2.conv.copyparams(caffemodel.conv1_2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These copyparams can be reduced.

Copy link
Member Author

@yuyu2172 yuyu2172 Aug 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can save CaffeFunction directly. Why do you copy the paramerters from CaffeFunction to VGG16?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because VGG16 uses Conv2DActiv, which is not used in the caffemodel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this yuyu2172#4 ?

@yuyu2172
Copy link
Member Author

This PR is no-compat because the child links of the FasterRCNN has changed.
It now uses Conv2DActiv.

Convert `*.caffemodel` to `*.npz`.

```
$ python caff2npz_vgg_16.py <source>.caffemodel <target>.npz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to change this line. Please update caff2npz_vgg_16.py -> caffe2npz.py.

@@ -6,7 +6,7 @@ For evaluation, please go to [`examples/classification`](https://github.com/chai
Convert `*.caffemodel` to `*.npz`.

```
$ python caff2npz_vgg_16.py <source>.caffemodel <target>.npz
$ python caff2npz.py <source>.caffemodel <target>.npz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo. caff -> caffe.

Copy link
Member

@Hakuyume Hakuyume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Hakuyume Hakuyume merged commit 01d3cb6 into chainer:master Aug 21, 2017
@yuyu2172 yuyu2172 added this to the v0.7 milestone Oct 6, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants