-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dilation option to ResNet #866
Conversation
Codecov Report
@@ Coverage Diff @@
## master #866 +/- ##
==========================================
+ Coverage 54.49% 54.58% +0.09%
==========================================
Files 36 36
Lines 3307 3318 +11
Branches 542 545 +3
==========================================
+ Hits 1802 1811 +9
- Misses 1372 1373 +1
- Partials 133 134 +1
Continue to review full report at Codecov.
|
I'm merging this, but let me know if anyone disagrees with something or has a better idea on how to improve this part. |
This change is backward compatible? For now I have been using the torch.nn.apply function to directly change the parametrization of the convolution layers, replacing padding, strides and dilation. |
@labor00 yes, this change is backwards-compatible. Old models and code still work the same as before, because we do not modify the strides / dilations if the parameters are the default ( |
The repository for the gluoncv is very well documented and have lots of tests. i believe that some of the models could be ported to pytorch without hurdle. |
@usdflt what models do you have in mind? Note that we are adding segmentation and detection models to torchvision soon. |
@fmassa I was thinking of models like pspnet, yolo-v*, ssd. Is there any plans to merge the maskrccn repo within the torch vision? |
@usdflt the plan is to include some simplified version of faster r-cnn and mask r-cnn in torchvision, together with FCN and DeepLabV3 |
This PR adds support for replacing 2x2 strides in ResNet with dilation.
This functionality is useful for segmentation models, where high-resolution feature maps are desirable.
Instead of hard-coding the resolution of the feature map, we let the user specify explicitly which blocks they want the 2x2 stride to be replaced by a dilated convolution.
Given that there are only 3 blocks with 2x2 strides (if we don't consider the initial convolution, which also performs downsampling), the user needs to specify a 3-element list of booleans.
Suggestions for a better naming (instead of
replace_stride_with_dilation
) are welcome.cc @aadcock