[FEATURE] Why doesn't use Conv2d directly in PatchMerging #951
Replies: 2 comments 3 replies
-
@WZMIAOMIAO This isn't an issue so moving to discussion, but re your question I assume you mean why not use F.conv2d with a manually crafted kernel to achieve the same? As I didn't implement that, it's a better question for the original authors, https://github.com/microsoft/Swin-Transformer It's not high on my priority list to try but if you want to compare I'll update if your approach is better. |
Beta Was this translation helpful? Give feedback.
-
I think the following issue is related to this question. My opinion is that: PatchMerge(in_channels, out_channels, downscaling_factor) == nn.Conv2d(in_channels, out_channels, kernel_size=downscaling_factor, stride=downscaling_factor, padding=0) By the way, my conclusion is drawn by reviewing another repository: |
Beta Was this translation helpful? Give feedback.
-
First of all, Thank you for your great works.
Is your feature request related to a problem? Please describe.
I'm learning your swin-transformer code. I have a question in PatchMerging. https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/swin_transformer.py#L310-L347. Why doesn't use Conv2d(k=2, s=2) directly to merge 2x2 patches? Is 2x2 convolution too inefficient? or to facilitate the use of official weights? Look forward to your reply.
Beta Was this translation helpful? Give feedback.
All reactions