-
Hi, I'm attempting to use EfficientNet as the backbone network in SwinTransformer, replacing the default PatchEmbed layer. from timm import create_model
from timm.models.vision_transformer_hybrid import HybridEmbed
model = create_model('swin_base_patch4_window7_224', pretrained=True, num_classes=0, in_chans=3)
embedder = create_model('tf_efficientnet_b3_ns', pretrained=True, in_chans=3, features_only=True, out_indices=[2])
hybridembed = HybridEmbed(embedder,img_size=384,
patch_size=1,
feature_size=model.patch_embed.grid_size,
in_chans=3,
embed_dim=model.embed_dim
)
model.patch_embed = hybridembed However, SwinTransformer seems not support HybridEmbed, not like convit: from torchinfo import summary
summary(model, input_size=(8, 3, 224, 224))
The code is from here. It seems that SwinTransformer used to support the use of HybridEmbed as patch Embed, but now it doesn't. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
@HuKaiLang try adding |
Beta Was this translation helpful? Give feedback.
@HuKaiLang try adding
output_fmt='NHWC'
to the HybridEmbed args, believe I changed swin to keep tensors in no-flatten form so they'd be more useful for feature extraction.