ConvNext in_channels > 3 leads to RunTime Error #1869
-
Hi there! I'm trying to use convnext on multichannel audio spectrogram but am hitting an error:
Is this just an issue with the spectrogram shape or am I missing something? For reference running the same code but using |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
I don't think this is related to the number of channels, but rather to the other dimensions (specifically, the small last dimension). To see why, note that the following should run with no problems: import torch
import timm
encoder = timm.create_model("convnextv2_base", in_chans=8, pretrained=False)
input = torch.ones((1, 8, 224, 224))
encoder.forward_features(input) but if instead you use too low a value for the last dimension (such as 26), we run into an issue, i.e., the below will not work: small_input = torch.ones((1, 8, 224, 26))
encoder.forward_features(small_input)
The reason your code work with You can also see this directly from the error message: The kernel size referenced in the |
Beta Was this translation helpful? Give feedback.
I don't think this is related to the number of channels, but rather to the other dimensions (specifically, the small last dimension). To see why, note that the following should run with no problems:
but if instead you use too low a value for the last dimension (such as 26), we run into an issue, i.e., the below will not work:
>>> RuntimeError: Calculated padded input size per channel: (14 x 1). Kernel size: (2 x 2). Kernel size can't be grea…