Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem I met when using resize #4

Open
a-green-hand-jack opened this issue Jan 15, 2024 · 1 comment
Open

Problem I met when using resize #4

a-green-hand-jack opened this issue Jan 15, 2024 · 1 comment

Comments

@a-green-hand-jack
Copy link

I had some problems using transforms.Resize((32, 32) because my network is designed for 33232 images, so I tried to use transforms.Resize((32, 32). But an error like this occurred:

TypeError: Unexpected type <class 'numpy.ndarray'>

To solve this problem, I tried to define a ResizeClass of my own:

class ResizeCustom(transforms.Resize):
    def __init__(self, size, interpolation=Image.BILINEAR):
        super(ResizeCustom, self).__init__(size, interpolation)

    def __call__(self, img):
        if isinstance(img, np.ndarray):
            img = Image.fromarray(img)

        return super(ResizeCustom, self).__call__(img)

However, the problem was not solved and an error like this appeared:

  File "d:\Slef_Learning\MY_Project\WuYang\TDA_new_dataset\nets\net_out_tda.py", line 123, in images_to_matrix_lists
    trainset, validation_dataset, test_dataset = MLclf.miniimagenet_clf_dataset(ratio_train=0.6, ratio_val=0.2, seed_value=None, shuffle=True, transform=self.train_transform, save_clf_data=True)
                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda\envs\PyTorchGpu\Lib\site-packages\MLclf\MLclf.py", line 319, in miniimagenet_clf_dataset
    data_feature_label_permutation_split = MLclf.miniimagenet_convert2classification(data_dir=data_dir, ratio_train=ratio_train, ratio_val=ratio_val, seed_value=seed_value, shuffle=shuffle, task_type='classical_or_meta', save_clf_data=save_clf_data, transform=transform)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda\envs\PyTorchGpu\Lib\site-packages\MLclf\MLclf.py", line 193, in miniimagenet_convert2classification
    data_feature_label['images'] = MLclf._feature_norm(data_feature_label['images'], transform=transform)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda\envs\PyTorchGpu\Lib\site-packages\MLclf\MLclf.py", line 295, in _feature_norm
    feature_output[i] = transform(feature_i)
    ~~~~~~~~~~~~~~^^^
RuntimeError: The expanded size of the tensor (84) must match the existing size (32) at non-singleton dimension 2.  Target sizes: [3, 84, 84].  Tensor sizes: [3, 32, 32]

It seems that there are some problems with the size of the picture, but I don't understand where the problem occurs; especially this [3,84,84], isn't the original size of the picture 64*64?

I have also tried some bits and pieces, but they have no effect. Is there any solution?

@a-green-hand-jack
Copy link
Author

Well, I found a way that might solve this problem.
First, when using Reszie, use ToTenser first, that is:

train_transform = transforms.Compose([
            transforms.ToTensor(),
            transforms.Resize((32, 32), antialias=True),
            transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
            ])

Then maybe need to modify a function in the source code _feature_norm

I think the problem is this. A feature_output is created based on the size of the original image. That is to say, its shape corresponds to the shape of the image that was not enhanced before, but the shape of the image after enhancement is different.

After making the following modifications, you can use Reszie normally.

@staticmethod
    def _feature_norm(feature, transform=None):
        """
        This function transforms the dimension of feature from (batch_size, H, W, C) to (batch_size, C, H, W).
        :param feature: feature / mini-imagenet's images.
        :return: transformed feature.
        """
        if transform is None:
            # convert a PIL image to tensor (H*W*C) in range [0,255] to a torch.Tensor(C*H*W) in the range [0.0,1.0]
            transform = transforms.Compose([transforms.ToTensor()])
            print('The argument transform is None, so only tensor conversion and normalization between [0,1] is done!')
        else:
            transform = transform

        feature_output = []
        
        for i, feature_i in enumerate(feature):
            transformed_feature = transform(feature_i)
            feature_output.append(transformed_feature)  # Move the channel dimension to the correct position
        
        return torch.stack(feature_output).numpy()

Of course, under normal circumstances, we may not necessarily use Reszie, but if you are like me and have to use Resize, this method may be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant