Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于internlm-xcomposer2-vl模型预处理时build_mlp.py中使用torch.nn.functional.interpolate的bicubic模式的问题 #338

Closed
lzcchl opened this issue Jun 26, 2024 · 1 comment
Assignees

Comments

@lzcchl
Copy link

lzcchl commented Jun 26, 2024

机缘巧合下发现的,即使是比较新版本的torch(我是2.1.2)也是存在这个问题的,就是resize后的图像像素会有较大的不平滑,这样会不会影响VIT模型的效果,进而导致影响整个对话效果?
这个我在torch的issues找到的 https://github.com/pytorch/vision/issues/2950,虽然这里是torchvision.transforms.Resize,但实质上还是调用torch.nn.functional.interpolate。
我测试下来的效果:可以看到下面torch.nn.functional.interpolate在右上的白色小车部分(可能还有其他不明显的位置)有黑色点,这明显是不平滑的。
原图:
dog
pil resize 到宽高一半大小:
pil
torch.nn.functional.interpolate 到宽高一半大小:
th2
torch.nn.functional.interpolate 参数antialias=True 到宽高一半大小:
th1

我的测试代码在下方,修改img_dir就可以跑,你可以快速验证我说的这个问题。
pil_torch_rsz.py.txt

@LightDXY
Copy link
Collaborator

LightDXY commented Jul 5, 2024

hi, thanks for the comments, 插值图像在4khd里仅作为global image,模型靠后面没有resize的local图像去看细节,所以对结果的影响应该不大

@LightDXY LightDXY closed this as completed Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants