-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PyTorch] deformable_conv2 error when converting torch traced model to relay #8057
Comments
Looks like your model is dynamic? The argument 3 in deformable conv2d is stride, which is expected to be a static constant expression. However, according to the error message, the strides of deformable conv2d in your model is a call node, which usually means the strides was calculated by another operator on the fly. It would be better if you could try to locate and post a subgraph in the model with such issue to see how the stride was determined. Also cc @masahi @codeislife99 |
Does this mean things work if you trace and convert to relay without going through serialization? Note that Torch erases all type information on serialization. This caused problems for quantized model in the past, see pytorch/pytorch#39690. Not sure if this is a related issue. |
Thanks @comaniac I could extract the subgraph of the traced model, after serializing and deserializing it.
and these are all the instances where modules calls the "deform_conf2d" function:
Hope it helps |
@sacalo Can you send me a repro script and model? |
@masahi I have created a shared folder with the traced model and the scripts to reproduce the issue Let me know if you need anything more |
ok there is an API change in torchvision deform conv2d between 1.7 and 1.8, and we do not support 1.8. If you replace tvm/python/tvm/relay/frontend/pytorch.py Lines 2071 to 2073 in 720e7b1
with
the deform conv2d problem should be gone. But I hit a different error from this model. |
thanks for that tip @masahi, I will start researching and trying to solve the problem from there |
It's ok, there is probably another bug in the frontend. I'll take a look. |
Hi, does it have some progress? I have the same issue here. |
If you guys can check, I found that the deformable conv2d ops of pytorch and the deformable conv2d op of relay api have different result : https://discuss.tvm.apache.org/t/how-to-fix-the-difference-of-deformable-conv2d-op-between-relay-api-and-pytorch-api/10180. probably, that's the reason. |
It is possible that relay and PT have different deform conv 2d results. I had put up a PR #7397 which fixed this for PT 1.6 . It's possible they changed something for later versions. My guess is that if something was changed it must be how the outer points are interpolated during the bilinear interpolation step because that's the only place where framework implementations differ as well. |
Yeah CI runs PT deformable conv2d test with PT 1.7. It seems PT 1.8 changed something in their deformable conv2d, the same test doesn't seem to work anymore. Even after I apply the fix in #8057 (comment) to workaround the API change, there is a shape mismatch issue. |
OK!I see! Pytorch 1.8 add modulated deformable conv2d in it, probably we need to add it too. I mean deformable conv2d has a mask parameter |
After converting a pytorch model to torchscript using the tracing method, I can successfully execute it and make inferences. But when trying to convert the traced model following this code, it fails with the attached traceback error. I can see that is has to do with the "deformable_conv2d" but I'm not able to follow the cause deeper.
using the scripted_model (works OK):
converting scripted_model to relay (fails with the error posted):
TRACEBACK ERROR:
cc @yelite
The text was updated successfully, but these errors were encountered: