-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMP] Vulkan Support for Mixed Precision Pass #8295
Comments
cc @Lunderberg |
I can confirm that TF2 ssd mobilenet v2 can be converted to fp16 and runs on vulkan (AMD) and opencl (Intel Ice lake), if I disable vectorization on fp16 at https://github.com/apache/tvm/blob/main/python/tvm/topi/cuda/injective.py#L54-L55 (cc @Lunderberg). But the output from fp16 is a bit off compared to fp32 (on both vk and ocl). Also on other models I got type mismatch
|
I ran into a few issues with vectorization when I was running ResNet50 with float16. If you apply PR #8528 , is it still necessary to disable the vectorization? |
Regarding the numerical accuracy, I had a few maybe-similar issues when putting together the unittests in #8529. There's a decent number of schedules that perform poorly if the accumulator dtype is float16. I had a short discussion with @AndrewZhaoLuo last week on how best to implement float32 accumulation in the mixed precision pass, but haven't looked into it much yet. |
With #8528, I get this error:
|
Vulkan support for fp16 is fully functional, thanks @Lunderberg |
Solve issues and make modifications to support Vulkan for mixed precision pass here: #8069
Current initial issues as described by @Lunderberg
This issue is completed when unit tests can pass for Vulkan target.
The text was updated successfully, but these errors were encountered: