-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port PadConstantForwardContiguous #4
Port PadConstantForwardContiguous #4
Conversation
e733b39
to
71b568b
Compare
68bdcac
to
7a051ed
Compare
f244ab6
to
13843f9
Compare
This reverts commit 4871cfd.
I knew that there are cases where input and output dtypes are separate, so I deleted my comment, but you already changed it. Please ignore my comments regarding dtype. |
This reverts commit 98bba90.
I have rolled the changes back. |
About other things, good work. You can change your ticket to done once you take care of the perf result table. |
And it looks you need to narrow down the condition in IsApplicable to guarantee better performance. |
This PR ports the
PadConstantForwardContiguous
OpenCL kernel to MIOpen. Closes MV-379.Checklist:
-V
)Performance comparison with PyTorch ROCm:
bfloat16
float32
float16
Average over all cases: