-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIP kernels may launch with non-uniform block size for backward compatibility #2307
Conversation
…kward compatibility
@JehandadKhan and @atamazov :
Thanks! |
@atamazov @JehandadKhan need your opinion here:
Hence this is not directly effective since the actual compilation command looks like:
and if I move -fno-offload-uniform-block out of the quotes (i.e. from
I think we need a very precise version of HIP from which Update: effective HIP_FLAT_VERSION 500723302 |
This would switch OFF some optimization in compiler that was ON before. As a side effect, we can get performance drops. Therefore this new flag should be applied only to the kernels that actually use non-uniform grids. Note that old .kdb files may contain kernels built with assumption that block sizes are uniform. AFAICS these kernels may work incorrectly when the grid is non-uniform (and that is why compiler team introduced the option). Therefore I think we need to re-generate .kdb after all the other fixes are done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very well done technically, but there is a danger of perf drop, see #2307 (comment). We can give it a try but some careful testing of performance is needed.
And this is a workaround indeed, because I do not believe we want to switch off some compiler optimizations permanently ;)
Co-authored-by: Artem Tamazov <artem.tamazov@gmail.com>
🌀 Performance testing resultsPreconditions:
Tested modes:
Bottom line: No performance or correctness regressions. Detailed logs & csv files are available upon request. Verdict: 🟢 GO! |
Ping @atamazov and @JehandadKhan for review. This feature is deferred but we may need it anyway. |
@junliume Reviewed and tested a while ago. Please go ahead. |
I like this approach better, fix the actual issue instead of introducing more workarounds. Looking at the invoker code we can easily determine which kernels rely on it and fix them. Furthermore, add a check in the kernel launch code to ensure that the grid size is symmetric. |
Perhaps we can merge this PR and review the solvers post-merge and ensure that all are symmetric, once done we can remove the workaround. |
@JehandadKhan : |
@junliume and @shurale-nkn I like that idea! |
https://ontrack-internal.amd.com/browse/SWDEV-413293
https://reviews.llvm.org/D155213
From ROCm 5.7 the compiler patch above will assume uniform block sizes; however, for backward compatibility we need to add this flag in MIOpen.
Or else we will observe:
Error log :
MIOpen Error: /long_pathname_so_that_rpms_can_package_the_debug_info/data/driver/MLOpen/src/hipoc/hipoc_kernel.cpp:104: Failed to launch kernel: invalid argument