You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The above PRs have a potential side effect. These unconditionally switch OFF some optimizations in the compiler. As a result, we may not get the expected performance gains (or, worse, get a drop).
value_unknownWe don't know how much additional performance we can regain. AFAICS almost nothing with ROCm 6.0. However, as the compiler develops, this may change in the future.
How to resolve the issue: The flags that allow non-uniform grids should be applied only to the kernels that actually use non-uniform grids. This work requires manual investigation of the solver's GetSolution() code.
⚠️ Old .kdb files may contain kernels built with the assumption that block sizes are non-uniform. Therefore, in order to get the expected performance gain, the binary kernel cache needs to be regenerated (after all other fixes are done).
Implementation tips
Look for WORKAROUND_SWDEV_413293 in the code
Unconditional enforcement of -fno-offload-uniform-block introduced in #2307 should be removed. The code that invokes HIP compiler should parse options and, if -foffload-uniform-block is found, just keep it. Otherise, it should add -fno-offload-uniform-block.
HIP: Pass -foffload-uniform-block from the solvers that use only uniform grids.
OpenCL: Pass -cl-uniform-work-group-size from the solvers that use only uniform grids.
@junliume [process] Setting
urgency_unknown
is not useless. For example, it is possible to quickly find the tickets that require clarification of urgency/importance, and revisit them, which I would recommend doing periodically.
Leftover of
The above PRs have a potential side effect. These unconditionally switch OFF some optimizations in the compiler. As a result, we may not get the expected performance gains (or, worse, get a drop).
value_unknown We don't know how much additional performance we can regain. AFAICS almost nothing with ROCm 6.0. However, as the compiler develops, this may change in the future.
How to resolve the issue: The flags that allow non-uniform grids should be applied only to the kernels that actually use non-uniform grids. This work requires manual investigation of the solver's
GetSolution()
code.Implementation tips
WORKAROUND_SWDEV_413293
in the code-fno-offload-uniform-block
introduced in #2307 should be removed. The code that invokes HIP compiler should parse options and, if-foffload-uniform-block
is found, just keep it. Otherise, it should add-fno-offload-uniform-block
.-foffload-uniform-block
from the solvers that use only uniform grids.-cl-uniform-work-group-size
from the solvers that use only uniform grids.[Attribution] @junliume @JehandadKhan
The text was updated successfully, but these errors were encountered: