-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MI200][NCHW][SWDEV-312112] The find-db NCHW records for ConvAsmImplicitGemmGTCDynamic*XdlopsNHWC must be updated #1349
Comments
#1324 increases WS up to 3 bytes, but the system find-db remains the same. The library may run into fatal errors from time to time. The problem may manifest in develop, Staging tests and in 5.0 Release Staging tests. The latter is potentially the most problematic. How we can fix thisWe do not need to actually run tuning sessions to update find-db files. Adding 3 to the "required workspace size" field for ConvAsmImplicitGemmGTCDynamic*XdlopsNHWC solvers in NCHW records should be enough. However it is better to accommodate for #1327 in advance and increase requires WS size not by 3, but by ( Alternatives:
Let's decide which way we prefer (assuming that the goal is to have the fix ASAP). |
Sure. But the immediate problem is that we have this issue in 5.0 staging. |
AFAIK, we tune after the branch is cut, and will need to regenerate find-db then anyway? |
Of course, but we do not want to receive bug reports before that, right? |
...And currently we have this problem in Staging branch. After promotion it will be replicated to Master branch and will remain there (i.e. in all Mainline builds) until 5.0 tuning is done, backported to develop, promoted to Staging, tested, and finally promoted to Mainline. |
@atamazov @JehandadKhan : which way is "faster", a.k.a. can be completed by next Tuesday? I will vote for "Patch existing find-db files using some script (@DrizztDoUrden )" since Chris is off and patching looks like a more "direct" way to ensure problems are fixed. What do you think? |
Either variant works for me. |
I vote for @DrizztDoUrden to update it since that sounds the safest to me as well. |
@DrizztDoUrden Let's update MI100/MI200 fdb as per #1349 (comment). I am available for a call at any time. |
@shaojiewang FYI. Please share your opinion about this. Thanks. |
@atamazov Thanks for findings and updates. I think that patching the existing find-db is a better way. In the future, for the find db part, could it be possible to remove the Workspace_size from the db and just let it be computed on the fly? Or else, is it possible to have some reserved space for WS? |
@JehandadKhan FYI There are none records with |
I mean, is it expected in |
Not according to my data |
@JehandadKhan Still unclear... ;) So there is nothing to fix in MI100 System find-db files, right? |
@atamazov That is correct, no records for gfx908 of the above/mentioned solvers |
Perhaps I could have worded my earlier response better ! |
…mmGTCDynamic*XdlopsNHWC in find-db (leftover of 1324) (#1354) Resolves #1349 Added 3 * 255 bytes to workspace to the following solvers workspace size in fdb where it was non-zero: ConvAsmImplicitGemmGTCDynamicBwdXdlopsNHWC ConvAsmImplicitGemmGTCDynamicFwdXdlopsNHWC ConvAsmImplicitGemmGTCDynamicWrwXdlopsNHWC
The find-db NCHW records for
must be refreshed.
Originally posted by @atamazov in #1324 (comment)
The text was updated successfully, but these errors were encountered: