-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU support #4
Comments
Hi @adayton1, we do not currently have plans to add GPU support to Verdict. Though there is a chance we could soon use Verdict on the GPU ourselves. Do you have specific changes you want to see? |
I use the hex and quad functions. But it's trivial enough to port the whole library. I put up pull request #5 |
Hi. Do you know if the verdict pull request with GPU support, #5 will get merged in? |
Yes, it will be merged, and was merged last week internally. We'll be pushing a new update to the verdict source to this github repo soon. |
Thanks! |
I've been testing Alan's implementation with ROCM on mi300a (rzadams) and found that the stack memory was getting exhausted for the hex_distortion (and quad_distortion) calls unless I specified 128K(!) gpu stack memory. This was due to the arrays that are sized maxTotalNumberGaussPointsmaxNumberNodes and maxNumberNodesmaxNumberNodes. For our use, we only have linear elements, so setting maxTotalNumberGaussPoints and maxNumberNodes to 8 fixed the issue for us, but obviously that is not a general solution. I'm not sure what is the preferred "correct" fix: compile-time switch, multiple implementations, or a templated API. |
Are there any plans for adding GPU support to verdict? My GPU accelerated code is hitting a big slowdown when I have to switch to host only execution to call some verdict functions. It looks to me as if porting verdict would mostly involve adding
__host__ __device__
specifiers to functions and switching from std:: math functions to the corresponding c versions.The text was updated successfully, but these errors were encountered: