-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kernel reuse #93
Comments
They are reused if your computation results in the same assembly code. You can see this by increasing the debug level a bit and checking if there are "cache hit" messages. One thing to avoid are literal constants that change from iteration to iteration (which lead to different PTX code being generated). |
If you give a small example of your problematic code, it will be easier to give feedback btw. |
I was wrong about not caching, cuda_eval() does take less when called repeatedly for same code+data.
and the top-level func:
Which makes me wonder... is there a way that some parameters change for that code that doesn't trigger what I assume is a recompile? I use it to generate Simple Noise fields, and it would be nice if I could generate the field with different offsets, without triggering a costly operation? Also note: The manual says that 'cuda_eval' may return early, async, but my profiling says the bulk of the cycles are spent in there?
|
I see in jit.cu that the call to https://github.com/mitsuba-renderer/enoki/blob/master/src/cuda/jit.cu#L1372 |
It's unclear to me if cuda kernels can ever be reused?
It seems the cuda code is compiled every time, even if I call the same code (with different data) every display frame?
The text was updated successfully, but these errors were encountered: