You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Might be useful to create some Stream object that is easy to create & forces null stream to wait for it to finish before continuing. Could add to cudarc
The latest cudarc added a CudaStream object that we can use for this:
let stream = self.dev.auto_joining_stream()?;self.blas.set_stream(Some(&stream))?;// call kernelself.blas.set_stream(None)?;self.dev.join_async(stream)?;// or you can just `drop(stream)`;
Use
cudarc::driver::result::stream
api andcublas::result::set_stream
to parallelize sgemm operations for conv2d and 4d batched matmulThe text was updated successfully, but these errors were encountered: