-
-
Notifications
You must be signed in to change notification settings - Fork 854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support CUDA stream with stream memory pool #306
Conversation
9a73412
to
ef4dc29
Compare
Added implementation to support stream on all CUDA libraries, cudnn, cublas, curand, cusparse, cusolver, and thrust. |
Because it is difficult to test whether CUDA is really using stream, I added examples and verified by running them with |
Wrote about incompatibility changes #306 (comment) |
da8880c
to
7791731
Compare
4d62dad
to
5ed26d7
Compare
I am now thinking to add an interface to |
76a85b5
to
8902608
Compare
Jenkins fails
I am struggling to resolve this, but I have no clue how to resolve this issue. |
Hmm, it seems |
It seems |
I've debugged sphinx. sphinx has So, another way (one way is to use python >=3.4 ) to avoid is not to treat warnings as error with a change: Makefile
This is another problem, but I noticed that number of doctests were increased as: sphinx-build -W
sphinx-build
|
EDIT: hmm, looks
|
jenkins, test this please |
LGTM! |
Can you provide a simple example of how to use Nvidia Visual profiler |
Fix #225
Support CUDA stream.
stream
can be specified withwith
statement oruse()
method as:nvprof --print-gpu-trace python ~/test_cupy.py
shows cupy kernels are executed in another stream.To support CUDA stream, this PR changes memory pool, too.
A memory pool is created for each stream separately so that parallel computations among streams do not touch memory used in another stream.
Note that cudaMalloc is issued on default stream always.
Incompatibility changes:
cupy.cuda.generator.RandomState
:setStream()
method was removedcupy.cuda.stream.Stream(null=True)
is prohibited to assurecupy.cuda.stream.Stream.null
object is always used to specify the default stream.