Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU: Asynchronous data transfers and kernels - Heterogeneous computation #132

Open
cguzman95 opened this issue Nov 5, 2019 · 5 comments
Assignees

Comments

@cguzman95
Copy link
Collaborator

I will work on this in the branch for #129. This issue is to document all ideas for asynchronous GPU execution, allowing GPU and CPU computation simultaneously.

@cguzman95 cguzman95 added this to the GPU chemistry solver milestone Nov 5, 2019
@cguzman95 cguzman95 self-assigned this Nov 5, 2019
@cguzman95
Copy link
Collaborator Author

cguzman95 commented Nov 5, 2019

Adding model_data_id variable on ModelData struct. This variable will identify the ModelData objects created, at least with the GPU flag ON.

Reason: For asynchronous memcpy between GPU and CPU, it's necessary to define a stream. Example:

cudaStream_t stream[nStreams];
  for (int i = 0; i < nStreams; ++i)
    checkCuda( cudaStreamCreate(&stream[i]) );
    cudaMemcpyAsync(d_a, a, N, cudaMemcpyHostToDevice, stream[i])
    cudaStreamDestroy(stream[i]);

So, the stream needs to be declared on new_solver and be destroyed alongside the deallocation of other structures. This means it needs to be declared at the start of the camp_gpu_solver.cu file (or declared inside ModelData, but for the moment is global since it gives me an error trying to declare cudaStream_t in a .h file)

But what happens when we have multiple ModelData objects in an execution? (example: new unit tests, with multi-cell and one-cell solver). Both will try to create streams with the same id. So, this means they will share the same stream.

This could seem like no problem since in principle they won't call solve at the same time. But, it's possible that in the future we (or the user) want to divide multiple solvers into individual CPU threads (with MPI for example). In the case all threads use the same GPU, the GPU execution will slow down since all threads will try to use the same stream.

Not only this, if for some reason the user destroys a solver object, it will destroy also the global streams, so if some solver is still in execution, it will crash.

In conclusion: A specific id of ModelData objects is necessary in order to assign different streams for each possible ModelData object.

@mattldawson
Copy link
Collaborator

Hi @cguzman95 - I agree it's important to allow multiple instances of the CAMP core to run simultaneously. I've been talking to people at NCAR that are interested in possibly using CAMP once it's ready and they will require multiple cores to run at the same time on different threads using OpenMP. So we have to make sure there are no global variables (I don't think there are currently) and no fortran module variables (I think there are only constants right now). I think your original idea of including the cudaStream_t in ModelData is the best design. Trying to internally manage externally generated instances of the CAMP core (and thus ModelData) using ids is going to get too complicated. What is the error you're getting when you try to include the cudaStream_t in ModelData?

@cguzman95
Copy link
Collaborator Author

cguzman95 commented Nov 5, 2019

Hi,

Yep, I agree with setting cudaStream_t in ModelData (I'm only using it as a global variable as a temporal "patch" to continue developing). Speaking of the error, only setting this two lines in any .h file:

#include <cuda.h>
cudaStream_t *stream_gpu;

Raise the error:

/gpfs/scratch/bsc32/bsc32815/gpupartmc/partmc/src/camp_common.h:198:3: error: unknown type name ‘cudaStream_t’
  cudaStream_t *stream_gpu;

Not sure the reason, maybe we are missing some configuration on the CMake?

@mattldawson
Copy link
Collaborator

mattldawson commented Nov 5, 2019

do you also need: #include <cuda_runtime.h>?

@cguzman95
Copy link
Collaborator Author

Yep, it's compiling fine now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants