Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssids option setting #11

Open
Poofee opened this issue Apr 6, 2017 · 2 comments
Open

ssids option setting #11

Poofee opened this issue Apr 6, 2017 · 2 comments
Labels

Comments

@Poofee
Copy link

Poofee commented Apr 6, 2017

Can anyone tell me how to set a proper value of ssids options so that I can improve my programs using GPU? I try to change the parameters, but I didn't see much improvement.
These days, I was reading reference from page http://www.numerical.rl.ac.uk/spral/doc/latest/C/. And I also try to read the source code to find some explaintation.
However, there seems few guides about how to set the value of min_gpu_work, gpu_perf_coeff.
So, how do you use ssids? Do you find an apparent improvement in calculation by setting different min_gpu_work and gpu_perf_coeff?

@flipflapflop
Copy link
Collaborator

flipflapflop commented Apr 11, 2017

There are two options that control the GPU usage: gpu_perf_coeff and min_gpu_work. gpu_perf_coeff is an architecture dependent value corresponding to how much faster is the GPU compared to a CPU (including all the cores!). Obviously this quantity depends on the kernels as well as the sizes of the workload being executed but it is not necessary to have a precise value. By default this value is set to 1.0 but if the GPU is twice as fast as a CPU (on average) then this value should be set to 2.0.

Using this gpu_perf_coeff the workload partitioned between the CPUs and the GPU during the analyse phase. However, partitions that are attributed to the GPU are run on the GPU only if the associated amount of flop for a partition exceed the value given by min_gpu_work. By default it is set to 5 GFlop. This means that if the workload is very small, which seems to be your case, none of the workload is every going to be put on the GPU.

@Poofee
Copy link
Author

Poofee commented Apr 12, 2017

How about solve phase? And how could I estimate the workload in my problem AX=b? A is n*n sparse matrix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants