GPU selection for jobs that use more than 1 gpu #4

yassersouri · 2019-07-25T14:20:28Z

For jobs that need more than 1 gpu, it is beneficial if the gpus selected are dependent on the topology of the gpu p2p connections.

This information can be seen for example from nvidia-smi topo -m.

Currently it seems that the gpus are selected randomly.

The text was updated successfully, but these errors were encountered:

yassersouri · 2019-07-25T15:07:52Z

I think an intermediate solution would be if the server allocated gpus in order.
For example if gpus 0, 1, 4, 5, 6, 7 are available, it is better if the server allocates 4, 5, 6, 7.

This is because usually the p2p connections are between gpus in order. Like for example in cvg32 machine in our lab, gpus 0-3 and 4-7 are connected with good p2p connections.

alexanderrichard · 2019-07-25T22:29:04Z

This should be easy to do. I don't have time this weekend but might be able to look into it next weekend.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU selection for jobs that use more than 1 gpu #4

GPU selection for jobs that use more than 1 gpu #4

yassersouri commented Jul 25, 2019

yassersouri commented Jul 25, 2019 •

edited

Loading

alexanderrichard commented Jul 25, 2019

GPU selection for jobs that use more than 1 gpu #4

GPU selection for jobs that use more than 1 gpu #4

Comments

yassersouri commented Jul 25, 2019

yassersouri commented Jul 25, 2019 • edited Loading

alexanderrichard commented Jul 25, 2019

yassersouri commented Jul 25, 2019 •

edited

Loading