You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think an intermediate solution would be if the server allocated gpus in order.
For example if gpus 0, 1, 4, 5, 6, 7 are available, it is better if the server allocates 4, 5, 6, 7.
This is because usually the p2p connections are between gpus in order. Like for example in cvg32 machine in our lab, gpus 0-3 and 4-7 are connected with good p2p connections.
For jobs that need more than 1 gpu, it is beneficial if the gpus selected are dependent on the topology of the gpu p2p connections.
This information can be seen for example from
nvidia-smi topo -m
.Currently it seems that the gpus are selected randomly.
The text was updated successfully, but these errors were encountered: