-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix run_demo(demo_model_parallel, world_size) issue #2367
Conversation
✅ Deploy Preview for pytorch-tutorials-preview ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
Changing the value of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the dev
calculation in the model class to reflect the new world_size
value
In the function demo_model_parallel, dev0 and dev1 are computed in a way that assigns two distinct GPUs to each process. This is achieved by doubling the rank and applying modulus operation with twice the world_size. Assuming 8 gpus world_size is set to 4, leading to the creation of 4 processes. Each of these processes is allocated two distinct GPUs. For instance, the first process (process 0) is assigned GPUs 0 and 1, the second process (process 1) is assigned GPUs 2 and 3, and so forth.
@subramen I've updated the calculation in a simple way to take into account the prior division. So now dev0 and dev1 are calculated as dev0 = (rank * 2) % (world_size * 2)
dev1 = (rank * 2 + 1) % (world_size * 2) |
@subramen going back on it, does
? |
Yes, you don't actually need |
should work well now assuming half as many processes as there are gpus
Fixes #1750
Description
Fixes run_demo(demo_model_parallel, world_size) issue as described in #1750
Checklist
cc @mrshenli @osalpekar @H-Huang @kwen2501