Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZeRO-3] Partitioned init with deepspeed.zero.Init() #1190

Merged
merged 3 commits into from
Mar 19, 2024

Conversation

R0n12
Copy link
Contributor

@R0n12 R0n12 commented Mar 18, 2024

fixes #1189

@R0n12 R0n12 marked this pull request as ready for review March 18, 2024 09:38
@R0n12 R0n12 requested a review from Quentin-Anthony as a code owner March 18, 2024 09:38
@Quentin-Anthony
Copy link
Member

Thanks @R0n12! Can you quickly test if my cleanup changes still work for your case?

@R0n12
Copy link
Contributor Author

R0n12 commented Mar 19, 2024

  • 13B over 16 A100
  • 6.7B over 16 A100

I think we are good!

@Quentin-Anthony Quentin-Anthony merged commit 7267a74 into EleutherAI:main Mar 19, 2024
2 of 5 checks passed
@R0n12 R0n12 deleted the lang/z3-init branch March 20, 2024 00:11
@R0n12 R0n12 restored the lang/z3-init branch March 20, 2024 03:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large model instantiation using DeepSpeed.zero.Init under ZeRO-3
2 participants