Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于batch-size设置为3072 #34

Open
Hanlard opened this issue Jan 4, 2021 · 2 comments
Open

关于batch-size设置为3072 #34

Hanlard opened this issue Jan 4, 2021 · 2 comments

Comments

@Hanlard
Copy link

Hanlard commented Jan 4, 2021

关于batch-size在论文中提及的很少。我有看过NVIDIA的Megatron模型代码,在一张V-100上存1.3B(模型并行为2的条件下)的参数,batch_size最大为16(默认是8),如果不用梯度累加策略的话,在 64卡上batch最大为512,请问您是如何做到3072呢?

@zzy14
Copy link
Contributor

zzy14 commented Jan 19, 2021

做了梯度累计,batch_size 12然后做8次梯度累计。

@lulu51230
Copy link

所以,请问是2张卡做模型并行、32张卡做数据并行吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants