fp16 fixes #222

beiwang2003 · 2022-09-27T12:14:57Z

In AMP fp16 mode, overflow issues were observed in several places in the network. To address this, those regions are now converted back to fp32. This change does not introduce significant performance cost which offsets the benefits of mixed precision training. We also disallow the use of memory efficient kernel (use_memory_efficient_kernel=false) in AMP fp16 mode explicitly.

It worth mentioning the debugging tool developed at huggingface for detecting overflow issue: https://huggingface.co/docs/transformers/v4.14.1/en/debugging

gahdritz · 2022-09-28T03:00:44Z

Thanks!

Bei Wang and others added 2 commits September 21, 2022 05:11

convert suspicious fp16 regions back to fp32

aef97f4

turn off use_memory_efficient_kernel off only for fp16 in primitives.py

4d5fa31

gahdritz merged commit 9082c25 into aqlaboratory:main Sep 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fp16 fixes #222

fp16 fixes #222

beiwang2003 commented Sep 27, 2022 •

edited

Loading

gahdritz commented Sep 28, 2022

fp16 fixes #222

fp16 fixes #222

Conversation

beiwang2003 commented Sep 27, 2022 • edited Loading

gahdritz commented Sep 28, 2022

beiwang2003 commented Sep 27, 2022 •

edited

Loading