updating batch norm after ema or checkpoint averaging #403
-
Thank you for sharing this repo. This seems required as the BN statistics of the final model, whether averaged or ema, should differ from that received by averaging or performing ema. Should a forward pass on the training data be made to update the statistic and if not why? Thanks in advance |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
@shairoz-deci there are multiple q about that in past issues, discuss... but in my experience, no it is not necessary and I feel it works better not to. Many of the training recipes for popular models in Tensorflow, such as EfficientNet, etc average the BN stats as well... it really just gives you a longer time const for the stats since they are already ema'd with the momentum param. I don't see why it would cause them to deviate enough to be a problem. |
Beta Was this translation helpful? Give feedback.
@shairoz-deci there are multiple q about that in past issues, discuss... but in my experience, no it is not necessary and I feel it works better not to. Many of the training recipes for popular models in Tensorflow, such as EfficientNet, etc average the BN stats as well... it really just gives you a longer time const for the stats since they are already ema'd with the momentum param. I don't see why it would cause them to deviate enough to be a problem.