-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Further enhance Classification Reference #4444
Further enhance Classification Reference #4444
Conversation
The failing tests are unrelated. |
fb775f8
to
e51ddb0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks @datumbox!
if args.distributed and args.sync_bn: | ||
model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model) | ||
|
||
criterion = nn.CrossEntropyLoss(label_smoothing=args.label_smoothing) | ||
|
||
opt_name = args.opt.lower() | ||
if opt_name == 'sgd': | ||
if opt_name.startswith("sgd"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? Are there other options starting with sgd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a nasty workaround to support nesterov. Instead of adding another flat args
parameter that is useful only when SGD is selected, the user can request for sgd_nesterov
.
pycls also schedules the LR only at the epoch level, so this is 100% consistent with the Fast and accurate paper :) |
@mannatsingh Thanks for the clarification and help on this. :) I mentioned it because we are considering switching to a per iter call, so I'm clarifying here I've not done this yet. Also as we discussed offline, though our code supports most of the ingredients(?) of the "Fast and accurate" recipe, there are differences. For transparency I list here those off the top of my head:
|
Summary: * Adding ExponentialLR and LinearLR * Fix arg type of --lr-warmup-decay * Adding support of Zero gamma BN and SGD with nesterov. * Fix --lr-warmup-decay for video_classification. * Update bn_reinit * Fix pre-existing bug on num_classes of model * Remove zero gamma. * Use fstrings. Reviewed By: datumbox Differential Revision: D31268051 fbshipit-source-id: 4e00d675ded2979f9a6fa7715086e8ca077c1805
Resolves #4281 and partially resolves #3995
Adding in a BC manner the features necessary to the classification reference so that we can reproduce the "Fast and Accurate Model Scaling" recipe and support some of the methods listed at "Bag of Tricks for Image Classification with Convolutional Neural Networks".
I'm not modifying the stepping scheme of the scheduler yet. We keep doing it on epoch level to minimize changes.
cc @pdollar @mannatsingh