Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further enhance Classification Reference #4444

Merged
merged 11 commits into from
Sep 21, 2021

Conversation

datumbox
Copy link
Contributor

@datumbox datumbox commented Sep 18, 2021

Resolves #4281 and partially resolves #3995

Adding in a BC manner the features necessary to the classification reference so that we can reproduce the "Fast and Accurate Model Scaling" recipe and support some of the methods listed at "Bag of Tricks for Image Classification with Convolutional Neural Networks".

I'm not modifying the stepping scheme of the scheduler yet. We keep doing it on epoch level to minimize changes.

cc @pdollar @mannatsingh

@datumbox
Copy link
Contributor Author

The failing tests are unrelated.

@datumbox datumbox force-pushed the references/classification_improvements branch from fb775f8 to e51ddb0 Compare September 19, 2021 23:42
Copy link
Contributor

@prabhat00155 prabhat00155 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks @datumbox!

if args.distributed and args.sync_bn:
model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model)

criterion = nn.CrossEntropyLoss(label_smoothing=args.label_smoothing)

opt_name = args.opt.lower()
if opt_name == 'sgd':
if opt_name.startswith("sgd"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? Are there other options starting with sgd?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a nasty workaround to support nesterov. Instead of adding another flat args parameter that is useful only when SGD is selected, the user can request for sgd_nesterov.

references/classification/train.py Outdated Show resolved Hide resolved
@datumbox datumbox merged commit c712016 into pytorch:main Sep 21, 2021
@datumbox datumbox deleted the references/classification_improvements branch September 21, 2021 20:09
@mannatsingh
Copy link

mannatsingh commented Sep 21, 2021

I'm not modifying the stepping scheme of the scheduler yet. We keep doing it on epoch level to minimize changes.

pycls also schedules the LR only at the epoch level, so this is 100% consistent with the Fast and accurate paper :)

@datumbox
Copy link
Contributor Author

@mannatsingh Thanks for the clarification and help on this. :) I mentioned it because we are considering switching to a per iter call, so I'm clarifying here I've not done this yet.

Also as we discussed offline, though our code supports most of the ingredients(?) of the "Fast and accurate" recipe, there are differences. For transparency I list here those off the top of my head:

  1. No PCA noise
  2. Different scheme for EMA (per epoch vs per X steps, can be approximated with different params)
  3. Different configuration for Schedulers (can be made compatible with right params)
  4. No support for custom weight decay for BN layers

facebook-github-bot pushed a commit that referenced this pull request Sep 30, 2021
Summary:
* Adding ExponentialLR and LinearLR

* Fix arg type of --lr-warmup-decay

* Adding support of Zero gamma BN and SGD with nesterov.

* Fix --lr-warmup-decay for video_classification.

* Update bn_reinit

* Fix pre-existing bug on num_classes of model

* Remove zero gamma.

* Use fstrings.

Reviewed By: datumbox

Differential Revision: D31268051

fbshipit-source-id: 4e00d675ded2979f9a6fa7715086e8ca077c1805
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants