Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ViT] Support fine-tuning with different image resolution #5025

Merged
merged 14 commits into from
Dec 9, 2021

Conversation

yiwen-song
Copy link
Contributor

@yiwen-song yiwen-song commented Dec 3, 2021

As discussed in #4594, we should be able to interpolate embeddings from one resolution to a different one when training ViT models.
This PR adds the support for it.

References: ClassyVision Implementation

Experiments:

  • Launching Command:
PYTHONPATH=$PYTHONPATH:`pwd` python -u ~/workspace/scripts/run_with_submitit.py --timeout 3000 --ngpus 8 --nodes 4 --partition train --model vit_b_16 --batch-size 16 --epochs 8 --opt sgd --lr 0.01 --wd 0 --lr-scheduler cosineannealinglr --amp --mixup-alpha 0.2 --auto-augment ra --data-path /datasets01_ontap/imagenet_full_size/061417/ --clip-grad-norm 1 --cutmix-alpha 1.0 --resume /checkpoints/sallysyw/experiments/8022/model_299.pth --train-crop-size 384 --val-crop-size 384

cc @datumbox

@facebook-github-bot
Copy link

facebook-github-bot commented Dec 3, 2021

💊 CI failures summary and remediations

As of commit 73093f0 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @sallysyw, let me know your thoughts.

torchvision/prototype/models/vision_transformer.py Outdated Show resolved Hide resolved
torchvision/prototype/models/vision_transformer.py Outdated Show resolved Hide resolved
@yiwen-song yiwen-song requested a review from fmassa December 7, 2021 23:09
Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sallysyw I pushed to your branch a change to fix the typing issues. The problem here is that the OrderedDict is not subscriptable. Using quotes will do the trick.

I'm approving to unblock your work, but it's important to follow up with another PR that adds some tests to cover the method.

torchvision/prototype/models/vision_transformer.py Outdated Show resolved Hide resolved
@yiwen-song yiwen-song merged commit 1b14829 into pytorch:main Dec 9, 2021
@github-actions
Copy link

github-actions bot commented Dec 9, 2021

Hey @sallysyw!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

@yiwen-song yiwen-song deleted the checkpoint branch December 9, 2021 23:10
@yiwen-song yiwen-song linked an issue Dec 10, 2021 that may be closed by this pull request
facebook-github-bot pushed a commit that referenced this pull request Dec 21, 2021
…5025)

Summary:
* add from_checkpoint method for vit

* remove useless change

* Making interpolate_embeddings a utility function

* remove logging

* fix type hint

* fix return type check

* ad  retuurns in docsting & unify type hint

* remove useless import

* fix issue: 'type' object is not subscriptable

* Fixing typing issues

* Making interpolation mode configurable

* formatting

Reviewed By: prabhat00155

Differential Revision: D33253466

fbshipit-source-id: 79bf6855f2dcee3c2fef6c05c243a0dc8dfee25e

Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding Vision Transformer to torchvision/models
4 participants