-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ViT] Support fine-tuning with different image resolution #5025
Conversation
💊 CI failures summary and remediationsAs of commit 73093f0 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @sallysyw, let me know your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sallysyw I pushed to your branch a change to fix the typing issues. The problem here is that the OrderedDict
is not subscriptable. Using quotes will do the trick.
I'm approving to unblock your work, but it's important to follow up with another PR that adds some tests to cover the method.
Hey @sallysyw! You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py |
…5025) Summary: * add from_checkpoint method for vit * remove useless change * Making interpolate_embeddings a utility function * remove logging * fix type hint * fix return type check * ad retuurns in docsting & unify type hint * remove useless import * fix issue: 'type' object is not subscriptable * Fixing typing issues * Making interpolation mode configurable * formatting Reviewed By: prabhat00155 Differential Revision: D33253466 fbshipit-source-id: 79bf6855f2dcee3c2fef6c05c243a0dc8dfee25e Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
As discussed in #4594, we should be able to interpolate embeddings from one resolution to a different one when training ViT models.
This PR adds the support for it.
References: ClassyVision Implementation
Experiments:
cc @datumbox