-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] DINOv2 pretrained vit weights #1779
Comments
yeah, it's fairly easy to add, especially with the recent eva.py variant as it includes the swiglu, just needs a slight remap as most of the model they use is derived from timm at some point, will do but not urgent level priority since it's a non-commercial license and all :) |
As I am also relying on DINOv2 for my work, I will o implement it and submit a pull request within the next few days. |
I have implemented the small, base, and large models, and they appear to be functioning correctly. However, I need to fix the SwiGLU before proceeding with the implementation of the giant model. Specifically, the code below attempts to initialize a non-existent linear fc1a, which results in an error: pytorch-image-models/timm/layers/mlp.py Lines 133 to 134 in 7326470
Additionally, the official repository supports different image sizes than 518 by dynamically resizing the positional embedding during inference. I am uncertain about the best approach to incorporate this feature into timm. |
Update: I fixed the SwiGLU bug and updated the giant model. |
added thanks to @leng-yue ... will update README shortly before next release |
Is your feature request related to a problem? Please describe.
dinov2 has been open sourced with the pretrained vit weight. Could it be added to timm?
The text was updated successfully, but these errors were encountered: