-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add standard vision models #1
Comments
I will be starting with Alexnet. |
FYI I had made an old PR in FluxML/Metalhead.jl#17 which added most of these models (there have been some changes in Flux but it should still work with minor changes). We never merged the PR due to some CI test failures, and lack of trained weights. So it might be a helpful reference. |
Agreed, we should not replicate work. The ResNet in Metalhead is also generic but only exposes the ResNet50 model. A PR that adds a kwarg and constructs these appropriately would get us most of the way there. Not to mention easy to integrate with the rest of the CI |
Though can I suggest not wrapping the models in structs like Metalhead? This adds the complexity of accessing the "actual" model through I think we should have FluxModels.jl be really terse, minimal constructions of standard models to allow for reuse across codebases. I recall instances on Slack etc. where folks are told to copy-paste the model def from Metalhead because they don't want all the extra functionality that Metalhead provides. So, I think there is value in having a standard package of just function that construct standard models. Later down the line we can have a PR to Metalhead that uses FluxModels.jl to construct the inner |
I would suggest not to add too much complexity. Maybe Metalhead can be the
"public" API and we can have smaller packages that implement one model (re
UNet.jl). That way you have access to both. I would prefer to have
instances of training these models be in the zoo. Can we actually first
work on a list of available models in Flux? I don't think copying the
source code without meaningful separation of concern would achieve much.
…On Sun, Jun 14, 2020 at 1:55 AM Kyle Daruwalla ***@***.***> wrote:
Though can I suggest not wrapping the models in structs like Metalhead?
This adds the complexity of accessing the "actual" model through m.layers,
and I don't think the type wrapper is useful for dispatch (e.g. if someone
wanted to dispatch on ::AlexNet, it would be trivial to wrap it in a
type). We should still copy the source code from Metalhead to avoid
rewriting from scratch.
I think we should have FluxModels.jl be really terse, minimal
constructions of standard models to allow for reuse across codebases. I
recall instances on Slack etc. where folks are told to copy-paste the model
def from Metalhead because they don't want all the extra functionality that
Metalhead provides. So, I think there is value in having a standard package
of just function that construct standard models. Later down the line we can
have a PR to Metalhead that uses FluxModels.jl to construct the inner
m.layers.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJOZVVKRZI6WNZ6DJIMXGF3RWPOB5ANCNFSM4NZSGQSA>
.
|
Yeah, I agree this is a good idea.
We can do that if that's preferred. We haven't started training any models anywhere yet. I think this would be good to discuss during the next call. I'm not suggesting creating a lot of packages. I would say there should be one package (e.g. FluxModels.jl though it could be Metalhead.jl) that is the source standard models. Based on the current design, I would not suggest Metalhead.jl as the public facing package. At least I, as a Flux user, would prefer an alternative. Metalhead does a lot: datasets, data loading, the |
Agree totally with @darsnack. Further, I think there should be a general framework for incorporating and fine-tuning trained models as just regular layers, not simply for inference. That can be composed with domain specific variants of dataloaders and other components. Again, taking inspiration from fast.ai here. |
If the issue is with having the models wrapped in a custom type, I can understand. Fwiw since moving to zygote, we can basically treat the model as a layer already and should be able to train it. If there are errors in that, we should consider it a bug. Not against the core idea, just want to avoid redoing the work, especially if just moving the data providers et al outside would get us most of the way there. |
Yes maybe this is related to: FluxML/FluxML-Community-Call-Minutes#2 (comment) Metalhead could be the vision model package referenced in that comment. We’d have to extract the other functionality out of the package. Though I think @dnabanita7 ‘s implementations are good since they use adaptive pooling to allow the model to work on multiple datasets. |
Shall I continue working on other models? |
Yes, let's not complicate the fellowship work for now. You can keep on the same plan committing to this repo. |
Hello, I have forked your package to add my VGGs implementations and to share my code with you, docs and unit testing are missing, but quick test on CIFAR10 seems to be OK (82% of accuracy after 30 epochs for a VGG11 without batchnorm). https://github.com/arnold-dev/FluxModels.jl Arnold |
Hi @arnold-dev thanks for your work! If you make a PR to this repo, then I'll be happy to merge. There are some higher level design issues that @dnabanita7 can address on top of your PR. Just so you know, we currently have an MLH student fellow (@dnabanita7) who is working on this issue. She has already made significant progress on the other models, and I normally wouldn't ask something like this, but we'd appreciate your hard work on other issues. I am not mentioning this to discourage you from working on the ML/Flux ecosystem, but only to allow Nabanita to learn as much as she can from this experience. |
Hi @darsnack, I'm happy to contribute and this kind of package could be very helpful for my research. I will make a PR, but I'm not very comfortable with github. |
Yup, that’s the plan! |
1239: add adaptive pool r=CarloLucibello a=dnabanita7 I have added ``AdaptiveMaxPool`` and ``AdaptiveMeanPool`` so that we can do a similar [PyTorch implementation](darsnack/FluxModels.jl#1 (comment)). cc @darsnack ### PR - [x] Tests are added - [x] Entry in NEWS.md - [x] Documentation, if applicable - [ ] Final review from `@MikeInnes` or `@dhairyagandhi96` (for API changes). ### Flux issue linking [Flux#1224](#1224) ### MLH issue linking [0.3.x-projects#26](https://github.com/MLH-Fellowship/0.3.x-projects/issues/26) Co-authored-by: Nabanita Dash <dashnabanita@gmail.com>
@arnold-dev Sorry for hijacking, but did you have any luck replicating those resnet9 dawnbench experiments? I wanted to use it for another experiment but I couldn't get the same performance despite quite some effort to cross-check implementations. I didn't need to match the absolute performance so I gave up after about half a day of trying, but it would be interesting to know what I did wrong. I put the experiments in a notebook here. I might have a Chain version of the resnet somewhere in the revision history as well although it is probably less effort to recreate from scratch than to find it. |
Add standard vision models that replicates: https://pytorch.org/docs/stable/torchvision/models.html
We want Flux implementations of the architectures. These can be untrained models for now, and we can add another issue for providing pre-trained weights. The models should be input size agnostic so that they can work with CIFAR or ImageNet. PyTorch does this with an Adaptive2DPool layer between the feature extractors and the fully-connected layers. Supporting similar functionality might require a PR to Flux.jl.
List of models:
The text was updated successfully, but these errors were encountered: