-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Train models from scratch #60
Comments
@zhiqwang Hi, library seems good, so i was thinking to contribute and make it as flexible as possible by adding support for many other backbones, losses and FPN |
Hi @kartik4949 Some modular design does require more careful consideration, we are eager for your help, and join on Slack here . |
Could I train the model by yolov5-rt with custom dataset?Or I need to train the model by yolov5 v4.0 then convert the weights by
Thanks |
Hi @stereomatchingkiss , Both of these are feasible, but I recommend the second approach now. Training with |
FYI I aim to release a version that supports training before 7th May, I guess that it will not train as well as ultralytics, but it will be more friendly 😄 |
Hi @zhiqwang , thanks for your awesome repo! Do you have any news on the training release? I started from your codebase to implement training myself. It is working fine now, i.e. i can run training steps, however i am running into one issue. When i apply default_train_transforms in your data modules. It happends that after transforming, there are no targets left, probably because they lie outside of the crop. Can you give me some hints how to deal best with empty targets in box_head.py? Particularily in those functions:
Thanks a lot in adavance! |
Hi @Tomakko Thanks for your carefully debug information, I guess it is due to the poorly implementation of the data augmentation, as you mentioned, the will filter most I think we should fix this augmentation to make sure there are at least one
My next plan is to learn from the realization of data augmentation in torchvision, they recently upload the augmentation methods when they are training the SSD models, we can borrow some of their codes here to make the augmentation acceptable. Your feedback is very important to me, and feel free to file new issues about the trainer here, and let's train a good model together. 🚀 |
Thanks you @zhiqwang! I currently need to relalize an embedded yolo model in the short term and therefore do training with ultralytics, but afterwards i would be willing to contribute here. The training pipeline in ultralytics is just super cumbersome ;) |
Hi @zhiqwang, thanks for the awesome work ! from yolort.models import yolov5s
model = yolov5s(pretrained=True, score_thresh=0.45, num_classes=5) This piece of code throws the following error due to dimension mismatch: RuntimeError: Error(s) in loading state_dict for YOLO: size mismatch for head.head.0.weight: copying a param with shape torch.Size([255, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([30, 128, 1, 1]). size mismatch for head.head.0.bias: copying a param with shape torch.Size([255]) from checkpoint, the shape in current model is torch.Size([30]). size mismatch for head.head.1.weight: copying a param with shape torch.Size([255, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([30, 256, 1, 1]). size mismatch for head.head.1.bias: copying a param with shape torch.Size([255]) from checkpoint, the shape in current model is torch.Size([30]). size mismatch for head.head.2.weight: copying a param with shape torch.Size([255, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([30, 512, 1, 1]). size mismatch for head.head.2.bias: copying a param with shape torch.Size([255]) from checkpoint, the shape in current model is torch.Size([30]). As we can see, only the weights & biases of head.head are mismatching, and I think that the formula to get that first dimension is Is there any function/method that I'm not aware of that would allow us to match these dimensions, some method that would work like that (if integrated in the YOLO class): def load_state_dict(self, state_dict, num_classes):
weights_to_skip = [f"head.head.{i}.weight" for i in range(3)]
bias_to_skip = [f"head.head.{i}.bias" for i in range(3)]
for weight in weights_to_skip + bias_to_skip:
state_dict[weight] = state_dict[weight][:(num_classes + 5) * 3, ...]
super().load_state_dict(state_dict) Currently the only way I found to load a YOLO model that has a different number of classes is to use the |
Hi @denguir , Thanks for asking this questions first.
We don't currently offer a solution to deal with this problem. But I guess you can load only the backbone parts to partially solve the problem. (I modified the snippets from https://discuss.pytorch.org/t/how-to-load-part-of-pre-trained-model/1113/3) from yolort.models import yolov5s
from yolort.utils import load_state_dict_from_url
model = yolov5s(pretrained=False, score_thresh=0.45, num_classes=5)
checkpoint_path = "/home/user/.cache/torch/hub/checkpoints/yolov5_darknet_pan_s_r60_coco-9f44bf3f.pt"
pretrained_dict = load_state_dict_from_url(checkpoint_path)
# 1. filter out unnecessary keys
pretrained_dict = {k: v for k, v in pretrained_dict.items() if "backbone" in k}
# 2. load the filted state dict
model.model.load_state_dict(pretrained_dict, strict=False) BTW, The training mechanism of yolort is still not well developed and any kind of contribution is welcome here. |
Thanks @zhiqwang, I will definitely explore further the training process of yolort and I will try to help there |
🚀 Feature
Support training models from scratch, this is a follow-up issue of #16.
Motivation
Test whether the trainer mechanism works.
Pitch
build_targets
inSetCriterion
#143The text was updated successfully, but these errors were encountered: