YOLO-WORLD-S在coco上finetune无法复现，且validation map呈现下降趋势 #160

shupinghu · 2024-03-20T09:56:39Z

【复现步骤】
你好，我下载了yolo_world_s_clip_base_dual_vlpan_2e-3adamw_32xb16_100e_o365_goldg_train_pretrained-18bea4d2.pth，加载它进行预训练，config文件是从configs/finetune_coco/yolo_world_l_dual_vlpan_2e-4_80e_8gpus_finetune_coco.py拷贝后修改的，即将_base_里面的yolov8_l_syncbn_fast_8xb16-500e_coco.py直接修改为yolov8_s_syncbn_fast_8xb16-500e_coco.py，然后在yolov8_s_syncbn_fast_8xb16-500e_coco.py里增加了“mixup_prob = 0.1”以解决mmengine报错的问题。

此外，我只用了单卡V100进行训练，担心单卡比8卡batch size差距太大，我将单卡的batch size从16修改成了32。

【实验现象】
其他没有做修改，开始训练后发现validation map从第15个epoch开始，就呈现下降趋势（从epoch 10的0.414、0.576下降到epoch 15的0.410、0.574，当前训练到epoch55，已经下降到了0.377、0.541）。

【问题咨询及分析】
想请问是否是我哪里复现有问题？（check了一下貌似直接改成s之后mixup好像没有开，有可能是这个原因，但是我觉得不至于会让validation map呈下降趋势）

wondervictor · 2024-03-20T10:00:50Z

Hi @shupinghu, we have also met the same problem: fine-tuning YOLO-World on COCO without mask-refine leads to performance degradation. We're checking it. However, you can enable mask-refine=True for better results currently.

shupinghu · 2024-03-20T10:52:25Z

Hi @shupinghu, we have also met the same problem: fine-tuning YOLO-World on COCO without mask-refine leads to performance degradation. We're checking it. However, you can enable mask-refine=True for better results currently.

Does the validation map in your experiment gradually decrease during fine-tuning as well?

In my experiment, not only was map unable to reproduce the values in the paper, but the bigger problem was that map was getting worse and worse during fine-tuning.

wondervictor · 2024-03-20T11:14:05Z

@shupinghu, w/ mask-refine, the fine-tuning results are normal and consistent with the results from the paper. However, removing mask-refine will produce abnormal results.

shupinghu · 2024-03-20T11:21:59Z

@shupinghu, w/ mask-refine, the fine-tuning results are normal and consistent with the results from the paper. However, removing mask-refine will produce abnormal results.

OK, I will try this config file and feedback the experiment result. Does use "mask-refine" mean that we use the segmentation annotation to reproduce the bbox annoration?

wondervictor · 2024-03-20T12:57:44Z

mask-refine provides box refinements and supports copypaste during training.

taofuyu · 2024-03-21T01:51:17Z

I'm confused that in transform YOLOv5RandomAffine, use_mask_refine is deprecated in your version of mmyolo. So, it should not influence the result ?
And, custom dataset usually dosen't have segmentation annotations, dose it mean that fine-tuning on custom dataset never yield out good result ?

wondervictor · 2024-03-21T02:58:27Z

@taofuyu

Compared to w/o mask-refine, w/ mask-refine contains another copypaste augmentation.
It should work well on datasets without segmentation annotations and we need to find out what's wrong under this setting.

taofuyu · 2024-03-21T03:08:02Z

@wondervictor
Thanks. For me, the problem is the decline of open-vocabulary ability after fine-tuning on custom dataset.

wondervictor · 2024-03-21T03:23:51Z

@taofuyu I'll add it in TODO and fix it soon.

wondervictor · 2024-03-21T04:04:20Z

Hi @shupinghu and @taofuyu, I've uploaded the fine-tuned weights and logs for models with mask-refine=True in configs/finetune_coco.

shupinghu · 2024-03-22T03:36:50Z

@shupinghu, w/ mask-refine, the fine-tuning results are normal and consistent with the results from the paper. However, removing mask-refine will produce abnormal results.

OK, I will try this config file and feedback the experiment result. Does use "mask-refine" mean that we use the segmentation annotation to reproduce the bbox annoration?

using "mask-refine" is OK.

wondervictor · 2024-03-22T03:47:11Z

Updates: the performance will be much worse without the CopyPaster augmentation.

wondervictor · 2024-03-22T06:22:21Z

[Failed Update] : using SGD, lr=1e-3, wd=0.0005 ~~seems good for fine-tuning~~.

optim_wrapper = dict(optimizer=dict(
    _delete_=True,
    type='SGD',
    lr=1e-3,
    momentum=0.937,
    nesterov=True,
    weight_decay=0.0005,
    batch_size_per_gpu=train_batch_size_per_gpu))

wondervictor · 2024-03-27T03:08:29Z

Hi all (@taofuyu, @shupinghu): happy to update a milestone,
now I've tried a new setting with SGD and fewer augmentation epochs, fine-tuning without mask-refine or copypaste works.

reduce mosaic epochs, increase normal epochs

max_epochs = 40  # Maximum training epochs
close_mosaic_epochs = 30

use SGD optimizer, add weight decay for BN and bias.

optim_wrapper = dict(
    optimizer=dict(_delete_=True,
                   type='SGD',
                   lr=1e-3,
                   momentum=0.937,
                   nesterov=True,
                   weight_decay=weight_decay,
                   batch_size_per_gpu=train_batch_size_per_gpu),
    paramwise_cfg=dict(custom_keys={'logit_scale': dict(weight_decay=0.0)}),
    constructor='YOLOWv5OptimizerConstructor')

Under this setting, YOLO-World-Large without mask-refine can achieve 52.8 AP on COCO (better than YOLOv8), and improve the former wrong baseline (48.6). BTW, fine-tuning with mask-refine now achieves 53.9 AP.

This is a milestone but not the terminus and we are still working on it for a better fine-tuning setting!

Those updates will be pushed in a day.

wondervictor · 2024-03-27T13:22:30Z

Hi all (@taofuyu, @shupinghu), we have preliminarily explored the errors about pre-training without maks-refine and fixed this issue. With mask-refine, YOLO-World performs significantly better than the paper version. Without mask-refine, YOLO-World still obtains competitive performance, e.g., YOLO-World-L obtains 52.8 AP on COCO.

You can find more details in configs/finetune_coco, especially for the version without mask-refine.

JiayuanWang-JW · 2024-03-27T17:26:20Z

Hi @wondervictor, I met the same issues for fine-tuning on my own dataset. The mAP50 will decrease after 15 epochs. Do you have any idea about that? I have tried the two fine-tuning config files yolo_world_l_dual_vlpan_2e-4_80e_8gpus_finetune_coco.py and yolo_world_v2_l_vlpan_bn_2e-4_80e_8gpus_finetune_coco_womixup.py (I think you delete this file in the current version). Both are decrest after 15 epochs. However, performance will increase in the last 10 epochs. I just modified the img_scale to 1280, 960 and max_epochs to 100. Other parameters are the same as your configs.

Unfortunately, my dataset does not include the mask, so I can not use mask-refine.

wondervictor · 2024-03-28T02:37:21Z

Hi @JiayuanWang-JW, could you try out the latest config for your custom data? I've preliminarily fixed the above issues. The new config does not require mask-refine and obtains steady improvement. Hope for your feedback :)

JiayuanWang-JW · 2024-03-29T13:16:48Z

Hi @JiayuanWang-JW, could you try out the latest config for your custom data? I've preliminarily fixed the above issues. The new config does not require mask-refine and obtains steady improvement. Hope for your feedback :)

Thanks for your rapid response. I have finished the experiment on my own dataset. It is much better than before config. The current result as shown

The previous is

If I continue fine-tuning more epochs, I need to change the close_mosaic_epochs and which parameters (such as base_lr, weight_decay, etc)? I want to try 100 epochs. The best mAP50 is 0.607. I think this is not enough, to be honest, some classical detection methods are much better than this value. And the number of parameters is much less than YOLO-World-L. Do you have any idea how to continue to improve the performance?

wondervictor · 2024-04-16T03:56:46Z

Hi @JiayuanWang-JW, is there any update?
I'm sorry for not getting back to you sooner. Exactly, there are several ways to improve the fine-tuning performance:
(1) replace with a better pre-trained model;
(2) increase the input resolution to 800 or higher (1280);
(3) increase the training epochs with a larger learning rate, and increase the mosaic epochs;

JiayuanWang-JW · 2024-04-16T04:47:22Z

Hi @JiayuanWang-JW, is there any update? I'm sorry for not getting back to you sooner. Exactly, there are several ways to improve the fine-tuning performance: (1) replace with a better pre-trained model; (2) increase the input resolution to 800 or higher (1280); (3) increase the training epochs with a larger learning rate, and increase the mosaic epochs;

Hi @wondervictor. Thanks for your reply. No. I didn't get the better YOLO-World results on my dataset.

Actually, I already used (1280, 960) size and different learning rate strategies(such as Cosine Annealing and different T_max) to fine-tune the detection tasks. I tried 100 epochs and different mosaic epochs. However, the best performance only achieved 60.7% for AP50. After that, the performance will decrease. Just select two examples to explain.

YOLOv8 achieved 75.1% on my dataset. They are a large gap.

Anyway, I will continue to explore it and update them here if I find any useful information.

NoomiHu · 2025-02-06T06:50:56Z

想问一下您这个权重是在哪里下载的？

wondervictor mentioned this issue Mar 21, 2024

Roadmap of YOLO-World #109

Open

16 tasks

wondervictor added bug Something isn't working Working on it now! labels Mar 21, 2024

wondervictor pinned this issue Mar 21, 2024

wondervictor unpinned this issue May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YOLO-WORLD-S在coco上finetune无法复现，且validation map呈现下降趋势 #160

YOLO-WORLD-S在coco上finetune无法复现，且validation map呈现下降趋势 #160

shupinghu commented Mar 20, 2024

wondervictor commented Mar 20, 2024

shupinghu commented Mar 20, 2024

wondervictor commented Mar 20, 2024

shupinghu commented Mar 20, 2024

wondervictor commented Mar 20, 2024

taofuyu commented Mar 21, 2024

wondervictor commented Mar 21, 2024

taofuyu commented Mar 21, 2024

wondervictor commented Mar 21, 2024

wondervictor commented Mar 21, 2024

shupinghu commented Mar 22, 2024

wondervictor commented Mar 22, 2024

wondervictor commented Mar 22, 2024 •

edited

Loading

wondervictor commented Mar 27, 2024

wondervictor commented Mar 27, 2024

JiayuanWang-JW commented Mar 27, 2024 •

edited

Loading

wondervictor commented Mar 28, 2024

JiayuanWang-JW commented Mar 29, 2024

wondervictor commented Apr 16, 2024

JiayuanWang-JW commented Apr 16, 2024

NoomiHu commented Feb 6, 2025

YOLO-WORLD-S在coco上finetune无法复现，且validation map呈现下降趋势 #160

YOLO-WORLD-S在coco上finetune无法复现，且validation map呈现下降趋势 #160

Comments

shupinghu commented Mar 20, 2024

wondervictor commented Mar 20, 2024

shupinghu commented Mar 20, 2024

wondervictor commented Mar 20, 2024

shupinghu commented Mar 20, 2024

wondervictor commented Mar 20, 2024

taofuyu commented Mar 21, 2024

wondervictor commented Mar 21, 2024

taofuyu commented Mar 21, 2024

wondervictor commented Mar 21, 2024

wondervictor commented Mar 21, 2024

shupinghu commented Mar 22, 2024

wondervictor commented Mar 22, 2024

wondervictor commented Mar 22, 2024 • edited Loading

wondervictor commented Mar 27, 2024

wondervictor commented Mar 27, 2024

JiayuanWang-JW commented Mar 27, 2024 • edited Loading

wondervictor commented Mar 28, 2024

JiayuanWang-JW commented Mar 29, 2024

wondervictor commented Apr 16, 2024

JiayuanWang-JW commented Apr 16, 2024

NoomiHu commented Feb 6, 2025

wondervictor commented Mar 22, 2024 •

edited

Loading

JiayuanWang-JW commented Mar 27, 2024 •

edited

Loading