Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump version to v3.2.0 #11029

Merged
merged 3 commits into from
Oct 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 46 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,52 @@ Apart from MMDetection, we also released [MMEngine](https://github.com/open-mmla

### Highlight

**v3.2.0** was released in 12/10/2023:

**1. Detection Transformer SOTA Model Collection**
(1) Supported four updated and stronger SOTA Transformer models: [DDQ](configs/ddq/README.md), [CO-DETR](projects/CO-DETR/README.md), [AlignDETR](projects/AlignDETR/README.md), and [H-DINO](projects/HDINO/README.md).
(2) Based on CO-DETR, MMDet released a model with a COCO performance of 64.1 mAP.
(3) Algorithms such as DINO support `AMP/Checkpoint/FrozenBN`, which can effectively reduce memory usage.

**2. [Comprehensive Performance Comparison between CNN and Transformer](<(projects/RF100-Benchmark/README.md)>)**
RF100 consists of a dataset collection of 100 real-world datasets, including 7 domains. It can be used to assess the performance differences of Transformer models like DINO and CNN-based algorithms under different scenarios and data volumes. Users can utilize this benchmark to quickly evaluate the robustness of their algorithms in various scenarios.

<div align=center>
<img src="https://github.com/open-mmlab/mmdetection/assets/17425982/86420903-36a8-410d-9251-4304b9704f7d"/>
</div>

**3. Support for [GLIP](configs/glip/README.md) and [Grounding DINO](configs/grounding_dino/README.md) fine-tuning, the only algorithm library that supports Grounding DINO fine-tuning**
The Grounding DINO algorithm in MMDet is the only library that supports fine-tuning. Its performance is one point higher than the official version, and of course, GLIP also outperforms the official version.
We also provide a detailed process for training and evaluating Grounding DINO on custom datasets. Everyone is welcome to give it a try.

| Model | Backbone | Style | COCO mAP | Official COCO mAP |
| :----------------: | :------: | :-------: | :--------: | :---------------: |
| Grounding DINO-T | Swin-T | Zero-shot | 48.5 | 48.4 |
| Grounding DINO-T | Swin-T | Finetune | 58.1(+0.9) | 57.2 |
| Grounding DINO-B | Swin-B | Zero-shot | 56.9 | 56.7 |
| Grounding DINO-B | Swin-B | Finetune | 59.7 | |
| Grounding DINO-R50 | R50 | Scratch | 48.9(+0.8) | 48.1 |

**4. Support for the open-vocabulary detection algorithm [Detic](projects/Detic_new/README.md) and multi-dataset joint training.**
**5. Training detection models using [FSDP and DeepSpeed](<(projects/example_largemodel/README.md)>).**

| ID | AMP | GC of Backbone | GC of Encoder | FSDP | Peak Mem (GB) | Iter Time (s) |
| :-: | :-: | :------------: | :-----------: | :--: | :-----------: | :-----------: |
| 1 | | | | | 49 (A100) | 0.9 |
| 2 | √ | | | | 39 (A100) | 1.2 |
| 3 | | √ | | | 33 (A100) | 1.1 |
| 4 | √ | √ | | | 25 (A100) | 1.3 |
| 5 | | √ | √ | | 18 | 2.2 |
| 6 | √ | √ | √ | | 13 | 1.6 |
| 7 | | √ | √ | √ | 14 | 2.9 |
| 8 | √ | √ | √ | √ | 8.5 | 2.4 |

**6. Support for the [V3Det](configs/v3det/README.md) dataset, a large-scale detection dataset with over 13,000 categories.**

<div align=center>
<img width=960 src="https://github.com/open-mmlab/mmdetection/assets/17425982/9c216387-02be-46e6-b0f2-b856f80f6d84"/>
</div>

We are excited to announce our latest work on real-time object recognition tasks, **RTMDet**, a family of fully convolutional single-stage detectors. RTMDet not only achieves the best parameter-accuracy trade-off on object detection from tiny to extra-large model sizes but also obtains new state-of-the-art performance on instance segmentation and rotated object detection tasks. Details can be found in the [technical report](https://arxiv.org/abs/2212.07784). Pre-trained models are [here](configs/rtmdet).

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/rtmdet-an-empirical-study-of-designing-real/real-time-instance-segmentation-on-mscoco)](https://paperswithcode.com/sota/real-time-instance-segmentation-on-mscoco?p=rtmdet-an-empirical-study-of-designing-real)
Expand All @@ -119,13 +165,6 @@ We are excited to announce our latest work on real-time object recognition tasks
<img src="https://user-images.githubusercontent.com/12907710/208044554-1e8de6b5-48d8-44e4-a7b5-75076c7ebb71.png"/>
</div>

**v3.1.0** was released in 30/6/2023:

- Supports tracking algorithms including multi-object tracking (MOT) algorithms SORT, DeepSORT, StrongSORT, OCSORT, ByteTrack, QDTrack, and video instance segmentation (VIS) algorithm MaskTrackRCNN, Mask2Former-VIS.
- Support [ViTDet](projects/ViTDet)
- Supports inference and evaluation of multimodal algorithms [GLIP](configs/glip) and [XDecoder](projects/XDecoder), and also supports datasets such as COCO semantic segmentation, COCO Caption, ADE20k general segmentation, and RefCOCO. GLIP fine-tuning will be supported in the future.
- Provides a [gradio demo](https://github.com/open-mmlab/mmdetection/blob/dev-3.x/projects/gradio_demo/README.md) for image type tasks of MMDetection, making it easy for users to experience.

## Installation

Please refer to [Installation](https://mmdetection.readthedocs.io/en/latest/get_started.html) for installation instructions.
Expand Down
54 changes: 47 additions & 7 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,53 @@ MMDetection 是一个基于 PyTorch 的目标检测开源工具箱。它是 [Ope

### 亮点

**v3.2.0** 版本已经在 2023.10.12 发布:

**1. 检测 Transformer SOTA 模型大合集**
(1) 支持了 [DDQ](configs/ddq/README.md)、[CO-DETR](projects/CO-DETR/README.md)、[AlignDETR](projects/AlignDETR/README.md) 和 [H-DINO](projects/HDINO/README.md) 4 个更新更强的 SOTA Transformer 模型
(2) 基于 CO-DETR, MMDet 中发布了 COCO 性能为 64.1 mAP 的模型
(3) DINO 等算法支持 AMP/Checkpoint/FrozenBN,可以有效降低显存

**2. [提供了全面的 CNN 和 Transformer 的性能对比](projects/RF100-Benchmark/README_zh-CN.md)**
RF100 是由 100 个现实收集的数据集组成,包括 7 个域,可以验证 DINO 等 Transformer 模型和 CNN 类算法在不同场景不同数据量下的性能差异。用户可以用这个 Benchmark 快速验证自己的算法在不同场景下的鲁棒性。

<div align=center>
<img src="https://github.com/open-mmlab/mmdetection/assets/17425982/86420903-36a8-410d-9251-4304b9704f7d"/>
</div>

**3. 支持了 [GLIP](configs/glip/README.md) 和 [Grounding DINO](configs/grounding_dino/README.md) 微调,全网唯一支持 Grounding DINO 微调**
MMDet 中的 Grounding DINO 是全网唯一支持微调的算法库,且性能高于官方 1 个点,当然 GLIP 也比官方高。
我们还提供了详细的 Grounding DINO 在自定义数据集上训练评估的流程,欢迎大家试用。

| Model | Backbone | Style | COCO mAP | Official COCO mAP |
| :----------------: | :------: | :-------: | :--------: | :---------------: |
| Grounding DINO-T | Swin-T | Zero-shot | 48.5 | 48.4 |
| Grounding DINO-T | Swin-T | Finetune | 58.1(+0.9) | 57.2 |
| Grounding DINO-B | Swin-B | Zero-shot | 56.9 | 56.7 |
| Grounding DINO-B | Swin-B | Finetune | 59.7 | |
| Grounding DINO-R50 | R50 | Scratch | 48.9(+0.8) | 48.1 |

**4. 支持开放词汇检测算法 [Detic](projects/Detic_new/README.md) 并提供多数据集联合训练可能**

**5. 轻松使用 [FSDP 和 DeepSpeed 训练检测模型](projects/example_largemodel/README_zh-CN.md)**

| ID | AMP | GC of Backbone | GC of Encoder | FSDP | Peak Mem (GB) | Iter Time (s) |
| :-: | :-: | :------------: | :-----------: | :--: | :-----------: | :-----------: |
| 1 | | | | | 49 (A100) | 0.9 |
| 2 | √ | | | | 39 (A100) | 1.2 |
| 3 | | √ | | | 33 (A100) | 1.1 |
| 4 | √ | √ | | | 25 (A100) | 1.3 |
| 5 | | √ | √ | | 18 | 2.2 |
| 6 | √ | √ | √ | | 13 | 1.6 |
| 7 | | √ | √ | √ | 14 | 2.9 |
| 8 | √ | √ | √ | √ | 8.5 | 2.4 |

**6. 支持了 [V3Det](configs/v3det/README.md) 1.3w+ 类别的超大词汇检测数据集**

<div align=center>
<img width=960 src="https://github.com/open-mmlab/mmdetection/assets/17425982/9c216387-02be-46e6-b0f2-b856f80f6d84"/>
</div>

我们很高兴向大家介绍我们在实时目标识别任务方面的最新成果 RTMDet,包含了一系列的全卷积单阶段检测模型。 RTMDet 不仅在从 tiny 到 extra-large 尺寸的目标检测模型上实现了最佳的参数量和精度的平衡,而且在实时实例分割和旋转目标检测任务上取得了最先进的成果。 更多细节请参阅[技术报告](https://arxiv.org/abs/2212.07784)。 预训练模型可以在[这里](configs/rtmdet)找到。

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/rtmdet-an-empirical-study-of-designing-real/real-time-instance-segmentation-on-mscoco)](https://paperswithcode.com/sota/real-time-instance-segmentation-on-mscoco?p=rtmdet-an-empirical-study-of-designing-real)
Expand All @@ -118,13 +165,6 @@ MMDetection 是一个基于 PyTorch 的目标检测开源工具箱。它是 [Ope
<img src="https://user-images.githubusercontent.com/12907710/208044554-1e8de6b5-48d8-44e4-a7b5-75076c7ebb71.png"/>
</div>

**v3.1.0** 版本已经在 2023.6.30 发布:

- 支持 Tracking 类算法,包括多目标跟踪 MOT 算法 SORT、DeepSORT、StrongSORT、OCSORT、ByteTrack、QDTrack 和视频实例分割 VIS 算法 MaskTrackRCNN、Mask2Former-VIS。
- 支持 [ViTDet](projects/ViTDet)
- 支持多模态开放检测算法 [GLIP](configs/glip) 和 [XDecoder](projects/XDecoder) 推理和评估,并同时支持了 COCO 语义分割、COCO Caption、ADE20k 通用分割、RefCOCO 等数据集。后续将支持 GLIP 微调
- 提供了包括 MMDetection 图片任务的 [gradio demo](https://github.com/open-mmlab/mmdetection/blob/dev-3.x/projects/gradio_demo/README.md),方便用户快速体验

## 安装

请参考[快速入门文档](https://mmdetection.readthedocs.io/zh_CN/latest/get_started.html)进行安装。
Expand Down
2 changes: 1 addition & 1 deletion docker/serve/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ ARG CUDNN="8"
FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel

ARG MMCV="2.0.0rc4"
ARG MMDET="3.1.0"
ARG MMDET="3.2.0"

ENV PYTHONUNBUFFERED TRUE

Expand Down
2 changes: 1 addition & 1 deletion docker/serve_cn/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ ARG CUDNN="8"
FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel

ARG MMCV="2.0.0rc4"
ARG MMDET="3.1.0"
ARG MMDET="3.2.0"

ENV PYTHONUNBUFFERED TRUE

Expand Down
68 changes: 68 additions & 0 deletions docs/en/notes/changelog.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,73 @@
# Changelog of v3.x

## v3.1.0 (12/10/2023)

### Highlights

**(1) Detection Transformer SOTA Model Collection**

- Supported four updated and stronger SOTA Transformer models: DDQ, CO-DETR, AlignDETR, and H-DINO.
- Based on CO-DETR, MMDet released a model with a COCO performance of 64.1 mAP.
- Algorithms such as DINO support AMP/Checkpoint/FrozenBN, which can effectively reduce memory usage.

**(2) Comprehensive Performance Comparison between CNN and Transformer**

RF100 consists of a dataset collection of 100 real-world datasets, including 7 domains. It can be used to assess the performance differences of Transformer models like DINO and CNN-based algorithms under different scenarios and data volumes. Users can utilize this benchmark to quickly evaluate the robustness of their algorithms in various scenarios.

**(3) Support for GLIP and Grounding DINO fine-tuning, the only algorithm library that supports Grounding DINO fine-tuning**

The Grounding DINO algorithm in MMDet is the only library that supports fine-tuning. Its performance is one point higher than the official version, and of course, GLIP also outperforms the official version.
We also provide a detailed process for training and evaluating Grounding DINO on custom datasets. Everyone is welcome to give it a try.

**(4) Support for the open-vocabulary detection algorithm Detic and multi-dataset joint training.**

**(5) Training detection models using FSDP and DeepSpeed.**

**(6) Support for the V3Det dataset, a large-scale detection dataset with over 13,000 categories.**

### New Features

- Support CO-DETR/DDQ/AlignDETR/H-DINO
- Support GLIP and Grounding DINO fine-tuning
- Support Detic and Multi-Datasets training (#10926)
- Support V3Det and benchmark (#10938)
- Support Roboflow 100 Benchmark (#10915)
- Add custom dataset of grounding dino (#11012)
- Release RTMDet-X p6 (#10993)
- Support AMP of DINO (#10827)
- Support FrozenBN (#10845)
- Add new configuration files for `QDTrack/DETR/RTMDet/MaskRCNN/DINO/DeformableDETR/MaskFormer` algorithm
- Add a new script to support the WBF (#10808)
- Add `large_image_demo` (#10719)
- Support download dataset from OpenXLab (#10799)
- Update to support torch2onnx for DETR series models (#10910)
- Translation into Chinese of an English document (#10744, #10756, #10805, #10848)

### Bug Fixes

- Fix name error in DETR metafile.yml (#10595)
- Fix device of the tensors in `set_nms` (#10574)
- Remove some unicode chars from `en/` docs (#10648)
- Fix download dataset with mim script. (#10727)
- Fix export to torchserve (#10694)
- Fix typo in `mask-rcnn_r50_fpn_1x-wandb_coco` (#10757)
- Fix `eval_recalls` error in `voc_metric` (#10770)
- Fix torch version comparison (#10934)
- Fix incorrect behavior to access train pipeline from ConcatDataset in `analyze_results.py` (#11004)

### Improvements

- Update `useful_tools.md` (#10587)
- Update Instance segmentation Tutorial (#10711)
- Update `train.py` to compat with new config (#11025)
- Support `torch2onnx` for maskformer series (#10782)

### Contributors

A total of 36 developers contributed to this release.

Thank @YQisme, @nskostas, @max-unfinity, @evdcush, @Xiangxu-0103, @ZhaoCake, @RangeKing, @captainIT, @ODAncona, @aaronzs, @zeyuanyin, @gotjd709, @Musiyuan, @YanxingLiu, @RunningLeon, @ytzfhqs, @zhangzhidaSunny, @yeungkong, @crazysteeaam, @timerring, @okotaku, @apatsekin, @Morty-Xu, @Markson-Young, @ZhaoQiiii, @Kuro96, @PhoenixZ810, @yhcao6, @myownskyW7, @jiongjiongli, @Johnson-Wang, @ryylcc, @guyleaf, @agpeshal, @SimonGuoNjust, @hhaAndroid

## v3.1.0 (30/6/2023)

### Highlights
Expand Down
3 changes: 2 additions & 1 deletion docs/en/notes/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,8 @@ Compatible MMDetection, MMEngine, and MMCV versions are shown as below. Please c

| MMDetection version | MMCV version | MMEngine version |
| :-----------------: | :---------------------: | :----------------------: |
| main | mmcv>=2.0.0, \<2.1.0 | mmengine>=0.7.1, \<1.0.0 |
| main | mmcv>=2.0.0, \<2.2.0 | mmengine>=0.7.1, \<1.0.0 |
| 3.2.0 | mmcv>=2.0.0, \<2.2.0 | mmengine>=0.7.1, \<1.0.0 |
| 3.1.0 | mmcv>=2.0.0, \<2.1.0 | mmengine>=0.7.1, \<1.0.0 |
| 3.0.0 | mmcv>=2.0.0, \<2.1.0 | mmengine>=0.7.1, \<1.0.0 |
| 3.0.0rc6 | mmcv>=2.0.0rc4, \<2.1.0 | mmengine>=0.6.0, \<1.0.0 |
Expand Down
3 changes: 2 additions & 1 deletion docs/zh_cn/notes/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,8 @@ export DYNAMO_CACHE_SIZE_LIMIT = 4

| MMDetection 版本 | MMCV 版本 | MMEngine 版本 |
| :--------------: | :---------------------: | :----------------------: |
| main | mmcv>=2.0.0, \<2.1.0 | mmengine>=0.7.1, \<1.0.0 |
| main | mmcv>=2.0.0, \<2.2.0 | mmengine>=0.7.1, \<1.0.0 |
| 3.2.0 | mmcv>=2.0.0, \<2.2.0 | mmengine>=0.7.1, \<1.0.0 |
| 3.1.0 | mmcv>=2.0.0, \<2.1.0 | mmengine>=0.7.1, \<1.0.0 |
| 3.0.0 | mmcv>=2.0.0, \<2.1.0 | mmengine>=0.7.1, \<1.0.0 |
| 3.0.0rc6 | mmcv>=2.0.0rc4, \<2.1.0 | mmengine>=0.6.0, \<1.0.0 |
Expand Down
2 changes: 1 addition & 1 deletion mmdet/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from .version import __version__, version_info

mmcv_minimum_version = '2.0.0rc4'
mmcv_maximum_version = '3.0.0'
mmcv_maximum_version = '2.2.0'
mmcv_version = digit_version(mmcv.__version__)

mmengine_minimum_version = '0.7.1'
Expand Down
2 changes: 1 addition & 1 deletion requirements/mminstall.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
mmcv>=2.0.0rc4,<3.0.0
mmcv>=2.0.0rc4,<2.2.0
mmengine>=0.7.1,<1.0.0
2 changes: 1 addition & 1 deletion requirements/readthedocs.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
mmcv>=2.0.0rc4,<2.1.0
mmcv>=2.0.0rc4,<2.2.0
mmengine>=0.7.1,<1.0.0
scipy
torch
Expand Down