Transformer-based method with Deformable Convolution and Central Difference Convolution for Object Detection
This study devotes to enhance the detection capabilities of real-time detection transformer (RT-DETR) by incorporating two adapter modules: the deformable convolutional network (DeformConv) and the central difference convolution (CDC) adapter module. These modules are integrated into backbone of RT-DETR and Transformer network, aiming to improve the ability of model to accurately locate and classify objects.
To evaluate the effectiveness of the proposed adapter modules, comprehensive experiments are conducted on two benchmark datasets: the NEU-DET steel plate crack dataset and the COCO dataset. On the NEU-DET dataset, compared to RT-DETR, the DeformConv adapter module achieved a significant 1% improvement in mean average precision (mAP) for medium-sized defects and a 0.1% improvement in recall (AR) for large-sized defects. These results highlight the capability of DeformConv for this type of defect with complex shapes. On the COCO dataset, the CDC adapter module exhibited a 0.1% mAP gain and a 0.5% AR improvement for medium-sized objects. These results demonstrate the effectiveness of CDC in extracting fine-grained details and distinguishing objects from the background. In summary, both DeformConv and CDC adapter modules have the potential to enhance the object detection capabilities of the RT-DETR model in different application scenarios. DeformConv can effectively capture object shape variations for complex-shaped object detection, while CDC can distinguish target objects from the background in situations where objects may be obscured or the background is complicated.
Install
pip install -r requirements.txt
Adapter
- For using central difference convolution at s5, replace
Adapter
withCDCadapter
- For using deformable convolution at s5, replace
Adapter
withCDCadapter
- For using RT-DETR, remain
Adapter
- Modify config
Adapter
,CDCadapter
,Deformadapter
Data
- Download and extract COCO 2017 train and val images.
path/to/coco/
annotations/ # annotation json files
train2017/ # train images
val2017/ # val images
- Modify config
img_folder
,ann_file
Training & Evaluation
- Training on a Single GPU:
# training on single-gpu
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/rtdetr/rtdetr_r50vd_6x_coco.yml
- Training on Multiple GPUs:
# train on multi-gpu
export CUDA_VISIBLE_DEVICES=0,1,2,3
torchrun --nproc_per_node=4 tools/train.py -c configs/rtdetr/rtdetr_r50vd_6x_coco.yml
- Evaluation on Multiple GPUs:
# val on multi-gpu
export CUDA_VISIBLE_DEVICES=0,1,2,3
torchrun --nproc_per_node=4 tools/train.py -c configs/rtdetr/rtdetr_r50vd_6x_coco.yml -r path/to/checkpoint --test-only
Export
python tools/export_onnx.py -c configs/rtdetr/rtdetr_r18vd_6x_coco.yml -r path/to/checkpoint --check