CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection

Jiajin Tang*, Ge Zheng*, Jingyi Yu, and Sibei Yang. *denotes equal contribution. This repository is the official implementation of our CoTDet.

Overview

In this paper, we focus on challenging task driven object detection, which is practical in the real world yet underexplored. To bridge the gap between abstract task requirements and objects in the image, we propose to explicitly extract visual affordance knowledge for the task and detect objects having consistent visual attributes with the visual knowledge. Furthermore, our CoTDet utilizes visual affordance knowledge to condition the decoder in localizing and recognizing suitable objects.

Getting Strated

1. Prepare the code and the environment

Git clone our repository, creating a python environment and activate it via the following command.

# clone repository
git clone https://github.com/Toneyaya/CoTDet.git
# create conda environment
cd CoTDet
conda env create -f environment.yaml
conda activate cotdet
# install detectron2 
python -m pip install -e detectron2
# install MultiScaleDeformableAttention
cd cotdet/modeling/pixel_encoder/ops
sh make.sh

2. Download the images and place them (both train and test) in the directory:

CoTDet
├── datasets
  ├── coco-tasts
    ├── annotations
    ├── images
      ├── 1.jpg
      ├── 2.jpg
      ├── 3.jpg
      ├── ...

3. Download our pre-trained weight which is pre-trained on a subset of the coco dataset removing all images that are duplicates of coco-tasks. Then put the pretrain weight path here at line 3.

Training

OPENBLAS_NUM_THREADS=1 python train_net.py --num-gpus 8 --config-file configs/COCOTASK_R101.yaml

Evaluation

You can download our model here and enter the paths for evaluation. Of course, you can also evaluate your training results in the same way.

OPENBLAS_NUM_THREADS=1 python train_net.py --num-gpus 8 --config-file configs/COCOTASK_R101.yaml --eval-only MODEL.WEIGHTS ckpt_path

Results

Object detection results on COCO-Tasks dataset. * indicates the evaluation results of release weight.

Method	Task1	Task2	Task3	Task4	Task5	Task6	Task7	Task8	Task9	Task10	Task11	Task12	Task13	Task14	Avg
GGNN	36.6	29.8	40.5	37.6	41.0	17.2	43.6	17.9	21.0	40.6	22.3	28.4	39.1	40.7	32.6
TOIST	45.8	40.0	49.4	49.6	53.4	26.9	58.3	22.6	32.5	50.0	35.5	43.7	52.8	56.2	44.1
CoTDet	58.9	55.0	51.2	68.5	60.5	47.7	76.9	40.7	47.4	66.5	41.9	48.3	61.7	71.4	56.9
CoTDet*	62.962	54.534	53.020	62.426	66.486	49.404	74.877	46.025	50.449	66.916	51.278	52.880	70.419	75.049	59.766

Instance segmentatio results on COCO-Tasks dataset. * indicates the evaluation results of release weight.

Method	Task1	Task2	Task3	Task4	Task5	Task6	Task7	Task8	Task9	Task10	Task11	Task12	Task13	Task14	Avg
GGNN	31.8	28.6	45.4	33.7	46.8	16.6	37.8	15.1	15.0	49.9	24.9	18.9	49.8	39.7	32.4
TOIST	40.8	36.5	48.9	37.8	43.4	22.1	44.4	20.3	26.9	48.1	31.8	34.8	51.5	46.3	38.8
CoTDet	55.0	51.6	51.2	57.7	60.1	43.1	65.9	40.4	45.4	64.8	40.4	48.7	61.7	64.4	53.6
CoTDet*	57.773	51.467	53.094	52.431	66.205	45.676	64.104	44.021	46.401	66.465	49.655	49.380	71.157	66.027	55.990

If you find our work helpful for your research, please consider citing the following BibTeX entry.

@InProceedings{Tang_2023_ICCV,
    author    = {Tang, Jiajin and Zheng, Ge and Yu, Jingyi and Yang, Sibei},
    title     = {CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {3068-3078}
}

Acknowledgement

Many thanks to these excellent opensource projects

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
configs		configs
cotdet		cotdet
datasets/coco-tasks/annotations		datasets/coco-tasks/annotations
details		details
detectron2		detectron2
knowledge		knowledge
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
train_net.py		train_net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection

Overview

Getting Strated

Training

Evaluation

Results

Acknowledgement

About

Releases

Packages

Languages

License

SooLab/CoTDet

Folders and files

Latest commit

History

Repository files navigation

CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection

Overview

Getting Strated

Training

Evaluation

Results

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages