This is the code repository for our paper "Attribute-aware Identity-hard Triplet Loss for Video-based Person Re-identification": https://arxiv.org/pdf/2006.07597.pdf. If you find this help your research, please cite it.
@article{chen2020attribute,
title={Attribute-aware Identity-hard Triplet Loss for Video-based Person Re-identification},
author={Chen, Zhiyuan and Li, Annan and Jiang, Shilu and Wang, Yunhong},
journal={arXiv preprint arXiv:2006.07597},
year={2020}
}
This repository contains a project which firstly introducing the pedestrain attribute information into video-based Re-ID, we address this issue by introducing a new metric learning method called Attribute-aware Identity-hard Triplet Loss (AITL), which reduces the intra-class variation among positive samples via calculating attribute distance. To achieve a complete model of video-based person Re-ID, a multitask framework with Attribute-driven Spatio-Temporal Attention (ASTA) mechanism is also proposed.
The batch-hard triplet loss frequently used in video-based person Re-ID suffers from the Distanc eVariance among Different Positives(DVDP) problem.
Attribute-aware Identity-hard Triplet Loss to solve the DVDP.
Introducing the spatial-temporal attention in attribute recognition process into Re-ID process.
It is mainly forked from video-person-reid and reid-strong-baseline. Since I suffered from severe poverty, I introduce the nvidia-apex to train the model in FP16 settings, so the training codes can be directly ran on a single RTX2070s, which is very friendly to proletarians like me. If you owes a 32GB V100 Graphic Card or 2 * GTX 1080Ti Cards, you can just ignore the apex operation and run the codes on a single card, and increase the batch size to 64, the u can get a higher performance :).
Requirements:
pytorch >= 0.4.1 ( < 1.5.0 apex is not friendly to pytorch 1.5.0 according to my practice)
torchvision >= 0.2.1
tqdm
[nvidia-apex](https://github.com/NVIDIA/apex), please follow the detailed install instructions
Experiments on MARS, as it is the largest dataset available to date for video-based person reID. Please follow deep-person-reid to prepare the data. The instructions are copied here:
- Create a directory named
mars/
. - Download dataset to
mars/
from http://www.liangzheng.com.cn/Project/project_mars.html. - Extract
bbox_train.zip
andbbox_test.zip
. - Download split information from https://github.com/liangzheng06/MARS-evaluation/tree/master/info and put
info/
indata/mars
(we want to follow the standard split in [8]). The data structure would look like: - Download
mars_attributes.csv
from http://irip.buaa.edu.cn/mars_duke_attributes/index.html, and put the file indata/mars
. The data structure would look like:
mars/
bbox_test/
bbox_train/
info/
mars_attributes.csv
- Change the global variable
_C.DATASETS.ROOT_DIR
to/path2mars/mars
and_C.DATASETS.NAME
tomars
in config or configs.
- Create a directory named
duke/
underdata/
. - Download dataset to
data/duke/
from http://vision.cs.duke.edu/DukeMTMC/data/misc/DukeMTMC-VideoReID.zip. - Extract
DukeMTMC-VideoReID.zip
. - Download
duke_attributes.csv
from http://irip.buaa.edu.cn/mars_duke_attributes/index.html, and put the file indata/duke
. The data structure would look like:
duke/
train/
gallery/
query/
duke_attributes.csv
- Change the global variable
_C.DATASETS.ROOT_DIR
to/path2duke/duke
and_C.DATASETS.NAME
toduke
in config or configs.
To train the model, please run
python main_baseline.py
Please modifies the settings directly on the config files.
The above performance is achieved in the setting: 2 * 1080Ti, train batchsize 64. (Once i was a middle-class deepnetwork-finetuner when i was in school.)
Best performance on lower devices(MARS, 1 * RTX 2070s, train batchsize 32): (Now i'm a proletarian. 要为了真理而斗争!)
mAP : 82.5% Rank-1 : 86.5%
More experiments result can be found in paper.