This is the paddle code of Deep High-Resolution Representation Learning for Human Pose Estimation.
In this work, we are interested in the human pose estimation problem with a focus on learning reliable high-resolution representations. Most existing methods recover high-resolution representations from low-resolution representations produced by a high-to-low resolution network. Instead, our proposed network maintains high-resolution representations through the whole process. We start from a high-resolution subnetwork as the first stage, gradually add high-to-low resolution subnetworks one by one to form more stages, and connect the mutli-resolution subnetworks in parallel. We conduct repeated multi-scale fusions such that each of the high-to-low resolution representations receives information from other parallel representations over and over, leading to rich high-resolution representations. As a result, the predicted keypoint heatmap is potentially more accurate and spatially more precise. We empirically demonstrate the effectiveness of our network through the superior pose estimation results over two benchmark datasets: the COCO keypoint detection dataset and the MPII Human Pose dataset.
- PaddlePaddle 2.2
- OS 64 bit
- Python 3(3.5.1+/3.6/3.7/3.8/3.9),64 bit
- pip/pip3(9.0.1+), 64 bit
- CUDA >= 10.1
- cuDNN >= 7.6
# CUDA10.1
python -m pip install paddlepaddle-gpu==2.2.0.post101 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
- For more CUDA version or environment to quick install, please refer to the PaddlePaddle Quick Installation document
- For more installation methods such as conda or compile with source code, please refer to the installation document
Please make sure that your PaddlePaddle is installed successfully and the version is not lower than the required version. Use the following command to verify.
# check
>>> import paddle
>>> paddle.utils.run_check()
# confirm the paddle's version
python -c "import paddle; print(paddle.__version__)"
Note
- If you want to use PaddleDetection on multi-GPU, please install NCCL at first.
pip install -r requirements.txt
mkdir output
mkdir log
Your directory tree should look like this:
${POSE_ROOT}
├── config
├── dataset
├── figures
├── lib
├── log
├── output
├── tools
├── README.md
└── requirements.txt
-
The coco dataset is downloaded automatically through the code. The dataset is large and takes a long time to download
# automatically download coco datasets by executing code python dataset/download_coco.py
after code execution, the organization structure of coco dataset file is:
>>cd dataset >>tree ├── annotations │ ├── instances_train2017.json │ ├── instances_val2017.json │ | ... ├── train2017 │ ├── 000000000009.jpg │ ├── 000000580008.jpg │ | ... ├── val2017 │ ├── 000000000139.jpg │ ├── 000000000285.jpg │ | ... | ...
-
If the coco dataset has been downloaded
The files can be organized according to the above data file organization structure.
We provides scripts for training, evalution and inference with various features according to different configure.
# training on single-GPU
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/hrnet_w32_256x192.yml
# training on multi-GPU
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/hrnet_w32_256x192.yml
# GPU evaluation
export CUDA_VISIBLE_DEVICES=0
python tools/eval.py -c configs/hrnet_w32_256x192.yml -o weights=https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_256x192.pdparams
# Inference
python tools/infer.py -c configs/hrnet_w32_256x192.yml --infer_img=dataset/test_image/hrnet_demo.jpg -o weights=https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_256x192.pdparams
# training with distillation
python tools/train.py -c configs/lite_hrnet_30_256x192_coco.yml --distill_config=./configs/hrnet_w32_256x192_teacher.yml
# training with PACT quantization on single-GPU
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/lite_hrnet_30_256x192_coco_pact.yml
# training with PACT quantization on multi-GPU
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/lite_hrnet_30_256x192_coco_pact.yml
# GPU evaluation with PACT quantization
export CUDA_VISIBLE_DEVICES=0
python tools/eval.py -c configs/lite_hrnet_30_256x192_coco_pact.yml -o weights=https://paddledet.bj.bcebos.com/models/keypoint/lite_hrnet_30_256x192_coco_pact.pdparams
# Inference with PACT quantization
python tools/infer.py -c configs/lite_hrnet_30_256x192_coco_pact.yml
--infer_img=dataset/test_image/hrnet_demo.jpg -o weights=https://paddledet.bj.bcebos.com/models/keypoint/lite_hrnet_30_256x192_coco_pact.pdparams
COCO Dataset
Model | Input Size | AP(coco val) | Model Download | Config File |
---|---|---|---|---|
HRNet-w32 | 256x192 | 76.9 | hrnet_w32_256x192.pdparams | config |
LiteHRNet-30 | 256x192 | 69.4 | lite_hrnet_30_256x192_coco.pdparams | config |
LiteHRNet-30-PACT | 256x192 | 68.9 | lite_hrnet_30_256x192_coco_pact.pdparams | config |
LiteHRNet-30-PACT | 256x192 | 69.9 | lite_hrnet_30_256x192_coco.pdparams | config |
@inproceedings{cheng2020bottom,
title={Deep High-Resolution Representation Learning for Human Pose Estimation},
author={Ke Sun and Bin Xiao and Dong Liu and Jingdong Wang},
booktitle={CVPR},
year={2019}
}