forked from matterport/Mask_RCNN
-
Notifications
You must be signed in to change notification settings - Fork 36
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
dc37cc5
commit b127f63
Showing
1 changed file
with
37 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,37 @@ | ||
Mask RCNN for Human Pose Estimation-----------------------------------The original code is from "https://github.com/matterport/Mask_RCNN" on Python 3, Keras, and TensorFlow. The code reproduce the work of "https://arxiv.org/abs/1703.06870" for human pose estimation.This project aims to addressing the [issue#2][1]. When I start it, I refer to another project by [@RodrigoGantier][2] .## However RodrigoGantier's project has the following problems:* It's codes have few comments and still use the oringal names from [@Matterport][3]'s project, which make the project hard to understand. * When I trained this model, I found it's hard to converge as described in [issue#3][4].## Requirements* Python 3.5+* TensorFlow 1.4+* Keras 2.0.8+* Jupyter Notebook* Numpy, skimage, scipy, Pillow, cython, h5py# Getting Started* [inference_humanpose.ipynb][5] shows how to predict the keypoint of human using my trained model. It randomly chooses a image from the validation set. You can download pre-trained COCO weights for human pose estimation (mask_rcnn_coco_humanpose.h5) from the releases page.* [train_humanpose.ipynb][6] shows how to train the model step by step. You can also use "python train_humanpose.py" to start training.* [inspect_humanpose.ipynb][7] visulizes the proposal target keypoints to check it's validity. It also outputs some innner layers to help us debug the model.#Discussion* I convert the joint coordinates into an integer label ([0, 56*56)), and use `tf.nn.sparse_softmax_cross_entropy_with_logits` as the loss function. This refers to the original [Detectron code][8] which is key reason why my loss can converge quickly.* If you still want to use the keypoint mask as output, you'd better adopt the modified loss function proposed by [@QtSignalProcessing][9] in [issue#2][10]. Because after crop and resize, the keypoint masks may hava more than one 1 values, and this will make the original soft_cross entropy_loss hard to converge.* Althougth the loss converge quickly, the prediction results isn't as good as the oringal papers, especially for right or left shoulder, right or left knee, etc. I'm confused with it, so I release the code and any contribution or suggestion to this repository is welcome. [1]: https://github.com/matterport/Mask_RCNN/issues/2 [2]: https://github.com/RodrigoGantier/Mask_R_CNN_Keypoints [3]: https://github.com/matterport/Mask_RCNN [4]: https://github.com/RodrigoGantier/Mask_R_CNN_Keypoints/issues/3 [5]: https://github.com/Superlee506/Mask_RCNN/blob/master/inference_humanpose.ipynb [6]: https://github.com/Superlee506/Mask_RCNN/blob/master/train_human_pose.ipynb [7]: https://github.com/Superlee506/Mask_RCNN/blob/master/inspect_humanpose.ipynb [8]: https://github.com/facebookresearch/Detectron/blob/master/lib/utils/keypoints.py [9]: https://github.com/QtSignalProcessing [10]: https://github.com/matterport/Mask_RCNN/issues/2 | ||
Mask RCNN for Human Pose Estimation | ||
----------------------------------- | ||
|
||
The original code is from "https://github.com/matterport/Mask_RCNN" on Python 3, Keras, and TensorFlow. The code reproduce the work of "https://arxiv.org/abs/1703.06870" for human pose estimation. | ||
This project aims to addressing the [issue#2][1]. | ||
When I start it, I refer to another project by [@RodrigoGantier][2] . | ||
## However RodrigoGantier's project has the following problems: | ||
* It's codes have few comments and still use the oringal names from [@Matterport][3]'s project, which make the project hard to understand. | ||
* When I trained this model, I found it's hard to converge as described in [issue#3][4]. | ||
|
||
## Requirements | ||
* Python 3.5+ | ||
* TensorFlow 1.4+ | ||
* Keras 2.0.8+ | ||
* Jupyter Notebook | ||
* Numpy, skimage, scipy, Pillow, cython, h5py | ||
# Getting Started | ||
* [inference_humanpose.ipynb][5] shows how to predict the keypoint of human using my trained model. It randomly chooses a image from the validation set. You can download pre-trained COCO weights for human pose estimation (mask_rcnn_coco_humanpose.h5) from the releases page. | ||
* [train_humanpose.ipynb][6] shows how to train the model step by step. You can also use "python train_humanpose.py" to start training. | ||
* [inspect_humanpose.ipynb][7] visulizes the proposal target keypoints to check it's validity. It also outputs some innner layers to help us debug the model. | ||
|
||
# Discussion | ||
* I convert the joint coordinates into an integer label ([0, 56*56)), and use `tf.nn.sparse_softmax_cross_entropy_with_logits` as the loss function. This refers to the original [Detectron code][8] which is key reason why my loss can converge quickly. | ||
* If you still want to use the keypoint mask as output, you'd better adopt the modified loss function proposed by [@QtSignalProcessing][9] in [issue#2][10]. Because after crop and resize, the keypoint masks may hava more than one 1 values, and this will make the original soft_cross entropy_loss hard to converge. | ||
* Althougth the loss converge quickly, the prediction results isn't as good as the oringal papers, especially for right or left shoulder, right or left knee, etc. I'm confused with it, so I release the code and any contribution or suggestion to this repository is welcome. | ||
|
||
|
||
[1]: https://github.com/matterport/Mask_RCNN/issues/2 | ||
[2]: https://github.com/RodrigoGantier/Mask_R_CNN_Keypoints | ||
[3]: https://github.com/matterport/Mask_RCNN | ||
[4]: https://github.com/RodrigoGantier/Mask_R_CNN_Keypoints/issues/3 | ||
[5]: https://github.com/Superlee506/Mask_RCNN/blob/master/inference_humanpose.ipynb | ||
[6]: https://github.com/Superlee506/Mask_RCNN/blob/master/train_human_pose.ipynb | ||
[7]: https://github.com/Superlee506/Mask_RCNN/blob/master/inspect_humanpose.ipynb | ||
[8]: https://github.com/facebookresearch/Detectron/blob/master/lib/utils/keypoints.py | ||
[9]: https://github.com/QtSignalProcessing | ||
[10]: https://github.com/matterport/Mask_RCNN/issues/2 |