by Yueming Jin, Yonghao Long, Cheng Chen, Zixu Zhao, Qi Dou, Pheng-Ann Heng.
- The Pytorch implementation for our paper 'Temporal Memory Relation Network for Workflow Recognition from Surgical Video', accepted at IEEE Transactions on Medical Imaging (TMI).
-
We use the dataset Cholec80 and M2CAI 2016 Challenge.
-
Training and test data split
Cholec80: first 40 videos for training and the rest 40 videos for testing, following the original paper EndoNet.
M2CAI: 27 videos for training and 14 videos for testing, following the challenge evaluation protocol.
-
Data Preprocessing:
- Using FFmpeg to convert the videos to frames;
- Downsample 25fps to 1fps (Or can directly set the convert frequency number as 1 fps in the previous step);
- Cut the black margin existed in the frame using the function of
change_size()
invideo2frame_cutmargin.py
;
Note: You also can directly use ``video2frame_cutmargin.py`` for step 1&3, you will obtain the cutted frames with original fps.
- Resize original frame to the resolution of 250 * 250.
- The structure of data folder is arranged as follows:
(root folder)
├── data
| ├── cholec80
| | ├── cutMargin
| | | ├── 1
| | | ├── 2
| | | ├── 3
| | | ├── ......
| | | ├── 80
| | ├── phase_annotations
| | | ├── video01-phase.txt
| | | ├── ......
| | | ├── video80-phase.txt
├── code
| ├── ......
-
Check dependencies:
- pytorch 1.0+ - opencv-python - numpy - sklearn
-
Clone this repo
git clone https://github.com/YuemingJin/TMRNet
-
Training model for building memory bank
-
Switch folder
$ cd ./code/Training memory bank model/
-
Run
$ get_paths_labels.py
to generate the files needed for the training -
Run
$ train_singlenet_phase_1fc.py
to start the training
- Training TMRNet
-
Switch folder
$ cd ./code/Training TMRNet/
-
Put the well-trained model obtained from step 2 to folder
./LFB/FBmodel/
-
Run
$ get_paths_labels.py
to generate the files needed for the training -
Set the args 'model_path' in
train_*.py
to./LFB/FBmodel/{your_model_name}.pth
-
Run
$ train_*.py
to start the trainingNote: In the first time to run train_*.py files, set the args 'load_LFB' to False to generate the memory bank We have three configurations about train_*.py: 1.train_only_non-local_pretrained.py: only capture long-range temporal pattern (ResNet); 2.train_non-local_mutiConv_resnet.py: capture long-range multi-scale temporal pattern (ResNet); 3.train_non-local_mutiConv_resnest.py: capture long-range multi-scale temporal pattern (ResNeSt), achieving the best results.
Our trained models can be downloaded from Dropbox.
- Switch folder
$ cd ./code/eval/python/
- Run
$ get_paths_labels.py
to generate the files needed for the testing - Specify the feature bank path, model path and test file path in ./test_*.py
- Run ./test_*.py to generate results.
- Run ./export_phase_copy.py to export results as txt files.
We use the evaluation protocol of M2CAI challenge for evaluating our method.
- Switch folder
$ cd ./code/eval/result/matlab-eval/
- Run matlab files ./Main_*.m to evaluate and print the result.
If this repository is useful for your research, please cite:
@ARTICLE{9389566,
author={Jin, Yueming and Long, Yonghao and Chen, Cheng and Zhao, Zixu and Dou, Qi and Heng, Pheng-Ann},
journal={IEEE Transactions on Medical Imaging},
title={Temporal Memory Relation Network for Workflow Recognition From Surgical Video},
year={2021},
volume={40},
number={7},
pages={1911-1923},
doi={10.1109/TMI.2021.3069471}
}
For further question about the code or paper, please contact 'ymjin5341@gmail.com'