Trans-SVNet: Accurate Phase Recognition from Surgical Videos via Hybrid Embedding Aggregation Transformer
You can refer to https://github.com/YuemingJin/TMRNet for data pre-processing.
- run train_embedding.py to train ResNet50
- run generate_LFB.py to generate spatial embeddings
- run tecno.py to train TCN
- run trans_SV.py to train Transformer
Note: although TCN is trained using the whole video, no future information is used for each mini-batch. Please refer to the TeCNO paper for details.
https://arxiv.org/abs/2003.10751
We used additional Matlab code to produce the finally reported results based on the saved predicted phases of each time step. The evaluation code is in https://github.com/YuemingJin/TMRNet.