ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
-
Updated
Oct 23, 2023 - Python
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection (ECCV 2022)
AnnoTheia is a data annotation toolkit that identifies when a person speaks in a scene and transcribes their speech, also offering flexibility to replace modules for different languages.
Accepted by TMM 2022
Add a description, image, and links to the active-speaker-detection topic page so that developers can more easily learn about it.
To associate your repository with the active-speaker-detection topic, visit your repo's landing page and select "manage topics."