As part of the shared task objectives, we assessed the impact of isometric translations
on a downstream task - automatic dubbing (AD).
To generate automatically dubbed videos we leverage the AD arch
from Federico et al.. To project the speech-pause structure from the source to the target, we use the latest prosodic alignment model from Virkar et al..
Table lists the participating team and the corresponding video title.
Team/System | Video Title |
---|---|
StrongBaseline | *-StrongBaseline-dubbed.mp4 |
WeakBaseline | *-WeakBaseline-dubbed.mp4 |
Huawei Translation Services Center | *-HW-TSC-(Constrained/Unconstrained)-dubbed.mp4 |
AppTek | *-Apptek-Constrained-dubbed.mp4 |
Amazon Prime Video | *-APV-Unconstrained-dubbed.mp4 |
Navrachana University | *-NUV-Constrained-dubbed.mp4 |
*Baseline indicate systems trained as baseline. Compare dubbed videos in De/Fr/Es against the English source.
@article{federico2020speech,
title={From speech-to-speech translation to automatic dubbing},
author={Federico, Marcello and Enyedi, Robert and Barra-Chicote, Roberto and Giri, Ritwik and Isik, Umut and Krishnaswamy, Arvindh and Sawaf, Hassan},
journal={arXiv preprint arXiv:2001.06785},
year={2020}
}
@article{virkar2022onoffscreenpa,
title={Prosodic alignment for off-screen automatic dubbing},
author={Virkar, Yogesh and Federico, Marcello and Enyedi, Robert and Barra-Chicote Roberto},
journal={arXiv preprint arXiv:2204.02530},
year={2022}
}