Isometric Spoken Language Translation: Use cases

As part of the shared task objectives, we assessed the impact of isometric translations on a downstream task - automatic dubbing (AD). To generate automatically dubbed videos we leverage the AD arch from Federico et al.. To project the speech-pause structure from the source to the target, we use the latest prosodic alignment model from Virkar et al..
Table lists the participating team and the corresponding video title.

Team/System	Video Title
StrongBaseline	*-StrongBaseline-dubbed.mp4
WeakBaseline	*-WeakBaseline-dubbed.mp4
Huawei Translation Services Center	*-HW-TSC-(Constrained/Unconstrained)-dubbed.mp4
AppTek	*-Apptek-Constrained-dubbed.mp4
Amazon Prime Video	*-APV-Unconstrained-dubbed.mp4
Navrachana University	*-NUV-Constrained-dubbed.mp4

*Baseline indicate systems trained as baseline. Compare dubbed videos in De/Fr/Es against the English source.

References

@article{federico2020speech,
  title={From speech-to-speech translation to automatic dubbing},
  author={Federico, Marcello and Enyedi, Robert and Barra-Chicote, Roberto and Giri, Ritwik and Isik, Umut and Krishnaswamy, Arvindh and Sawaf, Hassan},
  journal={arXiv preprint arXiv:2001.06785},
  year={2020}
}

@article{virkar2022onoffscreenpa,
  title={Prosodic alignment for off-screen automatic dubbing},
  author={Virkar, Yogesh and Federico, Marcello and Enyedi, Robert and Barra-Chicote Roberto},
  journal={arXiv preprint arXiv:2204.02530},
  year={2022}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Isometric Spoken Language Translation: Use cases

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

Isometric Spoken Language Translation: Use cases

References