Open-source models used:
- Whisper Large v2 (Speech-to-text)
- M2M100 (Translation)
- Coqui XTTS v2 (Voice clone)
- Sad talker (Lip sync)
Input video (English) | Output (Russian) |
---|---|
original.mp4 |
output.mp4 |
Input video (English) | Output (Spanish) |
---|---|
bbc_news.mp4 |
output.5.mp4 |
Input video (Chinese) | Output (English) |
---|---|
xi.mp4 |
output.mp4 |