LessonAble is a pipelined methodology leveraging the concept of Deep Fakes for generating MOOC (Massive Online Open Course) visual contents directly from a lesson transcript. To achieve this, the proposed pipeline consists of three main components: audio generation, video generation and lip-syncing.
This code is part of the paper: Leveraging Deep Fakes in MOOC Content Creation
📑 Original Paper | 📑 Thesis | 🌀 Output Example |
---|---|---|
Paper | Thesis | Example |
All results from this open-source code should only be used for research/academic/personal purposes only.
Prerequisites vary according to the chosen models for each component.
Python 3.6
- NVIDIA GPU + CUDA cuDNN
- ffmpeg:
sudo apt-get install ffmpeg
- Install necessary packages using
pip install -r requirements.txt
. - Check the chosen models repository prerequisites.
git clone --recursive https://github.com/priamus-lab/LessonAble
cd LessonAble
git submodule update --init --recursive
Component | Model | Description | Link to the model |
---|---|---|---|
Audio | ITAcotron 2 | Italian fine tuned model with this dataset | Model |
Audio | Tacotron 2 | English fine tuned model with this dataset | Model |
Audio | Tacotron 2 | English fine tuned model of Barack Obama with this dataset | Model |
The data required to generate MOOC content is:
- At least 15 minutes of audio of the lecturer. Follow the LessonAble Speech Dataset Generator to generate an excellent dataset to train the Text to Speech model.
- A profile photo of the lecturer for every video expression.
Once generated the lecturer's voice dataset, you're ready to training the Text to Speech model. Check the README of the Text to Speech models.
After you're fine with the generated audio model, you just need to configure the lesson_generation_config.json file. Then, by calling:
from lesson_generation.video.generate_video import generate_video
from lesson_generation.audio.generate_audio import generate_audio
from lesson_generation.lipsyncing.Wav2Lip.lipsync import lipsync
from common.config_loader import load_config
config = load_config('/home/Ciro/Desktop/LessonAble/lesson_generation/config.json')
def generate(config):
#1
generate_audio(config)
#2
generate_video(config)
#add both
lipsync(config)
generate(config)
If you are willing to use our code, please cite our work through the following BibTeX entry:
@inproceedings{sannino2022lessonable,
title={LessonAble: Leveraging Deep Fakes in MOOC Content Creation},
author={Sannino, Ciro and Gravina, Michela and Marrone, Stefano and Fiameni, Giuseppe and Sansone, Carlo},
booktitle={International Conference on Image Analysis and Processing},
year={2022},
organization={Springer}
}