Skip to content

priamus-lab/LessonAble

Repository files navigation

LessonAble

LessonAble is a pipelined methodology leveraging the concept of Deep Fakes for generating MOOC (Massive Online Open Course) visual contents directly from a lesson transcript. To achieve this, the proposed pipeline consists of three main components: audio generation, video generation and lip-syncing.

This code is part of the paper: Leveraging Deep Fakes in MOOC Content Creation

📑 Original Paper 📑 Thesis 🌀 Output Example
Paper Thesis Example

Pipelined structure

completed synthesis pipeline

Disclaimer

All results from this open-source code should only be used for research/academic/personal purposes only.

Prerequisites

Prerequisites vary according to the chosen models for each component. Components

  • Python 3.6
  • NVIDIA GPU + CUDA cuDNN
  • ffmpeg: sudo apt-get install ffmpeg
  • Install necessary packages using pip install -r requirements.txt.
  • Check the chosen models repository prerequisites.

Checkout instructions

git clone --recursive https://github.com/priamus-lab/LessonAble
cd LessonAble
git submodule update --init --recursive

Getting the produced weights

Component Model Description Link to the model
Audio ITAcotron 2 Italian fine tuned model with this dataset Model
Audio Tacotron 2 English fine tuned model with this dataset Model
Audio Tacotron 2 English fine tuned model of Barack Obama with this dataset Model

Data Collection

The data required to generate MOOC content is:

  • At least 15 minutes of audio of the lecturer. Follow the LessonAble Speech Dataset Generator to generate an excellent dataset to train the Text to Speech model.
  • A profile photo of the lecturer for every video expression.

Generated dataset with the LessonAble Speech Dataset Generator

Audio Training

Once generated the lecturer's voice dataset, you're ready to training the Text to Speech model. Check the README of the Text to Speech models.

Synthesis

After you're fine with the generated audio model, you just need to configure the lesson_generation_config.json file. Then, by calling:

from lesson_generation.video.generate_video import generate_video
from lesson_generation.audio.generate_audio import generate_audio
from lesson_generation.lipsyncing.Wav2Lip.lipsync import lipsync
from common.config_loader import load_config

config = load_config('/home/Ciro/Desktop/LessonAble/lesson_generation/config.json')

def generate(config):
    #1
    generate_audio(config)
    #2
    generate_video(config)
    #add both
    lipsync(config)
    
generate(config)

Cite work

If you are willing to use our code, please cite our work through the following BibTeX entry:

@inproceedings{sannino2022lessonable,
title={LessonAble: Leveraging Deep Fakes in MOOC Content Creation},
author={Sannino, Ciro and Gravina, Michela and Marrone, Stefano and Fiameni, Giuseppe and Sansone, Carlo},
booktitle={International Conference on Image Analysis and Processing},
year={2022},
organization={Springer}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published