Skip to content

Convert text and audio to facial expressions

License

Notifications You must be signed in to change notification settings

sweatybridge/text-to-anime

Repository files navigation

text-to-anime

Convert text and audio to facial expressions

Setup

python3.10 -m venv venv
echo $(pwd) > venv/lib/python3.10/site-packages/module.pth
. venv/bin/activate
pip install -r requirements.txt -r requirements-dev.txt

Web App

  • Save trained model to artefact/best-lips.pt
  • [M1] Comment out all references to librosa
streamlit run app.py

Data preparation

  1. Download LRS3 dataset to lrs3_v0.4 directory
  2. Download TED talks from YouTube to video/{id}.mp4 where id is the query parameter v
  3. Extract annotated frames from download videos to noisy directory
python lrs3_v0.4/preprocess.py
  1. Detect facial landmarks using OpenFace 2.0
docker-compose -f landmark/docker-compose.yml up
  1. Copy high confidence detections to clean directory
python lrs3_v0.4/postprocess.py

Training

The clean directory contains sample data that have been preprocessed. You may use it to reproduce our model.

python train.py

Inference

Call trained model with text input.

python score.py --text "Hello World!"

Video will be saved as output/line_0.mp4

About

Convert text and audio to facial expressions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published