Convert text and audio to facial expressions
python3.10 -m venv venv
echo $(pwd) > venv/lib/python3.10/site-packages/module.pth
. venv/bin/activate
pip install -r requirements.txt -r requirements-dev.txt
- Save trained model to
artefact/best-lips.pt
- [M1] Comment out all references to librosa
streamlit run app.py
- Download LRS3 dataset to
lrs3_v0.4
directory - Download TED talks from YouTube to
video/{id}.mp4
where id is the query parameterv
- https://www.youtube.com/watch?v=0C5UQbWzwg8
- https://www.youtube.com/watch?v=0FQXicAGy5U
- https://www.youtube.com/watch?v=0FkuRwU8HFc
- https://www.youtube.com/watch?v=0GL5r3HVAZ0
- https://www.youtube.com/watch?v=0JGarsZE1rk
- https://www.youtube.com/watch?v=0LxPAY9yis8
- https://www.youtube.com/watch?v=0akiEFwtkyA
- https://www.youtube.com/watch?v=0bop3D7SdDM
- https://www.youtube.com/watch?v=0d6iSvF1UmA
- https://www.youtube.com/watch?v=0hzSUUdTDUA
- https://www.youtube.com/watch?v=0iTehgSOZ8A
- https://www.youtube.com/watch?v=1BHOflzxPjI
- Extract annotated frames from download videos to
noisy
directory
python lrs3_v0.4/preprocess.py
- Detect facial landmarks using OpenFace 2.0
docker-compose -f landmark/docker-compose.yml up
- Copy high confidence detections to
clean
directory
python lrs3_v0.4/postprocess.py
The clean
directory contains sample data that have been preprocessed. You may use it to reproduce our model.
python train.py
Call trained model with text input.
python score.py --text "Hello World!"
Video will be saved as output/line_0.mp4