LORIS

This is the official implementation of "Long-Term Rhythmic Video Soundtracker", ICML2023.

Jiashuo Yu, Yaohui Wang, Xinyuan Chen, Xiao Sun, and Yu Qiao.

OpenGVLab, Shanghai Artificial Intelligence Laboratory

Introduction

We present Long-Term Rhythmic Video Soundtracker (LORIS), a novel framework to synthesize long-term conditional waveforms in sync with visual cues. Our framework consists of a latent conditional diffusion probabilistic model to perform waveform synthesis. Furthermore, a series of context-aware conditioning encoders are proposed to take temporal information into consideration for a long-term generation. We also extend our model's applicability from dances to multiple sports scenarios such as floor exercise and figure skating. To perform comprehensive evaluations, we establish a benchmark for rhythmic video soundtracks including the pre-processed dataset, improved evaluation metrics, and robust generative baselines.

How to Start

pip install -r requirements.txt

Training

bash scripts/loris_{subset}_s{length}.sh

Inference

bash scripts/infer_{subset}_s{length}.sh

Dataset

Dataset is available in huggingface.

from datasets import load_dataset
dataset = load_dataset("OpenGVLab/LORIS")

Model Zoo

We provide the pre-trained checkpoints and backbone audio diffusion model as follow:

Audio-diffusion-pytorch-v0.0.43,
Dance 25 seconds,
Figure Skating 25 seconds,
Floor Exercise 25 seconds,
Floor Exercise 50 seconds

It should be noted that these checkpoints must only be used for research purposes.

Citation

@inproceedings{Yu2023Long,
title={Long-Term Rhythmic Video Soundtracker},
author={Yu, Jiashuo and Wang, Yaohui and Chen, Xinyuan and Sun, Xiao and Qiao, Yu },
booktitle={International Conference on Machine Learning (ICML)},
year={2023}
}

Acknowledgement

We would like to thank the authors of previous related projects for generously sharing their code and insights: audio-diffusion-pytorch, CDCD, D2M-GAN, VQ-Diffusion, and JukeBox.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LORIS

Introduction

How to Start

Training

Inference

Dataset

Model Zoo

Citation

Acknowledgement

Files

README.md

Latest commit

History

README.md

File metadata and controls

LORIS

Introduction

How to Start

Training

Inference

Dataset

Model Zoo

Citation

Acknowledgement