Skip to content

Latest commit

 

History

History
67 lines (41 loc) · 2.88 KB

audio_gen.md

File metadata and controls

67 lines (41 loc) · 2.88 KB

Audio Gen

Survey

Audio Generation

  • Fugatto is a framework for audio synthesis and transformation given text instructions and optional audio inputs.

    · (fugatto.github)

  • Tell What You Hear From What You See -- Video to Audio Generation Through Text, arXiv, 2411.05679, arxiv, pdf, cication: -1

    Xiulong Liu, Kun Su, Eli Shlizerman

  • Audiobox TTA-RAG: Improving Zero-Shot and Few-Shot Text-To-Audio with Retrieval-Augmented Generation, arXiv, 2411.05141, arxiv, pdf, cication: -1

    Mu Yang, Bowen Shi, Matthew Le, ..., Wei-Ning Hsu, Andros Tjandra

  • FlashAudio: Rectified Flows for Fast and High-Fidelity Text-to-Audio Generation, arXiv, 2410.12266, arxiv, pdf, cication: -1

    Huadai Liu, Jialei Wang, Rongjie Huang, ..., Wei Xue, Zhou Zhao

  • Movie Gen: A Cast of Media Foundation Models, arXiv, 2410.13720, arxiv, pdf, cication: -1

    Adam Polyak, Amit Zohar, Andrew Brown, ..., Vladan Petrovic, Yuming Du · (ai.meta)

Speech Generation

Conversion

  • CoDiff-VC: A Codec-Assisted Diffusion Model for Zero-shot Voice Conversion, arXiv, 2411.18918, arxiv, pdf, cication: -1

    Yuke Li, Xinfa Zhu, Hanzhao Li, ..., Zhifei Li, Lei Xie

Audio Editting

Datasets

Toolkits

Products

Misc

Misc