Ego4DSounds

Ego4DSounds is a subset of Ego4D, an existing large-scale egocentric video dataset. Videos have a high action-audio correspondence, making it a high-quality dataset for action-to-sound generation.

Explore the dataset

Action2Sound

Dataset introduced in "Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos".

Action2Sound is an ambient-aware approach that disentangles the action sound from the ambient sound, allowing successful generation after training with diverse in-the-wild data, as well as controllable conditioning on ambient sound levels.

Explore the project

BibTeX

@article{chen2024action2sound,
  title = {Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos},
  author = {Changan Chen and Puyuan Peng and Ami Baid and Sherry Xue and Wei-Ning Hsu and David Harwath and Kristen Grauman},
  year = {2024},
  journal = {arXiv},
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
dataset.py		dataset.py
ego4d.json		ego4d.json
extract_ego4d_clips.py		extract_ego4d_clips.py
test_clips_11k.csv		test_clips_11k.csv
train_clips_1.2m.csv		train_clips_1.2m.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ego4DSounds

Action2Sound

Contents

BibTeX

About

Releases

Packages

Languages

Ego4DSounds/Ego4DSounds

Folders and files

Latest commit

History

Repository files navigation

Ego4DSounds

Action2Sound

Contents

BibTeX

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages