Skip to content

Reproducibility kit for "BAF: An Audio Fingerprinting Dataset for Broadcast Monitoring" by Guillem Cortès, Álex Ciurana, Emilio Molina, Marius Miron, Owen Meyers, Joren Six and Xavier Serra.

License

Notifications You must be signed in to change notification settings

guillemcortes/baf-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is an official page of "BAF: An Audio Fingerprinting Dataset For Broadcast Monitoring" published in ISMIR 2022.

pdf logo DOI

pdf logo Poster

pdf logo 4min video presentation:

BAF thumbnail

Dataset

Broadcast Audio Fingerprinting dataset is an open, available upon request, annotated dataset for the task of music monitoring in broadcast. It contains 2,000 tracks from Epidemic Sound's private catalogue as reference tracks that represent 74 hours. As queries, it contains over 57 hours of TV broadcast audio from 23 countries and 203 channels distributed with 3,425 one-min audio excerpts.

It has been annotated by six annotators in total and each query has been cross-annotated by three of them obtaining high inter-annotator agreement percentages, which validates the annotation methodology and ensures the reliability of the annotations.

pdf logo DOI

Downloading the data

The dataset is available for conducting non-commercial research related to audio analysis. It shall not be used for music generation or music synthesis. It is available upon request on Zenodo alongside an extended description of the dataset contents, motivation, license, ownership of the data, and the dataset datasheet.

Algorithms

Configuration files are located at baf-dataset/configs.

Code

baf-dataset/
├── compute_statistics.py --> Script to generate metrics
├── configs --> Parameter configurations used
│   ├── audfprint.cfg
│   ├── …
│   └── panako.cfg
└── peakfp --> Fingerprinting baseline
    ├── README.md
    ├── constants.py
    ├── …
    └── utils.py

Installation

The authors recommend the use of virtual environments.

Requirements:

  • Python 3.6+
  • Create virtual environment and install requirements
git clone https://github.com/guillemcortes/baf-dataset.git
cd baf-dataset
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Usage

BAF has a dedicated dataloader in mirdata that can help working with tha dataset. Check here the documentation.

License

  • The code in this repository is licensed under Apache 2.0
  • Dataset license is detailed in Zenodo

Citation

Please cite the following publication when using the dataset:

Guillem Cortès, Alex Ciurana, Emilio Molina, Marius Miron, Owen Meyers, Joren Six, & Xavier Serra. (2022). BAF: An audio fingerprinting dataset for broadcast monitoring. Proceedings of the 23rd International Society for Music Information Retrieval Conference, pp. 908–916. 4-8 December 2022, Bengaluru, India.

Bibtex version:

@inproceedings{cortes2022BAF,
  author       = {Guillem Cortès and
                  Alex Ciurana and
                  Emilio Molina and
                  Marius Miron and
                  Owen Meyers and
                  Joren Six and
                  Xavier Serra},
  title        = {{BAF: An audio fingerprinting dataset for broadcast monitoring}},
  booktitle    = {{Proceedings of the 23rd International Society for Music Information Retrieval Conference}},
  year         = 2022,
  pages        = {908-916},
  publisher    = {ISMIR},
  address      = {Bengaluru, India},
  month        = dec,
  venue        = {Bengaluru, India},
  doi          = {10.5281/zenodo.7316812},
  url          = {https://doi.org/10.5281/zenodo.7372162}
}

Acknowledgements

This research is part of NextCore – New generation of music monitoring technology (RTC2019-007248-7), funded by the Spanish Ministerio de Ciencia e Innovación and the Agencia Estatal de Investigación. Also, has received support from Industrial Doctorates plan of the Secretaria d’universitats i Recerca, Departament d’Empresa i Coneixement de la Generalitat de Catalunya, grant agreement No. DI46-2020.


Attribution

Document icon created by iconmas - Flaticon

Database icon created by Bharat Icons - Flaticon

Youtube Logo from freepnglogos.com

About

Reproducibility kit for "BAF: An Audio Fingerprinting Dataset for Broadcast Monitoring" by Guillem Cortès, Álex Ciurana, Emilio Molina, Marius Miron, Owen Meyers, Joren Six and Xavier Serra.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages