-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add loader for candombe_beat_downbeat #553
Closed
Closed
Changes from 11 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
3da7521
added required files for candombe_beat_downbeat dataset
jimearruti b8ea311
fixed empty new line at the end of file
jimearruti a095b53
changed dataloader name from candombe_beat_downbeat to candombe
jimearruti 9919bdb
added documentation for load_beats function in candombe dataloader
jimearruti 30e1b38
reformatted with black
jimearruti 55a5d22
included candombe dataset information in table.rst and mirdata.rst
jimearruti 597e08b
Merge branch 'master' into master
harshpalan ad1dfdf
formatted candombe.py with black
jimearruti 5d58b97
Merge branch 'master' into master
f5c0207
changed candombe.py according to black formatting
jimearruti 7bbd307
Merge branch 'master' into master
jimearruti 326a1f6
Merge branch 'master' into master
genisplaja 28a45e2
updated candombe Track class docstring
jimearruti 209c53f
Merge branch 'master' of github.com:jimearruti/mirdata_candombe_dataset
jimearruti 222edea
Merge branch 'master' into master
jimearruti f1f8cc3
made slight docstring changes
jimearruti 802f1c9
Merge branch 'master' into master
jimearruti de29328
Merge branch 'master' of github.com:jimearruti/mirdata_candombe_dataset
jimearruti 16d62ad
slight docstring changes
jimearruti File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,177 @@ | ||
"""Candombe Dataset Loader | ||
|
||
.. admonition:: Dataset Info | ||
:class: dropdown | ||
|
||
This is a dataset of Candombe recordings with annotated beats and downbeats, totaling over 2 hours of audio. | ||
It comprises 35 complete performances by renowned players, in groups of three to five drums. | ||
Recording sessions were conducted in studio, in the context of musicological research over the past two decades. | ||
A total of 26 tambor players took part, belonging to different generations and representing all the important traditional Candombe styles. | ||
The audio files are stereo with a sampling rate of 44.1 kHz and 16-bit precision. | ||
The location of beats and downbeats was annotated by an expert, adding to more than 4700 downbeats. | ||
|
||
The audio is provided as .flac files and the annotations as .csv files. | ||
The values in the first column of the csv file are the time instants of the beats. | ||
The numbers on the second column indicate both the bar number and the beat number within the bar. | ||
For instance, 1.1, 1.2, 1.3 and 1.4 are the four beats of the first bar. Hence, each label ending with .1 indicates a downbeat. | ||
Another set of annotations are provided as .beats files in which the bar numbers are removed. | ||
|
||
""" | ||
import csv | ||
from typing import BinaryIO, Optional, TextIO, Tuple | ||
|
||
import librosa | ||
import numpy as np | ||
|
||
from mirdata import download_utils, jams_utils, core, annotations, io | ||
|
||
BIBTEX = """ | ||
@inproceedings{Nunes2015, | ||
author = {Leonardo Nunes and Martín Rocamora and Luis Jure and Luiz W. P. Biscainho}, | ||
title = {{Beat and Downbeat Tracking Based on Rhythmic Patterns Applied to the Uruguayan Candombe Drumming}}, | ||
booktitle = {Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR 2015)}, | ||
month = {Oct.}, | ||
address = {Málaga, Spain}, | ||
pages = {264--270}, | ||
year = {2015} | ||
} | ||
""" | ||
|
||
INDEXES = { | ||
"default": "1.0", | ||
"test": "1.0", | ||
"1.0": core.Index(filename="candombe_index_1.0.json"), | ||
} | ||
|
||
|
||
REMOTES = { | ||
"annotations": download_utils.RemoteFileMetadata( | ||
filename="candombe_annotations.zip", | ||
url="https://zenodo.org/record/6533068/files/candombe_annotations.zip", | ||
checksum="f78aff60aa413cb4960c0c77cc31c243", | ||
destination_dir=None, | ||
), | ||
"audio": download_utils.RemoteFileMetadata( | ||
filename="candombe_audio.zip", | ||
url="https://zenodo.org/record/6533068/files/candombe_audio.zip", | ||
checksum="ccd7f437024807b1a52c0818aa0b7f06", | ||
destination_dir=None, | ||
), | ||
} | ||
|
||
LICENSE_INFO = """ | ||
Creative Commons Attribution 4.0 International | ||
""" | ||
|
||
|
||
class Track(core.Track): | ||
"""candombe track class | ||
# -- YOU CAN AUTOMATICALLY GENERATE THIS DOCSTRING BY CALLING THE SCRIPT: | ||
# -- `scripts/print_track_docstring.py my_dataset` | ||
# -- note that you'll first need to have a test track (see "Adding tests to your dataset" below) | ||
|
||
Args: | ||
track_id (str): track id of the track | ||
|
||
Attributes: | ||
audio_path (str): path to audio file | ||
annotation_path (str): path to annotation file | ||
# -- Add any of the dataset specific attributes here | ||
|
||
Cached Properties: | ||
annotation (EventData): a description of this annotation | ||
|
||
""" | ||
|
||
def __init__(self, track_id, data_home, dataset_name, index, metadata): | ||
super().__init__(track_id, data_home, dataset_name, index, metadata) | ||
|
||
self.audio_path = self.get_path("audio") | ||
self.beats_path = self.get_path("beats") | ||
|
||
@core.cached_property | ||
def beats(self) -> Optional[annotations.BeatData]: | ||
"""The track's beats | ||
|
||
Returns: | ||
BeatData: loaded beat data | ||
|
||
""" | ||
return load_beats(self.beats_path) | ||
|
||
@property | ||
def audio(self) -> Optional[Tuple[np.ndarray, float]]: | ||
"""The track's audio | ||
|
||
Returns: | ||
* np.ndarray - audio signal | ||
* float - sample rate | ||
|
||
""" | ||
return load_audio(self.audio_path) | ||
|
||
def to_jams(self): | ||
"""Jams: the track's data in jams format""" | ||
return jams_utils.jams_converter( | ||
audio_path=self.audio_path, beat_data=[(self.beats, None)], metadata=None | ||
) | ||
|
||
|
||
@io.coerce_to_bytes_io | ||
def load_audio(fhandle: BinaryIO) -> Tuple[np.ndarray, float]: | ||
"""Load a candombe audio file. | ||
|
||
Args: | ||
fhandle (str or file-like): path or file-like object pointing to an audio file | ||
|
||
Returns: | ||
* np.ndarray - the audio signal | ||
* float - The sample rate of the audio file | ||
|
||
""" | ||
return librosa.load(fhandle, sr=None, mono=True) | ||
|
||
|
||
@io.coerce_to_string_io | ||
def load_beats(fhandle: TextIO) -> annotations.BeatData: | ||
"""Load a candombe beats file. | ||
|
||
Args: | ||
fhandle (str or file-like): path or file-like object pointing to an audio file | ||
|
||
Returns: | ||
* BeatData: loaded beat data | ||
|
||
|
||
""" | ||
reader = csv.reader(fhandle, delimiter=",") | ||
times = [] | ||
beats = [] | ||
for line in reader: | ||
times.append(float(line[0])) | ||
beats.append(int(line[1].split(".")[1])) | ||
|
||
beat_data = annotations.BeatData( | ||
times=np.array(times), | ||
time_unit="s", | ||
positions=np.array(beats), | ||
position_unit="bar_index", | ||
) | ||
return beat_data | ||
|
||
|
||
@core.docstring_inherit(core.Dataset) | ||
class Dataset(core.Dataset): | ||
"""The candombe dataset""" | ||
|
||
def __init__(self, data_home=None, version="default"): | ||
super().__init__( | ||
data_home, | ||
version, | ||
name="candombe", | ||
track_class=Track, | ||
bibtex=BIBTEX, | ||
indexes=INDEXES, | ||
remotes=REMOTES, | ||
license_info=LICENSE_INFO, | ||
) |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You would need to update the docstring here, with the attributes and cached properties of your Track class. You would need to basically include audio and beats path as attributes, then beats as cached property.