This repository contains the corresponding code for the 2nd place submission to the first Freesound general-purpose audio tagging challenge carried out as Task 2 within the DCASE challenge 2018.
For a detailed description of the entire audio tagging system please visit the corresponding github page. In this README I just provide the technical instructions to set up the project.
Before we can start working with the code, we first need to set up a few things:
Note: This package requires Python 2.7!
For a list of required python packages see the requirements.txt
or just install them all at once using pip.
pip install -r requirements.txt
or the environment.yaml
:
conda env create -f environment.yaml
conda activate dcase18
To install the project in develop mode run
python setup.py develop --user
in the root folder of the package.
This is what I recommend, especially if you want to try out new ideas.
Then download the challenge data and organize it in the following folder structure:
<DATA_ROOT>
- audio_train
- audio_test
- train.csv
- test_post_competition.csv
In config/settings.py you have to set the following two paths:
DATA_ROOT = "/home/matthias/shared/datasets/dcase2018_task2_release"
EXP_ROOT = "/home/matthias/experiments/dcase_task2/"
DATA_ROOT is the <DATA_ROOT> path from above.
EXP_ROOT is where the model parameters and logs will be stored.
Once this is all set up, you can switch to the detailed writeup on this github page.
In order to run the audio_tagger.py
, we had to install pyaudio
and portaudio
in our Anaconda environment (Ubuntu 18.04):
conda install nwani::portaudio nwani::pyaudio