Quaternion Neural Networks project for 3D Sound Source
python3 -m pip install virtualenv
virtualenv venv
virtualenv -p /usr/bin/python3 venv
The sequent command has to be executed before you want to run the code. It creates a virtual environment in which you can install all the packages used in your project:
source venv/bin/activate
You have to activate the virtual env before you can run the code
Install packages (only the first time)
python3 -m pip install -r ./Code/requirements.txt
Download the dataset. This script downloads the TAU-Dataset and unzips the folders.
python3 download_dataset.py
The sequent script prepare the folders as the program expects to receive them. It renames all the entries of the dataset according to the development set and evaluation set, and divides them according to the number of overlapping (ov1 and ov2). It creates two main folders with the same content (TAU Dataset and TAU Dataset Seld), these will be used later to extract the features for the two different nets.
python3 prepare_dataset.py
Now you should have two folder inside Dataset folder: TAU Dataset and TAU Dataset Seld. Inside each folder there are two folders for wav and labels.
cd Code
python3 batch_feature_extraction.py
python3 batch_feature_extraction_seld.py
The first script extracts features for the quaternion net. The second script extracts features for the traditional net. These add the extracted features to the correct folder.
Once you have extracted the features you can train the network
python3 ./seld.py <name> --author <author> --params <params_num>
- name : is the name used to save the model and other info. If the name already exists it continues the training from the saved epoch, with saved info (metrics, etc..).
- author : is the name of the person who runs the training
- params_num : is a number referred to the configurations used (999 for quick test, 10 for ov1, 20 for ov2). It is possible to implement other configurations (TO DO).
python3 ./seld_cnn.py <name> --author <author> --params <params_num>
- params_num : is a number referred to the configurations used (11 for ov1, 21 for ov2). It is possible to implement other configurations (TO DO).
- other params as before
To visualize the plots we use Tensorboard
tensorboard --logdir ./logs
tensorboard --logdir ./Code/logs
logdir : logdir depends on where you are
This loads the extension in Colab
%load_ext tensorboard
This opens the window of tensorboard (inside Colab you have to run this before run the training, in this way you can see the progresses in real time)
%tensorboard --logdir ./logs