Split audios in chunks and uploads them to zooniverse. Audio chunks and metadata are saved at a location specified by the user, allowing their later association with the results derived from Zooniverse.
Output metadata is stored as a dataframe, with the same format as this example file.
git clone https://github.com/LAAC-LSCP/Zooniverse2.git
cd Zooniverse2
pip install -r requirements.txt
python zooniverse.py extract-chunks [-h] --destination DESTINATION
--sample-size SAMPLE_SIZE
[--annotation-set ANNOTATION_SET]
[--target-speaker-type {CHI,OCH,FEM,MAL}]
[--batch-size BATCH_SIZE]
[--threads THREADS]
path
If it does not exist, DESTINATION is created.
Audio chunks are saved in wav and mp3 in DESTINATION/chunks
.
Metadata is stored in a file named DESTINATION/chunks.csv
.
argument | description | default value |
---|---|---|
path | path to the dataset | |
destination | where to write the output metadata and files. metadata will be saved to $destination/chunks.csv and audio chunks to $destination/chunks. | |
sample-size | how many vocalization events per recording | |
batch-size | how many chunks per batch | 1000 |
annotation-set | which annotation set to use for sampling | vtc |
target-speaker-type | speaker type to get chunks from | CHI |
threads | how many threads to perform the conversion on, uses all CPUs if <= 0 | 0 |
python zooniverse.py upload-chunks [-h] --destination DESTINATION
--zooniverse-login ZOONIVERSE_LOGIN
--zooniverse-pwd ZOONIVERSE_PWD
--project-slug PROJECT_SLUG --subject-set
SUBJECT_SET [--batches BATCHES]
Uploads as many batches of audio chunks as specified to Zooniverse, and updates chunks.csv
accordingly.
argument | description | default value |
---|---|---|
destination | where to find the output metadata and files. | |
project-slug | Zooniverse project slug (e.g.: lucasgautheron/my-new-project) | |
subject-set | prefix for the subject set | |
zooniverse-login | zooniverse login | |
zooniverse-pwd | zooniverse password | |
batches | how many batches to upload. it is recommended to upload less than 10.000 chunks per day, so 10 batches of 1000 by default. upload all batches if set to 0 | 0 |