π Pydiogment
Pydiogment aims to simplify audio augmentation. It generates multiple audio files based on a starting mono audio file. The library can generates files with higher speed, slower, and different tones etc.
π₯ Installation
Dependencies
Pydiogment requires:
-
Python (>= 3.5)
-
NumPy (>= 1.17.2)
pip install numpy
-
SciPy (>= 1.3.1)
pip install scipy
-
FFmpeg
sudo apt install ffmpeg
Installation
If you already have a working installation of NumPy and SciPy , you can simply install Pydiogment using pip:
pip install pydiogment
To update an existing version of Pydiogment, use:
pip install -U pydiogment
π‘ How to use
-
Amplitude related augmentation
-
Apply a fade in and fade out effect
from pydiogment.auga import fade_in_and_out test_file = "path/test.wav" fade_in_and_out(test_file)
-
Apply gain to file
from pydiogment.auga import apply_gain test_file = "path/test.wav" apply_gain(test_file, -100) apply_gain(test_file, -50)
-
Add Random Gaussian Noise based on SNR to file
from pydiogment.auga import add_noise test_file = "path/test.wav" add_noise(test_file, 10)
-
-
Frequency related augmentation
-
Change file tone
from pydiogment.augf import change_tone test_file = "path/test.wav" change_tone(test_file, 0.9) change_tone(test_file, 1.1)
-
-
Time related augmentation
-
Slow-down/ speed-up file
from pydiogment.augt import slowdown, speed test_file = "path/test.wav" slowdown(test_file, 0.8) speed(test_file, 1.2)
-
Apply random cropping to the file
from pydiogment.augt import random_cropping test_file = "path/test.wav" random_cropping(test_file, 1)
-
Change shift data on the time axis in a certain direction
from pydiogment.augt import shift_time test_file = "path/test.wav" shift_time(test_file, 1, "right") shift_time(test_file, 1, "left")
-
π Documentation
A thorough documentation of the library is available under pydiogment.readthedocs.io.
π· Contributing
Contributions are welcome and encouraged. To learn more about how to contribute to Pydiogment please refer to the Contributing guidelines
π Acknowledgment and credits
- The test file used in the pytests is OSR_us_000_0060_8k.wav from the Open Speech Repository.