This application uses state-of-the-art source separation models to remove vocals from audio files. UVR's core developers trained all of the models provided in this package (except for the Demucs v3 and v4 4-stem models).
conda create -n uvr5 python=3.12
# tk is included in conda python
conda activate uvr5
conda install ffmpeg
conda install pytorch -c pytorch
pip install onnx2pytorch onnx
pip install -r requirements.txt
# building wheel for samplerate may require a proxy
python UVR.py
conda create -n uvr5 python=3.12
# tk is included in conda python
conda activate uvr5
conda install gcc gxx cmake ninja glib gobject-introspection ffmpeg
conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia
pip install onnxruntime-gpu onnx2pytorch onnx --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
pip install pygobject
pip install -r requirements.txt
# building wheel for samplerate may require a proxy
python UVR.py
- Nvidia GTX 1060 6GB is the minimum requirement for GPU conversions.
- Nvidia GPUs with at least 8GBs of V-RAM are recommended.
- AMD Radeon GPU supported is limited at this time.
- There is currently a working branch for AMD GPU users here
- This application is only compatible with 64-bit platforms.
- This application relies on the Rubber Band library for the Time-Stretch and Pitch-Shift options.
- This application relies on FFmpeg to process non-wav audio files.
- The application will automatically remember your settings when closed.
- Conversion times will significantly depend on your hardware.
- These models are computationally intensive.
- Model load times are faster.
- Importing/exporting audio files is faster.
- If FFmpeg is not installed, the application will throw an error if the user attempts to convert a non-WAV file.
- Memory allocation errors can usually be resolved by lowering the "Segment" or "Window" sizes.
There's a known issue on MacOS Sonoma where left-clicks aren't registering correctly within the app. This was impacting all applications built with Tkinter on Sonoma and has since been resolved. Please download the latest version via the following link if you are still experiencing issues - link
This issue was being tracked here.
Please be as detailed as possible when posting a new issue.
If possible, click the "Settings Button" to the left of the "Start Processing" button and click the "Error Log" button for detailed error information that can be provided to us.
The Ultimate Vocal Remover GUI code is MIT-licensed.
- Please Note: For all third-party application developers who wish to use our models, please honor the MIT license by providing credit to UVR and its developers.
- ZFTurbo - Created & trained the weights for the new MDX23C models.
- DilanBoskan - Your contributions at the start of this project were essential to the success of UVR. Thank you!
- Bas Curtiz - Designed the official UVR logo, icon, banner, and splash screen.
- tsurumeso - Developed the original VR Architecture code.
- Kuielab & Woosung Choi - Developed the original MDX-Net AI code.
- Adefossez & Demucs - Developed the original Demucs AI code.
- KimberleyJSN - Advised and aided the implementation of the training scripts for MDX-Net and Demucs. Thank you!
- Hv - Helped implement chunks into the MDX-Net AI code. Thank you!
- For anyone interested in the ongoing development of Ultimate Vocal Remover GUI, please send us a pull request, and we will review it.
- This project is 100% open-source and free for anyone to use and modify as they wish.
- We only maintain the development and support for the Ultimate Vocal Remover GUI and the models provided.
- [1] Takahashi et al., "Multi-scale Multi-band DenseNets for Audio Source Separation", https://arxiv.org/pdf/1706.09588.pdf