Requirements #48

mateusztobiasz · 2024-11-22T03:24:28Z

Hello!

I am a beginner and I've wanted to play with your finetuning script so I decided to run it on Google Colab. Just to test it out, I prepared train, test and eval datasets which contained only one audio with caption. When I ran this script, it consumed about 15 GB of GPU vRAM and the process was killed. Is this a normal behaviour? If so, do you guys know how much vRAM do I need to finetune this model on more reasonable datasets (about 1000 rows)?

Thank you in advance for your reply!

mateusztobiasz · 2024-11-22T03:34:26Z

I would like to also add that I haven't changed any hyperparameters and used audioldm_train/config/2023_08_23_reproduce_audioldm/audioldm_original_medium.yaml config. Also. here is the screenshot of the moment when process was killed:

ahmetbekcan · 2025-01-19T06:44:42Z

Hi, I also experience the same problem where T4 GPU is not sufficient. I don't know if it helps but I decreased the precision to "medium", decreased batch size to 1 but it is still not working. Could you find any solution to problem?

mateusztobiasz mentioned this issue Nov 26, 2024

what inside of dataset file ? i dont have that much size on hardisk #41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requirements #48

Requirements #48

mateusztobiasz commented Nov 22, 2024

mateusztobiasz commented Nov 22, 2024

ahmetbekcan commented Jan 19, 2025

Requirements #48

Requirements #48

Comments

mateusztobiasz commented Nov 22, 2024

mateusztobiasz commented Nov 22, 2024

ahmetbekcan commented Jan 19, 2025