what inside of dataset file ? i dont have that much size on hardisk #41

xalteropsx · 2024-06-11T14:08:11Z

what are benefit of training an model as a beginner cause u already provide an train model i just dont understand why people redo it

ogbanugot · 2024-07-02T09:28:56Z

Honestly, really wish you can add this please. Or a if you can help with a smaller sample?

ogbanugot · 2024-07-03T16:17:18Z

For anyone out there, you can use this Python script to structure your dataset similar to Audiocaps. You'll need to create a CSV of your data with headers ['audio', 'caption']. Where 'audio' is the full path to your audio wav file and the caption is a string. Create a row for each audio wav file.

import os
import json
import pandas as pd
import shutil
from tqdm import tqdm

# Load the CSV file
csv_path = 'audioldm_dataset.csv' #### Change to your own CSV file!
data = pd.read_csv(csv_path)

# Define paths
root_dir = './AudioLDM-training-finetuning/data'
audioset_dir = os.path.join(root_dir, 'dataset/audioset')
metadata_dir = os.path.join(root_dir, 'dataset/metadata')
datafiles_dir = os.path.join(metadata_dir, 'datafiles')
testset_subset_dir = os.path.join(metadata_dir, 'testset_subset')
valset_subset_dir = os.path.join(metadata_dir, 'valset_subset')

# Create directories if they don't exist
os.makedirs(audioset_dir, exist_ok=True)
os.makedirs(datafiles_dir, exist_ok=True)
os.makedirs(testset_subset_dir, exist_ok=True)
os.makedirs(valset_subset_dir, exist_ok=True)

# Copy audio files to the audioset directory
for audio_file in tqdm(data['audio']):
    file_name = os.path.basename(audio_file)
    new_path = os.path.join(audioset_dir, file_name)
    os.makedirs(os.path.dirname(new_path), exist_ok=True)
    try:
        shutil.copy(audio_file, new_path)
    except Exception as e:
        print(f"Error copying {audio_file}: {e}")

# Create metadata JSON files
train_data = []
test_data = []
val_data = []

for i, row in data.iterrows():
    datapoint = {
        'wav': os.path.basename(row['audio']),
        'caption': row['caption']
    }
    # You can define your own condition to split between train, test, and val
    if i % 5 == 0:  # Example condition for test
        test_data.append(datapoint)
    elif i % 5 == 1:  # Example condition for validation
        val_data.append(datapoint)
    else:
        train_data.append(datapoint)

# Save the train metadata
train_metadata = {'data': train_data}
with open(os.path.join(datafiles_dir, 'audiocaps_train_label.json'), 'w') as f:
    json.dump(train_metadata, f, indent=4)

# Save the test metadata
test_metadata = {'data': test_data}
with open(os.path.join(testset_subset_dir, 'audiocaps_test_nonrepeat_subset_0.json'), 'w') as f:
    json.dump(test_metadata, f, indent=4)

# Save the validation metadata
val_metadata = {'data': val_data}
with open(os.path.join(valset_subset_dir, 'audiocaps_val_label.json'), 'w') as f:
    json.dump(val_metadata, f, indent=4)

# Save the dataset root metadata
dataset_root_metadata = {
    'audiocaps': 'data/dataset/audioset',
    'metadata': {
        'path': {
            'audiocaps': {
                'train': 'data/dataset/metadata/datafiles/audiocaps_train_label.json',
                'test': 'data/dataset/metadata/testset_subset/audiocaps_test_nonrepeat_subset_0.json',
                'val': 'data/dataset/metadata/valset_subset/audiocaps_val_label.json'
            }
        }
    }
}
with open(os.path.join(metadata_dir, 'dataset_root.json'), 'w') as f:
    json.dump(dataset_root_metadata, f, indent=4)

print("Dataset structured successfully!")

xalteropsx · 2024-07-04T00:59:37Z

thnx

mateusztobiasz · 2024-11-21T22:45:10Z

Is this everything I need to do to fine tune AudioLDM on my dataset? Because in the provided zip file there are much more files and .csv files have such headers: [audiocap_id,youtube_id,start_time,caption]

EXPSTUDIOmo · 2024-11-26T14:34:29Z

Wondering the same as @mateusztobiasz .
The audiocaps example is much more rich and complicated, but I guess we don't need all of that?
Thanks for the provided script anyway!

mateusztobiasz · 2024-11-26T16:10:39Z

@EXPSTUDIOmo I ran the above script to structure my own dataset and put it in the right dirs. Then, I ran this command: python3 audioldm_train/train/latent_diffusion.py -c audioldm_train/config/2023_08_23_reproduce_audioldm/audioldm_original_medium.yaml --reload_from_ckpt data/checkpoints/audioldm-m-full.ckpt .
It didn't fail so it seems that this script could be enough. Although, I am not 100% sure because the process was killed when it came to back propagation (it consumed 15GB vRAM, so I think I am going to try the smaller model or manipulate the config file). Regarding this issue I created #48 because I haven't found any info about hardware requirements.

mateusztobiasz · 2024-12-09T04:28:54Z

Any updates on the topic we are discussing?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

what inside of dataset file ? i dont have that much size on hardisk #41

what inside of dataset file ? i dont have that much size on hardisk #41

xalteropsx commented Jun 11, 2024 •

edited

Loading

ogbanugot commented Jul 2, 2024

ogbanugot commented Jul 3, 2024 •

edited

Loading

xalteropsx commented Jul 4, 2024

mateusztobiasz commented Nov 21, 2024 •

edited

Loading

EXPSTUDIOmo commented Nov 26, 2024

mateusztobiasz commented Nov 26, 2024

mateusztobiasz commented Dec 9, 2024

what inside of dataset file ? i dont have that much size on hardisk #41

what inside of dataset file ? i dont have that much size on hardisk #41

Comments

xalteropsx commented Jun 11, 2024 • edited Loading

ogbanugot commented Jul 2, 2024

ogbanugot commented Jul 3, 2024 • edited Loading

xalteropsx commented Jul 4, 2024

mateusztobiasz commented Nov 21, 2024 • edited Loading

EXPSTUDIOmo commented Nov 26, 2024

mateusztobiasz commented Nov 26, 2024

mateusztobiasz commented Dec 9, 2024

xalteropsx commented Jun 11, 2024 •

edited

Loading

ogbanugot commented Jul 3, 2024 •

edited

Loading

mateusztobiasz commented Nov 21, 2024 •

edited

Loading