Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what inside of dataset file ? i dont have that much size on hardisk #41

Open
xalteropsx opened this issue Jun 11, 2024 · 7 comments
Open

Comments

@xalteropsx
Copy link

xalteropsx commented Jun 11, 2024

what are benefit of training an model as a beginner cause u already provide an train model i just dont understand why people redo it

@ogbanugot
Copy link

Honestly, really wish you can add this please. Or a if you can help with a smaller sample?

@ogbanugot
Copy link

ogbanugot commented Jul 3, 2024

For anyone out there, you can use this Python script to structure your dataset similar to Audiocaps. You'll need to create a CSV of your data with headers ['audio', 'caption']. Where 'audio' is the full path to your audio wav file and the caption is a string. Create a row for each audio wav file.

import os
import json
import pandas as pd
import shutil
from tqdm import tqdm

# Load the CSV file
csv_path = 'audioldm_dataset.csv' #### Change to your own CSV file!
data = pd.read_csv(csv_path)

# Define paths
root_dir = './AudioLDM-training-finetuning/data'
audioset_dir = os.path.join(root_dir, 'dataset/audioset')
metadata_dir = os.path.join(root_dir, 'dataset/metadata')
datafiles_dir = os.path.join(metadata_dir, 'datafiles')
testset_subset_dir = os.path.join(metadata_dir, 'testset_subset')
valset_subset_dir = os.path.join(metadata_dir, 'valset_subset')

# Create directories if they don't exist
os.makedirs(audioset_dir, exist_ok=True)
os.makedirs(datafiles_dir, exist_ok=True)
os.makedirs(testset_subset_dir, exist_ok=True)
os.makedirs(valset_subset_dir, exist_ok=True)

# Copy audio files to the audioset directory
for audio_file in tqdm(data['audio']):
    file_name = os.path.basename(audio_file)
    new_path = os.path.join(audioset_dir, file_name)
    os.makedirs(os.path.dirname(new_path), exist_ok=True)
    try:
        shutil.copy(audio_file, new_path)
    except Exception as e:
        print(f"Error copying {audio_file}: {e}")

# Create metadata JSON files
train_data = []
test_data = []
val_data = []

for i, row in data.iterrows():
    datapoint = {
        'wav': os.path.basename(row['audio']),
        'caption': row['caption']
    }
    # You can define your own condition to split between train, test, and val
    if i % 5 == 0:  # Example condition for test
        test_data.append(datapoint)
    elif i % 5 == 1:  # Example condition for validation
        val_data.append(datapoint)
    else:
        train_data.append(datapoint)

# Save the train metadata
train_metadata = {'data': train_data}
with open(os.path.join(datafiles_dir, 'audiocaps_train_label.json'), 'w') as f:
    json.dump(train_metadata, f, indent=4)

# Save the test metadata
test_metadata = {'data': test_data}
with open(os.path.join(testset_subset_dir, 'audiocaps_test_nonrepeat_subset_0.json'), 'w') as f:
    json.dump(test_metadata, f, indent=4)

# Save the validation metadata
val_metadata = {'data': val_data}
with open(os.path.join(valset_subset_dir, 'audiocaps_val_label.json'), 'w') as f:
    json.dump(val_metadata, f, indent=4)

# Save the dataset root metadata
dataset_root_metadata = {
    'audiocaps': 'data/dataset/audioset',
    'metadata': {
        'path': {
            'audiocaps': {
                'train': 'data/dataset/metadata/datafiles/audiocaps_train_label.json',
                'test': 'data/dataset/metadata/testset_subset/audiocaps_test_nonrepeat_subset_0.json',
                'val': 'data/dataset/metadata/valset_subset/audiocaps_val_label.json'
            }
        }
    }
}
with open(os.path.join(metadata_dir, 'dataset_root.json'), 'w') as f:
    json.dump(dataset_root_metadata, f, indent=4)

print("Dataset structured successfully!")

@xalteropsx
Copy link
Author

thnx

@mateusztobiasz
Copy link

mateusztobiasz commented Nov 21, 2024

Is this everything I need to do to fine tune AudioLDM on my dataset? Because in the provided zip file there are much more files and .csv files have such headers: [audiocap_id,youtube_id,start_time,caption]

@EXPSTUDIOmo
Copy link

Wondering the same as @mateusztobiasz .
The audiocaps example is much more rich and complicated, but I guess we don't need all of that?
Thanks for the provided script anyway!

@mateusztobiasz
Copy link

@EXPSTUDIOmo I ran the above script to structure my own dataset and put it in the right dirs. Then, I ran this command: python3 audioldm_train/train/latent_diffusion.py -c audioldm_train/config/2023_08_23_reproduce_audioldm/audioldm_original_medium.yaml --reload_from_ckpt data/checkpoints/audioldm-m-full.ckpt .
It didn't fail so it seems that this script could be enough. Although, I am not 100% sure because the process was killed when it came to back propagation (it consumed 15GB vRAM, so I think I am going to try the smaller model or manipulate the config file). Regarding this issue I created #48 because I haven't found any info about hardware requirements.

@mateusztobiasz
Copy link

Any updates on the topic we are discussing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants