Replies: 1 comment
-
This method is very resource intensive. The reference of 800 max length needed an 80GB card. The minimum acceptable length of 200 gets you between 13 GB and 16 GB. You could try the dagshub repo with batch size 1 and sharding, but the model may not function. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have a $800 graphics card. why can't I cook? Also, some of my training and loss functions aren't kicking in. i'm only getting mel loss. I have a basic configuration pointing to the correct pathways and configs for the models used (ASR, pitch extractor, Bert). anyone have a similar issue and can diagnose what the problem is?:
config.yaml:
save_freq: 2
log_interval: 10
device: "cuda"
epochs_1st: 200 # number of epochs for first stage training (pre-training)
epochs_2nd: 100 # number of peochs for second stage training (joint training)
batch_size: 2
max_len: 400 # maximum number of frames
pretrained_model: ""
second_stage_load_pretrained: true # set to true if the pre-trained model is for 2nd stage
load_only_params: false # set to true if do not want to load epoch numbers and optimizer parameters
.......
model_params:
multispeaker: false
dim_in: 16
hidden_dim: 512 # 512
max_conv_dim: 512
n_layer: 3
n_mels: 80
n_token: 178 # number of phoneme tokens
max_dur: 50 # maximum duration of a single phoneme
style_dim: 128 # style vector size
dropout: 0.2
accelerate launch --mixed_precision=fp16 train_first.py --config_path ./Configs/config.yml
console:
............
Epoch [1/200], Step [840/853], Mel Loss: 0.51713, Gen Loss: 0.00000, Disc Loss: 0.00000, Mono Loss: 0.00000, S2S Loss: 0.00000, SLM Loss: 0.00000
Time elasped: 252.5265998840332
.............
FROZEN - not enough memory WOMP WOMP
Beta Was this translation helpful? Give feedback.
All reactions