Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parse_output error with Blizzard2013 data #104

Open
jinhonglu opened this issue Jul 17, 2021 · 0 comments
Open

parse_output error with Blizzard2013 data #104

jinhonglu opened this issue Jul 17, 2021 · 0 comments

Comments

@jinhonglu
Copy link

Hi, I am trying to run mellotron on Blizzard2013 dataset, I aligned the audio with some alignment tool, where each resulted audio is about 15-25s long.

However, I am facing parse_output error as

Traceback (most recent call last):
  File "train.py", line 286, in <module>
    args.warm_start, args.n_gpus, args.rank, args.group_name, hparams)
  File "train.py", line 210, in train
    y_pred = model(x)
  File "Desktop/py3_env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "Desktop/py3_env/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "Desktop/py3_env/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "Desktop/py3_env/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
    output.reraise()
  File "Desktop/py3_env/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 1 on device 5.
Original Traceback (most recent call last):
  File "Desktop/py3_env/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
    output = module(*input, **kwargs)
  File "Desktop/py3_env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "Desktop/PDAEmotion/mellotron/model.py", line 632, in forward
    output_lengths)
  File "Desktop/PDAEmotion/mellotron/model.py", line 603, in parse_output
    outputs[0].data.masked_fill_(mask, 0.0)
RuntimeError: The expanded size of the tensor (891) must match the existing size (349) at non-singleton dimension 2.  Target sizes: [16, 80, 891].  Tensor sizes: [16, 80, 349]

I am reading the paper and know that the actual implementation uses audio that is less than 10s. I just wonder this problem is caused by the length of the audio in my dataset? Or not?

How should I fix this?

Also, I changed some of the code to support multi-GPUs with DataParalle

def load_model(hparams):
  device = torch.device('cuda:4')
  model = Tacotron2(hparams).to(device)
  if hparams.fp16_run:
      model.decoder.attention_layer.score_mask_value = finfo('float16').min

  if torch.cuda.device_count() > 1:
      model = DataParallel(model, device_ids=[4, 5])
return model

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant