--cuda option with strange results #356

fmobrj · 2019-01-03T22:11:38Z

Hello. Thanks for the great work.

When I try to train withoy using my GPU, without the --cuda option set, the training seems to go on normally, with the loss slowly droping, as I would expect. As expected, no GPU activity, using nvidia-smi.

$ python train.py --train-manifest data/libri_train_manifest.csv --val-manifest data/libri_val_manifest.csv

Without the cuda option training results:

When I use the --cuda option, the nvidia-smi shows the GPU usage and the training advances much faster, but the results are somewhat strange. The loss starts ate 0 and stays like that for all the training:

It seems to be some kind of problem with the input tensor. Any hints?

Best regards,
Fabio.

fmobrj · 2019-01-06T16:26:21Z

Hello yuxwang1102, I found this binding implementation for baidu warp-ctc: https://github.com/jpuigcerver/pytorch-baidu-ctc.

I just replaced the warp-ctc for this one (import) and now I have losses similiar to the cpu when using --cuda option.

Installed the new package, then I replaced:

"from warpctc_pytorch import CTCLoss"

for

"from torch_baidu_ctc import ctc_loss, CTCLoss"

fmobrj · 2019-01-07T14:57:10Z

You are welcome, my friend.

import torch
torch.version
'1.0.0'
torch.version.cuda
'9.0.176'

gcc version 6.5.0 20181026 (Ubuntu 6.5.0-2ubuntu1~16.04)

miguelvr · 2019-01-07T15:03:49Z

hey guys, AFAIK both repos work with pytorch 0.4.X and weren't really tested with pytorch 1.0

SeanNaren · 2019-01-07T15:50:51Z

@miguelvr you think that is the issue? sorry for the silence, I'll hopefully get time to address this ASAP

fmobrj · 2019-01-07T15:52:55Z

Thanks, Miguel!

For some reason, I couldn't make it work with --cuda be it torch 0.4.X or 1.0.0 using the warp-ctc binding from the README instructions. After trying everything (updated gcc version, multiple torch versions, even Python versions), I found the alternative repo for the ctc bindings and tried with 1.0.0 and it is working for me.

I suspect it has something to do with the cuda driver.

Hope yux can figure out an environment that works for him.

miguelvr · 2019-01-07T15:53:02Z

@SeanNaren no clue, but it would be worth a try. Also, @fmobrj seems to be using windows, so there's that.

fmobrj · 2019-01-07T15:54:32Z

@miguelvr I am using Ubuntu, 18.04.1.

miguelvr · 2019-01-07T15:54:43Z

oh sorry, my bad.

fmobrj · 2019-01-07T15:54:56Z

No problem.

fmobrj · 2019-01-07T15:57:37Z

@miguelvr you think that is the issue? sorry for the silence, I'll hopefully get time to address this ASAP

Thank you very much, @SeanNaren.

SeanNaren · 2019-01-07T16:48:34Z

Could you change sound.shape[1] == 1: to sound.shape[0] == 1: at: https://github.com/SeanNaren/deepspeech.pytorch/blob/master/data/data_loader.py#L26 and tell me if it works?

SeanNaren · 2019-01-07T17:46:52Z

Well thats a mess... I'll investigate further for whatever happened, it seems like the dimensions have swapped from torchaudio so will just have to transpose before we do any other transformations. Will push a fix once i verify it works!

SeanNaren · 2019-01-08T13:47:51Z

Fixed on the master branch!

SeanNaren mentioned this issue Jan 7, 2019

Fixed error with transpose using newest torchaudio, added pytorch 1.0 changes #357

Merged

SeanNaren closed this as completed Jan 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--cuda option with strange results #356

--cuda option with strange results #356

fmobrj commented Jan 3, 2019

fmobrj commented Jan 6, 2019

fmobrj commented Jan 7, 2019

miguelvr commented Jan 7, 2019

SeanNaren commented Jan 7, 2019

fmobrj commented Jan 7, 2019

miguelvr commented Jan 7, 2019

fmobrj commented Jan 7, 2019

miguelvr commented Jan 7, 2019

fmobrj commented Jan 7, 2019

fmobrj commented Jan 7, 2019

SeanNaren commented Jan 7, 2019

SeanNaren commented Jan 7, 2019

SeanNaren commented Jan 8, 2019

--cuda option with strange results #356

--cuda option with strange results #356

Comments

fmobrj commented Jan 3, 2019

fmobrj commented Jan 6, 2019

fmobrj commented Jan 7, 2019

miguelvr commented Jan 7, 2019

SeanNaren commented Jan 7, 2019

fmobrj commented Jan 7, 2019

miguelvr commented Jan 7, 2019

fmobrj commented Jan 7, 2019

miguelvr commented Jan 7, 2019

fmobrj commented Jan 7, 2019

fmobrj commented Jan 7, 2019

SeanNaren commented Jan 7, 2019

SeanNaren commented Jan 7, 2019

SeanNaren commented Jan 8, 2019