-
Notifications
You must be signed in to change notification settings - Fork 621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--cuda option with strange results #356
Comments
Hello yuxwang1102, I found this binding implementation for baidu warp-ctc: https://github.com/jpuigcerver/pytorch-baidu-ctc. I just replaced the warp-ctc for this one (import) and now I have losses similiar to the cpu when using --cuda option. Installed the new package, then I replaced: "from warpctc_pytorch import CTCLoss" for "from torch_baidu_ctc import ctc_loss, CTCLoss" |
You are welcome, my friend.
gcc version 6.5.0 20181026 (Ubuntu 6.5.0-2ubuntu1~16.04) |
hey guys, AFAIK both repos work with pytorch 0.4.X and weren't really tested with pytorch 1.0 |
@miguelvr you think that is the issue? sorry for the silence, I'll hopefully get time to address this ASAP |
Thanks, Miguel! For some reason, I couldn't make it work with --cuda be it torch 0.4.X or 1.0.0 using the warp-ctc binding from the README instructions. After trying everything (updated gcc version, multiple torch versions, even Python versions), I found the alternative repo for the ctc bindings and tried with 1.0.0 and it is working for me. I suspect it has something to do with the cuda driver. Hope yux can figure out an environment that works for him. |
@SeanNaren no clue, but it would be worth a try. Also, @fmobrj seems to be using windows, so there's that. |
@miguelvr I am using Ubuntu, 18.04.1. |
oh sorry, my bad. |
No problem. |
Thank you very much, @SeanNaren. |
Could you change |
Well thats a mess... I'll investigate further for whatever happened, it seems like the dimensions have swapped from |
Fixed on the master branch! |
Hello. Thanks for the great work.
When I try to train withoy using my GPU, without the --cuda option set, the training seems to go on normally, with the loss slowly droping, as I would expect. As expected, no GPU activity, using nvidia-smi.
$ python train.py --train-manifest data/libri_train_manifest.csv --val-manifest data/libri_val_manifest.csv
Without the cuda option training results:
When I use the --cuda option, the nvidia-smi shows the GPU usage and the training advances much faster, but the results are somewhat strange. The loss starts ate 0 and stays like that for all the training:
It seems to be some kind of problem with the input tensor. Any hints?
Best regards,
Fabio.
The text was updated successfully, but these errors were encountered: