Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gradient filter for tdnn_lstm_ctc #565

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

yaozengwei
Copy link
Collaborator

@yaozengwei yaozengwei commented Sep 5, 2022

This PR adds the gradient filter for tdnn_lstm_ctc recipe. You could see #564 for details.

@danpovey
Copy link
Collaborator

danpovey commented Sep 6, 2022

@huangruizhe you can see whether this resolves your problem.

@huangruizhe
Copy link
Contributor

huangruizhe commented Sep 9, 2022

Hi, I've tested on the tdnn_lstm_ctc2 recipe with grad_norm_threshold =100, but I think the model behaves similarly to the one before adding the gradient filter -- the model diverges when the learning rate takes 1e-3, as in the default recipe, while starts to converge when lr=1e-4..

Here is the tensorboard:

  1. Running this recipe directly (with grad_norm_threshold =100): tdnn_lstm_ctc2/train.py
    tensorboard

  2. Running the above configuration, and shuffling the whole librispeech train cuts.
    tensorboard

  3. The recipe before adding the gradient filter, and shuffling the whole librispeech train cuts: tdnn_lstm_ctc/train.py
    tensorboard

@danpovey
Copy link
Collaborator

danpovey commented Sep 9, 2022

It will be hard to diagnose what's really going on here without looking at the diagnostics files (obtained by starting from intermediate epochs and adding the flag --print-diagnostics=True.. should take about 5 minutes).

@yaozengwei
Copy link
Collaborator Author

This recipe does not support using flag --print-diagnostics=True.

@danpovey
Copy link
Collaborator

danpovey commented Sep 9, 2022

Ruizhe can figure out how to add the code from other recipes, and make a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants