-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-condition multi-database English LVCSR recipe #870
Comments
I have started working on this issue. |
@vijayaditya I am going to work on @guoguo12 's branch #771 and put my stuff in local/multicondition For the RIRs, I can let the user to choose if they want real RIRs or simulated RIRs. |
@tomkocse: Let me know if you need any help getting the multi_en stuff set up! |
@tomkocse cloning @guoguo12 's branch is dangerous as he will be rebasing his branch to ensure that his commits are not interleaved with other commits. This will enable him to do a squash merge when he wants to merge his branch. I would recommend the following. Please continue working on other aspects of this issue as long as you can. Once @guoguo12 's branch is relatively stable we will merge his recipe into the master, as it will not break anything else. You could then start working on this project. Does that sound fine to you ? @tomkocse In the mean time could you please complete #716 and #552 . I will try to complete the reviews of these PRs or at least request some one else to do it, if I am unable to do it. |
@tomkocse Also remember that we decided to just download all the simulated RIRs rather than preparing them in the recipe, so you need not write any scripts for the RIR preparation. |
I am supposed to place the reverberation stuff in a new directory local/multicondition so i think the possibility of crashing with @guoguo12 's existing files will be low. |
@tomkocse Sorry for the delay w.r.t. the RIR and noise lists. Could you please create these lists to have a combination of all the different Kaldi non-table and table IO types |
@vijayaditya The rir and noise are wave files, so yes, different IO types like the location of the wave file or a pipeline to create the wave file should be supported, but i don't understand why " an index into an ark file" is needed. |
@tomkocse I thought support for ark files would be desirable as we might have a lot of very short noise files which can be better stored in ark format. However this is not immediately necessary as the current noise database we have access to (NOISEX) has comparatively small number of noise files. Could you please generate the files yourself. I am a bit swamped and @vimalmanohar might be busy too. Once you run this test I will check this PR. |
OK |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed by a bot strictly because of inactivity. This does not mean that we think that this issue is not important! If you believe it has been closed hastily, add a comment to the issue and mention @kkm000, and I'll gladly reopen it. |
This is an extension to the Multi-database English LVCSR recipe being tracked in #699 .
Previously we found that the ASpIRE models performed better than fisher_english models when used off-the-shelf on new test sets. This is due to the data augmentation being used the ASpIRE recipe.
As the multi-database English LVCSR recipe is shaping up, I think it would be better to extend this recipe to have multi-condition training. This recipe would reside in the same directory as
multi_en
recipe ( #699 ). It would involve the creation of a new subdirectorylocal/multicondition
which will house the scripts to download the data necessary for simulating reverberation and noise conditions (see #552 ) and nnet3 recipes (xent, xent+sMBR and chain) for acoustic model training.This issue has been created to track the progress of this recipe.
The models trained using this recipe could most probably be our best off-the-shelf models, so the person involved in this project might learn about a lot of interesting research problems when these models are used by the community-at-large.
It would involve
The text was updated successfully, but these errors were encountered: