Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dropout schedule in nnet3 training scripts #1247

Closed
danpovey opened this issue Dec 4, 2016 · 28 comments
Closed

Dropout schedule in nnet3 training scripts #1247

danpovey opened this issue Dec 4, 2016 · 28 comments

Comments

@danpovey
Copy link
Contributor

danpovey commented Dec 4, 2016

Recently, @GaofengCheng has been doing some interesting experiments with dropout and BLSTMs, and getting nice improvements. He was using a dropout schedule in which you start with zero dropout, ramp up to 0.2, and then go back to zero at the very end.

I have been thinking about the best and most flexible way to support general dropout schedules in the training scripts. @vimalmanohar, since you are now the 'owner' of the python training scripts, it would be best if you take this on.

Here is my proposal.

Firstly, the --set-dropout-proportion (or whatever it is) option to nnet3*-copy is (or should be) deprecated. The way I want to do this is by adding an option to the '--edits-config' file. See ReadEditConfig() in nnet-utils.h. The option should have the following documentation in the comment there:

     set-dropout-proportion [name=<name-pattern>] proportion=<dropout-proportion>
        Sets the dropout rates for any components of type DropoutComponent whose
        names match the given <name-pattern> (e.g. lstm*).  <name-pattern> defaults to "*".

The documentation for the python-training-script option would read something like the following:

   parser.add_argument("--trainer.dropout-schedule", type=str, 
           dest='dropout_schedule', default='',
          help="""Use this to specify the dropout schedule.  You specify
        a piecewise linear function on the domain [0,1], where 0 is the start
        and 1 is the end of training; the function-argument (x) rises linearly with
        the amount of data you have seen, not iteration number (this improves
        invariance to num-jobs-{initial-final}).  E.g. '0,0.2,0' means 0 at the
        start; 0.2 after seeing half the data; and 0 at the end.  You may
        specify the x-value of selected points, e.g. '0,0.2@0.25,0' means
        that the 0.2 dropout-proportion is reached a quarter of the way through the
        data.   The start/end x-values are at x=0/x=1, and other unspecified x-values
        are interpolated between known x-values.  You may specify different rules
        for different component-name patterns using 'pattern1=func1 pattern2=func2',
        e.g. 'relu*=0,0.1,0 lstm*=0,0.2,0'.  More general should precede less general
       patterns, as they are applied sequentially.""")

I suggest to turn this into a command-line option to nnet3-copy or nnet3-am-copy that looks like the following, to avoid having to create lots of little config files:

--edits-config='echo "set-dropout-proportion name=lstm* proportion=0.113"; echo "set-dropout-proportion name=tdnn* proportion=0.575"|'

The double-quotes are just a bit of paranoia, to avoid bash globbing in case a file like 'name=lstmX' exists, but of course this does avoid some directory I/O.
I'd be OK with placing the parsing of the option to the inner part of the python code even if this means it's done multiple times, if this helps keep the code structure clean; I don't think the time taken is significant in the overall scheme of things.

@GaofengCheng
Copy link
Contributor

@danpovey @vimalmanohar I think adding this function into the existing dropout is interesting :
supporting schedule [0, 0.2, 0] and[0, 1.0, 0] during one training.
We could control the monotony of specific dropout components.

@GaofengCheng
Copy link
Contributor

@danpovey could you give me some guidance on how to set a random matrix by row in kaldi? .... I saw a cumatrix function ApplyHeaviside, you could tell the function name you will use for realizing this funcion, and I can do it myslf.

@danpovey
Copy link
Contributor Author

danpovey commented Dec 7, 2016 via email

@danpovey
Copy link
Contributor Author

[note: to some extent this is a response to discussions that have been happening by email or on @vimalmanohar's repo.]

The basic situation is that @GaofengCheng has been doing a lot of experiments investigating how to do dropout in BLSTMs and different dropout schedules, and is getting some really nice improvements (around 1% absolute); and I believe his best current setup is based on just putting conventional dropout after the rp_t component (the component that combines the 'r' and 'p' matrices in projected LSTMs).... [BTW, @GaofengCheng, you might want to try putting it just on the 'r' or 'p' parts if you haven't tried that already... that may require a bit of messing around with dim-range components. It's possible to split things apart using dim-range nodes, and then append them back together using Apend].

I have been thinking about the best next-steps to take with regards to this dropout-schedule stuff, and getting it merged to master in the nicest way.
I think @vimalmanohar should be in charge of this since he is kind of taking the lead on the nnet3 python-script maintenance and development. What I'm thinking is we could just use what we've learned from @GaofengCheng's experiments but (if Vimal feels it is best) modify the python code from a clean start if that is more conducive to getting things done fast. [Also, I think @GaofengCheng was using the pre-xconfig scripts, which we shouldn't be messing with at this point.]
What I'm thinking, @vimalmanohar, is that we can give the various LSTM xconfig classes a string-valued component called 'dropout', defaulting to None, which you would set to 'rp' to do dropout as Gaofeng is currently recommending (i.e. on the output of the 'rp' component). We need to make sure this works in the new, 'fast' LSTM component as well as the old one. The use of a string-valued config will mean this is extensible to any new setup that Gaofeng comes up with.
Since we were not seeing great results for the 'whole-frame' dropout, let's not consider merging any of that just yet; we'll merge it to master if it turns out to give a benefit in some setup.

@vijayaditya, you may want to chime in if you disagree with this plan.

@GaofengCheng
Copy link
Contributor

@danpovey I added the dropout on the input of 'rp' , i.e. before LSTM projection, .... but I can try on the output of rp right now and see the effect(this may better than on the input of rp, because the dropout effect will do directly on the LSTM gates)... @vimalmanohar as for the dropout place, you can ref lstm.py in vimalmanohar#8

@danpovey
Copy link
Contributor Author

danpovey commented Dec 23, 2016 via email

@GaofengCheng
Copy link
Contributor

@danpovey yes... input of lstm dropout is m_t

@vimalmanohar
Copy link
Contributor

vimalmanohar commented Dec 23, 2016 via email

@13265170340
Copy link

Recently, @GaofengCheng has been doing some interesting experiments with dropout and BLSTMs, and getting nice improvements. He was using a dropout schedule in which you start with zero dropout, ramp up to 0.2, and then go back to zero at the very end.

I have been thinking about the best and most flexible way to support general dropout schedules in the training scripts. @vimalmanohar, since you are now the 'owner' of the python training scripts, it would be best if you take this on.

Here is my proposal.

Firstly, the --set-dropout-proportion (or whatever it is) option to nnet3*-copy is (or should be) deprecated. The way I want to do this is by adding an option to the '--edits-config' file. See ReadEditConfig() in nnet-utils.h. The option should have the following documentation in the comment there:

     set-dropout-proportion [name=<name-pattern>] proportion=<dropout-proportion>
        Sets the dropout rates for any components of type DropoutComponent whose
        names match the given <name-pattern> (e.g. lstm*).  <name-pattern> defaults to "*".

The documentation for the python-training-script option would read something like the following:

   parser.add_argument("--trainer.dropout-schedule", type=str, 
           dest='dropout_schedule', default='',
          help="""Use this to specify the dropout schedule.  You specify
        a piecewise linear function on the domain [0,1], where 0 is the start
        and 1 is the end of training; the function-argument (x) rises linearly with
        the amount of data you have seen, not iteration number (this improves
        invariance to num-jobs-{initial-final}).  E.g. '0,0.2,0' means 0 at the
        start; 0.2 after seeing half the data; and 0 at the end.  You may
        specify the x-value of selected points, e.g. '0,0.2@0.25,0' means
        that the 0.2 dropout-proportion is reached a quarter of the way through the
        data.   The start/end x-values are at x=0/x=1, and other unspecified x-values
        are interpolated between known x-values.  You may specify different rules
        for different component-name patterns using 'pattern1=func1 pattern2=func2',
        e.g. 'relu*=0,0.1,0 lstm*=0,0.2,0'.  More general should precede less general
       patterns, as they are applied sequentially.""")

I suggest to turn this into a command-line option to nnet3-copy or nnet3-am-copy that looks like the following, to avoid having to create lots of little config files:

--edits-config='echo "set-dropout-proportion name=lstm* proportion=0.113"; echo "set-dropout-proportion name=tdnn* proportion=0.575"|'

The double-quotes are just a bit of paranoia, to avoid bash globbing in case a file like 'name=lstmX' exists, but of course this does avoid some directory I/O.
I'd be OK with placing the parsing of the option to the inner part of the python code even if this means it's done multiple times, if this helps keep the code structure clean; I don't think the time taken is significant in the overall scheme of things.

How to add dropout module in TDNN script

@13265170340
Copy link

https://github.com/kaldi-asr/kaldi/blob/master/egs/swbd/s5c/local/chain/tuning/run_tdnn_7q.sh
https://github.com/kaldi-asr/kaldi/blob/master/egs/swbd/s5c/local/chain/tuning/run_tdnn_7p.sh
https://github.com/kaldi-asr/kaldi/blob/master/egs/swbd/s5c/local/chain/tuning/run_tdnn_7o.sh

https://github.com/kaldi-asr/kaldi/blob/master/egs/swbd/s5c/local/chain/tuning/run_tdnn_7q.sh
https://github.com/kaldi-asr/kaldi/blob/master/egs/swbd/s5c/local/chain/tuning/run_tdnn_7p.sh
https://github.com/kaldi-asr/kaldi/blob/master/egs/swbd/s5c/local/chain/tuning/run_tdnn_7o.sh

local/nnet3/run_tdnn3.sh: creating neural net configs
tree-info exp/tri5a_sp_ali/tree
steps/nnet3/xconfig_to_configs.py --xconfig-file exp/nnet3/tdnn_sp_2/configs/network.xconfig --config-dir exp/nnet3/tdnn_sp_2/configs/
ERROR:root:***Exception caught while parsing the following xconfig line:
*** relu-batchnorm-dropout-layer name=tdnn1 l2-regularize=0.004 dropout-proportion=0.0 dropout-per-dim=true dropout-per-dim- continuous=true dim=850

Traceback (most recent call last):
File "steps/nnet3/xconfig_to_configs.py", line 333, in
main()
File "steps/nnet3/xconfig_to_configs.py", line 323, in main
all_layers = xparser.read_xconfig_file(args.xconfig_file, existing_layers)
File "steps/libs/nnet3/xconfig/parser.py", line 189, in read_xconfig_file
this_layer = xconfig_line_to_object(line, existing_layers)
File "steps/libs/nnet3/xconfig/parser.py", line 96, in xconfig_line_to_object
return config_to_layer[first_token](first_token, key_to_value, prev_layers)
File "steps/libs/nnet3/xconfig/basic_layers.py", line 706, in init
XconfigLayerBase.init(self, first_token, key_to_value, prev_names)
File "steps/libs/nnet3/xconfig/basic_layers.py", line 68, in init
self.set_configs(key_to_value, all_layers)
File "steps/libs/nnet3/xconfig/basic_layers.py", line 97, in set_configs
"" .format(key, value, self.layer_type, configs))
RuntimeError: Configuration value continuous=true was not expected in layer of type relu-batchnorm-dropout-layer; allowed configs with their defaults: self-repair-scale->1e-05 l2-regularize->"" add-log-stddev->False ng-linear-options->"" bias-stddev->"" bottleneck-dim->-1 dropout-per-dim->False dim->-1 max-change->0.75 ng-affine-options->"" learning-rate-factor->"" dropout-per-dim-continuous->False input->"[-1]" dropout-proportion->0.5 target-rms->1.0

Xconfig error adding new layer on TDNN model

@danpovey
Copy link
Contributor Author

danpovey commented Nov 1, 2018 via email

@13265170340
Copy link

13265170340 commented Nov 2, 2018

steps/nnet3/decode.sh --nj 40 --cmd run.pl --online-ivector-dir exp/nnet3/ivectors_dev exp/tri5a/graph data/dev_hires exp/nnet3/tdnn_sp_2/decode_dev
steps/nnet3/decode.sh: feature type is raw
bash: line 1: 46146 Segmentation fault      (core dumped) ( nnet3-latgen-faster --online-ivectors=scp:exp/nnet3/ivectors_dev/ivector_online.scp --online-ivector-period=10 --frames-per-chunk=50 --extra-left-context=0 --extra-right-context=0 --extra-left-context-initial=-1 --extra-right-context-final=-1 --minimize=false --max-active=7000 --min-active=200 --beam=15.0 --lattice-beam=8.0 --acoustic-scale=0.1 --allow-partial=true --word-symbol-table=exp/tri5a/graph/words.txt exp/nnet3/tdnn_sp_2/final.mdl exp/tri5a/graph/HCLG.fst "ark,s,cs:apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:data/dev_hires/split40/34/utt2spk scp:data/dev_hires/split40/34/cmvn.scp scp:data/dev_hires/split40/34/feats.scp ark:- |" "ark:|gzip -c >exp/nnet3/tdnn_sp_2/decode_dev/lat.34.gz" ) 2>> exp/nnet3/tdnn_sp_2/decode_dev/log/decode.34.log >> exp/nnet3/tdnn_sp_2/decode_dev/log/decode.34.log

LOG (nnet3-latgen-faster[5.5.88~3-8e30f]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 13 orphan nodes.
LOG (nnet3-latgen-faster[5.5.88~3-8e30f]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 20 orphan components.
LOG (nnet3-latgen-faster[5.5.88~3-8e30f]:Collapse():nnet-utils.cc:1378) Added 7 components, removed 20
apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:data/dev_hires/split40/1/utt2spk scp:data/dev_hires/split40/1/cmvn.scp scp:data/dev_hires/split40/1/feats.scp ark:- 

Thank you, the previous problem has been solved.However, there is a problem with decoding.

@danpovey
Copy link
Contributor Author

danpovey commented Nov 2, 2018

I suggest to cd to src/, do "make depend -j 10" and "make -j 10" to minimize the chance of compilation errors, and try again. If that doesn't work, get it in gdb and show me a stack trace: gdb --args (program) (args), then "r", then "bt" when it crashes. E.g.

gdb --args nnet3-latgen-faster --online-ivectors=scp:exp/n.....
(gdb) r
...
(gdb) bt

@13265170340
Copy link

I suggest to cd to src/, do "make depend -j 10" and "make -j 10" to minimize the chance of compilation errors, and try again. If that doesn't work, get it in gdb and show me a stack trace: gdb --args (program) (args), then "r", then "bt" when it crashes. E.g.

gdb --args nnet3-latgen-faster --online-ivectors=scp:exp/n.....
(gdb) r
...
(gdb) bt

yuyin@yuyin-Super-Server:/kaldi-trunk1/egs/aishell/s5$ gdb --args nnet3-latgen-faster --online-ivectors=scp:exp/nnet3/ivectors_dev/ivector_online.scp --online-ivector-period=10 --frames-per-chunk=50 --extra-left-context=0 --extra-right-context=0 --extra-left-context-initial=-1 --extra-right-context-final=-1 --minimize=false --max-active=7000 --min-active=200 --beam=15.0 --lattice-beam=8.0 --acoustic-scale=0.1 --allow-partial=true --word-symbol-table=exp/tri5a/graph/words.txt exp/nnet3/tdnn_sp_2/final.mdl exp/tri5a/graph/HCLG.fst "ark,s,cs:apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:data/dev_hires/split40/17/utt2spk scp:data/dev_hires/split40/17/cmvn.scp scp:data/dev_hires/split40/17/feats.scp ark:- |" "ark:|gzip -c >exp/nnet3/tdnn_sp_2/decode_dev/lat.17.gz"
GNU gdb (Ubuntu 7.11.1-0ubuntu1
16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
nnet3-latgen-faster: 没有那个文件或目录.
(gdb)

Did not solve the problem, gdb will not use

@jtrmal
Copy link
Contributor

jtrmal commented Nov 2, 2018 via email

@13265170340
Copy link

yuyin@yuyin-Super-Server:~/kaldi-trunk1$ g++ -g -o nnet3-latgen-faster nnet3-latgen-faster.cc
g++: error: nnet3-latgen-faster.cc: 没有那个文件或目录
g++: fatal error: no input files

Does GDB support shell scripts?

when in gdb, type 'run' and when/if it crashes, type 'bt' and paste the output of that command -- that is what dan is looking for. y.

On Fri, Nov 2, 2018 at 9:00 AM xiaowang @.***> wrote: I suggest to cd to src/, do "make depend -j 10" and "make -j 10" to minimize the chance of compilation errors, and try again. If that doesn't work, get it in gdb and show me a stack trace: gdb --args (program) (args), then "r", then "bt" when it crashes. E.g. gdb --args nnet3-latgen-faster --online-ivectors=scp:exp/n..... (gdb) r ... (gdb) bt @.***Super-Server:/kaldi-trunk1/egs/aishell/s5$ gdb --args nnet3-latgen-faster --online-ivectors=scp:exp/nnet3/ivectors_dev/ivector_online.scp --online-ivector-period=10 --frames-per-chunk=50 --extra-left-context=0 --extra-right-context=0 --extra-left-context-initial=-1 --extra-right-context-final=-1 --minimize=false --max-active=7000 --min-active=200 --beam=15.0 --lattice-beam=8.0 --acoustic-scale=0.1 --allow-partial=true --word-symbol-table=exp/tri5a/graph/words.txt exp/nnet3/tdnn_sp_2/final.mdl exp/tri5a/graph/HCLG.fst "ark,s,cs:apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:data/dev_hires/split40/17/utt2spk scp:data/dev_hires/split40/17/cmvn.scp scp:data/dev_hires/split40/17/feats.scp ark:- |" "ark:|gzip -c >exp/nnet3/tdnn_sp_2/decode_dev/lat.17.gz" GNU gdb (Ubuntu 7.11.1-0ubuntu116.5) 7.11.1 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/. For help, type "help". Type "apropos word" to search for commands related to "word"... nnet3-latgen-faster: 没有那个文件或目录. (gdb) Did not solve the problem, gdb will not use — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1247 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AKisX7N_Z9DlTPdBPKaDWFUOidT2kjtBks5urEHSgaJpZM4LDr60 .

when in gdb, type 'run' and when/if it crashes, type 'bt' and paste the output of that command -- that is what dan is looking for. y.

On Fri, Nov 2, 2018 at 9:00 AM xiaowang @.***> wrote: I suggest to cd to src/, do "make depend -j 10" and "make -j 10" to minimize the chance of compilation errors, and try again. If that doesn't work, get it in gdb and show me a stack trace: gdb --args (program) (args), then "r", then "bt" when it crashes. E.g. gdb --args nnet3-latgen-faster --online-ivectors=scp:exp/n..... (gdb) r ... (gdb) bt @.***Super-Server:/kaldi-trunk1/egs/aishell/s5$ gdb --args nnet3-latgen-faster --online-ivectors=scp:exp/nnet3/ivectors_dev/ivector_online.scp --online-ivector-period=10 --frames-per-chunk=50 --extra-left-context=0 --extra-right-context=0 --extra-left-context-initial=-1 --extra-right-context-final=-1 --minimize=false --max-active=7000 --min-active=200 --beam=15.0 --lattice-beam=8.0 --acoustic-scale=0.1 --allow-partial=true --word-symbol-table=exp/tri5a/graph/words.txt exp/nnet3/tdnn_sp_2/final.mdl exp/tri5a/graph/HCLG.fst "ark,s,cs:apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:data/dev_hires/split40/17/utt2spk scp:data/dev_hires/split40/17/cmvn.scp scp:data/dev_hires/split40/17/feats.scp ark:- |" "ark:|gzip -c >exp/nnet3/tdnn_sp_2/decode_dev/lat.17.gz" GNU gdb (Ubuntu 7.11.1-0ubuntu116.5) 7.11.1 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/. For help, type "help". Type "apropos word" to search for commands related to "word"... nnet3-latgen-faster: 没有那个文件或目录. (gdb) Did not solve the problem, gdb will not use — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1247 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AKisX7N_Z9DlTPdBPKaDWFUOidT2kjtBks5urEHSgaJpZM4LDr60 .

yuyin@yuyin-Super-Server:~/kaldi-trunk1$ g++ -g -o nnet3-latgen-faster nnet3-latgen-faster.cc
g++: error: nnet3-latgen-faster.cc: 没有那个文件或目录
g++: fatal error: no input files

Does GDB support shell scripts?

@jtrmal
Copy link
Contributor

jtrmal commented Nov 2, 2018

I think you are confusing g++ and gdb.

@13265170340
Copy link

I think you are confusing g++ and gdb.

I know dan, but I won't use gdb.

@13265170340
Copy link

I suggest to cd to src/, do "make depend -j 10" and "make -j 10" to minimize the chance of compilation errors, and try again. If that doesn't work, get it in gdb and show me a stack trace: gdb --args (program) (args), then "r", then "bt" when it crashes. E.g.

gdb --args nnet3-latgen-faster --online-ivectors=scp:exp/n.....
(gdb) r
...
(gdb) bt

yuyin@yuyin-Super-Server:/kaldi-trunk1/src/nnet3bin$ gdb nnet3-latgen-faster
GNU gdb (Ubuntu 7.11.1-0ubuntu1
16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from nnet3-latgen-faster...done.

(gdb) r --online-ivectors=scp:exp/nnet3/ivectors_dev/ivector_online.scp --online-ivector-period=10 --frames-per-chunk=50 --extra-left-context=0 --extra-right-context=0 --extra-left-context-initial=-1 --extra-right-context-final=-1 --minimize=false --max-active=7000 --min-active=200 --beam=15.0 --lattice-beam=8.0 --acoustic-scale=0.1 --allow-partial=true --word-symbol-table=exp/tri5a/graph/words.txt exp/nnet3/tdnn_sp_2/final.mdl exp/tri5a/graph/HCLG.fst "ark,s,cs:apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:data/dev_hires/split40/17/utt2spk scp:data/dev_hires/split40/17/cmvn.scp scp:data/dev_hires/split40/17/feats.scp ark:- |" "ark:|gzip -c >exp/nnet3/tdnn_sp_2/decode_dev/lat.17.gz"
Starting program: /home/yuyin/kaldi-trunk1/src/nnet3bin/nnet3-latgen-faster --online-ivectors=scp:exp/nnet3/ivectors_dev/ivector_online.scp --online-ivector-period=10 --frames-per-chunk=50 --extra-left-context=0 --extra-right-context=0 --extra-left-context-initial=-1 --extra-right-context-final=-1 --minimize=false --max-active=7000 --min-active=200 --beam=15.0 --lattice-beam=8.0 --acoustic-scale=0.1 --allow-partial=true --word-symbol-table=exp/tri5a/graph/words.txt exp/nnet3/tdnn_sp_2/final.mdl exp/tri5a/graph/HCLG.fst "ark,s,cs:apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:data/dev_hires/split40/17/utt2spk scp:data/dev_hires/split40/17/cmvn.scp scp:data/dev_hires/split40/17/feats.scp ark:- |" "ark:|gzip -c >exp/nnet3/tdnn_sp_2/decode_dev/lat.17.gz"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
/home/yuyin/kaldi-trunk1/src/nnet3bin/nnet3-latgen-faster --online-ivectors=scp:exp/nnet3/ivectors_dev/ivector_online.scp --online-ivector-period=10 --frames-per-chunk=50 --extra-left-context=0 --extra-right-context=0 --extra-left-context-initial=-1 --extra-right-context-final=-1 --minimize=false --max-active=7000 --min-active=200 --beam=15.0 --lattice-beam=8.0 --acoustic-scale=0.1 --allow-partial=true --word-symbol-table=exp/tri5a/graph/words.txt exp/nnet3/tdnn_sp_2/final.mdl exp/tri5a/graph/HCLG.fst 'ark,s,cs:apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:data/dev_hires/split40/17/utt2spk scp:data/dev_hires/split40/17/cmvn.scp scp:data/dev_hires/split40/17/feats.scp ark:- |' 'ark:|gzip -c >exp/nnet3/tdnn_sp_2/decode_dev/lat.17.gz'
ERROR (nnet3-latgen-faster[5.5.88~3-8e30f]:Input():kaldi-io.cc:756) Error opening input stream exp/nnet3/tdnn_sp_2/final.mdl

[ Stack-Trace: ]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::FatalMessageLogger::~FatalMessageLogger()
kaldi::Input::Input(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, bool*)
main
__libc_start_main
_start

ERROR (nnet3-latgen-faster[5.5.88~3-8e30f]:Input():kaldi-io.cc:756) Error opening input stream exp/nnet3/tdnn_sp_2/final.mdl

[ Stack-Trace: ]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::MessageLogger::~MessageLogger()
kaldi::FatalMessageLogger::~FatalMessageLogger()
kaldi::Input::Input(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, bool*)
main
__libc_start_main
_start

@13265170340
Copy link

Is the training file final.mdl wrong?

@jtrmal
Copy link
Contributor

jtrmal commented Nov 2, 2018

you are running it from a different directory, probably

@danpovey
Copy link
Contributor Author

danpovey commented Nov 2, 2018

Please get someone local to help you. We are busy and we don't have time to deal with people who don't know basic things like how to use a debugger, and there must be people in your lab who know this stuff.

@13265170340
Copy link

thank you.The problem has been solved because the previous model is not updated

@13265170340
Copy link

I want to ask which papers are used in the dropout algorithm on kaldi.

@danpovey
Copy link
Contributor Author

danpovey commented Nov 9, 2018 via email

@13265170340
Copy link

There are different forms available. If you are asking about the one used in the TDNN-F scripts, which is continuous and shared across time, look at my publications page, it may possibly be described in the paper on factorized TDNNs with Gaofeng Cheng as a co-author. There is also more conventional dropout. Dan

On Thu, Nov 8, 2018 at 8:50 PM xiaowang @.***> wrote: I want to ask which papers are used in the dropout algorithm on kaldi. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#1247 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu58GP5QMU4SU5kYpitmHKaNi60dDks5utN9ggaJpZM4LD

There are different forms available. If you are asking about the one used in the TDNN-F scripts, which is continuous and shared across time, look at my publications page, it may possibly be described in the paper on factorized TDNNs with Gaofeng Cheng as a co-author. There is also more conventional dropout. Dan

On Thu, Nov 8, 2018 at 8:50 PM xiaowang @.***> wrote: I want to ask which papers are used in the dropout algorithm on kaldi. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#1247 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu58GP5QMU4SU5kYpitmHKaNi60dDks5utN9ggaJpZM4LDr60 .

Yes, about TDNN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants