WIP: Multi-database English LVCSR recipe #771

guoguo12 · 2016-05-09T06:39:39Z

vijayaditya · 2016-05-27T04:02:50Z

Two minor suggestions..
It might be convenient to have sub directories in local for data prep scripts corresponding to each database (e.g. local/ami/ local/fisher/ local/swbd/ local/tedlium/)

When ever you copy a script it would be very convenient to provide the the source path

e.g.

# This script was copied from egs/swbd/s5c/local/format_acronyms_dict.py (commit d8b196951c1cf3437b3fa6cd76edbbc0542b3db9)
# Minor modifications were made.

This is convenient as people familiar with the original scripts can skip unnecessary parts.

guoguo12 · 2016-05-27T05:16:40Z

Thanks for the feedback!

It might be convenient to have sub directories in local for data prep scripts corresponding to each database (e.g. local/ami/ local/fisher/ local/swbd/ local/tedlium/)

Since some of the data prep scripts use relative paths, I'd rather not do this. Is the issue that it's difficult to pick out the non-database specific scripts (like normalize_transcript.py)? If so, I could prefix those (multi_en_normalize_transcript.py) to make it clear that those are general scripts.

When ever you copy a script it would be very convenient to provide the the source path

I will do this!

skoocda · 2016-05-30T21:31:53Z

@guoguo12 has anyone tried to train this multi-database yet?

Will training on a set this large scale linearly- i.e. can I expect to train for about 35x the duration of the tedlium 120 set?

Thanks!

danpovey · 2016-05-30T21:35:04Z

Approximately linearly. If you're talking about neural net training with
nnet2 or nnet3, though, we typically use more GPUs when using more data,
and maybe fewer epochs so the training time for any given run will rarely
be more than a day or two. But this is assuming you have a grid with
multiple GPUs (e.g. up to 16).
Dan

On Mon, May 30, 2016 at 5:31 PM, D notifications@github.com wrote:

@guoguo12 https://github.com/guoguo12 has anyone tried to train this
multi-database yet?

Will training on a set this large scale linearly- i.e. can I expect to
train for about 35x the duration of the tedlium 120 set?

Thanks!

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ADJVuwDoHpf9QgQRzzeYM-5YrNolV8zQks5qG1dMgaJpZM4IZ3x7
.

vince62s · 2016-06-24T12:21:43Z

@guoguo12 Just from my experience with g2p, both for training and applying, it's safer to use --encoding utf-8, it will avoid some strange behavior.

vijayaditya · 2016-06-27T13:30:06Z

@guoguo12 When you think the recipe is ready for intermediate review please let us know so that we can go through them more carefully.

guoguo12 · 2016-06-27T16:39:41Z

@vince62s: Thanks! This is true even if the lexicon is pure ASCII?

@vijayaditya: We actually just finished the final HMM-GMM training step. I've squashed and pushed all of my work to guoguo12:multi-recipe. On the CLSP Grid, my work is at /export/a15/allenguo/kaldi/egs/multi_en/s5.

Here's the speaker-independent WER for eval2000 after the final HMM-GMM step:

%WER 31.9 | 4459 42989 | 72.4 21.3 6.3 4.3 31.9 68.6 | exp/multi_a/tri5/decode_tg_eval2000.si/score_12_1.0/test.ctm.filt.sys

This is a bit better than the 32.2% WER achieved by fisher_swbd at roughly the same step (link). I'm still waiting on the speaker-adapted decode to finish; I'll post an update when it's done.

Important: I adapted existing Kaldi conventions to work with the multi-database situation. You can read about the design decisions I made in the README.

Also, here's a concise chart outlining the exact training recipe:

I'll probably add it to the README once it's finalized.

After refining the HMM-GMM training steps (as needed), the next step is nnet3/chain. I would be copying egs/fisher_swbd/s5/local/chain/run_blstm_6h.sh, assuming that's still the best one.

vijayaditya · 2016-06-27T17:31:52Z

@guoguo12 I would recommend evaluating the recipe on all the test sets of interest from the beginning as it will help prevent tuning the recipe to one test set. You could evaluate the systems on tedlium test sets, librispeech test sets, rt03 and hub'00.

vijayaditya · 2016-06-27T17:38:36Z

@guoguo12 I see that the next stage you plan to execute is nnet3/chain. I would recommend building a TDNN acoustic model trained with cross-entropy and sMBR criteria. This is our most stable recipe and usually helps us figure out if the nnet recipes are working fine. Other nnet3 recipes don't work out of the box for new databases. We previously had problems with BLSTM acoustic models and TDNN+chain acoustic models in some recipes (e.g. Tedlium or AMI).

vijayaditya · 2016-06-27T17:52:03Z

egs/multi_en/s5/conf/ami_ihm/mfcc.conf

@@ -0,0 +1,4 @@
+--use-energy=false
+--sample-frequency=16000


The feature configuration file is ideally same for all the databases, unless you are doing some very specific pre-processing. Are you planning to add this feature ? If not, I would strongly recommend just maintaining one conf/mfcc.conf file.

WSJ, AMI, ICSI, Tedlium are 16khz, but fisher and switchboard are 8khz. Since most applications/research is done on the 8khz part, I'd suggest we rather limit the feature banks to 8khz, as opposed to upsampling this data to 16khz.

On a second thought, would it be better to use one mfcc.conf and downsample the 16khz data?

Korbinian, I think it's better to use a sox command to downsample the data
in the wav file, rather than messing with the mel-banks high-freq, which
could be harder for users to get right (e.g. also needs to be done in the
mfcc_hires.conf). Or (even better), downsample them beforehand and dump
them to disk, which might cause less I/O later on.
By messing with the mel high-freq you'd get a different energy than if you
downsampled the signal.

Also, the downsampling beforehand would probably be more efficient if there
are multiple passes of mfcc extraction (which there are).

On Mon, Jun 27, 2016 at 11:05 AM, Korbinian notifications@github.com
wrote:

In egs/multi_en/s5/conf/ami_ihm/mfcc.conf
#771 (comment):

@@ -0,0 +1,4 @@
+--use-energy=false
+--sample-frequency=16000

WSJ, AMI, ICSI, Tedlium are 16khz, but fisher and switchboard are 8khz.
Since most applications/research is done on the 8khz part, I'd suggest we
rather limit the feature banks to 8khz, as opposed to upsampling this data
to 16khz.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/kaldi-asr/kaldi/pull/771/files/d13323180b98a83622b0cb193a3a5cd042468609#r68625847,
or mute the thread
https://github.com/notifications/unsubscribe/ADJVu1llR4jOH8czmmR5jrSEIJOWSht-ks5qQBDSgaJpZM4IZ3x7
.

@danpovey: So you'd recommend switching to downsampling and retraining the HMM-GMM models from scratch before starting TDNN training?

I think it might make people's lives easier later on-- so yes.

On Mon, Jun 27, 2016 at 12:49 PM, Allen Guo notifications@github.com
wrote:

In egs/multi_en/s5/conf/ami_ihm/mfcc.conf
#771 (comment):

@@ -0,0 +1,4 @@
+--use-energy=false
+--sample-frequency=16000

@danpovey https://github.com/danpovey: So you'd recommend switching to
downsampling and retraining the HMM-GMM models from scratch before starting
TDNN training?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kaldi-asr/kaldi/pull/771/files/d13323180b98a83622b0cb193a3a5cd042468609#r68643294,
or mute the thread
https://github.com/notifications/unsubscribe/ADJVuwRCr5al9xNMdcfH9g0uisGZ7rmBks5qQCkygaJpZM4IZ3x7
.

vijayaditya · 2016-07-09T01:21:10Z

Once your GMM-HMM systems are well tuned please let us know, we can go through a second round of reviews. It would also enable @tomkocse to start working on his multi-condition recipe.

guoguo12 · 2016-07-09T01:42:54Z

Yep, will do. I'm currently redoing tri5 (the last GMM-HMM step). Here are the results from tri4, across multiple test sets (as requested):

%WER 25.6 | 4459 42989 | 77.8 16.6 5.6 3.4 25.6 64.2 | exp/multi_a/tri4/decode_tg_eval2000/score_13_0.0/test.ctm.filt.sys
%WER 32.3 | 4459 42989 | 72.0 21.7 6.3 4.3 32.3 68.2 | exp/multi_a/tri4/decode_tg_eval2000.si/score_12_0.0/test.ctm.filt.sys
%WER 22.17 [ 11657 / 52576, 1525 ins, 1157 del, 8975 sub ] exp/multi_a/tri4/decode_tg_librispeech/wer_10_1.0
%WER 27.34 [ 14375 / 52576, 1945 ins, 1401 del, 11029 sub ] exp/multi_a/tri4/decode_tg_librispeech.si/wer_10_1.0
%WER 25.2 | 8420 76157 | 77.7 16.3 6.0 2.9 25.2 57.3 | exp/multi_a/tri4/decode_tg_rt03/score_14_1.0/test.ctm.filt.sys
%WER 32.6 | 8420 76157 | 71.1 21.5 7.3 3.8 32.6 63.1 | exp/multi_a/tri4/decode_tg_rt03.si/score_13_0.5/test.ctm.filt.sys

For fun (and at @sikoried's suggestion), I also decoded the Librispeech test set using a Librispeech LM (small, tg):

%WER 13.71 [ 7210 / 52576, 862 ins, 780 del, 5568 sub ] exp/multi_a/tri4/decode_libri_tg_librispeech/wer_16_0.0
%WER 16.88 [ 8875 / 52576, 1104 ins, 937 del, 6834 sub ] exp/multi_a/tri4/decode_libri_tg_librispeech.si/wer_13_0.0

These are perhaps slightly worse than expected (11.2%, seen here).

vijayaditya · 2016-07-09T04:40:13Z

@xiaohui-zhang you might have some insights in the case of librispeech test set, based on your pronunciation dictionary experiments.

danpovey · 2016-07-09T04:42:58Z

Allen, can you remind us how you got the lexicon and word list for this
setup?
Dan

On Fri, Jul 8, 2016 at 9:40 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@xiaohui-zhang https://github.com/xiaohui-zhang you might have some
insights in the case of librispeech test set, based on your pronunciation
dictionary experiments.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ADJVuyJ-d3zdWSxkS5M-BB0rVHlVIFPRks5qTyYwgaJpZM4IZ3x7
.

guoguo12 · 2016-07-09T04:51:22Z

It's CMUDict with all remaining OOVs across all training corpora
synthesized using Sequitur G2P.
On Jul 8, 2016 9:43 PM, "Daniel Povey" notifications@github.com wrote:

Allen, can you remind us how you got the lexicon and word list for this
setup?
Dan

On Fri, Jul 8, 2016 at 9:40 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@xiaohui-zhang https://github.com/xiaohui-zhang you might have some
insights in the case of librispeech test set, based on your pronunciation
dictionary experiments.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment),
or mute
the thread
<
https://github.com/notifications/unsubscribe/ADJVuyJ-d3zdWSxkS5M-BB0rVHlVIFPRks5qTyYwgaJpZM4IZ3x7

.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AB2c-dX9ad-OTNUKAosVGErUUdPyboODks5qTybagaJpZM4IZ3x7
.

vince62s · 2016-07-09T06:06:34Z

@guoguo12 I would be also curious to see what the Tedlium test set gives in this set up ...when you get a chance.

danpovey · 2016-07-09T19:12:01Z

OK. We could definitely do a bit better using Samuel's method that he's
working on, but that can wait till later.
Dan

On Fri, Jul 8, 2016 at 9:51 PM, Allen Guo notifications@github.com wrote:

It's CMUDict with all remaining OOVs across all training corpora
synthesized using Sequitur G2P.
On Jul 8, 2016 9:43 PM, "Daniel Povey" notifications@github.com wrote:

Allen, can you remind us how you got the lexicon and word list for this
setup?
Dan

On Fri, Jul 8, 2016 at 9:40 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@xiaohui-zhang https://github.com/xiaohui-zhang you might have some
insights in the case of librispeech test set, based on your
pronunciation
dictionary experiments.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment),
or mute
the thread
<

https://github.com/notifications/unsubscribe/ADJVuyJ-d3zdWSxkS5M-BB0rVHlVIFPRks5qTyYwgaJpZM4IZ3x7

.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment),
or mute
the thread
<
https://github.com/notifications/unsubscribe/AB2c-dX9ad-OTNUKAosVGErUUdPyboODks5qTybagaJpZM4IZ3x7

.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ADJVuyWOPyer0bQrQWN87EBiluozlAC5ks5qTyjNgaJpZM4IZ3x7
.

guoguo12 · 2016-07-12T18:00:41Z

@vince62s, here are the Tedlium test set results for tri4:

%WER 25.8 | 1155 27512 | 78.7 17.5 3.7 4.5 25.8 92.6 | exp/multi_a/tri4/decode_tg_tedlium/score_12_1.0/test.ctm.filt.sys
%WER 34.3 | 1155 27512 | 72.0 23.8 4.1 6.4 34.3 96.1 | exp/multi_a/tri4/decode_tg_tedlium.si/score_10_0.0/test.ctm.filt.sys

I decoded these using the standard trigram LM for this recipe, which is trained on Fisher/SWBD.

danpovey · 2016-07-12T18:13:21Z

Allen, could you please get in the habit of putting baselines in these
kinds of posts with results, together with some text saying what you
conclude from it? E.g. is it better than the baseline? Worse? Do you
have any theory why? Any kind of comment of this nature would make this
interpretable by others.

On Tue, Jul 12, 2016 at 11:00 AM, Allen Guo notifications@github.com
wrote:

@vince62s https://github.com/vince62s, here are the Tedlium test set
results for tri4:

%WER 25.8 | 1155 27512 | 78.7 17.5 3.7 4.5 25.8 92.6 | exp/multi_a/tri4/decode_tg_tedlium/score_12_1.0/test.ctm.filt.sys
%WER 34.3 | 1155 27512 | 72.0 23.8 4.1 6.4 34.3 96.1 | exp/multi_a/tri4/decode_tg_tedlium.si/score_10_0.0/test.ctm.filt.sys

I decoded these using the standard trigram LM for this recipe, which is
trained on Fisher/SWBD.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ADJVu8ZSqtjqRSMmdB2L7IodKUqnJYkdks5qU9ZLgaJpZM4IZ3x7
.

guoguo12 · 2016-07-12T18:18:25Z

Sure. The comparable result from the Tedlium recipe is 20.3% WER (link), so this is worse. I would predict that the LM is mostly to blame.

danpovey · 2016-07-12T18:20:41Z

OK. It would probably make sense eventually to build graphs with the
'native' LMs for each of the source databases, so we can disentangle this.

Dan

On Tue, Jul 12, 2016 at 11:18 AM, Allen Guo notifications@github.com
wrote:

Sure. The comparable result from the Tedlium recipe is 20.3% WER (link

kaldi/egs/tedlium/s5/RESULTS

Line 27 in 54a90f6

%WER 20.3 | 1155 27512 | 82.7 13.4 3.9 3.0 20.3 90.0 | -0.063 | exp/tri3/decode_test/score_14_0.5/ctm.filt.filt.sys

),
so this is worse. I would predict that the LM is mostly to blame.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ADJVu9RxXh-tBXZ_9hxDZjFUNc12qP0xks5qU9pzgaJpZM4IZ3x7
.

danpovey · 2016-08-19T23:20:40Z

egs/multi_en/s5/local/remove_dup_utts.sh

@@ -0,0 +1,56 @@
+#!/bin/bash


this script local/remove_dup_utts.sh should be deleted as it now lives in utils/data/.

vijayaditya · 2016-08-19T23:49:32Z

@naxingyu Would you be able to take this recipe to the nnet3 stage ? Most of the GPUs on our cluster will be busy for the next 2 weeks and it would be good to see how the results look sooner than that.

vijayaditya · 2016-08-31T16:44:14Z

@sikoried Would you be able to make the two changes suggested by @danpovey ? We can merge the the recipe in its current state and start working on the nnet3 systems at a cooler pace.

danpovey · 2016-08-31T19:21:26Z

vijay, it may not be obvious to him what changes you refer to here.

On Wed, Aug 31, 2016 at 12:44 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@sikoried https://github.com/sikoried Would you be able to make the two
changes suggested by @danpovey https://github.com/danpovey ? We can
merge the the recipe in its current state and start working on the nnet3
systems at a cooler pace.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu5MQUflxN-VsWAQeGNTu6ERLiOM9ks5qla9igaJpZM4IZ3x7
.

sikoried · 2016-08-31T19:31:07Z

Vijay, Dan: sorry this slipped my radar. Which changes?

On Aug 31, 2016 12:26, "Daniel Povey" notifications@github.com wrote:

vijay, it may not be obvious to him what changes you refer to here.

On Wed, Aug 31, 2016 at 12:44 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@sikoried https://github.com/sikoried Would you be able to make the
two
changes suggested by @danpovey https://github.com/danpovey ? We can
merge the the recipe in its current state and start working on the nnet3
systems at a cooler pace.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment),
or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu5MQUflxN-
VsWAQeGNTu6ERLiOM9ks5qla9igaJpZM4IZ3x7>
.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADhueFrpS3e3chxorDFJTI76AyBubNPxks5qldQ7gaJpZM4IZ3x7
.

vijayaditya · 2016-08-31T19:40:06Z

I was referring to 1 and 2.

sikoried · 2016-09-01T15:50:04Z

I'll get it done later today!

sikoried · 2016-09-02T01:28:16Z

@vijayaditya I pushed the changes @danpovey requested. Good to go now? Appreciate that you take over the nnet2/3 experiments, they're quite time consuming to run and you guys have a better handle on when they fit on the cluster...

danpovey · 2016-09-03T19:35:59Z

@vijayaditya, can we have someone run this on our grid before I merge it? Or is already on our grid somewhere?

vijayaditya · 2016-09-04T00:23:42Z

IIRC Guoguo ran these experiments on our cluster. I will try to find the
location in the mail chain.

--Vijay

On Sat, Sep 3, 2016 at 12:36 PM, Daniel Povey notifications@github.com
wrote:

@vijayaditya https://github.com/vijayaditya, can we have someone run
this on our grid before I merge it? Or is already on our grid somewhere?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADtwoAlYJ4qsQ3DempipDpSVxi_6eytOks5qmcwigaJpZM4IZ3x7
.

sikoried · 2016-09-04T01:13:46Z

I had run it on the grid, and posted the location a few posts up, and Allen
also had run everything on the clsp grid.

On Sep 3, 2016 17:23, "Vijayaditya Peddinti" notifications@github.com
wrote:

IIRC Guoguo ran these experiments on our cluster. I will try to find the
location in the mail chain.

--Vijay

On Sat, Sep 3, 2016 at 12:36 PM, Daniel Povey notifications@github.com
wrote:

@vijayaditya https://github.com/vijayaditya, can we have someone run
this on our grid before I merge it? Or is already on our grid somewhere?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment),
or mute
the thread
<https://github.com/notifications/unsubscribe-auth/
ADtwoAlYJ4qsQ3DempipDpSVxi_6eytOks5qmcwigaJpZM4IZ3x7>
.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADhueKkCICVdulJTFMMDi3e3g1sqOZ0Cks5qmg-SgaJpZM4IZ3x7
.

danpovey · 2016-09-04T01:43:48Z

I can't find where you posted the location.

sikoried · 2016-09-05T21:04:11Z

On clsp: /export/a14/kriedhammer/git/kaldi-guoguo12/egs/multi_en/s5 Does that work?

danpovey · 2016-09-05T21:15:46Z

Should these be commented in the run.sh?


  #local/make_partitions.sh --multi $multi --stage 5 || exit 1;

  #steps/align_fmllr.sh --cmd "$train_cmd" --nj 60 \

  #  data/$multi/tri3_ali data/lang \

  #  exp/$multi/tri3 exp/$multi/tri3_ali || exit 1;

On Mon, Sep 5, 2016 at 5:04 PM, Korbinian notifications@github.com wrote:

On clsp: /export/a14/kriedhammer/git/kaldi-guoguo12/egs/multi_en/s5 Does
that work?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu5j93Zk_9QEfoq0_HJSlULXUoWlwks5qnIPOgaJpZM4IZ3x7
.

danpovey · 2016-09-05T21:16:29Z

and should the 'exit 0' be in the middle of the run.sh?

On Mon, Sep 5, 2016 at 5:15 PM, Daniel Povey dpovey@gmail.com wrote:

Should these be commented in the run.sh?
  #local/make_partitions.sh --multi $multi --stage 5 || exit 1;

  #steps/align_fmllr.sh --cmd "$train_cmd" --nj 60 \

  #  data/$multi/tri3_ali data/lang \

  #  exp/$multi/tri3 exp/$multi/tri3_ali || exit 1;
On Mon, Sep 5, 2016 at 5:04 PM, Korbinian notifications@github.com
wrote:

On clsp: /export/a14/kriedhammer/git/kaldi-guoguo12/egs/multi_en/s5 Does
that work?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu5j93Zk_9QEfoq0_HJSlULXUoWlwks5qnIPOgaJpZM4IZ3x7
.

sikoried · 2016-09-05T21:23:30Z

Sorry, these were leftovers from the last run after I had made some adjustments.

danpovey · 2016-09-05T21:27:31Z

I think it's OK now, but we need to figure out how to squash it at least to some extent- it might be a bit more complicated since it's a multi-author PR.
I would really rather not clutter up the git log with that many commits. If it were just a handful it would be better.

sikoried · 2016-09-05T21:39:03Z

@guoguo12 As the owner of this fork/repo, can you squash the commits as indicated by Dan?

guoguo12 · 2016-09-05T22:57:39Z

@danpovey: If you enable GitHub's squash merge feature, you should be able to squash it on merge.

jtrmal · 2016-09-05T22:59:28Z

I think it's even enabled :)
y.

On Mon, Sep 5, 2016 at 6:57 PM, Allen Guo notifications@github.com wrote:

@danpovey https://github.com/danpovey: If you enable GitHub's squash
merge https://github.com/blog/2141-squash-your-commits feature, you
should be able to squash it on merge.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AKisX5IUBN4mV9dq95FfRZzU2WRDn1nKks5qnJ5pgaJpZM4IZ3x7
.

danpovey · 2016-09-05T23:04:48Z

If I do it that way, I doubt that the authorship info will be correct.
I'd rather have 'git blame' give the correct userid. You will also show up
in the stats, which is nice.

On Mon, Sep 5, 2016 at 6:59 PM, jtrmal notifications@github.com wrote:

I think it's even enabled :)
y.

On Mon, Sep 5, 2016 at 6:57 PM, Allen Guo notifications@github.com
wrote:

@danpovey https://github.com/danpovey: If you enable GitHub's squash
merge https://github.com/blog/2141-squash-your-commits feature, you
should be able to squash it on merge.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment),
or mute
the thread
<https://github.com/notifications/unsubscribe-auth/
AKisX5IUBN4mV9dq95FfRZzU2WRDn1nKks5qnJ5pgaJpZM4IZ3x7>

.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu2cuzPPAlDdLjs-bcncP7JX-x_oCks5qnJ7TgaJpZM4IZ3x7
.

guoguo12 · 2016-09-05T23:18:28Z

Squashed to three commits (original recipe, revision with Tedlium 2, proofreading).

danpovey · 2016-09-05T23:34:34Z

Thanks! Merging.

vijayaditya reviewed Jun 27, 2016
View reviewed changes

vijayaditya mentioned this pull request Jun 27, 2016

Multi-condition multi-database English LVCSR recipe #870

Closed

danpovey reviewed Aug 19, 2016
View reviewed changes

guoguo12 and others added 3 commits September 5, 2016 19:16

Start recipe

8a0ddc1

Use Tedlium release 2 scripts/data

712a7d6

Proofread recipe

c282983

danpovey merged commit e9852e6 into kaldi-asr:master Sep 5, 2016

vijayaditya mentioned this pull request Sep 16, 2016

Multi-database English LVCSR recipe #699

Closed

sikoried mentioned this pull request Sep 26, 2016

multi_en: removed --out-suffix from run_ivector_common.sh #1062

Merged

		@@ -0,0 +1,4 @@
		--use-energy=false
		--sample-frequency=16000

WIP: Multi-database English LVCSR recipe #771

WIP: Multi-database English LVCSR recipe #771

Conversation

guoguo12 commented May 9, 2016

vijayaditya commented May 27, 2016

guoguo12 commented May 27, 2016

skoocda commented May 30, 2016

danpovey commented May 30, 2016

vince62s commented Jun 24, 2016

vijayaditya commented Jun 27, 2016

guoguo12 commented Jun 27, 2016

vijayaditya commented Jun 27, 2016

vijayaditya commented Jun 27, 2016

vijayaditya Jun 27, 2016

Choose a reason for hiding this comment

sikoried Jun 27, 2016 • edited Loading

Choose a reason for hiding this comment

danpovey Jun 27, 2016

Choose a reason for hiding this comment

guoguo12 Jun 27, 2016

Choose a reason for hiding this comment

danpovey Jun 27, 2016

Choose a reason for hiding this comment

vijayaditya commented Jul 9, 2016

guoguo12 commented Jul 9, 2016

vijayaditya commented Jul 9, 2016

danpovey commented Jul 9, 2016

guoguo12 commented Jul 9, 2016

vince62s commented Jul 9, 2016

danpovey commented Jul 9, 2016

guoguo12 commented Jul 12, 2016

danpovey commented Jul 12, 2016

guoguo12 commented Jul 12, 2016

danpovey commented Jul 12, 2016

danpovey Aug 19, 2016

Choose a reason for hiding this comment

vijayaditya commented Aug 19, 2016

vijayaditya commented Aug 31, 2016

danpovey commented Aug 31, 2016

sikoried commented Aug 31, 2016

vijayaditya commented Aug 31, 2016

sikoried commented Sep 1, 2016

sikoried commented Sep 2, 2016

danpovey commented Sep 3, 2016

vijayaditya commented Sep 4, 2016

sikoried commented Sep 4, 2016

danpovey commented Sep 4, 2016

sikoried commented Sep 5, 2016

danpovey commented Sep 5, 2016

danpovey commented Sep 5, 2016

sikoried commented Sep 5, 2016

danpovey commented Sep 5, 2016

sikoried commented Sep 5, 2016

guoguo12 commented Sep 5, 2016

jtrmal commented Sep 5, 2016

danpovey commented Sep 5, 2016

guoguo12 commented Sep 5, 2016

danpovey commented Sep 5, 2016

sikoried Jun 27, 2016 •

edited

Loading