Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Multi-database English LVCSR recipe #771

Merged
merged 3 commits into from
Sep 5, 2016
Merged

WIP: Multi-database English LVCSR recipe #771

merged 3 commits into from
Sep 5, 2016

Conversation

guoguo12
Copy link
Contributor

@guoguo12 guoguo12 commented May 9, 2016

See #699.

@vijayaditya
Copy link
Contributor

Two minor suggestions..
It might be convenient to have sub directories in local for data prep scripts corresponding to each database (e.g. local/ami/ local/fisher/ local/swbd/ local/tedlium/)

When ever you copy a script it would be very convenient to provide the the source path

e.g.

# This script was copied from egs/swbd/s5c/local/format_acronyms_dict.py (commit d8b196951c1cf3437b3fa6cd76edbbc0542b3db9)
# Minor modifications were made.

This is convenient as people familiar with the original scripts can skip unnecessary parts.

@guoguo12
Copy link
Contributor Author

Thanks for the feedback!

It might be convenient to have sub directories in local for data prep scripts corresponding to each database (e.g. local/ami/ local/fisher/ local/swbd/ local/tedlium/)

Since some of the data prep scripts use relative paths, I'd rather not do this. Is the issue that it's difficult to pick out the non-database specific scripts (like normalize_transcript.py)? If so, I could prefix those (multi_en_normalize_transcript.py) to make it clear that those are general scripts.

When ever you copy a script it would be very convenient to provide the the source path

I will do this!

@skoocda
Copy link

skoocda commented May 30, 2016

@guoguo12 has anyone tried to train this multi-database yet?

Will training on a set this large scale linearly- i.e. can I expect to train for about 35x the duration of the tedlium 120 set?

Thanks!

@danpovey
Copy link
Contributor

Approximately linearly. If you're talking about neural net training with
nnet2 or nnet3, though, we typically use more GPUs when using more data,
and maybe fewer epochs so the training time for any given run will rarely
be more than a day or two. But this is assuming you have a grid with
multiple GPUs (e.g. up to 16).
Dan

On Mon, May 30, 2016 at 5:31 PM, D notifications@github.com wrote:

@guoguo12 https://github.com/guoguo12 has anyone tried to train this
multi-database yet?

Will training on a set this large scale linearly- i.e. can I expect to
train for about 35x the duration of the tedlium 120 set?

Thanks!


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ADJVuwDoHpf9QgQRzzeYM-5YrNolV8zQks5qG1dMgaJpZM4IZ3x7
.

@vince62s
Copy link
Contributor

@guoguo12 Just from my experience with g2p, both for training and applying, it's safer to use --encoding utf-8, it will avoid some strange behavior.

@vijayaditya
Copy link
Contributor

@guoguo12 When you think the recipe is ready for intermediate review please let us know so that we can go through them more carefully.

@guoguo12
Copy link
Contributor Author

@vince62s: Thanks! This is true even if the lexicon is pure ASCII?

@vijayaditya: We actually just finished the final HMM-GMM training step. I've squashed and pushed all of my work to guoguo12:multi-recipe. On the CLSP Grid, my work is at /export/a15/allenguo/kaldi/egs/multi_en/s5.

Here's the speaker-independent WER for eval2000 after the final HMM-GMM step:

%WER 31.9 | 4459 42989 | 72.4 21.3 6.3 4.3 31.9 68.6 | exp/multi_a/tri5/decode_tg_eval2000.si/score_12_1.0/test.ctm.filt.sys

This is a bit better than the 32.2% WER achieved by fisher_swbd at roughly the same step (link). I'm still waiting on the speaker-adapted decode to finish; I'll post an update when it's done.

Important: I adapted existing Kaldi conventions to work with the multi-database situation. You can read about the design decisions I made in the README.

Also, here's a concise chart outlining the exact training recipe:

mdrp

I'll probably add it to the README once it's finalized.

After refining the HMM-GMM training steps (as needed), the next step is nnet3/chain. I would be copying egs/fisher_swbd/s5/local/chain/run_blstm_6h.sh, assuming that's still the best one.

@vijayaditya
Copy link
Contributor

@guoguo12 I would recommend evaluating the recipe on all the test sets of interest from the beginning as it will help prevent tuning the recipe to one test set. You could evaluate the systems on tedlium test sets, librispeech test sets, rt03 and hub'00.

@vijayaditya
Copy link
Contributor

@guoguo12 I see that the next stage you plan to execute is nnet3/chain. I would recommend building a TDNN acoustic model trained with cross-entropy and sMBR criteria. This is our most stable recipe and usually helps us figure out if the nnet recipes are working fine. Other nnet3 recipes don't work out of the box for new databases. We previously had problems with BLSTM acoustic models and TDNN+chain acoustic models in some recipes (e.g. Tedlium or AMI).

@@ -0,0 +1,4 @@
--use-energy=false
--sample-frequency=16000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature configuration file is ideally same for all the databases, unless you are doing some very specific pre-processing. Are you planning to add this feature ? If not, I would strongly recommend just maintaining one conf/mfcc.conf file.

Copy link
Contributor

@sikoried sikoried Jun 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WSJ, AMI, ICSI, Tedlium are 16khz, but fisher and switchboard are 8khz. Since most applications/research is done on the 8khz part, I'd suggest we rather limit the feature banks to 8khz, as opposed to upsampling this data to 16khz.

On a second thought, would it be better to use one mfcc.conf and downsample the 16khz data?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Korbinian, I think it's better to use a sox command to downsample the data
in the wav file, rather than messing with the mel-banks high-freq, which
could be harder for users to get right (e.g. also needs to be done in the
mfcc_hires.conf). Or (even better), downsample them beforehand and dump
them to disk, which might cause less I/O later on.
By messing with the mel high-freq you'd get a different energy than if you
downsampled the signal.

Also, the downsampling beforehand would probably be more efficient if there
are multiple passes of mfcc extraction (which there are).

On Mon, Jun 27, 2016 at 11:05 AM, Korbinian notifications@github.com
wrote:

In egs/multi_en/s5/conf/ami_ihm/mfcc.conf
#771 (comment):

@@ -0,0 +1,4 @@
+--use-energy=false
+--sample-frequency=16000

WSJ, AMI, ICSI, Tedlium are 16khz, but fisher and switchboard are 8khz.
Since most applications/research is done on the 8khz part, I'd suggest we
rather limit the feature banks to 8khz, as opposed to upsampling this data
to 16khz.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/kaldi-asr/kaldi/pull/771/files/d13323180b98a83622b0cb193a3a5cd042468609#r68625847,
or mute the thread
https://github.com/notifications/unsubscribe/ADJVu1llR4jOH8czmmR5jrSEIJOWSht-ks5qQBDSgaJpZM4IZ3x7
.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danpovey: So you'd recommend switching to downsampling and retraining the HMM-GMM models from scratch before starting TDNN training?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might make people's lives easier later on-- so yes.

On Mon, Jun 27, 2016 at 12:49 PM, Allen Guo notifications@github.com
wrote:

In egs/multi_en/s5/conf/ami_ihm/mfcc.conf
#771 (comment):

@@ -0,0 +1,4 @@
+--use-energy=false
+--sample-frequency=16000

@danpovey https://github.com/danpovey: So you'd recommend switching to
downsampling and retraining the HMM-GMM models from scratch before starting
TDNN training?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kaldi-asr/kaldi/pull/771/files/d13323180b98a83622b0cb193a3a5cd042468609#r68643294,
or mute the thread
https://github.com/notifications/unsubscribe/ADJVuwRCr5al9xNMdcfH9g0uisGZ7rmBks5qQCkygaJpZM4IZ3x7
.

@vijayaditya
Copy link
Contributor

Once your GMM-HMM systems are well tuned please let us know, we can go through a second round of reviews. It would also enable @tomkocse to start working on his multi-condition recipe.

@guoguo12
Copy link
Contributor Author

guoguo12 commented Jul 9, 2016

Yep, will do. I'm currently redoing tri5 (the last GMM-HMM step). Here are the results from tri4, across multiple test sets (as requested):

%WER 25.6 | 4459 42989 | 77.8 16.6 5.6 3.4 25.6 64.2 | exp/multi_a/tri4/decode_tg_eval2000/score_13_0.0/test.ctm.filt.sys
%WER 32.3 | 4459 42989 | 72.0 21.7 6.3 4.3 32.3 68.2 | exp/multi_a/tri4/decode_tg_eval2000.si/score_12_0.0/test.ctm.filt.sys
%WER 22.17 [ 11657 / 52576, 1525 ins, 1157 del, 8975 sub ] exp/multi_a/tri4/decode_tg_librispeech/wer_10_1.0
%WER 27.34 [ 14375 / 52576, 1945 ins, 1401 del, 11029 sub ] exp/multi_a/tri4/decode_tg_librispeech.si/wer_10_1.0
%WER 25.2 | 8420 76157 | 77.7 16.3 6.0 2.9 25.2 57.3 | exp/multi_a/tri4/decode_tg_rt03/score_14_1.0/test.ctm.filt.sys
%WER 32.6 | 8420 76157 | 71.1 21.5 7.3 3.8 32.6 63.1 | exp/multi_a/tri4/decode_tg_rt03.si/score_13_0.5/test.ctm.filt.sys

For fun (and at @sikoried's suggestion), I also decoded the Librispeech test set using a Librispeech LM (small, tg):

%WER 13.71 [ 7210 / 52576, 862 ins, 780 del, 5568 sub ] exp/multi_a/tri4/decode_libri_tg_librispeech/wer_16_0.0
%WER 16.88 [ 8875 / 52576, 1104 ins, 937 del, 6834 sub ] exp/multi_a/tri4/decode_libri_tg_librispeech.si/wer_13_0.0

These are perhaps slightly worse than expected (11.2%, seen here).

@vijayaditya
Copy link
Contributor

@xiaohui-zhang you might have some insights in the case of librispeech test set, based on your pronunciation dictionary experiments.

@danpovey
Copy link
Contributor

danpovey commented Jul 9, 2016

Allen, can you remind us how you got the lexicon and word list for this
setup?
Dan

On Fri, Jul 8, 2016 at 9:40 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@xiaohui-zhang https://github.com/xiaohui-zhang you might have some
insights in the case of librispeech test set, based on your pronunciation
dictionary experiments.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ADJVuyJ-d3zdWSxkS5M-BB0rVHlVIFPRks5qTyYwgaJpZM4IZ3x7
.

@guoguo12
Copy link
Contributor Author

guoguo12 commented Jul 9, 2016

It's CMUDict with all remaining OOVs across all training corpora
synthesized using Sequitur G2P.
On Jul 8, 2016 9:43 PM, "Daniel Povey" notifications@github.com wrote:

Allen, can you remind us how you got the lexicon and word list for this
setup?
Dan

On Fri, Jul 8, 2016 at 9:40 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@xiaohui-zhang https://github.com/xiaohui-zhang you might have some
insights in the case of librispeech test set, based on your pronunciation
dictionary experiments.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment),
or mute
the thread
<
https://github.com/notifications/unsubscribe/ADJVuyJ-d3zdWSxkS5M-BB0rVHlVIFPRks5qTyYwgaJpZM4IZ3x7

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AB2c-dX9ad-OTNUKAosVGErUUdPyboODks5qTybagaJpZM4IZ3x7
.

@vince62s
Copy link
Contributor

vince62s commented Jul 9, 2016

@guoguo12 I would be also curious to see what the Tedlium test set gives in this set up ...when you get a chance.

@danpovey
Copy link
Contributor

danpovey commented Jul 9, 2016

OK. We could definitely do a bit better using Samuel's method that he's
working on, but that can wait till later.
Dan

On Fri, Jul 8, 2016 at 9:51 PM, Allen Guo notifications@github.com wrote:

It's CMUDict with all remaining OOVs across all training corpora
synthesized using Sequitur G2P.
On Jul 8, 2016 9:43 PM, "Daniel Povey" notifications@github.com wrote:

Allen, can you remind us how you got the lexicon and word list for this
setup?
Dan

On Fri, Jul 8, 2016 at 9:40 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@xiaohui-zhang https://github.com/xiaohui-zhang you might have some
insights in the case of librispeech test set, based on your
pronunciation
dictionary experiments.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment),
or mute
the thread
<

https://github.com/notifications/unsubscribe/ADJVuyJ-d3zdWSxkS5M-BB0rVHlVIFPRks5qTyYwgaJpZM4IZ3x7

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment),
or mute
the thread
<
https://github.com/notifications/unsubscribe/AB2c-dX9ad-OTNUKAosVGErUUdPyboODks5qTybagaJpZM4IZ3x7

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ADJVuyWOPyer0bQrQWN87EBiluozlAC5ks5qTyjNgaJpZM4IZ3x7
.

@guoguo12
Copy link
Contributor Author

@vince62s, here are the Tedlium test set results for tri4:

%WER 25.8 | 1155 27512 | 78.7 17.5 3.7 4.5 25.8 92.6 | exp/multi_a/tri4/decode_tg_tedlium/score_12_1.0/test.ctm.filt.sys
%WER 34.3 | 1155 27512 | 72.0 23.8 4.1 6.4 34.3 96.1 | exp/multi_a/tri4/decode_tg_tedlium.si/score_10_0.0/test.ctm.filt.sys

I decoded these using the standard trigram LM for this recipe, which is trained on Fisher/SWBD.

@danpovey
Copy link
Contributor

Allen, could you please get in the habit of putting baselines in these
kinds of posts with results, together with some text saying what you
conclude from it? E.g. is it better than the baseline? Worse? Do you
have any theory why? Any kind of comment of this nature would make this
interpretable by others.

On Tue, Jul 12, 2016 at 11:00 AM, Allen Guo notifications@github.com
wrote:

@vince62s https://github.com/vince62s, here are the Tedlium test set
results for tri4:

%WER 25.8 | 1155 27512 | 78.7 17.5 3.7 4.5 25.8 92.6 | exp/multi_a/tri4/decode_tg_tedlium/score_12_1.0/test.ctm.filt.sys
%WER 34.3 | 1155 27512 | 72.0 23.8 4.1 6.4 34.3 96.1 | exp/multi_a/tri4/decode_tg_tedlium.si/score_10_0.0/test.ctm.filt.sys

I decoded these using the standard trigram LM for this recipe, which is
trained on Fisher/SWBD.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ADJVu8ZSqtjqRSMmdB2L7IodKUqnJYkdks5qU9ZLgaJpZM4IZ3x7
.

@guoguo12
Copy link
Contributor Author

Sure. The comparable result from the Tedlium recipe is 20.3% WER (link), so this is worse. I would predict that the LM is mostly to blame.

@danpovey
Copy link
Contributor

OK. It would probably make sense eventually to build graphs with the
'native' LMs for each of the source databases, so we can disentangle this.

Dan

On Tue, Jul 12, 2016 at 11:18 AM, Allen Guo notifications@github.com
wrote:

Sure. The comparable result from the Tedlium recipe is 20.3% WER (link

%WER 20.3 | 1155 27512 | 82.7 13.4 3.9 3.0 20.3 90.0 | -0.063 | exp/tri3/decode_test/score_14_0.5/ctm.filt.filt.sys
),
so this is worse. I would predict that the LM is mostly to blame.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ADJVu9RxXh-tBXZ_9hxDZjFUNc12qP0xks5qU9pzgaJpZM4IZ3x7
.

@@ -0,0 +1,56 @@
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this script local/remove_dup_utts.sh should be deleted as it now lives in utils/data/.

@vijayaditya
Copy link
Contributor

@naxingyu Would you be able to take this recipe to the nnet3 stage ? Most of the GPUs on our cluster will be busy for the next 2 weeks and it would be good to see how the results look sooner than that.

@vijayaditya
Copy link
Contributor

@sikoried Would you be able to make the two changes suggested by @danpovey ? We can merge the the recipe in its current state and start working on the nnet3 systems at a cooler pace.

@danpovey
Copy link
Contributor

vijay, it may not be obvious to him what changes you refer to here.

On Wed, Aug 31, 2016 at 12:44 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@sikoried https://github.com/sikoried Would you be able to make the two
changes suggested by @danpovey https://github.com/danpovey ? We can
merge the the recipe in its current state and start working on the nnet3
systems at a cooler pace.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu5MQUflxN-VsWAQeGNTu6ERLiOM9ks5qla9igaJpZM4IZ3x7
.

@sikoried
Copy link
Contributor

Vijay, Dan: sorry this slipped my radar. Which changes?

On Aug 31, 2016 12:26, "Daniel Povey" notifications@github.com wrote:

vijay, it may not be obvious to him what changes you refer to here.

On Wed, Aug 31, 2016 at 12:44 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@sikoried https://github.com/sikoried Would you be able to make the
two
changes suggested by @danpovey https://github.com/danpovey ? We can
merge the the recipe in its current state and start working on the nnet3
systems at a cooler pace.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment),
or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu5MQUflxN-
VsWAQeGNTu6ERLiOM9ks5qla9igaJpZM4IZ3x7>
.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADhueFrpS3e3chxorDFJTI76AyBubNPxks5qldQ7gaJpZM4IZ3x7
.

@vijayaditya
Copy link
Contributor

I was referring to 1 and 2.

@sikoried
Copy link
Contributor

sikoried commented Sep 1, 2016

I'll get it done later today!

@sikoried
Copy link
Contributor

sikoried commented Sep 2, 2016

@vijayaditya I pushed the changes @danpovey requested. Good to go now? Appreciate that you take over the nnet2/3 experiments, they're quite time consuming to run and you guys have a better handle on when they fit on the cluster...

@danpovey
Copy link
Contributor

danpovey commented Sep 3, 2016

@vijayaditya, can we have someone run this on our grid before I merge it? Or is already on our grid somewhere?

@vijayaditya
Copy link
Contributor

IIRC Guoguo ran these experiments on our cluster. I will try to find the
location in the mail chain.

--Vijay

On Sat, Sep 3, 2016 at 12:36 PM, Daniel Povey notifications@github.com
wrote:

@vijayaditya https://github.com/vijayaditya, can we have someone run
this on our grid before I merge it? Or is already on our grid somewhere?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADtwoAlYJ4qsQ3DempipDpSVxi_6eytOks5qmcwigaJpZM4IZ3x7
.

@sikoried
Copy link
Contributor

sikoried commented Sep 4, 2016

I had run it on the grid, and posted the location a few posts up, and Allen
also had run everything on the clsp grid.

On Sep 3, 2016 17:23, "Vijayaditya Peddinti" notifications@github.com
wrote:

IIRC Guoguo ran these experiments on our cluster. I will try to find the
location in the mail chain.

--Vijay

On Sat, Sep 3, 2016 at 12:36 PM, Daniel Povey notifications@github.com
wrote:

@vijayaditya https://github.com/vijayaditya, can we have someone run
this on our grid before I merge it? Or is already on our grid somewhere?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment),
or mute
the thread
<https://github.com/notifications/unsubscribe-auth/
ADtwoAlYJ4qsQ3DempipDpSVxi_6eytOks5qmcwigaJpZM4IZ3x7>
.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADhueKkCICVdulJTFMMDi3e3g1sqOZ0Cks5qmg-SgaJpZM4IZ3x7
.

@danpovey
Copy link
Contributor

danpovey commented Sep 4, 2016

I can't find where you posted the location.

@sikoried
Copy link
Contributor

sikoried commented Sep 5, 2016

On clsp: /export/a14/kriedhammer/git/kaldi-guoguo12/egs/multi_en/s5 Does that work?

@danpovey
Copy link
Contributor

danpovey commented Sep 5, 2016

Should these be commented in the run.sh?


  #local/make_partitions.sh --multi $multi --stage 5 || exit 1;

  #steps/align_fmllr.sh --cmd "$train_cmd" --nj 60 \

  #  data/$multi/tri3_ali data/lang \

  #  exp/$multi/tri3 exp/$multi/tri3_ali || exit 1;

On Mon, Sep 5, 2016 at 5:04 PM, Korbinian notifications@github.com wrote:

On clsp: /export/a14/kriedhammer/git/kaldi-guoguo12/egs/multi_en/s5 Does
that work?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu5j93Zk_9QEfoq0_HJSlULXUoWlwks5qnIPOgaJpZM4IZ3x7
.

@danpovey
Copy link
Contributor

danpovey commented Sep 5, 2016

and should the 'exit 0' be in the middle of the run.sh?

On Mon, Sep 5, 2016 at 5:15 PM, Daniel Povey dpovey@gmail.com wrote:

Should these be commented in the run.sh?


  #local/make_partitions.sh --multi $multi --stage 5 || exit 1;

  #steps/align_fmllr.sh --cmd "$train_cmd" --nj 60 \

  #  data/$multi/tri3_ali data/lang \

  #  exp/$multi/tri3 exp/$multi/tri3_ali || exit 1;

On Mon, Sep 5, 2016 at 5:04 PM, Korbinian notifications@github.com
wrote:

On clsp: /export/a14/kriedhammer/git/kaldi-guoguo12/egs/multi_en/s5 Does
that work?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu5j93Zk_9QEfoq0_HJSlULXUoWlwks5qnIPOgaJpZM4IZ3x7
.

@sikoried
Copy link
Contributor

sikoried commented Sep 5, 2016

Sorry, these were leftovers from the last run after I had made some adjustments.

@danpovey
Copy link
Contributor

danpovey commented Sep 5, 2016

I think it's OK now, but we need to figure out how to squash it at least to some extent- it might be a bit more complicated since it's a multi-author PR.
I would really rather not clutter up the git log with that many commits. If it were just a handful it would be better.

@sikoried
Copy link
Contributor

sikoried commented Sep 5, 2016

@guoguo12 As the owner of this fork/repo, can you squash the commits as indicated by Dan?

@guoguo12
Copy link
Contributor Author

guoguo12 commented Sep 5, 2016

@danpovey: If you enable GitHub's squash merge feature, you should be able to squash it on merge.

@jtrmal
Copy link
Contributor

jtrmal commented Sep 5, 2016

I think it's even enabled :)
y.

On Mon, Sep 5, 2016 at 6:57 PM, Allen Guo notifications@github.com wrote:

@danpovey https://github.com/danpovey: If you enable GitHub's squash
merge https://github.com/blog/2141-squash-your-commits feature, you
should be able to squash it on merge.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AKisX5IUBN4mV9dq95FfRZzU2WRDn1nKks5qnJ5pgaJpZM4IZ3x7
.

@danpovey
Copy link
Contributor

danpovey commented Sep 5, 2016

If I do it that way, I doubt that the authorship info will be correct.
I'd rather have 'git blame' give the correct userid. You will also show up
in the stats, which is nice.

On Mon, Sep 5, 2016 at 6:59 PM, jtrmal notifications@github.com wrote:

I think it's even enabled :)
y.

On Mon, Sep 5, 2016 at 6:57 PM, Allen Guo notifications@github.com
wrote:

@danpovey https://github.com/danpovey: If you enable GitHub's squash
merge https://github.com/blog/2141-squash-your-commits feature, you
should be able to squash it on merge.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment),
or mute
the thread
<https://github.com/notifications/unsubscribe-auth/
AKisX5IUBN4mV9dq95FfRZzU2WRDn1nKks5qnJ5pgaJpZM4IZ3x7>

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#771 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu2cuzPPAlDdLjs-bcncP7JX-x_oCks5qnJ7TgaJpZM4IZ3x7
.

@guoguo12
Copy link
Contributor Author

guoguo12 commented Sep 5, 2016

Squashed to three commits (original recipe, revision with Tedlium 2, proofreading).

@danpovey danpovey merged commit e9852e6 into kaldi-asr:master Sep 5, 2016
@danpovey
Copy link
Contributor

danpovey commented Sep 5, 2016

Thanks! Merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants