Add OCR/Handwriting Recognition examples #1984

hhadian · 2017-10-30T17:35:41Z

No description provided.

pegahgh · 2017-11-06T21:02:06Z

egs/iam/s5/local/augment_and_make_feature_vect.py

+signal(SIGPIPE, SIG_DFL)
+
+parser = argparse.ArgumentParser(
+    description="""Generates and saves the feature vectors""")


It is good if you add some description about types of augmentation, you are doing in this script.

pegahgh · 2017-11-06T21:14:11Z

egs/iam/s5/local/chain/run_cnn_1a.sh

+frame_subsampling_factor=4
+alignment_subsampling_factor=1
+# training chunk-options
+chunk_width=340,300,200,100


Is there any reason for this chunk width? should it be multiple of 32?

We can change this to something less unusual like 300,200,100
Again, the reason for choosing this is because the average number of frames per phone/letter is almost 2 times larger for OCR.

pegahgh · 2017-11-06T21:15:29Z

egs/iam/s5/local/chain/run_cnn_1a.sh

+lat_dir=exp/chain${nnet3_affix}/${gmm}_${train_set}_lats
+dir=exp/chain${nnet3_affix}/cnn${affix}
+train_data_dir=data/${train_set}
+lores_train_data_dir=$train_data_dir  # for the start, use the same data for gmm and chain


why did you define lores_train_data_dir?Isn't it the same as train_data_dir?

This was modified from an ASR recipe and we didn't remove this
variable so we could optionally experiment with different resolutions for
the gmm and chain systems. Anyway, the gmm system does not give good results
so I guess we can remove this and focus on the chain setup only.

pegahgh · 2017-11-06T21:16:19Z

egs/iam/s5/local/chain/run_cnn_1a.sh

+# chain options
+train_stage=-10
+xent_regularize=0.1
+frame_subsampling_factor=4


Does 4 work better than 3 in HWR?

It has not been tested yet. The reason for choosing
a larger factor was that the average number of frames per
word in OCR (when the line images are scaled to have a height of 40) is almost 2 times that of ASR.

Best word error rate with subsampling factor(FSF) 4 is slightly better than 3 and 5. WER(%) with FSF=3,5 is close to 14.80% and for FSF=4 it is close to 14.50%.

FSF Best WER(%)

3 14.81

4 14.51

5 14.80

pegahgh · 2017-11-07T15:16:55Z

egs/iam/s5/local/unk_arc_post_to_transcription.py

+import sys
+import numpy as np
+from scipy import misc
+parser = argparse.ArgumentParser(description="""uses phones to convert unk to word""")


It is good to add extended description for this function and its arguments. This script can be used in other applications.

danpovey · 2017-11-15T05:40:22Z

Actually I prefer it when the chunk widths are not too regularly spaced... then we get more combinations so we can more closely approximate the lengths of longer utterances. That's why I sometimes use slightly random-seeming numbers.

…

On Wed, Nov 15, 2017 at 12:36 AM, Hossein Hadian ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/iam/s5/local/chain/run_cnn_1a.sh <#1984 (comment)>: > +train_set=train +gmm=tri3 # this is the source gmm-dir that we'll use for alignments; it + # should have alignments for the specified training data. +nnet3_affix= # affix for exp dirs, e.g. it was _cleaned in tedlium. +affix=_1a #affix for TDNN+LSTM directory e.g. "1a" or "1b", in case we change the configuration. +ali=tri3_ali +common_egs_dir= +reporting_email= + +# chain options +train_stage=-10 +xent_regularize=0.1 +frame_subsampling_factor=4 +alignment_subsampling_factor=1 +# training chunk-options +chunk_width=340,300,200,100 We can change this to something less unusual like 300,200,100 Again, the reason for choosing this is because the average number of frames per phone/letter is almost 2 times larger for OCR. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1984 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVuyvpyEHBrfwIUFm8fxes-u7jMuuxks5s2nhtgaJpZM4QLkfc> .

hhadian · 2017-11-15T05:55:24Z

Actually I prefer it when the chunk widths are not too regularly spaced...
then we get more combinations so we can more closely approximate the
lengths of longer utterances. That's why I sometimes use slightly
random-seeming numbers.

Oh OK. I won't change it then.

danpovey

just realized I had some pending comments.

danpovey · 2017-11-08T17:40:20Z

egs/iam/s5/run.sh

+    # create a backup directory to store text, utt2spk and image.scp file
+    mkdir -p $data_dir/train/backup
+    mv $data_dir/train/text $data_dir/train/utt2spk $data_dir/train/images.scp $data_dir/train/backup/
+    local/augment_and_make_feature_vect.py $data_dir/train --scale-size 40 --vertical-shift 10 | \


I don't like "vect" for vector, should be "vec". But the name is too long anyway-- call it "augment_and_make_features.py".

danpovey · 2017-11-08T17:41:47Z

egs/iam/s5/run.sh

+    ark:- ark,scp:$data_dir/test/data/images.ark,$data_dir/test/feats.scp || exit 1
+  steps/compute_cmvn_stats.sh $data_dir/test || exit 1;
+
+  if [ $augment = true ]; then


just do
if $augment; then
(true and false are valid statements that have return codes).

danpovey · 2017-11-08T17:43:08Z

egs/uw3/v1/local/make_feature_vect.py

@@ -0,0 +1,85 @@
+#!/usr/bin/env python


call this make_features.py.

For newly created python scripts I prefer python3. This will reduce our headaches when python2 is no longer supported.

danpovey · 2017-11-08T18:01:55Z

egs/uw3/v1/local/prepare_lm.sh

+
+cat $arpa | utils/find_arpa_oovs.pl $lang/words.txt  > $tmpdir/oovs.txt
+
+cat $arpa | \


it looks to me like you might have copied an older example here. Right now I believe most of this is done by a single command line involving arpa2fst, look for a more recent example to copy. I think the oovs.txt is no longer needed, also. It's all done via options to arpa2fst now.

I think more preferable way would be just call utils\format_lm.sh (I noticed in the previous recipe)

Also, it seems you are osciallating between IRSTLM and pocolm? is that necessary? Cannot you use just one toolkit?

Yes, this is very old. I'll fix it.
Re LM toolkits, I'm not sure which toolkit to use.
Here (i.w. UW3) we just need a simple LM trained on training text.
But in IAM, we have 3 LM sources and pocolm might be more suited.

I think I'm OK with using 2 different LM toolkits since it's 2 different recipes. The only benefit in having a standard is if it's globally enforced, and that isn't practical for various reasons (there just isn't a clear leader).

danpovey · 2017-11-08T18:03:12Z

egs/uw3/v1/local/process_data.py

+#!/usr/bin/env python
+
+# (Author: Chun Chieh Chang)
+


A comment saying what this script does would be nice. E.g. what files does it produce and what will their contents look like? Doing this via some kind of doc-string or usage message is fine too, maybe preferable.

danpovey · 2017-11-08T19:31:44Z

egs/iam/s5/run.sh

+  local/iam_train_lm.sh
+  cp -R $data_dir/lang -T $data_dir/lang_test_corpus
+  gunzip -k -f data/local/local_lm/data/arpa/3gram_big.arpa.gz
+  local/prepare_lm.sh data/local/local_lm/data/arpa/3gram_big.arpa $data_dir/lang_test_corpus || exit 1;


This is wasteful of disk, to gunzip the ARPA file and not delete it afterwards. (why gzip it in the first place if you are going to keep the unzipped one)?
I don't really like the way this is structured, with the prepare_lm.sh script that can either use an existing LM or prepare one. I'd rather have one script to build the LM and one to build the graph; but as I commented in the script, there is a much simpler way to build the graph than what you are doing, it's just a single invocation of arpa2fst now I think.

danpovey · 2017-11-08T19:32:10Z

egs/iam/s5/run.sh

+  local/run_unk_model.sh
+fi
+
+num_gauss=10000


Please don't have these variables, just hardcode them in the script invocations.

danpovey · 2017-11-08T19:34:17Z

egs/uw3/v1/run.sh

+
+stage=0
+nj=30
+data_download=data


I think it would be better and clearer to specify something other than data/ for the download location, e.g. at least specify a subdirectory, because it makes it unclear to the user to what extent you can really change this.
Make sure that if the data is already downloaded, the contents of that directory $data_download can be used without any modification, so that a user can use another user's downloaded directory. Otherwise it would be impossible for multiple users to share the same data.
And please make clear what location on the CLSP grid can be used for this, so that if anyone runs it here, they don't have to re-download the data.

danpovey · 2017-11-08T19:35:19Z

egs/uw3/v1/run.sh

+stage=0
+nj=30
+data_download=data
+data_dir=data


It's nonstandard to have 'data' and 'exp' be variables in the scripts. Even though I can see that it might be useful, I'd rather have this script look like all the other scripts and not use these variables.

danpovey · 2017-11-08T19:43:11Z

egs/iam/s5/local/chain/run_cnn_1a.sh

+chunk_width=340,300,200,100
+num_leaves=500
+# we don't need extra left/right context for TDNN systems.
+chunk_left_context=32


You shouldn't need this left and right context for CNN systems either as long as you set them up right.
Did you try removing 'required-time-offsets=0' from the 'common' options?
Then the network would require all that context, and no extra context would be required; the needed number of
frames would be added in egs creation.
(However, in this case you should probably be careful to ensure that the first and last frame of each text line
are just the background color, as you'd get the first or last frame repeated and it would otherwise produce a strange image).

jtrmal · 2017-11-23T19:52:21Z

egs/iam/s5/local/score.sh

@@ -0,0 +1,154 @@
+#!/bin/bash


this should be a symlink to steps/score_kaldi.sh, not a copy

It's because it calls local/unk_arc_post_to_transcription.py somewhere
in the scoring pipeline. However, looking at the script,
it seems to me that it can be done
through hyp_filtering_cmd. @aarora8, could you please make it
a symlink and do the "unk scoring" using hyp_filtering_cmd?

ok, it is also converting upper case to lower case. But in the mail response Paul from RWTH-aachen mention that they are not converting upper case to lower case. should I not include the conversion in the change as well.

jtrmal · 2017-11-23T19:54:30Z

BTW, lot of the files does not have authors and copyrights -- please add those.

jtrmal · 2017-11-23T19:56:51Z

egs/iam/s5/cmd.sh

+# conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more information,
+# or search for the string 'default_config' in utils/queue.pl or utils/slurm.pl.
+
+export cmd="queue.pl"


cmd is definitely weird and non-standard. If you have any use-case for it, please name it more self-descriptively.

Actually I think I am ok with just using "$cmd". The distinction between train_cmd and decode_cmd became less necessary now that we have a common interface for those tools-- we mostly keep them around just out of inertia.

you should probably remove either $cmd, or $train_cmd and $decode_cmd.

jtrmal · 2017-11-23T20:01:40Z

egs/iam/s5/local/prepare_lm.sh

@@ -0,0 +1,62 @@
+#!/bin/bash


I think this file should be replaced by utils/format_lm.sh (or at least most of the things from this file should be replaced by that one)

jtrmal · 2017-11-23T20:14:54Z

egs/uw3/v1/local/score.sh

+cmd=run.pl
+stage=0
+decode_mbr=false
+stats=true


perhaps you should add stats to scoring_opts as well.
Also, if don't wanna do it the way suggested in kaldi_score_cer.sh (I assume because you want to modify the default lmwts, even though I'm not sure if it's worth the efort), please make sure the stage parameter is processed correctly

@ChunChiehChang, could you please change this script
to follow the instructions mentioned in the header of steps/scoring/score_kaldi_cer.sh?

jtrmal · 2017-11-23T20:17:46Z

I agree but now we would have three instead of using just one? y.

…

On Thu, Nov 23, 2017 at 3:15 PM, Daniel Povey ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/iam/s5/cmd.sh <#1984 (comment)>: > @@ -0,0 +1,15 @@ +# you can change cmd.sh depending on what type of queue you are using. +# If you have no queueing system and want to run on a local machine, you +# can change all instances 'queue.pl' to run.pl (but be careful and run +# commands one by one: most recipes will exhaust the memory on your +# machine). queue.pl works with GridEngine (qsub). slurm.pl works +# with slurm. Different queues are configured differently, with different +# queue names and different ways of specifying things like memory; +# to account for these differences you can create and edit the file +# conf/queue.conf to match your queue's configuration. Search for +# conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more information, +# or search for the string 'default_config' in utils/queue.pl or utils/slurm.pl. + +export cmd="queue.pl" Actually I think I am ok with just using "$cmd". The distinction between train_cmd and decode_cmd became less necessary now that we have a common interface for those tools-- we mostly keep them around just out of inertia. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1984 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKisX31An5JzrPn8MD3v27wP1u2xrM0Yks5s5dJ7gaJpZM4QLkfc> .

danpovey · 2017-11-23T20:26:20Z

But in this particular example it looks like there would be just one. The only way to start using just one is to do it example by example-- too much hassle to upgrade existing scripts.

…

On Thu, Nov 23, 2017 at 3:17 PM, jtrmal ***@***.***> wrote: I agree but now we would have three instead of using just one? y. On Thu, Nov 23, 2017 at 3:15 PM, Daniel Povey ***@***.***> wrote: > ***@***.**** commented on this pull request. > ------------------------------ > > In egs/iam/s5/cmd.sh > <#1984 (comment)>: > > > @@ -0,0 +1,15 @@ > +# you can change cmd.sh depending on what type of queue you are using. > +# If you have no queueing system and want to run on a local machine, you > +# can change all instances 'queue.pl' to run.pl (but be careful and run > +# commands one by one: most recipes will exhaust the memory on your > +# machine). queue.pl works with GridEngine (qsub). slurm.pl works > +# with slurm. Different queues are configured differently, with different > +# queue names and different ways of specifying things like memory; > +# to account for these differences you can create and edit the file > +# conf/queue.conf to match your queue's configuration. Search for > +# conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more information, > +# or search for the string 'default_config' in utils/queue.pl or utils/ slurm.pl. > + > +export cmd="queue.pl" > > Actually I think I am ok with just using "$cmd". The distinction between > train_cmd and decode_cmd became less necessary now that we have a common > interface for those tools-- we mostly keep them around just out of inertia. > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#1984 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/ AKisX31An5JzrPn8MD3v27wP1u2xrM0Yks5s5dJ7gaJpZM4QLkfc> > . > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1984 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu3KlrX4WkHfjAdX44IdVgVf1d-RRks5s5dLsgaJpZM4QLkfc> .

jtrmal · 2017-11-23T20:48:36Z

Ok, I thought I see three different variables definition. But if there is only one, then I don't have any complaints Y

…

On Nov 23, 2017 15:26, "Daniel Povey" ***@***.***> wrote: But in this particular example it looks like there would be just one. The only way to start using just one is to do it example by example-- too much hassle to upgrade existing scripts. On Thu, Nov 23, 2017 at 3:17 PM, jtrmal ***@***.***> wrote: > I agree but now we would have three instead of using just one? > y. > > On Thu, Nov 23, 2017 at 3:15 PM, Daniel Povey ***@***.***> > wrote: > > > ***@***.**** commented on this pull request. > > ------------------------------ > > > > In egs/iam/s5/cmd.sh > > <#1984 (comment)>: > > > > > @@ -0,0 +1,15 @@ > > +# you can change cmd.sh depending on what type of queue you are using. > > +# If you have no queueing system and want to run on a local machine, you > > +# can change all instances 'queue.pl' to run.pl (but be careful and run > > +# commands one by one: most recipes will exhaust the memory on your > > +# machine). queue.pl works with GridEngine (qsub). slurm.pl works > > +# with slurm. Different queues are configured differently, with > different > > +# queue names and different ways of specifying things like memory; > > +# to account for these differences you can create and edit the file > > +# conf/queue.conf to match your queue's configuration. Search for > > +# conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more > information, > > +# or search for the string 'default_config' in utils/queue.pl or utils/ > slurm.pl. > > + > > +export cmd="queue.pl" > > > > Actually I think I am ok with just using "$cmd". The distinction between > > train_cmd and decode_cmd became less necessary now that we have a common > > interface for those tools-- we mostly keep them around just out of > inertia. > > > > — > > You are receiving this because you commented. > > Reply to this email directly, view it on GitHub > > <#1984 (comment)>, > or mute > > the thread > > <https://github.com/notifications/unsubscribe-auth/ > AKisX31An5JzrPn8MD3v27wP1u2xrM0Yks5s5dJ7gaJpZM4QLkfc> > > . > > > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#1984 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/ ADJVu3KlrX4WkHfjAdX44IdVgVf1d-RRks5s5dLsgaJpZM4QLkfc> > . > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1984 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKisX6-kC__HplEd5AyAHxB3C4o-JBPIks5s5dT4gaJpZM4QLkfc> .

hhadian · 2017-11-24T16:45:01Z

egs/uw3/v1/local/prepare_lm.sh

+
+cat $arpa | utils/find_arpa_oovs.pl $lang/words.txt  > $tmpdir/oovs.txt
+
+cat $arpa | \


Yes, this is very old. I'll fix it.
Re LM toolkits, I'm not sure which toolkit to use.
Here (i.w. UW3) we just need a simple LM trained on training text.
But in IAM, we have 3 LM sources and pocolm might be more suited.

hhadian · 2017-11-24T16:49:50Z

egs/uw3/v1/local/score.sh

+cmd=run.pl
+stage=0
+decode_mbr=false
+stats=true


@ChunChiehChang, could you please change this script
to follow the instructions mentioned in the header of steps/scoring/score_kaldi_cer.sh?

hhadian · 2017-11-24T16:58:26Z

egs/iam/s5/local/score.sh

@@ -0,0 +1,154 @@
+#!/bin/bash


It's because it calls local/unk_arc_post_to_transcription.py somewhere
in the scoring pipeline. However, looking at the script,
it seems to me that it can be done
through hyp_filtering_cmd. @aarora8, could you please make it
a symlink and do the "unk scoring" using hyp_filtering_cmd?

hhadian · 2017-11-24T17:01:19Z

egs/iam/s5/local/chain/align_nnet3_lats.sh

@@ -0,0 +1,105 @@
+#!/bin/bash


Should we move this script to steps/nnet3?

See how it differs from steps/nnet3/align_lats.sh.

Actually, they are basically the same, however steps/nnet3/align_lats.sh was not there at the time.
@aarora8, could you please update the chainali
recipes to use steps/nnet3/align_lats.sh instead and remove this script?

done, made the change for the pull request.

jtrmal · 2017-11-24T18:52:25Z

I'm not gonna argue any more, but I don't see reason why they _should_ be different, given the fact that the recipes are prepared by the same team (and in the same PR) -- I think the more things will be idiomatic and fixed across related recipes, the better (of course only where it makes sense). y.

…

On Fri, Nov 24, 2017 at 1:38 PM, Daniel Povey ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/uw3/v1/local/prepare_lm.sh <#1984 (comment)>: > + echo "$0: IRSTLM does not seem to be installed (build-lm.sh not on your path): " && \ + echo "go to <kaldi-root>/tools and try 'make irstlm_tgt'" && exit 1; + + cut -d' ' -f2- $lmsrc | sed -e 's:^:<s> :' -e 's:$: </s>:' \ + > $tmpdir/tmp_lm_train + build-lm.sh -k 1 -i $tmpdir/tmp_lm_train -n $ngram -o $tmpdir/tmp_lm.ilm.gz + + compile-lm $tmpdir/tmp_lm.ilm.gz -t=yes /dev/stdout | \ + grep -v unk > $tmpdir/lm_phone_bg.arpa + + arpa=$tmpdir/lm_phone_bg.arpa +fi + +cat $arpa | utils/find_arpa_oovs.pl $lang/words.txt > $tmpdir/oovs.txt + +cat $arpa | \ I think I'm OK with using 2 different LM toolkits since it's 2 different recipes. The only benefit in having a standard is if it's globally enforced, and that isn't practical for various reasons (there just isn't a clear leader). — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1984 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKisX4x7cBFI6q6b1rqp0qAnpZdZRun5ks5s5w0QgaJpZM4QLkfc> .

xiaohui-zhang · 2017-11-27T21:07:58Z

egs/iam/s5/local/unk_arc_post_to_transcription.py

+utt_word_dict = dict()
+utt_phone_dict = dict()# stores utteranceID and phoneID
+unk_word_dict = dict()
+count=0


looks crowded. consider adding an empty line here

xiaohui-zhang · 2017-11-27T21:25:57Z

egs/iam/s5/local/unk_arc_post_to_transcription.py

+    out_fh = open(args.out_ark,'wb')
+
+phone_dict = dict()# stores phoneID and phone mapping
+phone_data_vect = phone_fh.read().strip().split("\n")


looks crowded. consider adding an empty line here

xiaohui-zhang · 2017-11-27T21:26:07Z

egs/iam/s5/local/unk_arc_post_to_transcription.py

+  key_val = key_val.split(" ")
+  phone_dict[key_val[1]] = key_val[0]
+word_dict = dict()
+word_data_vect = word_fh.read().strip().split("\n")


looks crowded. consider adding an empty line here

xiaohui-zhang · 2017-11-27T21:27:30Z

egs/iam/s5/local/unk_arc_post_to_transcription.py

+parser.add_argument('unk', type=str, default='-', help='location of unk file')
+parser.add_argument('--input-ark', type=str, default='-', help='where to read the input data')
+parser.add_argument('--out-ark', type=str, default='-', help='where to write the output data')
+args = parser.parse_args()


looks crowded. consider adding an empty line here

xiaohui-zhang · 2017-11-27T21:29:55Z

egs/iam/s5/local/unk_arc_post_to_transcription.py

+    utt_word_dict[uttID] = dict()
+    utt_phone_dict[uttID] = dict()
+    utt_word_dict[uttID][count] = word
+    utt_phone_dict[uttID][count] = phones


looks crowded. consider adding an empty line here

xiaohui-zhang · 2017-11-27T22:07:08Z

egs/iam/s5/local/unk_arc_post_to_transcription.py

+    for phone_val in phone_val_vect:
+      phone_2_word.append(phone_val.split('_')[0])
+    phone_2_word = ''.join(phone_2_word)
+    utt_word_dict[uttID][count] = phone_2_word


This part is the core of this script and definitely needs more explanation,.e.g. "Since in OCR, the lexicon is purely graphemic, we can just concatenate the phones from the most probable phone sequence given by the unk-model to produce the predicted word.", because usually we need a P2G model in order to map a phone sequence to a predicted word.

Also please add some explanation at the header,.e.g. "then it will replace
the <unk> with the word predicted by <unk> model" => "then it will replace
the <unk> with the word predicted by <unk> model by concatenating phones decoded from the unk-model."

Done, thanks.

danpovey · 2018-01-04T04:08:27Z

I haven't merged this because it's still marked WIP. Let me know when you think it's ready.

* OCR: Add IAM corpus with unk decoding support (#3) * Add a new English OCR database 'UW3' * Some minor fixes re IAM corpus * Fix an issue in IAM chain recipes + add a new recipe (#6) * Some fixes based on the pull request review * Various fixes + cleaning on IAM * Fix LM estimation and add extended dictionary + other minor fixes * Add README for IAM * Add output filter for scoring * Fix a bug RE switch to pyhton3 * Add updated results + minor fixes * Remove unk decoding -- gives almost no gain * Add UW3 OCR database * Fix cmd.sh in IAM + fix usages of train/decode_cmd in chain recipes * Various minor fixes on UW3 * Rename iam/s5 to iam/v1 * Add README file for UW3 * Various cosmetic fixes on UW3 scripts * Minor fixes in IAM

* OCR: Add IAM corpus with unk decoding support (kaldi-asr#3) * Add a new English OCR database 'UW3' * Some minor fixes re IAM corpus * Fix an issue in IAM chain recipes + add a new recipe (kaldi-asr#6) * Some fixes based on the pull request review * Various fixes + cleaning on IAM * Fix LM estimation and add extended dictionary + other minor fixes * Add README for IAM * Add output filter for scoring * Fix a bug RE switch to pyhton3 * Add updated results + minor fixes * Remove unk decoding -- gives almost no gain * Add UW3 OCR database * Fix cmd.sh in IAM + fix usages of train/decode_cmd in chain recipes * Various minor fixes on UW3 * Rename iam/s5 to iam/v1 * Add README file for UW3 * Various cosmetic fixes on UW3 scripts * Minor fixes in IAM

* OCR: Add IAM corpus with unk decoding support (#3) * Add a new English OCR database 'UW3' * Some minor fixes re IAM corpus * Fix an issue in IAM chain recipes + add a new recipe (#6) * Some fixes based on the pull request review * Various fixes + cleaning on IAM * Fix LM estimation and add extended dictionary + other minor fixes * Add README for IAM * Add output filter for scoring * Fix a bug RE switch to pyhton3 * Add updated results + minor fixes * Remove unk decoding -- gives almost no gain * Add UW3 OCR database * Fix cmd.sh in IAM + fix usages of train/decode_cmd in chain recipes * Various minor fixes on UW3 * Rename iam/s5 to iam/v1 * Add README file for UW3 * Various cosmetic fixes on UW3 scripts * Minor fixes in IAM

* OCR: Add IAM corpus with unk decoding support (Idlak#3) * Add a new English OCR database 'UW3' * Some minor fixes re IAM corpus * Fix an issue in IAM chain recipes + add a new recipe (Idlak#6) * Some fixes based on the pull request review * Various fixes + cleaning on IAM * Fix LM estimation and add extended dictionary + other minor fixes * Add README for IAM * Add output filter for scoring * Fix a bug RE switch to pyhton3 * Add updated results + minor fixes * Remove unk decoding -- gives almost no gain * Add UW3 OCR database * Fix cmd.sh in IAM + fix usages of train/decode_cmd in chain recipes * Various minor fixes on UW3 * Rename iam/s5 to iam/v1 * Add README file for UW3 * Various cosmetic fixes on UW3 scripts * Minor fixes in IAM

aarora8 and others added 2 commits October 30, 2017 13:31

OCR: Add IAM corpus with unk decoding support (#3)

2a59b3a

Add a new English OCR database 'UW3'

13e0a5b

pegahgh reviewed Nov 7, 2017

View reviewed changes

Some minor fixes re IAM corpus

fdd0953

Fix an issue in IAM chain recipes + add a new recipe (#6)

aa7c19a

danpovey reviewed Nov 23, 2017

View reviewed changes

jtrmal reviewed Nov 23, 2017

View reviewed changes

hhadian commented Nov 24, 2017

View reviewed changes

xiaohui-zhang reviewed Nov 27, 2017

View reviewed changes

aarora8 and others added 10 commits December 22, 2017 20:10

Some fixes based on the pull request review

4e085a4

Various fixes + cleaning on IAM

e243bee

Fix LM estimation and add extended dictionary + other minor fixes

0e4f613

Add README for IAM

6f790ed

Add output filter for scoring

96b51d4

Fix a bug RE switch to pyhton3

b914da2

Add updated results + minor fixes

05fb12e

Remove unk decoding -- gives almost no gain

1e3a8c4

Add UW3 OCR database

a08725e

Fix cmd.sh in IAM + fix usages of train/decode_cmd in chain recipes

8a97657

hhadian added 5 commits January 4, 2018 06:34

Various minor fixes on UW3

51a8747

Rename iam/s5 to iam/v1

686327a

Add README file for UW3

3eef728

Various cosmetic fixes on UW3 scripts

bc89b7d

Minor fixes in IAM

9898023

hhadian changed the title ~~[WIP] Add OCR/Handwriting Recognition examples~~ Add OCR/Handwriting Recognition examples Jan 4, 2018

danpovey merged commit 8292e4c into kaldi-asr:master Jan 4, 2018


		cat $arpa \| utils/find_arpa_oovs.pl $lang/words.txt > $tmpdir/oovs.txt

		cat $arpa \| \

Add OCR/Handwriting Recognition examples #1984

Add OCR/Handwriting Recognition examples #1984

Conversation

hhadian commented Oct 30, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey commented Nov 15, 2017 via email

hhadian commented Nov 15, 2017

danpovey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtrmal commented Nov 23, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtrmal commented Nov 23, 2017 via email

danpovey commented Nov 23, 2017 via email

jtrmal commented Nov 23, 2017 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtrmal commented Nov 24, 2017 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiaohui-zhang Nov 27, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey commented Jan 4, 2018

xiaohui-zhang Nov 27, 2017 •

edited

Loading