Skip to content

Commit

Permalink
trunk: minor fix to dictionary preparation script for Fisher English
Browse files Browse the repository at this point in the history
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@4708 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
  • Loading branch information
danpovey committed Dec 20, 2014
1 parent 8797aac commit 8e7793f
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion egs/fisher_english/s5/local/fisher_prepare_dict.sh
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ cat $dir/silence_phones.txt| awk '{printf("%s ", $1);} END{printf "\n";}' > $dir

grep -v ';;;' $dir/cmudict/cmudict.0.7a | tr '[A-Z]' '[a-z]' | \
perl -ane 'if(!m:^;;;:){ s:(\S+)\(\d+\) :$1 :; s: : :; print; }' | \
sed s/[0-9]//g | sort | uniq > $dir/lexicon1_raw_nosil.txt || exit 1;
perl -ane '@A = split(" ", $_); for ($n = 1; $n<@A;$n++) { $A[$n] =~ s/[0-9]//g; } print join(" ", @A) . "\n";' | \
sort | uniq > $dir/lexicon1_raw_nosil.txt || exit 1;

# Add prons for laughter, noise, oov
for w in `grep -v sil $dir/silence_phones.txt`; do
Expand Down Expand Up @@ -92,6 +93,7 @@ cat $dir/lexicon3_expand.txt \


cp $dir/lexicon4_extra.txt $dir/lexicon.txt
rm $dir/lexiconp.txt 2>/dev/null; # can confuse later script if this exists.

awk '{print $1}' $dir/lexicon.txt | \
perl -e '($word_counts)=@ARGV;
Expand Down

0 comments on commit 8e7793f

Please sign in to comment.