forked from cmusphinx/sphinx4
-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
1599 lines (1302 loc) · 63.1 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<!-- Copyright 1999-2002 Carnegie Mellon University. Portions Copyright 2002 Sun Microsystems, Inc. Portions Copyright 2002 Mitsubishi Electric Research Laboratories. All Rights Reserved. Use is subject to license terms. See the file "license.terms" for information on usage and redistribution of this file, and for a DISCLAIMER OF ALL WARRANTIES. -->
<title>Sphinx-4 - A speech recognizer written entirely in the
Java(TM) programming language</title>
<style type="text/css">
/*<![CDATA[*/
pre { font-size: medium; background: #f0f8ff; padding: 2mm; border-style: ridge ; color: teal}
code {font-size: medium; color: teal}
s4keyword { color: red; font-weight: bold }
/*]]>*/
</style>
<meta name="generator" content=
"HTML Tidy for Linux/x86 (vers 14 June 2007), see www.w3.org" />
</head>
<body style="background-color: white;">
<div style="text-align: center;">
<table bgcolor="#99CBFF" border="0" width="100%">
<tbody>
<tr>
<td align="center" width="100%">
<h1><i>Sphinx-4</i><br />
<span style="font-size: larger;">A speech recognizer
written entirely in the Java<sup>TM</sup> programming
language</span></h1>
</td>
</tr>
</tbody>
</table>
</div>
<table border="0" width="100%">
<tbody>
<tr>
<td bgcolor="#F0F8FF" valign="top" width="20%">
<br />
<b><span>Sphinx-4 Links</span></b>
<p>SourceForge</p>
<ul>
<li><a href=
"http://sourceforge.net/projects/cmusphinx">Project
Page</a></li>
<li><a href=
"http://sourceforge.net/forum/?group_id=1904">Forums</a></li>
<li><a href=
"http://sourceforge.net/projects/cmusphinx/files/sphinx4">
Download</a></li>
<li><a href=
"https://sourceforge.net/scm/?type=svn&group_id=1904">SVN
Repository</a></li>
</ul>
<p><a href=
"http://cmusphinx.sourceforge.net">CMU Sphinx Website</a></p>
<p><a href=
"http://cmusphinx.sourceforge.net/sphinx4/javadoc/index.html">
Sphinx-4 Javadocs</a></p>
<p><a href=
"http://cmusphinx.sourceforge.net/wiki/">Sphinx-4
Wiki</a></p>
<hr />
<b>ZipCity</b> - A
demonstration of Sphinx-4 using Java Web Start
technology. <a href=
"src/apps/edu/cmu/sphinx/demo/zipcity/README.html">
Read more</a> about the ZipCity demo, or <a href=
"http://cmusphinx.sourceforge.net/sphinx4/src/apps/edu/cmu/sphinx/demo/zipcity/zipcity.jnlp">
Try it</a>.
<p><a href=
"http://cmusphinx.sourceforge.net/sphinx4/src/apps/edu/cmu/sphinx/demo/zipcity/zipcity.jnlp">
<img src="doc-files/zipcity.gif" /></a></p>
<hr />
<div style="text-align: center;">
<a href=
"http://www.sourceforge.net"><img src=
"http://sourceforge.net/sflogo.php?group_id=1904&type=1"
alt="SourceForge Logo" border="0" height="31" width=
"88" /></a><br />
Hosted by SourceForge.net
</div><br />
<div style="text-align: center;">
<a href="http://www.jetbrains.com/idea/">
<img src=
"http://www.jetbrains.com/idea/opensource/img/banners/idea88x31_blue.gif"
alt="The best Java IDE" border="0" /></a> <br />Developed
with IntelliJ
<br />
Profiled with <a href=
"http://www.ej-technologies.com/products/jprofiler/overview.html">
JProfiler</a>
</div>
</td>
<td width="5%"><br /></td>
<td valign="top">
<br />
<h3>General Information</h3>
<ul>
<li><a href="#what_is_sphinx4">Introduction</a></li>
<li><a href="#capabilities">Capabilities</a></li>
<li><a href="#speed_and_accuracy">Performance</a></li>
</ul>
<h3>Installation</h3>
<ul>
<li><a href="#download_and_install">Required
Software</a></li>
<li><a href="#source">Downloading Sphinx-4</a></li>
<li><a href="#how_build">Building Sphinx-4</a></li>
<li><a href="#create_javadocs">Create
Javadocs</a></li>
<li><a href="#setupide">How to setup my IDE (Eclipse,
Netbeans, Idea) ?</a></li>
<li><a href="#demos">Running the demonstration
programs</a></li>
</ul>
<h3>Sphinx-4 in Detail</h3>
<ul>
<li><a href="#whitepaper">Sphinx-4 Whitepaper</a></li>
<li><a href="doc/Sphinx4-faq.html">FAQ: Frequently
asked questions about Sphinx-4 (with answers)</a></li>
<li><a href="#sphinx_properties">Understanding Sphinx-4
Configuration Management</a></li>
<li><a href="#sphinx_instrumentation">Understanding
Sphinx-4 Instrumentation</a></li>
<li><a href=
"javadoc/edu/cmu/sphinx/frontend/doc-files/FrontEndFAQ.html">
Front End</a></li>
<li style="list-style: none; display: inline">
<ul>
<li><a href=
"javadoc/edu/cmu/sphinx/frontend/doc-files/FrontEndConfiguration.html">
Configuration</a></li>
<li><a href=
"javadoc/edu/cmu/sphinx/frontend/doc-files/FrontEndFAQ.html#create_cepstra">
Creating spectrum/cepstrum</a></li>
<li><a href=
"javadoc/edu/cmu/sphinx/frontend/doc-files/FrontEndFAQ.html#decode_cepstra">
Decoding cepstra</a></li>
<li><a href=
"javadoc/edu/cmu/sphinx/frontend/doc-files/FrontEndFAQ.html#enable_endpointer">
Enabling the endpointer</a></li>
</ul>
</li>
<li><a href="#batch_tests">Running the Regression
Tests</a></li>
<li>
<a href="#setup_test">Setting up a Regression
Test</a>
<ul>
<li><a href="#batch_files">Batch Files</a></li>
<li><a href="#input_files">Input Audio/Cepstral
Files</a></li>
<li><a href="#an4_walkthrough">Example: Setting up
AN4 tests</a></li>
</ul>
</li>
<li><a href="#acoustic_models">Creating and using
Acoustic Model Package</a></li>
<li><a href="#language_models">Creating Language
Models</a></li>
<li><a href="#bnf_grammars">BNF Style Grammars</a></li>
<li><a href="#architecture_and_api1">Architecture and
API</a></li>
<li><a href="doc/ProgrammersGuide.html">Programmer's
Guide</a></li>
</ul>
</td>
</tr>
</tbody>
</table>
<hr />
<h2>General Information about Sphinx-4</h2>
<ul>
<li>
<a name="what_is_sphinx4" id=
"what_is_sphinx4"><b>Introduction</b></a>
<p>Sphinx-4 is a state-of-the-art speech recognition system
written entirely in the Java<sup>TM</sup> programming language.
It was created via a joint collaboration between the Sphinx
group at Carnegie Mellon University, Sun Microsystems
Laboratories, Mitsubishi Electric Research Labs (MERL), and
Hewlett Packard (HP), with contributions from the University
of California at Santa Cruz (UCSC) and the Massachusetts
Institute of Technology (MIT).</p>
<p>Sphinx-4 started out as a port of Sphinx-3 to the Java
programming language, but evolved into a recognizer designed
to be much more flexible than Sphinx-3, thus becoming an
excellent platform for speech research.<br />
</p>
</li>
<li>
<a name="capabilities" id=
"capabilities"><b>Capabilities</b></a>
<p>Live mode and batch mode speech recognizers, capable of
recognizing discrete and continuous speech.</p>
<p>Generalized pluggable <a href=
"./javadoc/edu/cmu/sphinx/frontend/package-summary.html"><b>front
end</b></a> architecture. Includes pluggable implementations
of <a href=
"./javadoc/edu/cmu/sphinx/frontend/filter/Preemphasizer.html">
preemphasis</a>, <a href=
"./javadoc/edu/cmu/sphinx/frontend/window/RaisedCosineWindower.html">
Hamming window</a>, <a href=
"./javadoc/edu/cmu/sphinx/frontend/transform/DiscreteFourierTransform.html">
FFT</a>, <a href=
"./javadoc/edu/cmu/sphinx/frontend/frequencywarp/MelFrequencyFilterBank.html">
Mel frequency filter bank</a>, <a href=
"./javadoc/edu/cmu/sphinx/frontend/transform/DiscreteCosineTransform.html">
discrete cosine transform</a>, <a href=
"./javadoc/edu/cmu/sphinx/frontend/feature/BatchCMN.html">cepstral
mean normalization</a>, and <a href=
"./javadoc/edu/cmu/sphinx/frontend/feature/DeltasFeatureExtractor.html">
feature extraction</a> of cepstra, delta cepstra, double
delta cepstra features.</p>
<p>Generalized pluggable <b>language model</b> architecture.
Includes pluggable language model support for <a href=
"./javadoc/edu/cmu/sphinx/linguist/language/ngram/SimpleNGramModel.html">
ASCII</a> and <a href=
"./javadoc/edu/cmu/sphinx/linguist/language/ngram/large/LargeTrigramModel.html">
binary</a> versions of unigram, bigram, trigram, <a href=
"./javadoc/edu/cmu/sphinx/jsgf/JSGFGrammar.html">Java Speech
API Grammar Format (JSGF)</a>, and <a href=
"./javadoc/edu/cmu/sphinx/linguist/language/grammar/FSTGrammar.html">
ARPA-format FST grammars</a>.</p>
<p>Generalized <a href=
"./javadoc/edu/cmu/sphinx/linguist/acoustic/package-summary.html">
<b>acoustic model</b></a> architecture. Includes pluggable
support for <a href=
"./javadoc/edu/cmu/sphinx/linguist/acoustic/tiedstate/Sphinx3Loader.html">
Sphinx-3 acoustic models</a>.</p>
<p>Generalized <a href=
"./javadoc/edu/cmu/sphinx/decoder/search/package-summary.html">
<b>search management</b></a>. Includes pluggable support for
<a href=
"./javadoc/edu/cmu/sphinx/decoder/search/SimpleBreadthFirstSearchManager.html">
breadth first</a> and <a href=
"./javadoc/edu/cmu/sphinx/decoder/search/WordPruningBreadthFirstSearchManager.html">
word pruning</a> searches.</p>
<p>Utilities for post-processing recognition results,
including <a href=
"./javadoc/edu/cmu/sphinx/result/ConfidenceScorer.html">obtaining
confidence scores</a>, <a href=
"./javadoc/edu/cmu/sphinx/result/Lattice.html">generating
lattices</a> and <a href=
"./javadoc/edu/cmu/sphinx/tools/tags/package-summary.html">embedding
ECMAScript into JSGF tags</a>.</p>
<p>Standalone tools. Includes tools for <a href=
"./javadoc/edu/cmu/sphinx/tools/audio/package-summary.html">displaying
waveforms and spectrograms</a> and <a href=
"./javadoc/edu/cmu/sphinx/tools/feature/package-summary.html">
generating features from audio</a>.</p>
<p>(<b>NOTE:</b> The links in this section point to local
files created by javadoc. If they are broken, please follow
the instructions on <a href="#create_javadocs">Create
Javadocs</a> to create these links.)<br />
</p>
</li>
<li>
<a name="speed_and_accuracy" id=
"speed_and_accuracy"><b>Performance</b></a>
<p>Sphinx-4 is a very flexible system capable of performing
many different types of recognition tasks. As such, it is
difficult to characterize the performance and accuracy of
Sphinx-4 with just a few simple numbers such as speed and
accuracy. Instead, we regularly run regression tests on
Sphinx-4 to determine how it performs under a variety of
tasks. These tasks and their latest results are as follows
(each task is progressively more difficult than the previous
task):</p>
<ul>
<li>Isolated Digits (TI46): Runs Sphinx-4 with pre-recorded
test data to gather performance metrics for recognizing
just one word at a time. The vocabulary is merely the
spoken digits from 0 through 9, with a single utterance
containing just one digit.<br />
<i>(TI46 refers to the "NIST CD-ROM Version of the Texas
Instruments-developed 46-Word Speaker-Dependent Isolated
Word Speech Database".)</i></li>
<li>Connected Digits (TIDIGITS): Extends the Isolated
Digits test to recognize more than one word at a time
(i.e., continuous speech). The vocabulary is merely the
spoken digits from 0 through 9, with a single utterance
containing a sequence of digits.<br />
<i>(TIDIGITS refers to the "NIST CD-ROM Version of the
Texas Instruments-developed Studio Quality
Speaker-Independent Connected-Digit Corpus".)</i></li>
<li>Small Vocabulary (AN4): Extends the vocabulary to
approximately 100 words, with input data ranging from
speaking words as well as spelling words out letter by
letter.</li>
<li>Medium Vocabulary (RM1): Extends the vocabulary to
approximately 1,000 words.</li>
<li>Medium Vocabulary (WSJ5K): Extends the vocabulary to
approximately 5,000 words.</li>
<li>Medium Vocabulary (WSJ20K): Extends the vocabulary to
approximately 20,000 words.</li>
<li>Large Vocabulary (HUB4): Extends the vocabulary to
approximately 64,000 words.</li>
</ul>
</li>
<li style="list-style: none; display: inline">
<p>The following table compares the performance of Sphinx 3.3
with Sphinx-4.</p>
<table border="1" cellpadding="1" cellspacing="0">
<tbody>
<tr>
<th bgcolor="#E0E8FF"><strong>Test</strong></th>
<th bgcolor="#E0E8FF"><strong>S3.3 WER</strong></th>
<th bgcolor="#E0E8FF"><strong>S4 WER</strong></th>
<th bgcolor="#E0E8FF"><strong>S3.3 RT</strong></th>
<th bgcolor="#E0E8FF"><strong>S4 RT(1)</strong></th>
<th bgcolor="#E0E8FF"><strong>S4 RT (2)</strong></th>
<th bgcolor="#E0E8FF"><strong>Vocabulary
Size</strong></th>
<th bgcolor="#E0E8FF"><strong>Language
Model</strong></th>
</tr>
<tr>
<th bgcolor="#F0F8FF"><strong>TI46</strong></th>
<td align="right">1.217</td>
<td align="right">0.168</td>
<td align="right">0.14</td>
<td align="right">.03</td>
<td align="right">.02</td>
<td align="right">11</td>
<td>isolated digits recognition</td>
</tr>
<tr>
<th bgcolor="#F0F8FF"><strong>TIDIGITS</strong></th>
<td align="right">0.661</td>
<td align="right">0.549</td>
<td align="right">0.16</td>
<td align="right">0.07</td>
<td align="right">0.05</td>
<td align="right">11</td>
<td>continuous digits</td>
</tr>
<tr>
<th bgcolor="#F0F8FF"><strong>AN4</strong></th>
<td align="right">1.300</td>
<td align="right">1.192</td>
<td align="right">0.38</td>
<td align="right">0.25</td>
<td align="right">0.20</td>
<td align="right">79</td>
<td>trigram</td>
</tr>
<tr>
<th bgcolor="#F0F8FF"><strong>RM1</strong></th>
<td align="right">2.746</td>
<td align="right">2.88</td>
<td align="right">0.50</td>
<td align="right">0.50</td>
<td align="right">0.41</td>
<td align="right">1,000</td>
<td>trigram</td>
</tr>
<tr>
<th bgcolor="#F0F8FF"><strong>WSJ5K</strong></th>
<td align="right">7.323</td>
<td align="right">6.97</td>
<td align="right">1.36</td>
<td align="right">1.22</td>
<td align="right">0.96</td>
<td align="right">5,000</td>
<td>trigram</td>
</tr>
<tr>
<th bgcolor="#F0F8FF"><strong>HUB4</strong></th>
<td align="right">18.845</td>
<td align="right">18.756</td>
<td align="right">3.06</td>
<td align="right">~4.4</td>
<td align="right">3.95</td>
<td align="right">60,000</td>
<td>trigram</td>
</tr>
</tbody>
</table>
<p>Note that performance work on the HUB4 test is not
complete</p>
<p>Key:</p>
<ul>
<li><strong>WER</strong> - Word error rate (%) (lower is
better)</li>
<li><strong>RT</strong> - Real Time - Ratio of processing
time to audio time - (lower is better)</li>
<li><strong>S3.3 RT</strong> - Results for a single or dual
CPU configuration</li>
<li><strong>S4 RT(1)</strong> - Results on a single-CPU
configuration</li>
<li><strong>S4 RT(2)</strong> - Results for a dual-CPU
configuration</li>
</ul>
<p>This data was collected on a dual CPU UltraSPARC(R)-III
running at 1015 MHz with 2G of memory.</p>
</li>
</ul>
<hr />
<h2>Installation</h2>
<ul>
<li>
<a name="download_and_install" id=
"download_and_install"><b>Required Software</b></a>
<p>Sphinx-4 has been built and tested on the Solaris
<sup>TM</sup>
Operating Environment, Mac OS X, Linux and Win32 operating
systems. Running, building, and testing Sphinx-4 requires
additional software. Before you start, you will need the
following software available on your machine.</p>
<ul>
<li><b>Java SE 6 Development Kit</b> or better. Go to
<a href="http://java.sun.com">java.sun.com</a>, and select
"J2SE" from popular downloads. At the time of writing, the
latest release version is JDK 6 Update 14, which is the one
we recommend.</li>
</ul><br />
<ul>
<li><b>Ant 1.6.0</b> or better, available at <a href=
"http://ant.apache.org">ant.apache.org</a>. The site has a
manual with instructions on how to download, install, and
use ant. You will only need ant if you wish to build
Sphinx-4 from the source distribution.</li>
<li><b>Subversion (svn)</b>, but only if you want to
interact directly with the svn tree (which we recommend).
The canonical places to get it is <a href=
"http://subversion.tigris.org/">subversion.tigris.org</a>.
If you are using Windows, your best choice is to install
<a href="http://cygwin.com">cygwin</a>, which will give you
a linux-like environment in a command prompt window. Make
sure to choose "svn" when you install cygwin.</li>
</ul>
</li>
<li>
<a name="source" id="source"><b>Downloading
Sphinx-4</b></a><br />
<ul>
<li>
<b>Instructions for retrieving code from a release
package.</b>
<p>Sphinx-4 has two packages available for <a href=
"http://sourceforge.net/projects/cmusphinx/files/sphinx4">
download</a>:</p>
<ul>
<li><b>sphinx4-{version}-bin.zip</b>: provides the jar
files, documentation, and demos</li>
<li><b>sphinx4-{version}-src.zip</b>: provides the
sources, documentation, demos, unit tests and
regression tests.</li>
</ul>
<p>See <a href="doc/Sphinx4-faq.html#which_dist">this FAQ
question</a> to help determine whether you should get the
binary or the source distribution.</p>
<p>After you have downloaded the distribution, unjar the
ZIP files using the <code>jar</code> command which is in
the <code>bin</code> directory of your Java
installation:</p>
<pre>
jar xvf sphinx4-{version}-bin.zip<br />jar xvf sphinx4-{version}-src.zip
</pre>
<p>For both downloads, a directory called
"sphinx4-{version}" will be created.</p>
<p>There are also the RM1 acoustic model, and HUB4
acoustic and language models, available for download at
the same location on SourceForge. Download them only if
you want to run the regression tests for RM1 and
HUB4.</p>
</li>
<li>
<a name="svn" id="svn"><b>Instructions for retrieving
code from the svn repository</b></a>
<p>If you want to be able to get the latest updates from
the svn repository, you should retrieve the code from the
repository on SourceForge. The Sphinx-4 code is located
at <a href=
"http://sourceforge.net/projects/cmusphinx">sourceforge.net</a>
as open source. Please follow the instructions below to
retrieve it.</p>
<ul>
<li>Get the code from sourceforge.net:
<pre>
% svn co https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/sphinx4
</pre>
</li>
</ul>
</li>
</ul>
</li>
<li>
<a name="how_build" id="how_build"><b>Building
Sphinx-4</b></a>
<p>Since the sphinx4-{version}-bin.zip distribution does not
contain the source code, you must download the
sphinx4-{version}-src.zip, or retrieved the code from
SourceForge using svn, in order to be able to build from the
sources. The software required for building Sphinx-4 are
listed in the <a href="#download_and_install">Required
Software</a> section.</p>
<p><b>Run ant</b></p>
<p>To build Sphinx-4, at the command prompt change to the
directory where you installed Sphinx-4 (usually, a simple "cd
sphinx4" will do). Set required environment variables.
<code>JAVA_HOME</code> to the location of JDK, <code>ANT_HOME</code>
to the location of ant and and <code>PATH</code> to include
both bin subfolder of JDK and bin subfolder of ant
variables. For example:</p>
<pre>
export JAVA_HOME=/usr/local/jdk1.6.0_14
export ANT_HOME=/usr/local/apache-ant-1.8.0
export PATH=/usr/local/jdk1.6.0_10/bin:/usr/local/apache-ant-1.8.0/bin:$PATH
</pre>
<p>Then type the following:</p>
<pre>
ant
</pre>
<p>This executes the <a href="http://ant.apache.org/">Apache
Ant</a> command to build the Sphinx-4 classes under the
<code>bld</code> directory, the jar files under the
<code>lib</code> directory, and the demo jar files under the
<code>bin</code> directory.</p>
<p>To delete all the output from the build to give you a
fresh start:</p>
<pre>
ant clean
</pre>
</li>
<li>
<a name="create_javadocs" id="create_javadocs"><b>Create
Javadocs</b></a>
<p>The javadocs have already been built if you downloaded the
sphinx4-{version}-bin.zip. In order to build the javadocs
yourself, you must download the sphinx4-{version}-src.zip
distribution instead. To build the javadocs, go to the top
level directory ("sphinx4-{version}"), and type:</p>
<pre>
ant javadoc
</pre>
<p>This will build javadocs from public classes, displaying
only the public methods and fields. In general, this is all
the information you will need. If you need more details, such
as private or protected classes, you can generate the
corresponding javadoc by doing, for example:</p>
<pre>
ant -Daccess=private javadoc
</pre>
</li>
<li>
<a name="setupide" id="setupide"><b>How to setup my IDE
(Eclipse, Netbeans, Idea) ?</b></a>
<p>The setup is straightforward:<br /></p>
<ol>
<li>Add all <span style=
"font-weight: bold;">subfolders</span>(!) of the
<code>src</code>-directory as source folders to your
project.</li>
</ol>To perform common tasks (like deployment of the
sphinx4.jar, the models or the demo-jars) directly from
within your IDE you might also want to add the bundled
<code>build.xml</code> as project ant file. This can be done
in most cases by just right-clicking the
<code>build.xml</code> in the file navigator pane of your IDE
and selecting "Add as project ant file". To debug the
demo applications you also need to add the
<code>src/apps</code> folder and the acoustic model jars
(that can be deployed to the
<code>lib</code>-directory with a simple
<code>ant all</code>) to your classpath.<br />
</li>
</ul>
<hr />
<h2><a name="demos" id="demos">Demos</a></h2>
<p>Sphinx-4 contains a number of demo programs. If you downloaded
the binary distribution (sphinx4-{version}-bin.zip), the JAR
files of the demos are already built, so you can just run them
directly. However, if you downloaded the source distribution
(sphinx4-{version}-src.zip or via svn), you need to build the
demos. Click on the links below for instructions on how to build
and run the demos.</p>
<h3>Simple demos to start with sphinx4</h3>
<ul>
<li><a href=
"src/apps/edu/cmu/sphinx/demo/helloworld/README.html">Hello
World Demo</a>: a command line application that recognizes
simple phrases.</li>
<li><a href=
"src/apps/edu/cmu/sphinx/demo/hellongram/README.html">Hello
N-Gram Demo</a>: a command line application using an N-gram
language model for speech recognition</li>
</ul>
<h3>Demos for audio file transcription</h3>
<ul>
<li><a href=
"src/apps/edu/cmu/sphinx/demo/transcriber/README.html">Transcriber
Demo</a>: a simple demo program showing how to transcribe a
continuous audio file that has multiple utterances separated by
silences.</li>
<li><a href=
"src/apps/edu/cmu/sphinx/demo/confidence/README.html">Confidence
Demo</a>: a simple demo program showing how to obtain
confidence scores for result.</li>
<li><a href=
"src/apps/edu/cmu/sphinx/demo/lattice/README.html">Lattice
Demo</a>: a simple demo program showing how to extract lattices
from recognition results.</li>
<li><a href=
"src/apps/edu/cmu/sphinx/demo/classbased/README.html">Class-Based
Language model Demo</a>: a simple demo of the class based
language model.</li>
<li><a href=
"src/apps/edu/cmu/sphinx/demo/aligner/README.html">Aligner
Demo</a>: aligns audio file to transcription and get times of
words. Can be useful for closed captioning.</li>
</ul>
<h3>Dialog demos to write advanced dialog system</h3>
<ul>
<li><a href=
"src/apps/edu/cmu/sphinx/demo/dialog/README.html">Dialog
Demo</a>: a demo program showing how a program can swap between
multiple JSGF and dictation grammars.</li>
</ul>
<p>There is also a <a href="tests/live/README.html">live-mode
test program</a> (this link only works if you downloaded the
source distribution), which is available if you download the
sphinx-src-{version}.zip file but not available in the
sphinx-bin-{version}.zip file.</p>
<p>The <a href=
"javadoc/edu/cmu/sphinx/tools/audio/doc-files/HowToRunAudioTool.html">
AudioTool</a> is a visual tool that records and displays the
waveform and spectrogram of an audio signal. It is available in
both the binary and source releases.</p>
<hr />
<h2>Sphinx-4 in Detail</h2>
<ul>
<li><a name="whitepaper" id="whitepaper"><b>Sphinx-4
Whitepaper</b></a> <a href=
"doc/Sphinx4Whitepaper.pdf">Sphinx-4: A Flexible Open Source
Framework for Speech Recognition</a> describes the framework
and implementation of Sphinx-4 from a speech-technologist's
perspective. Please read this if you'd like to extend
Sphinx-4.</li>
<li><a name="sphinx_properties" id="sphinx_properties"><b>FAQ:
Frequently asked questions about Sphinx-4</b></a> The document
<a href="doc/Sphinx4-faq.html">Frequently asked questions about
Sphinx-4</a> contains answers to a number of frequently asked
questions about Sphinx-4</li>
<li>
<a name="sphinx_properties" id=
"sphinx_properties"><b>Understanding Sphinx-4 Configuration
Management</b></a>
<p><a name="sphinx_properties" id="sphinx_properties"></a>
The document <a href=
"javadoc/edu/cmu/sphinx/util/props/doc-files/ConfigurationManagement.html">
Sphinx-4 Configuration Management</a> describes, in detail,
how to configure a Sphinx-4 system.<br />
</p>
</li>
<li>
<a name="sphinx_instrumentation" id=
"sphinx_instrumentation"><b>Understanding Sphinx-4
Instrumentation</b></a>
<p><a name="sphinx_instrumentation" id=
"sphinx_instrumentation"></a>The document <a href=
"javadoc/edu/cmu/sphinx/instrumentation/doc-files/Instrumentation.html">
Sphinx-4 Instrumentation</a> describes, in detail, how to use
the instrumentation facilities of the Sphinx-4 system.<br />
</p>
</li>
<li>
<a name="batch_tests" id="batch_tests"><b>Running the
Regression Tests</b></a>
<p>Sphinx-4 contains a number of regression tests using
common speech databases. Again, you have to download the
source distribution or downloaded the source tree using svn
in order to get the regression tests directory. The
regression tests we have are:</p>
<ul>
<li><a href="#isolated_digits_test">Isolated Digits -
TI46</a></li>
<li><a href="#connected_digits_test">Connected Digits -
TIDIGITS</a></li>
<li><a href="#small_vocab_test">Small Vocabulary -
AN4</a></li>
<li><a href="#medium_vocab_test">Medium Vocabulary -
RM1</a></li>
<li><a href="#medium_vocab_test_wsj">Medium Vocabulary -
WSJ</a></li>
<li><a href="#large_vocab_test">Large Vocabulary -
HUB4</a></li>
</ul>
<p>Before you run any of the tests, make sure that you have
built Sphinx-4 already. To do so, go to the top level and
type:</p>
<pre>
ant
</pre>
<p>You also need to make sure you have the appropriate
acoustic model(s) installed. More details below.</p>
<p>The Sphinx-4 regression tests have different directories
for the different tasks. The directory
sphinx4/tests/performance contains directories named ti46,
tidigits, an4, rm1, hub4, and some other tests. Each of these
directories contains a build.xml with targets specific to the
particular task. The build.xml allows you to run a number of
different tests. Type:</p>
<pre>
ant -projecthelp
</pre>
to list a help text with the possible targets.
<p><a name="isolated_digits_test" id=
"isolated_digits_test"><br />
<b>Isolated Digits - TI46</b></a></p>
<p>The TIDIGITS models are already included as part of the
distribution. Therefore, you do not need to download them
separately. You must have the TI46 test data, available from
the <a href=
"http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S9">
LDC TI46</a> website.</p>
<p>You need to edit the batch file called
<code>ti46.batch</code>, located in
<code>tests/performance/ti46</code> directory. You will need
to change it such that it matches where you stored the TI46
test files. Refer to the section <a href="#batch_files">Batch
Files</a> for detail about the format of batch files.</p>
<p>To run the tests:</p>
<pre>
% cd sphinx4/tests/performance/ti46<br /> % ant -projecthelp # to see a list of possible targets<br /> % ant ti46_wordlist<br />
</pre>
<p><a name="connected_digits_test" id=
"connected_digits_test"><br />
<b>Connected Digits - TIDIGITS</b></a></p>
<p>The TIDIGITS models are already included as part of the
distribution. Therefore, you do not need to download them
separately.</p>
<p>You must have the TIDIGITS test data, available from the
<a href=
"http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S10">
LDC TIDIGITS</a> website.</p>
<p>You need to edit the batch file called
<code>tidigits.batch</code>, located in the
<code>tests/performance/tidigits</code> directory. You will
need to change it such that it matches where you stored the
TIDIGITS test files. Refer to the section <a href=
"#batch_files">Batch Files</a> for detail about the format of
batch files.</p>
<p>To run the tests:</p>
<pre>
% cd sphinx4/tests/performance/tidigits<br /> % ant -projecthelp # to see a list of possible targets<br /> % ant tidigits_flat_unigram<br />
</pre>
<p><a name="small_vocab_test" id="small_vocab_test"><br />
<b>Small Vocabulary - AN4</b></a></p>
<p>The Wall Street Journal (WSJ) models are already included
as part of the distribution. Therefore, you do not need to
download them separately.</p>
<p>Download the big endian raw audio format of the <a href=
"http://www.speech.cs.cmu.edu/databases/an4/">AN4
Database</a>. Unpack it at a directory of your choice:</p>
<pre>
% gunzip an4_raw.bigendian.tar.gz<br /> % tar -xvf an4_raw.bigendian.tar<br />
</pre>
<p>Then update the following batch files (located in the
<code>tests/performance/an4</code> directory), so that they
match up with where you unpacked the AN4 data. You probably
just need to replace all instances of the string
<code>"/lab/speech/sphinx4/data"</code> inside these batch
files. Please refer to the <a href="#batch_files">Batch
Files</a> section for details about batch files:</p>
<p><code>an4_full.batch<br />
an4_spelling.batch<br />
an4_words.batch</code></p>
<p>After you have updated the batch files, you can run the
tests by:</p>
<pre>
% cd sphinx4/tests/performance/an4<br /> % ant -projecthelp # to see a list of possible targets<br /> % ant an4_words_unigram<br />
</pre>