From 08024e9019050937e813ee59ab5fe697e5894023 Mon Sep 17 00:00:00 2001 From: kkm Date: Mon, 3 Aug 2015 20:35:40 -0700 Subject: [PATCH 1/3] Documentation changes: added Git tutorial, removed Subversion tutorial and updated multiple references from Subversion to Git and from SourceForge to Kaldi's own web site or GitHub as appropriate. --- src/doc/about.dox | 45 +++--- src/doc/dependencies.dox | 14 +- src/doc/install.dox | 52 +++---- src/doc/legal.dox | 30 ++-- src/doc/other.dox | 17 +-- src/doc/tutorial.dox | 2 +- src/doc/tutorial_code.dox | 37 ++--- src/doc/tutorial_git.dox | 259 +++++++++++++++++++++++++++++++++++ src/doc/tutorial_looking.dox | 30 ++-- src/doc/tutorial_prereqs.dox | 6 +- src/doc/tutorial_setup.dox | 18 +-- src/doc/tutorial_svn.dox | 70 ---------- 12 files changed, 375 insertions(+), 205 deletions(-) create mode 100644 src/doc/tutorial_git.dox delete mode 100644 src/doc/tutorial_svn.dox diff --git a/src/doc/about.dox b/src/doc/about.dox index 08c46bf93d9..dbb06791d15 100644 --- a/src/doc/about.dox +++ b/src/doc/about.dox @@ -33,12 +33,12 @@ @section about_name The name Kaldi According to legend, Kaldi was the Ethiopian goatherder who discovered the - coffee plant. + coffee plant. @section about_compare Kaldi's versus other toolkits Kaldi is similar in aims and scope to HTK. The goal is to have modern and - flexible code, written in C++, that is easy to modify and extend. + flexible code, written in C++, that is easy to modify and extend. Important features include: - Code-level integration with Finite State Transducers (FSTs) - We compile against the OpenFst toolkit (using it as a library). @@ -49,21 +49,21 @@ - As far as possible, we provide our algorithms in the most generic form possible. For instance, our decoders are templated on an object that provides a score indexed by a (frame, fst-input-symbol) - tuple. This means the decoder could work from any suitable source - of scores, such as a neural net. + tuple. This means the decoder could work from any suitable source + of scores, such as a neural net. - Open license - The code is licensed under Apache 2.0, which is one of the least restrictive licenses available. - Complete recipes - - Our goal is to make available complete recipes for building + - Our goal is to make available complete recipes for building speech recognition systems, that work from widely available databases such as those provided by the Linguistic Data - Consortium (LDC). + Consortium (LDC). - The goal of releasing complete recipes is an important aspect of Kaldi. - Since the code is publicly available under a license that permits + The goal of releasing complete recipes is an important aspect of Kaldi. + Since the code is publicly available under a license that permits modifications and re-release, we would like to encourage people to release - their code, along with their script directories, in a similar format to + their code, along with their script directories, in a similar format to Kaldi's own example script. We have tried to make Kaldi's documentation as complete as possible given time @@ -75,7 +75,7 @@ to an expert. In the future we hope to make it somewhat more accessible, bearing in mind that our intended audience is speech recognition researchers or researchers-in-training. In general, Kaldi is not a speech recognition - toolkit "for dummies." It will allow you to do many kinds of operations that + toolkit "for dummies." It will allow you to do many kinds of operations that don't make sense. @section about_flavor The flavor of Kaldi @@ -88,39 +88,39 @@ - We emphasize generic algorithms and universal recipes - By "generic algorithms" we mean things like linear - transforms, rather than those that are specific to speech + transforms, rather than those that are specific to speech in some way. But we don't intend to be too dogmatic about this, if more specific algorithms are useful. - - We would like recipes that can be run on any data-set, rather than + - We would like recipes that can be run on any data-set, rather than those that have to be customized. - We prefer provably correct algorithms - The recipes have been designed in such a way that in principle they - should never fail in a catastophic way. There has been an effort to avoid recipes and + should never fail in a catastophic way. There has been an effort to avoid recipes and algorithms that could possibly fail, even if they don't fail in the "normal case" (one example: FST weight-pushing, which normally helps but can crash or make things much worse in certain cases). - Kaldi code is thoroughly tested. - - The goal is for all or nearly all the code to have corresponding - test routines. + - The goal is for all or nearly all the code to have corresponding + test routines. - We try to keep the simple cases simple. - There is a danger when building a large speech toolkit that the code can become a forest of rarely used alternatives. We are trying to avoid this by structuring the toolkit in the following way. Each command-line program generally works for a limited set of cases (e.g. a decoder might just work for GMMs). Thus, when you add a new type of model, you create - a new command-line decoder (that calls the same underlying templated code). + a new command-line decoder (that calls the same underlying templated code). - Kaldi code is easy to understand. - Even though the Kaldi toolkit as a whole may get very large, we aim for each individual part of it to be understandable without too much effort. We will accept some code duplication if it improves the understandability of individual pieces. - Kaldi code is easy to reuse and refactor. - - We aim for the toolkit to as loosely coupled as possible. + - We aim for the toolkit to as loosely coupled as possible. In general this means that any given header should need to \#include as few other header files as possible. The matrix library, in particular, only depends on code in one other subdirectory so it can be used independently of almost all the rest of Kaldi. - + @section about_status Status of the project Currently, we have code and scripts for most standard techniques, including all standard @@ -134,12 +134,9 @@ Note: after an early phase in which we intended to use version numbers for major releases of Kaldi ("v1" and so on), we realized that these type of releases do not mesh well with the natural style of development, which is very - continuous. Currently we maintain two major versions of Kaldi: the "trunk" - version, and the "complete" version (which maintains some little-used features - that were deleted from trunk). We also maintain various sandboxes for feature - development; these are merged back into trunk when the feature is complete. - For most purposes, the "trunk" is the version you should use, and you should - frequently do "svn up" to keep it up to date; see \ref install for more details. + continuous. Currently we maintain only the "master" development branch, and + this is the version you should use. Also, + frequently do "git pull" to keep it up to date; see \ref install for more details. See \ref roadmap for details of features we are currently working on. diff --git a/src/doc/dependencies.dox b/src/doc/dependencies.dox index 85c36a66626..bff6983e0d6 100644 --- a/src/doc/dependencies.dox +++ b/src/doc/dependencies.dox @@ -33,7 +33,7 @@ grid will have NVidia GPUs which you can use for neural net training, and you can reserve these on the queue by adding some extra option to qsub. See \ref queue for more information. - + We have started a separate project called Kluster that shows you how to create such a cluster on Amazon's EC2; MIT's project page on Sourceforge contains - a number of useful resources, but after the recent extended outage we are migrating away from - Sourceforge. kaldi-asr.org/ is now the top-level - location you should go to; see in particular information about help forums and email - lists at kaldi-asr.org/forums.html. + in the sub-directory \c egs/). + + + Kaldi's project page contains + a number of useful resources; see in particular information about help forums and email + lists at kaldi-asr.org/forums.html. diff --git a/src/doc/tutorial.dox b/src/doc/tutorial.dox index 85cc331e09d..ea94ee93e50 100644 --- a/src/doc/tutorial.dox +++ b/src/doc/tutorial.dox @@ -22,7 +22,7 @@ - \subpage tutorial_prereqs "Prerequisites" - \subpage tutorial_setup "Getting started" (15 minutes) - - \subpage tutorial_svn "Version control with Subversion" (5 minutes) + - \subpage tutorial_git "Version control with Git" (5 minutes) - \subpage tutorial_looking "Overview of the distribution" (25 minutes) - \subpage tutorial_running "Running the example scripts" (40 minutes) - \subpage tutorial_code "Reading and modifying the code" (30 minutes) diff --git a/src/doc/tutorial_code.dox b/src/doc/tutorial_code.dox index d71e0b741fa..d53e89db78d 100644 --- a/src/doc/tutorial_code.dox +++ b/src/doc/tutorial_code.dox @@ -36,7 +36,7 @@ Go to the top-level directory (we called it kaldi-1) and then into - src/. + src/. First look at the file base/kaldi-common.h (don't follow the links within this document; view it from the shell or from an editor). This \#includes a number of things from the base/ directory that are used by almost every Kaldi program. You @@ -56,7 +56,7 @@ \section tutorial_code_matrix Matrix library (and modifying and debugging code) - + Now look at the file matrix/matrix-lib.h. See what files it includes. This provides an overview of the kinds of things that are in the matrix library. This library is basically a C++ wrapper for BLAS and LAPACK, if that means anything to you (if not, @@ -69,7 +69,7 @@ These types of commends, and block comments that begin with /**, are interpreted by the Doxygen software that automatically generates documentation. It also generates the page you are reading right now (the source for this type of documentation - is in src/doc/). + is in src/doc/). At this point we would like you to modify the code and compile it. We will be adding a test function to the file matrix/matrix-lib-test.cc. As mentioned @@ -92,13 +92,13 @@ void UnitTestAddVec() { InitRand(&v); InitRand(&w); Vector w2(w); // w2 is a copy of w. - Real f = RandGauss(); + Real f = RandGauss(); w.AddVec(f, v); // w <-- w + f v for (int32 i = 0; i < dim; i++) { Real a = w(i), b = f * w2(i) + v(i); AssertEqual(a, b); // will crash if not equal to within // a tolerance. - } + } } \endverbatim Add this code to the file matrix-lib-test.cc, just above the function @@ -109,7 +109,7 @@ MatrixUnitTest(). Then, inside MatrixUnitTest(), add the line: It doesn't matter where in the function you add this. Then type "make test". There should be an error (a semicolon that should be a comma); fix it and try again. -Now type "./matrix-lib-test". This should crash with an assertion failure, +Now type "./matrix-lib-test". This should crash with an assertion failure, because there was another mistake in the unit-test code. Next we will debug it. Type \verbatim @@ -130,7 +130,7 @@ values of a and b ("p" is short for "print"). Your screen should look someting $5 = -0.931363404 (gdb) p b $6 = -0.270584524 -(gdb) +(gdb) \endverbatim The exact values are, of course, random, and may be different for you. Since the numbers are considerably different, it's clear that it's not just a question @@ -145,7 +145,7 @@ $8 = 0.281656802 $9 = -0.931363404 (gdb) p w2.data_[0] $10 = -1.07592916 -(gdb) +(gdb) \endverbatim This may help you work out that the expression for "b" is wrong. Fix it in the code, recompile, and run again (you can just type "r" in the gdb prompt to rerun). It should now run OK. Force gdb to break into the @@ -169,6 +169,7 @@ If you need to debug a program that takes command-line arguments, you can do it \endverbatim or you can invoke gdb without arguments and then type "r arg1 arg2..." at the prompt. +\todo This paragraph is full of lies! When you are done, and it compiles, type \verbatim svn diff @@ -176,7 +177,7 @@ svn diff to see what changes you made. If you are contributing to the Kaldi project and you are planning to commit code in the near future, you may want to revert the changes you made so you don't accidentally commit them. The following -commands will save the file you modified in case you need it later, and will revert to +commands will save the file you modified in case you need it later, and will revert to the original version: \verbatim cp matrix-lib-test.cc matrix-lib-test.cc.tmp @@ -190,12 +191,12 @@ svn commit --username=your_sourceforge_username -m "Added a unit-test in matrix/ \section tutorial_code_acoustic Acoustic modeling code -Next look at gmm/diag-gmm.h (this class stores a Gaussian Mixture Model). +Next look at gmm/diag-gmm.h (this class stores a Gaussian Mixture Model). The class DiagGmm may look a bit confusing as it has many different accessor functions. Search for "private" and look at the class member variables (they always end with an underscore, as per the Kaldi style). This should make it clear how we store the GMM. -This is just a single GMM, not a whole collection of GMMs. +This is just a single GMM, not a whole collection of GMMs. Look at gmm/am-diag-gmm.h; this class stores a collection of GMMs. Notice that it does not inherit from anything. Search for "private" and you can see the member variables (there @@ -211,7 +212,7 @@ keeping the rest of the system the same. We'll come to this other stuff later. Next look at feat/feature-mfcc.h. Focus on the MfccOptions struct. The struct members give you some idea what kind of options are supported -in MFCC feature extraction. +in MFCC feature extraction. Notice that some struct members are options structs themselves. Look at the Register function. This is standard in Kaldi options classes. Then look at featbin/compute-mfcc-feats.cc (this is a command-line @@ -219,8 +220,8 @@ program) and search for Register. You can see where the Register function of the options struct is called. To see a complete list of the options supported for MFCC feature extraction, execute the program featbin/compute-mfcc-feats with no arguments. -Recall that you saw some of these options being registered in -the MfccOptions class, and others being registered in +Recall that you saw some of these options being registered in +the MfccOptions class, and others being registered in featbin/compute-mfcc-feats.cc. The way to specify options is --option=value. Type \verbatim @@ -264,11 +265,11 @@ adding statistics together and evaluating some kind of objective function (e.g. a likelihood). In the normal recipe, it actually points to a class that contains sufficient statistics for estimating a diagonal Gaussian p.d.f.. -Do +Do \verbatim less exp/tri1/log/acc_tree.log \endverbatim -There won't be much information in this file, but you can see the command +There won't be much information in this file, but you can see the command line. This program accumulates the single-Gaussian statistics for each HMM-state (actually, pdf-class) of each seen triphone context. The --ci-phones options is so that it knows to avoid accumulating separate @@ -285,7 +286,7 @@ This program does the decision-tree clustering; it reads in the statistics that were output by. It is basically a wrapper for the BuildTree function discussed above. The questions that it asks in the decision-tree clustering are automatically generated, as you can see in the script steps/train_tri1.sh (look for the programs cluster-phones -and compile-questions). +and compile-questions). @@ -296,7 +297,7 @@ topologies for a number of phones. In general each phone can have a different topology. The topology includes "default" transitions, used for initialization. Look at the example topology in the extended comment at the top of the header. There is a tag (note: as with HTK text formats, -this file looks vaguely XML-like, but it is not really XML). +this file looks vaguely XML-like, but it is not really XML). The is always the same as the HMM-state () here; in general, it doesn't have to be. This is a mechanism to enforce tying of distributions between distinct HMM states; it's possibly useful if you want to diff --git a/src/doc/tutorial_git.dox b/src/doc/tutorial_git.dox new file mode 100644 index 00000000000..63676df86c1 --- /dev/null +++ b/src/doc/tutorial_git.dox @@ -0,0 +1,259 @@ +// doc/tutorial_git.dox + +// Copyright 2015 Smart Action Company LLC + +// See ../../COPYING for clarification regarding multiple authors +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at + +// http://www.apache.org/licenses/LICENSE-2.0 + +// THIS CODE IS PROVIDED *AS IS* BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED +// WARRANTIES OR CONDITIONS OF TITLE, FITNESS FOR A PARTICULAR PURPOSE, +// MERCHANTABLITY OR NON-INFRINGEMENT. +// See the Apache 2 License for the specific language governing permissions and +// limitations under the License. + +/** + \page tutorial_git Kaldi Tutorial: Version control with Git (5 minutes) + + \ref tutorial "Up: Kaldi tutorial"
+ \ref tutorial_setup "Previous: Getting started"
+ \ref tutorial_looking "Next: Overview of the distribution"
+ +Git is a distributed version control system. This means that, unlike +Subversion, there are multiple copies of the repository, and the changes are +transferred between these copies in multiple different ways explicitly, but most +of the time one's work is backed by a single copy of the repository. Because of +this multiplicity of copies, there are multiple possible \em workflows that you +may want to follow. Here's one we think best suits you if you just want to +compile and use Kaldi at first, but then at some point optionally decide +to \em contribute your work back to the project. + +\section tutorial_git_git_setup First-time Git setup + +If you have never used Git before, + +perform some minimal configuration first. At the very least, set up your +name and e-mail address: + +\verbatim +$ git config --global user.name "John Doe" +$ git config --global user.email johndoe@example.com +\endverbatim + +Also, set short names for the most useful git commands you type most often. + +\verbatim +$ git config --global alias.co checkout +$ git config --global alias.br branch +$ git config --global alias.st status +\endverbatim + +Another very useful utility comes with git-prompts.sh, +a bash prompt extension utility for Git (if you do not have it, +search the internet how to install it on your system). +When installed, it provides a shell function \c __git_ps1 that, +when added to the prompt, +expands into the current branch name and pending commit markers, +so you do not forget where you are. +You may modify your \c PS1 shell variable so that it includes literally +$(__git_ps1 "[%s]"). +I have this in my \c ~/.bashrc: + +\code{.sh} +PS1='\[\033[00;32m\]\u@\h\[\033[0m\]:\[\033[00;33m\]\w\[\033[01;36m\]$(__git_ps1 "[%s]")\[\033[01;33m\]\$\[\033[00m\] ' +export GIT_PS1_SHOWDIRTYSTATE=true GIT_PS1_SHOWSTASHSTATE=true +# fake __git_ps1 when git-prompts.sh not installed +if [ "$(type -t __git_ps1)" == "" ]; then + function __git_ps1() { :; } +fi +\endcode + +\section tutorial_git_workflow The User Workflow + +Set up your repository and the working directory with this command: + +\verbatim +kkm@yupana:~$ git clone https://github.com/kaldi-asr/kaldi.git --branch master --single-branch --origin golden +Cloning into 'kaldi'... +remote: Counting objects: 51770, done. +remote: Compressing objects: 100% (8/8), done. +remote: Total 51770 (delta 2), reused 0 (delta 0), pack-reused 51762 +Receiving objects: 100% (51770/51770), 67.72 MiB | 6.52 MiB/s, done. +Resolving deltas: 100% (41117/41117), done. +Checking connectivity... done. +kkm@yupana:~$ cd kaldi/ +kkm@yupana:~/kaldi[master]$ +\endverbatim + +Now, you are ready to configure and compile Kaldi and work with it. +Once in a while you want the latest changes in your local branch. +This is akin to what you usually did with svn update. + +But please first let's agree to one thing: +you do not commit any files on the master branch. +We'll get to that below. +So far, you are only using the code. +It will be hard to untangle if you do not follow the rule, +and Git is so amazingly easy at branching, +that you always want to do your work on a branch. + +\verbatim +kkm@yupana:~/kaldi[master]$ git pull golden +remote: Counting objects: 148, done. +remote: Compressing objects: 100% (55/55), done. +remote: Total 148 (delta 111), reused 130 (delta 93), pack-reused 0 +Receiving objects: 100% (148/148), 18.39 KiB | 0 bytes/s, done. +Resolving deltas: 100% (111/111), completed with 63 local objects. +From https://github.com/kaldi-asr/kaldi + 658e1b4..827a5d6 master -> golden/master +\endverbatim + +The command you use is git pull, +and \c golden is the alias we used to designate the main replica of the Kaldi +repository before. + +\section tutorial_git_contributor From User To Contributor + +At some point you decided to change Kaldi code, +be it scripts or source. Maybe you made a simple bug fix. +Maybe you are contributing a whole recipe. In any case, +your always do your work on a branch. +Even if you have uncommitted changes, Git handles that. +For example, you just realized that the \c fisher_english recipe does not +actually make use of \c hubscr.pl for scoring, but checks it exists and +fails. +You quickly fixed that in your work tree and want to share this change +with the project. + +\subsection tutorial_git_branch Work locally on a branch + +\verbatim +kkm@yupana:~/kaldi[master *]$ git fetch golden +kkm@yupana:~/kaldi[master *]$ git co golden/master -b fishfix --no-track +M fisher_english/s5/local/score.sh +Branch fishfix set up to track remote branch master from golden. +Switched to a new branch 'fishfix' +kkm@yupana:~/kaldi[myfix *]$ +\endverbatim + +So what we did here, we first \em fetched the current changes to the golden +repository to your machine. +This did not update your master +(in fact, you cannot pull if you have local worktree changes), +but did update the remote reference \c golden/master. +In the second command, we forked off a branch in your local repository, +called \c fishfix. +Was it more logical to branch off \c master? Not at all! +First, this is one operation more. You do not *need* to update the master, so +why would you? Second, we agreed (remember?) that master will have no changes, +and you had some. Third, and believe me, this happens, you might have committed +something to your master by mistake, and you do not want to bring this feral +change into your new branch. + +Now you examine your changes, and, since they are good, you commit them: + +\code{.diff} +kkm@yupana:~/kaldi[fishfix *]$ git diff +diff --git a/egs/fisher_english/s5/local/score.sh b/egs/fisher_english/s5/local/score.sh +index 60e4706..552fada 100755 +--- a/egs/fisher_english/s5/local/score.sh ++++ b/egs/fisher_english/s5/local/score.sh +@@ -27,10 +27,6 @@ dir=$3 + + model=$dir/../final.mdl # assume model one level up from decoding dir. + +-hubscr=$KALDI_ROOT/tools/sctk/bin/hubscr.pl +-[ ! -f $hubscr ] && echo "Cannot find scoring program at $hubscr" && exit 1; +-hubdir=`dirname $hubscr` +- + for f in $data/text $lang/words.txt $dir/lat.1.gz; do + [ ! -f $f ] && echo "$0: expecting file $f to exist" && exit 1; + done +kkm@yupana:~/kaldi[fishfix *]$ git commit -am 'fisher_english scoring does not really need hubscr.pl from sctk.' +[fishfix d7d76fe] fisher_english scoring does not really need hubscr.pl from sctk. + 1 file changed, 4 deletions(-) +kkm@yupana:~/kaldi[fishfix]$ +\endcode + +Note that the \c -a switch to git commit makes it commit all modified +files (we had only one changed, so why not?). If you want to separate file +modifications into multiple features to submit separately, git add +specific files followed by git commit without the \c -a switch, and +then start another branch off the same point as the first one for the next fix: +git co golden/master -b another-fix --no-track, where you could add and +commit other changed files. With Git, it is not uncommon to have a dozen +branches going. Remember that it is extremely easy to combine multiple feature +branches into one, but splitting one large changeset into many smaller features +involves more work. + +Now you need to create a pull request to the maintaners of Kaldi, so that they +can pull the change from your repository. For that, your repository needs +to be available online to them. And for that, you need a GitHub account. + +\subsection tutorial_git_github_setup One-time GitHub setup + +\li Go to main Kaldi repository +page and click on the Fork button. If you do not have an account, GitHub +will lead you through necessary steps. +\li Generate and +register an SSH key with GitHub so that GitHub can identify you. Everyone +can read everything on GitHub, but only you can write to your forked repository! + +\subsection pull_request Creating a pull request + +Make sure your fork is registered under the name \c origin (the alias is +arbitrary, this is what we'll use here). If not, add it. The URL is listed on +your repository page under "SSH clone URL", and looks like +git@github.com:YOUR_USER_NAME/kaldi.git. + +\verbatim +kkm@yupana:~/kaldi[fishfix]$ git remote -v +golden https://github.com/kaldi-asr/kaldi.git (fetch) +golden https://github.com/kaldi-asr/kaldi.git (push) +kkm@yupana:~/kaldi[fishfix]$ git remote add origin git@github.com:kkm000/kaldi.git +kkm@yupana:~/kaldi[fishfix]$ git remote -v +golden https://github.com/kaldi-asr/kaldi.git (fetch) +golden https://github.com/kaldi-asr/kaldi.git (push) +origin git@github.com:kkm000/kaldi.git (fetch) +origin git@github.com:kkm000/kaldi.git (push) +\endverbatim + +Now push the branch into your fork of Kaldi: + +\verbatim +kkm@yupana:~/kaldi[fishfix]$ git push origin HEAD -u +Counting objects: 632, done. +Delta compression using up to 12 threads. +Compressing objects: 100% (153/153), done. +Writing objects: 100% (415/415), 94.45 KiB | 0 bytes/s, done. +Total 415 (delta 324), reused 326 (delta 262) +To git@github.com:kkm000/kaldi.git + * [new branch] HEAD -> fishfix +Branch fishfix set up to track remote branch fishfix from origin. +\endverbatim + +\c HEAD in git push tells Git "create branch in the remote repo with +the same name as the current branch", and \c -u remembers the connection between +your local branch \c fishfix and \c origin/fishfix in your repository. + +Now go to your repository page and +create a +pull request. +Examine your changes, +and submit the request if everything looks good. The maintainers will receive +the request and either accept it or comment on it. +Follow the comments, commit fixes on your branch, push to \c origin again, and +GitHub will automatically update the pull request web page. +Then reply e. g. "Done" under the comments that you received, so that they know +you followed up on their comments. + + \ref tutorial "Up: Kaldi tutorial"
+ \ref tutorial_setup "Previous: Getting started"
+ \ref tutorial_looking "Next: Overview of the distribution"
+

+*/ diff --git a/src/doc/tutorial_looking.dox b/src/doc/tutorial_looking.dox index ef7c7512d54..6d525df93e9 100644 --- a/src/doc/tutorial_looking.dox +++ b/src/doc/tutorial_looking.dox @@ -21,12 +21,12 @@ \page tutorial_looking Kaldi tutorial: Overview of the distribution (20 minutes) \ref tutorial "Up: Kaldi tutorial"
- \ref tutorial_svn "Previous: Version control with Subversion"
+ \ref tutorial_git "Previous: Version control with Git"
\ref tutorial_running "Next: Running the example scripts"
Before we jump into the example scripts, let us take a few minutes to look at what else is included in the Kaldi distribution. Go to the kaldi-1 directory and list it. - There are a few files and subdirectories. + There are a few files and subdirectories. The important subdirectories are "tools/", "src/", and "egs/" which we will look at in the next section. We will give an overview of "tools/" and "src/". @@ -53,7 +53,7 @@ of an abstract FST type. You can see that there are a lot of templates involved. If templates are not your thing, you will probably have trouble understanding this code. - Change directory to bin/, or add it to your path. + Change directory to bin/, or add it to your path. We will be executing some simple example instructions from here. @@ -63,7 +63,7 @@ # arc format: src dest ilabel olabel [weight] # final state format: state [weight] # lines may occur in any order except initial state must be first line -# unspecified weights default to 0.0 (for the library-default Weight type) +# unspecified weights default to 0.0 (for the library-default Weight type) cat >text.fst < - \ref tutorial_svn "Previous: Version control with Subversion"
+ \ref tutorial_git "Previous: Version control with Git"
\ref tutorial_running "Next: Running the example scripts"

*/ diff --git a/src/doc/tutorial_prereqs.dox b/src/doc/tutorial_prereqs.dox index 0a298c1f7db..91c3251970f 100644 --- a/src/doc/tutorial_prereqs.dox +++ b/src/doc/tutorial_prereqs.dox @@ -43,11 +43,11 @@ Management (RM) CDs from the Linguistic Data Consortium (LDC), in the original form as distributed by the LDC. That is, we assume this data is sitting on your system somewhere. We obtained this as catalog number LDC93S3A. It is - also available in two separate pieces. Be careful because there was previously + also available in two separate pieces. Be careful because there was previously a different distribution of the RM data with a different layout. The system requirements are fairly basic. We assume that you have tools - including wget, svn, awk, perl and so on, or that you know how to install them. + including wget, git, awk, perl and so on, or that you know how to install them. The most difficult part of the installation process relates to the math library ATLAS; if this is not already installed as a library on your system you will have to compile it, and this requires that CPU throttling be turned off, which @@ -61,7 +61,7 @@ to try to keep to the posted schedule, if necessary by skipping steps and avoiding following links to more information that we provide in the text. This will help ensure that you get a balanced overview. You can always review the material in more - detail later on. If this tutorial is to be given in a classroom setting, it is + detail later on. If this tutorial is to be given in a classroom setting, it is important that someone run through the tutorial on the relevant system beforehand in order to verify that all the prerequisites are installed. diff --git a/src/doc/tutorial_setup.dox b/src/doc/tutorial_setup.dox index 46b3a91ca2d..11d97a945f9 100644 --- a/src/doc/tutorial_setup.dox +++ b/src/doc/tutorial_setup.dox @@ -22,7 +22,7 @@ \ref tutorial "Up: Kaldi tutorial"
\ref tutorial_prereqs "Previous: Prerequisites"
- \ref tutorial_svn "Next: Version control with Subversion"
+ \ref tutorial_git "Next: Version control with Git"
The first step is to download and install Kaldi. We will be using version 1 of the toolkit, so that this tutorial does not get out of date. However, be aware @@ -32,28 +32,28 @@ "s3" scripts mentioned in this tutorial. But be aware that if you do that some aspects of the tutorial may be out of date. - Assuming Subversion (svn) is installed, to get the latest code you can type + Assuming Git is installed, to get the latest code you can type \verbatim - svn co svn://svn.code.sf.net/p/kaldi/code/trunk kaldi-trunk + git clone https://github.com/kaldi-asr/kaldi.git kaldi-trunk --origin golden \endverbatim - Then cd to kaldi-trunk. Look at the INSTALL file and follow the instructions + Then cd to kaldi-trunk. Look at the INSTALL file and follow the instructions (it points you to two subdirectories). Look carefully at the output of the installation scripts, as they try to guide you what to do. Some installation - errors are non-fatal, and the installation scripts will tell you so (i.e. there + errors are non-fatal, and the installation scripts will tell you so (i.e. there are some things it installs which are nice to have but are not really needed). The "best-case" scenario is that you do: \verbatim cd kaldi-trunk/tools/; make; cd ../src; ./configure; make \endverbatim and everything will just work; however, if this does not happen there are - fallback plans (e.g. you may have to install some package on your machine, or run - install_atlas.sh in tools/, or run some steps in tools/INSTALL manually, + fallback plans (e.g. you may have to install some package on your machine, or run + install_atlas.sh in tools/, or run some steps in tools/INSTALL manually, or provide options to the configure script in src/). If there are problems, there may be some information in \ref build_setup that will help you; otherwise, - feel free to contact the maintainers (\ref other) and we will be happy to help. + feel free to contact the maintainers (\ref other) and we will be happy to help. \ref tutorial "Up: Kaldi tutorial"
\ref tutorial_prereqs "Previous: Prerequisites"
- \ref tutorial_svn "Next: Version control with Subversion"
+ \ref tutorial_git "Next: Version control with Git"

*/ diff --git a/src/doc/tutorial_svn.dox b/src/doc/tutorial_svn.dox deleted file mode 100644 index 1cdeafe7a16..00000000000 --- a/src/doc/tutorial_svn.dox +++ /dev/null @@ -1,70 +0,0 @@ -// doc/tutorial_svn.dox - -// Copyright 2009-2011 Microsoft Corporation - -// See ../../COPYING for clarification regarding multiple authors -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at - -// http://www.apache.org/licenses/LICENSE-2.0 - -// THIS CODE IS PROVIDED *AS IS* BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED -// WARRANTIES OR CONDITIONS OF TITLE, FITNESS FOR A PARTICULAR PURPOSE, -// MERCHANTABLITY OR NON-INFRINGEMENT. -// See the Apache 2 License for the specific language governing permissions and -// limitations under the License. - -/** - \page tutorial_svn Kaldi Tutorial: Version control with Subversion (5 minutes) - - \ref tutorial "Up: Kaldi tutorial"
- \ref tutorial_setup "Previous: Getting started"
- \ref tutorial_looking "Next: Overview of the distribution"
- - In case you are unfamiliar with the Subversion (svn) version control system, we - give a brief overview of some commands that might be useful to you. Subversion commands - always look like: "svn [command] [arguments]"; you can do "svn help" to see what - commands are available, or "svn help " for help on a specific command. - In kaldi-1 or any subdirectory, type - \verbatim - svn up - \endverbatim - (this is short for "svn update"). If we have committed changes to the repository - in the several minutes since you installed Kaldi, you should see output like - the following: -\verbatim -kaldi-1: svn update -U src/lat/Makefile -U src/nnetbin/nnet-forward.cc -Updated to revision 191. -\endverbatim - More likely, it will just say something like "At revision 191." - To see if you have made any changes to anything, type -\verbatim - svn status -\endverbatim - This will - list files that you changed or that have been added. Files that have been added - to the directories but are not under version control because you have not used the - "svn add" command, will appear with the descriptor '?' (you will see all the - binaries that were compiled). Next, edit a version-controlled file (for example, - src/Makefile; add a comment or something), and type -\verbatim - svn diff -\endverbatim -This should show how your version differs from the copy that you downlaoded. -If you are going to be - contributing to the Kaldi project (and we do welcome new contributors), - then you should become familiar with other commands such - as "svn add", "svn commit" and so on. For this, there are tutorials available - online. - - - \ref tutorial "Up: Kaldi tutorial"
- \ref tutorial_setup "Previous: Getting started"
- \ref tutorial_looking "Next: Overview of the distribution"
-

-*/ From fbe35a969a9f4ef87c6d66b3176f9174cb93ec74 Mon Sep 17 00:00:00 2001 From: kkm Date: Mon, 3 Aug 2015 22:23:03 -0700 Subject: [PATCH 2/3] Addressing @danpovey comments: svn is still a prerequisite; svn mirror is not maintained; version control paragraph in tutorial_code.dox rewritten. --- src/doc/install.dox | 6 +----- src/doc/tutorial_code.dox | 25 +++++++++---------------- src/doc/tutorial_prereqs.dox | 2 +- 3 files changed, 11 insertions(+), 22 deletions(-) diff --git a/src/doc/install.dox b/src/doc/install.dox index 0e606804b0c..1e40fafdeca 100644 --- a/src/doc/install.dox +++ b/src/doc/install.dox @@ -24,11 +24,7 @@ \section install_download Dowloading Kaldi - We have now transitioned to - GitHub for all future development. We still intend to maintain a - read-only Subversion mirror of the GitHub parent, located at SourceForge and mirrored - by us. - + We have now transitioned to GitHub for all future development. You first need to install Git. The most current version of Kaldi, possibly including unfinished and experimental features, can be downloaded by typing into a shell: diff --git a/src/doc/tutorial_code.dox b/src/doc/tutorial_code.dox index d53e89db78d..0505a25d4a8 100644 --- a/src/doc/tutorial_code.dox +++ b/src/doc/tutorial_code.dox @@ -169,25 +169,18 @@ If you need to debug a program that takes command-line arguments, you can do it \endverbatim or you can invoke gdb without arguments and then type "r arg1 arg2..." at the prompt. -\todo This paragraph is full of lies! When you are done, and it compiles, type \verbatim -svn diff -\endverbatim -to see what changes you made. If you are contributing to the Kaldi project and you -are planning to commit code in the near future, you -may want to revert the changes you made so you don't accidentally commit them. The following -commands will save the file you modified in case you need it later, and will revert to -the original version: -\verbatim - cp matrix-lib-test.cc matrix-lib-test.cc.tmp - svn revert matrix-lib-test.cc -\endverbatim -If you actually wanted to commit the changes, and you had an account on Sourceforge, you -would have to ask us to add you to the Kaldi project, and you would type something like -\verbatim -svn commit --username=your_sourceforge_username -m "Added a unit-test in matrix/ directory." + git diff \endverbatim +to see what changes you made. If you are contributing to the Kaldi project and +planning to send us code in the near future, you +may want to commit them to a branch as described in the \tutorial_git, so that you +can generate a clean +GitHub pull request later. We recommend that you familiarize +yourself with Git branches even if you are not contributing your changes +outright; Git is a powerful tool to maintain your local code changes as well as +those you may contribute. \section tutorial_code_acoustic Acoustic modeling code diff --git a/src/doc/tutorial_prereqs.dox b/src/doc/tutorial_prereqs.dox index 91c3251970f..82079a281b9 100644 --- a/src/doc/tutorial_prereqs.dox +++ b/src/doc/tutorial_prereqs.dox @@ -47,7 +47,7 @@ a different distribution of the RM data with a different layout. The system requirements are fairly basic. We assume that you have tools - including wget, git, awk, perl and so on, or that you know how to install them. + including wget, git, svn, awk, perl and so on, or that you know how to install them. The most difficult part of the installation process relates to the math library ATLAS; if this is not already installed as a library on your system you will have to compile it, and this requires that CPU throttling be turned off, which From 3f3dc6ba16222ae96c1e692de79aef408ea28400 Mon Sep 17 00:00:00 2001 From: kkm Date: Tue, 4 Aug 2015 08:36:40 -0700 Subject: [PATCH 3/3] Windows build is slightly more maintained than not. :-) --- src/doc/install.dox | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/doc/install.dox b/src/doc/install.dox index 1e40fafdeca..0ffb2b1220f 100644 --- a/src/doc/install.dox +++ b/src/doc/install.dox @@ -47,8 +47,7 @@ \section install_install Installing Kaldi The top-level installation instructions are in the file \c INSTALL. - For Windows, there are separate instructions (unfortunately, not actively - maintained and woefully out of date) in \c windows/INSTALL. + For Windows, there are separate instructions in \c windows/INSTALL. See also \ref build_setup which explains how the build process works internally.