-
Notifications
You must be signed in to change notification settings - Fork 23
yn00
The program yn00
implements the method of Yang and Nielsen (2000) for estimating synonymous and nonsynonymous substitution rates between two sequences (runmode= -2
and CodonFreq = 2
in the control file to execute CODEML
) as much as possible even for pairwise sequence comparison.
Below, you can find an example of a control file to run yn00
, normally named yn00.ctl
:
seqfile = abglobin.nuc * path to input sequence file
outfile = yn * path to main output file
verbose = 0 * 1: detailed output (list sequences), 0: concise output
icode = 0 * 0:universal code; 1:mammalian mt; 2-10:see below or check the PAML documentation
weighting = 0 * weighting pathways between codons (0/1)?
commonf3x4 = 0 * use one set of codon freqs for all pairs (0/1)?
In this example, the path to the input sequence file (seqfile
) and the path to the main output file (outfile
) have been specified. Note that, if the control file is saved in the same folder as the input files or where the output files are to be saved, you can just type the name of such files (i.e., no need for absolute/relative paths, no spaces or special characters in the file name).In addition, sites (codons) involving alignment gaps or ambiguity nucleotides in any sequence are removed from all sequences. As for other programs, variable verbose
is used to decide how much information is to be printed in the output file, and variable icode
to specify the genetic code (see below for more details)
The variable weighting
decides whether equal weighting or unequal weighting will be used when counting differences between codons. The two approaches will be different for divergent sequences, and unequal weighting is much slower computationally. The transition/transversion rate ratio commonf3x4
specifies whether codon frequencies (based on the "F3x4 model" in CODEML
) should be estimated for each pair or for all sequences in the data.
Besides the main result file, the program also generates three distance matrices saved in the following files: 2YN.dS
file for synonymous rates, 2YN.dN
file for nonsynonymous rates, 2YN.t
file for the combined codon rate (NEIGHBOR
in Felsenstein's PHYLIP package.
The genetic codes implemented in PAML
and enabled via variable icode
are the following:
-
0
: universal, -
1
: mammalian mt. -
2
: yeast mt. -
3
: mold mt. -
4
: invertebrate mt. -
5
: ciliate nuclear -
6
: echinoderm mt. -
7
: euplotid mt. -
8
: alternative yeast nu. -
9
: ascidian mt. -
10
: blepharisma nu.
Note
These codes correspond to transl_table
1 to 11 of GENEBANK.
© Copyright 1993-2023 by Ziheng Yang
The software package is provided "as is" without warranty of any kind. In no event shall the author or their employer be held responsible for any damage resulting from the use of this software, including but not limited to the frustration that you may experience in using the package. The program package, including source codes, example data sets, executables, and this documentation is maintained by Ziheng Yang and distributed under the GNU GPL v3.
Ziheng Yang
Department of Genetics, Evolution, and Environment
University College London
Gower Street
WC1E 6BT, London, United Kingdom