forked from Arkarachai/STR-FM
-
Notifications
You must be signed in to change notification settings - Fork 0
/
readdepth2sequencingdepth.xml
58 lines (40 loc) · 2.71 KB
/
readdepth2sequencingdepth.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
<tool id="readdepth2seqdepth" name="Convert informative read depth to sequencing depth" version="1.0.0">
<description>for flank-based mapping of microsatellites</description>
<command interpreter="python2.7">sequencingdepthconversion_G.py $repeatlength $flanksize $readlength $infodepth $probprediction > $output </command>
<inputs>
<param name="repeatlength" type="integer" value="10" label="Repeat length (bp)" />
<param name="flanksize" type="integer" value="20" label="Required flank bases on each side in mapping" />
<param name="readlength" type="integer" value="100" label="Read length (treat all read as single end read)" />
<param name="infodepth" type="integer" value="5" label="Required read depth" />
<param name="probprediction" type="float" value="0.9" label="Proportion of genome that need certain level of read depth" />
</inputs>
<outputs>
<data format="input" name="output" />
</outputs>
<tests>
<!-- Test data with valid values -->
<test>
<param name="repeatlength" value="10"/>
<param name="flanksize" value="20" />
<param name="readlength" value="100" />
<param name="infodepth" value="5" />
<param name="probprediction" value="0.9" />
<output name="output" file="readdepth2seqdepth.out"/>
</test>
</tests>
<help>
.. class:: infomark
**What it does**
This tool is used to convert informative read depth (specified by user) to sequencing depth when the STRs is mapped using STR-FM pipeline.
The locus specific sequencing depth (yrequired) is the sequencing depth that will make an STR locus to have a certain informative read depth based on uniform mapping of reads. It is calculated as follows: ::
yrequired = ( X * L ) / (L - (2F+r-1))
where X = informative read depth, L = read length, F = the number of flanking bases required on either side, r = the expected repeat length of the STR of interest.
The genome wide sequencing depth is the sequencing depth that will make certain percentage of genome (e.g. 90 percent or 95 percent) to have certain locus specific sequencing depth. It's calculated using numerical guessing to find smallest lambda that: ::
0.90 (or other proportion specified by user) < = P(Y=0) + P(Y=1) + …+ P(Y=yrequired-1)
where P(Y=y) = (lambda^(y) * e ^(-lambda)) /y!
y = specific level of sequencing depth. Lambda = genome wide sequencing depth
Please refer the Methods section of the paper cited below for further details.
**Citation**
When you use this tool, please cite **Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research**
</help>
</tool>