forked from ErasmusMC-Bioinformatics/shm_csr
-
Notifications
You must be signed in to change notification settings - Fork 1
/
shm_first.htm
127 lines (109 loc) · 5.98 KB
/
shm_first.htm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=UTF-8">
<meta name=Generator content="Microsoft Word 14 (filtered)">
<style>
<!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin-top:0in;
margin-right:0in;
margin-bottom:10.0pt;
margin-left:0in;
line-height:115%;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
.MsoChpDefault
{font-family:"Calibri","sans-serif";}
.MsoPapDefault
{margin-bottom:10.0pt;
line-height:115%;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
-->
</style>
</head>
<body lang=EN-US>
<div class=WordSection1>
<p class=MsoNormalCxSpFirst style='margin-bottom:0in;margin-bottom:.0001pt;
text-align:justify;line-height:normal'><span lang=EN-GB style='font-size:12.0pt;
font-family:"Times New Roman","serif"'>Table showing the order of each
filtering step and the number and percentage of sequences after each filtering
step. </span></p>
<p class=MsoNormalCxSpMiddle style='margin-bottom:0in;margin-bottom:.0001pt;
text-align:justify;line-height:normal'><u><span lang=EN-GB style='font-size:
12.0pt;font-family:"Times New Roman","serif"'>Input:</span></u><span
lang=EN-GB style='font-size:12.0pt;font-family:"Times New Roman","serif"'> The
number of sequences in the original IMGT file. This is always 100% of the
sequences.</span></p>
<p class=MsoNormalCxSpMiddle style='margin-bottom:0in;margin-bottom:.0001pt;
text-align:justify;line-height:normal'><u><span lang=EN-GB style='font-size:
12.0pt;font-family:"Times New Roman","serif"'>After "no results" filter: </span></u><span
lang=EN-GB style='font-size:12.0pt;font-family:"Times New Roman","serif"'>IMGT
classifies sequences either as "productive", "unproductive", "unknown", or "no
results". Here, the number and percentages of sequences that are not classified
as "no results" are reported.</span></p>
<p class=MsoNormalCxSpMiddle style='margin-bottom:0in;margin-bottom:.0001pt;
text-align:justify;line-height:normal'><u><span lang=EN-GB style='font-size:
12.0pt;font-family:"Times New Roman","serif"'>After functionality filter:</span></u><span
lang=EN-GB style='font-size:12.0pt;font-family:"Times New Roman","serif"'> The
number and percentages of sequences that have passed the functionality filter. The
filtering performed is dependent on the settings of the functionality filter.
Details on the functionality filter <a name="OLE_LINK12"></a><a
name="OLE_LINK11"></a><a name="OLE_LINK10">can be found on the start page of
the SHM&CSR pipeline</a>.</span></p>
<p class=MsoNormalCxSpMiddle style='text-align:justify'><u><span lang=EN-GB
style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>After
removal sequences that are missing a gene region:</span></u><span lang=EN-GB
style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>
In this step all sequences that are missing a gene region (FR1, CDR1, FR2,
CDR2, FR3) that should be present are removed from analysis. The sequence
regions that should be present are dependent on the settings of the sequence
starts at filter. <a name="OLE_LINK9"></a><a name="OLE_LINK8">The number and
percentage of sequences that pass this filter step are reported.</a> </span></p>
<p class=MsoNormalCxSpMiddle style='text-align:justify'><u><span lang=EN-GB
style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>After
N filter:</span></u><span lang=EN-GB style='font-size:12.0pt;line-height:115%;
font-family:"Times New Roman","serif"'> In this step all sequences that contain
an ambiguous base (n) in the analysed region or the CDR3 are removed from the
analysis. The analysed region is determined by the setting of the sequence
starts at filter. The number and percentage of sequences that pass this filter
step are reported.</span></p>
<p class=MsoNormalCxSpMiddle style='text-align:justify'><u><span lang=EN-GB
style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>After
filter unique sequences</span></u><span lang=EN-GB style='font-size:12.0pt;
line-height:115%;font-family:"Times New Roman","serif"'>: The number and
percentage of sequences that pass the "filter unique sequences" filter. Details
on this filter </span><span lang=EN-GB style='font-size:12.0pt;line-height:
115%;font-family:"Times New Roman","serif"'>can be found on the start page of
the SHM&CSR pipeline</span></p>
<p class=MsoNormalCxSpMiddle style='text-align:justify'><u><span lang=EN-GB
style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>After
remove duplicate based on filter:</span></u><span lang=EN-GB style='font-size:
12.0pt;line-height:115%;font-family:"Times New Roman","serif"'> The number and
percentage of sequences that passed the remove duplicate filter. Details on the
"remove duplicate filter based on filter" can be found on the start page of the
SHM&CSR pipeline.</span></p>
<p class=MsoNormalCxSpMiddle style='text-align:justify'><a name="OLE_LINK17"></a><a
name="OLE_LINK16"><u><span lang=EN-GB style='font-size:12.0pt;line-height:115%;
font-family:"Times New Roman","serif"'>Number of matches sequences:</span></u></a><span
lang=EN-GB style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>
The number and percentage of sequences that passed all the filters described
above and have a (sub)class assigned.</span></p>
<p class=MsoNormalCxSpMiddle style='text-align:justify'><u><span lang=EN-GB
style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>Number
of unmatched sequences</span></u><span lang=EN-GB style='font-size:12.0pt;
line-height:115%;font-family:"Times New Roman","serif"'>: The number and percentage
of sequences that passed all the filters described above and do not have
subclass assigned.</span></p>
<p class=MsoNormal><span lang=EN-GB> </span></p>
</div>
</body>
</html>