-
Notifications
You must be signed in to change notification settings - Fork 13
/
Copy pathFAQ.html
145 lines (78 loc) · 6.33 KB
/
FAQ.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
---
layout: default
---
<style type="text/css">
.selectedHeader {
font-weight: 500;
}
.selectedDiv {
border: 5px solid red;
}
</style>
<h1>Frequently asked questions</h1>
<br>
<h5>A set of FAQs related to lineages are available <a href="https://www.pango.network/">here</a>, the following are a list of FAQs related to the pangolin software tool and cov-lineages.org website.</h5>
<br>
<h3>Where does the data come from?</h3>
<div class="contrib">
The data used to inform pangolin assignments are the lineages hosted at <a href="https://github.com/cov-lineages/pango-designation">pango-designation</a> matched to the latest GISAID data. We do not share GISAID data as per the data agreement and appropriate permissions have been given.
</div>
<h3>How often is the website updated?</h3>
<div class="contrib">
The website is updated on a daily basis with the latest lineage assignments using all full genome sequences on GISAID.
</div>
<h3> What support statistics are there?</h3>
<div class="contrib">
<h5>pangolin 3.0</h5>
Full details about support statistics output in pangolin 3.0 can be found in the <a href="resources/pangolin/output.html">pangolin documentation</a>.
<h5>pangolin 2.0</h5>
Recall and supporting statistics, hosted <a href="https://github.com/cov-lineages/pangoLEARN/blob/master/pangoLEARN/data/lineagerecalls.txt">here</a>, were generated using the same procedure as above to train a model using 75% of the data, while 25% of the data was used as testing data. Smaller lineages may have lower recall rates due to the very small sample sizes in the training and test set.
<h5>pangolin 1.0:</h5>
Of 9,843 GISAID sequences assigned lineages by hand (taking sequence, phylogeny and metadata into account), pangolin accurately assigns the lineage of 97.85% of those sequences. Of the sequences that were not recalled correctly, 74.5% had 0 bootstrap and 0 alrt. We're continuing to work to improve this recall rate, but recommend interpreting the pangolin output cautiously with due attention to the UFbootstrap and aLRT values. <br><br>
Given SARS-CoV-2 is relatively slow evolving for an RNA virus and there is still not a huge amount of diversity, missing or ambiguous data at key residues may lead to incorrect placement within the guide tree. We have a filter in place that by default with not call a lineage for any sequence with >50% N-content, but this can be made more conservative with the command line option `--max-ambig`.
</div>
<h3>Why might a lineage assignment change?</h3>
<div class="contrib">
A lineage assignment is a "best guess" at what the lineage of an unknown sequence may be based on available data. We report <a href="https://github.com/cov-lineages/pango-designation">pango-designation</a> version numbers with the assignment and this indicates what data the inference engine bases the assignment on. <br><br>
This assignment comes with a certain amount of noise. Our most recent estimates give an average 95.8% recall value for designated lineages, with some lineage having better recall and precision values than others. The accuracy of assignment may vary depending on a number of factors, including the number of sequences in that lineage (i.e. quantity of data), the amount of ambiguity in those sequences (i.e. quality of data) and how unique the defining SNPs are for that lineage (i.e. E484K may be associated with a number of different lineages). <br><br>
The assignment may change as new designations are made and new releases of the model are tagged. It may be that your sequence gets included in a new lineage designation that didn't exist when you first ran your sequence through pangolin. The full list of designated sequences can be found at <a href="https://github.com/cov-lineages/pango-designation" style="color:#7351A3">github.com/cov-lineages/pango-designation</a>.
</div>
<h3>I think my sequence has been incorrectly assigned, what can I do?</h3>
<div class="contrib">
The vast majority of the time, an incorrect assignment is down to missing data. A given sequence may be lacking an informative SNP and this can lead to incorrect placement in the pangoLEARN decision tree, or incorrect placement in the UShER protobuf file if running in `--usher` mode (available from pangolin 3.0 onwards). <br><br>
If you have a complete genome sequence and suspect there's something wrong with the assignment model, the model is informed by the data input from <a href="https://github.com/cov-lineages/pango-designation">pango-designation</a>. The lineages are all designated by hand based on evidence in the phylogenetic tree. If you believe a new sequence should be designated as something other than what it's being assigned, users can follow the instructions at <a href="https://www.pango.network/how-does-the-system-work/how-to-suggest-a-new-lineage/">pango.network</a> to submit a lineage update or correction. This involves posting an issue to <a href="https://github.com/cov-lineages/pango-designation">pango-designation</a> where the <a href="https://www.pango.network/committees/committee-structure/">Pango Network Team</a> will respond to and address queries.
</div>
<h3>I've found a bug in the pangolin software, what can I do?</h3>
<div class="contrib">
Please post an issue to the <a href="https://github.com/cov-lineages/pangolin">pangolin github repository</a> where we will answer any queries and try to fix any bugs you may have found!
</div>
<script type="text/javascript">
var limit = Math.max( document.body.scrollHeight, document.body.offsetHeight,
document.documentElement.clientHeight, document.documentElement.scrollHeight, document.documentElement.offsetHeight );
const queryString = window.location.search;
const urlParams = new URLSearchParams(queryString);
var selectedQ = urlParams.get("q");
var num = parseInt(selectedQ);
if(num || num === 0) {
var count = 0;
for(var i of document.getElementById("content_wrapper").children) {
if(i.tagName == "H3" && count == num) {
i.classList.add("selectedHeader");
i.scrollIntoView();
}
if(i.classList) {
if(i.classList[0] == "contrib") {
if(count == num) {
i.classList.add("selectedDiv");
var scrollDiv = i.offsetTop;
if(scrollDiv < limit) {
scrollDiv = scrollDiv - 175;
}
window.scrollTo({ top: scrollDiv, behavior: 'smooth'});
}
count = count + 1;
}
}
}
}
</script>