-
Notifications
You must be signed in to change notification settings - Fork 17
/
updates.txt
61 lines (34 loc) · 3.22 KB
/
updates.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
----------------------------------------------------------------------------
zCall: A Rare Variant Caller for Array-Based Genotyping
Author: Jackie Goldstein - jigold@broadinstitute.org
Contact: Benjamin Neale - bneale@broadinstitute.org
May 8th, 2012
***The Illumina provided Code was provided as-is and with no warranty as to performance and no warranty against it infringing any other party's intellectual property rights.
---------------------------------------------------------------------------
Updates:
May 8th, 2012:
1. Fixed scripts to account for sites where <= 2 points were assigned to both homozygote clusters.
2. Set default AutoCall PED file (Version 1) to output A/B alleles rather than A,T,G,C alleles so that the output of all three versions is standard. This also avoids differences in reference strand used when comparing calls across sites. To revert back to outputting A,T,G,C alleles, look at zCall.py and uncomment/comment the specified lines.
3. Updated input GenomeStudio report format (Version 2 & 3) to include the chromosome and position in order to avoid using the probe manifest file. The input should now resemble the following:
Name<tab>Chr<tab>Position<tab>Sample1.GType<tab>Sample1.X<tab>Sample1.Y
rs000001<tab>1<tab>900001<tab>NC<tab>0.0000<tab>0.0000
This affects the input parameters to convertReportToTPED.py and zCall.py
May 21st, 2012:
1. Added a minimum signal intensity threshold filter for No Calls in the findThresholds.py script with the -I flag. Default is 0.2. To recall all sites, use -I 0.0
June 21st, 2012:
1. There is an error in the README for Versions 2 and 3. The example usage for the calibrateZ.py script should read:
python calibrateZ.py -R my.report -T my.thresholds -E my.egt > my.concordance.stats
October 3rd, 2012:
1. Updated the website so it's easier to find files
2. Updated the main README with the reference to the zCall publication in Bioinformatics. Also added some clarifying points based on responses from users such as it is not necessary to do site QC before using zCall (sample QC is necessary), Version 3 is the preferred version to use, and the example input files are not meant for testing (they won't work).
2. Added a new zip file to each folder with minor updates to README files. Also fixed a small bug in the calibrateZ.py script for Version 3 and the sampleConcordance.py script for Version 1. (The bug caused the script to get killed when there was an "NA" in the thresholds file) You can either download the latest zip file bundle for each version (VersionX.3) or just download the scripts with the bug fixes from the additionalScripts/ folder.
January 10th, 2014:
The following applies to zCall_Version3.4_GenomeStudio.tgz:
1. Fixed bugs where the ID name isn't output correctly if it contains periods or the SNP name contains "snp"
2. Made the scripts more efficient
3. Made the scripts compatible with a GenomeStudio report that has been gripped (ends with .gz)
4. Added a script qcReport.py that filters out samples with a call rate less than a given threshold.
example usage: python qcReport.py -R my.report -C 0.99 > my.qc.report
-R is the original genome studio report
-C is the call rate (0-1)
output is a genome studio report with samples dropped if they have a low call rate