-
Notifications
You must be signed in to change notification settings - Fork 15
/
Changes
306 lines (244 loc) · 13.6 KB
/
Changes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
CHANGES LOG
-----------
- Added .github/dependabot.yml file to auto-update GitHub actions
0.40.0
- change permissions of STAR working directory to allow deletion by standard pipeline tools
- add Dockerfile to create container with appropriate tools for basic analyses
- Updated image labels
0.39.0
- use "samtools index" instead of "cram_index to create cram indexes
- for bam and cram index creation, use "samtools index -o <index_file>" to explicitly indicate the index file
0.38.0
- add additional autosome specific coverage threshold to final_output_prep.json
0.37.2
- change bambi i2b command in stage1 analysis to use new boolean version of --nocall-quality flag
0.37.1
- amend markdup command chain for duplexseq (nanoseq) analyses
- updated to use samtools merge instead of bammerge for library merging
0.37.0
- add i2b_nocall_qual_switch to stage1 analysis to enable "--nocall-quality=2" to be set for bambi i2b
0.36.0
- add hs_bwa_executable parameter to allow different bwa executables for target and nchs alignments (e.g. bwa-mem2 and bwa0_6)
- add optminpixeldif flag (optical distance for duplicates) to bamstreamingmarkduplicates command (biobambam markdup method)
0.35.0
- replace scramble with samtools in analysis pipeline templates
0.34.0
- add calmd to biobambam,samtools and duplexseq markdup methods
0.33.1
- update CI and tests
0.33.0
- extend list of preserved auxtags in realignment bamreset (for Duplex-Seq)
0.32.4
- add supplementary files used in dehumanising CLIMB uploads for ENA (Heron)
- add maxreadlen flag to bamstreamingduplicates in markdup_biobambam with default of 500
0.32.3
- force markdup_method:none (and primer_clip_method:no_clip) for untagged phiX and non-consented human split
0.32.2
- change prefix for SamHaplotag log files to include a trailing '.' (simplifies npg_irods archival)
0.32.1
- add missing edge to top-level human split template
0.32.0
- add haplotag processing to stage2 analysis
- move C2A analysis from bam to cram output stream (to void possible malformed MD tag values)
- remove intermediate files from duplexseq processing
- flatten edges arrays (burst sub-arrays) as part of vtfp processing
0.31.0
- add C2A analysis to stage2 of standard pipelines
0.30.0
- add optional flags to bwa mem command for Hi-C library stage 2 analyses
- bam files (and md5, index files) produced using samtools and md5sum instead of bamrecompress
0.29.0
- renamed markdup_botseq.json markdup_duplexseq.json
- add markdup_botseq.json
0.28.2
- move samtools calmd to correct position downstream of coord sort, and make calmd -Q flag (quiet) the default (alignment with markdup_method:none)
0.28.1
- amend samtools command to "ampliconclip" (previously "amplicon-clip")
0.28.0
- a new template for cases when marking duplicates is not required, with
an option to clip primers, which relies on a new feature in samtools
0.27.0
- check for divide by 0 when calc samtools subsample value
- drop file check, explictly assume fastq.gz files are gzipped when calc #reads for salmon
- make the code flow more robust in cases of insufficient reads for salmon
- unconditionally remove auxtags before adapter clipping when realignment_switch is 1
0.26.0
- add parameters file for top-up merge
- functional equivalence: enable selection of markdup method - biobambam (default), samtools or picard
0.25.0
- add target autosome stats file generation to final_output_prep
- library merge:
cram_write_option (default: use_tears, option use_local)
updated templates to v2.0 format
- add options to standard stage2 templates to handle realignment
release 0.24.2
- stage 1 and seq alignment templates: remove fastqcheck file generation
- stage 1 template: remove generation of a fastq file for index read
release 0.24.1
- correct count of minimum reads to run Salmon
release 0.24.0
- allow format selection for stage1 outputs/stage2 inputs (default: cram)
- stage1 adapter detection now done in parallel with phix alignment
- add samtools stats qc check to stage1 analysis
- stage2 analysis templates amended to allow merging of multiple inputs; application of spatial_filter moved from stage1
- RNA alignments handle single-end runs
- new hisat2 alignment method
- use compressed fastq input for star alignment
- salmon quantification
check that there are enough reads in the fastq file inputs
check if quant file is present before copying it
connect zip node to quant_genes node to ensure zip waits for salmon
- post_alignment_realignment.json: new port specification format
release 0.23.0
- target stats (samtools) addition to final output prep
release 0.22.1
- add default ("ifnull":"salmon") value for quant_method parameter to fix star/tophat2 tag#0 problem
release 0.22.0
- new template for minimap2 alignment, with optional post-alignment filtering of secondary and supplementary
alignments, and minimap2_F_value parameter to allow non-default insert size to be specified (-F flag)
- subsampling of 10K reads (fastq produced by samtools view -s SEED.FRAC) and production of fastqcheck files
added to stage1 and stage2 analyses (nchs, single-/paired-end)
- add lane-level samtools stats to stage1
- spatial filter application moved to stage2 from stage1
- amend spatial_filter input/output file names for easier identification
- empty edge entries recognised as noop - filter in viv.pl, ignore in visualisation, do not apply VTFILE
prefixes when there are no to and from attributes
- added -F option to minimap2
release 0.21.0
- salmon_alignment: copy quant.genes.sf to archive directory
- star_alignment: assign appropriate suffixes to bamtofastq output files
- merge: flag change for use with tears 1.2.4
release 0.20.1
- star_alignment
correct version number (to 2.0)
do queryname sort after alignment
- upgrade realignment template to version 2
- viv.pl: loosen overly strict checks when using output file nodes with subtype dummy
release 0.20.0
- phase 1 of new format templates (port spec)
- replacement of AlignmentFilter and spatial filter with bambi commands
- now handles XA/Y-split with no target alignment
release 0.19.3
- fixes for nonconsented human split with no target alignment
change intermediate OUTFILE node to RAFILE
remove unneeded bammarkduplicates and parameters only used by sort and duplicate marking
release 0.19.2
- fix for single-end alignments (bwa aln): specify use_STD[IN|OUT] attributes for EXEC nodes in bwa_aln_se_alignment
release 0.19.1
- fix split_by_chromosome template (ysplit)
release 0.19.0
- add new templates for STAR alignment and Salmon
- Small change in final_output_prep to allow bamsort_cmd to have configurable executable and scramble to have optional embed reference param.
- add select directive
- splice/prune operations can be specified in param_vals file
- new version2 format of port specification in templates
- correct failure to report errors lower than one VTFILE level
release 0.18.6
- Y-split fixes
When processing VTFILE nodes, make sure that parameters are reevaluated locally instead of naively inheriting values
from the enclosing template (otherwise local changes in values of components of a parameter value may be ignored).
use bambi for split by chromosome (remove java)
10-vtfp-vtfile.t - add tests for VTFILE functionality
release 0.18.5
- Correct logic in vftp.pl for allowing edges without id attributes
- Reduce verbosity in vtfp.pl
- Fix handling of splice/prune in vftp.pl where source/destination is STDIN/STOUT
- Use fopid in place of other substitutions
- use FindBin in vtfp.pl and viv.pl
release 0.18.4
- removal of -n 1 flag with seqchksum_merge.pl command in merge_aligned template to maintain current behaviour
- logging changes
correct unneeded warnings about changing use_STDIN/use_STDOUT values for INFILE/OUTFILE/RAFILE node
recognise JSON::PP:Boolean type (remove warning from log)
allow edges to have no id values (warnings about undefined values in log files should no longer appear)
- cope with differing line return behaviour of versions of rev utility (test fix)
release 0.18.3
- bwa mem flags default changes and additions
don't use -T by default
[new flag] use -K 100000000 (reads per thread, large value to make alignment runs replicable)
[new flag] use -Y to soft clip supplementary alignments instead of hard clipping
- add ability to splice and prune nodes from the final graph
- final_output_prep changes to support optional targeted stats files.
- use threaded bamsormadup in place of bamsort for coord sort
- add parameters to allow selection of java or bambi (default) implementations of i2b and decode
release 0.18.2
- propagate any 'AH' fields provided by the aligner when providing full SQ headers (bwa0.7.15)
- fix bamindexdecoder.json template: add pack option to bamindexdecoder_java_cmd array to handle undefined parameters; change default implementation from "samtools decode" to java BamIndexDecoder
release 0.18.1
- corrected typo (add_item instead of additem) which can cause compilation failure
release 0.18.0
- stage1 analysis : update to bcl_phix_deplex_wtsi_stage1_template; addition of new subgraphs
- added bcftools_genotype_call template for library merge gtcheck qc test
- added facility to allow splicing in of tee(pot) nodes at node outputs at viv.pl runtime
- remove bamcheck and add bam_stats
- added in realignment_wtsi_stage2_humansplit_template which was used to split existing target files from iRODS. Now renamed to realignment_wtsi_humansplit_template.
- changes to merge_final_output_prep to remove extra stats files and allow use of reference for existing stats files
- adding in realignment_wtsi_humansplit_notargetalign_template
- updated bamindexdecoder template (stage1 analysis) to use samtools decode by default (java version still an option)
release 0.17.3
- update Build.PL and tests to have correct and full set of requirements
release 0.17.2
- turn off compression of AlignmentFilter output
- changes to merge_final_output_prep to use tears to stream data into iRODS, generate extra stats,
use ref in stats file generation and minor tidy up.
release 0.17.1
- -n 1 flag added to seqchksum_merge.pl command in merge_aligned template to allow different tags in column 1
- scramble compression
- up to 7 for final output cram files
- down to 0 for internal bam streams
- port naming conventions (IN/OUT pre- and postfixes) adopted in templates and enforced in viv.pl
- vtfp.pl
improved error reporting
refactoring to ensure more consistent/intuitive evaluation of parameter values
allow specification of "local" parameter substitution (within a specific vtnode)
added --param_vals --export_param_vals flags
subst directive attributes ifnull and required added
remove dead code, review relevance of comments, general tidying
more tests
- add tests for viv.pl
- add (secondary stage) extra split template - remove human and another genome
release 0.17
- human split with no target align (secondary stage) template introduced
- initial work for stage one (bcl, adapter, phiX and spatial filter to split processing) using templates
- bammarkduplicates reintroduced for unaligned file because downstream qc processing relies on presence of markdups_metrics file
- library cram merging: merge_aligned.json and merge_final_output_prep.json
- remove potential deadlock by using non-blocking open of STDIN
release 0.16.4
- add extra branch to teepot command in to stream seqchksum output downstream (instead of using a file as an internal node)
release 0.16.3
- add tee after seqchksum (bam) output to split output to file and cmp node, to remove deadlock
- remove -s flag from all cmp commands (improve diagnostics)
- decrease teepot timeout for bmd_multiway node back to 300
release 0.16.2
- remove superfluous "-" argumant to bamseqchksum command
- increase teepot timeout for bmd_multiway node to 50000 (from 500)
release 0.16.1
- add tempdir parameter and verbose (-v) flag for teepot
release 0.16
- human split: new alignment_wtsi_stage2_humansplit_template.json, seqchksum_hs.json; addition of subst_params to alignment_common
- seqchksum comparisons: merge bamseqchksum files for outputs for comparison with initial bam file in seqchksum.json
- added comparison of cram and bam seqchksum within final_output_prep
- scramble reference optional in final_output_prep template (reference name passed as a parameter instead of via subgraph_io)
- realignment templates fixes/amendents: default value for common subst_params file; default to cram input
release 0.15
- fix construction of alternate hash command to construct sha512primesums512 seqchksum file
release 0.14
- correct prefix value given to calibration_pu -p flag (used in output file naming - it contained an unwanted ".bam")
release 0.13
- create cram index files
release 0.12
-viv.pl
read/write to/from stdin/stdout
exec failure of a node's command is now fatal (bug fix)
-vtfp.pl
updated to use new subst_params format in templates
multiple -keys/-values pairs on the command-line now produce an array on substitution into the template
when substition of nested parameters is done, array elements which are themselves arrays are flattened
(its elements are spliced into the position of the original array); net result is that top-level
parameter substitutions result in either strings or arrays of strings
-templates
new subst_params format for production templates
changes to final_output_prep template to add flexibility to this phase of analysis (e.g. y chrom. split)
output of seqchksum files for hash type sha512primesums512 added
release 0.09
-install action should not remove lib directory at target