-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathChangeLog
400 lines (375 loc) · 12.6 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
CHANGES
=======
* fixed bug
* [pre-commit] Fixes from hooks
* Fixed a bug in 06\_align.smk where if the bam file is not sorted to sort it
* Fixed another location where the PIs 2 last names is causing problems
* Fixing a bug with PIs 2 last name, from a different location
* Typo fixed
* Fixed a bug with groupdir if PI has 2 last names
v2.1.2
------
* [pre-commit] Fixes from hooks
* Added other reference genomes
* Added GRCm39 and GRCh37
* New function get\_organism\_name to fix the organism name from parkour
2.1.1
-----
* [pre-commit] Fixes from hooks
* Fixed typo
* sending email when genome is not found
* fix typo, together with @gerikson
* fixes together with @gerikson
* IMPORTANT: adjust config-parkour-url before running this version. api endpoints are hardcoded
* pre-commit hooks CI autoupdate
v2.1.0
------
* allowing existing target directory in final transfer
* [pre-commit] Fixes from hooks
* avoiding yes/no and true/false for flags as python/bash eval problems
* creating the transfer/proj/Analysis dir if needed
* fix for bash variable context
* black fmt. 2 spaces for comment
* fixing error in variable comparison for Bash command
* error parsing SampleSheet.csv, fixing file reading
* precommit config has snakefmt now
* pre-commit Rdy
* avoid syntax warning
* in-line snakemake param interpolation go into bash to use syntactical sugars .lower (,,) or .upper (^^)
* hotfix after change on parkour
* smallfix
* ah, this was missing
* clean dead code
* drop deprecated guppy basecaller
* guppy final deprecation
* clean dead code
* drop deprecated guppy basecaller
* drop deprecated guppy basecaller
* small adjustments (#162)
* Upcoming version 1.1.0 (#149)
* adding flowcell/pod5 as another pod dir
* script to clean up backups
* skip SampleSheet in dont\_touch
* passing default process values for each flowcell
* waiting for GPU only when basecalling
* renaming subset files
* renaming subset files
* using seqkit to subset fastq
* using pycoQC conda env
* using seqkit to subset reads
* conda env for pycoQC
* conda env for seqkit
* separating subset fastq for porechop
* debugging errors in porechop
* debugging errors in porechop
* debugging errors in porechop
* debugging errors in porechop
* debugging errors in porechop
* debugging errors in porechop
* debugging errors in porechop
* debugging errors in porechop
* debugging errors in porechop
* setting temp dir as /tmp for porechop
* fix for analysis name
* fix for analysis name
* debug error in porechop
* script to generate conda envs
* using individual conda envs
* renaming env for each bioinfo tool
* pysam for pycoQC install
* pysam for pycoQC install
* pycoQC for pip install
* pycoQC for pip install
* rename pycoQC for pip install
* using fastQC after Falco fails
* updating to v 1.21
* using pip to install pycoQC (v2.5.2)
* rolling back pycoQC version
* updating pycoQC
* missing pysam dep
* missing dep for pycoqc
* error space
* error space
* removing env yamls
* error in conda env name
* adding conda env yaml
* removing full path to modules
* removing old tools configs, missing kraken db
* using conda envs
* removing bionfo tools as they will be in their own conda envs
* creating modular conda envs for bioinformatic tools
* keeping subsampled files for debug
* multithreading for pod5
* updating pod5 tooling
* missing comma for porechop subset params
* missing comma for bam params
* missing comma for bam params
* extending old\_outputDirs
* adding new setting options: basecall, align, modbase, and extending old\_outputDirs
* adding option to target a specific flowcell
* changes for parameters and some suggestions from PR#149
* subsetting fastq for porechop (as it is slow)
* using batch\_size from config
* adding new parameters
* freezing package versions
* using conda for installing programs and deps
* setting undefined model from json metadata
* bug fix for reporting the species in termination email
* adding html parsing for metadata
* config basecall model when needed
* adding html parsing for metadata
* adding logical steps for basecall, align and modbed
* missing comma
* separating BAM merge
* separating BAM merge
* code cleaning
* adding changes after review
* removed unused functions
* clean up and adding back transfer step
* adding optional align step
* swapping BAM to Fastq for dorado align
* debug for bam export
* updating python and snakemake
* updating python and snakemake
* updating python and snakemake
* updating python and snakemake
* updating python and snakemake
* exporting bam in prepare\_bam
* exporting bam in prepare\_bam
* exporting bam in prepare\_bam
* rules for merging
* conditional basecall and modbed
* conditional basecall and modbed
* conditional basecall and modbed
* conditional basecall and modbed
* conditional basecall and modbed
* conditional basecall and modbed
* conditional basecall and modbed
* conditional basecall and modbed
* conditional basecall and modbed
* conditional align
* conditional align
* simple software version capture, adding dorado
* align skiping
* align skiping
* conditional flag
* conditional flag
* conditional flag
* conditional flag
* conditional flag
* conditional flag
* final bam name
* changing modbam2bed to modkit
* bam expansion rule
* bam expansion rule
* adding clean-up and fixing baseout
* activating BAM merging step
* turn-off final data transfer for debugging
* turn-off final data transfer for debugging
* code clean up, removing parkour creds
* code clean up, removing parkour creds
* adding step to merge BAM files
* code clean up
* code clean-up
* code cleaning, adding model and software capture
* code cleaning and formatting
* 10G check on dorado rule as well
* GPU check > 10GB, not 20 to accomodate V100s
v2.0.4
------
* add cautionary sleep to communication.py
* fix ship\_qcreports
v2.0.3
------
* 07\_modbed: silence modbam2bed and send stderr to log file
* 06\_align: make bai explicit target to avoid warnings with older index files
* fastqc: make memory requirements a config parameter
v2.0.2
------
* legacy fix: allow flexible replacement in samplesheet of NaN or NoIndex with no\_bc
v2.0.1
------
* improved readability of gpu\_available()
* add version to metadata.yml and fix wrong reporting of median length in QC report
v2.0.0
------
* address PR comments
* add empty postfix '' to parkour query - to account for legacy naming of flowcells
* wipe config['info\_dict'] after process - avoid errorneous information transfer (e.g. bc\_kits) between flowcells
* add helper scripts for downsampling flowcell folders and pod5 (for testing)
* update communication and basic inference of dorado model
* update old rules/ to work with new config (for now)
* define new set of rules and output structure
v1.2.0
------
* disable final multqc for now
* add GPU test before basecalling
* WIP: add final multiqc rule
* add option to call modifications to guppy
* add conda env to config['paths']
* make parkour\_query() and send\_email() more robust
* correct reporting of transfer
* fix df -BG
* fix filter\_flowcell bug
* update QC metrics and storage report
* Manke (#105)
* 3\_qc.smk: swallow kraken output (#104)
* Manke (#103)
* uncomment parkour-query hack
* refactor find\_new\_flowcell and permit aribtrary folder as cli. wgetDir and baseDir unused
* porechop: add flexible choice for -abi flag
* updated porechop rule and included multiqc
* add porechop for adapter detection
v1.1.1
------
* fix bug in parkour\_query and recursive merging of pod5
v1.1.0
------
* updated installation instructions
* Separated data preparation from basecalling. Defined new rule: 0\_prepare.smk
* fixed calculation of footprint and ratio
* removed project\_id from message. config[data] not yet defined
* moved pycoqc to pip - for version 2.5.2
* added try/except to "getfast5foot()" function
* fixed pod5 merging and footprint
* improved 'prepare\_pod5' rule and changed the getfast5foot() function to output NA if fast5 dir doesn't exist
* replaced rule 'fast5\_to\_pod5' with rule 'prepare\_pod5' to accomodate the input change (from fast5 to multiple pod5)
* modif rule fast5\_to\_pod5
* test\_merge\_pod5
* add project ID to e-mail body
* added memory allocation to fastqc of 4Gb
* omit pod5 copy until further notice
* define dirs bugfix
* env typo
* work with 2 output directories (seq\_data volumes) + implement ignores
* ignore update template
* include common primers/adapters/bcs
* change seqfac subfolders
* cutadapt dep
* --dir instead of -d as newer fqc cant handle it
* genmodels finalize
* scps into the samba world
* attempt to fix prom - sup models
* imports & syntax
* include samba shipping over ssh
* extra escapes + minor changes to samplesheet missing checks
* re-use index
* parkour queries incorp flowcell re-usage
* yaml wo for user
* more escapes
* init meta config for end user
* more hardcoded escapes until protocol is finalised. trigger for ext data + offload = html
* escape to dna-mouse for no parkour entry
* escape to dna-mouse for no parkour entry
* fix env yaml
* dev rebase
* fix env conflict
* include offload & wget dirs but keep them optional
* syntax, missing colon
* strip vers in env, force python, juggle channels
* attempt to cap versions + purge fastqc wrapper
* pod converion arg change
* fix env yaml
* fix env yaml
* include offload & wget dirs but keep them optional
* dev rebase
* fix env conflict
* include offload & wget dirs but keep them optional
* syntax, missing colon
* Wd (#56)
* strip vers in env, force python, juggle channels
* Wd (#55)
* attempt to cap versions + purge fastqc wrapper
* Wd (#51)
* pod converion arg change
* some workrounds for older flow cells. Rename rules for a pretty dag. (#50)
* some workrounds for older flow cells. Rename rules for a pretty dag
* simplify triggermail + smkopts
* proper naming BC samples
* initiate cmd log with basecall cmds to include for end user
* exist checks 4 fs issues ?
* merge develop
* split pass fail
* hardcode flowcell escape (parkour screwup), no-bc field bugfix, comm enhancements
* strip authors
* trigger email
* grab barcoding / barcoding kit information from guppy line
* strip authors
* Wd (#35) (#37)
* successful debug - runs on test case
* some refactoring and add fastqc
* Wd (#35)
* A to Z for barcoded samples as well
* A to Z for non-barcoded, DNA mouse sample
* - executable / configs - genmodel - pod5 implementation - smk API
* add longreads header to subj
* typo print
* add env in readme
* purge tab indent
* rich in env
* force check folders first, only afterwards sampleSheets
* add requests/snakemake to env
* pysam dep fix env
* mamba env file w/ reqs wo/ guppy
* added bc\_kit for flowcells with barcode
* Update README.md
* Update README.md
* added the env yaml file
* updated log files content
* added misc
* added more emails, small bug fix, added new flags for the end of each step
* updated the config template, started adding sendEmail
* bug fix
v1.0.0
------
* updated readme
* add requirements and config template
* rm config
* barocde changes
* added sleep and parkour query
* removed old files
* Update README.md
* fixed mapping and qc for external users
* dded rules to do the basecalling and renaming the files
* fixed the dependency
* all the rules were added, there is somethign worn with the dag though. It always build all the rules from scratch
* updated qc for bam files and paths in the config file
* there is bug in running the dag. it always staet from the beginning , needs to be fixed
* all fast5 are now used
* mapping
* config was updated
* changing the pipeline to snakemake has been done up to mapping
* config
* new files are added. Basecallign , renaming , qc and data transfer is done. mapping is left
* basecallign , renaming , qc and data transfer is done. mapping is left
* fixed mapping for dna protocol
* added config to gitignore
* cosmetic changes
* fixed config
* added gitignore
* added bam qc
* cosmetic change
* fixed a small bug in reading compatible kits names
* adeed pycoQC for the qc
* updated mapping and config file
* added contamiantion report on bam level
* more fix in rna mapping output and started adding the conatmination report
* Fixed rna mapping
* update the pipeline until after qc for samples with barcodes
* updated the code until after fastqc step for reads with barcode
* uncommented a sys.exit
* Update README.md
* renamed pipeline.py to nanopore\_pipeline.py
* updated the pipeline on a case with barecodes
* minor typo
* added module load anaconda3 to the usage in readme
* uncommented an else to throw an error if the path already exists
* updated paths
* nd more readme updates!
* update readme
* more readme updates
* updated readme Please enter the commit message for your changes. Lines starting
* updated minimap2 to 2.17
* anged the mapping options for dna and rna
* Initial commit
* first commit for the nanpore\_data\_transfer pipeline