Releases: gustaveroussy/bigr_long-reads_bulk
Releases · gustaveroussy/bigr_long-reads_bulk
2.1.1
Bugs correction:
- major bug in dorado 0.8.0: "bug causing dna_r10.4.1_e8.2_400bps_sup@v5.0.0_5mC_5hmC@v2 to
call CpG contexts only and dna_r10.4.1_e8.2_400bps_sup@v5.0.0_5mCG_5hmCG@v2 to call all contexts" (see https://github.com/nanoporetech/dorado/blob/release-v0.8/CHANGELOG.md#081-3-oct-2024). - bug fix in snv filtering: filter was malformatted.
- remove ambiguity in wildcards between condition and paired samples for DMR
2.1.0
Input:
- Addition of UBAM (unaligned BAM) as alternative pipeline entrypoint. The available pipeline input files are now POD5, BAM and UBAM.
Steps modification:
- Fastq files Quality Control is now performed on the unaligned BAM file right after basecalling and before filtering (instead of the aligned BAM).
- All VCF files produced during the SNV annotation with SnpEff/SnpSift and CNV Calling with Spectre are now all bgzipped and indexed, it speeds up the following steps when compressed VCF files are accepted as input.
- By default, VCF files from SNV variant caller are first filtered to keep only the variants annotated as "PASS", before proceeding to the annotation step with SnpEff. "keep_only_pass" value was added in the config file to enable/disable this step when needed.
Miscellaneous:
- Addition of various verifications before running the pipeline (supplementary design and config files check, etc)
- Reads filtering parameters were added in the config file as "reads_filtering" value (default values: min read length = 1000 bp, min base quality score = 10 bp).
- Methylation content presence is now checked in UBAM and BAM files given as input of the pipeline if the user choses to perform any methylation analysis in his config file.
- SnpSift annotation rules were modified to ease various databases use.
Bugs correction:
- multiQC did not work correctly when 'basecalling_mode = basic' since it was searching for methylation output by default: a conditional input for multiQC was added, depending on wether methylation analysis is performed or not.
- Clair3 was not able to retrieve the input BAM index file since it was marked as temp() in the previous rule (reconcat_split_bam) and it was not given as an input to clair3 rule. It is now corrected.
- CuteSV sniffles2plot output were the same as the ones generated with Sniffles2 plot: wrong input path was corrected.
- Changed resources for modkit_pileup_uncomb rule as it returned some out of memory issues (even though it was labelled as a "Time Limit" error in slurm logs).
2.0.1
Analysis steps added:
- Preprocessing: Basecalling and Methylation calling
- Alignment
- Quality control (fastq, bam, methylation)
- Differential Methylated Regions (DMR) analysis between all samples and/or between two conditions
- Long Copy Number Variations (CNV) identification
- Phasing
Input:
- can be pod5 or aligned bam
- the format of the design file has changed
- we can give a file with interesting genes for maftools plots.
Reproducibility:
- Singularity images are pushed on zenodo instead of github.
Miscellaneous:
- Supplementary germline SV (cuteSV) and SNV (PEPPER-Margin-DeepVariant) variant callers were added for future consensus analysis.