[Feat] Add UMI Handling to the pipeline #164

CKComputomics · 2022-06-21T09:33:29Z

Adds the option to use UMIs directly in the pipeline. This can be activated by setting --with_umi
For the extraction step the nf-core sub workflow has been imported. The deduplication step had to be implemented in a new subworkflow. It utilizes the existing bowtie modules to map the reads to a reference genome and deduplicates based on this mapping. The deduplicated reads are merged with the unmapped reads into one file. This behavior can be deactivated by setting --umi_merge_unmapped false.
Using UMIs can result in fast files with very little reads. To few reads can result in a fail of mirtop. this needs to be considered when using this feature.

PR checklist

Reverting changes to a non-linted version and added the umitools modules.

Added the umitools workflow and integrated it into the smrnaseq workflow

Add additional documentation to use UMI tools as part of the pipeline. Most of the documentation has been copied from nf-core/rnaseq.

The bam2fq module is neccessary to convert the deduplicated bam files back into a fastq format to be fed into the existing pipeline.

Added the umitools extract modules.config lines from nf-core/rnaseq to this pipeline.

Added configurations for umi deduplication.

Initial comit of the umi dedup subworkflow. The workflow combines already existing modules of the pipeline and nf-core module to deduplicate the reads by mapping them to the species genome and re-converting them to fastq after deduplication.

includes the optional umitools deduplication step after the read QC.

Added additional configuration to change the output file name of samtools sort.

Added the documentation detailing the output files of the UMI-tools deduplication step.

After deduplication the reads that remained unaligned to the provided reference genome are merged with the set of deduplicated reads to enable the use of the full spectrum of reads, independent of potential reference bias. This behaviour can be deactivated by setting --umi_merge_unmapped false

Information on the new --umi_merge_unmapped command were added to both the CHANGELOG, as well as the output markdown script.

CKComputomics · 2022-06-21T09:35:38Z

@apeltzer there is still a problem with nf-core lint and prettier. The email_template.html file does either pass the knitting or prettier, but the changes made by each cause a fail of the other. Could you have a look at what's going on there.

apeltzer · 2022-06-21T09:37:04Z

Can you resolve the conflicts with dev before that?
@CKComputomics

github-actions · 2022-06-21T10:02:03Z

`nf-core lint` overall result: Failed ❌

Posted for pipeline commit fcc3ef0

+| ✅ 157 tests passed       |+
!| ❗   1 tests had warnings |!
-| ❌   6 tests failed       |-

❌ Test failures:

files_unchanged - .github/CONTRIBUTING.md does not match the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md does not match the template
files_unchanged - .github/workflows/linting.yml does not match the template
files_unchanged - lib/NfcoreTemplate.groovy does not match the template
multiqc_config - 'assets/multiqc_config.yml' does not contain a matching 'report_comment'.
The expected comment is:
This report has been generated by the <a href="https://github.com/nf-core/smrnaseq/tree/dev" target="_blank">nf-core/smrnaseq</a> analysis pipeline. For information about how to interpret these results, please see the <a href="https://nf-co.re/smrnaseq/dev/docs/output" target="_blank">documentation</a>.
The current comment is:
This report has been generated by the <a href="https://github.com/nf-core/smrnaseq/releases/tag/dev" target="_blank">nf-core/smrnaseq</a> analysis pipeline. For information about how to interpret these results, please see the <a href="https://nf-co.re/smrnaseq/dev/docs/output" target="_blank">documentation</a>.
modules_structure - modules directory structure is outdated. Should be 'modules/nf-core/TOOL/SUBTOOL'

❗ Test warnings:

pipeline_todos - TODO string in WorkflowSmrnaseq.groovy: Optionally add in-text citation tools to this list.

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-smrnaseq_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-smrnaseq_logo_light.png
files_exist - File found: docs/images/nf-core-smrnaseq_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: lib/nfcore_external_java_deps.jar
files_exist - File found: lib/NfcoreTemplate.groovy
files_exist - File found: lib/Utils.groovy
files_exist - File found: lib/WorkflowMain.groovy
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: conf/igenomes.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: lib/WorkflowSmrnaseq.groovy
files_exist - File found: modules.json
files_exist - File found: pyproject.toml
files_exist - File not found check: Singularity
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: docs/images/nf-core-smrnaseq_logo.png
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.validationShowHiddenParams
nextflow_config - Config variable found: params.validationSchemaIgnoreParams
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: 2.3dev
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
nextflow_config - nextflow.config contains configuration profile test
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-smrnaseq_logo_light.png matches the template
files_unchanged - docs/images/nf-core-smrnaseq_logo_light.png matches the template
files_unchanged - docs/images/nf-core-smrnaseq_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - lib/nfcore_external_java_deps.jar matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
files_unchanged - pyproject.toml matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 23.04.0, Config: 23.04.0
readme - README Zenodo placeholder was replaced with DOI.
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (169 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: release-announcments.yml
actions_schema_validation - Workflow validation passed: awstest.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - 'assets/multiqc_config.yml' contains report_section_order
multiqc_config - 'assets/multiqc_config.yml' contains export_plots
multiqc_config - 'assets/multiqc_config.yml' contains report_comment
multiqc_config - 'assets/multiqc_config.yml' follows the ordering scheme of the minimally required plugins.
multiqc_config - 'assets/multiqc_config.yml' contains 'export_plots: true'.

Run details

nf-core/tools version 2.11.1
Run at 2024-01-11 12:54:06

CKComputomics · 2022-06-21T11:23:16Z

Can you resolve the conflicts with dev before that?
@CKComputomics

Done! Fixed the prettier vs listing issue as well.

lpantano · 2022-07-19T11:40:34Z

Hi,

has somebody run this in a real dataset? at least without UMI to make sure you get the same results? I don't quite follow whether the trimming will be exactly the same. Where is the params.protocol variable sync with the parameters for nf-core/trimgalore now? Before was in the local/trimgalore. I can try to run this in some of our samples next week to check we get the same results if we don't use the UMI option.

sean-at-tessera · 2023-03-31T15:13:15Z

@apeltzer I'm currently working with miRNA-seq data using UMIs and would love for this feature to get merged into dev -- is there anything I can use to vet this functionality on the datasets I'm using?

chaochungkuo · 2023-04-02T07:07:12Z

Hi, I think this feature is really needed and useful. I just want to reactivate this thread again. I will run a test in the coming week.

apeltzer · 2023-04-03T08:49:16Z

You can run using the -r CKComputomics:umitools branch, @sean-at-tessera . @chaochungkuo if you want to test things, please let me know if things work fine. We also would have to resolve the merge conflicts and finally get this merged into the pipeline.

chaochungkuo · 2023-04-03T14:26:10Z

Hi @apeltzer Thanks. But I cannot run it as you suggested.

nextflow run nf-core/smrnaseq -r CKComputomics:umitools -profile docker --with_umi \
     --umitools_extract_method regex --umitools_bc_pattern '.+AACTGTAGGCACCATCAAT{s<=2}(?P<umi_1>.{12})(?P<discard_2>.*)' \
     --input samplesheet.csv --outdir results --mirtrace_species hsa --mirtrace_protocol qiaseq \
     --three_prime_adapter AACTGTAGGCACCATCAAT --protocol qiaseq \
     --genome GRCh38 \
     --mirna_gtf /data/genomes/hg38/miRNA/hsa.gff3 \
     --mature /data/genomes/spikein/QIASeq_miRNAseq_SpikeIn/mature_with_qiaseq_spikein.fa \
     --hairpin /data/genomes/spikein/QIASeq_miRNAseq_SpikeIn/hairpin_with_qiaseq_spikein.fa

And the error message I got:

N E X T F L O W  ~  version 23.04.0
Pulling nf-core/smrnaseq ...
WARN: Cannot read project manifest -- Cause: Remote resource not found: https://api.github.com/repos/nf-core/smrnaseq/contents/nextflow.config?ref=CKComputomics:umitools
Remote resource not found: https://api.github.com/repos/nf-core/smrnaseq/contents/main.nf?ref=CKComputomics:umitools

Any idea? We used QIAseq™ miRNA Library QC Spike-Ins.

sean-at-tessera · 2023-04-03T15:16:26Z

Should it be the below?

nextflow run CKComputomics/smrnaseq -r umitools
...

chaochungkuo · 2023-04-03T16:06:32Z

Thanks, @sean-at-tessera it works now. I will report after it is done.

apeltzer · 2023-04-04T06:44:32Z

Yes sorry, mistakenly thought the branch is already here in smrnaseq.

sean-at-tessera · 2023-04-04T14:43:49Z

@chaochungkuo I also tried to look at resolving the merge conflicts, but I think it would take your insight to do quickly. It looks like enough has changed since you implemented umitools that some things need to be renamed.

sean-at-tessera · 2023-04-04T17:14:24Z

@chaochungkuo can you document what --bc-pattern needs to be? Should that be pulled from the existing parameter list? I'm working with single-end data, which you may not have tested yet.

chaochungkuo · 2023-04-05T07:39:18Z

Hi @sean-at-tessera

I am not the one who implements this. It was done by @CKComputomics.

I also have single-end reads with QIAseq™ miRNA Library QC Spike-Ins. These parameters are specific for this kit:

--umitools_extract_method regex --umitools_bc_pattern '.+AACTGTAGGCACCATCAAT{s<=2}(?P<umi_1>.{12})(?P<discard_2>.*)' \

I got my results but it seems like didn't go through to the end. The processes of umitools takes too much memory and time.

The error messages are:

[aa/a4040b] NOTE: Process `NFCORE_SMRNASEQ:SMRNASEQ:DEDUPLICATE_UMIS:UMITOOLS_DEDUP (Patientin_P009)` terminated with an error exit status (137) -- Execution is retried (1)
[35/7645bc] NOTE: Process `NFCORE_SMRNASEQ:SMRNASEQ:DEDUPLICATE_UMIS:UMITOOLS_DEDUP (Kontrolle_K023)` terminated with an error exit status (137) -- Execution is retried (1)
...

I am not sure the root of this issue. However, I will increase the limit and run it again. Any advice is appreciated.

CKComputomics · 2023-04-05T12:01:37Z

@chaochungkuo can you document what --bc-pattern needs to be? Should that be pulled from the existing parameter list? I'm working with single-end data, which you may not have tested yet.

Hi, the --bc-pattern error seems to originate from UMItools directly. The corresponding parameter in the nextflow run would be --umitools_bc_pattern.
I have not used UMItools or this pipeline in a while and would refer you to the original UMI-tools documentation. It should be the UMI barcode pattern to use e.g. 'NNNNNN' indicates that the first 6 nucleotides of the read are from the UMI.

I hope this helps you to solve the issue. If not I will do some digging and see if I can figure out what is going wrong.

@chaochungkuo using more memory sounds like a reasonable option. I have only ever tested the pipeline with small datasets and thus have no idea how this scales on full sets.

chaochungkuo · 2023-04-05T12:45:12Z

When I run umitools directly outside of nfcore/smrnaseq, I use the following command:

umi_tools extract --stdin=$1 --stdout=${TRIMMED_FASTQ} --extract-method=regex --bc-pattern='.+AACTGTAGGCACCATCAAT{s<=2}(?P<umi_1>.{12})(?P<discard_2>.*)'

I works fine and I got the trimmed FASTQs I want.

When I pass these parameters into this branch now, I modified them as:

nextflow run nf-core/smrnaseq -r CKComputomics:umitools -profile docker --with_umi \
     --umitools_extract_method regex --umitools_bc_pattern '.+AACTGTAGGCACCATCAAT{s<=2}(?P<umi_1>.{12})(?P<discard_2>.*)' \
     --input samplesheet.csv --outdir results --mirtrace_species hsa --mirtrace_protocol qiaseq \
     --three_prime_adapter AACTGTAGGCACCATCAAT --protocol qiaseq

I thought the name of the parameter is changed from --bc-pattern to --umitools_bc_pattern now.

sean-at-tessera · 2023-04-05T16:01:10Z

@CKComputomics I have the same problem as @chaochungkuo; specifying --umitools_bc_pattern still yields ValueError: Must supply --bc-pattern for single-end from umitools in process NFCORE_SMRNASEQ:SMRNASEQ:FASTQC_UMITOOLS_TRIMGALORE:UMITOOLS_EXTRACT.

Looking at the file for the extract command, I'm guessing bc_params needs to appear in $args, but then, shouldn't the parameter just be --bc_params?

chaochungkuo · 2023-04-05T21:13:27Z

@sean-at-tessera
I just checked the log file of umitool extract and I see the parameter --umitools_bc_pattern I gave to nextflow is properly passed to umi_tools as --bc-pattern:

# UMI-tools version: 1.1.2
# output generated by extract -I Kontrolle_K013_S32_R1_001.fastq.gz -S Kontrolle_K013.umi_extract.fastq.gz --extract-method=regex --bc-pattern=.+AACTGTAGGCACCATCAAT{s<=2}(?P<umi_1>.{12})(?P<discard_2>.*)

However, I still get error message as

Command exit status:
  137

No idea yet...

chaochungkuo · 2023-04-06T07:52:07Z

Here is the exact error I received. Could someone help me to diagnose? Thanks.

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_SMRNASEQ:SMRNASEQ:DEDUPLICATE_UMIS:UMITOOLS_DEDUP":
      umitools: $(umi_tools --version 2>&1 | sed 's/^.*UMI-tools version://; s/ *$//')
  END_VERSIONS

Command exit status:
  137

Command output:
  # chrom                                   : None
  # compresslevel                           : 6
  # detection_method                        : None
  # filter_umi                              : None
  # gene_tag                                : None
  # gene_transcript_map                     : None
  # get_umi_method                          : read_id
  # ignore_tlen                             : False
  # ignore_umi                              : False
  # in_sam                                  : False
  # log2stderr                              : False
  # loglevel                                : 1
  # mapping_quality                         : 0
  # method                                  : directional
  # no_sort_output                          : False
  # out_sam                                 : False
  # output_unmapped                         : False
  # paired                                  : False
  # per_cell                                : False
  # per_contig                              : False
  # per_gene                                : False
  # random_seed                             : None
  # read_length                             : False
  # short_help                              : None
  # skip_regex                              : ^(__|Unassigned)
  # soft_clip_threshold                     : 4
  # spliced                                 : False
  # stats                                   : Patientin_P030.umi_dedup.sorted
  # stderr                                  : <_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>
  # stdin                                   : <_io.TextIOWrapper name='Patientin_P030.sorted.bam' mode='r' encoding='UTF-8'>
  # stdlog                                  : <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
  # stdout                                  : <_io.TextIOWrapper name='Patientin_P030.umi_dedup.sorted.bam' mode='w' encoding='UTF-8'>
  # subset                                  : None
  # threshold                               : 1
  # timeit_file                             : None
  # timeit_header                           : None
  # timeit_name                             : all
  # tmpdir                                  : None
  # umi_sep                                 : _
  # umi_tag                                 : RX
  # umi_tag_delim                           : None
  # umi_tag_split                           : None
  # umi_whitelist                           : None
  # umi_whitelist_paired                    : None
  # unmapped_reads                          : discard
  # unpaired_reads                          : use
  # whole_contig                            : False
  2023-04-06 00:35:28,960 INFO command: dedup -I Patientin_P030.sorted.bam -S Patientin_P030.umi_dedup.sorted.bam --output-stats Patientin_P030.umi_dedup.sorted
  2023-04-06 00:35:52,840 INFO total_umis 13631184
  2023-04-06 00:35:52,840 INFO #umis 664445

sean-at-tessera · 2023-04-06T14:45:37Z

@chaochungkuo exit status 137 generally indicates a memory error. Could you allocate more memory and try again?

sean-at-tessera · 2023-04-07T13:51:12Z

I was able to almost run the CKComputomics:umitools on my data. umitools did require up to ~10Gb of memory at one point. This might be a similar upper bound for your testing, @chaochungkuo. One of the key issues in getting it to run was passing --umitools_extract_method regex, which I'd neglected to do before and was causing an error.

The pipeline only crashed on one stage in the edgeR step. This is because this PR doesn't include the fix incorporated in this pull request.

@CKComputomics , could you please resolve the merge conflicts with dev? I can run another test then and evaluate the resulting outputs.

Dtdavidgit · 2023-07-11T09:17:32Z

Hi guys, just want to activate this thread again, wondering when the UMI handling feature will be added to the repo?
Thanks

apeltzer · 2024-01-11T12:38:09Z

Ok, will give this a go now that more people requrested it. My hope was that someone is quicker at this but that seems not to be the case ;-)

apeltzer · 2024-01-11T12:38:23Z

I will pull in upstream changes, then try to resolve conflicts and merge it

Christian Kubica and others added 23 commits May 10, 2022 14:35

REVERT CHANGES

d783ef9

Reverting changes to a non-linted version and added the umitools modules.

INCLUDE UMITOOLS WORKFLOW

1043932

Added the umitools workflow and integrated it into the smrnaseq workflow

ADD DOCUMENTATION

27fd482

Add additional documentation to use UMI tools as part of the pipeline. Most of the documentation has been copied from nf-core/rnaseq.

ADD SAMTOOLS BAM2FQ MODULE

ee673b0

The bam2fq module is neccessary to convert the deduplicated bam files back into a fastq format to be fed into the existing pipeline.

ADD UMITOOLS EXTRACT ARGS

0bc65e4

Added the umitools extract modules.config lines from nf-core/rnaseq to this pipeline.

UPDATE MODULES.CONFIG

8d14f90

Added configurations for umi deduplication.

INCLUDE UMITOOLS DEDUP WORKFLOW

23f96d8

Initial comit of the umi dedup subworkflow. The workflow combines already existing modules of the pipeline and nf-core module to deduplicate the reads by mapping them to the species genome and re-converting them to fastq after deduplication.

INCLUDE UMITOOLS DEDUP

944d277

includes the optional umitools deduplication step after the read QC.

ADD SAMTOOLS SORT CONFIG

ddb3dba

Added additional configuration to change the output file name of samtools sort.

FIX TYPO

b2ef66a

ADD DEDUP DOCUMENTATION

29ec7da

Added the documentation detailing the output files of the UMI-tools deduplication step.

ADD DEDUP STEP

afa1ad7

ADD UMITOOLS VERSION

c72ac5b

ADD MISSING OPTION

f9ca542

ADD NEWLINE

b974717

CLEAN CODE

4610be1

ADD DOCUMENTATION

67b2cac

ADD UMI_MERGE_UNMAPPED COMMAND

23fc985

FINALIZE DOCUMENTATION

be241ea

Information on the new --umi_merge_unmapped command were added to both the CHANGELOG, as well as the output markdown script.

UPDATE MAIL TEMPLATE

8b433f1

CHANGE DAG OUTPUT TO HTML

0e732ed

PLEASE PRETTIER

8f426b5

Merge branch 'dev' into umitools

8e132fb

CKComputomics added 2 commits June 21, 2022 13:06

FIX MERGE ERROR

039843f

MAKE PRETTIER HAPPY

53c097c

apeltzer modified the milestones: Release 2.1.0, Release 2.2.0 Oct 11, 2022

apeltzer modified the milestones: Release 2.2.0, Release 2.1.1 Patch Feb 17, 2023

apeltzer removed this from the Release 2.2.0 milestone Sep 1, 2023

apeltzer changed the base branch from dev to umi-handling January 11, 2024 12:39

Merge branch 'umi-handling' into umitools

fcc3ef0

apeltzer merged commit 069beb1 into nf-core:umi-handling Jan 11, 2024
2 of 8 checks passed

apeltzer mentioned this pull request Jan 24, 2024

[RELEASE PR for 2.3.0 release] #305

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] Add UMI Handling to the pipeline #164

[Feat] Add UMI Handling to the pipeline #164

CKComputomics commented Jun 21, 2022 •

edited

Loading

CKComputomics commented Jun 21, 2022

apeltzer commented Jun 21, 2022 •

edited

Loading

github-actions bot commented Jun 21, 2022 •

edited

Loading

❌ Test failures:

❗ Test warnings:

✅ Tests passed:

Run details

CKComputomics commented Jun 21, 2022

lpantano commented Jul 19, 2022

sean-at-tessera commented Mar 31, 2023

chaochungkuo commented Apr 2, 2023

apeltzer commented Apr 3, 2023

chaochungkuo commented Apr 3, 2023

sean-at-tessera commented Apr 3, 2023

chaochungkuo commented Apr 3, 2023

apeltzer commented Apr 4, 2023

sean-at-tessera commented Apr 4, 2023

sean-at-tessera commented Apr 4, 2023

chaochungkuo commented Apr 5, 2023

CKComputomics commented Apr 5, 2023

chaochungkuo commented Apr 5, 2023

sean-at-tessera commented Apr 5, 2023 •

edited

Loading

chaochungkuo commented Apr 5, 2023

chaochungkuo commented Apr 6, 2023

sean-at-tessera commented Apr 6, 2023

sean-at-tessera commented Apr 7, 2023

Dtdavidgit commented Jul 11, 2023

apeltzer commented Jan 11, 2024

apeltzer commented Jan 11, 2024

[Feat] Add UMI Handling to the pipeline #164

[Feat] Add UMI Handling to the pipeline #164

Conversation

CKComputomics commented Jun 21, 2022 • edited Loading

PR checklist

CKComputomics commented Jun 21, 2022

apeltzer commented Jun 21, 2022 • edited Loading

github-actions bot commented Jun 21, 2022 • edited Loading

nf-core lint overall result: Failed ❌

❌ Test failures:

❗ Test warnings:

✅ Tests passed:

Run details

CKComputomics commented Jun 21, 2022

lpantano commented Jul 19, 2022

sean-at-tessera commented Mar 31, 2023

chaochungkuo commented Apr 2, 2023

apeltzer commented Apr 3, 2023

chaochungkuo commented Apr 3, 2023

sean-at-tessera commented Apr 3, 2023

chaochungkuo commented Apr 3, 2023

apeltzer commented Apr 4, 2023

sean-at-tessera commented Apr 4, 2023

sean-at-tessera commented Apr 4, 2023

chaochungkuo commented Apr 5, 2023

CKComputomics commented Apr 5, 2023

chaochungkuo commented Apr 5, 2023

sean-at-tessera commented Apr 5, 2023 • edited Loading

chaochungkuo commented Apr 5, 2023

chaochungkuo commented Apr 6, 2023

sean-at-tessera commented Apr 6, 2023

sean-at-tessera commented Apr 7, 2023

Dtdavidgit commented Jul 11, 2023

apeltzer commented Jan 11, 2024

apeltzer commented Jan 11, 2024

CKComputomics commented Jun 21, 2022 •

edited

Loading

apeltzer commented Jun 21, 2022 •

edited

Loading

github-actions bot commented Jun 21, 2022 •

edited

Loading

`nf-core lint` overall result: Failed ❌

sean-at-tessera commented Apr 5, 2023 •

edited

Loading