-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add UMI reads processing capability #145
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks amazing.
We just need the test data to do some CI.
Can you update the CHANGELOG as well?
Made a couple of suggestions, if you accept them, you can batch commit them. |
Co-Authored-By: Maxime Garcia <maxime.garcia@scilifelab.se>
Co-Authored-By: Maxime Garcia <maxime.garcia@scilifelab.se>
Co-Authored-By: Maxime Garcia <maxime.garcia@scilifelab.se>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree with suggestions, reviewed and made more explicit
Hi any updates on adding umi to variant calling? Is it working? Otherwise I will build a new pipeline. |
Hi @chelauk not sure what's holding the pull request at this stage, I did test everything at the March hackathon using the test data here |
Hi @chelauk @nibscles |
@chelauk @nibscles |
nf-core/sarek pull request
Many thanks for contributing to nf-core/sarek!
Please fill in the appropriate checklist below (delete whatever is not relevant).
These are the most common things requested on pull requests (PRs).
PR checklist
This pull request introduces a chunk of code to process reads containing UMIs. Unique Molecular Indices are very important particularly for somatic workflows aiming at detecting very low allele-fraction variants (MRD, Liquid Biopsy). The chosen workflow adopts the FGBIO tools, which create a consensus read within the same UMI-groups, and a robust method for identification of the groups. See blog and ref.
The approach ensures downstream compatibility with the workflow: the result of the UMI process is a uBam, which can then be fed into MappingReads and downstream in both HaplotypeCaller and more importantly Mutect2 or Strelka.
Tests are work in progress: datasets have been identified from 2 different UMI types (QIAseq and Illumina TSO), but cannot complete them on laptop
As indicated above, the reads will be uploaded at nf-core/sarek branch on the nf-core/test-datasets repo
The code has passed lints (
nf-core lint .
).Documentation in
docs
has been updatedCHANGELOG.md
is not been updated yetREADME.md
has not been updated yet (not sure if this is relevant)