Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix errors in miRDeep2 analysis when reads map to unplaced contigs in $refgenome #100

Merged
merged 2 commits into from
Aug 5, 2021

Conversation

Daniel-VM
Copy link
Contributor

This pull request attempts to fix a potential corner case in STEP 7.2 miRDeep2:

Some miRNA reads could map to unplaced contigs (ie: ">chr11_KI270721v1_random") in the reference genome
($refgenome). In such situation, removing "_" from the sequence ID of $refgenome leads to mismatch with
the chromosome IDs listed in the *.arf file ($reads_vs_refdb). Example:

    genome_nowhitespace.fa: >chr11KI270721v1random ($refgenome after removing underscore)
    $reads_vs_refdb: chr11_KI270721v1_random

Therefore, mirdeep2.pl doesn't identify the mapped read ($reads_vs_refdb) in the edited reference genome causing the following error:

""
Command error:
#Starting miRDeep2
[...]

The mapped reference id chr11_KI270721v1_random from file *_reads_vs_refdb.arf is
    not an id of the genome file genome_nowhitespace.fa
     [...]

""
PROPOSAL:

  1. Avoid "_" removal with awk in STEP 7.2 miRDeep2
    In my case, this modification solves the above error.

  2. if "_" is removed from $refgenome, then apply the same modification to the *.arf file in the $6 field (chromosome ID column) in order to preserve the chromosome's ID correspondence.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • Make sure your code lints (nf-core lint .).
  • Ensure the test suite passes (nextflow run . -profile test,conda).

@Daniel-VM Daniel-VM changed the title fix errors in miRDeep2 analysis when reads maps to unplaced contigs in $refgenome fix errors in miRDeep2 analysis when reads map to unplaced contigs in $refgenome Jul 27, 2021
@ewels ewels merged commit 4b7fbcf into nf-core:dev Aug 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants