fix errors in miRDeep2 analysis when reads map to unplaced contigs in $refgenome #100
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request attempts to fix a potential corner case in STEP 7.2 miRDeep2:
Some miRNA reads could map to unplaced contigs (ie: ">chr11_KI270721v1_random") in the reference genome
($refgenome). In such situation, removing "_" from the sequence ID of $refgenome leads to mismatch with
the chromosome IDs listed in the *.arf file ($reads_vs_refdb). Example:
Therefore, mirdeep2.pl doesn't identify the mapped read ($reads_vs_refdb) in the edited reference genome causing the following error:
""
Command error:
#Starting miRDeep2
[...]
""
PROPOSAL:
Avoid "_" removal with awk in STEP 7.2 miRDeep2
In my case, this modification solves the above error.
if "_" is removed from $refgenome, then apply the same modification to the *.arf file in the $6 field (chromosome ID column) in order to preserve the chromosome's ID correspondence.
PR checklist
nf-core lint .
).nextflow run . -profile test,conda
).