You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running dorado-0.8.2-linux-x64 on barcoded bacterial native DNA sequence data from a MinION run, I am getting [debug] messages like the following: Invalid trim interval for read id b1879b87-0723-4d45-95f9-6a9f7064580f: 34-14. Trimming will be skipped.
The basecalling of this run is still ongoing but I guess this affects only a minority of reads (ca. 300 reads thus far).
I would have two main questions:
Could you please clarify what exactly the issue is and what it means "downstream"? Does it mean that no trimming happens at all, i.e., that adapters and barcodes (no primers in this case) would remain in the reads? Put differently, how should the resulting basecalled reads be further processed? I am running dorado demux afterwards, followed by hybracter, which includes Porechop_ABI. This will be a hybrid assembly, also involving Illumina data.
Is this information also available in case a user has not added the -v option?
This issue occurs when the trim intervals for the adapter and the barcode overlap in such a way that the entire read would be removed i.e. the adapter trimming thinks that it should retain region 34-50, but the barcoding thinks it should retain 8-14. As there is no sensible way to perform trimming in this instance, we skip it entirely.
Reads that suffer from this issue tend to be very short, and can usually be excluded based on read length. As they will be untrimmed, you could also align the the adapter sequence and discard reads with a high alignment score.
If you want to try to retain these reads there are two other options available:
Run dorado trim on your output to remove any remaining adapters. This will, however, leave any untrimmed barcodes in place.
Rebasecall your data without barcoding but with --no-trim, and then demux and barcode in the second step:
This will skip explicit adapter trimming so there will be no conflict between the intervals, and adapters will be removed anyway by the barcode trimming during demux. If you also want to remove adapters from any unclassified reads, use step 1 on the unclassified.bam file.
Issue Report
Please describe the issue:
When running
dorado-0.8.2-linux-x64
on barcoded bacterial native DNA sequence data from a MinION run, I am getting[debug]
messages like the following:Invalid trim interval for read id b1879b87-0723-4d45-95f9-6a9f7064580f: 34-14. Trimming will be skipped.
The basecalling of this run is still ongoing but I guess this affects only a minority of reads (ca. 300 reads thus far).
I would have two main questions:
dorado demux
afterwards, followed byhybracter
, which includesPorechop_ABI
. This will be a hybrid assembly, also involving Illumina data.-v
option?Steps to reproduce the issue:
Please list any steps to reproduce the issue.
Run environment:
0.8.2
"basecaller" "sup" "--device" "cuda:all" "--kit-name" "SQK-NBD114-24" "-v" "/PATH/TO/pod5" "--modified-bases" "4mC_5mC"
Logs
Thank you very much!
Best wishes and stay safe,
Cedric
The text was updated successfully, but these errors were encountered: