-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sambamba-markdup: Read reference ID is out of range #224
Comments
Hi, It sounds like a bug in
|
I'll email you a link if that is ok... |
Thanks, I downloaded the files and can reproduce the issue. |
Appears to have the same cause as #214 (buggy version of lz4 library), finished just fine with v0.6.2. |
Thanks. I'm busily doing workarounds for other broken tools, and was just assuming this was the same problem. I'll be sure to update before rerunning markdup. |
I reran it several times, and occasionally it still fails :( Reopening. |
Apparently there is a deep limitation in The workaround I mentioned above is to split that contig (it is a chromosome, so I can split it into arms with not problem). However, that doesn't really solve the problem for others and the future. If this really is the issue, everything is moving to |
That's unlikely to be the cause, markdup doesn't do any index queries. It worst-case scenario it will write a broken .BAI index. Might be related to #189 |
Ugh... Deadlocks. That is above my pay grade I'm afraid. Just tell me if there's anything particular you'd like me to look out for. I feel a bit guilty not digging into the code myself to help, but I'm not going to learn D at the moment. |
I also encountered this with v0.6.3
Some bam files work fine, but when index the marked bam file, error occurred: But they all work fine with Picard MarkDuplicates. |
What's the status on this issue? We (Genetics department at UMCG) are heavily dependent on sambamba, upgrading to the latest version (0.6.3) did help for some samples/runs, but not all. We are using also using bwa mem and merge with sambamba.. |
Hi @RoanKanninga, could you check if compiling from source fixes the issue? I also experience it with the release binaries, but can't reproduce it in the development environment. It may be that outdated LLVM on CentOS leads to bugs like this. |
I still have this problem with v0.6.5 binary. |
If #189 is suspected then it'd be a good to have @RoanKanninga's kernel version(s). |
#219 includes a new 0.6.5 binary of sambamba with debug info. May be worth trying that since it was built with a recent ldc and llvm 3.7. |
I just got this with version |
I also just got this in version |
I also have this issue when using version |
@charlottewright The problem I was having was due to BAI indexing limitations and a non-human genome with a really big chromosome. If you have a chrom/contig > (2^29)-1 in length, that's you're problem. Using a CSI index will likely fix it. If you can't do that, split the offending chrom/contig into two. |
sambamba markdup is dying (sig 11 I think) with:
sambamba-markdup: Read reference ID is out of range
This only happens for some (one at the moment) bam file. Others work fine.
v0.6.1
command:
sambamba_v0.6.1 markdup -t 8 merged.bam merged_markdup.bam
complete output:
The merge step was also done with
sambamba
. Mapping done withbwa mem
.It fails fairly quickly, but the input file it is failing on is fairly huge (5.1G). I can make it available to someone for testing (it isn't human data), but don't want to just post it.
The problem may have something to do with the reference (total size 1.3G). The 2nd scaffold (out of 3919) in the reference is 552137040 bp long, and some picard and GATK tools have been choking on it. 552137040 * 8 is > 2^32, unlike any of the human chroms... so maybe something there.
The text was updated successfully, but these errors were encountered: