-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with CRAM parsing #280
Comments
Thanks @ernfrid. Can you run the recent sambamba with debug information using https://github.com/lomereiter/sambamba#troubleshooting and see if that fails? And do you have an example CRAM file we can use to replicate the issue? |
We should absolutely upgrade htslib. Good news is that we apparently can use (almost) vanilla htslib now; the core issue was having bitfields in |
Sorry for the long delay. My CRAM file was human data and not open access, but I just downloaded the following CRAM from 1000 Genomes and observed the same error.
I failed at my first attempt in getting a sambamba with debug information turned on so I don't have additional information at this time. |
Thanks @ernfrid, I can reproduce the issue, will look into it on the weekend. |
Hi everyone, any news here? For me sambamba v0.6.6 raises the same error with a CRAM v2.1 file produced by sambamba. BAM to CRAM: CRAM to SAM: |
Btw, sambamba dumped this on stderr in my terminal:
Of any help? Input file is "CRAM version 2.1 compressed sequence data" according to htsfile (htslib) 1.3, again this CRAM file was produced by sambamba. The same error is raised when reading CRAM 3.0 format produced by |
Sambamba v6.6.7 $ module load htslib/1.6 sambamba/0.6.7
$ htsfile --version
htsfile (htslib) 1.6
Copyright (C) 2017 Genome Research Ltd.
$ sambamba --version
sambamba 0.6.7
This version was built with:
LDC 1.1.1
using DMD v2.071.2
using LLVM 3.8.1
bootstrapped with LDC - the LLVM D compiler (0.17.4)
$ file sample.cram
sample.cram: data
$ htsfile sample.cram
sample.cram: CRAM version 3.0 compressed sequence data
$ samtools view -h sample.cram | tee >(file - >&2) | htsfile -
-: SAM version 1.3 sequence text
/dev/stdin: ASCII text, with very long lines
$ samtools view -u sample.cram | tee >(file - >&2) | htsfile -
-: BAM version 1 compressed sequence data
/dev/stdin: gzip compressed data, extra field
$ sambamba view sample.cram | file -
sambamba-view: Error reading BGZF block starting from offset 0: wrong BGZF magic
/dev/stdin: no read permission
$ sambamba view -C sample.cram | file -
Init cram_fd* #1
Init cram_fd* #2
Init _Anonymous_25* #1
cram_read_slice (1/1)
Init cram_slice* #1
Init _Anonymous_25* #2
...
cram_read_slice (1/1)
Init cram_slice* #35
Init _Anonymous_5* #2
/dev/stdin: ASCII text, with very long lines
$ sambamba view -C sample.cram 2>/dev/null | file -
/dev/stdin: ASCII text, with very long lines
$ sambamba markdup sample.cram /dev/null
sambamba-markdup: Error reading BGZF block starting from offset 0: wrong BGZF magic
$ sambamba depth base sample.cram
REF POS COV A C G T DEL REFSKIP SAMPLE
sambamba-depth: Error reading BGZF block starting from offset 0: wrong BGZF magic
$ samtools view -u sample.cram | sambamba depth base /dev/stdin
REF POS COV A C G T DEL REFSKIP SAMPLE
sambamba-depth: All files must be indexed All in all pretty useless... :-( Too bad, because I really like sambamba's enhanced features! Is there anyway I can help resolving this? |
We need to look into this, that is why the issue is open ;). The first step would be to upgrade htslib and the build tools to latest. I'll ping you here when I get round to it. I hope early spring. |
CRAM support will be dropped. |
We first noticed this on version 0.6.4, but it appears to exist in 0.6.5 as well.
Running a command like:
./sambamba_v0.6.5 view -C -f bam file.cram -l 0
Eventually results in an error like:
*** glibc detected *** ./sambamba_v0.6.5: corrupted double-linked list: 0x00000000013f1430 ***
If instead we output to SAM, I get a slightly different error:
sambamba-view: Failure in cram_decode_slice
samtools is able to parse these CRAMs without issue so I don't believe that the issue is with the files themselves.
The text was updated successfully, but these errors were encountered: