Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sambamba base output missing chunks of data #225

Closed
parlar opened this issue Jun 17, 2016 · 6 comments
Closed

sambamba base output missing chunks of data #225

parlar opened this issue Jun 17, 2016 · 6 comments

Comments

@parlar
Copy link

parlar commented Jun 17, 2016

Hi!

The output from sambamba depth base 0.6.2 (and earlier versions) is missing chunks of data. I used the -z option. The bam was sorted but to make sure that it really was I sorted it again using sambamba sort.

Here is an example. Notice the gap between pos 775 and 1860.

chrM 768 2 0 2 0 0 0 0 16-3584
chrM 769 2 2 0 0 0 0 0 16-3584
chrM 770 2 0 0 2 0 0 0 16-3584
chrM 771 1 0 1 0 0 0 0 16-3584
chrM 772 1 1 0 0 0 0 0 16-3584
chrM 773 1 1 0 0 0 0 0 16-3584
chrM 774 1 0 0 0 1 0 0 16-3584
chrM 775 1 0 0 1 0 0 0 16-3584
chrM 1860 15 15 0 0 0 0 0 16-3584
chrM 1861 15 15 0 0 0 0 0 16-3584
chrM 1862 15 0 0 0 15 0 0 16-3584
chrM 1863 15 0 0 0 15 0 0 16-3584
chrM 1864 15 15 0 0 0 0 0 16-3584
chrM 1865 30 30 0 0 0 0 0 16-3584
chrM 1866 30 0 30 0 0 0 0 16-3584

As a comparison I ran bedtools genomecov, which outputs correctly depth readings for all positions.

@lomereiter
Copy link
Contributor

Hi,

Does it help if you add -c 0 option? It defaults to 1 (sorry, I know documentation for depth is terrible).

@parlar
Copy link
Author

parlar commented Jun 17, 2016

Nope. Same thing (below).

I ran it like this:

sambamba62 depth base -c 0 -z -t 8 -F "" 16-3584-ready.sorted.bam >tmp.txt

Have tried other filter options also.

chrM 769 2 2 0 0 0 0 0 16-3584
chrM 770 2 0 0 2 0 0 0 16-3584
chrM 771 1 0 1 0 0 0 0 16-3584
chrM 772 1 1 0 0 0 0 0 16-3584
chrM 773 1 1 0 0 0 0 0 16-3584
chrM 774 1 0 0 0 1 0 0 16-3584
chrM 775 1 0 0 1 0 0 0 16-3584
chrM 1860 15 15 0 0 0 0 0 16-3584
chrM 1861 15 15 0 0 0 0 0 16-3584
chrM 1862 15 0 0 0 15 0 0 16-3584
chrM 1863 15 0 0 0 15 0 0 16-3584
chrM 1864 15 15 0 0 0 0 0 16-3584
chrM 1865 30 30 0 0 0 0 0 16-3584

@parlar
Copy link
Author

parlar commented Jun 17, 2016

I just emailed you a slice of a bam file that might be useful for tracking down the problem.

cheers

lomereiter pushed a commit that referenced this issue Jun 19, 2016
@lomereiter
Copy link
Contributor

Should be fixed now.
Just in case, the test file is the one you've sent, but with randomly generated read bases.

lomereiter pushed a commit that referenced this issue Jun 19, 2016
@parlar
Copy link
Author

parlar commented Jun 20, 2016

That fixed it! I notice, however, that depths obtained using bedtools genomecov and sambamba 0.6.2 (the new 0.6.2) sometimes differ slightly. Don't know which one is correct, however.

WIll you make a 0.6.3 release? Seems to me as a quite important bug that you fixed here.

@lomereiter
Copy link
Contributor

IMO the best way to check is to find differences in low-coverage regions and check reads at a particular base manually.

pjotrp pushed a commit to pjotrp/sambamba that referenced this issue Dec 13, 2016
pjotrp pushed a commit to pjotrp/sambamba that referenced this issue Dec 13, 2016
pjotrp pushed a commit to pjotrp/sambamba that referenced this issue Dec 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants