Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::bad_alloc in ataqv1.1.1 #10

Closed
naumenko-sa opened this issue Jan 2, 2020 · 19 comments
Closed

std::bad_alloc in ataqv1.1.1 #10

naumenko-sa opened this issue Jan 2, 2020 · 19 comments

Comments

@naumenko-sa
Copy link

naumenko-sa commented Jan 2, 2020

Hello!

Thanks for the useful tool!
We are using ataqv in bcbio (https://github.com/bcbio/bcbio-nextgen)

In a project with many samples one sample failed ataqv step with std::bad_alloc.
I updated ataqv to the latest 1.1.1 (built it manually).
It allowed me to pass this sample, but it failed another one.

Call:

ataqv \
--peak-file S72941_2007_Naive-NF_peaks.narrowPeak \
--name S72941_2007_Naive \
--metrics-file S72941_2007_Naive.ataqv.json.gz \
--tss-file TSS.bed \
--autosomal-reference-file autosomal.txt \
--ignore-read-groups --mitochondrial-reference-name MT None \
S72941_2007_Naive-sort-shifted_chrom-noMito.unique-noduplicates-NF.bam

Output:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

Could you please help to resolve?
I'm attaching the peak file.
S72941_2007_Naive-NF_peaks.narrowPeak.gz
I can also share a bam file if you need it.

Thanks!
Sergey

@porchard
Copy link
Contributor

porchard commented Jan 3, 2020

Hi Sergey,

Glad you all have found it useful. That command looks good to me so it's not immediately clear what the issue is. If you could send the bam file along with the TSS + autosomal reference files you've passed in the above command I'll see if I can identify the problem.

Also -- based on the chromosome names in the peak file above, I'm guessing this is human data? If so, you shouldn't need to pass an autosomal reference file as long as you indicate it's human and the autosomes are either 1 - 22 or chr1 - chr22 (which, based on the peak file you've shared, they are). You will still need to indicate the mitochondrial reference name (as you've correctly done above), since the built-in human mitochondrial reference name is 'chrM'.

@naumenko-sa
Copy link
Author

Hi @porchard !

Thanks for the quick response!

Here are the files to reproduce the issue:
https://drive.google.com/open?id=1cXSf6cmZ3x_zQC81EItTKncQyc76o5Yx

This is human data aligned to grch37.
When I removed autosomal reference file and called as you suggested

#!/bin/bash

~/bcbio/tools/bin/ataqv \
--peak-file S72941_2007_Naive-NF_peaks.narrowPeak \
--name S72941_2007_Naive \
--metrics-file S72941_2007_Naive.ataqv.json.gz \
--tss-file TSS.bed \
--ignore-read-groups --mitochondrial-reference-name MT \
human \
S72941_2007_Naive-sort-shifted_chrom-noMito.unique-noduplicates-NF.bam

it works fine not throwing bad_alloc for ataqv1.0.0 and for ataqv1.1.1.

Sergey

@yoonsquared
Copy link

yoonsquared commented Apr 21, 2020

Hello,

Just adding to this as we have gotten the same problem with the hg38-bwa-based atac-seq run.

Here is the error file I have extracted.
lian_atac_GRCh38_5422214.txt
It doesn't throw an error when I take out the autosomal file and replace none -> human like the example above.

Do you need extra bamfiles and tss files for the error?

Best,
Joon

P.S. I'm sorry, should I add this as a new post? It seems this is appropriate as I have solved my problem at the moment with the same solution you have provided.

roryk added a commit to bcbio/bcbio-nextgen that referenced this issue Apr 22, 2020
This has an issue ParkerLab/ataqv#10 so we
can skip it when we don't need it.

Thanks to @yoonsquared for testing the fix.
@naumenko-sa
Copy link
Author

Hi @porchard !

Sorry for bugging you again, we are hitting bad_alloc this time with ataqv1.1.1 and mm10 data, no autosomal reference.

ataqv \
--peak-file YapKO_E15_Bla_Rep2-full_peaks.narrowPeak \
--name YapKO_E15_Bla_Rep2 \
--metrics-file YapKO_E15_Bla_Rep2.ataqv.json.gz \
--tss-file TSS.bed  \
--ignore-read-groups \
--mitochondrial-reference-name chrM \
--tss-extension 1000 mouse YapKO_E15_Bla_Rep2-sort-shifted.bam

I tried to submit it with 50G and 100G of RAM.

Could you please help us debug?

Sergey

@porchard
Copy link
Contributor

porchard commented Jul 7, 2020

Hi Sergey (and Yoon),

Sure, happy to have a look at this. Are you allowed to share the files used in that last command?

I'll also set about a fix for the issue noted in the previous comments in this thread.

@porchard
Copy link
Contributor

porchard commented Jul 7, 2020

Also -- can you share an autosomal reference file you were using? Looks like that Google Drive link above is dead.

@yoonsquared
Copy link

yoonsquared commented Jul 7, 2020 via email

@porchard
Copy link
Contributor

porchard commented Jul 7, 2020

Hi Joon,

Google Drive/Dropbox/etc would be fine, if that works for you?

Regarding the autosomal reference file: I was referring to the earlier comments in this thread. Sorry for the confusion. If you happen to have any of those lying around, I'd be curious to see them as well -- just wondering if the formatting for those might have been incorrect or something.

@yoonsquared
Copy link

Hi @porchard,
https://drive.google.com/drive/folders/1dmC6rRdHwA82K855F48ORamyQ_dxlFBN?usp=sharing

These are the input files. I can't get the autosomal reference until tomorrow, but let me know if these files work for you.
Thanks!

Best,
Joon

@porchard
Copy link
Contributor

porchard commented Jul 8, 2020

Hi Joon,

Thanks! Running your command, I do trigger a bad_alloc error. I think the issue may be some empty tags in the BAM file header:

@CO     TY:checksum     ST:     PA:all  HA:crc32prod    CO:0    BS:1    NS:1    SQ:1    ST:BC,FI,QT,RT,TC:1  
@CO     TY:checksum     ST:     PA:pass HA:crc32prod    CO:0    BS:1    NS:1    SQ:1    ST:BC,FI,QT,RT,TC:1

I think the empty ST tags are the problem. Can you remove those two lines from the header, re-run ataqv, and let me know if it runs successfully? If it does, I'll presume that's the issue and push a fix to handle the empty tags.

@yoonsquared
Copy link

yoonsquared commented Jul 8, 2020

Hi @porchard,

I give it a go after removing it, I will report if the run went through. For the same study, my run stopped with the same error for a different sample under bowtie also. I will check if it has the same issue with the empty ST tags.

Thanks!

Update: I saw the same empty ST: in the bowtie run of another sample. It gave a bad_alloc also.

@yoonsquared
Copy link

Hi @porchard,

The run went through for that particular sample, but my run threw another bad:alloc in another sample with the same problem.

Can you go ahead and push the issue? I'll try going through all the bam files to check all the samples with the same problem.
Thanks!

Best,
Joon

@porchard
Copy link
Contributor

porchard commented Jul 8, 2020

Hi Joon,

Ok, give commit 1a2e8c5 a shot. Let me know how it goes.

@yoonsquared
Copy link

yoonsquared commented Jul 8, 2020

Thanks @porchard,

It will take a bit to update the system and re-run, I will keep you posted.

Best,
Joon.

@naumenko-sa
Copy link
Author

Thanks, @porchard!

The fix works.

I've compiled it manually and patched our bcbio installation with
ataqv:

#!/bin/bash

export LD_LIBRARY_PATH=/n/app/htslib/1.10.2/lib:/n/app/boost/1.62.0/lib:/n/app/gcc/6.2.0/lib64:/n/app/gcc/6.2.0/lib:$LD_LIBRARY_PATH
/n/app/bcbio/dev/anaconda/bin/ataqv_fix_2020-07-10 "$@"

@yoonsquared is running the whole bcbio run to double-check the fix.

Do you plan to release a new version?
Otherwise, users installing ataqv via bioconda (including bcbio users) will continue to hit similar bugs.

Sergey

@porchard
Copy link
Contributor

Hi @naumenko-sa and @yoonsquared,

Excellent, thanks for letting me know. Yes, I'll see about getting a new release up

@yoonsquared
Copy link

Thanks for the prompt replies and solution!
The run finished fine.

All the best,
Joon

@porchard
Copy link
Contributor

Hi @naumenko-sa and @yoonsquared,

I've published a new release - thanks again for your help.

@naumenko-sa
Copy link
Author

Thanks! It is already in bioconda:
https://github.com/bioconda/bioconda-recipes/blob/master/recipes/ataqv/meta.yaml
so we all set!
Until the next time!
S.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants