Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfaults with ATAC-seq data #78

Closed
NMaziak opened this issue Feb 23, 2022 · 8 comments
Closed

segfaults with ATAC-seq data #78

NMaziak opened this issue Feb 23, 2022 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@NMaziak
Copy link

NMaziak commented Feb 23, 2022

Hello,

I'm trying to use chromap on ATAC-seq data but I keep getting getting segfaults. I'm using version 0.1.3 (downloaded from [chromap-0.1.3_x64-linux.tar.bz2]) on CentOS 7 with datasets in SRP234892 (https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA593838&o=acc_s%3Aa) SRR10597272-9.

System details:

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

I also tried on other datasets and ran into the same issue. One of my colleagues has run this version of chromap on our system on a different dataset and that worked fine.
The error I get with chromap-0.1.3_x64-linux.tar.bz2 is as follows:

Will try to remove adapters on 3'.
Will remove PCR duplicates after mapping.
Will remove PCR duplicates at cell level.
Won't allocate multi-mappings after mapping.
Only output unique mappings after mapping.
Only output mappings of which barcodes are in whitelist.
Perform Tn5 shift.
Output mappings in SAM format.
Reference file: /home/research/vaquerizas/store/genomes/insects/Dmel/6.07/fasta/dmel-all-chromosome-r6.07.fasta
Index file: /home/research/vaquerizas/store/genomes/insects/Dmel/ChromapIndexes/6.07/index
1th read 1 file: fastq/SRR10597272_1.fastq.gz
1th read 2 file: fastq/SRR10597272_2.fastq.gz
Output file: aligned/SRR10597272_chromap.sam
Loaded all sequences successfully in 3.66s, number of sequences: 1870, number of bases: 143726002.
Kmer size: 17, window size: 7.
Lookup table size: 28722375, occurrence table size: 8529222.
Loaded index successfully in 4.69s.
Mapped 500000 read pairs in 26.49s.
/bin/bash: line 1: 190805 Segmentation fault (core dumped) chromap --preset atac -x /home/research/vaquerizas/store/genomes/insects/Dmel/ChromapIndexes/6.07/index -r /home/research/vaquerizas/store/genomes/insects/Dmel/6.07/fasta/dmel-all-chromosome-r6.07.fasta -1 fastq/SRR10597272_1.fastq.gz -2 fastq/SRR10597272_2.fastq.gz --SAM -o aligned/SRR10597272_chromap.sam

Tried again with chromap-0.1.3-asan_x64-linux.tar.bz2 and got:

[nmaziak1@node-head1a fastq]$ chromap-asan --trim-adapters -l 2000 --remove-pcr-duplicates-at-cell-level --Tn5-shift -x /home/research/vaquerizas/store/genomes/insects/Dmel/ChromapIndexes/6.07/index -r /h
ome/research/vaquerizas/store/genomes/insects/Dmel/6.07/fasta/dmel-all-chromosome-r6.07.fasta -1 SRR10597272_1.fastq.gz -2 SRR10597272_2.fastq.gz -o try.bed
Start to map reads.
Parameters: error threshold: 8, min-num-seeds: 2, max-seed-frequency: 500,1000, max-num-best-mappings: 1, max-insert-size: 2000, MAPQ-threshold: 30, min-read-length: 30, bc-error-threshold: 1, bc-probability-threshold: 0.90
Number of threads: 1
Analyze bulk data.
Will try to remove adapters on 3'.
Won't remove PCR duplicates after mapping.
Will remove PCR duplicates at cell level.
Won't allocate multi-mappings after mapping.
Only output unique mappings after mapping.
Only output mappings of which barcodes are in whitelist.
Perform Tn5 shift.
Output mappings in BED/BEDPE format.
Reference file: /home/research/vaquerizas/store/genomes/insects/Dmel/6.07/fasta/dmel-all-chromosome-r6.07.fasta
Index file: /home/research/vaquerizas/store/genomes/insects/Dmel/ChromapIndexes/6.07/index
1th read 1 file: SRR10597272_1.fastq.gz
1th read 2 file: SRR10597272_2.fastq.gz
Output file: try.bed
Loaded all sequences successfully in 1.93s, number of sequences: 1870, number of bases: 143726002.
Kmer size: 17, window size: 7.
Lookup table size: 28722375, occurrence table size: 8529222.
Loaded index successfully in 3.06s.
Mapped 500000 read pairs in 47.95s.
AddressSanitizer:DEADLYSIGNAL
=================================================================
==64100==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000030 (pc 0x00000077487e bp 0x0000fffffff6 sp 0x7fff184bac00 T0)
==64100==The signal is caused by a READ memory access.
==64100==Hint: address points to the zero page.
  #0 0x77487e in chromap::Chromap<chromap::PairedEndMappingWithoutBarcode>::VerifyCandidatesOnOneDirection(chromap::Direction, chromap::SequenceBatch const&, unsigned int, chromap::SequenceBatch const&, std::vector<chromap::Candidate, std::allocator<chromap::Candidate> > const&, std::vector<std::pair<int, unsigned long>, std::allocator<std::pair<int, unsigned long> > >*, std::vector<int, std::allocator<int> >*, int*, int*, int*, int*) (/home/nmaziak1/software/chromap-0.1.3-asan_x64-linux/chromap-asan+0x77487e)
  #1 0x7ef445 in chromap::Chromap<chromap::PairedEndMappingWithoutBarcode>::VerifyCandidates(chromap::SequenceBatch const&, unsigned int, chromap::SequenceBatch const&, std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > > const&, std::vector<chromap::Candidate, std::allocator<chromap::Candidate> > const&, std::vector<chromap::Candidate, std::allocator<chromap::Candidate> > const&, std::vector<std::pair<int, unsigned long>, std::allocator<std::pair<int, unsigned long> > >*, std::vector<int, std::allocator<int> >*, std::vector<std::pair<int, unsigned long>, std::allocator<std::pair<int, unsigned long> > >*, std::vector<int, std::allocator<int> >*, int*, int*, int*, int*) (/home/nmaziak1/software/chromap-0.1.3-asan_x64-linux/chromap-asan+0x7ef445)
  #2 0x8271da in chromap::Chromap<chromap::PairedEndMappingWithoutBarcode>::MapPairedEndReads() [clone ._omp_fn.2] (/home/nmaziak1/software/chromap-0.1.3-asan_x64-linux/chromap-asan+0x8271da)
  #3 0x9fc635 in GOMP_taskloop (/home/nmaziak1/software/chromap-0.1.3-asan_x64-linux/chromap-asan+0x9fc635)
  #4 0x6b75e9 in chromap::Chromap<chromap::PairedEndMappingWithoutBarcode>::MapPairedEndReads() [clone ._omp_fn.0] (/home/nmaziak1/software/chromap-0.1.3-asan_x64-linux/chromap-asan+0x6b75e9)
  #5 0x9f68f1 in GOMP_parallel (/home/nmaziak1/software/chromap-0.1.3-asan_x64-linux/chromap-asan+0x9f68f1)
  #6 0x8aab30 in chromap::Chromap<chromap::PairedEndMappingWithoutBarcode>::MapPairedEndReads() (/home/nmaziak1/software/chromap-0.1.3-asan_x64-linux/chromap-asan+0x8aab30)
  #7 0x5ebc16 in chromap::ChromapDriver::ParseArgsAndRun(int, char**) (/home/nmaziak1/software/chromap-0.1.3-asan_x64-linux/chromap-asan+0x5ebc16)
  #8 0x41bdbf in main (/home/nmaziak1/software/chromap-0.1.3-asan_x64-linux/chromap-asan+0x41bdbf)
  #9 0x7f10f35bc554 in __libc_start_main (/lib64/libc.so.6+0x22554)
  #10 0x41ea1e (/home/nmaziak1/software/chromap-0.1.3-asan_x64-linux/chromap-asan+0x41ea1e)AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/home/nmaziak1/software/chromap-0.1.3-asan_x64-linux/chromap-asan+0x77487e) in chromap::Chromap<chromap::PairedEndMappingWithoutBarcode>::VerifyCandidatesOnOneDirection(chromap::Direction, chromap::SequenceBatch const&, unsigned int, chromap::SequenceBatch const&, std::vector<chromap::Candidate, std::allocator<chromap::Candidate> > const&, std::vector<std::pair<int, unsigned long>, std::allocator<std::pair<int, unsigned long> > >*, std::vector<int, std::allocator<int> >*, int*, int*, int*, int*)
==64100==ABORTING

As far as I can tell the indexing has been fine. Also of note, without the option --SAM or --trim-adapters it will run for a little longer until it crashes into the same error. Last, I accidentally fed it the same read pair once and it actually went through it just fine (of course telling me nothing paired). The --low-mem option doesn’t seem to change anything.

I have had general issues with ATAC-seq datasets from drosophila (no issues however with ChIP/RNAseq/or GROseq), not sure why but just for reference, the code I run to get them is : fastq-dump --gzip --split-3 --readids -B --skip-technical --clip SRR10597272

Thank you!
Noura

@haowenz
Copy link
Owner

haowenz commented Feb 23, 2022

Thanks for trying out the tool. Did you try the latest version v0.2.0 of Chromap? Though mapping results should be the same for BED output among the different versions, several bugs got fixed after v0.1.3, which may include this one.

@haowenz
Copy link
Owner

haowenz commented Feb 23, 2022

Btw, to remove PCR duplicates, you have to also use --remove-pcr-duplicates in your command.

@NMaziak
Copy link
Author

NMaziak commented Feb 23, 2022

Hello, thanks for the reply! I can't use v0.2.0 as of yet because of the same issue @liz-is had here [https://github.com//issues/4#issuecomment-975966558 ]. And thanks for catching this for me, I was just trying to use it outside of the presets which gave me the same issue.

@haowenz
Copy link
Owner

haowenz commented Feb 23, 2022

I just attached a prebuilt binary to the release. And actually I downloaded the reads and I was able to reproduce the error, though I used GCF_000001215.4_Release_6_plus_ISO1_MT_genomic.fna as reference. I also located the bug. Once I fix it, I will upload a prebuilt binary here and let you know.

@haowenz haowenz self-assigned this Feb 23, 2022
@haowenz haowenz added the bug Something isn't working label Feb 23, 2022
@NMaziak
Copy link
Author

NMaziak commented Feb 23, 2022

Thank you so much!

@haowenz
Copy link
Owner

haowenz commented Feb 23, 2022

You can now use the no-cache version of Chromap at branch no-cache. And you can find the prebuilt binary at https://github.com/haowenz/chromap/releases/download/v0.2.0/chromap-0.2.0-no-cache_x64-linux.tar.bz2. It should work without segfault on this dataset.

@NMaziak
Copy link
Author

NMaziak commented Feb 23, 2022

It seems to have worked, many thanks for all the help

@haowenz
Copy link
Owner

haowenz commented Mar 31, 2022

v0.2.1 should also work on this. You may give it a try if you want.

@haowenz haowenz closed this as completed Mar 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants