DeepSV, an approach based on deep learning for calling long deletions from sequence reads.DeepSV is based on a novel method of visualizing sequence reads. The visualization is designed to capture multiple sources of information in the data that are relevant to long deletions. DeepSV also implements techniques for working with noisy training data. DeepSV trains a model from the visualized sequence reads and calls deletions based on this model. We demonstrate that DeepSV outperforms existing methods in terms of accuracy and efficiency of deletion calling on the data from the 1000 Genomes Project. Our work shows that deep learning can potentially lead to effective calling of different types of genetic variations that are complex than SNPs.
- python 3.6, Jupyter Notebook, numpy, scipy, pandas, Matplotlib
- Cuda 8.0, Cudnn
- TensorFlow
- Digits
- Pysam
bash Anaconda3-4.3.1-Linux-x86_64.sh
- pip install jupyter
- Configure
jupyter notebook --generate-config
Create a ciphertext password: from notebook.auth import passwd - Modify the default configuration file
c.NotebookApp.ip='*'
c.NotebookApp.password = u'sha:...'
c.NotebookApp.open_browser = False
c.NotebookApp.port =8888
c.NotebookApp.notebook_dir = u’/home/...’
Installation tutorial can be downloaded from the official website
- pip install tensorflow-gpu
cd ~
git clone https://github.com/NVIDIA/DIGITS.git digits
cd digits
sudo apt-get install graphviz gunicorn
for req in $(cat requirements.txt); do sudo pip install $req; done
pip install -r ~/digits/requirements.txt
./digits-devserver
- pip install pysam
BAM file & VCF file
First provide the bam files and vcf files for program
Run Generate_Deletion_Image.py and Generate_Non_Deletion_Image.py in the custom path
- python Generate_Deletion_Image.py --del_length
- python Generate_Non_Deletion_Image.py --del_length
Generate the path of all pictures for training the network
- python my_file_travel.py
Send all the generated pictures to the network training
- Using the CNN architecture in CNN_Source.py
Generating whole genome pictures
- python Whole_genome_Image.py
- python extract_breakpoint.py
- python generate_final_vcf.py