Skip to content
/ EM-MUL Public

EM-MUL is an effective tools which resolves ambiguous bisulfite-treated reads, making use of information we have. #To run this program, we needs to have samtools, perl, bedtools,and g++ first. The inputs of this tool consists four parts. -r is the reference genome to be aligned; -u is the unique reads; -m is multireads,which align to multiple lo…

Notifications You must be signed in to change notification settings

lmylynn/EM-MUL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 

Repository files navigation

EM-MUL

  EM-MUL is an effective tools which resolves ambiguous bisulfite-treated reads, making use of information we have.

To run this program, we needs to have samtools, perl, bedtools,and g++ first.
The inputs of this tool consists four parts.

  • -r is the reference genome to be aligned.
  • -u is the unique reads.
  • -m is multireads,which align to multiple locations of the reference genome ambiguously.
  • -o is the unique reads that overlapped with multireads.

Among them, the unique reads and the multireads are obtained by aligning the original BS reads to bismark. Overlappedfile can be obtained through the unique reads and multireads, the processing flow refers to BAM_ABS, the commad is:

  • Convert unique_reads.sam to unique_reads.bam.

    • samtools view -bS all_unique_reads.sam > all_unique_reads.bam
  • Run Covert_to_bed_unite.pl to covert ambiguous read file to bed formate with --ambiguous option.

    • perl Convert_to_bed_unite.pl --ambiguous ambiguous_file.sam
  • Run samtools to get overlapped unique reads in sam format.

    • samtools view -L ambiguous_file.bed all_unique_reads.bam -q 20 > unique_reads.sam
  • To get rid of duplicates from the unique reads.

    • sort -n -r -k3,3 -k4,4 -k5,5 unique_reads.sam|uniq -u > unique_reads_nodup.sam
  • Convert unique read file to bed format with --unique option.

    • perl Convert_to_bed_unite.pl --unique unique_reads_nodup.sam
  • Get overlapped unique reads by using Bedtools and run the following command in the bedtools folder to get the overlappedfile we use.

    • ./intersectBed -a ambiguous_file.bed -b unique_reads_nodup.bed -wb -wa > overlapfile.txt
  • Score the multireads using EM-MUL.

    • python3.6 new_score_all_and_coverage_human -r hg38 -u unique_reads_nodup.sam -m multireads.sam -o overlapfile.txt

About

EM-MUL is an effective tools which resolves ambiguous bisulfite-treated reads, making use of information we have. #To run this program, we needs to have samtools, perl, bedtools,and g++ first. The inputs of this tool consists four parts. -r is the reference genome to be aligned; -u is the unique reads; -m is multireads,which align to multiple lo…

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published