Skip to content

vezzi/qaTools

 
 

Repository files navigation

A couple of useful qa tools for sequencing data.

I. Setup:

Use make SAMTOOLS=/PATH/TO/SAMTOOLS/SOURCE VERSION=SAMTOOLS_VERSION If you don't have samtools, download it from here (and run make): http://samtools.sourceforge.net Any version should work. However, 1.3 is verified to do such.

II. Tools:

  1. qaCompute Computes normal and span coverage from a bam/sam file. Also counts unmapped and sub-par quality reads. Parameters: m - Compute median coverage for each contig/chromosome. Will make running a bit slower. Off by default.

    q [INT] - Quality threshold. Any read with a mapping quality under INT will be ignored when computing the coverage.

     NOTE: bwa outputs mapping quality 0 for reads that map with
     equal quality in multiple places. If you want to condier this,
     set q to 0.
    

    d - Print coverage histrogram over each individual contig/chromosome. These details will be printed in file .detail

    p [INT] - Print coverage profile to bed file, averaged over given window size.

    i - Silent run. Will not print running info to stdout.

    s [INT] - Compute span coverage. (Use for mate pair libs) Instead of actual read coverage, using the options will consider the entire span of the insert as a read, if insert size is lower than INT. For an accurate estimation of span coverage, I recommend setting an insert size limit INT around 3*std_dev of your lib's insert size distribution.

    c [INT] - Maximum X coverage to consider in histogram.

    h [STR] - Use different header. Because mappers sometimes break the headers or simply don't output them, this is provieded as a non-kosher way around it. Use with care!

    For more info on the parameteres try ./qaCompute

  2. removeUnmapped Remove unmapped and sub-par quality reads from a bam/sam file. For more info on the parameters try ./removeUnmapped

  3. computeInsertSizeHistogram Compute the insert size distribution from a bam/sam file. For more info on the parameters try ./computeInsertSizeHistogram

About

Some more QA

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 76.2%
  • C++ 21.6%
  • Makefile 2.2%