Skip to content

Latest commit

 

History

History
85 lines (67 loc) · 4.1 KB

README.md

File metadata and controls

85 lines (67 loc) · 4.1 KB

test Anaconda-Server Badge Anaconda-Server Badge

PretextMap

Paired REad TEXTure Mapper. Converts SAM or pairs formatted read pairs into genome contact maps. See https://github.com/4dn-dcic/pairix/blob/master/pairs_format_specification.md for pairs format specification.
Pairs format supported by version 0.04 or later only.

PretextMap is a commandline tool for converting aligned read pairs in either the SAM/BAM/CRAM or pairs format into genomic contact maps (https://github.com/aidenlab/juicer/wiki/Pre, https://higlass.io/).

Data is read from stdin over a unix pipe, eliminating the need for any intermidiate files. Alignments can be read directly from an aligner ( | PretextMap), from a SAM file (PretextMap < file.sam), from a BAM/CRAM file using samtools (samtools view -h file.bam | PretextMap) or from a pairs file (PretextMap < file.pairs). PretextMap can even be inserted into the middle of existing pipelines by using tee or similar pipe-chaining tricks.

PretextMap comes with no imposed pipeline for processing data. Process your alignments however you want before feeding to PretextMap.

Bioconda

All commandline Pretext tools for Unix (Linux and Mac) are available on bioconda.

The full suite of Pretext tools can be installed with

> conda install pretext-suite

Or, just PretextMap can be installed with

> conda install pretextmap

Usage

Pipe SAM or pairs formatted read pairs to PretextMap e.g.:
samtools view -h file.bam | PretextMap
zcat file.paris.gz | PretextMap

Important: A SAM header with contig info must be present for SAM format (-h option for samtools).

Or pipe directly from an aligner e.g. bwa mem ... | PretextMap

Options

  • -o specifies an output file (required)
  • --sortby sorts contigs by length, name or nosort (default: length)
  • --sortorder ascend or descend (default: descend, no effect if sortby = nosort)
  • --mapq sets a minimum mapping quality filter (default: 10)

example:

> samtools view -h file.bam | PretextMap -o map.pretext --sortby length --sortorder descent --mapq 10

New option, version 0.1:

  • --filterInclude: a comma separated list of sequence names, only these sequences will be included
  • --filterExclude: a comma separated list of sequence names, these sequence will be excluded

example:

> samtools view -h file.bam seq_1 seq_2 | PretextMap -o map.pretext --filterInclude "seq_1, seq_2"

Filtering will increase the map resolution, since you're mapping less sequence into a fixed number of bins.
Note: also filtering with samtools view as in the above example (... seq_1 seq_2) is not nessesary, but is recommended purely for speed (provided your bam file is sorted and indexed).

New option, version 0.1.9:

  • --highRes: high resolution output, only supported by PretextView >=0.2.5

Map Format

Contact maps are saved in a compressed texture format (hence the name). Maps can be read by PretextView (https://github.com/wtsi-hpag/PretextView). Expect pretext map files to take around 30 to 50 M of disk space each.

Requirments, running

3G of RAM and 2 CPU cores

Third-Party acknowledgements

PretextMap uses the following third-party libraries:

Installation

Requires:

  • clang >= 11.0.0
  • meson >= 0.57.1
git submodule update --init --recursive
env CXX=clang meson setup --buildtype=release --unity on --prefix=<installation prefix> builddir
cd builddir
meson compile
meson test
meson install