Chromatin Interaction Analysis with Paired-End Tag (ChIA-PET) sequencing is a genome-wide high-throughput technology that detects chromatin interactions associated with a specific protein of interest (Fullwood et al., 2009). ChIA-PET Tool (Li et al., 2010) is a computational package to process the next-generation sequence data generated from ChIA-PET wet-lab experiments,
which contains 7 steps:
- linker filtering
- mapping the paired-end reads to a reference genome
- purifying the mapped reads
- dividing the reads into different categories
- peak calling
- interaction calling
- visualizing the results
ChIA-PET Tool was originally published in the journal Genome Biology in 2010. After that, the package and its modifications were used in many research projects for publications in high-profile journals. The modifications include revising the linker filtering scripts, adopting the state-of-the-art mapping tools (such as BWA and Bowtie), generating the statistics of the data, and evaluating the quality of the data. In this updated package, we demonstrate how to apply the latest ChIA-PET Tool to the publicly available ChIA-PET data and illustrate the details and interpretation of the results to facilitate the usage of ChIA-PET Tool.
The current ChIA-PET Tool is a command-line program whose execution requires a terminal program. ChIA-PET Tool is mainly coded in Java. Shell scripts are used to glue the different steps in ChIA-PET Tool as a single pipeline. R scripts are used to calculate p-values and generate figures. The package can be downloaded from https://github.com/GuoliangLi-HZAU/ChIA-PET_Tool/archive/master.zip, which includes ChIA-PET Tool source codes in Java, a precompiled JAR file, shell scripts, R scripts and some example files.
program
LGL.jar // the kernel program written in JAVA
LGLsrc
LGL // source codes in JAVA
path.txt // necessary file used to compiled source codes
hypergeometric.r // assessing the statistical significance of the interaction from hypergeometric model in R
pois.r // assessing the statistical significance of the peaks from Poisson distribution model in R
cutoff_hist_binsize_10bp.r // assessing the border between self-ligation and inter-ligation PETs
peakHeader.txt
hg19.chromSize.txt // a file contains the length of each chromosome
mm10.chromSize.txt
ChIA-PET_Tool_Report // an empty template for generating report
Rscript_and_genome_data // R scripts for generating reportChIA-PET_Tool_Report.r
Plotting_functions.R
hg19_cytoBandIdeo.txt
mm10_cytoBandIdeo.txt
linker_set_1.with-barcode-info.txt // linker file 1
linker_set_2.with-barcode-info.txt // linker file 2
MCF7.input.information.txt // input files and core parameters of linker filtering
run.MCF7.sh // shell scripts gluing the whole steps
deletion.sh // delete some temperary files and move files for generating visualization report to a new folder
If you need to modify the source codes, using the following commands to pack the files. (change your working directory to: ChIA-PET_Tool-master/program/LGL/src/)
mkdir ../classes
javac -d ../classes @path.txt
cd ../classes/
jar -cvf LGL.jar LGL
rm ../../LGL.jar
cp LGL.jar ../../
ChIA-PET Tool is a pipeline based primarily on JAVA (http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html). At the same time, it also depends on the following softwares.
BWA (http://bio-bwa.sourceforge.net/) is used to map ChIA-PET sequencing reads to a reference genome. BWA can be replaced by other mapping tools, such as Bowtie (http://sourceforge.net/projects/bowtie-bio/files/bowtie). The corresponding mapping tools in the scripts and the genome index should be modified for this purpose.
SAMtools (http://samtools.sourceforge.net/) is used to convert the alignment output from SAM format to BAM format.
Bedtools (https://bedtools.googlecode.com/files/BEDTools.v2.17.0.tar.gz) is required to convert the files from BAM format to bedpe format.
R (http://www.r-project.org/) environment is used to compute the p-values and R packages xtable (http://cran.r-project.org/web/packages/xtable/index.html) and RCircos (http://cran.r-project.org/web/packages/RCircos/index.html) are used to generate the graphs for visualization.
Install each software package according to the corresponding instructions and test each software to be run properly.
To run ChIA-PET Tool, the genome sequence, chromosome sizes, and cytoband data of the interested genome are required. The genome index needs to be built with BWA (if BWA is used for mapping) in advance.
In our test, human hg19 reference genome (ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz), chromosome sizes (ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.chrom.sizes) and cytoband data (http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/cytoBandIdeo.txt.gz) were all downloaded from UCSC. So did the mouse data.
In our test, we used published ChIA-PET data associated with RNA polymerase II (RNAPII) from human breast cell line MCF7 and leukemia cell line K562 (Li et al., 2012), which could be downloaded from (GEO with accession number GSE33664).
ChIA-PET Tool is an easy-to-use pipeline and you can simply run it with one command line after you setup all the required tools, data and parameters:
sh run.MCF7.sh
Before you run the pipeline, you need to modify the variables in the shell scripts, especially for the required tools. The details of parameters and their meanings can be found in user_manual.pdf. There are different output files from ChIA-PET Tool. The format of the result files and the interpretations of the results are in output_illustration.pdf, and the running information and summary statistics are shown in ChIA-PET_Tool_Report generated with ChIA-PET Tool.
- Guoliang Li, Xiaoan Ruan, Raymond K. Auerbach, Michael Snyder, Yijun Ruan, et al. Extensive Promoter-Centered Chromatin Interactions Provide a Topological Basis for Transcription Regulation. Cell 148(1), 84-98 (2012) (Date Sets)
- Li G, Fullwood MJ, Xu H et al. ChIA-PET tool for comprehensive chromatin interaction analysis with paired-end tag sequencing. Genome Biology 11(2):R22 (2010)
- Fullwood, M. J. et al. An oestrogen-receptor-alpha-bound human chromatin interactome. Nature 462, 58-64 (2009)
If you have any problems or suggestions, you could send email to Dr. Guoliang Li (guoliang.li@mail.hzau.edu.cn).