-
Notifications
You must be signed in to change notification settings - Fork 22
plot H2A gencode ex
Andy Pohl edited this page Aug 6, 2014
·
2 revisions
A very common practice is to plot the aggregate bigWig signal at many loci, all centered (or anyhow anchored) at a common feature, such as the transcription start site (TSS). To examine the binding of histone H2A in TSS regions genome-wide, we can first collect the genes and make a BED file:
$ wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_20/gencode.v20.annotation.gtf.gz
$ zcat gencode.v20.annotation.gtf.gz \
| awk 'BEGIN{OFS="\t"}{if($3=="gene" && $20=="\"protein_coding\";"){print $1, $4-1, $5, $18, "0", $7}}' \
| sed 's/\"//g;s/;//' \
| sort -k1,1 -k2,2n > gencode_pc.bed
then we'll download the bigWig file (from ENCODE hESC cells):
$ paraFetch 30 10 http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHistone/wgEncodeBroadHistoneH1hescH4k20me1StdSig.bigWig esc-H4k20me1.bw
then we'll run bwtool aggregate to average all the H2A signal upstream and downstream of the TSS of each of these genes from GENCODE: