ãfastpã¯ãªãŒãã®ããªãã³ã°ãè¡ãããŒã«ã§ããã ã以äžã®ã³ãã³ãã§fastpãã€ã³ã¹ããŒã«ããã
$ conda install -c bioconda fastp
$ conda install -c bioconda bowtie2
ããããã³ã°ã®ããã®ããŒã¿ããŒã¹Pre-built indexãååŸãããBowtie 2 ã®ããŒãžïŒhttp://bowtie-bio.sourceforge.net/bowtie2/index.shtml ïŒãžã¢ã¯ã»ã¹ãããå³åŽã®"Indexes"ãšããç®æããã察象ã®çç©çš®ã«åã£ãPre-built indexãããŠã³ããŒãããã
ãæ¬é ã§äœ¿çšããçš®ã¯ããŠã¹ãªã®ã§ãM. Musculus (mm10) ã®ãªã³ã¯å ã¢ãã¬ã¹ïŒftp://ftp.ccb.jhu.edu/pub/data/bowtie2_indexes/mm10.zipïŒãã³ããŒãã以äžã®ã³ãã³ãã§ããŠã¹ã²ãã (mm10)ã®ã€ã³ããã¯ã¹ãã¡ã€ã«ãããŠã³ããŒããã解åããã
$ mkdir ~/bowtie2_index # Pre-built indexãå
¥ããããã®ãã£ã¬ã¯ããªãäœæãã
$ cd ~/bowtie2_index # # Pre-built indexãå
¥ããããã®ãã£ã¬ã¯ããªã«ç§»åãã
$ wget ftp://ftp.ccb.jhu.edu/pub/data/bowtie2_indexes/mm10.zip
$ unzip mm10.zip
ãåæ§ã«ã以äžã®ã³ãã³ãã§ããããªãã¡ã¬ã³ã¹ã²ãã é åïŒhg38ïŒçšã®pre-built indexãããŠã³ããŒãããããã«ããŠã³ããŒããããå§çž®ãã¡ã€ã«ã解åããã
$ cd ~/bowtie2_index
$ wget ftp://ftp.ncbi.nlm.nih.gov/genomes/archive/old_genbank/Eukaryotes/vertebrates_mammals/Homo_sapiens/GRCh38/seqs_for_alignment_pipelines/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.bowtie_index.tar.gz
$ tar xvzf GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.bowtie_index.tar.gz
ããªããIlluminaã®iGenomesã®ããŒãž (https://support.illumina.com/sequencing/sequencing_software/igenome.html) ã«ãããŠãæ§ã ãªçç©çš®ã®ã²ãã ã«å¯Ÿããpre-built indexãé åžãããŠããã
çŸè¡ã®MACS2ã¯æšæºã§ã¯python 2 ã®ã¿ã«å¯Ÿå¿ããŠãããäžæ¹ã§ãpython 2 ã¯2020幎1æã«ãµããŒããçµäºããïŒåèæç®ïŒïŒãä»åŸã®ç¶ç¶æ§ãèæ
®ããããã§ã¯python 3 ã§ã®ã€ã³ã¹ããŒã«æ¹æ³ã玹ä»ããããªãã2019幎2æçŸåšã§ã¯ã pip
ã conda
ãšãã£ã ã¯python 2ã®ã¿ã§æ£åžžã«ã€ã³ã¹ããŒã«ã§ããã
ãŸããMACS2ãã€ã³ã¹ããŒã«ããç°å¢ãäœãã
(2021/11/04 ä¿®æ£)
Python 3 çã® MACS2 㯠pip
ã³ãã³ãã§ã€ã³ã¹ããŒã«ã§ããããã«ãªã£ãã
ã以äžã®ã³ãã³ãã§MACS2ã®python 3çãã€ã³ã¹ããŒã«ããã
$ pip install MACS2
ã以äžã®ã³ãã³ãã§MACS2ãã€ã³ã¹ããŒã«ãããããšã確èªããã
$ macs2 --help # ãã«ãã¡ãã»ãŒãžãåºåãããã°OK
$ brew install samtools
$ conda install -c bioconda homer
ã以äžã®ã³ãã³ãã§HOMERã§ããŠã¹ã²ãã ïŒhg38ïŒã䜿ããããã«ããã
$ perl /anaconda3/share/homer-*/configureHomer.pl -install hg38
ã以äžã®ã³ãã³ãã§HOMERã§ããŠã¹ã²ãã ïŒmm10ïŒã䜿ããããã«ããã
$ perl /anaconda3/share/homer-*/configureHomer.pl -install mm10
$ conda install -c bioconda deeptools
$ deeptools --version
$ brew install r
$ brew install rstudio --cask
$ brew install igv
$ brew install bedtools
ããŸãã以äžã®ã³ãã³ãã§ãRStudioãèµ·åããã
$ open -a RStudio
次ã«ã以äžã®Rã®ã³ãã³ãã§ãchipPeakAnnoãã€ã³ã¹ããŒã«ããããªãã"Update all/some/none? [a/s/n]:" ãšè¡šç€ºãããã "a" ãšå ¥åããŠãšã³ã¿ãŒãæŒãããã«ããã
> if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
> BiocManager::install("ChIPpeakAnno")
- GENCODE https://www.gencodegenes.org
- ããã®å Žå㯠https://www.gencodegenes.org/human/
- ããŠã¹ã®å Žå㯠https://www.gencodegenes.org/mouse/
äžèšURLã«ã¢ã¯ã»ã¹ããContentã"Comprehensive gene annotation"ãRegionsã"CHR"ã§ããè¡ã®GTFã®ããŠã³ããŒããªã³ã¯ãã³ããŒããã
ã以äžã®ã³ãã³ãã§ãGENCODEã®ããŠã¹ã®ãªãã¡ã¬ã³ã¹éºäŒåã¢ãã«ãããŠã³ããŒãããã
$ mkdir ~/gencode
$ cd ~/gencode
$ wget ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M20/gencode.vM20.annotation.gtf.gz # ãã®URLã¯ã³ããŒããããŠã³ããŒããªã³ã¯ã«å¿ããŠå€ãã
$ gzip -d gencode.vM20.annotation.gtf.gz
$ ls gencode.vM20.annotation.gtfã# gencode.vM20.annotation.gtfãã§ããããšã確èªããã
ã以äžã®ã³ãã³ãã§ãChIP-seq解æçšã®ãã£ã¬ã¯ããªãäœæããããã«ãã®äžã«FASTQãã¡ã€ã«ãä¿åãããã£ã¬ã¯ããªãäœæããã
$ mkdir ~/chipseq # ChIP-seq解æçšã®ãã£ã¬ã¯ããªãäœæãã
$ cd ~/chipseq # ChIP-seq解æçšã®ãã£ã¬ã¯ããªãžç§»åãã
$ mkdir fastq # FASTQãã¡ã€ã«ãå
¥ãããã£ã¬ã¯ããªãäœæãã
ãããŠã¹ã§è¡ãããChIP-seqå®éšã®FASTQãã¡ã€ã«ãããŠã³ããŒãããã ãããã§ã¯ãEMBL-EBIãéå¶ããENA (European Nucleotide Archive) ããFASTQãã¡ã€ã«ãããŠã³ããŒãããã
1ïŒããŠã¹AT-3现èïŒIFN-γ添å æïŒã«ãããBRD4 ChIP-seq ãSRR5208824.fastq.gzãããŠã³ããŒãããã
$ cd ~/chipseq/fastq
$ wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR520/004/SRR5208824/SRR5208824.fastq.gz
2ïŒããŠã¹AT-3现èïŒIFN-γ添å æïŒã«ãããIRF1 ChIP-seq ãSRR5208828.fastq.gzãããŠã³ããŒãããã
$ cd ~/chipseq/fastq
$ wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR520/008/SRR5208828/SRR5208828.fastq.gz
ïŒïŒããŠã¹AT-3现èã«ãããInput DNA ChIP-seq Input DNAå®éšã¯ChIP-seqã®å®éšã®ãã¬ãã£ãã³ã³ãããŒã«ãšããŠçšããããChIP-seqã®ã³ã³ãããŒã«å®éšã®ãµã³ãã«ã¯ãMACS2ã«ããããŒã¯æ€åºã®éã«éç¹ç°çãªããŒã¯ãé€å»ããããã«çšããããã ãSRR5208838.fastq.gzãããŠã³ããŒãããã
$ cd ~/chipseq/fastq
$ wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR520/008/SRR5208838/SRR5208838.fastq.gz
ãSRR5208824.fastq.gzãã³ããŒããBRD4_ChIP_IFNy.R1.fastq.gzã«ååãå€æŽãã
$ cd ~/chipseq/fastq
$ cp SRR5208824.fastq.gz BRD4_ChIP_IFNy.R1.fastq.gz
ã以äžãä»ã®ïŒã€ã®ãã¡ã€ã«ã«ã€ããŠãåæ§ã«å®è¡ããã
$ cp SRR5208828.fastq.gz IRF1_ChIP_IFNy.R1.fastq.gz
$ cp SRR5208838.fastq.gz Input_DNA.R1.fastq.gz
ããŸãFastQCã®çµæãå ¥ãããã£ã¬ã¯ããªãäœæããã
$ cd ~/chipseq
$ mkdir fastqc
ã以äžã®ã³ãã³ãã§ãFastQCãå®è¡ããã
$ cd ~/chipseq
$ fastqc -o fastqc fastq/BRD4_ChIP_IFNy.R1.fastq.gz
ãåæ§ã«ãä»ã®FASTQãã¡ã€ã«ã«å¯ŸããŠãFastQCãå®è¡ããã
$ cd ~/chipseq
$ fastqc -o fastqc fastq/IRF1_ChIP_IFNy.R1.fastq.gz
$ fastqc -o fastqc fastq/Input_DNA.R1.fastq.gz
ãfastqcãã£ã¬ã¯ããªã®äžã«ã以äžã®ãã¡ã€ã«ãã§ããŠããããšã確èªããã
BRD4_ChIP_IFNy.R1_fastqc.html
BRD4_ChIP_IFNy.R1_fastqc.zip
IRF1_ChIP_IFNy.R1_fastqc.html
IRF1_ChIP_IFNy.R1_fastqc.zip
Input_DNA.R1_fastqc.html
Input_DNA.R1_fastqc.zip
ããŸãfastpã®çµæãå ¥ãããã£ã¬ã¯ããªãäœæããã
$ cd ~/chipseq
$ mkdir fastp # fastpãšããååã®ãã£ã¬ã¯ããªãäœã
ã以äžã®ã³ãã³ãã§ãBRD4_ChIP_IFNy.R1.fastq.gzã«å¯ŸããŠfastpãå®è¡ãããããã§ã¯ãªãŒãããªãã³ã°åŸã®FASTQãã¡ã€ã«ãBRD4_ChIP_IFNy.R1.trim.fastq.gzãšããŠåºåãããã
$ cd ~/chipseq
$ fastp -i fastq/BRD4_ChIP_IFNy.R1.fastq.gz -o fastp/BRD4_ChIP_IFNy.R1.trim.fastq.gz --html fastp/BRD4_ChIP_IFNy.fastp.html
ãåæ§ã«IRF1_ChIP_IFNy.R1.fastq.gzãInput_DNA.R1.fastq.gzã«å¯ŸããŠfastpãå®è¡ããã
$ cd ~/chipseq
$ fastp -i fastq/IRF1_ChIP_IFNy.R1.fastq.gz -o fastp/IRF1_ChIP_IFNy.R1.trim.fastq.gz --html fastp/IRF1_ChIP_IFNy.fastp.html
$ fastp -i fastq/Input_DNA.R1.fastq.gz -o fastp/Input_DNA.R1.trim.fastq.gz --html fastp/Input_DNA.fastp.html
ãfastpãã£ã¬ã¯ããªå ã«ä»¥äžã®ãã¡ã€ã«ãã§ããŠãããšã確èªããã
BRD4_ChIP_IFNy.fastp.html
BRD4_ChIP_IFNy.R1.trim.fastq.gz
IRF1_ChIP_IFNy.fastp.html
IRF1_ChIP_IFNy.R1.trim.fastq.gz
Input_DNA.fastp.html
Input_DNA.R1.trim.fastq.gz
ã以äžã®ã³ãã³ãã§ãfastpã®ã¬ããŒãã確èªã§ããã
$ cd ~/chipseq
$ open fastp/BRD4_ChIP_IFNy.fastp.html
TipsïŒFASTQã®ãªãŒãã®3â端ã®å¡©åºããã¹ãŠã®ãªãŒãããé€ãæäœãå¿ èŠãªå Žåããã ãIllumina瀟補ã®DNAã·ãŒã±ã³ãµãŒã®æåŸã®ãµã€ã¯ã«ïŒæåŸã®å¡©åºïŒã§ã¯å¡©åºèªã¿åã粟床ãäœããªãããšãç¥ãããŠããããã®ãããã·ãŒã±ã³ãµãŒãåããå®éšè ãã·ãŒã±ã³ã·ã³ã°ã»ã³ã¿ãŒãåèšäŒæ¥ã®æ¹éã«ãã£ãŠã¯ãç®çã®ãªãŒãé·ã®ãµã€ã¯ã«ã«ïŒå¡©åºåãµã€ã¯ã«ãè¿œå ããFASTQãã¡ã€ã«ã®ååŠçã®æ®µéã§3â端ã®ïŒå¡©åºãåãå ŽåããããäŸãã°ã100å¡©åºé·ã®ãªãŒããã»ãããšãã«ã¯ã101ãµã€ã¯ã«åããŠãªãŒãé·ã101å¡©åºã®ãªãŒããããªãFASTQãã¡ã€ã«ãååŸããããŒã¿è§£æã®æ®µéã§3â端ã®ïŒå¡©åºãåã£ãŠ100ã®å¡©åºã®ãªãŒãé·ã®FASTQãã¡ã€ã«ãåºåããããšãã£ãå ·åã§ããã ããã®ãããªå Žåã«ã¯ãfastpãªã©ã®ãªãŒãããªãã³ã°ããŒã«ã«ãã£ãŠæåŸã®ïŒå¡©åºãåãæäœãå¿ èŠãšãªããfastpã§ã¯ãã¹ãŠã®ãªãŒããåãé·ãã®å Žåã
--trim_tail1=1
ãšãããªãã·ã§ã³ã䜿çšããããšã§ããã¹ãŠã®ãªãŒãã®3'端ã®1å¡©åºãåãããšãã§ããïŒãã¢ãšã³ããªãŒãã®å Žåã¯--trim_tail1=1 --trim_tail2=1
ïŒ ãäžæ¹ããã§ã«ãªãŒãããªãã³ã°åŸã®FASTQãã¡ã€ã«ã§ããå Žåã--max_len1 N
(Nã¯æ£ã®æŽæ°) ãšãããªãã·ã§ã³ã䜿çšããããšã§ãRead 1ã§Nå¡©åºãè¶ãããªãŒãããã£ããNå¡©åºã«ãªããŸã§3'端ããå¡©åºãåããšããåŠçãè¡ãããšãã§ããïŒãã¢ãšã³ããªãŒãã®å Žåã¯--max_len1 N --max_len2 N
ïŒã
ãBowtie 2ã«ãããªãã¡ã¬ã³ã¹ã²ãã é åãžã®ãªãŒãã®ãããã³ã°ã¯èšç®ãéããããèšç®çµäºãŸã§ã«æ°æéããåæ¥ãããå Žåããããããã§ãè€æ°ã®ã³ã¢ã«èšç®ãåæ£ãããããšã§èšç®ã®é«éåãå³ãããããã®ããã«ãŸããMacã®ã³ã¢æ°ã確èªããã ã以äžã®ã³ãã³ãã§ãMacã®ã³ã¢æ°ã確èªããã以äžã®äŸã§ã¯ãTotal Number of Cores: 2ãšè¡šç€ºãããŠãããæè¿ã®Macã§æèŒãããŠããIntelã®Coreã·ãªãŒãºã§ã¯ãã€ããŒã¹ã¬ããã£ã³ã°ã»ãã¯ãããžãŒ (Hyper-Threading Technology)ãçšããããŠãããããïŒã€ã®ã³ã¢ã§ïŒã¹ã¬ãããåäœãããããåæã«åäœå¯èœãªã¹ã¬ããæ°ã¯4ãšãªãã
$ system_profiler SPHardwareDataType | grep Cores
Total Number of Cores: 2
ã次ã«ãBowtie 2ã«ãã£ãŠãªãŒãããªãã¡ã¬ã³ã¹ã²ãã é åãžãããã³ã°ããã ããŸããBowtie 2ã®èšç®çµæãå ¥ãããã£ã¬ã¯ããªãäœæããã
$ cd ~/chipseq
$ mkdir bowtie2
ã以äžã®ã³ãã³ãã§ãBowtie 2ã«ãããBRD4_ChIP_IFNy.R1.trim.fastq.gzã®ãªãŒããããŠã¹ãªãã¡ã¬ã³ã¹ã²ãã ïŒmm10ïŒãžãããã³ã°ããã
$ cd ~/chipseq
$ bowtie2 -p 2 -x ~/bowtie2_index/mm10 \
-U fastp/BRD4_ChIP_IFNy.R1.trim.fastq.gz > bowtie2/BRD4_ChIP_IFNy.trim.sam
-pïŒäœ¿çšããã³ã¢æ°ãæå®ããã -xïŒãªãã¡ã¬ã³ã¹ã²ãã é åã®Pre-built indexã®æ¥é èŸãæå®ããã -UïŒFASTQãã¡ã€ã«ãæå®ããã ãªããBowtie 2 ã¯ããã©ã«ãã§ã¯æãã¹ã³ã¢ãé«ãã¢ã©ã€ã³ã¡ã³ããïŒã€ã ãåºåãããããæãã¹ã³ã¢ãé«ãã¢ã©ã€ã³ã¡ã³ããè€æ°èŠã€ãã£ãå Žåã¯ããã®äžããã©ã³ãã ã«äžã€éžã¶ã
ãèšç®ãçµäºããéã«ã以äžã®ãããªã©ã®ãããã®ãªãŒãããããã³ã°ããããã®å²åãåºåããããäžè¬ã«ãoverall alignment rateã極端ã«äœãå Žåã¯ãå®éšãããŸããã£ãŠããªãã£ããããªãã¡ã¬ã³ã¹ã²ãã é åã®éžæãäžé©åã§ãããªã©äœããã®ç°åžžã®å¯èœæ§ãããã
19709457 reads; of these:
19709457 (100.00%) were unpaired; of these:
1163507 (5.90%) aligned 0 times
13206029 (67.00%) aligned exactly 1 time
5339921 (27.09%) aligned >1 times
94.10% overall alignment rate
ãbowtie2ãã©ã«ãã®äžã«ãBRD4_ChIP_IFNy.trim.samãåºåãããŠããããšã確èªãããããã¯SAM圢åŒã®ãã¡ã€ã«ãšåŒã°ããã
ãåæ§ã«ãä»ã®FASTQãã¡ã€ã«ã«å¯ŸããŠãBowtie 2ãå®è¡ããã
$ cd ~/chipseq
$ bowtie2 -p 2 -x ~/bowtie2_index/mm10 \
-U fastp/IRF1_ChIP_IFNy.R1.trim.fastq.gz > bowtie2/IRF1_ChIP_IFNy.trim.sam
$ bowtie2 -p 2 -x ~/bowtie2_index/mm10 \
-U fastp/Input_DNA.R1.trim.fastq.gz > bowtie2/Input_DNA.trim.sam
ã以äžã®ãã¡ã€ã«ãã§ããããšã確èªããã
bowtie2/IRF1_ChIP_IFNy.trim.sam
bowtie2/Input_DNA.trim.sam
ãSAMãã¡ã€ã«ãçŽæ¥æ±ãããã¯ãSAMãã¡ã€ã«ããã€ããªåããŠå§çž®ããBAMãã¡ã€ã«ãžå€æããæ¹ãåŸã ã®è§£æã§äŸ¿å©ã§ãããããã§ãsamtoolsãçšããŠBowtie 2ããåºåãããSAMãã¡ã€ã«ãBAMãã¡ã€ã«ãžå€æããã ã以äžã®ã³ãã³ãã§ã¯ã(1) SAMãBAMã«å€æãã(2) SAMãããŠããŒã¯ãªãªãŒãïŒè€æ°ã®ã²ãã é åã«ãããããããªãŒãïŒãæœåºãã(3) ããã«BAMããœãŒãããŠããã
$ cd ~/chipseq
$ samtools view -bhS -F 0x4 -q 42 bowtie2/BRD4_ChIP_IFNy.trim.sam | samtools sort -T bowtie2/BRD4_ChIP_IFNy.trim - > bowtie2/BRD4_ChIP_IFNy.trim.uniq.bam
-F 0x4
ã«ãã£ãŠããããããªãã£ããªãŒããé€ãã-q 42
ã«ãã£ãŠãŠããŒã¯ãªãªãŒãã ããæœåºããããšãã§ããã
ã以äžã®ãã¡ã€ã«ãã§ããããšã確èªããã
bowtie2/BRD4_ChIP_IFNy.trim.uniq.bam
ãåæ§ã«ãæ®ãã®SAMãã¡ã€ã«ã«ã€ããŠãBAMãžã®å€æãè¡ãã
$ cd ~/chipseq
$ samtools view -bhS -F 0x4 -q 42 bowtie2/IRF1_ChIP_IFNy.trim.sam | samtools sort -T bowtie2/IRF1_ChIP_IFNy.trim - > bowtie2/IRF1_ChIP_IFNy.trim.uniq.bam
$ samtools view -bhS -F 0x4 -q 42 bowtie2/Input_DNA.trim.sam | samtools sort -T bowtie2/Input_DNA.trim - > bowtie2/Input_DNA.trim.uniq.bam
ãBAMãã¡ã€ã«ãèªã¿èŸŒãéã«BAMã®ã€ã³ããã¯ã¹ (æ¡åŒµåã.bam.bai) ãå¿ èŠã«ãªãå Žåãå€ããããã§ã以äžã®ã³ãã³ãã§ãBAMã®ã€ã³ããã¯ã¹ãäœæããã
$ cd ~/chipseq
$ samtools index bowtie2/BRD4_ChIP_IFNy.trim.uniq.bam
$ samtools index bowtie2/IRF1_ChIP_IFNy.trim.uniq.bam
$ samtools index bowtie2/Input_DNA.trim.uniq.bam
ã以äžã®ãã¡ã€ã«ãã§ããããšã確èªããã
bowtie2/BRD4_ChIP_IFNy.trim.uniq.bam.bai
bowtie2/IRF1_ChIP_IFNy.trim.uniq.bam.bai
bowtie2/Input_DNA.trim.uniq.bam.bai
$ cd ~/chipseq
$ mkdir macs2 # MACS2ã®åºåçµæãä¿åãããã£ã¬ã¯ããªãäœæããïŒå¿
é ã§ã¯ãªãïŒ
ã以äžã®ã³ãã³ãã§ãbowtie2/BRD4_ChIP_IFNy.trim.uniq.bamããã³bowtie2/IRF1_ChIP_IFNy.trim.uniq.bamã«å¯ŸããŠããããMACS2ãé©çšããããŒã¯æ€åºãè¡ãããªããããã§ã¯ãbowtie2/Input_DNA.trim.uniq.bamãChIP-seqå®éšã®ãã¬ãã£ãã³ã³ãããŒã«ãšã㊠"-c" ã§æå®ããŠããã
$ cd ~/chipseq
$ macs2 callpeak -t bowtie2/BRD4_ChIP_IFNy.trim.uniq.bam \
-c bowtie2/Input_DNA.trim.uniq.bam -f BAM -g mm -n BRD4_ChIP_IFNy --outdir macs2 -B -q 0.01
$ macs2 callpeak -t bowtie2/IRF1_ChIP_IFNy.trim.uniq.bam \
-c bowtie2/Input_DNA.trim.uniq.bam -f BAM -g mm -n IRF1_ChIP_IFNy --outdir macs2 -B -q 0.01
-gïŒçç©çš®ãæå®ãããããŠã¹ã ãšmmãããã ãšhsã«ããã -qïŒããŒã¯ãåºåããéã®ãè£æ£ãããpå€ãïŒadjusted p-valueïŒãããã¯qå€(q-value)ã®éŸå€ãè¡šããããã©ã«ãã§ã¯qå€ã®éŸå€ã¯0.05ã§ãããpå€ã§ã¯ãªãqå€ãçšããã®ã¯ãå€éæ€å®è£æ£ã®ããã§ãããMACS2ã§ã¯ããŒã¯ã®"確ãããã"ã«é¢ããçµ±èšç仮説æ€å®ãããŒã¯åè£ã®æ°ã ãè¡ãïŒå€éæ€å®ïŒããã®ãããªå Žåãããšãåž°ç¡ä»®èª¬ãæ£åŽã§ããªãå Žåã§ãäœåºŠãæ€å®ãè¡ãã°ãå°ãªããšãïŒåºŠã¯åž°ç¡ä»®èª¬ãæ£åŽãããå²åãæ€å®åæ°ã«åŸã£ãŠå¢ããåœéœæ§ã®å±éºæ§ãé«ãŸãããã®ãããå€éæ€å®è£æ£ãå¿ èŠãšãªãããã®å Žåãéåžžã®på€ã§ã¯ãªãpå€ã«å€éæ€å®è£æ£ãæœããqå€(q-value)ãFDR(False discovery rate)ã§éŸå€ãèšå®ãããMACS2ã§ã¯ãBHæ³ (Benjamini-Hochberg method)ãçšããŠFDRãèšç®ããŠãããå€éæ€å®è£æ£ã«ã€ããŠã¯åèæç®1ãåç §ããããã -BïŒbigBedãã¡ã€ã«ãåºåããã -cïŒã³ã³ãããŒã«ã®ããŒã¿ãæå®ãããChIP-seqã®ã³ã³ãããŒã«å®éšãå®æœãããŠããªããšãã£ãçç±ããã³ã³ãããŒã«ã®ããŒã¿ãçšããªãå Žåã¯äœ¿çšããªãã
ã以äžã®ãã¡ã€ã«ãã§ããŠããããšã確èªããã
BRD4_ChIP_IFNy_model.r # ChIP DNAãã©ã°ã¡ã³ãé·ã®æšå®çµæ
BRD4_ChIP_IFNy_peaks.narrowPeak # ããŒã¯é åãè¡šãnarrowPeakãã¡ã€ã«
BRD4_ChIP_IFNy_peaks.xls # ããŒã¯ã®è©³çŽ°æ
å ±ã瀺ãExcelãã¡ã€ã«
BRD4_ChIP_IFNy_summits.bed # ããŒã¯é åã®äžã§é äžãšãªãéšå (summit) ãè¡šãBEDãã¡ã€ã«
BRD4_ChIP_IFNy_treat_pileup.bdg #
BRD4_ChIP_IFNy_control_lambda.bdg #
ãpeaks_.narrowPeak ã¯BED6+4 format圢åŒã®ãã¡ã€ã«ã§ããã *_peaks.narrowPeak ã *_summits.bed ã§ã¯ãæ€åºãããããŒã¯ã®æ å ±ãïŒè¡ãã€èšèŒãããŠããã
ãäŸãã°ã以äžã®headã³ãã³ãã䜿ã£ãŠããã¡ã€ã«ã®æåã®10è¡ã衚瀺ããããšãã§ããã
$ head macs2/BRD4_ChIP_IFNy_peaks.narrowPeak
chr1 4807514 4808176 BRD4_ChIP_IFNy_peak_1 117 . 8.68079 15.22217 11.75610 520
chr1 4857437 4857680 BRD4_ChIP_IFNy_peak_2 35 . 4.76915 6.42381 3.56659 44
chr1 4857758 4858397 BRD4_ChIP_IFNy_peak_3 216 . 12.20127 25.56099 21.63464 266
chr1 5018884 5019146 BRD4_ChIP_IFNy_peak_4 48 . 5.54587 7.84017 4.84834 123
chr1 5019310 5019671 BRD4_ChIP_IFNy_peak_5 60 . 6.07428 9.09495 6.02557 165
chr1 5022794 5023366 BRD4_ChIP_IFNy_peak_6 117 . 8.68079 15.22217 11.75610 450
chr1 5082960 5083202 BRD4_ChIP_IFNy_peak_7 80 . 5.95644 11.26737 8.04385 189
chr1 6214644 6215164 BRD4_ChIP_IFNy_peak_8 95 . 7.50730 12.85034 9.50346 161
chr1 7088391 7088703 BRD4_ChIP_IFNy_peak_9 73 . 6.72497 10.58313 7.39270 149
chr1 9747880 9748331 BRD4_ChIP_IFNy_peak_10 78 . 6.70945 11.01250 7.80631 189
ã*_peaks.narrowPeakã¯
ïŒåç®ïŒæè²äœçªå· ïŒåç®ïŒããŒã¯ã®ïŒ'端 ïŒåç®ïŒããŒã¯ã®ïŒâ端 ïŒåç®ïŒããŒã¯å ïŒåç®ïŒããŒã¯ã®-10*log10(qvalue)ãæŽæ°ã«å€æããå€ ïŒåç®ïŒããŒã¯ã®ã¹ãã©ã³ãïŒChIP-seqã®ããŒã¯ã¯ã¹ãã©ã³ãæ å ±ã¯ç¡ããã.ãšè¡šç€ºïŒ ïŒåç®ïŒããã¯ã°ã©ãŠã³ããšã®fold change ïŒåç®ïŒ-log10(pvalue) ïŒåç®ïŒ-log10(qvalue) 10åç®ïŒããŒã¯ã®ïŒ'端ããããŒã¯ã®é äžãžã®çžå¯Ÿçãªäœçœ®
ããŸãã*_summits.bedã¯æ€åºãããããŒã¯ã®é äžéšåã®äœçœ®ã瀺ããåã®èª¬æ㯠*_peaks.narrowPeak ã®ïŒãïŒè¡ç®ã«çžåœããã
$ head macs2/BRD4_ChIP_IFNy_summits.bed
chr1 4808034 4808035 BRD4_ChIP_IFNy_peak_1 11.75610
chr1 4857481 4857482 BRD4_ChIP_IFNy_peak_2 3.56659
chr1 4858024 4858025 BRD4_ChIP_IFNy_peak_3 21.63464
chr1 5019007 5019008 BRD4_ChIP_IFNy_peak_4 4.84834
chr1 5019475 5019476 BRD4_ChIP_IFNy_peak_5 6.02557
chr1 5023244 5023245 BRD4_ChIP_IFNy_peak_6 11.75610
chr1 5083149 5083150 BRD4_ChIP_IFNy_peak_7 8.04385
chr1 6214805 6214806 BRD4_ChIP_IFNy_peak_8 9.50346
chr1 7088540 7088541 BRD4_ChIP_IFNy_peak_9 7.39270
chr1 9748069 9748070 BRD4_ChIP_IFNy_peak_10 7.80631
ã次ã«ãããã€ã®ããŒã¯ãæ€åºãããããæ°ãããããŒã¯ïŒã€ãïŒè¡ã§è¡šããããã®ã§ãwc -l ã§ãã¡ã€ã«ã®è¡æ°ãèšç®ããã°ãæ€åºãããããŒã¯æ°ããããã
$ cd ~/chipseq
$ $ wc -l macs2/*_peaks.narrowPeak
9348 macs2/BRD4_ChIP_IFNy_peaks.narrowPeak
907 macs2/BRD4_ChIP_IFNy_peaks.overlapped_with_IRF1_ChIP_IFNy_peaks.narrowPeak
3866 macs2/IRF1_ChIP_IFNy_peaks.narrowPeak
ã次ã«ãBRD4ã®ããŒã¯ãšIRF1ã®ããŒã¯ãã©ã®ãããéãªããã調ã¹ããïŒã€ã®ããŒã¯éåã®éã§ã®éãªãã調ã¹ãããã«bedtoolsã䜿çšããã ã以äžã®ã³ãã³ãã§ã¯ã-a ã§æå®ããããŒã¯çŸ€ïŒBRD4ïŒã®ãã¡ã-bã§æå®ããããŒã¯çŸ€ïŒIRF1ïŒãšéãªããã®ãæœåºããã
$ cd ~/chipseq
$ bedtools intersect -u -a macs2/BRD4_ChIP_IFNy_peaks.narrowPeak -b macs2/IRF1_ChIP_IFNy_peaks.narrowPeak > macs2/BRD4_ChIP_IFNy_peaks.overlapped_with_IRF1_ChIP_IFNy_peaks.narrowPeak
ãåæ§ã«ãIRF1ã®ããŒã¯ã®ãã¡ãBRD4ã®ããŒã¯ãšéãªããã®ãæœåºããã
$ cd ~/chipseq
$ bedtools intersect -u -a macs2/IRF1_ChIP_IFNy_peaks.narrowPeak -b macs2/BRD4_ChIP_IFNy_peaks.narrowPeak > macs2/IRF1_ChIP_IFNy_peaks.overlapped_with_BRD4_ChIP_IFNy_peaks.narrowPeak
ã以äžã®ã³ãã³ãã§ã-a ã§æå®ããããŒã¯ã®ãã¡ã-bã§æå®ããããŒã¯ãšéãªããªããã®ãæœåºããã
$ cd ~/chipseq
$ bedtools intersect -v -a macs2/BRD4_ChIP_IFNy_peaks.narrowPeak -b macs2/IRF1_ChIP_IFNy_peaks.narrowPeak > macs2/BRD4_ChIP_IFNy_peaks.not_overlapped_with_IRF1_ChIP_IFNy_peaks.narrowPeak
$ bedtools intersect -v -a macs2/IRF1_ChIP_IFNy_peaks.narrowPeak -b macs2/BRD4_ChIP_IFNy_peaks.narrowPeak > macs2/IRF1_ChIP_IFNy_peaks.not_overlapped_with_BRD4_ChIP_IFNy_peaks.narrowPeak
ã以äžã®ã³ãã³ãã§ãããããã®è¡æ°ïŒããŒã¯æ°ïŒã調ã¹ãã
$ cd ~/chipseq
$ wc -l macs2/*overlapped*.narrowPeak
8441 macs2/BRD4_ChIP_IFNy_peaks.not_overlapped_with_IRF1_ChIP_IFNy_peaks.narrowPeak
907 macs2/BRD4_ChIP_IFNy_peaks.overlapped_with_IRF1_ChIP_IFNy_peaks.narrowPeak
2957 macs2/IRF1_ChIP_IFNy_peaks.not_overlapped_with_BRD4_ChIP_IFNy_peaks.narrowPeak
909 macs2/IRF1_ChIP_IFNy_peaks.overlapped_with_BRD4_ChIP_IFNy_peaks.narrowPeak
$ cd ~/chipseq
$ mkdir deeptools # deepToolsã®åºåçµæãä¿åãããã£ã¬ã¯ããªãäœæãã
$ cd ~/chipseq
$ bamCoverage -b bowtie2/BRD4_ChIP_IFNy.trim.uniq.bam -o deeptools/BRD4_ChIP_IFNy.trim.uniq.bw -of bigwig --normalizeUsing CPM
$ cd ~/chipseq
$ bamCoverage -b bowtie2/IRF1_ChIP_IFNy.trim.uniq.bam -o deeptools/IRF1_ChIP_IFNy.trim.uniq.bw -of bigwig --normalizeUsing CPM
$ bamCoverage -b bowtie2/Input_DNA.trim.uniq.bam -o deeptools/Input_DNA.trim.uniq.bw -of bigwig --normalizeUsing CPM
ã以äžã®ãã¡ã€ã«ãã§ããããšã確èªããã
deeptools/BRD4_ChIP_IFNy.trim.uniq.bw
deeptools/IRF1_ChIP_IFNy.trim.uniq.bw
deeptools/Input_DNA.trim.uniq.bw
ã以äžã®ã³ãã³ãã§ãIGVãèµ·åããã
$ igv
ããŸããHOMERã®çµæãä¿åãããã£ã¬ã¯ããªãäœæããã
$ cd ~/chipseq
$ mkdir homer
ã以äžã®ã³ãã³ãã§ãmacs2/BRD4_ChIP_IFNy_summits.bedã«å¯ŸããŠHOMERãå®è¡ãããã
$ cd ~/chipseq
$ mkdir homer/BRD4_ChIP_IFNy
$ findMotifsGenome.pl macs2/BRD4_ChIP_IFNy_summits.bed mm10 homer/BRD4_ChIP_IFNy -size 200 -mask
mm10
ã¯ããŠã¹ã²ãã ãè¡šããããã®å Žåã¯hg38
ãªã©ã«ããã
homer/
以äžã« homerResults.html
ãš knownResults.html
ãšããïŒã€ã®ã¬ããŒããåºåããããhomerResults.html
ã¯æ°èŠã«ã¢ããŒããæ¢çŽ¢ããçµæããknownResults.html
ã¯æ¢ç¥ã®ã¢ããŒãã®æç¡ãã¹ãã£ã³ããçµæãããããèšé²ããŠããã
ãåæ§ã«ãmacs2/IRF1_ChIP_IFNy_summits.bedã«ã€ããŠãHOMERãå®è¡ããã
$ cd ~/chipseq
$ mkdir homer/IRF1_ChIP_IFNy
$ findMotifsGenome.pl macs2/IRF1_ChIP_IFNy_summits.bed mm10 homer/IRF1_ChIP_IFNy -size 200 -mask
ãhomer/BRD4_ChIP_IFNy ããã³ homer/IRF1_ChIP_IFNyã®ããããã«ã以äžã®ãããªãã¡ã€ã«ãåºåãããããã®ãã¡ "homerResults.html"ãš"knownResults.html"ã«ã¯ãããããæ°èŠã¢ããŒãã®æ¢çŽ¢ã®çµæããã³æ¢ç¥ã¢ããŒãã®ã¹ãã£ã³ã®çµæãèŠçŽãããŠããã
homerMotifs.all.motifs
homerMotifs.motifs10
homerMotifs.motifs12
homerMotifs.motifs8
homerResults
homerResults.html
knownResults
knownResults.html
knownResults.txt
motifFindingParameters.txt
seq.autonorm.tsv
ã以äžã®ã³ãã³ãã§ãããŠã¹ã®éºäŒåã¢ãã«ã®GTFãã¡ã€ã«(~/gencode/gencode.vM20.annotation.gtf) ãregionsãBRD4 ChIP-seqã®bigWigãã¡ã€ã«ãscoreFile (deeptools/BRD4_ChIP_IFNy.trim.uniq.bw ) ãšããŠãmatrix ãã¡ã€ã«ïŒdeepToolsç¬èªã®åœ¢åŒã®ãã¡ã€ã«ïŒãäœæãããcomputeMatrixã³ãã³ãã®"scale-regions"ãšããã¢ãŒãã䜿çšããã
$ cd ~/chipseq
$ computeMatrix scale-regions \
--regionsFileName ~/gencode/gencode.vM20.annotation.gtf \
--scoreFileName deeptools/BRD4_ChIP_IFNy.trim.uniq.bw \
--outFileName deeptools/BRD4_ChIP_IFNy.trim.uniq.matrix_gencode_vM20_gene.txt.gz \
--upstream 1000 --downstream 1000 \
--skipZeros
ã以äžã®ã³ãã³ãã§ãMetagene plot ãäœæããã
$ cd ~/chipseq
$ plotProfile -m deeptools/BRD4_ChIP_IFNy.trim.uniq.matrix_gencode_vM20_gene.txt.gz \
-out deeptools/metagene_BRD4_ChIP_IFNy_gencode_vM20_gene.pdf \
--plotTitle "GENCODE vM20 genes"
ã次ã®ã³ãã³ãã§ãéºäŒåé åéåã«å¯ŸããChIP-seqã®ãªãŒãã®ã·ã°ãã«ã®ããŒãããããäœæã§ããã
$ cd ~/chipseq
$ plotHeatmap -m deeptools/BRD4_ChIP_IFNy.trim.uniq.matrix_gencode_vM20_gene.txt.gz \
-out deeptools/heatmap_BRD4_ChIP_IFNy_gencode_vM20_gene.pdf \
--plotTitle "GENCODE vM20 genes"
åæ§ã«ã以äžã®ã³ãã³ãã§ãIRF1ã®ChIP-seqããŒã¿ã«ã€ããŠããmetagene plotãšããŒãããããäœæããã
$ cd ~/chipseq
$ computeMatrix scale-regions \
--regionsFileName ~/gencode/gencode.vM20.annotation.gtf \
--scoreFileName deeptools/IRF1_ChIP_IFNy.trim.uniq.bw \
--outFileName deeptools/IRF1_ChIP_IFNy.trim.uniq.matrix_gencode_vM20_gene.txt.gz \
--upstream 1000 --downstream 1000 \
--skipZeros
$ plotProfile -m deeptools/IRF1_ChIP_IFNy.trim.uniq.matrix_gencode_vM20_gene.txt.gz \
-out deeptools/metagene_IRF1_ChIP_IFNy_gencode_vM20_gene.pdf \
--plotTitle "GENCODE vM20 genes"
$ plotHeatmap -m deeptools/IRF1_ChIP_IFNy.trim.uniq.matrix_gencode_vM20_gene.txt.gz \
-out deeptools/heatmap_IRF1_ChIP_IFNy_gencode_vM20_gene.pdf \
--plotTitle "GENCODE vM20 genes"
ã次ã«ä»¥äžã®ã³ãã³ãã§ãBRD4 ChIP-seqã®ãªãŒãã®ååž (deeptools/BRD4_ChIP_IFNy.trim.uniq.bw)ãIRF1 ChIP-seqã®ããŒã¯ (macs2/IRF1_ChIP_IFNy_summits.bed) ãäžå¿ãšãããšãã«ã²ãã å šäœãšããŠã©ããªã£ãŠããããaggregation plotãšããŠæãããã®æºåããããcomputeMatrixã³ãã³ãã®"reference-point"ãšããã¢ãŒãã䜿çšããã
$ cd ~/chipseq
$ computeMatrix reference-point \
--regionsFileName macs2/IRF1_ChIP_IFNy_summits.bed \
--scoreFileName deeptools/BRD4_ChIP_IFNy.trim.uniq.bw \
--referencePoint center \
--upstream 1000 \
--downstream 1000 \
--outFileName deeptools/BRD4_ChIP_IFNy.trim.IRF1_ChIP_IFNy_summits.matrix.txt.gz \
--skipZeros
ã次ã«ã以äžã®ã³ãã³ãã§ãaggregation plotãäœæããã
$ plotProfile -m deeptools/BRD4_ChIP_IFNy.trim.IRF1_ChIP_IFNy_summits.matrix.txt.gz \
-out deeptools/aggregation_BRD4_ChIP_IFNy.trim.IRF1_ChIP_IFNy_summits.pdf \
--regionsLabel "IRF1_ChIP_IFNy Peaks"
ãçµæã¯äžã®å³ã®ããã«ãªãããã®å³ãããããŒã¯ã®ååŸæ°çŸå¡©åºã®ç¯å²ã«ãªãŒããéäžããŠããããšããããã
$ plotHeatmap -m deeptools/BRD4_ChIP_IFNy.trim.IRF1_ChIP_IFNy_summits.matrix.txt.gz \
-out deeptools/heatmap_BRD4_ChIP_IFNy.trim.IRF1_ChIP_IFNy_summits.pdf \
--samplesLabel "BRD4_ChIP_IFNy" \
--regionsLabel "IRF1_ChIP_IFNy Peaks"
ãçµæã¯äžã®å³ã®ããã«ãªãããã®å³ãããIRF1ã®ããŒã¯ã®äžéšã«ã€ããŠããã®åšèŸºã«BRD4ã®ãªãŒããéäžããŠããããšããããã
ãããã«ã以äžã®ããã«--scoreFileName ã«è€æ°ã®bigWigãã¡ã€ã«ãæå®ããããšã§ãè€æ°ã®ChIP-seqããŒã¿ã«ããããªãŒãã®ååžãåæã«å¯èŠåããããšãã§ãããplotHeatmapã§--kmeans ã§ã¯ã©ã¹ã¿æ°ãæå®ããããšã§ãk-meansã¢ã«ãŽãªãºã ã§ã²ãã é åãã¯ã©ã¹ã¿ãªã³ã°ããŠè¡šç€ºãããããšãã§ããã
$ computeMatrix scale-regions \
--regionsFileName ~/gencode/gencode.vM20.annotation.gtf \
--scoreFileName deeptools/BRD4_ChIP_IFNy.trim.uniq.bw \
deeptools/IRF1_ChIP_IFNy.trim.uniq.bw \
--outFileName deeptools/chipseq_matrix_gencode_vM20_gene.txt.gz \
--upstream 1000 --downstream 1000 \
--skipZeros
$ plotHeatmap -m deeptools/chipseq_matrix_gencode_vM20_gene.txt.gz \
-out deeptools/heatmap_BRD4_ChIP_IFNy_gencode_vM20_gene.k3.pdf \
--kmeans 3 \
--plotTitle "GENCODE vM20 genes"
ããŸã ã以äžã®ã³ãã³ãã§ãGREATã§åãä»ããŠããããããã«å å·¥ãããå ·äœçã«ã¯ãïŒåç®ããïŒåç®ã ããæœåºããŠBEDãã©ãŒãããã«ããã
$ cd ~/chipseq
$ cut -f 1,2,3,4,5,6 macs2/BRD4_ChIP_IFNy_peaks.narrowPeak > macs2/BRD4_ChIP_IFNy_peaks.narrowPeak.bed
$ cut -f 1,2,3,4,5,6 macs2/IRF1_ChIP_IFNy_peaks.narrowPeak > macs2/IRF1_ChIP_IFNy_peaks.narrowPeak.bed
ãããã§ãheadã³ãã³ãã䜿ãBEDãã¡ã€ã«ã®å é 10è¡ãèŠãŠãå ã®ã³ãã³ãã®çµæã確èªããã
$ cd ~/chipseq
$ head macs2/*.narrowPeak.bed
==> macs2/BRD4_ChIP_IFNy_peaks.narrowPeak.bed <==
chr1 4807514 4808176 BRD4_ChIP_IFNy_peak_1 117 .
chr1 4857437 4857680 BRD4_ChIP_IFNy_peak_2 35 .
chr1 4857758 4858397 BRD4_ChIP_IFNy_peak_3 216 .
chr1 5018884 5019146 BRD4_ChIP_IFNy_peak_4 48 .
chr1 5019310 5019671 BRD4_ChIP_IFNy_peak_5 60 .
chr1 5022794 5023366 BRD4_ChIP_IFNy_peak_6 117 .
chr1 5082960 5083202 BRD4_ChIP_IFNy_peak_7 80 .
chr1 6214644 6215164 BRD4_ChIP_IFNy_peak_8 95 .
chr1 7088391 7088703 BRD4_ChIP_IFNy_peak_9 73 .
chr1 9747880 9748331 BRD4_ChIP_IFNy_peak_10 78 .
==> macs2/IRF1_ChIP_IFNy_peaks.narrowPeak.bed <==
chr1 3405415 3405606 IRF1_ChIP_IFNy_peak_1 95 .
chr1 3408231 3408677 IRF1_ChIP_IFNy_peak_2 586 .
chr1 6406537 6406748 IRF1_ChIP_IFNy_peak_3 143 .
chr1 6717606 6717857 IRF1_ChIP_IFNy_peak_4 207 .
chr1 7139854 7140166 IRF1_ChIP_IFNy_peak_5 25 .
chr1 7660520 7660678 IRF1_ChIP_IFNy_peak_6 107 .
chr1 9129013 9129176 IRF1_ChIP_IFNy_peak_7 95 .
chr1 9703961 9704130 IRF1_ChIP_IFNy_peak_8 90 .
chr1 9943944 9944140 IRF1_ChIP_IFNy_peak_9 35 .
chr1 10220210 10220391 IRF1_ChIP_IFNy_peak_10 88 .
- GREATã®ãŠã§ããµã€ã: http://great.stanford.edu
ããŸãã"Species Assembly"ã§ã²ãã ã®ããŒãžã§ã³ãéžã¶ãããã§ã¯"Mouse: NCBI build 38 (UCSC mm10, Dec/2011)"ãéžæããã次ã«ã"Test regions" ã®BED fileã«ã¯ [ãã¡ã€ã«ãéžæ] (Choose File) ããã¿ã³ãã¯ãªãã¯ããŠãäœæããBEFãã¡ã€ã«(IRF1_ChIP_IFNy_peaks.narrowPeak.bed) ãéžã¶ã
ã次ã«ãAssociation rule settings ã® [Show settings]ãã¿ã³ãæŒããšãäžå³ã®é ç®ãçŸããããGREATããŒã«ã¯ãäžããBEDãã¡ã€ã«ã®é åïŒä»åã¯ããŒã¯é åïŒã«å¯ŸããéºäŒåãå²ãåœãŠããããã®å²ãåœãŠæ¹ãéžæã§ãããä»åã¯ã[Basal plus extension]ãéžæãã[Submit]ãã¿ã³ãæŒãã ããªãããšã³ãã³ãµãŒãå«ãŸããããŒã¯é åã察象ã«ããå Žåã¯ã[Two nearest genes]ãéžæãããšããã
ã[Job Description]ã®äžã®Associated genomic regions ã®é ç®ã§ã¯ãã©ã®ããŒã¯ãäœã®éºäŒåã«å²ãåœãŠããããã瀺ããªã³ã¯ [View all genomic region-gene associations] ãããããããä¿åããŠãããšãåŸã§ã©ããªéºäŒåãå²ãåœãŠãããã確èªããéã«äŸ¿å©ã§ããã
2019幎2æçŸåšã§ããã²ãã ã«ã€ããŠã¯hg19ã®ã¿ã䜿ããããã®ãããhg38ãªã©å¥ã®ããŒãžã§ã³ã®ãããªãã¡ã¬ã³ã¹ã²ãã ã§è§£æããçµæã§GREATã䜿ãããå Žåã liftover (https://genome.ucsc.edu/cgi-bin/hgLiftOver)ãªã©ã®ããŒã«ãçšããŠã²ãã ã®åº§æšãhg19ã«åãããŠå€æããå¿ èŠãããã
ããŸãã以äžã®ã³ãã³ãã§ãRStudioãèµ·åããã
$ cd ~/chipseq
$ open -a RStudio
ããªãã以äžã§ã¯é©å®Rã®ããã±ãŒãžãã€ã³ã¹ããŒã«ããããã®éã«ã¯ã"Update all/some/none? [a/s/n]:" ãšè¡šç€ºãããã "a" ãšå ¥åããŠãšã³ã¿ãŒãæŒããŠæ¬¡ã«é²ãã
ã以äžã§ã¯ããBRD4 ChIP-seqã®ããŒã¯ã®éåããšãIRF1 ChIP-seqã®ããŒã¯ã®éåããã©ã®ãããéãªããã調ã¹ãã
> library(ChIPpeakAnno)
> gr1 <- toGRanges("~/chipseq/macs2/BRD4_ChIP_IFNy_peaks.narrowPeak", format="narrowPeak", header=FALSE)
> gr2 <- toGRanges("~/chipseq/macs2/IRF1_ChIP_IFNy_peaks.narrowPeak", format="narrowPeak", header=FALSE)
> ol <- findOverlapsOfPeaks(gr1, gr2) # ããŒã¯å士ã®éãªãã調æ»ãã
> makeVennDiagram(ol, NameOfPeaks=c("BRD4", "IRF1")) # éãªãããã³å³ãšããŠå¯èŠåãã
ã次ã«ãããŒã¯ãæãè¿ã転åéå§ç¹ã®éºäŒåãžå²ãåœãŠãã
> BiocManager::install("TxDb.Mmusculus.UCSC.mm10.ensGene") # ããŠã¹ã²ãã mm10ã®éºäŒåã¢ãã«ã®ããã±ãŒãžãããŠã³ããŒããã
> library(TxDb.Mmusculus.UCSC.mm10.ensGene) # ããŠã¹ã²ãã mm10ã®éºäŒåã¢ãã«ã®ããã±ãŒãžãããŒããã
> annoData <- toGRanges(TxDb.Mmusculus.UCSC.mm10.ensGene)
> seqlevelsStyle(gr1) <- seqlevelsStyle(annoData) # æè²äœåã®ã¹ã¿ã€ã«ãæãã
> anno1 <- annotatePeakInBatch(gr1, AnnotationData=annoData) # ããŒã¯ãæãè¿ã転åéå§ç¹ïŒTSSïŒã«å²ãåœãŠã
> pie1(table(anno1$insideFeature), main="BRD4") # ããŒã¯ãéºäŒåããã¿ãŠã©ã®é åã«çœ®ããã®ãããåã°ã©ããšããŠè¡šç€ºãã
ãäžã®å³ããã転åéå§ç¹ã«éãªãããŒã¯ïŒoverlapStartïŒãå€ãããšããããã
> seqlevelsStyle(gr2) <- seqlevelsStyle(annoData) # æè²äœåã®ã¹ã¿ã€ã«ãæãã
> anno2 <- annotatePeakInBatch(gr2, AnnotationData=annoData) # ããŒã¯ãæãè¿ã転åéå§ç¹ïŒTSSïŒã«å²ãåœãŠã
> pie1(table(anno2$insideFeature), main="IRF2") # ããŒã¯ãéºäŒåããã¿ãŠã©ã®é åã«çœ®ããã®ãããåã°ã©ããšããŠè¡šç€ºãã
ãäžã®å³ãããIRF2ã®ããŒã¯ã¯è»¢åéå§ç¹ã®äžæµã«å€ãããšããããã
ããŸãã以äžã®ã³ãã³ãã§ãBRD4ãšIRF2ã®ChIP-seqããŒã¯ãéãªãé åã¯ã©ã®ãããªç¹åŸŽãæã€ãã調ã¹ãããšãã§ããã
> overlaps <- ol$peaklist[["gr1///gr2"]]
> aCR <- assignChromosomeRegion(overlaps, nucleotideLevel=FALSE,
precedence=c("Promoters", "immediateDownstream",
"fiveUTRs", "threeUTRs",
"Exons", "Introns"),
TxDb=TxDb.Mmusculus.UCSC.mm10.ensGene)
> pie1(aCR$percentage, main="BRD4 & IRF1")
ãäžã®ã¢ãããŒã·ã§ã³ã«ã¯éºäŒåIDã¯å«ãŸããŠããããéºäŒååãå«ãŸããŠããªããéºäŒååã察å¿ã¥ããŠããã»ãã䟿å©ãªããšãå€ããããéºäŒåIDã«å¯Ÿå¿ããéºäŒååãè¿œå ããã
> BiocManager::install("EnsDb.Mmusculus.v79")
> library(EnsDb.Mmusculus.v79)
> anno1$feature[is.na(anno1$feature)] <- "." # ãšã©ãŒãé¿ããããã« NA ãããªãªãã«å€ãã
> anno1$geneName <- mapIds(EnsDb.Mmusculus.v79, keys=anno1$feature, column = "GENENAME", keytype="GENEID")
> anno1[1:2]
GRanges object with 2 ranges and 14 metadata columns:
seqnames ranges strand |
<Rle> <IRanges> <Rle> |
BRD4_ChIP_IFNy_peak_1.ENSMUSG00000025903 chr1 4807514-4808176 * |
BRD4_ChIP_IFNy_peak_2.ENSMUSG00000033813 chr1 4857437-4857680 * |
score signalValue pValue
<integer> <numeric> <numeric>
BRD4_ChIP_IFNy_peak_1.ENSMUSG00000025903 117 8.68079 15.22217
BRD4_ChIP_IFNy_peak_2.ENSMUSG00000033813 35 4.76915 6.42381
qValue peak
<numeric> <character>
BRD4_ChIP_IFNy_peak_1.ENSMUSG00000025903 11.7561 BRD4_ChIP_IFNy_peak_1
BRD4_ChIP_IFNy_peak_2.ENSMUSG00000033813 3.56659 BRD4_ChIP_IFNy_peak_2
feature start_position
<character> <integer>
BRD4_ChIP_IFNy_peak_1.ENSMUSG00000025903 ENSMUSG00000025903 4807788
BRD4_ChIP_IFNy_peak_2.ENSMUSG00000033813 ENSMUSG00000033813 4857814
end_position feature_strand
<integer> <character>
BRD4_ChIP_IFNy_peak_1.ENSMUSG00000025903 4886770 +
BRD4_ChIP_IFNy_peak_2.ENSMUSG00000033813 4897909 +
insideFeature distancetoFeature
<factor> <numeric>
BRD4_ChIP_IFNy_peak_1.ENSMUSG00000025903 overlapStart -274
BRD4_ChIP_IFNy_peak_2.ENSMUSG00000033813 upstream -377
shortestDistance
<integer>
BRD4_ChIP_IFNy_peak_1.ENSMUSG00000025903 274
BRD4_ChIP_IFNy_peak_2.ENSMUSG00000033813 134
fromOverlappingOrNearest geneName
<character> <character>
BRD4_ChIP_IFNy_peak_1.ENSMUSG00000025903 NearestLocation Lypla1
BRD4_ChIP_IFNy_peak_2.ENSMUSG00000033813 NearestLocation Tcea1
-------
seqinfo: 23 sequences from an unspecified genome; no seqlengths
ã以äžã®ã³ãã³ãã§ãçµæãã¿ãåºåããã¡ã€ã«ãšããŠä¿åããã
if(!dir.exists("~/chipseq/ChIPpeakAnno")) dir.create("~/chipseq/ChIPpeakAnno")
df_anno1 <- as.data.frame(anno1)
write.table(df_anno1, "~/chipseq/ChIPpeakAnno/BRD4_ChIP_IFNy_peaks.annot.txt", sep="\t", quote=F)