-
Notifications
You must be signed in to change notification settings - Fork 3
/
BigLD_manual.Rmd
91 lines (72 loc) · 3.01 KB
/
BigLD_manual.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
title: "Big-LD"
author: "Sunah Kim (sunny03@snu.ac.kr)"
output: rmarkdown::github_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r, echo = FALSE}
knitr::opts_chunk$set(
fig.path = "README_figs/README-"
)
```
# Big-LD
Big-LD is a block partition method based on interval graph modeling of LD bins which are clusters of strong pairwise LD SNPs, not necessarily physically consecutive.
The detailed information about the Big-LD can be found in our paper published in [bioinformatics](https://academic.oup.com/bioinformatics/article/doi/10.1093/bioinformatics/btx609/4282661/A-new-haplotype-block-detection-method-for-dense).
## Installation
```{r, eval=FALSE}
library("devtools")
devtools::install_github("sunnyeesl/BigLD")
```
```{r, message=FALSE}
library(BigLD)
```
## Data
You need an additive genotype data (each SNP genotype is coded in terms of the number of minor alleles) and a SNP information data.
The package include sample genotype data and SNPinfo data.
Load the sample data (if you installed the BigLD packages).
```{r data}
data(geno)
data(SNPinfo)
```
Or simply you can download the sample data from `/inst/extdata`
The sample data include 1000SNPs and 286 individuals.
```{r}
geno[1:10, 1:7]
head(SNPinfo)
```
## CLQD
`CLQD` partitioning the SNPs into subgroups such that each subgroup contains highly correlated SNPs.
There are two CLQ methods, original CLQ(`ClQmode = 'Maximal'`) and CLQD (`ClQmode = 'Density'`).
```{r CLQD}
CLQres = CLQD(geno, SNPinfo, CLQmode = 'Density')
head(CLQres, n = 20)
```
## Big_LD
'Big_LD` returns the estimation of LD block regions of given data.
```{r Big_LD}
BigLDres = Big_LD(geno, SNPinfo)
BigLDres
```
If you want to apply heuristic procedure, add option `checkLargest = TRUE`.
```{r Big_LDheuristic, eval=FALSE}
Big_LD(geno, SNPinfo, MAFcut = 0.05, checkLargest = TRUE, appendrare = TRUE)
```
## LDblockHeatmap
`LDblockHeatmap` visualize the LDblock boundaries detected by Big_LD.
You can input the results obtained using Big-LD (`LDblockResult= BigLDres`).
If you do not input a Big-LD results, the `LDblockHeatmap` function first excute `Big_LD` function to obtain an LD block estimation result.
```{r LDheatmap1, results='hide'}
LDblockHeatmap(geno, SNPinfo, 22, LDblockResult= BigLDres)
```
You can show the location of the specific SNPs (`showSNPs = SNPinfo[c(100, 200), ]` shows the 100th and 200th SNPs),
or give the threshold for LD block sizes to show SNP information (`showLDsize = 50`).
If you want to save the LD heatmap results as tif file, add options such as `savefile = TRUE, filename = "LDheatmap2.tif"`.
```{r LDheatmap2, eval=FALSE}
LDblockHeatmap(geno, SNPinfo, 22, showSNPs = SNPinfo[c(100, 200), ], showLDsize = 50, savefile = TRUE, filename = "LDheatmap2.tif")
```
```{r LDheatmap3, results='hide', echo=FALSE}
LDblockres = LDblockHeatmap(geno, SNPinfo, 22, showSNPs = SNPinfo[c(100, 200), ], showLDsize = 50, savefile = TRUE, filename = "LDheatmap2.tif")
```
If you have any suggestion or question, please contact us (sunny03@snu.ac.kr).