An R package for Standardized Summary, Annotation, Comparison, and Visualization of CNV, CNVR and ROH
Main functions and outputs form HandyCNV
This package was originally designed for the Post-analysis of CNV results inferred from PennCNV and CNVPartition (GenomeStudio). However, it has now been expanded to accept input files in standard formats for a wider range of applications. Our motivation is to provide a standard, reproducible and time-saving pipeline for the post-analysis of CNVs and ROHs detected from SNP genotyping data for the majority of diploid Species. The functions provided in this package can be categorised into five sections: Conversion, Summary, Annotation, Comparison and Visualization. The most useful features provided are: integrating summarized results, generating lists of CNVR, annotating the results with known gene positions, plotting CNVR distribution maps, and producing customised visualisations of CNVs and ROHs with gene and other related information on one plot. This package also supports a range of customisations, including the colour, size of high resolution figures, and output folder, avoiding conflict between the results of different runs. Running through all functions detailed in the vignette could help us to identify and explore the most interesting genomic regions more easily.
The details examples please visit our Github pages: https://jh-zhou.github.io/HandyCNV/
First, to run this package, we need to make sure that R (Version >= 3.5.2) is installed in your computer (R download link: https://www.r-project.org/). Once R is installed, the 'HandyCNV' package can be installed from Github repository by running the following script. If you rarely used R, it may take more time to install the 'HandyCNV' for the first time.
install.packages("remotes") # Run this code if you haven't install 'remotes' package before
remotes::install_github(repo = "JH-Zhou/HandyCNV@v1.1.7")
If the first method cannnot work well for some reasons, we can manually download the 'Source code (Zip)' from the newly released tag at here: Download Source Code
Then install the Source Code from the local path by following code:
install.packages("remotes") # Run this code if you haven't install 'remotes' package before
remotes::install_local(path = "C:/Users/HandyCNV-1.1.7.zip") # Repalce 'C:/Users/' to your local path where you downloaded the Source Code
Then, we need to load the 'HandyCNV' package in order to run the following examples. This can be done using the library
function as shown below.
library(HandyCNV)
Click the following link to browse the output in examples.
1. How do we prepare the standard cnv input file for HandyCNV and get a quick summary?
3. What types of summary plots are available?
4. How do we generate CNVRs (CNV Regions) from CNV results?
5. How do we annotate genes for CNVRs and CNVs?
6. How to plot CNVR distribution map?
7. How can we plot all high frequency CNVRs at once?
8. Can we compare CNVs between different result sets?
10. How do we find the consensus set of genes common to multiple CNV result sets?
13. How can we make a plot to show the source of the CNVs?
14. How can we plot just the genes in a specific region, to save as a seperate figure?
15. How can we find regions with high frequencies of runs of homozygosity?
17. How do we get haplotype for ROH region?
18. How do we convert coordinates for CNV, CNVR, ROH, or any other intervals?
If you have any special requirements for this package, please feel free to sumbit your demands via this link: Submit Requirments, we are happy to add the new features to meet your needs.
If you find any errors while using this package, please tell us via this link: Bug Report, we will fix it as soon as possible.
If this tool is useful for your academic research, please cite our publication: Browse publication
Citation: Zhou J, Liu L, Lopdell TJ, Garrick DJ and Shi Y (2021). HandyCNV: Standardized Summary, Annotation, Comparison and Visualization of CNV, CNVR, and ROH. Front. Genet. 12:731355. doi: 10.3389/fgene.2021.731355
- New feature to visualize the haplotype that generated from get_haplotype()
haplo_visual(haplotype = haplotype_letter, xlab_text = "BMP7 Gene ")
- Unify the format of some inputs and outputs
- Add autosomal boundary data for other species in 'cnvr_plot' function, which will be used to plot CNVR map. Now it support the hg38 and hg19 of human, UMD3.1 and ARS-UCD1.2 of cattle, Oar_v4.0 of sheep, Sscrofa11.1 of Pig, galGal6 of Chicken, EquCab3.0 of Horse and UMICH_Zoey_3.1 of Dog.
- Add conditions to automatically control the number of X axis labels that present in the figure of The Number of CNVs Detected per Individual.
- Update Github pages with Vignettes
Minor modifications, such as unifying input file formats and correcting spelling errors.
- Most functions now support reading variable object as input files;
- Most functions now support returning the main output as object to R environment for the further operation;
- New function 'get_samples' to extract samples ID by searching interested gene from CNV annotation list.
- The 'call_cnvr' funtion now support generating CNVRs from CNV list that contains Chromosomes without CNVs information;
- Add links of Horse_quCab2.0 genome reference and sheep 'oviAri3' reference genome into 'get_refgene' function;
- Setup a standard table to support present comparison plot with empty group in 'compare_cnvr' function;
- Add '-' as separator between the two recoded haplotypes in 'get_haplotype' function.
- New function to plot SNP density from SNP genotyping map.
plot_snp_density(map = "convert_map/target_plink.map",
max_chr = 24, #optional
top_density = 60, #optional
low_density = 20, #optional
color_top = "red", #optional
color_low = "blue", #optional
color_mid = "black", #optional
legend_position = c(0.9, 0.1), #optional
x_label = "Physical position\n物理位置", #optional
y_label = "SNPs/Mb\n每1Mb区间的SNP数",#optional
ncol_1 = 5)
#save the plot by 'ggsave'
#ggsave(filename = "snp_density.png", width = 26, height = 18, units = "cm", dpi = 350)
- Revised CNVs status distribution plot in 'cnv_summary_plot' function, force to appear the boxplot and line on chromosome that has no CNVs.
Corrected the version number.
- The CNV list could only be loaded from the local directory through a 'Path' before, now supports to read data from working environment by checking the type of input file;
- Support to return the CNVR list to working environment.
- Support to return Clean CNV List to working environment.
- Support to load CNV List from working environment.
- Add new function to generate the Unique and Mutual CNVRs by uniting the overlapped CNVRs between two results. There are two purposes of this work, one is to better understand the overlapping CNVRs, the other is to mark the common regions on CNVR distribution map in 'cnvr_plot' function;
- Add 'overlap_cnvr' argument to support to mark overlapped region on CNVR distribution map;
- Add 'label_prop' argument to show the proportion of CNVRs length to total length of relative chromosome on CNVR map;
- Add 'chr_col' argument to customize the color of Chromosome;
- Add 'overlap_col' argument to customize color of overlapped CNVRs
- Reduce the margin of final CNVR distribution map.
# Demo code:
cnvr_plot(cnvr = "./cnvr_combine_part_penn_umd/cnvr.txt", assembly = "UMD",
sample_size = 388, common_cnv_threshold = 0.05,
overlap_cnvr = "./compare_cnvr_Penn_UMD_Vs_Part_UMD/common_cnvr.txt",
gain_col = "deeppink1", loss_col = "deepskyblue3", mixed_col = "springgreen3",
folder = "./cnvr_combine_part_penn_umd/cnvr_map_common")
- Add 'color_label' argument to display the color of genes that passed common threshold in the Heatmap;