Skip to content

An R package for Standardized Summary, Annotation, Comparison, and Visualization of CNV, CNVR and ROH

License

Notifications You must be signed in to change notification settings

JH-Zhou/HandyCNV

Repository files navigation

Welcome to HandyCNV

R-CMD-check codecov

An R package for Standardized Summary, Annotation, Comparison, and Visualization of CNV, CNVR and ROH

Main functions and outputs form HandyCNV

Introduction

This package was originally designed for the Post-analysis of CNV results inferred from PennCNV and CNVPartition (GenomeStudio). However, it has now been expanded to accept input files in standard formats for a wider range of applications. Our motivation is to provide a standard, reproducible and time-saving pipeline for the post-analysis of CNVs and ROHs detected from SNP genotyping data for the majority of diploid Species. The functions provided in this package can be categorised into five sections: Conversion, Summary, Annotation, Comparison and Visualization. The most useful features provided are: integrating summarized results, generating lists of CNVR, annotating the results with known gene positions, plotting CNVR distribution maps, and producing customised visualisations of CNVs and ROHs with gene and other related information on one plot. This package also supports a range of customisations, including the colour, size of high resolution figures, and output folder, avoiding conflict between the results of different runs. Running through all functions detailed in the vignette could help us to identify and explore the most interesting genomic regions more easily.

Vignettes and Manual

The details examples please visit our Github pages: https://jh-zhou.github.io/HandyCNV/

Installation and Prerequisites

First, to run this package, we need to make sure that R (Version >= 3.5.2) is installed in your computer (R download link: https://www.r-project.org/). Once R is installed, the 'HandyCNV' package can be installed from Github repository by running the following script. If you rarely used R, it may take more time to install the 'HandyCNV' for the first time.

1. Method one, install from Github Repo directly

install.packages("remotes") # Run this code if you haven't install 'remotes' package before 
remotes::install_github(repo = "JH-Zhou/HandyCNV@v1.1.7")

2. Method two, install manually

If the first method cannnot work well for some reasons, we can manually download the 'Source code (Zip)' from the newly released tag at here: Download Source Code

Then install the Source Code from the local path by following code:

install.packages("remotes") # Run this code if you haven't install 'remotes' package before 
remotes::install_local(path = "C:/Users/HandyCNV-1.1.7.zip") # Repalce 'C:/Users/' to your local path where you downloaded the Source Code

Then, we need to load the 'HandyCNV' package in order to run the following examples. This can be done using the library function as shown below.

library(HandyCNV)

What issues can this package solve?

Click the following link to browse the output in examples.

1. How do we prepare the standard cnv input file for HandyCNV and get a quick summary?

2. How do we visualise CNVs?

3. What types of summary plots are available?

4. How do we generate CNVRs (CNV Regions) from CNV results?

5. How do we annotate genes for CNVRs and CNVs?

6. How to plot CNVR distribution map?

7. How can we plot all high frequency CNVRs at once?

8. Can we compare CNVs between different result sets?

9. What about CNVRs?

10. How do we find the consensus set of genes common to multiple CNV result sets?

11. How can we create the map file used in HandyCNV, to allow comparing CNVs between different reference genomes?

12. How can we integrate the CNVs and annotated genes with additional information, such as Log R ratio, B Allele Frequency, call rate, heterozygosity, missing value rate and Linkage Disequilibrium, and plot it as one figure?

13. How can we make a plot to show the source of the CNVs?

14. How can we plot just the genes in a specific region, to save as a seperate figure?

15. How can we find regions with high frequencies of runs of homozygosity?

16. How do we visualize ROHs?

17. How do we get haplotype for ROH region?

18. How do we convert coordinates for CNV, CNVR, ROH, or any other intervals?

Feature request

If you have any special requirements for this package, please feel free to sumbit your demands via this link: Submit Requirments, we are happy to add the new features to meet your needs.

Bug report

If you find any errors while using this package, please tell us via this link: Bug Report, we will fix it as soon as possible.

Citation

If this tool is useful for your academic research, please cite our publication: Browse publication

Citation: Zhou J, Liu L, Lopdell TJ, Garrick DJ and Shi Y (2021). HandyCNV: Standardized Summary, Annotation, Comparison and Visualization of CNV, CNVR, and ROH. Front. Genet. 12:731355. doi: 10.3389/fgene.2021.731355

Current release: HandyCNV v1.1.7 Release Date: 2021/09/24

What's news

  1. New feature to visualize the haplotype that generated from get_haplotype()
haplo_visual(haplotype = haplotype_letter, xlab_text = "BMP7 Gene ")

Fig. Haplotype of BMP7 Gene

  1. Unify the format of some inputs and outputs

Previous release: HandyCNV v1.1.6 Release Date: 2021/09/01

What's news

  1. Add autosomal boundary data for other species in 'cnvr_plot' function, which will be used to plot CNVR map. Now it support the hg38 and hg19 of human, UMD3.1 and ARS-UCD1.2 of cattle, Oar_v4.0 of sheep, Sscrofa11.1 of Pig, galGal6 of Chicken, EquCab3.0 of Horse and UMICH_Zoey_3.1 of Dog.
  2. Add conditions to automatically control the number of X axis labels that present in the figure of The Number of CNVs Detected per Individual.
  3. Update Github pages with Vignettes

Previous release: HandyCNV v1.1.5 Release Date: 2021/08/29

What's news

Minor modifications, such as unifying input file formats and correcting spelling errors.

Previous release: HandyCNV v1.1.4 Release Date: 2021/07/23

What's news

Major improvements:

  1. Most functions now support reading variable object as input files;
  2. Most functions now support returning the main output as object to R environment for the further operation;
  3. New function 'get_samples' to extract samples ID by searching interested gene from CNV annotation list.

Minor changes:

  1. The 'call_cnvr' funtion now support generating CNVRs from CNV list that contains Chromosomes without CNVs information;
  2. Add links of Horse_quCab2.0 genome reference and sheep 'oviAri3' reference genome into 'get_refgene' function;
  3. Setup a standard table to support present comparison plot with empty group in 'compare_cnvr' function;
  4. Add '-' as separator between the two recoded haplotypes in 'get_haplotype' function.

Previous release: HandyCNV v1.1.3 Release Date: 2021/05/26

What's news

  1. New function to plot SNP density from SNP genotyping map.
plot_snp_density(map = "convert_map/target_plink.map", 
                 max_chr = 24, #optional
                 top_density = 60, #optional 
                 low_density = 20, #optional
                 color_top = "red", #optional
                 color_low = "blue", #optional
                 color_mid = "black", #optional
                 legend_position = c(0.9, 0.1), #optional 
                 x_label = "Physical position\n物理位置", #optional
                 y_label = "SNPs/Mb\n每1Mb区间的SNP数",#optional
                 ncol_1 = 5) 
#save the plot by 'ggsave'
#ggsave(filename = "snp_density.png", width = 26, height = 18, units = "cm", dpi = 350)

Fig.Demo SNP density distribution

  1. Revised CNVs status distribution plot in 'cnv_summary_plot' function, force to appear the boxplot and line on chromosome that has no CNVs.

Previous release: HandyCNV v1.1.2 Release Date: 2021/04/18

What's news

Corrected the version number.

Released Version: HandyCNV v1.1.1 Release Date: 2021/04/14

What's news

1. Update call_cnvr.R …

  1. The CNV list could only be loaded from the local directory through a 'Path' before, now supports to read data from working environment by checking the type of input file;
  2. Support to return the CNVR list to working environment.

2. Update cnv_clean.R …

  1. Support to return Clean CNV List to working environment.

3. Update cnv_visualising.R …

  1. Support to load CNV List from working environment.

4. Update compare_cnvr.R …

  1. Add new function to generate the Unique and Mutual CNVRs by uniting the overlapped CNVRs between two results. There are two purposes of this work, one is to better understand the overlapping CNVRs, the other is to mark the common regions on CNVR distribution map in 'cnvr_plot' function;

5. Update 'cnvr_plot' …

  1. Add 'overlap_cnvr' argument to support to mark overlapped region on CNVR distribution map;
  2. Add 'label_prop' argument to show the proportion of CNVRs length to total length of relative chromosome on CNVR map;
  3. Add 'chr_col' argument to customize the color of Chromosome;
  4. Add 'overlap_col' argument to customize color of overlapped CNVRs
  5. Reduce the margin of final CNVR distribution map.

New feature demo

# Demo code:
cnvr_plot(cnvr = "./cnvr_combine_part_penn_umd/cnvr.txt", assembly = "UMD", 
          sample_size =  388, common_cnv_threshold = 0.05, 
          overlap_cnvr = "./compare_cnvr_Penn_UMD_Vs_Part_UMD/common_cnvr.txt", 
          gain_col = "deeppink1", loss_col = "deepskyblue3", mixed_col = "springgreen3", 
          folder = "./cnvr_combine_part_penn_umd/cnvr_map_common")

Fig.2 CNVR Map

6. Update compare_gene.R …

  1. Add 'color_label' argument to display the color of genes that passed common threshold in the Heatmap;

About

An R package for Standardized Summary, Annotation, Comparison, and Visualization of CNV, CNVR and ROH

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages