This R package is designed to distribute the RecView ShinyApp which aims at providing a user-friendly GUI for viewing and locating recombination positions on chromosomes using pedigree data.
devtools::install_github("HKyleZhang/RecView")
Function | Description |
---|---|
make_012gt() |
Formats the genotype file for RecView. |
make_012gt_from_vcf() |
Formats the genotype file from VCF file for RecView. |
run_RecView_App() |
Invokes RecView. RecView provides options to save the result figures and tables to your current working directory. |
File | Description |
---|---|
Genotype file | This file can be generated by using make_012gt() (or make_012gt_from_vcf() ). |
Scaffold file | One .csv file having the order and orientation of the reference genome scaffolds. It should have the following columns (names are case sensitive): scaffold , size , CHR , order , orientation . Note: with a chromosome-level assembly, this file can be tweaked so to make scaffold and CHR identical, but still keep separate columns. |
-
Choose offspring(s): Choose the offspring for the analysis. It supports multiple selection.
-
Choose chromosome(s): Choose the chromosome for the analysis. It supports multiple selection.
-
Locate recombination positions? Check 'Yes' to locate recombination positions with either of two algorithms (see below).
-
Algorithms (optional):
- PD: Proportional Difference algorithm proceeds by specifying a window size (the number of informative SNPs of each flanking window), a step value (k) giving the number of SNPs between each calculated position, and a threshold to trigger denser calculations (at every SNP) to detect local maxima.
- CCS: Cumulative Continuity Score algorithm calculates a CCS for each position along the chromosome, and (ii) finds putative recombination positions by locating regions where long continuously increasing slopes of CCSs of one grandparent-of-origin is replaced by long continuously increasing slopes of CCSs from the other grandparent.
-
Radius value (PD optional): the number of informative SNPs around the examined position for calculating the proportion of informative SNPs from specific grandparents.
-
Step value (PD optional): the step size to move along the chromosome. Larger values decrease the number of positions to be examined, while increasing analysis speed.
-
Finer step value (PD optional): the step size to move along the chromosome, after the absolute difference of the proportion of grandparent-of-origin reaches above the threshold. Larger value decreases the positions to be examined, while increasing the analysis speed.
-
Threshold (PD optional): the condition to initiate a finer step, and later filter the local maxima for effectively true recombination.
-
Threshold (CCS optional): the minimal CCS to consider an effectively true recombination. Larger value is more stringent and captures crossovers, while small value captures both crossovers and non-crossovers. However, small values can also capture artefacts of recombination due to wrongly called genotypes.
-
Saving options (optional):
- GoO Inference: this option will save inferences of grandparent-of-origin for the selected offspring(s) as csv-file(s) separately for each selected chromosome in the current working directory.
- Plots: this option will save the result figures for the selected offspring(s) as pdf-file(s) separately for each selected chromosome in the current working directory.
- Locations: when Locate recombination positions? is checked "Yes", this option will save the table of the putative recombination locations in the selected offspring(s) as csv-file(s) separately for each selected chromosome in the current working directory.
-
Run analysis button: start the analysis!
For big VCF file, it is recommended to continue with Workflow A.
- Use
--extract-FORMAT-info GT
option in VCFtools to extract genotypes into a single file. - Use
make_012gt()
to format the genotype file. - Prepare scaffold file.
- In Rstudio, navigate to the working directory where the genotype file and scaffold file are stored.
- In Rstudio, start the RecView ShinyApp by
run_RecView_App()
; continue with settings and run analysis.
- Use
make_012gt_from_vcf()
to format the genotype file directly from VCF file. - Prepare scaffold file.
- In Rstudio, navigate to the working directory where the genotype file and scaffold file are stored.
- In Rstudio, start the RecView ShinyApp by
run_RecView_App()
; continue with settings and run analysis.
Zhang, H., Hansson, B. RecView: an interactive R application for locating recombination positions using pedigree data. BMC Genomics 24, 712 (2023). https://doi.org/10.1186/s12864-023-09807-2.
- Enable inferring grandparent-of-origin when genotypes of some individuals are missing at all or some sites.
- Enable preview when multiple offspring and chromosomes are selected for analysis.
- Show number of informative sites in GoO figure.
- Reduce RAM usage by changing the way of loading input files.
- Reduce running time for the PD algorithm.
- version 1.0.0
devtools::install_github("HKyleZhang/RecView@v1.0.0")