A Python tool for detecting and analyzing UniParental Disomy (UPD) events in Next Generation Sequencing (NGS) trio data.
UniParental Disomy (UPD) occurs when an individual inherits both copies of a chromosome from a single parent and none from the other. UPD manifests in two forms:
- Heterodisomy: Inheritance of non-identical chromosomes from one parent
- Isodisomy: Inheritance of a duplicate chromosome from one parent
UPDOG analyzes VCF files containing trio data (child and both parents) to identify and visualize potential UPD events through:
- Calculating chromosome segment metrics
- Detecting statistically significant Mendelian errors
- Generating visualizations and detailed reports
pip install pyupdog
Basic command:
UPDog.py --vcf <input.vcf.gz> --proband_id <sample_id> --ped <family.ped> --output <output_prefix>
Full options:
UPDog.py \
--vcf <input.vcf.gz> # Input VCF (must be bgzipped and tabixed)
--proband_id <sample_id> # Sample ID of the proband in VCF
--ped <family.ped> # PED file describing family relationships
--output <output_prefix> # Output name prefix
--min_dp <int> # Minimum genotype depth (default: 20)
--min_gq <int> # Minimum genotype quality (default: 20)
--min_qual <int> # Minimum QUAL value (default: 90)
--block_size <int> # Block size for UPD calculation (default: 1000000)
--min_variants_per_block <int> # Minimum variants per block (default: 100)
--p_value <float> # Max P-value for significance (default: 0.001)
--min_blocks <int> # Minimum contiguous blocks for calls (default: 5)
--min_proportion <float> # Minimum UPD variant proportion (default: 0.01)
--chromosome <str> # Restrict to single chromosome (testing only)
--prop_plot <bool> # Plot variant proportions aka BAF plot (default: False)
--wes <bool> # BAF plot not downsampled - better for WES data (default: False)
- Segmentation: Examines each chromosome in fixed-size segments (default 1MB)
- Metric Calculation: Computes proportion of Mendelian errors per segment
- Baseline Determination: Calculates expected Mendelian error rate across all chromosomes
- Statistical Analysis: Identifies segments with significant increases in Mendelian errors
- Merging: Combines adjacent significant segments
- Filtering: Applies quality filters to generate final UPD calls
*_UPD_calls.csv
: Tab-separated file containing significant UPD events*_raw_data.csv
: Raw metrics for all analyzed segments*_chr*_UPD.png
: UPD metric plots per chromosome*_chr*_baf.png
: B-allele frequency plots (if --prop_plot enabled)