-
Notifications
You must be signed in to change notification settings - Fork 5
Usage: get_gens_dfs.py
Kathleen Keough edited this page May 22, 2018
·
6 revisions
get_gens_dfs.py generates a table (tsv file) listing all variants in a defined interval for a specified individual (based on input VCF file). This basically reformats genotypes from VCF for easier processing later when designing sgRNAs.
get_gens_dfs.py <vcf_file> <locus> <out> [-fv] [--bed] [--chrom]
python3 get_gens_df.py\
INPUT.vcf.gz\
1:11980181-12013515\
OUT_GENS
python3 get_gens_df.py\
INPUT.vcf.gz\
loci.bed\
OUT_multi_loci_gens\
--bed
where the loci.bed
file is formated like so:
1 11976269 12018380 MFN2
7 76298036 76308038 HSPB1
11 61940001 61963675 BEST1
Arguments: | Details |
---|---|
vcf_file |
BCF/VCF file with genotypes. Files should be gzipped (using bcftools or bgzip ) and include an index (using bcftools or tabix ). |
locus |
Locus from which to pull variants, in format chromosome:start-stop, or a BED file if --bed is specified. |
out |
The name for the output file and directory in which to save the output files. The output is an .h5 file. Do not include the extention. |
Options: | Details |
---|---|
-f |
If this option is specified, keeps homozygous variants in output file. |
-v |
Verbose mode. |
--bed |
Indicates that a BED file is being used in place of a locus. BED files are expected to include the CHROM, START, STOP, and ID column. |
--chrom |
Run on entire chromosome. |
AlleleAnalyzer. Keough et al. 2019, Genome Biology.