-
Notifications
You must be signed in to change notification settings - Fork 4
Site Frequency Spectrum
This method calculates a site frequency spectrum using ANGSD. Please see ANGSD's tutorial page.
To run this method, use the following command
angsd-wrapper SFS Site_Frequency_Spectrum_Config
where Site_Frequency_Spectrum_Config
is the full path to the configuration file for the site frequency spectrum.
All inputs should be specified in Site_Frequency_Spectrum_Config
.
This method does make use of Common_Config
, those that are used are listed below:
Variable | Function |
---|---|
SAMPLE_LIST GROUP_SAMPLES on dev
|
A list of samples to be used in calculations |
SAMPLE_INBREEDING GROUP_INBREEDING on dev
|
A list of inbreeding coefficients, where each line here corresponds to a line in SAMPLE_LIST or GROUP_SAMPLES on dev
|
ANC_SEQ |
Path to ancestral sequence |
REF_SEQ |
Path to reference sequence |
PROJECT |
Name given to all outputs in ANGSD-wrapper |
SCRATCH |
Place to store files, the full path is SCRATCH/PROJECT/SFS
|
REGIONS |
Limit the scope of ANGSD-wrapper to certain regions |
UNIQUE_ONLY |
Use uniquely mapped reads only |
MIN_BASEQUAL |
Minimum base quality score |
BAQ |
Adjust Q scores around indels |
MIN_IND |
Minimum number of individuals needed to use this site |
GT_LIKELIHOOD |
Estimates genotype likelihoods |
MIN_MAPQ |
Minimum base mapping quality |
N_CORES |
Number of cores to use, please do not set above the limits of your system |
DO_MAJORMINOR |
Estimate major/minor alleles |
DO_GENO |
Peform genotype calling |
DO_MAF |
Calculate per-site frequencies |
DO_POST |
Calculate the posterior probability using per-site frequencies |
This method has no method-specifc variables
The parameters for this method can be tweaked as necessary, they have been set for optimal generalized function:
Parameter | Function |
---|---|
DO_SAF |
Creates a site frequency spectrum |
OVERRIDE |
If true , will recalculate files that already exist |
Naming Scheme | Contents |
---|---|
PROJECT_DerivedSFS.graph.me |
Final site frequency spectrum |
PROJECT_SFSOut.arg |
Details of arguments |
PROJECT_SFSOut.geno.gz |
Genotype calls |
PROJECT_SFSOut.mafs.gz |
Minor allele frequencies |
PROJECT_SFSOut.saf.gz |
Intermediate site frequency spectrum |
PROJECT_SFSOUT.saf.idx |
Index of intermediate site frequency spectrum |
PROJECT_SFSOut.saf.pos.gz |
Position data of the saf file |
PROJECT_DerivedSFS.graph.me
can be visualized with the Shiny graphing interface. A web browser with a graphical user interface is required.
Newer versions of ANGSD support estimating the SFS with less developed genomes, by using the reference sequence to approximate the folded SFS, following this methodology. To use this within the wrapper, simply leave the ANC_SEQ
variable blank within the config file and assign the other variables as usual.