-
Notifications
You must be signed in to change notification settings - Fork 11
Allele Association Analysis
Methods for association analysis between HLA alleles and diseases.
--file input0.txt [Mandatory]
--assoc [Mandatory]
--digit 4 [Default]
--test fisher [Default]
--model allelic [Default]
--freq 0 [Default]
--adjust FDR [Default]
--out output.txt [Default]
--print [Optimal]
--perm N [Optimal]
--seed S [Optimal]
--exclude EXCLUDE.txt [Optimal]
--covar COVAR.txt [Optimal, for logistic and linear regression only]
--covarname COVARNAME [Optimal, for logistic and linear regression only]
Test of association using two digits, four digits or six digits. When two was used, alleles such as A*02:01
and A*02:06
will be combined as A*02
. Default value is 4.
chisq Pearson chi-squared test (For disease traits, 2 x 2 coningency table)
fisher Fisher's exact test (For disease traits, 2 x 2 coningency table)
logistic logistic regression (For disease traits)
linear linear regression (For quantitative traits)
raw Pearson chi-squared test (For disease traits, 2 x m coningency table)
score Score test proposed by Galta (2005) et al. (For disease traits)
delta Population frequency difference between cases and controls
(For disease traits,Fisher's exact test)
When linear or logistic regression was used, assume A*01:01
is the test allele, then A*01:01 A*01:01
is code as 2, A*01:01 A*01:02
is code as 1, and A*01:02 A*01:03
is code as 0.
Default value is fisher.
When Pearson chi-squared test or Fisher's exact test was used, three genetic models can be specified.
allelic compares one allele against the others group together
dom compares individuals carry one allele against individuals do not carry it
rec compares individuals carry homozygous of one allele against other individuals
Default value is allelic
.
Note: --model
only effect when --test chisq
or --test fisher
is specified.
A value between 0 and 1. Only alleles/allele groups have frequency higher than this threshold will be included in association analysis. Default value is 0. When --perm
is specified, it is better to set a higher value than 0 to --freq
to reduce permutation time.
Bonferroni Bonferroni single-step adjusted p-values
Holm Holm (1979) step-down adjusted p-values
FDR Benjamini & Hochberg (1995) step-up FDR control
FDR_BY Benjamini & Yekutieli (2001) step-up FDR control
Default value is output.txt
.
Specify --print
will print all results to screen (still write results to the output file).
Number of permutation will be performed.
For each permutation run, a simulated dataset is constructed from the original dataset by randomizing the assignment of phenotype status among individuals. The same individuals are used, maintaining the same LD structure and the original case/control ratio.
Only simulated dataset with the same common alleles between cases and controls as the original dataset will be used. So assign a greater than zero value to --freq
can speed up the permutation.
Random seed for permutation. A number used to initialize the basic random number generator. By default, the current system time is used.
Alleles to be excluded. One allele per line.
A*01:01:02
C*01:03
One or more covariates can be included in linear and logistic regression.
The covariates file is a white-space (space or tab) delimited file. The first row is header. Row 2 onwards contain the individual ID (IID) and measures of several traits. Each row for one individual. The first column is IID and column 2 onwards contain measures of several traits. Each column for one trait.
For example, here are two individuals with three traits:
IID age sex bmi
0001 28 1 20.70
0002 23 0 16.29
Note: Name of trait should not include any white-space.
Note: --covar
only effect when --test linear
or --test logistic
is specified.
Note: The order of individuals in covariates file does not have to be the same as the genotype input file. The number of individuals in covariates file also does not have to be the same as the genotype input file. Only the common individuals of both files were included in the analysis.
##1.12 Covariates name (--covarname)
To select a particular subset of covariates, use --covarname covarnames
command.
covarnames is a string of trait names (in the header row of covariates file) concatenate with comma(,).
For example,
--covar cov.txt # use all covariates in cov.txt
--covar cov.txt --covarname bmi # only use 'bmi'
--covar cov.txt --covarname age,bmi # use both 'age' and 'bmi'
--covar cov.txt --covarname age,sex,bmi # use all three covariates
Note: if --covarname covarnames
command is not specified, all covariates in cov.txt will be used.