-
Notifications
You must be signed in to change notification settings - Fork 0
Structural analysis
The structural analysis of StructMAn calculates a set of structural features for individual residues that are part of a protein structure. The calculations can be divided into three parts:
The solvent-accessible area of a residue is an important measure to distinguish functional roles of residues. The vast majority of solvent-accessible area calculations are performed by xssp, only for protein structures with missing atom coordinates we use SphereCon. We divide the solvent-accessible area by the total surface area of the residue to receive the relative solvent-accessible area (RSA). We further divide the RSA-values into sidechain atoms and mainchain atoms. Based on RSA-values we categorize three types of structural locations:
- Surface (RSA >= 0.16)
- Buried (0.16 > RSA >= 0.05)
- Core (RSA < 0.05)
A shortest distance between a residue and another molecule is the shortest possible distance between any atom of the residue and any atom of the molecule.
We calculate the shortest distance for each residue to all other molecules that are part of the structural data. We store the shortest distance for each type of molecule. We distinguish between:
- Protein chains
- DNA chains
- RNA chains
- Peptides
- Low molecular-weight ligands
- Metals
- Non-metal ions
Residue Interaction Networks are graph representations of protein structures. We use RINerator to generate RIN datastructures.
Similar to the distance calculations, we detect interactions of the analyzed residue to all other molecules in the RIN and distinguish between the same type of interaction partners. Further, we look into interactions of the analyzed residue to other residues of the same chain. Here, we distinguish between residues by their distance in the amino acid sequence of the corresponding protein:
- Neighbor (sequence distance 1, the two neighboring amino acids)
- Short (sequence distance < 6)
- Long (sequence distance >= 6)
For all interaction types detected in a RIN, we store: - interaction degree (total amount of individual interactions corresponding to one interaction type)
- interaction score (total probe score for all corresponding interactions)
- H-bond score
- Overlap score (a negative score penalizing clashes of van-der-Waals spheres)
Graph centrality features
We calculate twelve different types of centrality scores to measure the strength of the connectivity of the residue inside the protein structure. We take all possible combinations of three types of normalization and four types of graph constructions. The three normalizations are:
- None, take the absolute centrality value
- Min-Max normalization
- Zero-One normalization
The four types of graph constructions:
- Protein chain only
- Protein chain only, and subtracting the overlap clash scores
- Whole complex
- Whole complex, and subtracting the overlap clash scores