A.D. 2023
Warning
βσ version contains residues A, G and K only
Caution
Issues
- Output is a linear string, hence, some bond angles are not correct
- Does not contain hydrogen records (use pyMol or WebMo or openMM to add hydrogens)
- Atom Index jumps by +1 after branched chain
- CONECT record stops short of last few residues after attaching branched sequence
- some bond lengths are not ideal so your vizualizer (VMD, pyMol etc.) might add some random bonds here and there
- Isopeptide bond between Lys and branched C-term. is too short
- messy code pls dont hate me
Code for a Python function to generate a Protein Data Base (PDB) file with the added option of Ubiquitylation and side chain modification.
This generates a linear PDB file based off of a single letter amino acid code (FASTSA) sequence e.g input = AAA generates a pdb file that contains three alanines.
The residues are generated from N->C termini, along with the proper PDB CONECT record.
The user merely enters their desired sequence and ends with a * to signify the C-terminus. Following the above example, the input AAA* generates three alanines with connection records and a proper C-terminus.
This version allows the user to modify a respecitve amino acid. The '^' symbol specifies when the branched sequence should start, and the '*' specifies when to end the branched sequence.
Since Ubiquitylation is common, the sequence of the branched chain will start in reverse: So, AAK^AAG*AA* sequence consists of three lysienes, with the first lysine having the sequence GAA attached to its sidechain via a isopeptide bond.
If reverse is set to false, then the branched sequence will connect to the respective residue via its N-terminus.
This code was written by me (peter swanson) and me alone. I did not copy from another source or use any stupid "A.I." program. It took me, like a few days and stuff.
I have no affiliations to any company/institution. But I am affiliated with my grandma, who is very lovely.
Glory to God. ICXC NIKA