The Rossmann Toolbox provides two deep learning models for predicting the cofactor specificity of Rossmann enzymes based on either the sequence or the structure of the beta-alpha-beta cofactor binding motif.
Create a conda environment:
conda create --name rtb python=3.7
conda activate rtb
Install pip in the environment:
conda install pip
Install from PyPI:
pip install rossmann-toolbox
Alternatively, to get the most recent changes, install directly from the repository:
pip install git+https://github.com/labstructbioinf/rossmann-toolbox.git
The input is a full-length sequence. The algorithm first detects Rossmann cores (i.e. the β-α-β motifs that interact with the cofactor) in the sequence and later evaluates their cofactor specificity:
import matplotlib.pylab as plt
from rossmann_toolbox import RossmannToolbox
rtb = RossmannToolbox(use_gpu=True)
# Eample 1
# The b-a-b core is predicted in the full-length sequence
data = {'3m6i_A': 'MASSASKTNIGVFTNPQHDLWISEASPSLESVQKGEELKEGEVTVAVRSTGICGSDVHFWKHGCIGPMIVECDHVLGHESAGEVIAVHPSVKSIKVGDRVAIEPQVICNACEPCLTGRYNGCERVDFLSTPPVPGLLRRYVNHPAVWCHKIGNMSYENGAMLEPLSVALAGLQRAGVRLGDPVLICGAGPIGLITMLCAKAAGACPLVITDIDEGRLKFAKEICPEVVTHKVERLSAEESAKKIVESFGGIEPAVALECTGVESSIAAAIWAVKFGGKVFVIGVGKNEIQIPFMRASVREVDLQFQYRYCNTWPRAIRLVENGLVDLTRLVTHRFPLEDALKAFETASDPKTGAIKVQIQSLE'}
preds = rtb.predict(data, mode='seq', core_detect_mode='dl', importance=False)
# Eample 2
# The b-a-b cores are provided by the user (WT vs mutant)
data = {'seq_wt': 'AGVRLGDPVLICGAGPIGLITMLCAKAAGACPLVITDIDEGR', # WT, binds NAD
'seq_mut': 'AGVRLGDPVLICGAGPIGLITMLCAKAAGACPLVITSRDEGR'} # D211S, I212R mutant, binds NADP
preds, imps = rtb.predict(data, mode='core', importance=True)
# Example 3
# Which residues contributed most to the prediction of WT as NAD-binding?
seq_len = len(data['seq_wt'])
plt.errorbar(list(range(1, seq_len+1)),
imps['seq_wt']['NAD'][0], yerr=imps['seq_wt']['NAD'][1], ecolor='grey')
Structure-based predictions are not currently available. We are working on a new version that will not only provide predictions, but also the ability to make specificity-shifting mutations.
The structure-based predictor includes an EGAT layer that deals with graph neural networks supporting edge features. The EGAT layer is available from DGL, and you can find more details about it in the DGL documentation. For a detailed description of the EGAT layer and its usage, please refer to the supplementary materials of the Rossmann Toolbox paper.
If you find the rossmann-toolbox
useful, please cite the paper:
Rossmann-toolbox: a deep learning-based protocol for the prediction and design of cofactor specificity in Rossmann fold proteins Kamil Kamiński, Jan Ludwiczak, Maciej Jasiński, Adriana Bukala, Rafal Madaj, Krzysztof Szczepaniak, Stanisław Dunin-Horkawicz Briefings in Bioinformatics, Volume 23, Issue 1, January 2022, bbab371
If you have any questions, problems or suggestions, please contact us.
This work was supported by the First TEAM program of the Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund.