Target specific de novo drug design using Transformer neural network.
Folders:
/data The folder contains datasets used for training and evaluation of the model. To obtain these datasets we downoaded full version of BindingDB and ran BindingDB_dataset_preparation.ipynb (see scripts). Datasets contain only human proteins. Vocabulary file was used to encode data.
/data_4_organisms The folder contains extended datasets with proteins from human, bovine, rat and mouse and corresponding vocabulary file.
/scripts The folder contains IPython notebooks with raw data preparation, model training and decoding from it, model evaluation and results visualization.