Skip to content

SimonKitSangChu/MLSB2021

Repository files navigation

MLSB2021

This project requires ProteinGNN to parse pdb to PyG compactible format. Please follow the installation process there.

To build the datasets, place all AlphaFold2 structures under data/alphafold2/your_dataset, fasta under data/fasta and experiment csv under data/csv.

python build_dataset.py --embedding esm --radii 6 --dataset your_dataset --n_processes N_PROCESSES

To train sequence-only and geometric models and visualize their performances,

bash batch_train.sh
python compare_supervised.py --rootdir esm-6

To further compare with unsupervised predictions, place pssm files under data/pssm and run

python ESM.py --stage preprocess --esm_install_path ESM_INSTALL_DIR
bash benchmark_esm.sh
python ESM.py --stage postprocess
python compare_unsupervised.py

For embedding locality analysis,

python locality.py

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published