Skip to content

Latest commit

 

History

History

molecular-property-prediction

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Alt

Benchmark datasets

Classification task

Dataset # Molecules # Tasks
BBBP 2039 1
Tox21 7831 12
ClinTox 1478 2
HIV 41127 1
BACE 1513 1
SIDER 1427 27
MUV 93087 17
ToxCast 8575 617

Regression task

Dataset # Molecules # Tasks
ESOL 1128 1
FreeSolv 642 1
Lipophilicity 4200 1
QM7 6830 1
QM8 21786 12
QM9 133885 8

1. Datasets

[OGB-LSC] OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs (arXiv, 2021) [Paper][Website][PCQM4Mv2]

[GEOM] GEOM, energy-annotated molecular conformations for property prediction and molecular generation (Scientific Data, 2022) [Paper]

[MoleculeNet] MoleculeNet: a benchmark for molecular machine learning (Chemical science, 2018) [Paper]

2. Reviews

Applications of deep learning in molecule generation and molecular property prediction (Accounts of chemical research, 2020) [Paper]

Deep learning methods for molecular representation and property prediction (Drug Discovery Today, 2022) [Paper]

3. Single-view learning

3.1 Traditional fingerprint-based

[TF_Robust] Massively Multitask Networks for Drug Discovery (arXiv, 2015) [Paper]

3.2 SMILES-based

[X-MOL] X-MOL: large-scale pre-training for molecular understanding and diverse molecular analysis (Science Bulletin, 2022) [Paper] [Code]

[ChemBERTa] ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction (NeurIPS 2020 workshop) [Paper] [Code]

[AGBT] Algebraic graph-assisted bidirectional transformers for molecular property prediction (Nature Communications 2021) [Paper] [Code]

[SMILES Transformer] SMILES Transformer: Pre-trained Molecular Fingerprint for Low Data Drug Discovery (arXiv 2019) [Paper] [Code]

3.3 Graph-based

[MolCLR] Molecular Contrastive Learning of Representations via Graph Neural Networks (Nature Machine Intelligence, 2022) [Paper][Code]

[GEM] Geometry-enhanced molecular representation learning for property prediction (Nature Machine Intelligence, 2022) [Paper][Code]

[3D Infomax] 3D Infomax improves GNNs for Molecular Property Prediction (ICML 2022) [Paper][Code]

[GraphMVP] Pre-training Molecular Graph Representation with 3D Geometry (ICLR 2022) [Paper] [Code]

[Graphormer] Do Transformers Really Perform Badly for Graph Representation? (NeurIPS 2021) [Paper] [Code]

[MGSSL] Motif-based Graph Self-Supervised Learning for Molecular Property Prediction (NeurIPS 2021) [Paper] [Code]

[MG-BERT] MG-BERT: leveraging unsupervised atomic representation learning for molecular property prediction (Briefings in Bioinformatics 2021) [Paper] [Code]

[GROVER] Self-Supervised Graph Transformer on Large-Scale Molecular Data (NeurIPS 2020) [Paper] [Code]

[ATMOL] Attention-wise masked graph contrastive learning for predicting molecular property (BIB 2022) [Paper] [Code] [Chinese blog]

3.4 Image-based

[ImageMol] Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework (Nature Machine Intelligence, 2022) [Paper] [Code] [Chinese blog]

4. Multi-view learning

4.1 joint training

[DMP] Dual-view Molecule Pre-training (arXiv 2021) [Paper] [Code]

[MM-Deacon] Multilingual Molecular Representation Learning via Contrastive Pre-training (ACL 2022) [Paper] [Chinese blog]

4.2 unified training

Unified 2D and 3D Pre-Training of Molecular Representations (KDD 2022) [Paper] [Code] [Chinese blog]