DeepTox: Toxicity Prediction using Deep Learning #72

agitter · 2016-08-09T17:07:35Z

http://doi.org/10.3389/fenvs.2015.00080

Related to virtual screening #45.

agitter · 2016-08-25T15:03:19Z

Biology

Tox21 Data Challenge: 12000 chemicals screened with 12 assays
Assays are for toxicity, which is related to other virtual screening tasks but different from predicting the effects for a single protein target (AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery #56)
Challenge is to predict effects of 647 test compounds; challenge organizers split compounds into training, leaderboard, and test sets
Good overview of similarity-based and feature-based algorithms for predicting chemical effects
Compound can be active, inactive, or inconclusive/untested for each assay; many more inactives than actives
Input: chemical compound in SDF format (graph of atoms and their bonds), transforms these into chemical descriptors (as in Multi-task Neural Networks for QSAR Predictions #57) and molecular fingerprints (as in Massively Multitask Networks for Drug Discovery #55); ultimately using more features than related work

Computational aspects

Cites graph-based kernels for molecular graphs dating back to 2003 (relevant to Convolutional Networks on Graphs for Learning Molecular Fingerprints #52 and Molecular Graph Convolutions: Moving Beyond Fingerprints #53)
Acknowledge some of the related deep learning methods but do not discuss them or compare to them
Won the grand challenge and several sub-challenges, but not the top performer for every individual assay
Score with AUC despite class imbalance
Use multi-task learning, number of tasks is comparable to (Multi-task Neural Networks for QSAR Predictions #57) but less than (Massively Multitask Networks for Drug Discovery #55)
Claim deep learning in general performs well with a large dataset, related input features, and multi-task setting. Tox21 satisfies these conditions.
Indicator variable in objective function to ignore untested compounds
ReLU hidden units with multi-task sigmoid output layer
1024, 2048, 4096, 8192, or 16384 hidden units per layer and up to 4 hidden layers
Optimize hyperparameters separately for each task even though they have a multi-task network
Supplement the challenge training data with similar compounds and assays from PubChem, ChEMBL, etc.
Also use SVM, random forest, and elastic nets both as competing methods and in an ensemble with the neural network
Cluster-based cross validation to help prevent indirect test fold leakage when compounds are very similar
Multi-task network outperforms single-task for 10 of 12 assays, but Massively Multitask Networks for Drug Discovery #55 has a better assessment of the impact of the number of tasks in this domain
Adding chemical descriptor inputs doesn't substantially boost performance relative to fingerprint features alone
For interpretation, associate hidden units with toxicophores using U-test and correlation
They claim that the neural network is better than competing methods in most cases, but either SV or random forest is better than or at least competitive with the neural network in terms of AUC for many of the assays
Provide code and a cleaned dataset

Why include it in the review

Winning a challenge can help bolster claims that this is the state of the art in the domain
This is not the authors' interpretation, but in my opinion the AUCs may suggest that the neural network was more of an incremental than a 'transformative' improvement for many assays
Includes some discussion of interpreting hidden units in the virtual screening domain
The study is generally well-executed, even if many of the computational ideas had already appeared in other virtual screening papers. This work could be mentioned but may not be a major focal point.

Other notes

They cite http://arxiv.org/abs/1503.01445 which has a slightly different title and author order but appears to be the preprint version of this paper

hmf0103 · 2018-08-26T05:47:50Z

@agitter Hi, do you know where to find deeptox source code?

agitter · 2018-08-26T11:01:28Z

@hmf0103 in my original notes above I linked to the code at https://github.com/bioinf-jku/binet. It may not be the exact same code used in this paper, but you could ask the authors in the Issues there.

vinay-hebb · 2022-08-31T14:24:14Z

@agitter

Provide code and a cleaned dataset

I can't find any mention of the code in their paper. How did you infer that they used binet?

agitter · 2022-08-31T17:09:39Z

@vinay-hebb I no longer remember how I originally associated that repo with the paper. It may have been through the authors' DeepTox website. In bioinf-jku/binet#6 the author also confirmed

DeepTox was done using binet, as well as other libraries (e.g. rdkit and scikit-learn).

agitter mentioned this issue Aug 10, 2016

Massively Multitask Networks for Drug Discovery #55

Closed

agitter self-assigned this Aug 21, 2016

agitter added the treat label Nov 5, 2016

agitter mentioned this issue Jan 3, 2017

Drug discovery and high-throughput screening sub-section outline #174

Merged

26 tasks

agitter mentioned this issue Apr 16, 2017

Ligand-based virtual screening #313

Merged

dhimmel added a commit to dhimmel/deep-review that referenced this issue Nov 3, 2017

Examples: add bitcoin-whitepaper & rephetio-manuscript (greenelab#72)

2a87697

cgreene closed this as completed Mar 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepTox: Toxicity Prediction using Deep Learning #72

DeepTox: Toxicity Prediction using Deep Learning #72

agitter commented Aug 9, 2016

agitter commented Aug 25, 2016

hmf0103 commented Aug 26, 2018

agitter commented Aug 26, 2018

vinay-hebb commented Aug 31, 2022 •

edited

Loading

agitter commented Aug 31, 2022

DeepTox: Toxicity Prediction using Deep Learning #72

DeepTox: Toxicity Prediction using Deep Learning #72

Comments

agitter commented Aug 9, 2016

agitter commented Aug 25, 2016

Biology

Computational aspects

Why include it in the review

Other notes

hmf0103 commented Aug 26, 2018

agitter commented Aug 26, 2018

vinay-hebb commented Aug 31, 2022 • edited Loading

agitter commented Aug 31, 2022

vinay-hebb commented Aug 31, 2022 •

edited

Loading