Skip to content

An interpretability framework for ML models trained on protein sequences

Notifications You must be signed in to change notification settings

chenxcynthia/decode-seq

Repository files navigation

Decoding Neural Networks: Novel Computational Methods to Discover Anti-Tumor B-Cell Receptor Binding Motifs

Cynthia Chen

Research project conducted from 2018-2020 under the mentorship of Sherlock Hu, Collin Tokheim, and Shirley Liu.

Full Research Paper.

AACR Abstract (first-author).

Molecular Cell paper (co-author).

Abstract: Deep learning models have been successfully employed for various challenging biological tasks; however, the complexity and depth of neural networks render them black boxes. To address this problem and reveal the important features learned by deep learning models, we developed a novel computational pipeline for decoding neural network models trained on protein sequence data. Our pipeline consists of several stages: generating random input sequences, running the model to rank sequences, clustering top sequences to characterize motifs, and visualizing motif clusters with sequence logos. Using our pipeline, we deciphered the binding motifs learned by a deep learning model trained on a pan-cancer dataset containing more than 30 million B cell receptor (BCR) protein sequences from 5,000 patients. We discovered 65 BCR binding motifs among 13 cancer types and validated the robustness of the motifs through extensive correlation analyses. Our study is the first to reveal and validate anti-tumor BCR binding motifs that target specific tumor antigens, a discovery that is critical to the future synthesis of new antibody drugs for cancer treatments. Furthermore, we demonstrated the versatility of our computational pipeline by using it to decode a second deep learning model, showing that our methods are applicable to a variety of neural networks.

Decoding Pipeline: Pipeline

B-Cell Receptor Motif Results: Motifs

About

An interpretability framework for ML models trained on protein sequences

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published