Skip to content

tadorfer/nlprot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Papers & Blog Posts on NLP for proteins

ProtTrans: Towards Cracking the Language of Life's Code Through Self-Supervised Deep Learning and High Performance Computing
Ahmed Elnaggar, Michael Heinzinger, Christian Dallago, Ghalia Rihawi, Yu Wang, Llion Jones, Tom Gibbs, Tamas Feher, Christoph Angerer, Martin Steinegger, Debsindhu Bhowmik, Burkhard Rost
arXiv, July 2020 | Paper

BERTology Meets Biology: Interpreting Attention in Protein Language Models
Jesse Vig, Ali Madani, Lav R. Varshney, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani
arXiv, July 2020 | Paper | Blog

PEDL: extracting protein–protein associations using deep language models and distant supervision
Leon Weber, Kirsten Thobe, Oscar Arturo Migueles Lozano, Jana Wolf, Ulf Leser
Bioinformatics, July 2020 | Paper

Signal Peptides Generated by Attention-Based Neural Networks
Zachary Wu, Kevin K. Yang, Michael J. Liszka, Alycia Lee, Alina Batzilla, David Wernick, David P. Weiner, Frances H. Arnold
ACS Synthetic Biology, July 2020 | Paper

USMPep: universal sequence models for major histocompatibility complex binding affinity prediction
Johanna Vielhaben, Markus Wenzel, Wojciech Samek & Nils Strodthoff
BMC Bioinformatics, July 2020 | Paper

Transforming the Language of Life: Transformer Neural Networks for Protein Prediction Tasks
Ananthan Nambiar, Simon Liu, Mark Hopkins, Maeve Heflin, View ORCID ProfileSergei Maslov, Anna Ritz
bioRxiv, June 2020 | Paper

ProGen: Language Modeling for Protein Generation
Ali Madani, Bryan McCann, Nikhil Naik, Nitish Shirish Keskar, Namrata Anand, Raphael R. Eguchi, Po-Ssu Huang, Richard Socher
bioRxiv, March 2020 | Paper | Blog

UDSMProt: universal deep sequence models for protein classification
Nils Strodthoff, Patrick Wagner, Markus Wenzel, Wojciech Samek
Bioinformatics, January 2020 | Paper

Modeling aspects of the language of life through transfer-learning protein sequences
Michael Heinzinger, Ahmed Elnaggar, Yu Wang, Christian Dallago, Dmitrii Nechaev, Florian Matthes, Burkhard Rost
BMC Bioinformatics, December 2019 | Paper

Generative models for graph-based protein design
John Ingraham, Vikas K. Garg, Regina Barzilay, Tommi Jaakkola
NeurIPS, December 2019 | Paper

Accurate Protein Structure Prediction by Embeddings and Deep Learning Representations
Iddo Drori, Darshan Thaker, Arjun Srivatsa, Daniel Jeong, Yueqi Wang, Linyong Nan, Fan Wu, Dimitri Leggas, Jinhao Lei, Weiyi Lu, Weilong Fu, Yuan Gao, Sashank Karri, Anand Kannan, Antonio Moretti, Mohammed AlQuraishi, Chen Keasar, Itsik Pe'er
arXiv, November 2019 | Paper

Athena: Automated Tuning of k-mer based Genomic Error Correction Algorithms using Language Models
Mustafa Abdallah, Ashraf Mahgoub, Hany Ahmed, Somali Chaterji
Scientific Reports, November 2019 | Paper

Unified rational protein engineering with sequence-only deep representation learning
Ethan C. Alley, Grigory Khimulya, Surojit Biswas, Mohammed AlQuraishi, George M. Church
Nature Methods, October 2019 | Paper

Evaluating Protein Transfer Learning with TAPE
Roshan Rao, Nicholas Bhattacharya, Neil Thomas, Yan Duan, Xi Chen, John Canny, Pieter Abbeel, Yun S. Song
bioRxiv, June 2019 | Paper | Blog

Biological Structure and Function Emerge from Scaling Unsupervised Learning to 250 Million Protein Sequences
Alexander Rives, Siddharth Goyal, Joshua Meier, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, Rob Fergus
bioRxiv, May 2019 | Paper

A High Efficient Biological Language Model for Predicting Protein–Protein Interactions
Yanbin Wang, Zhu-Hong You, Shan Yang, Xiao Li, Tong-Hai Jiang, Xi Zhou
Cells, February 2019 | Paper

Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations
Robin Winter, Floriane Montanari, Frank Noé, Djork-Arné Clevert
Chemical Science, November 2018 | Paper

Natural language processing in text mining for structural modeling of protein complexes
Varsha D. Badal, Petras J. Kundrotas, Ilya A. Vakser
BMC Bioinformatics, March 2018 | Paper

Identifying the missing proteins in human proteome by biological language model
Qiwen Dong, Kai Wang, Xuan Liu
BMC Systems Biology, December 2016 | Paper

Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics
Ehsaneddin Asgari, Mohammad R. K. Mofrad
PLOS ONE, November 2015 | Paper

Survey of Natural Language Processing Techniques in Bioinformatics
Zhiqiang Zeng, Hua Shi, Yun Wu, Zhiling Hong
Computational and Mathematical Methods in Medicine, October 2015 | Paper

About

NLP for Proteins - A paper collection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages