SARS-CoV-2 is a contagious virus established to affect not only humans, but other mammal species. Studies over the last two years have revealed certain species are distinctly immune to the virus, which can be attributed to differences in the ACE2 protein, the virus’ target protein, in various hosts. This study applies machine learning methods to classify hosts as susceptible or immune to SARS-CoV-2 based on their ACE2 sequences.
Machine learning has faced criticism within the field of biology for its uninterpretable logic; it’s imperative that biologists and medical professionals can have confidence in the tools they use, and this isn’t always possible with machine learning. This is an investigation into the importance of explainable machine learning within bioinformatics that involves the comparison of three models withvarying degrees of biological considerations.
This study validated the hypothesis that biology-driven machine learning applications outperform pure machine learning models, and produced other interesting findings on how mutations in the ACE2 protein can affect susceptibility to SARS-CoV-2.
- Designed under the supervision of H. Jabbari