Table of Contents
This repository contains the code and Jupyter Notebooks accompanying my blog post on Integrative analysis of single-cell multi-omics data using deep learning. Single-cell RNA sequencing (scRNA-seq) has revolutionized the profiling of various cell types, such as immune cells, with single-cell resolution using next-generation sequencing.
Exciting technologies like Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq) have extended scRNA-seq by simultaneously measuring multiple molecular modalities, including proteome and transcriptome from the same cell. By utilizing antibodies conjugated to oligonucleotides, CITE-seq generates sequencing-based readouts for both surface protein expression and gene expression.
Given that gene and protein expressions provide distinct and complementary information about a cell, CITE-seq offers a unique opportunity to integrate transcriptomic and proteomic data. This integration enables a considerably higher resolution understanding of individual cell biology compared to using either modality alone. To address this, the tutorial in this repository demonstrates an integrative analysis of CITE-seq data using an unsupervised deep learning method called autoencoders.
- 🧬 Single-cell technologies offer considerable promise in dissecting the heterogeneity among individual cells and are being utilized in biomedical studies at an astounding pace.
- 💡 CITE-seq simultaneously measures gene expression and surface protein at a single-cell level.
- 💻 Integrative analysis of CITE-seq data from two modalities using autoencoders.
To run the Jupyter Notebooks, please follow the steps below:
- Clone the repo:
git clone https://github.com/naity/citeseq_autoencoder.git
- [Optional] Run the R script to for data preprocessing:
Rscript preprocessing.R
The following Python packages need to be installed to run the notebooks. Please use the commands below for installation.
pip install pandas numpy scikit-learn torch pytorch-lightning tqdm umap-learn plotly
To run the preprocessing script, you will need to install the following R
packages using the commands below:
install.packages("tidyverse")
install.packages('Seurat')
devtools::install_github('satijalab/seurat-data')
Explore the application of autoencoders for CITE-seq data through the following Jupyter Notebooks. These notebooks not only implement and train autoencoders but also provide visualizations of the results.
-
autoencoder_citeseq.ipynb
: This notebook uses vanilla PyTorch for implementing and training autoencoders tailored for CITE-seq data. -
autoencoder_citeseq_saturn.ipynb
: This notebook introduces the usage ofPyTorch Lightning
for an updated implementation with improved model training and evaluation. Additionally, it provides more detailed background and technical explanations. -
learn
: If you're interested in delving into the detailed implementations of datasets, models, and training, please explore this Python module. -
Deep learning for single-cell analysis.pptx
: Slides for my "Decode Life Workshop Deep Learning Lecture"
- 08/04/2022: Tutorial on Saturn Cloud (video)
- 06/30/2021: Decode Life Workshop deep learning lecture (video)
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE.txt
for more information.