This repository contains supporting code to facilitate reproducible analysis. For details see the Genome Biology publication. If you find bugs please create a github issue.
Please do not use this code for your own analyses! It is not updated. Better implementations are available in the following two R packages.
GLM-PCA (dimension reduction for generalized linear model likelihoods) is now available as a standalone R package. This method is highlighted in the paper as being suitable for single cell RNA-Seq data.
The scry R package contains functions for feature selection using deviance, computation of null residuals, and interfaces to apply these methods and GLM-PCA to Bioconductor objects such as SingleCellExperiment and SummarizedExperiment.
Will Townes, Stephanie Hicks, Martin Aryee, and Rafa Irizarry
Implementations of dimension reduction algorithms
- existing.R - wrapper functions for PCA, tSNE, ZINB-WAVE, etc
- glmpca.R - placeholder file that just loads the glmpca package.
Analysis of various real scRNA-Seq datasets. The Rmarkdown files can be used to produce figures in the manuscript
Systematic assessment of clustering performance of a variety of normalization, feature selection, and dimension reduction algorithms using ground-truth datasets.
Downloadable table of results from assessments
Utility functions. Please consider using the updated versions of these functions via the scry R package.
- clustering.R - wrappers for seurat clustering, model based clustering, and k-means
- functions.R - Poisson and Binomial deviance and residuals functions, a function for loading 10x read counts from molecule information files.
- functions_genefilter.R - convenience functions for gene filtering (feature selection) based on highly variable genes, highly expressed genes, and deviance.