Outputs of each Analytical Step in the scRNAbox pipeline
Introduction
Each Analytical Step of the scRNAbox pipeline produces outputs for the analysis. The outputs of each Analytical Step are deposited into a Step-specific folder which contains three sub folders:
step1
├── figs1
├── info1
└── objs1
- The
figs/
folder contains figures; - The
info/
folder contains text files and tables; - The
objs/
folder contains intermediate Seurat RDS objects.
Note: If users re-run an Analytical Step, the outputs from the previous run will automatically be overwritten. If you do not want to lose the outputs from a previous run, it is important to copy the materials to a separate directory. One exception to this is when annotating data in Step 7; users can re-run the Annotate step as many times as they wish and each interation will add a new metadata column to the already existing Seurat object.
Standard scRNAseq Analysis Track
Step 1: FASTQ to gene expression matrix
All of the outputs of the CellRanger counts pipeline are produced. For more information on the outputs, please visit the CellRanger documentation.
Step 2: Create Seurat object and remove ambient RNA
Output type | Name | Description |
---|---|---|
Figure | vioplot_sample_name.pdf | Sample-specific violin plot showing the distribution of cells according to QC metrics |
Figure | zoomed_in_vioplot_sample_name.pdf | Sample-specific violin plot showing the distribution of cells according to QC metrics. The minimum value to the mean is shown. |
Figure | cell_cycle_dim_plot_sample_name.pdf | Sample-specific principal component analysis of cell-cycle genes, colour-coded by the cell cycle score of each cell. |
Info | sample_name_ambient_rna_summary.rds | Sample-specific summary of ambient RNA estimation by SoupX |
Info | sample_name_RNA.txt | Sample-specific sparse matrix of RNA assay |
Info | estimated_ambient_RNA_sample_name.txt | Sample-specific ambient RNA estimation. |
Info | MetaData_sample_name.txt | Sample-specific dataframe showing the Seurat object metadata |
Info | meta_info_sample_name.txt | Sample-specific text file showing the column names of the Seurat object metadata |
Info | summary_sample_name.txt | Sample-specific text file showing the summary of QC metrics (Minimum, 1st Quartile, Median, Mean, 3rd Quartile, Maximum) |
Info | sessionInfo.txt | Session information for the R session |
Data object | sample_name.rds | Sample-specific intermediate Seurat RDS object |
Step 3: Quality control and filtering
Output type | Name | Description |
---|---|---|
Figure | dimplot_pca_sample_name.pdf | Sample-specific PCA showing the first two PCs |
Figure | elbow_sample_name.pdf | Elbow plot to visualize the percentage of variance explained by each PC |
Figure | filtered_QC_vioplot_sample_name.pdf | Sample-specific violin plot showing the distribution of cells according to QC metrics after filtering |
Figure | VariableFeaturePlot_sample_name.pdf | Sample-specific figure showing the most variably expressed genes |
Info | sample_name_RNA.txt | Sample-specific sparse matrix of RNA assay |
Info | MetaData_sample_name.txt | Sample-specific dataframe showing the Seurat object metadata |
Info | meta_info_sample_name.txt | Sample-specific text file showing the column names of the Seurat object metadata |
Info | most_variable_genes_sample_name.txt | Sample-specific text file showing the column names of the Seurat object metadata |
Info | summary_sample_name.txt | Sample-specific text file showing the summary of QC metrics (Minimum, 1st Quartile, Median, Mean, 3rd Quartile, Maximum) |
Info | sessionInfo.txt | Session information for the R session |
Data object | sample_name.rds | Sample-specific intermediate Seurat RDS object |
Step 4: Doublet removal
Output type | Name | Description |
---|---|---|
Figure | sample_nameDF.classifications.pdf | Sample-specific UMAP plot showing droplet classifications (singlet or doublet) |
Figure | sample_doublet_summary.pdf | Sample-specific violin plot showing pANN value across singlet and doublet assignments; sample-specific bar plot showing the number of singlets and doublets. |
Info | n_predicted_doublets_sample_name.txt | Sample-specific text file showing the number of identified doublets. |
Info | sample_name_RNA.txt | Sample-specific sparse matrix of RNA assay |
Info | MetaData_sample_name.txt | Sample-specific dataframe showing the Seurat object metadata |
Info | meta_info_sample_name.txt | Sample-specific text file showing the column names of the Seurat object metadata |
Info | sessionInfo.txt | Session information for the R session |
Data object | sample_name.rds | Sample-specific intermediate Seurat RDS object |
Step 5: Integration and linear dimensional reduction
Output type | Name | Description |
---|---|---|
Figure | DimPlot_pca.pdf | PCA showing the first two PCs, colour-coded by sample |
Figure | DimPlot_umap.pdf | UMAP, colour-coded by sample |
Figure | elbow.pdf | Elbow plot to visualize the percentage of variance explained by each PC |
Figure | Jackstraw_plot.pdf | Jackstraw plot to visualize the distribution of p-values for each PC |
Info | seu_int_RNA.txt | Sparse matrix of integrated assay |
Info | seu_int_MetaData.txt | Dataframe showing the Seurat object metadata |
Info | meta_info_seu_step5.csv | Text file showing the column names of the Seurat object metadata |
Info | sessionInfo.txt | Session information for the R session |
Data object | seu_step5.rds | Integrated intermediate Seurat RDS object |
Step 6: Clustering
Output type | Name | Description |
---|---|---|
Figure | clustree_int.pdf | Clustree plot showing the stability across the user-defied clustering resolutions |
Figure | integrated_snn_res.pdf | UMAP at the user defined clustering-resolution |
Figure | ARI.pdf | Mean and standard deviation of the Adjusted Rand Index (ARI) between clustering pairs at a user-defined resolution |
Info | clustering_ARI.xlsx | Excel file showing the mean and standard deviation of the ARI between clustering pairs at a user-defined resolution |
Info | seu_RNA.txt | Sparse matrix of integrated assay |
Info | seu_MetaData.txt | Dataframe showing the Seurat object metadata |
Info | meta_info.csv | Text file showing the column names of the Seurat object metadata |
Info | sessionInfo.txt | Session information for the R session |
Data object | seu_step6.rds | Intermediate Seurat RDS object |
Step 7: Cluster annotation
Cluster annotation method | Output type | Name | Description |
---|---|---|---|
General | Figure | umap.pdf | UMAP plot of integrated assay at the user-defined clustering resolution used for cluster annotation |
General | Figure | umap_splitted.pdf | UMAP plot of integrated assay at the user-defined clustering resolution used for cluster annotation, split by sample |
Method 1: Cluster marker GSEA | Figure | heatmap.pdf | Heatmap showing the expression of the top marker genes across cells, stratified by cluster |
Method 1: Cluster marker GSEA | Figure | plotenrich.pdf | Barplot showing the 20 most enriched terms for a particular cluster and cell type library |
Method 2: Module score | Figure | module_score_gene_set.pdf | UMAP plot showing the module score across cells for user-defined gene sets |
Method 3: Reference-based annotations | Figure | UMAP_transferred_labels.pdf | UMAP plots showing the cluster annotations from the reference Seurat object projected onto the query Seurat object |
Visualize features | Figure | select_feature_dot_plot.pdf | Dotplot showing the expression of user-defined features at the cluster level |
Visualize features | Figure | select_feature_violin_plot.pdf | Violin plot showing the expression of user-defined features at the cluster level |
Visualize features | Figure | select_feature_feature_plot.pdf | UMAP plots showing the expression of user-defined features at the cell level |
Annotate | Figure | clustering_name_cluster_annotation.pdf | UMAP plot of the integrated assay showing the cluster annotation |
Annotate | Figure | clustering_name_split_cluster_annotation.pdf | UMAP plot of the integrated assay showing the cluster annotation, split by sample |
General | Info | meta_info_seu_step7.txt | Text file showing the column names of the Seurat object metadata |
General | Info | sessionInfo_marker.txt | Session information for the R session |
Method 1: Cluster marker GSEA | Info | cluster_just_genes.xlsx | Excel file showing the marker genes for each cluster |
Method 1: Cluster marker GSEA | Info | cluster_whole.xlsx | Excel file showing the marker genes and corresponding summary statistics for each cluster |
Method 1: Cluster marker GSEA | Info | ClusterMarkers.csv | csv file showing the marker genes and corresponding summary statistics for each cluster |
Method 1: Cluster marker GSEA | Info | top_sel.csv | csv file showing the top n marker genes for each cluster. The user defined n in the execution parameters |
Method 1: Cluster marker GSEA | Info | Er.genes.csv | Enrichment terms and the corresponding statistics for a particular cluster and cell type library |
Method 1: Cluster marker GSEA | Data object | ClusterMarkers.rds | RDS object containing the marker genes for each cluster |
Method 2: Module score | Info | geneset_by_cluster.csv | Mean module score across clusters for each user-defined gene set |
Method 3: Reference-based annotations | Info | reference_predictions_summary.xlsx | Number of cells from each cluster assigned a particular annotation based of the reference |
General | Data object | seu_step7.rds | Intermediate Seurat RDS object |
Step 8: Differential gene expression contrasts
DGE contrast | Output type | Name | Description |
---|---|---|---|
Pseudo-bulk | Figure | contrast_name.pdf | Volcano plot of showing differentially expressed genes |
Sample-sample contrasts | Figure | contrast_name_volcano_plot.pdf | Volcano plot of showing differentially expressed genes |
Sample-cell contrasts | Figure | contrast_name_volcano_plot.pdf | Volcano plot of showing differentially expressed genes |
Sample-sample contrasts | Info | contrast_name_DEG.csv | Differentially exppresed genes identified for the user-defined contrast |
Sample-cell contrasts | Info | contrast_name_DEG.csv | Differentially exppresed genes identified for the user-defined contrast |
Pseudo-bulk | Info | Aggregated_expression_summary.csv | Aggregated counts across user-defined sample groups |
Pseudo-bulk | Info | PseudoBulk_DGEsummarytable.csv | Number of differentially expressed genes in the positive and negative direction for each user-defined contrast |
General | Info | seu_RNA.txt | Sparse matrix of integrated assay |
General | Info | seu_MetaData.txt | Dataframe showing the Seurat object metadata |
General | Info | meta_info.csv | Text file showing the column names of the Seurat object metadata |
General | Info | sessionInfo.txt | Session information for the R session |
General | Data object | seu_step8.rds | Intermediate Seurat RDS object |
Cell Hashtag scRNAseq Analysis Track
Step 1: FASTQ to gene expression matrix
All of the outputs of the CellRanger counts pipeline are produced. For more information on the outputs, please visit the CellRanger documentation.
Step 2: Create Seurat object and remove ambient RNA
Output type | Name | Description |
---|---|---|
Figure | vioplot_run_name.pdf | Run-specific violin plot showing the distribution of cells according to QC metrics |
Figure | zoomed_in_vioplot_run_name.pdf | Run-specific violin plot showing the distribution of cells according to QC metrics. The minimum value to the mean is shown. |
Figure | cell_cycle_dim_plot_run_name.pdf | Run-specific principal component analysis of cell-cycle genes, colour-coded by the cell cycle score of each cell. |
Info | run_name_ambient_rna_summary.rds | Run-specific summary of ambient RNA estimation by SoupX |
Info | run_name_RNA.txt | Run-specific sparse matrix of RNA assay |
Info | estimated_ambient_RNA_run_name.txt | Run-specific ambient RNA estimation. |
Info | MetaData_run_name.txt | Run-specific dataframe showing the Seurat object metadata |
Info | meta_info_run_name.txt | Run-specific text file showing the column names of the Seurat object metadata |
Info | summary_run_name.txt | Run-specific text file showing the summary of QC metrics (Minimum, 1st Quartile, Median, Mean, 3rd Quartile, Maximum) |
Info | sessionInfo.txt | Session information for the R session |
Data object | run_name.rds | Run-specific intermediate Seurat RDS object |
Step 3: Quality control and filtering
Output type | Name | Description |
---|---|---|
Figure | dimplot_pca_run_name.pdf | Run-specific PCA showing the first two PCs |
Figure | elbow_run_name.pdf | Elbow plot to visualize the percentage of variance explained by each PC |
Figure | filtered_QC_vioplot_run_name.pdf | Run-specific violin plot showing the distribution of cells according to QC metrics after filtering |
Figure | VariableFeaturePlot_run_name.pdf | Run-specific figure showing the most variably expressed genes |
Info | run_name_RNA.txt | Run-specific sparse matrix of RNA assay |
Info | MetaData_run_name.txt | Run-specific dataframe showing the Seurat object metadata |
Info | meta_info_run_name.txt | Run-specific text file showing the column names of the Seurat object metadata |
Info | most_variable_genes_run_name.txt | Run-specific text file showing the column names of the Seurat object metadata |
Info | summary_run_name.txt | Run-specific text file showing the summary of QC metrics (Minimum, 1st Quartile, Median, Mean, 3rd Quartile, Maximum) |
Info | sessionInfo.txt | Session information for the R session |
Data object | run_name.rds | Run-specific intermediate Seurat RDS object |
Step 4: Demultiplexing and doublet removal
Output type | Name | Description |
---|---|---|
Figure | run_name_DotPlot_HTO_MSD.pdf | Run-specific dot plot showing the enrichment of barcode-labels across cell assignments |
Figure | run_name_Heatmap_HTO_MSD.pdf | Run-specific heatmap showing the enrichment of barcode-labels across cell assignments |
Figure | run_name_Ridgeplot_HTO_MSD.pdf | Run-specific ridge plot showing the enrichment of barcode-labels across cell assignments |
Figure | run_name_HTO_dimplot_pca_.pdf | Run-specific PCA of antibody assay |
Figure | run_name_HTO_dimplot_umap_.pdf | Run-specific UMAP of antibody assay |
Figure | run_name_nCounts_RNA_MSD.pdf | Run-specific violin plot showing the number of unque transcripts across cell assignments |
Info | run_name.rds_old_antibody_label_MULTIseqDemuxHTOcounts.csv | Run-specific list of sample-specific barcode labels used in the experiment |
Info | run_name_MULTIseqDemuxHTOcounts.csv | Run-specific number of cells assigned to each sample |
Info | run_namefiltered_MULTIseqDemuxHTOcounts.csv | Run-specific number of cells assigned to each sample after removal of doublet and negative droplets |
Info | run_name_meta_info_.txt | Run-specific text file showing the column names of the Seurat object metadata |
Info | run_name_MetaData.txt | Run-specific dataframe showing the Seurat object metadata |
Info | run_name_RNA.txt | Run-specific sparse matrix of RNA assay |
Info | sessionInfo.txt | Session information for the R session |
Data object | run_name.rds | Run-specific intermediate Seurat RDS object |
Step 5: Integration and linear dimensional reduction
Output type | Name | Description |
---|---|---|
Figure | DimPlot_pca.pdf | PCA showing the first two PCs, colour-coded by run |
Figure | DimPlot_umap.pdf | UMAP, colour-coded by run |
Figure | elbow.pdf | Elbow plot to visualize the percentage of variance explained by each PC |
Figure | Jackstraw_plot.pdf | Jackstraw plot to visualize the distribution of p-values for each PC |
Info | seu_int_RNA.txt | Sparse matrix of integrated assay |
Info | seu_int_MetaData.txt | Dataframe showing the Seurat object metadata |
Info | meta_info_seu_step5.csv | Text file showing the column names of the Seurat object metadata |
Info | sessionInfo.txt | Session information for the R session |
Data object | seu_step5.rds | Integrated intermediate Seurat RDS object |
Step 6: Clustering
Output type | Name | Description |
---|---|---|
Figure | clustree_int.pdf | Clustree plot showing the stability across the user-defied clustering resolutions |
Figure | integrated_snn_res.pdf | UMAP at the user defined clustering-resolution |
Figure | ARI.pdf | Mean and standard deviation of the Adjusted Rand Index (ARI) between clustering pairs at a user-defined resolution |
Info | clustering_ARI.xlsx | Excel file showing the mean and standard deviation of the ARI between clustering pairs at a user-defined resolution |
Info | seu_RNA.txt | Sparse matrix of integrated assay |
Info | seu_MetaData.txt | Dataframe showing the Seurat object metadata |
Info | meta_info.csv | Text file showing the column names of the Seurat object metadata |
Info | sessionInfo.txt | Session information for the R session |
Data object | seu_step6.rds | Intermediate Seurat RDS object |
Step 7: Cluster annotation
Cluster annotation method | Output type | Name | Description |
---|---|---|---|
General | Figure | umap.pdf | UMAP plot of integrated assay at the user-defined clustering resolution used for cluster annotation |
General | Figure | umap_splitted.pdf | UMAP plot of integrated assay at the user-defined clustering resolution used for cluster annotation, split by run |
Method 1: Cluster marker GSEA | Figure | heatmap.pdf | Heatmap showing the expression of the top marker genes across cells, stratified by cluster |
Method 1: Cluster marker GSEA | Figure | plotenrich.pdf | Barplot showing the 20 most enriched terms for a particular cluster and cell type library |
Method 2: Module score | Figure | module_score_gene_set.pdf | UMAP plot showing the module score across cells for user-defined gene sets |
Method 3: Reference-based annotations | Figure | UMAP_transferred_labels.pdf | UMAP plots showing the cluster annotations from the reference Seurat object projected onto the query Seurat object |
Visualize features | Figure | select_feature_dot_plot.pdf | Dotplot showing the expression of user-defined features at the cluster level |
Visualize features | Figure | select_feature_violin_plot.pdf | Violin plot showing the expression of user-defined features at the cluster level |
Visualize features | Figure | select_feature_feature_plot.pdf | UMAP plots showing the expression of user-defined features at the cell level |
Annotate | Figure | clustering_name_cluster_annotation.pdf | UMAP plot of the integrated assay showing the cluster annotation |
Annotate | Figure | clustering_name_split_cluster_annotation.pdf | UMAP plot of the integrated assay showing the cluster annotation, split by run |
General | Info | meta_info_seu_step7.txt | Text file showing the column names of the Seurat object metadata |
General | Info | sessionInfo_marker.txt | Session information for the R session |
Method 1: Cluster marker GSEA | Info | cluster_just_genes.xlsx | Excel file showing the marker genes for each cluster |
Method 1: Cluster marker GSEA | Info | cluster_whole.xlsx | Excel file showing the marker genes and corresponding summary statistics for each cluster |
Method 1: Cluster marker GSEA | Info | ClusterMarkers.csv | csv file showing the marker genes and corresponding summary statistics for each cluster |
Method 1: Cluster marker GSEA | Info | top_sel.csv | csv file showing the top n marker genes for each cluster. The user defined n in the execution parameters |
Method 1: Cluster marker GSEA | Info | Er.genes.csv | Enrichment terms and the corresponding statistics for a particular cluster and cell type library |
Method 1: Cluster marker GSEA | Data object | ClusterMarkers.rds | RDS object containing the marker genes for each cluster |
Method 2: Module score | Info | geneset_by_cluster.csv | Mean module score across clusters for each user-defined gene set |
Method 3: Reference-based annotations | Info | reference_predictions_summary.xlsx | Number of cells from each cluster assigned a particular annotation based of the reference |
General | Data object | seu_step7.rds | Intermediate Seurat RDS object |
Step 8: Differential gene expression contrasts
DGE contrast | Output type | Name | Description |
---|---|---|---|
Pseudo-bulk | Figure | contrast_name.pdf | Volcano plot of showing differentially expressed genes |
Sample-sample contrasts | Figure | contrast_name_volcano_plot.pdf | Volcano plot of showing differentially expressed genes |
Sample-cell contrasts | Figure | contrast_name_volcano_plot.pdf | Volcano plot of showing differentially expressed genes |
Sample-sample contrasts | Info | contrast_name_DEG.csv | Differentially exppresed genes identified for the user-defined contrast |
Sample-cell contrasts | Info | contrast_name_DEG.csv | Differentially exppresed genes identified for the user-defined contrast |
Pseudo-bulk | Info | Aggregated_expression_summary.csv | Aggregated counts across user-defined run groups |
Pseudo-bulk | Info | PseudoBulk_DGEsummarytable.csv | Number of differentially expressed genes in the positive and negative direction for each user-defined contrast |
General | Info | seu_RNA.txt | Sparse matrix of integrated assay |
General | Info | seu_MetaData.txt | Dataframe showing the Seurat object metadata |
General | Info | meta_info.csv | Text file showing the column names of the Seurat object metadata |
General | Info | sessionInfo.txt | Session information for the R session |
General | Data object | seu_step8.rds | Intermediate Seurat RDS object |