Paul Hoffman edited this page Jul 30, 2018 · 10 revisions

The Assay class is where expression data is stored.


Slot Function
counts Stores unnormalized data such as raw counts or TPMs
data Normalized data matrix Scaled data matrix
key A character string to facilitate looking up features from a specific Assay
var.features A vector of features identified as variable
meta.features Feature-level meta data

Object Information

Summary information about Assay objects can be had quickly and easily using standard R functions. Object shape/dimensions can be found using the dim, ncol, and nrow functions; Cell and feature names can be found using the colnames and rownames functions, respectively, or the dimnames function.

# The following examples use the RNA assay from the PBMC 3k dataset
> rna
Assay data with 13714 features for 2638 cells
Top 10 variable features:
 PPBP, DOK3, NFE2L2, ARVCF, YPEL2, UBE2D4, FAM210B, CTB-113I20.2, GBGT1,
# nrow and ncol provide the number of features and cells, respectively
# dim provides both nrow and ncol at the same time
> dim(x = rna)
[1] 13714  2638
# In addtion to rownames and colnames, one can use dimnames
# which provides a two-length list with both rownames and colnames
> head(x = rownames(x = rna))
[1] "AL627309.1"    "AP006222.2"    "RP11-206L10.2" "RP11-206L10.9"
[5] "LINC00115"     "NOC2L"
> head(x = colnames(x = rna))

Data Access

Accessing data from an Assay object is done in several ways. Expression data is accessed with the GetAssayData function. Pulling expression data from the data slot can also be done with the single [ extract operator. Adding expression data to either the counts, data, or slots can be done with SetAssayData. New data must have the same cells in the same order as the current expression data. Data added to counts or data must have the same features as the current expression data.

# Slicing data using the single [ extract operator can take
# numeric slices or vectors of row/column names
> rna[1:3, 1:3]
3 x 3 sparse Matrix of class "dgCMatrix"
AL627309.1                 .              .              .
AP006222.2                 .              .              .
RP11-206L10.2              .              .              .
# GetAssayData allows pulling from a specific slot rather than just data
> GetAssayData(object = rna, slot = '')[1:3, 1:3]
AL627309.1       -0.06547546    -0.10052277    -0.05804007
AP006222.2       -0.02690776    -0.02820169    -0.04508318
RP11-206L10.2    -0.03596234    -0.17689415    -0.09997719
# SetAssayData example...

Feature-level meta data can be accessed with the double [[ extract operator. Adding feature-level meta data can be set using the double [[ extract operator as well. The HVFInfo function serves a specific version of the double [[ extract operator, pulling certain columns from the meta data.

# Feature-level meta data is stored as a data frame
# Standard data frame functions work on the meta data data frame
> colnames(x = rna[[]])
[1] "mean"              "dispersion"        "dispersion.scaled"
# HVFInfo pulls mean, dispersion, and dispersion scaled
# Useful for viewing the results of FindVariableFeatures
> head(x = HVFInfo(object = rna))
                     mean dispersion dispersion.scaled
AL627309.1    0.013555659   1.432845        -0.6236875
AP006222.2    0.004695980   1.458631        -0.5728009
RP11-206L10.2 0.005672517   1.325459        -0.8356099
RP11-206L10.9 0.002644177   0.859264        -1.7556304
LINC00115     0.027437275   1.457477        -0.5750770
NOC2L         0.376037723   1.876440        -0.4162432
# One can pull multiple values from the data frame at any time
> head(x = rna[[c('mean', 'dispersion')]])
                     mean dispersion
AL627309.1    0.013555659   1.432845
AP006222.2    0.004695980   1.458631
RP11-206L10.2 0.005672517   1.325459
RP11-206L10.9 0.002644177   0.859264
LINC00115     0.027437275   1.457477
NOC2L         0.376037723   1.876440
# Passing `drop = TRUE` will turn the meta data into a names vector
# with each entry being named for the cell it corresponds to
> head(x = rna[['mean', drop = TRUE]])
   AL627309.1    AP006222.2 RP11-206L10.2 RP11-206L10.9     LINC00115
  0.013555659   0.004695980   0.005672517   0.002644177   0.027437275
# Add meta data example

The vector of variable features can be pulled with the VariableFeatures function. VariableFeatures can also set the vector of variable features.

# VariableFeatures both accesses and sets the vector of variable features
> head(x = VariableFeatures(object = rna))
[1] "PPBP"   "DOK3"   "NFE2L2" "ARVCF"  "YPEL2"  "UBE2D4"
# Set variable features example

The key

# Key both accesses and sets the key slot for an Assay object
> Key(object = rna)
> Key(object = rna) <- 'myRNA_'
> Key(object = rna)
# Pull a feature from the RNA assay on the Seurat level
> head(x = FetchData(object = pbmc, vars.fetch = 'rna_MS4A1'))


Methods for the Assay class can be found with the following:

utils::methods(class = 'Assay')