Save Bioconductor objects to file

Overview

The alabaster framework implements methods to save a variety of R/Bioconductor objects to on-disk representations. This is a more robust and portable alternative to the typical approach of saving objects in RDS files.

By separating the on-disk representation from the in-memory object structure, we can more easily adapt to changes in S4 class definitions. This improves robustness to R environment updates, especially when updateObject() is not correctly configured.
By using standard file formats like HDF5 and JSON, we ensure that Bioconductor objects can be easily read from other languages like Python and Javascript. This improves interoperability between application ecosystems.
By breaking up complex Bioconductor objects into their components, we enable modular reads and writes to the backing store. We can easily read or update part of an object without having to consider the other parts.

The alabaster.base package defines the base generics to read and write the file structures along with the associated metadata. Implementations of these methods for various Bioconductor classes can be found in the other alabaster packages like alabaster.se and alabaster.bumpy.

Quick start

First, we'll install the alabaster.base package. This package is available from Bioconductor, so we can use the standard Bioconductor installation process:

# install.packages("BiocManager")
BiocManager::install("alabaster.base")

The simplest example involves saving a DataFrame inside a staging directory. Let's mock up an object:

library(S4Vectors)
df <- DataFrame(X=1:10, Y=letters[1:10])
## DataFrame with 10 rows and 2 columns
##            X           Y
##    <integer> <character>
## 1          1           a
## 2          2           b
## 3          3           c
## 4          4           d
## 5          5           e
## 6          6           f
## 7          7           g
## 8          8           h
## 9          9           i
## 10        10           j

Then we can save it to the staging directory:

tmp <- tempfile()
library(alabaster.base)
saveObject(df, tmp)

We can copy the directory to another location, over a network, etc., and then easily load it back into a new R session:

readObject(tmp)
## DataFrame with 10 rows and 2 columns
##            X           Y
##    <integer> <character>
## 1          1           a
## 2          2           b
## 3          3           c
## 4          4           d
## 5          5           e
## 6          6           f
## 7          7           g
## 8          8           h
## 9          9           i
## 10        10           j

Check out the user's guide for more details.

Supported classes

The saving/reading process can be applied to a range of data structures, provided the appropriate alabaster package is installed.

Package	Object types	BioC-devel	BioC-release
alabaster.base	`list`, `factor`, `DataFrame`, `List`
alabaster.matrix	`matrix`, `Matrix` objects, `DelayedArray`
alabaster.ranges	`GRanges`, `GRangesList` and related objects
alabaster.se	`SummarizedExperiment`, `RangedSummarizedExperiment`
alabaster.sce	`SingleCellExperiment`
alabaster.mae	`MultiAssayExperiment`
alabaster.string	`XStringSet`
alabaster.spatial	`SpatialExperiment`
alabaster.bumpy	`BumpyMatrix` objects
alabaster.vcf	`VCF` objects
alabaster.files	Common bioinformatics files, e.g., FASTQ, BAM

All packages are available from Bioconductor and can be installed with the usual BiocManager::install() process. Alternatively, to install all packages in one go, users can install the alabaster umbrella package.

Extensions and applications

Developers can extend this framework to support more R/Bioconductor classes by creating their own alabaster package. Check out the extension section for more details.

Developers can also customize this framework for specific applications, most typically to add bespoke metadata in the staging directory. The metadata can then be indexed by database systems like SQLite and MongoDB to provide search capabilities. Check out the applications section for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 201 Commits
.github/workflows		.github/workflows
R		R
inst		inst
man		man
src		src
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Save Bioconductor objects to file

Overview

Quick start

Supported classes

Extensions and applications

About

Releases

Packages

Contributors 2

Languages

License

ArtifactDB/alabaster.base

Folders and files

Latest commit

History

Repository files navigation

Save Bioconductor objects to file

Overview

Quick start

Supported classes

Extensions and applications

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages