Skip to content

Commit

Permalink
update README
Browse files Browse the repository at this point in the history
  • Loading branch information
Yunuuuu committed Jan 26, 2024
1 parent b6198ae commit 060e708
Show file tree
Hide file tree
Showing 3 changed files with 42 additions and 24 deletions.
26 changes: 18 additions & 8 deletions README.html
Original file line number Diff line number Diff line change
Expand Up @@ -769,12 +769,22 @@ <h1 id="bpcells-backend-for-delayedarray-objects">BPCells backend for DelayedArr
<h2 id="matrix-storage-format">Matrix Storage Format</h2>
<p>BPCells provide three format:</p>
<ol>
<li>Directory of files * <code>writeBPCellsDirArray</code></li>
<li>Hdf5 file</li>
<li>in memory * <code>writeBPCellsMemArray</code></li>
<li>Directory of files
<ul>
<li>read: <code>readBPCellsDirMatrix</code></li>
<li>write: <code>writeBPCellsDirArray</code></li>
</ul></li>
<li>Hdf5 file
<ul>
<li>read: <code>readBPCellsHDF5Matrix</code></li>
<li>write: <code>writeBPCellsHDF5Array</code></li>
</ul></li>
<li>in memory
<ul>
<li>write: <code>writeBPCellsMemArray</code></li>
</ul></li>
</ol>
<p>&quot;*&quot; means the format has been implemented in <code>BPCellsArray</code> package, followed by the function to implement this format.</p>
<p>Matrices can be stored in a directory on disk, in memory, or in an HDF5 file. Saving in a directory on disk is a good default for local analysis, as it provides the best I/O performance and lowest memory usage. The HDF5 format allows saving within existing hdf5 files to group data together, and the in memory format provides the fastest performance in the event memory usage is unimportant.</p>
<p>Matrices can be stored in a directory on disk, in memory, or in an HDF5 file. Saving in a directory on disk is a good default for local analysis, as it provides the best I/O performance and lowest memory usage. The HDF5 format allows saving within existing hdf5 files to group data together, and the in memory format provides the fastest performance in the event memory usage is unimportant. So when using <code>as(object, &quot;BPCellsArray&quot;)</code> or <code>as(object, &quot;BPCellsMatrix&quot;)</code>, the default behavior will be <code>as(object, &quot;BPCellsDirMatrix&quot;)</code>.</p>
<p>Details see: <a href="https://bnprks.github.io/BPCells/articles/web-only/bitpacking-format.html">https://bnprks.github.io/BPCells/articles/web-only/bitpacking-format.html</a></p>
<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true"></a><span class="kw">library</span>(BPCellsArray)</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true"></a><span class="kw">library</span>(SingleCellExperiment)</span>
Expand Down Expand Up @@ -876,7 +886,7 @@ <h2 id="matrix-storage-format">Matrix Storage Format</h2>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true"></a><span class="co">#&gt; Storage order: column major</span></span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true"></a><span class="co">#&gt; </span></span>
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true"></a><span class="co">#&gt; Queued Operations:</span></span>
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true"></a><span class="co">#&gt; 1. Load compressed matrix from directory /tmp/RtmpRHG9Yx/BPCells2e32ea6b10f2ac</span></span></code></pre></div>
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true"></a><span class="co">#&gt; 1. Load compressed matrix from directory /tmp/RtmpG0nbIy/BPCells336a81150062bf</span></span></code></pre></div>
<p>If you do delayed operations with this assay, the class may be changed, that’s because all of BPCells operations are lazy, no real work is performed on the matrix until the result needs to be returned as an R object or written to disk. You can coerce it into a dense matrix or <code>dgCMatrix</code> to get a actual R object.</p>
<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true"></a><span class="kw">assay</span>(sce, <span class="st">&quot;counts&quot;</span>)[<span class="dv">1</span><span class="op">:</span><span class="dv">10</span>, <span class="dv">1</span><span class="op">:</span><span class="dv">10</span>]</span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true"></a><span class="co">#&gt; 10 x 10 DelayedMatrix object with class BPCellsMatrix</span></span>
Expand All @@ -888,7 +898,7 @@ <h2 id="matrix-storage-format">Matrix Storage Format</h2>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true"></a><span class="co">#&gt; Storage order: column major</span></span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true"></a><span class="co">#&gt; </span></span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true"></a><span class="co">#&gt; Queued Operations:</span></span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true"></a><span class="co">#&gt; 1. Load compressed matrix from directory /tmp/RtmpRHG9Yx/BPCells2e32ea6b10f2ac</span></span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true"></a><span class="co">#&gt; 1. Load compressed matrix from directory /tmp/RtmpG0nbIy/BPCells336a81150062bf</span></span>
<span id="cb6-12"><a href="#cb6-12" aria-hidden="true"></a><span class="co">#&gt; 2. Select rows: 1, 2 ... 10 and cols: 1, 2 ... 10</span></span>
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true"></a><span class="kw">as.matrix</span>(<span class="kw">assay</span>(sce, <span class="st">&quot;counts&quot;</span>)[<span class="dv">1</span><span class="op">:</span><span class="dv">10</span>, <span class="dv">1</span><span class="op">:</span><span class="dv">10</span>])</span>
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true"></a><span class="co">#&gt; Cell_001 Cell_002 Cell_003 Cell_004 Cell_005 Cell_006 Cell_007</span></span>
Expand Down Expand Up @@ -942,7 +952,7 @@ <h2 id="matrix-storage-format">Matrix Storage Format</h2>
<span id="cb8-9"><a href="#cb8-9" aria-hidden="true"></a><span class="co">#&gt; Storage order: column major</span></span>
<span id="cb8-10"><a href="#cb8-10" aria-hidden="true"></a><span class="co">#&gt; </span></span>
<span id="cb8-11"><a href="#cb8-11" aria-hidden="true"></a><span class="co">#&gt; Queued Operations:</span></span>
<span id="cb8-12"><a href="#cb8-12" aria-hidden="true"></a><span class="co">#&gt; 1. Load compressed matrix from directory /tmp/RtmpRHG9Yx/BPCells2e32ea6b10f2ac</span></span>
<span id="cb8-12"><a href="#cb8-12" aria-hidden="true"></a><span class="co">#&gt; 1. Load compressed matrix from directory /tmp/RtmpG0nbIy/BPCells336a81150062bf</span></span>
<span id="cb8-13"><a href="#cb8-13" aria-hidden="true"></a><span class="co">#&gt; 2. Scale columns by 0.984, 1.05 ... 1</span></span>
<span id="cb8-14"><a href="#cb8-14" aria-hidden="true"></a><span class="co">#&gt; 3. Transform log1p</span></span>
<span id="cb8-15"><a href="#cb8-15" aria-hidden="true"></a><span class="co">#&gt; 4. Scale by 1.44</span></span></code></pre></div>
Expand Down
22 changes: 13 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,19 +55,23 @@ Other non-lazied operations:

BPCells provide three format:

1. Directory of files \* `writeBPCellsDirArray`
1. Directory of files
- read: `readBPCellsDirMatrix`
- write: `writeBPCellsDirArray`
2. Hdf5 file
3. in memory \* `writeBPCellsMemArray`

"\*" means the format has been implemented in `BPCellsArray` package,
followed by the function to implement this format.
- read: `readBPCellsHDF5Matrix`
- write: `writeBPCellsHDF5Array`
3. in memory
- write: `writeBPCellsMemArray`

Matrices can be stored in a directory on disk, in memory, or in an HDF5
file. Saving in a directory on disk is a good default for local
analysis, as it provides the best I/O performance and lowest memory
usage. The HDF5 format allows saving within existing hdf5 files to group
data together, and the in memory format provides the fastest performance
in the event memory usage is unimportant.
in the event memory usage is unimportant. So when using `as(object,
"BPCellsArray")` or `as(object, "BPCellsMatrix")`, the default behavior
will be `as(object, "BPCellsDirMatrix")`.

Details see:
<https://bnprks.github.io/BPCells/articles/web-only/bitpacking-format.html>
Expand Down Expand Up @@ -190,7 +194,7 @@ assay(sce, "counts")
#> Storage order: column major
#>
#> Queued Operations:
#> 1. Load compressed matrix from directory /tmp/RtmpRHG9Yx/BPCells2e32ea6b10f2ac
#> 1. Load compressed matrix from directory /tmp/RtmpG0nbIy/BPCells336a81150062bf
```

If you do delayed operations with this assay, the class may be changed,
Expand All @@ -210,7 +214,7 @@ assay(sce, "counts")[1:10, 1:10]
#> Storage order: column major
#>
#> Queued Operations:
#> 1. Load compressed matrix from directory /tmp/RtmpRHG9Yx/BPCells2e32ea6b10f2ac
#> 1. Load compressed matrix from directory /tmp/RtmpG0nbIy/BPCells336a81150062bf
#> 2. Select rows: 1, 2 ... 10 and cols: 1, 2 ... 10
as.matrix(assay(sce, "counts")[1:10, 1:10])
#> Cell_001 Cell_002 Cell_003 Cell_004 Cell_005 Cell_006 Cell_007
Expand Down Expand Up @@ -275,7 +279,7 @@ assay(sce, "logcounts")
#> Storage order: column major
#>
#> Queued Operations:
#> 1. Load compressed matrix from directory /tmp/RtmpRHG9Yx/BPCells2e32ea6b10f2ac
#> 1. Load compressed matrix from directory /tmp/RtmpG0nbIy/BPCells336a81150062bf
#> 2. Scale columns by 0.984, 1.05 ... 1
#> 3. Transform log1p
#> 4. Scale by 1.44
Expand Down
18 changes: 11 additions & 7 deletions vignettes/BPCellsArray.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -61,19 +61,23 @@ Other non-lazied operations:
## Matrix Storage Format
BPCells provide three format:

1. Directory of files * `writeBPCellsDirArray`
2. Hdf5 file
3. in memory * `writeBPCellsMemArray`

"*" means the format has been implemented in `BPCellsArray` package, followed by
the function to implement this format.
1. Directory of files
- read: `readBPCellsDirMatrix`
- write: `writeBPCellsDirArray`
2. Hdf5 file
- read: `readBPCellsHDF5Matrix`
- write: `writeBPCellsHDF5Array`
3. in memory
- write: `writeBPCellsMemArray`

Matrices can be stored in a directory on disk, in memory, or in an HDF5 file.
Saving in a directory on disk is a good default for local analysis, as it
provides the best I/O performance and lowest memory usage. The HDF5 format
allows saving within existing hdf5 files to group data together, and the in
memory format provides the fastest performance in the event memory usage is
unimportant.
unimportant. So when using `as(object, "BPCellsArray")` or `as(object,
"BPCellsMatrix")`, the default behavior will be `as(object,
"BPCellsDirMatrix")`.

Details see: <https://bnprks.github.io/BPCells/articles/web-only/bitpacking-format.html>

Expand Down

0 comments on commit 060e708

Please sign in to comment.