Skip to content

Commit

Permalink
[SparseArraysBase] Update README (#1595)
Browse files Browse the repository at this point in the history
  • Loading branch information
lkdvos authored Nov 18, 2024
1 parent 34a826b commit f1f179b
Showing 1 changed file with 75 additions and 18 deletions.
93 changes: 75 additions & 18 deletions NDTensors/src/lib/SparseArraysBase/README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,83 @@
# SparseArraysBase

Defines a generic interface for sparse arrays in Julia.
SparseArraysBase is a package that aims to expand on the sparse array functionality that is currently in Julia Base.
While SparseArrays.jl is centered mostly around `SparseMatrixCSC` and the SuiteSparse library, here we wish to broaden the scope a bit, and consider generic sparse arrays.
Abstractly, the mental model can be considered as a storage object that holds the stored values, and a bijection between the array indices and the indices of the storage.
For now, we focus on providing efficient implementations of Dictionary of Key (DOK) type sparse storage formats, but may expand upon this in the future.
As a result, for typical linear algebra routines, we still expect `SparseMatrixCSC` to be the object of choice.

The design consists of roughly three components:
- `AbstractSparseArray` interface functions
- Overloaded Julia base methods
- `SparseArrayDOK` struct that implements this

## AbstractSparseArray

The first part consists of typical functions that are useful in the context of sparse arrays.
The minimal interface, which enables the usage of the rest of this package, consists of the following functions:

| Signature | Description | Default |
|-----------|-------------|---------|
| `sparse_storage(a::AbstractArray)` | Returns the storage object of the sparse array | `a` |
| `storage_index_to_index(a::AbstractArray, I)` | Converts a storage index to an array index | `I` |
| `index_to_storage_index(a::AbstractArray, I)` | Converts an array index to a storage index | `I` |

Using these primitives, several convenience functions are defined to facilitate the writing of sparse array algorithms.

| Signature | Description | Default |
|-----------|-------------|---------|
| `storage_indices(a)` | Returns the indices of the storage | `eachindex(sparse_storage(a))` |
| `stored_indices(a)` | Returns the indices of the stored values | `Iterators.map(Base.Fix1(storage_index_to_index, a), storage_indices(a))` |
| `stored_length(a)` | Returns the number of stored values | `length(storage_indices(a))` |

<!-- TODO: `getindex!`, `increaseindex!`, `sparse_map`, expose "zero" functionality? -->

Interesting to note here is that the design is such that we can define sparse arrays without having to subtype `AbstractSparseArray`.
To achieve this, each function `f` is defined in terms of `sparse_f`, rather than directly overloading `f`.
<!--
TODO:
In order to opt-in to the sparse array functionality, one needs to dispatch the functions through `sparse_f` instead of `f`.
For convenience, you can automatically dispatch all functions through `sparse_f` by using the following macro:
The minimal interface is:
```julia
nonzeros(a::AbstractArray) = ...
nonzero_index_to_index(a::AbstractArray, Inz) = ...
index_to_nonzero_index(a::AbstractArray{<:Any,N}, I::CartesianIndex{N}) where {N} = ...
Broadcast.BroadcastStyle(arraytype::Type{<:AbstractArray}) = SparseArraysBase.SparseArrayStyle{ndims(arraytype)}()
```
Once these are defined, along with Julia AbstractArray interface functions like
`Base.size` and `Base.similar`, functions like the following will take advantage of sparsity:
```julia
SparseArraysBase.nonzero_length # SparseArrays.nnz
SparseArraysBase.sparse_getindex
SparseArraysBase.sparse_setindex!
SparseArraysBase.sparse_map!
SparseArraysBase.sparse_copy!
SparseArraysBase.sparse_copyto!
SparseArraysBase.sparse_permutedims!
@abstractsparsearray MySparseArrayType
```
which can be used to define the corresponding `Base` functions.
-->

## Overloaded Julia base methods

The second part consists of overloading Julia base methods to work with sparse arrays.
In particular, specialised implementations exist for the following functions:

- `sparse_similar`
- `sparse_reduce`
- `sparse_map`
- `sparse_map!`
- `sparse_all`
- `sparse_any`
- `sparse_isequal`
- `sparse_fill!`
- `sparse_zero`, `sparse_zero!`, `sparse_iszero`
- `sparse_one`, `sparse_one!`, `sparse_isone`
- `sparse_reshape`, `sparse_reshape!`
- `sparse_cat`, `sparse_cat!`
- `sparse_copy!`, `sparse_copyto!`
- `sparse_permutedims`, `sparse_permutedims!`
- `sparse_mul!`, `sparse_dot`

## SparseArrayDOK

Finally, the `SparseArrayDOK` struct is provided as a concrete implementation of the `AbstractSparseArray` interface.
It is a dictionary of keys (DOK) type sparse array, which stores the values in a `Dictionaries.jl` dictionary, and maps the indices to the keys of the dictionary.
This model is particularly useful for sparse arrays with a small number of non-zero elements, or for arrays that are constructed incrementally, as it boasts fast random accesses and insertions.
The drawback is that sequential iteration is slower than for other sparse array types, leading to slower linear algebra operations.
For the purposes of `SparseArraysBase`, this struct will serve as the canonical example of a sparse array, and will be returned by default when new sparse arrays are created.

One particular feature of `SparseArrayDOK` is that it can be used in cases where the non-stored entries have to be constructed in a non-trivial way.
Typically, sparse arrays use `zero(eltype(a))` to construct the non-stored entries, but this is not always sufficient.
A concrete example is found in `BlockSparseArrays.jl`, where initialization of the non-stored entries requires the construction of a block of zeros of appropriate size.

<!-- TODO: update TODOs -->

## TODO
Still need to implement `Base` functions:
Expand Down

0 comments on commit f1f179b

Please sign in to comment.