-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] [BlockSparseArrays] (Truncated) SVD for Abstract(Block)(Sparse)Array #1572
base: main
Are you sure you want to change the base?
Conversation
NDTensors/src/lib/BlockSparseArrays/src/blocksparsearray/blocksparsearray.jl
Outdated
Show resolved
Hide resolved
NDTensors/src/lib/BlockSparseArrays/src/blocksparsearray/blocksparsearray.jl
Outdated
Show resolved
Hide resolved
NDTensors/src/lib/BlockSparseArrays/src/BlockArraysExtensions/BlockArraysExtensions.jl
Outdated
Show resolved
Hide resolved
NDTensors/src/lib/BlockSparseArrays/src/BlockArraysExtensions/BlockArraysExtensions.jl
Outdated
Show resolved
Hide resolved
NDTensors/src/lib/BlockSparseArrays/src/BlockArraysExtensions/BlockArraysExtensions.jl
Outdated
Show resolved
Hide resolved
NDTensors/src/lib/BlockSparseArrays/src/blocksparsearray/blockdiagonalarray.jl
Outdated
Show resolved
Hide resolved
Attempt to find a permutation of blocks that makes `A` blockdiagonal. If unsuccesful, | ||
returns nothing, otherwise returns both the blockdiagonal `B` as well as the permutation `I, J`. | ||
""" | ||
function try_make_blockdiagonal(A::AbstractBlockSparseMatrix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider changing to try_to_blockdiagonal
as discussed above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, does this handle rectangular block matrices, i.e. when the block pattern itself is rectangular?
NDTensors/src/lib/BlockSparseArrays/src/blocksparsearray/blockdiagonalarray.jl
Outdated
Show resolved
Hide resolved
|
||
# Cast to block-diagonal implementation if permuted-blockdiagonal | ||
function find_permute_blockdiagonal(A) | ||
@show keys = map(Base.Fix2(getproperty, :n), collect(block_stored_indices(A))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the canonical way to convert Block(1, 1)
to (1, 1)
, as an alternative to using field access:
Int.(Tuple(Block(1, 1)))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though we could consider keeping the Block
wrapper, i.e. Tuple(Block(1, 1))
, and then the permutations are block permutations. Then below, instead of blocks(A)[tuple.(invperm(I), J)]
, it could be written as a slicing operation on the block sparse matrix A[...]
. I slightly prefer that approach. In that case I would call the function something like try_to_blockdiagonal_blockperm
to make it clear it is outputting a block permutation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played around with this, but within the current constraints I don't think I can make that work. I really need a vector of blocks in this case, in other words linear slicing into the block sparse matrix. A[vec_of_blockinds]
just cannot work, for the reasons we discussed previously, because the slicing wants to return a BlockArray
, which would then be a vector of matrices that all have different sizes.
My way around this is to perform linear index slicing into the blocks
, which I think is fair to assume to return a vector of blocks (not wrapped into a BlockArray
). I agree that it would be nicer to not have to do this, but I would also argue that it is not outside the allowed operations of the interface. blocks
should return an AbstractArray
containing the different blocks, and I should be allowed to index into that to obtain different blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I'm not quite following, let's discuss this.
In this second iteration, I implemented most of the comments as you suggested. There are two main problems that hold this back currently:
I'll handle these first separately, and then continue working here. I'll also split off the |
Co-authored-by: Matt Fishman <mtfishman@users.noreply.github.com>
Looks interesting and relatively simple (in a good way I mean). My main question about the truncation system would be this: is the design flexible enough to be able to use a "relative" cutoff, meaning that one would first compute the sum of squares of singular values and then measure discard only those singular values defined so that their sum of squares divided by the total sum of squares is less than the cutoff? I would guess this is trivial to add in the dense case – is it also easy to add for the block sparse case? |
Let me link the original inspiration: https://github.com/Jutho/TensorKit.jl/blob/master/src/tensors/truncation.jl I think the main strengths of the design are that it is incredibly modular, and most of the pieces are designed to make it easy to hook into. For example, for truncating a fraction of the total weight, something like this could work: struct TruncFraction{T} <: TruncationScheme
fraction::T
end
function _truncate(usv::SVD, trunc::TruncUntil)
total_weight = norm(usv.S)^2
current_weight = zero(total_weight)
local n # make n available outside loop
for n in reverse(1:length(usv.S))
current_weight += usv.S[n]
if current_weight > trunc.factor * total_weight
break
end
end
keep = 1:n
return SVD(usv.U[:, keep], usv.S[keep], usv.Vt[keep, :])
end Making use of Julia's multiple dispatch, it's relatively straightforward to specialize I think here there is no real benefit of t trying to come up with code that works for both dense and blockdiagonal at the same time, the implementations are fundamentally different, and abstracting this even further might hurt more than it does good, but that's of course my opinion. That's also more or less why the design is the way it is, to make it very easy to add custom implementations, rather than attempting to provide implementations that work for everything. |
Thanks for the explanation and detailed example. I definitely agree with you that it would be better to split up the dense and block-sparse truncation implementations. |
Should be moved to BlockSparseArrays.jl once it is configured properly. |
This is a first draft of functionality to extend singular value decompositions to blockarrays. It features quite a few components that probably have to be discussed further.
svd(!)
The core functionality is centered around the typical
LinearAlgebra.svd
.For generic block-arrays, the idea would be that the rows of
U
, and the columns ofVt
have the same structure as the rows and columns ofA
, such thatU * Diagonal(S) * Vt = A
. For the columns ofU
and the rows ofVt
however, there is no structure that can be inferred, such that this would generically be a single block.Additionally, implementation-wise it is probably the most efficient to first cast to a single dense array, then do the SVD, and then restore the blocked structure. For that reason, the return type of the matrices is also
BlockedArray
instead ofBlockArray
. This is implemented and works:If we want to keep that behavior of the blockstructure, the
LinearAlgebra.SVD
struct is however insufficient, as this requires the types ofU
anVt
to be the same. ForBlockArray
objects, the type of axes is captured in the type parameters, such that if the rows and colums ofA
are not described by axes of the same type, this fails.SVD
structIn order to circumvent this problem, I copied most of the
LinearAlgebra.svd
files, and created a newSVD
struct that does not have this restriction, but otherwise functions identically. With this, I have new functionssvd(!)
that by default fall back to theLinearAlgebra
implementation, but rewrap them in the new struct.Some questions here:
LinearAlgebra.svd!
instead of keeping them separate instead, only returning the custom SVD struct whenever required?SVD
struct in order to avoid confusion with the LinearAlgebra-defined one?BlockSparse (Diagonal) SVD
Expanding this to
BlockSparseArrays
, I first added an implementation forBlockDiagonal
, which is just aBlockSparseMatrix
with aDiagonal{T,<:AbstractMatrix{T}}
storage type.In this case, the SVD happens block-by-block, and as a result we have extra information about the columns of
U
and the rows ofVt
, as they should retain this blocked structure.Here, it might be beneficial to have
S
be either aBlockSparseVector
or aBlockVector
, depending on whether or not thefull
SVD is asked (not implemented yet)BlockSparse SVD
Then, to also implement the generic BlockSparse case, I selected two possible implementations. Either the input is "permuted block-diagonal", as can arise in the context of tensors with nonzero flux, or it is a generic block sparse matrix.
There is a runtime check for this, (based on ITensor/BlockSparseArrays.jl#3) which permutes the array to become
BlockDiagonal
if possible, and otherwise falls back to a dense implementation.Here, the questions mostly concern whether we want to have this runtime check. I would guess that this is something that can be statically inferred from the type of the axes, ie if the sparsity is due to symmetries, it has to be blockdiagonal in some form, while if not, it is presumably quite rare to have this specific structure. We could consider a trait to enable or disable this implementation.
Truncated SVD
Finally, in order to also include the truncation, I added an interface roughly based on the TensorKit implementation, which features a
truncate(F::SVD)
to truncate a given decomposition, as well as atsvd(A)
which simply combines the decomposition and truncation step. This second form could be useful in the future for algorithms that cannot decouple these two, such as Krylov methods.The interface supports truncation by a number of different methods, based around keeping the
n
first values after sorting according to some criteria, as well as discarding values via a filtering mechanism. There are convenient constructors for the cases that are probably used most, but the system allows quite a lot of freedom. Additionally, because of the specialization onTruncationScheme
, it is easily extensible in the future.In order to make this work with
BlockSparseArray
s, I am missing the functionality to haveU[:,[1,2,4]]
slice the matrix but retain as much of the block structure as possible:As you see, much to discuss. I attached my preliminary test file, and added the code in places that I felt are relevant, but obviously I am completely open to suggestions, reorganisations, etc.