Semantics of map! and broadcast! over sparse arrays #19372

Sacha0 · 2016-11-21T02:38:30Z

#19239 concluded with generic map/broadcast over sparse arrays not storing numerical zeros in the result. But whether map!/broadcast! should do the same, or instead store some numerical zeros dependent on / reflecting the inputs' structures, wasn't settled. Hence this issue.

Originally (c. #19239) I thought map!/broadcast! should retain the inputs' structure. But working on #19371 changed my thinking: map!/broadcast! should either (1) store the same result as map/broadcast or (2) retain solely the output argument's structure, perhaps throwing an error where broadcast(f, inputargs...) does not fit within the output argument's structure. Primary reason:

What "retaining the inputs' structure" means is not well defined: Should broadcast!(f, C, A, B) retain A's structure, B's structure, the union of A's and B's structures, the union of A's, B's, and C's structures, or the intersection of C's structure with the union of A's and B's structures? What if either of A or B has a singleton dimension or is scalar?

Only the output argument seems logically privileged, hence proposal (2) above. (2) has a few possible variations: When broadcast(f, inargs...) contains a nonzero value outside the equivalent broadcast!'s output argument's structure, broadcast! could either (2a) error, (2b) ignore that nonzero, or (2c) grow the output argument's structure.

Some thoughts on proposals (1) and (2):

(1) has the advantages of ease of use (it always works) and strict consistency with map/broadcast.

(2a) behaves like broadcast!(f, S, args...) where S is a structured (e.g. Diagonal, Tridiagonal, etc) matrix: strictly non-allocating, but not as flexible / easy to use as (1).

(2b) is interesting: On the one hand, it requires iteration only over the parts of the input arguments' structure that intersect the output argument's structure, potentially enabling greater efficiency where the result's structure is known/predictable. In other words, this approach provides a form of escape hatch for exploiting known detailed zero-preserving behavior of the map/broadcast function (ref. discussion in #18590), and that escape hatch could be extended to map!/broadcast! for structured matrices. On the other hand, map/broadcast and map!/broadcast! could produce inconsistent results.

(2c) seems like a half measure with little if any advantage over either (1) or (2a).

Thoughts? Thanks and best!

The text was updated successfully, but these errors were encountered:

martinholters · 2016-11-21T09:50:39Z

I think silently ignoring a nonzero result is dangerous, especially in generic code. If f(x) = broadcast!(cos, similar(x), x) errors when called with a sparse x, that might be ok (unavoidable even, if x is so huge that its densified version doesn't fit into memory), but silently leaving zeros as zeros would be a nasty surprise.

kmsquire · 2016-11-21T17:10:50Z

Since the semantic of broadcasting over only nonzero/structural values is different than broadcasting over all elements, why not define a new set of methods for doing this, e.g., mapnz, broadcastnz?

stevengj · 2016-11-21T21:41:09Z

My first inclination would be to use the output argument's nonzero structure, and throw an error if this is not possible (i.e. if f(input) is nonzero at locations not stored in the output array).

I don't see the point of using broadcast! here if you don't want to pre-allocate the nonzero storage in the output. If you want to resize the output to fit f(input), then you might as well use broadcast.

Sacha0 · 2016-11-21T23:20:33Z

Cheers, seems we can easily eliminate (2b) and (2c).

My first inclination would be to use the output argument's nonzero structure, and throw an error if this is not possible (i.e. if f(input) is nonzero at locations not stored in the output array).

My thinking as well. A slight generalization of (2a): Rather than require that the result's nonzero pattern fit within the output argument's existing structure (pattern), we could require the weaker condition that the result fit within the output argument's existing storage. Then, instead of needing to know the result's nonzero pattern, you need only know how much space it requires. Regarding implementation, this approach should also be less complex and more efficient.

I don't see the point of using broadcast! here if you don't want to pre-allocate the nonzero storage in the output. If you want to resize the output to fit f(input), then you might as well use broadcast.

Agreed. The only counterargument I see is generic programming. And on top of the variation of (2a) described above, the implementation and performance costs of allowing reallocation in broadcast! when storage runs out mid-stream may be negligible. If that bears out, providing the additional flexibility / generality / fault tolerance seems appealing?

Proposal for action: If the variation of (2a) described above (i.e. without reallocation) seems agreeable, I'll sketch an implementation of map! to get a better sense of the implementation considerations. Then revisit the reallocation question if necessary?

Thanks and best!

tkelman · 2016-11-21T23:44:26Z

I'm more in favor of 1. The number of nonzero entries for this kind of operation might be a bit complicated to determine ahead of time.

Sacha0 · 2016-11-21T23:48:57Z

I'm more in favor of 1. The number of nonzero entries for this kind of operation might be a bit complicated to determine ahead of time.

Is that to say you would also be on board with the variation of (2a) described in #19372 (comment) with reallocation (which is basically (1) but only allocates if necessary)? Thanks!

…ith new (JuliaLang#19372) semantics / less pessimistic allocation behavior and optimize a few inefficiently handled common cases. Also provide a broadcast/broadcast! implementation capable of (relatively) efficiently handling an arbitrary number of (input) sparse matrices. Integrate with the map/map! implementation, and touch up a few rough edges of the map/map! implementation. Test.

…ith new (#19372) semantics / less pessimistic allocation behavior and optimize a few inefficiently handled common cases. Also provide a broadcast/broadcast! implementation capable of (relatively) efficiently handling an arbitrary number of (input) sparse matrices. Integrate with the map/map! implementation, and touch up a few rough edges of the map/map! implementation. Test. (#19518)

ChrisRackauckas · 2018-01-26T12:01:53Z

Since the semantic of broadcasting over only nonzero/structural values is different than broadcasting over all elements, why not define a new set of methods for doing this, e.g., mapnz, broadcastnz?

Can we at least have a generic way to loop over nonzeros of a matrix as part of Julia 1.0? This is crucial for writing any kind of sparse matrix support in differentiation tooling for example. Right now I am not sure of a better way to handle it other than to handle every matrix type differently, which seems like far too much work when indexing something like a Diagonal at just the diagonal values should do just fine.

Even with sparse matrices the only way I could find out how to do this without allocating was a little bit weird:

julia> idxs = SparseArrays.nonzeroinds(vec(A))
973-element Array{Int64,1}:
    1
   10
   19
   21
   27
   36
   64
   65
   70
   71
    ⋮
 9920
 9931
 9935
 9940
 9941
 9942
 9953
 9977
 9978
 9996

julia> ind2sub(size(A),idxs[3])
(19, 1)

I think the first step here is to have tooling that allows for writing an efficient sparse map/broadcast, while the actual convenience of a sparse map/broadcast can come later.

Sacha0 · 2018-01-26T18:28:30Z

Apart from the discussion/pointers in #25760, section (1) of stdlib/SparseArrays/src/higherorderfns.jl begins work towards such an interface. Regrettably, further work along such lines is very much post-1.0.

Discussing unified iteration over sparse and/or structured objects is mostly orthogonal to this issue's purpose. The focal questions of this issue having largely been resolved and the decisions implemented, this issue should be closed. Best!

Sacha0 added sparse Sparse arrays broadcast Applying a function over a collection needs decision A decision on this change is needed design Design of APIs or of the language itself labels Nov 21, 2016

Sacha0 mentioned this issue Nov 28, 2016

map/map! for sparse matrices #19438

Merged

This was referenced Dec 7, 2016

generic broadcast/broadcast! for sparse matrices, with #19372 semantics #19518

Merged

Classes of unary operations in broadcast over sparse arrays #18309

Closed

oscardssmith mentioned this issue Dec 13, 2016

Taking Sparse Arrays Seriously #19573

Closed

Sacha0 mentioned this issue Jan 10, 2017

Fix ambiguity of broadcast!(identity, ::SparseVector, ::SparseVector) #19895

Merged

ChrisRackauckas mentioned this issue Jan 26, 2018

Sparsity support JuliaDiff/FiniteDiff.jl#29

Closed

Sacha0 closed this as completed Jan 26, 2018

Sacha0 mentioned this issue Jan 1, 2019

similar(::SparseMatrixCSC, dims) returns sparse matrix with empty space (#26560) #30435

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Semantics of map! and broadcast! over sparse arrays #19372

Semantics of map! and broadcast! over sparse arrays #19372

Sacha0 commented Nov 21, 2016

martinholters commented Nov 21, 2016

kmsquire commented Nov 21, 2016 •

edited by StefanKarpinski

Loading

stevengj commented Nov 21, 2016

Sacha0 commented Nov 21, 2016

tkelman commented Nov 21, 2016

Sacha0 commented Nov 21, 2016

ChrisRackauckas commented Jan 26, 2018

Sacha0 commented Jan 26, 2018

Semantics of map! and broadcast! over sparse arrays #19372

Semantics of map! and broadcast! over sparse arrays #19372

Comments

Sacha0 commented Nov 21, 2016

martinholters commented Nov 21, 2016

kmsquire commented Nov 21, 2016 • edited by StefanKarpinski Loading

stevengj commented Nov 21, 2016

Sacha0 commented Nov 21, 2016

tkelman commented Nov 21, 2016

Sacha0 commented Nov 21, 2016

ChrisRackauckas commented Jan 26, 2018

Sacha0 commented Jan 26, 2018

kmsquire commented Nov 21, 2016 •

edited by StefanKarpinski

Loading