-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow explicit stored zeros in SparseMatrixCSC #5538
Conversation
CC: @ViralBShah @lindahua |
This looks good to me. However, I am not completely sure the need of renaming |
I'm worried about silently making user code do the wrong thing. |
I have been thinking about this. Given that our On the name |
This all seems like a high price in complexity to pay. But if you're really sure you want to put up with all of this, I'll go back to my corner. |
The only extra complexity here is in the naming conventions, it doesn't affect the implementation of algorithms at all.
|
I find the cognitive load goes up surprisingly quickly as you split single concepts into two. Any one case of it seems small, but they tend to pile up. For example see the recent suggestions to add |
An alternative is to just to redefine |
This shifts the extra cognitive load to the minority use cases and has the extra benefit of likely not breaking existing code. |
@mlubin Do you expect actually returning sparse matrices with zero elements in them to the user for general use, or are these for internal use in libraries only? |
I have to note that We can also have another function called |
I actually find |
+1 for |
Good idea, I've updated the PR. My deprecation doesn't seem to work correctly: julia> x = [1,2,3];
julia> nnz(x)
WARNING: nnz has been renamed to countnz and is no longer computed in constant time for sparse matrices. Instead, use nfilled() for the number of elements in a sparse matrix.
in nnz at deprecated.jl:391
WARNING: nnz has been renamed to countnz and is no longer computed in constant time for sparse matrices. Instead, use nfilled() for the number of elements in a sparse matrix.
in nnz at deprecated.jl:391
in nnz at deprecated.jl:392
fatal: error thrown and no exception handler available.
<?::Segmentation fault (core dumped) |
@ViralBShah I don't expect that Base will return sparse matrices with explicit zeros, but user code should be able to handle it. Going forward, we might want to return these matrices in particular cases, like via a keyword option to |
The deprecated function |
Perhaps I should just toss in here that I've always thought that having sparse matrices with a non-zero default would be a very useful thing to have. You can implement them quite simply by storing the default and the delta from the default and noting, e.g. that (sparse1 + default1) + (sparse2 + default2) = (sparse1 + sparse2) + (default1 + default2)
(sparse1 + default1) * (sparse2 + default2) = (sparse1*sparse2 + default1*sparse2 + default2*sparse2) + (default1*default2) and so on. The nice thing about sparse matrices with non-zero defaults is that they're closed under basic algebraic operations with themselves and scalars. |
That could be useful in some abstract settings, not sure if it's worth making everyone who uses sparse matrices deal with that extra complexity though. |
@ViralBShah any more issues? |
I just thought it was a useful forcing function when thinking about the API – what functions make sense regardless of whether the default value is zero or something else? |
This seems orthogonal to the pull request. Supposing we supported non-zero defaults, storing explicit default values in the sparse matrix shouldn't affect the implementation of basic linear algebra operations. |
One use for non-zero defaults is for Yates correction in count data, for |
Fair enough. Is there a name that captures |
I usually refer to explicitly stored entries of a sparse matrix as "structurally non-zero", whether or not their value is actually zero. I believe this is how they are referred to in Tim Davis' "Direct Methods for Sparse Linear Systems". So, something like |
That makes perfect sense for sparse matrices, but doesn't read very well for counting non-zeros in a plain vector, which |
I misread that -- |
How about |
@ViralBShah: I think the meaning of What about this? countne(a, v) # the number of values not equal to v
countnz(a) = countne(a, 0) # applies to both sparse matrix & dense array
nstored(a) # the number of explicitly stored elements
# (I feel stored sounds more accurate than filled)
# for a variant of sparse matrix with default values, we may
# count the non-default values in the following way
countne(a, default(a)) |
It also seems strange to export |
To clarify, @kmsquire, did you want |
To me, the best approach is to not worry about the fancy sparse matrices with non-zero default values, until there's real need of this arises in practice. Each time I see a sparse matrix being talked about, people always think it as a matrix where a dominant portion of the elements are zeros. If we focus on the standard notion of sparse matrices, then |
I'd tend to agree with this. |
I think on non-default sparse matrices if/when they exist is fine. |
I also agree with @lindahua's last comment. |
I agree about just focussing on the common case of sparse matrices. On stored zeros, we also need to think about |
|
Ok, this is good enough to merge then. Could you rebase and merge? |
Done. |
I'll pull the trigger for you. |
allow explicit stored zeros in SparseMatrixCSC
I'm late to the party, but matlab compatibility is a nice feature to have, and i don't see what we gain by deprecating |
Because no one suggested it until now. ;-) |
If we didn't deprecate
|
Following JuliaLang/LinearAlgebra.jl#60, this PR introduces
nfilled
to get the number of elements in a sparse matrix and updates the documentation. Also I've renamednnz
tonumnz
to reflect the fact that it shouldn't be used as frequently. Maybe we could even put in a special deprecation warning letting users know aboutnfilled
.