Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Introduce AbstractSparseMatrixCSC into Type Hierarchy #24973

Closed
wants to merge 7 commits into from

Conversation

JaredCrean2
Copy link
Contributor

@JaredCrean2 JaredCrean2 commented Dec 7, 2017

This PR introduces the minimal set of changes needed to let users define their own sparse matrix CSC types and retain access to the SuiteSparse libraries.

Specifically, it introduces an AbstractSparseMatrixCSC type and modifies the SuiteSparse wrappers to accept it (rather than only SparseMatrixCSC). This allows users to implement new sparse matrix types with their own indexing behavior (ref #15579, #22733, #5424, particularly #9906, edit: added issue 9906).

I have updated the tests to make sure a sample TestCSC <: AbstractSparseMatrixCSC works, and they all pass. For some reason, the ambiguous tests now cause julia to segfault, and I'm not sure how to debug this.

To reproduce, run

  cd ./test
  ../julia ./runtests.jl ambiguous

in the repo root directory. The backtrace generated by the segfault is here and running julia-debug under gdb produced this.

@ararslan ararslan added the sparse Sparse arrays label Dec 7, 2017
@StefanKarpinski
Copy link
Member

StefanKarpinski commented Dec 7, 2017

This seems like an oxymoronic type – it's an abstraction with a very specific implementation...

@JaredCrean2
Copy link
Contributor Author

JaredCrean2 commented Dec 8, 2017

Depends on what you mean by implementation. All AbstractSparseMatrixCSCs need to have some common state, but the state doesn't define a single implementation of a compressed sparse column matrix. There are a number of possible implementations that have different behaviors regarding storing/purging of zeros, mutability of the sparsity pattern, and method of index look ups (some use a log(n) search, others a hash table, others still can do clever tricks when several columns have similar structure).

@ViralBShah
Copy link
Member

But isn't AbstractSparseMatrix sufficient to capture all that and extend? Do we need another abstract type in between? I can totally imagine that people's sparse matrix use cases are really specific. Maybe this will all be much easier when we excise sparse matrices from Base.

@JaredCrean2
Copy link
Contributor Author

AbstractSparseMatrix is not sufficient because some libraries use CSR rather than CSC.

@StefanKarpinski
Copy link
Member

I guess my question is why do you need to dispatch on whether the representation is CSC versus CSR? I'm not saying you don't have to, I'd just like to know why.

@pkofod
Copy link
Contributor

pkofod commented Dec 8, 2017

AbstractSparseMatrix is not sufficient because some libraries use CSR rather than CSC.

Potential subtypes of AbstractSparseMatrixCSC must have something in common since they need a shared parent, right? You want the abstract type so you don't have to rewrite duplicate methods for all instances (that's one use case at least).

However, they will also differ, because if they didn't, a single SparseMatrixCSC type would be enough. The question is then: what is it that AbstractSparseMatrixCSC subtypes share that they won't simply share with all subtypes of AbstractSparseMatrix?

@JaredCrean2
Copy link
Contributor Author

why do you need to dispatch on whether the representation is CSC versus CSR?

If there two libraries that can do LU factorzations, for example, one or CSC and the other for CSR, having abstract types would let them extend lufact for the types they operate on. The benefit is a linear algebra interface the can support different matrix representations.

what is it that AbstractSparseMatrixCSC subtypes share that they won't simply share with all subtypes of AbstractSparseMatrix?

The colptr and rowval fields used by SparseMatrixCSCwould be rowptr and colval for the (currently hypothetical) SparseMatrixCSR. Any algorithm that operates on the underlying data (not using set/getindex) would need to to access the right fields (and for speed the algorithm should iterate over the rows or column depending on the representation).

@ViralBShah
Copy link
Member

As you rightly pointed out these details differ even in different CSC implementations. In any case, I think we should hold this until we can move sparse out of base into stdlib, which will have to wait for a bunch of other things that are in flight.

@StefanKarpinski
Copy link
Member

We're also going to reserve the right to insert supertypes into the type hierarchy and explicitly warn people not to depend on the precise supertype of a type but code to subtype relationships instead, so this could be done at any point in the 1.x timeframe. There will be a "writing future-proof code" guide spelling out dos and don'ts.

@JaredCrean2
Copy link
Contributor Author

Is there a particular benefit to waiting? I was really hoping to get this into 0.7 (its one of the last things to do before upgrading my codebase from 0.4). Most of the changes are to stdlib.

@ViralBShah
Copy link
Member

ViralBShah commented Dec 8, 2017

I still don't understand the rationale. For the different kinds of CSC implementations, how does having AbstractSparseMatrixCSC benefit? Are you thinking that these different implementations might be able to reuse some of the generic stuff in the current implementation? Much of the existing code is not sharable, if you change some of the underlying assumptions around stored zeros. Again, I doubt it is generic enough for hash tables in the data structure.

On CSC vs. CSR, yes, I agree it is good to be able to distinguish and having CSR would be valuable too - but I'm not sure that points in the direction of AbstractSparseMatrixCSC.

Just trying to understand what in your codebase needs this. Can you share relevant bits of code if open source?

@ViralBShah
Copy link
Member

As I have suggested, we should build a new sparse implementation in an external package, which is flexible and abstract as necessary.

@ViralBShah ViralBShah closed this Dec 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sparse Sparse arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants