-
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parallel LU factorization memory leak #317
Comments
What do you mean by "memory leak"? Did you get an error? Did the code not do what you expected? |
I can confirm this. Sparse factorizations cannot usually be moved across workers, since the pointer is set to zero during the serializing process. However, in contrast to the Cholesky, the sparse LU recomputes the factorization instead of failing when it is detected that the pointer is null. I'm still not sure why this causes a leak. |
Is that difference intentional? Seems like we should stick to one or the other. |
Yes, and I think there should be a warning in the case of recomputing. I forgot to mention that the leak also happens when I define the LU factorization everywhere like this:
I wasn't aware that even then the LU factorization is moved across workers (and that it's recomputed on the workers).
|
I don't think so. UMFPACK and CHOLMOD have a slightly different design, but our wrappers are also written by different people. The present memory management model in the CHOLMOD wrappers is one I introduced out of necessity, but I've almost not contributed to the UMFPACK wrappers. It might be possible to recompute a The usual tradeoffs between generality and hidden slowness also apply here. It might be possible to avoid the error when moving sparse factorizations, but it will be extremely inefficient to move and recompute the factorization on the new workers. Users probably want to use ...but back to this issue. I think I have an idea about what is causing the memory leak and will get back with an update on that. |
Maybe erroring instead of recomputing in the LU case would be a better option (and more consistent with Cholesky) than quietly re-factorizing. |
@andreasnoack Was this resolved? |
No this is still an issue. The problem is that we don't register a finalizer when deserializing the factorization so the memory allocated when the factorization is recomputed is never released. The issue can therefore be reproduced just with julia> using SparseArrays, SuiteSparse, LinearAlgebra, Serialization
julia> b = IOBuffer();
julia> F = lu(sparse(1.0I, 10000, 10000));
julia> foreach(1:1000) do i
seekstart(b)
serialize(b, F)
seekstart(b)
SuiteSparse.UMFPACK.umfpack_symbolic!(deserialize(b))
end; I think there are three possible solution
I suspect we should start with 2 and then open a separate to track the development of 3. |
) Fixes #15450 (cherry picked from commit 356ceee)
) Fixes #15450 (cherry picked from commit 356ceee)
) Fixes #15450 (cherry picked from commit 356ceee)
) Fixes #15450 (cherry picked from commit 356ceee)
) Fixes #15450 (cherry picked from commit 356ceee)
I encountered a memory leak when I tried to solve LU-factorizations in parallel:
The text was updated successfully, but these errors were encountered: