-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stop using finalizers for resource management? #11207
Comments
Hmmm... I was just going to start using finalizers (but still had some questions about them to investigate). |
Go has the f := os.Open(file) which pushes f.Close() onto a stack of function calls that get evaluated http://blog.golang.org/defer-panic-and-recover -Jacob On Fri, May 8, 2015 at 8:48 PM, Scott P. Jones notifications@github.com
|
If one is rethinking this, the machinations of CUDArt to manage GPU memory in a GC-compatible way are probably amusing fodder for thought. The arrival of NOTE: 2nd link updated to correct target. |
See also #1037 It would be great to get rid of finalizers entirely, but that's probably not realistic. For starters, I would still allow finalizers but not use them to close files and such. @ScottPJones you can definitely use finalizers to call your C release code. Finalizers can be associated with a type by adding them to all instances in the constructor :) |
@JeffBezanson it is very useful to have a mechanism to allow freeing of limited resources (like file descriptors) as soon as reasonably possible. As you say finalizers will eventually get around to it, but that doesn't prevent exhaustion in the meantime. One question, are finalizers always run, no matter how the program exits, so its always possible to be sure any resource does not remain locked? |
@elextr yes - that sort of exhaustion has been a big issue with the sort of code that I write, where it has to stay running with minimal downtime for years... |
@ScottPJones then its probably best if you do your resource management explicitly yourself, certainly don't rely on anything in the semantics of any language, unless specified and guaranteed. Specifically the semantics of the Julia GC and hence finalizers is not guaranteed, it currently happens to have recently changed to a generational GC in 0.4, but is not generational in 0.3, and that may change in 0.4/0.5 again when threading lands (for example). All you can know about a finalizer is that the object it relates to is no longer in use when the finalizer is run, but my reading of this suggests that it may not be run for bitstypes, hence my question above. |
Another use case is the interaction of the Java and Julia GC's in JavaCall. Objects retrieved from Java into Julia need to explicitly de-referenced in Java when they are no longer used within Julia. This is achieved via the finalizers. Which works fine, except that the Java VM can have greater memory pressure than the Julia VM. In that case, the JVM can run out of memory, before Julia decides that the GC needs to be run. |
@ScottPJones, I hear you. In several places like HDF5 and CUDArt, the key was to write code like
which guarantees that |
@elextr I should have been clearer... I'm not planning on relying on the finalizers at all. myObj = DA.PackedData(1000) # creates a packed data buffer with initial size at least 1000 bytes.
push!(myObj, "Encode a string")
push!(myObj, 5.2332) # encode an IEEE binary floating point number
save!(myDBMS, myObj) # write packed record out as a row
release(myObj) # Release the underlying C buffer object, 0 out the pointer in the Julia myObj object... What happened many times, in the REPL, is I accidentally set myObj to something else before calling release... so I lost memory each time... @timholy That's good to know, but is that sort of syntax only for files? (sorry, my newbieness with Julia is showing again!) |
@ScottPJones, it's a standard julia convention, see http://docs.julialang.org/en/latest/manual/functions/#do-block-syntax-for-function-arguments. You have to write a version of your function that takes another function as the first argument (see, e.g., the methods defined for |
@ScottPJones No the do syntax is not restricted to files see http://julia.readthedocs.org/en/latest/manual/functions/#do-block-syntax-for-function-arguments I think this is the standard way to do it in Julia and as @timholy said it is used in various places in Julia land. In Gtk.jl we have also some places. Where the finalizers are important is when the type goes out of scope. We have for instance in Gtk.jl the situation where it is really needed. |
oh Tim is faster, sorry. |
It definitely takes a while, no apologies needed. |
@JeffBezanson: What is the actual proposal of this issue? Isn't the do syntax already consistently been used for files? I think the finalizers are useful when the scope is not local. |
We probably can't remove finalizers alltogether because then we would be leaking resources. I think this issue is more about conventions on "good practice for resource management" since the biggest problem (besides performance) is that the gc is very lazy : it only works under pressure, that is memory pressure. It has no way to know e.g. how many file descriptor are open by the program, so if your handle object is small, the gc will be completely fine keeping it around for a long time while you exhaust your open fd limit. I don't have any good idea about this by the way... |
I found finalizers unreliable. When interfacing with C code, I would really prefer something like Go |
@tknopp good question. My proposal would be
The last item sounds drastic, but as it is finalizers might not be invoked for a very long time, and unpredictably. You could still use finalizers as an escape hatch. If you're not sure how to handle releasing some object, you can just call |
Ok. Is there some issue what |
I'll just add another small issue about using finalizers with IO objects which I very recently discovered: on Windows, trying to call |
@JeffBezanson how do you propose to handle objects whose lifetime exceeds the scope of the |
If an object lifetime exceeds the local scope, you can't use |
Another idea is to have some types opt into reference counting and finalize them when their counts get to zero. It's not entirely clear to me how to make a mix of refcounting and not work, however. |
watch out, may be flayed by mentioning reference counting :-) |
the problem with mixed refcount is that a refcounted object can still be kept alive by a non refcounted one (worst case : the object keeping it alive is in oldgen). Then you don't get the "immediate finalization" property. |
To alleviate the late finalization problem we could also teach the gc about other kind of resources so that it can be taken into account in the collection heuristics. So e.g. you could register a "file descriptor", or "GPU memory" something, and then explicitely say : I allocated X of this, running this finalizer will get me Y of this back. Painful to implement though. And it can only make gc overhead worse (by collecting more often). |
Yes, in such a scheme every reference that could transitively reach anything refcounted would need to maintain a refcount. That includes most abstract slots, and slots in data structures that can refer to refcounted objects. But that still excludes most things we care about the performance of. |
Ah, ok, thanks... I misunderstood. Yes, it should be sufficient to have finalisers associated with types. Currently, every object gets the same finaliser function. Of course, the type parameters and fields will need to be available to the finaliser. |
I wonder if the mmap and WeakKeyDict cases call for something like finalize(a) do f
# code to finalize `a` which is a type not declared with a `finalize` method
end This wouldn't actually do the finalizing, just "move" Not sure how feasible "moving an object to the finalization pool of objects" would be though.... |
Is #10960 then an artifact of the new gc? That could explain memory leaks with shared and distributed arrays. An ability to explicitly "free" remote objects will be quite useful, especially in cases where people are using distributed arrays across multiple hosts specifically to leverage every bit of memory available. |
@quinnj Carrying the discussion from #11280 over here, as requested...
That's precisely what I said I'd done, I have a pointer to something that needs to be finalized, so I simply set it to zero (C_NULL) in |
Just noticed this when there are multiple finalizers defined for an object.
Found it a little odd that all the finalizers are not executed together at the first |
@amitmurthy This is somewhat related to the (sub-)issue I noticed in #11814 (comment) . My guess is that running too many finalizers at the same time will cause a too long pulse but @carnaval should know for sure. |
FWIW, the issue above in #11207 (comment) is solved by #13995 . |
Cool. And regarding the topic of this issue - It is not just about files, I don't think we have a choice but to use finalizers for remote references. We can document that users can manually call |
Isn't this issue essentially a duplicate of #7721? |
Finalizers are inefficient and unpredictable. And with the new GC, it might take much longer to get around to freeing an object, therefore tying up its resources longer. Ideally releasing external resources should not be tied to how memory management works.
We are already not far from this with the
open(f) do
construct. I think that and/orwith
should be used. Perhaps there could be some other mechanism for registering files to close eventually.Discussed this with @carnaval .
The text was updated successfully, but these errors were encountered: