Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UndefRefError with Dicts and finalizers #14445

Closed
amitmurthy opened this issue Dec 19, 2015 · 7 comments
Closed

UndefRefError with Dicts and finalizers #14445

amitmurthy opened this issue Dec 19, 2015 · 7 comments
Labels
bug Indicates an unexpected problem or unintended behavior

Comments

@amitmurthy
Copy link
Contributor

  | | |_| | | | (_| |  |  Version 0.4.2 (2015-12-06 21:47 UTC)
 _/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
|__/                   |  x86_64-apple-darwin13.4.0

julia> const d = Dict()
Dict{Any,Any} with 0 entries

julia> type Foo
         a::Int
         b::Int
       end

julia> type Bar
         a::Int
         b::Int
       end

julia> function del_entry(id)
         global d
         delete!(d, id)
         nothing
       end
del_entry (generic function with 1 method)

julia> for n in 1:10^8
           f = Foo(n,n)
           finalizer(f, x->del_entry((x.a, x.b)))
           d[(n,n)] = Bar(n,n)
           yield()
       end
ERROR: UndefRefError: access to undefined reference
 in ht_keyindex2 at dict.jl:602
 in setindex! at dict.jl:643
 [inlined code] from none:4
 in anonymous at no file:0

I think this is the cause of #14295 as also the intermittent errors seen during the 0.4.2 release.
cc: @yuyichao , @tkelman

@tkelman
Copy link
Contributor

tkelman commented Dec 19, 2015

Is it okay on master? Worth bisecting, on any branch it occurs.

@amitmurthy
Copy link
Contributor Author

No, same error on master too.

@yuyichao
Copy link
Contributor

The problem seems to be that the finalizer makes the Dict dirty during a GC in the middle of some Dict internal function and confuses it. Disabling the GC around d[(n, n)] = Bar(n, n) works around the problem in this simple case. We could probably add more check for h.dirty although it is also very hard to predict when will allocation happen.... (for example, it can happen when calling another helper function for non-leaftype key or value because that method haven't been compiled yet). Also ref #12157

@amitmurthy amitmurthy added the bug Indicates an unexpected problem or unintended behavior label Dec 20, 2015
@amitmurthy
Copy link
Contributor Author

A workaround is changing delete!(d, id) to @async delete!(d, id). I'll submit a PR for this use of finalizers in the parallel code while this issue is being addressed.

@pearcemc
Copy link

Hi I was wondering if it is likely that the following error is caused by this same problem:

ERROR: UndefRefError: access to undefined reference
 [inlined code] from ./operators.jl:313
 in ht_keyindex at ./dict.jl:568
 in delete! at dict.jl:767
 in remotecall_wait at multi.jl:747
 [inlined code] from multi.jl:368
 in remotecall_wait at multi.jl:751
 in anonymous at task.jl:443

It's happening on a remote so I'm having a whale of a time trying to capture the source of the problem and find exactly what is failing. Thanks!

I'm on:

julia> versioninfo()
Julia Version 0.5.0-dev+749
Commit 83eac1e* (2015-10-13 16:00 UTC)

@amitmurthy
Copy link
Contributor Author

Considering the workaround was merged on Dec 22 and your version is from Oct 13th, it probably is.

@amitmurthy
Copy link
Contributor Author

This does not seem to be an issue anymore. Will revert #14456 once #15923 is fixed.

@vtjnash vtjnash reopened this May 4, 2016
vtjnash added a commit that referenced this issue May 4, 2016
we may want to start indicating which APIs are safe to call from finalizers
which generally will require disabling finalizers in most of the
functions of that API

to facilitate that, expose control from userspace of the finalizer-disable
mechanism already used by codegen (modeled similarly to gc-disable)

to make both easier to use, automatically restore them when an error is thrown

fix #14445
vtjnash added a commit that referenced this issue Jul 26, 2016
also use this `client_refs.lock` to protect other data-structures from
being interrupted by finalizers, in the multi.jl logic

we may want to start indicating which mutable data-structures are safe
to call from finalizers, since generally that isn't possible

to make a finalizer API gc-safe, that code should observe the standard
thread-safe restrictions (there's no guarantee of which thread it'll run on),

plus, if the data-structures uses locks for synchronization,
use the `islocked` pattern (demonstrated herein) in the `finalizer`
to re-schedule the finalizer when the mutable data-structure is not
available for mutation.
this ensures that the lock cannot be acquired recursively,
and furthermore, this pattern will continue to work if finalizers
get moved to their own separate thread.

close #14445
fix #16550
reverts workaround #14456 (shouldn't break #14295, due to new locks)
should fix #16091 (with #17619)
vtjnash added a commit that referenced this issue Jul 26, 2016
also use this `client_refs.lock` to protect other data-structures from
being interrupted by finalizers, in the multi.jl logic

we may want to start indicating which mutable data-structures are safe
to call from finalizers, since generally that isn't possible

to make a finalizer API gc-safe, that code should observe the standard
thread-safe restrictions (there's no guarantee of which thread it'll run on),

plus, if the data-structures uses locks for synchronization,
use the `islocked` pattern (demonstrated herein) in the `finalizer`
to re-schedule the finalizer when the mutable data-structure is not
available for mutation.
this ensures that the lock cannot be acquired recursively,
and furthermore, this pattern will continue to work if finalizers
get moved to their own separate thread.

close #14445
fix #16550
reverts workaround #14456 (shouldn't break #14295, due to new locks)
should fix #16091 (with #17619)
vtjnash added a commit that referenced this issue Aug 4, 2016
also use this `client_refs.lock` to protect other data-structures from
being interrupted by finalizers, in the multi.jl logic

we may want to start indicating which mutable data-structures are safe
to call from finalizers, since generally that isn't possible

to make a finalizer API gc-safe, that code should observe the standard
thread-safe restrictions (there's no guarantee of which thread it'll run on),

plus, if the data-structures uses locks for synchronization,
use the `islocked` pattern (demonstrated herein) in the `finalizer`
to re-schedule the finalizer when the mutable data-structure is not
available for mutation.
this ensures that the lock cannot be acquired recursively,
and furthermore, this pattern will continue to work if finalizers
get moved to their own separate thread.

close #14445
fix #16550
reverts workaround #14456 (shouldn't break #14295, due to new locks)
should fix #16091 (with #17619)
@vtjnash vtjnash closed this as completed in cd8be65 Aug 8, 2016
tkelman pushed a commit that referenced this issue Aug 11, 2016
also use this `client_refs.lock` to protect other data-structures from
being interrupted by finalizers, in the multi.jl logic

we may want to start indicating which mutable data-structures are safe
to call from finalizers, since generally that isn't possible

to make a finalizer API gc-safe, that code should observe the standard
thread-safe restrictions (there's no guarantee of which thread it'll run on),

plus, if the data-structures uses locks for synchronization,
use the `islocked` pattern (demonstrated herein) in the `finalizer`
to re-schedule the finalizer when the mutable data-structure is not
available for mutation.
this ensures that the lock cannot be acquired recursively,
and furthermore, this pattern will continue to work if finalizers
get moved to their own separate thread.

close #14445
fix #16550
reverts workaround #14456 (shouldn't break #14295, due to new locks)
should fix #16091 (with #17619)

(cherry picked from commit cd8be65)
ref #16204
mfasi pushed a commit to mfasi/julia that referenced this issue Sep 5, 2016
also use this `client_refs.lock` to protect other data-structures from
being interrupted by finalizers, in the multi.jl logic

we may want to start indicating which mutable data-structures are safe
to call from finalizers, since generally that isn't possible

to make a finalizer API gc-safe, that code should observe the standard
thread-safe restrictions (there's no guarantee of which thread it'll run on),

plus, if the data-structures uses locks for synchronization,
use the `islocked` pattern (demonstrated herein) in the `finalizer`
to re-schedule the finalizer when the mutable data-structure is not
available for mutation.
this ensures that the lock cannot be acquired recursively,
and furthermore, this pattern will continue to work if finalizers
get moved to their own separate thread.

close JuliaLang#14445
fix JuliaLang#16550
reverts workaround JuliaLang#14456 (shouldn't break JuliaLang#14295, due to new locks)
should fix JuliaLang#16091 (with JuliaLang#17619)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

5 participants