Skip to content

Commit

Permalink
WIP: World-age parition bindings
Browse files Browse the repository at this point in the history
This implements world-age partitioning of bindings as proposed in #40399.
In effect, much like methods, the global view of bindings now depends on
your currently executing world. This means that `const` bindings can now
have different values in different worlds. In principle it also means
that regular global variables could have different values in different
worlds, but there is currently no case where the system does this.

The reasons for this change are manifold:

1. The primary motivation is to permit Revise to redefine structs.
   This has been a feature request since the very begining of Revise
   (timholy/Revise.jl#18) and there have been
   numerous attempts over the past 7 years to address this, as well as
   countless duplicate feature request. A past attempt to implement the
   necessary julia support in #22721 failed because the consequences and
   semantics of re-defining bindings were not sufficiently worked out.
   One way to think of this implementation (at least with respect to types)
   is that it provides a well-grounded implementation of #22721.

2. A secondary motivation is to make `const`-redefinition no longer UB
   (although `const` redefinition will still have a significant performance
   penalty, so it is not recommended). See e.g. the full discussion in #54099.

3. Not currently implemented, but this mechanism can be used to re-compile code
   where bindings are introduced after the first compile, which is a common
   performance trap for new users (#53958).

4. Not currently implemented, but this mechanism can be used to clarify the semantics
   of bindings import and resolution to address issues like #14055.

In this PR:
 - `Binding` gets `min_world`/`max_world` fields like `CodeInstance`
 - Various lookup functions walk this linked list using the current task world_age as a key
 - Inference accumulates world bounds as it would for methods
 - Upon binding replacement, we walk all methods in the system, invalidating those whose
   uninferred IR references the replaced GlobalRef
 - One primary complication is that our IR definition permits `const` globals in value position,
   but if binding replacement is permitted, the validity of this may change after the fact.
   To address this, there is a helper in `Core.Compiler` that gets invoked in the type inference
   world and will rewrite the method source to be legal in all worlds.
 - A new `@world` macro can be used to access bindings from old world ages. This is used in printing
   for old objects.
 - The `const`-override behavior was changed to only be permitted at toplevel. The warnings about
   it being UB was removed.

Of particular note, this PR does not include any mechanism for invalidating methods whose signatures
were created using an old Binding (or types whose fields were the result of a binding evaluation).
There was some discussion among the compiler team of whether such a mechanism should exist in base,
but the consensus was that it should not. In particular, although uncommon, a pattern like:
```
f() = Any
g(::f()) = 1
f() = Int
```
Does not redefine `g`. Thus to fully address the Revise issue, additional code will be required in
Revise to track the dependency of various signatures and struct definitions on bindings.

```
julia> struct Foo
               a::Int
       end

julia> g() = Foo(1)
g (generic function with 1 method)

julia> g()
Foo(1)

julia> f(::Foo) = 1
f (generic function with 1 method)

julia> fold = Foo(1)
Foo(1)

julia> struct Foo
               a::Int
               b::Int
       end

julia> g()
ERROR: MethodError: no method matching Foo(::Int64)
The type `Foo` exists, but no method is defined for this combination of argument types when trying to construct it.

Closest candidates are:
  Foo(::Int64, ::Int64)
   @ Main REPL[6]:2
  Foo(::Any, ::Any)
   @ Main REPL[6]:2

Stacktrace:
 [1] g()
   @ Main ./REPL[2]:1
 [2] top-level scope
   @ REPL[7]:1

julia> f(::Foo) = 2
f (generic function with 2 methods)

julia> methods(f)
 [1] f(::Foo)
     @ REPL[8]:1
 [2] f(::@world(Foo, 0:26898))
     @ REPL[4]:1

julia> fold
@world(Foo, 0:26898)(1)
```

On my machine, the validation required upon binding replacement for the full system image takes about 200ms.
With CedarSim loaded (I tried OmniPackage, but it's not working on master), this increases about 5x. That's
a fair bit of compute, but not the end of the world. Still, Revise may have to batch its validation. There
may also be opportunities for performance improvement by operating on the compressed representation directly.

- [ ] Do we want to change the resolution time of bindings to (semantically) resolve them immediately?
- [ ] Do we want to introduce guard bindings when inference assumes the absence of a binding?

- [ ] Precompile re-validation
- [ ] Various cleanups in the accessors
- [ ] Invert the order of the binding linked list to make the most recent one always the head of the list
- [ ] CodeInstances need forward edges for GlobalRefs not part of the uninferred code
- [ ] Generated function support
  • Loading branch information
Keno committed Jun 9, 2024
1 parent 77c28ab commit 06e6939
Show file tree
Hide file tree
Showing 25 changed files with 527 additions and 59 deletions.
3 changes: 3 additions & 0 deletions base/Base.jl
Original file line number Diff line number Diff line change
Expand Up @@ -550,6 +550,9 @@ for m in methods(include)
delete_method(m)
end

# Arm binding invalidation mechanism
const invalidate_code_for_globalref! = Core.Compiler.invalidate_code_for_globalref!

# This method is here only to be overwritten during the test suite to test
# various sysimg related invalidation scenarios.
a_method_to_overwrite_in_test() = inferencebarrier(1)
Expand Down
2 changes: 0 additions & 2 deletions base/boot.jl
Original file line number Diff line number Diff line change
Expand Up @@ -541,8 +541,6 @@ GenericMemoryRef(mem::GenericMemory) = memoryref(mem)
GenericMemoryRef(mem::GenericMemory, i::Integer) = memoryref(mem, i)
GenericMemoryRef(mem::GenericMemoryRef, i::Integer) = memoryref(mem, i)

const Memory{T} = GenericMemory{:not_atomic, T, CPU}
const MemoryRef{T} = GenericMemoryRef{:not_atomic, T, CPU}
const AtomicMemory{T} = GenericMemory{:atomic, T, CPU}
const AtomicMemoryRef{T} = GenericMemoryRef{:atomic, T, CPU}

Expand Down
36 changes: 32 additions & 4 deletions base/compiler/abstractinterpretation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2826,6 +2826,7 @@ end
isdefined_globalref(g::GlobalRef) = !iszero(ccall(:jl_globalref_boundp, Cint, (Any,), g))
isdefinedconst_globalref(g::GlobalRef) = isconst(g) && isdefined_globalref(g)

# TODO: This should verify that there is only one binding for this globalref
function abstract_eval_globalref_type(g::GlobalRef)
if isdefinedconst_globalref(g)
return Const(ccall(:jl_get_globalref_value, Any, (Any,), g))
Expand All @@ -2834,10 +2835,37 @@ function abstract_eval_globalref_type(g::GlobalRef)
ty === nothing && return Any
return ty
end
abstract_eval_global(M::Module, s::Symbol) = abstract_eval_globalref_type(GlobalRef(M, s))

function abstract_eval_binding_type(b::Core.Binding)
if isdefined(b, :owner)
b = b.owner
end
if isconst(b) && isdefined(b, :value)
return Const(b.value)
end
isdefined(b, :ty) || return Any
ty = b.ty
ty === nothing && return Any
return ty
end
function abstract_eval_global(M::Module, s::Symbol)
# TODO: This needs to add a new globalref to globalref edges list
return abstract_eval_globalref_type(GlobalRef(M, s))
end

function lookup_binding(world::UInt, g::GlobalRef)
ccall(:jl_lookup_module_binding, Any, (Any, Any, UInt), g.mod, g.name, world)::Union{Core.Binding, Nothing}
end

function abstract_eval_globalref(interp::AbstractInterpreter, g::GlobalRef, sv::AbsIntState)
rt = abstract_eval_globalref_type(g)
binding = lookup_binding(get_inference_world(interp), g)
if binding === nothing
# TODO: We could allocate a guard entry here, but that would require
# going through a binding replacement if the binding ends up being used.
return RTEffects(Any, UndefVarError, Effects(EFFECTS_TOTAL; consistent=ALWAYS_FALSE, nothrow=false, inaccessiblememonly=ALWAYS_FALSE))
end
update_valid_age!(sv, WorldRange(binding.min_world, binding.max_world))
rt = abstract_eval_binding_type(binding)
consistent = inaccessiblememonly = ALWAYS_FALSE
nothrow = false
if isa(rt, Const)
Expand All @@ -2848,12 +2876,12 @@ function abstract_eval_globalref(interp::AbstractInterpreter, g::GlobalRef, sv::
end
elseif InferenceParams(interp).assume_bindings_static
consistent = inaccessiblememonly = ALWAYS_TRUE
if isdefined_globalref(g)
if isdefined(binding, :value)
nothrow = true
else
rt = Union{}
end
elseif isdefinedconst_globalref(g)
elseif isdefined(binding, :value) && isconst(binding)
nothrow = true
end
return RTEffects(rt, nothrow ? Union{} : UndefVarError, Effects(EFFECTS_TOTAL; consistent, nothrow, inaccessiblememonly))
Expand Down
2 changes: 2 additions & 0 deletions base/compiler/compiler.jl
Original file line number Diff line number Diff line change
Expand Up @@ -222,5 +222,7 @@ ccall(:jl_set_typeinf_func, Cvoid, (Any,), typeinf_ext_toplevel)
include("compiler/parsing.jl")
Core._setparser!(fl_parse)

include("compiler/invalidation.jl")

end # baremodule Compiler
))
161 changes: 161 additions & 0 deletions base/compiler/invalidation.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# GlobalRef/binding reflection
# TODO: This should potentially go in reflection.jl, but `@atomic` is not available
# there.
struct GlobalRefIterator
mod::Module
end
globalrefs(mod::Module) = GlobalRefIterator(mod)

function iterate(gri::GlobalRefIterator, i = 1)
m = gri.mod
table = ccall(:jl_module_get_bindings, Ref{SimpleVector}, (Any,), m)
i == length(table) && return nothing
b = table[i]
b === nothing && return iterate(gri, i+1)
return ((b::Core.Binding).globalref, i+1)
end

const TYPE_TYPE_MT = Type.body.name.mt
const NONFUNCTION_MT = MethodTable.name.mt
function foreach_module_mtable(visit, m::Module)
for gb in globalrefs(m)
binding = gb.binding
if isconst(binding)
isdefined(binding, :value) || continue
v = @atomic binding.value
uw = unwrap_unionall(v)
name = gb.name
if isa(uw, DataType)
tn = uw.name
if tn.module === m && tn.name === name && tn.wrapper === v && isdefined(tn, :mt)
# this is the original/primary binding for the type (name/wrapper)
mt = tn.mt
if mt !== nothing && mt !== TYPE_TYPE_MT && mt !== NONFUNCTION_MT
@assert mt.module === m
visit(mt) || return false
end
end
elseif isa(v, Module) && v !== m && parentmodule(v) === m && _nameof(v) === name
# this is the original/primary binding for the submodule
foreach_module_mtable(visit, v) || return false
elseif isa(v, MethodTable) && v.module === m && v.name === name
# this is probably an external method table here, so let's
# assume so as there is no way to precisely distinguish them
visit(v) || return false
end
end
end
return true
end

function foreach_reachable_mtable(visit)
visit(TYPE_TYPE_MT) || return
visit(NONFUNCTION_MT) || return
if isdefined(Core.Main, :Base)
for mod in Core.Main.Base.loaded_modules_array()
foreach_module_mtable(visit, mod)
end
else
foreach_module_mtable(visit, Core)
foreach_module_mtable(visit, Core.Main)
end
end

function invalidate_code_for_globalref!(gr::GlobalRef, src::CodeInfo)
found_any = false
labelchangemap = nothing
stmts = src.code
function get_labelchangemap()
if labelchangemap === nothing
labelchangemap = fill(0, length(stmts))
end
labelchangemap
end
isgr(g::GlobalRef) = gr.mod == g.mod && gr.name === g.name
isgr(g) = false
for i = 1:length(stmts)
stmt = stmts[i]
if isgr(stmt)
found_any = true
continue
end
found_arg = false
ngrs = 0
for ur in userefs(stmt)
arg = ur[]
# If any of the GlobalRefs in this stmt match the one that
# we are about, we need to move out all GlobalRefs to preseve

Check warning on line 87 in base/compiler/invalidation.jl

View workflow job for this annotation

GitHub Actions / Check for new typos

perhaps "preseve" should be "preserve".
# effect order, in case we later invalidate a different GR
if isa(arg, GlobalRef)
ngrs += 1
if isgr(arg)
@assert !isa(stmt, PhiNode)
found_arg = found_any = true
break
end
end
end
if found_arg
get_labelchangemap()[i] += ngrs
end
end
next_empty_idx = 1
if labelchangemap !== nothing
cumsum_ssamap!(labelchangemap)
new_stmts = Vector(undef, length(stmts)+labelchangemap[end])
new_ssaflags = Vector{UInt32}(undef, length(new_stmts))
new_debuginfo = DebugInfoStream(nothing, src.debuginfo, length(new_stmts))
new_debuginfo.def = src.debuginfo.def
for i = 1:length(stmts)
stmt = stmts[i]
urs = userefs(stmt)
new_stmt_idx = i+labelchangemap[i]
for ur in urs
arg = ur[]
if isa(arg, SSAValue)
ur[] = SSAValue(arg.id + labelchangemap[arg.id])
elseif next_empty_idx != new_stmt_idx && isa(arg, GlobalRef)
new_debuginfo.codelocs[3next_empty_idx - 2] = i
new_stmts[next_empty_idx] = arg
new_ssaflags[next_empty_idx] = UInt32(0)
ur[] = SSAValue(next_empty_idx)
next_empty_idx += 1
end
end
@assert new_stmt_idx == next_empty_idx
new_stmts[new_stmt_idx] = urs[]
new_debuginfo.codelocs[3new_stmt_idx - 2] = i
new_ssaflags[new_stmt_idx] = src.ssaflags[i]
next_empty_idx = new_stmt_idx+1
end
src.code = new_stmts
src.ssavaluetypes = length(new_stmts)
src.ssaflags = new_ssaflags
src.debuginfo = Core.DebugInfo(new_debuginfo, length(new_stmts))
end
return found_any
end

function invalidate_code_for_globalref!(gr::GlobalRef, new_max_world::UInt)
valid_in_valuepos = false
foreach_reachable_mtable() do mt::MethodTable
for method in MethodList(mt)
if isdefined(method, :source)
src = _uncompressed_ir(method)
old_stmts = src.code
if invalidate_code_for_globalref!(gr, src)
if src.code !== old_stmts
method.debuginfo = src.debuginfo
method.source = src
method.source = ccall(:jl_compress_ir, Ref{String}, (Any, Ptr{Cvoid}), method, C_NULL)
end

for mi in specializations(method)
ccall(:jl_invalidate_method_instance, Cvoid, (Any, UInt), mi, new_max_world)
end
end
end
end
return true
end
end
44 changes: 44 additions & 0 deletions base/essentials.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1069,6 +1069,50 @@ function invoke_in_world(world::UInt, @nospecialize(f), @nospecialize args...; k
return Core._call_in_world(world, Core.kwcall, kwargs, f, args...)
end

"""
@world(sym, world)
Resolve the binding `sym` in world `world`. See [`invoke_in_world`](@ref) for running
arbitrary code in fixed worlds. `world` may be `UnitRange`, in which case the macro
will error unless the binding is valid and has the same value across the entire world
range.
The `@world` macro is primarily used in the priniting of bindings that are no longer available
in the current world.
## Example
```
julia> struct Foo; a::Int; end
Foo
julia> fold = Foo(1)
julia> Int(Base.get_world_counter())
26866
julia> struct Foo; a::Int; b::Int end
Foo
julia> fold
@world(Foo, 26866)(1)
```
!!! compat "Julia 1.12"
This functionality requires at least Julia 1.12.
"""
macro world(sym, world)
if isa(sym, Symbol)
return :($(_resolve_in_world)($world, $(QuoteNode(GlobalRef(__module__, sym)))))
elseif isa(sym, GlobalRef)
return :($(_resolve_in_world)($world, $(QuoteNode(sym))))
else
error("`@world` requires a symbol or GlobalRef")
end
end

_resolve_in_world(world::Integer, gr::GlobalRef) =
invoke_in_world(UInt(world), Core.getglobal, gr.mod, gr.name)

inferencebarrier(@nospecialize(x)) = compilerbarrier(:type, x)

"""
Expand Down
1 change: 1 addition & 0 deletions base/exports.jl
Original file line number Diff line number Diff line change
Expand Up @@ -810,6 +810,7 @@ export
@invoke,
invokelatest,
@invokelatest,
@world,

# loading source files
__precompile__,
Expand Down
13 changes: 13 additions & 0 deletions base/range.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1680,3 +1680,16 @@ function show(io::IO, r::LogRange{T}) where {T}
show(io, length(r))
print(io, ')')
end

# Implementation detail of @world
# The rest of this is defined in essentials.jl, but UnitRange is not available
function _resolve_in_world(world::UnitRange, gr::GlobalRef)
# Validate that this binding's reference covers the entire world range
bnd = ccall(:jl_lookup_module_binding, Any, (Any, Any, UInt), gr.mod, gr.name, first(world))::Union{Core.Binding, Nothing}
if bnd !== nothing
if bnd.max_world < last(world)
error("Binding does not cover the full world range")
end
end
_resolve_in_world(last(world), gr)
end
15 changes: 15 additions & 0 deletions base/reflection.jl
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,9 @@ function isconst(g::GlobalRef)
return ccall(:jl_globalref_is_const, Cint, (Any,), g) != 0
end

isconst(b::Core.Binding) =
ccall(:jl_binding_is_const, Cint, (Any,), b) != 0

"""
isconst(t::DataType, s::Union{Int,Symbol}) -> Bool
Expand Down Expand Up @@ -2595,6 +2598,18 @@ function delete_method(m::Method)
ccall(:jl_method_table_disable, Cvoid, (Any, Any), get_methodtable(m), m)
end

"""
delete_binding(mod::Module, sym::Symbol)
Force the binding `mod.sym` to be undefined again, allowing it be redefined.
Note that this operation is very expensive, requirinig a full scan of all code in the system,
as well as potential recompilation of any methods that (may) have used binding
information.
"""
function delete_binding(mod::Module, sym::Symbol)
ccall(:jl_disable_binding, Cvoid, (Any,), GlobalRef(mod, sym))
end

function get_methodtable(m::Method)
mt = ccall(:jl_method_get_table, Any, (Any,), m)
if mt === nothing
Expand Down
21 changes: 21 additions & 0 deletions base/show.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1040,6 +1040,24 @@ function is_global_function(tn::Core.TypeName, globname::Union{Symbol,Nothing})
return false
end

function check_world_bounded(tn)
bnd = ccall(:jl_get_module_binding, Any, (Any, Any, Cint, UInt), tn.module, tn.name, false, 1)::Core.Binding
if bnd !== nothing
while true
if isdefined(bnd, :owner) && isdefined(bnd, :value)
if bnd.value <: tn.wrapper
max_world = @atomic bnd.max_world
max_world == typemax(UInt) && return nothing
return Int(bnd.min_world):Int(max_world)
end
end
isdefined(bnd, :next) || break
bnd = @atomic bnd.next
end
end
return nothing
end

function show_type_name(io::IO, tn::Core.TypeName)
if tn === UnionAll.name
# by coincidence, `typeof(Type)` is a valid representation of the UnionAll type.
Expand Down Expand Up @@ -1068,7 +1086,10 @@ function show_type_name(io::IO, tn::Core.TypeName)
end
end
end
world = check_world_bounded(tn)
world !== nothing && print(io, "@world(")
show_sym(io, sym)
world !== nothing && print(io, ", ", world, ")")
quo && print(io, ")")
globfunc && print(io, ")")
nothing
Expand Down
Loading

0 comments on commit 06e6939

Please sign in to comment.