WIP: Make mutating immutables easier #21912

Keno · 2017-05-16T18:22:25Z

I like immutables, I really do, they're simple to work with, fast, don't cause allocations. Really the only thing that bothers me about them is that they're well, immutable, making them a bit of pain to work with, esp. when wanting to construct one incrementally. So here's an attempt to remedy that.
Example usage:

julia> struct Foo
       a::Int
       end

julia> x = Foo(1)
Foo(1)

julia> x@a = 2
Foo(2)

julia> x
Foo(2)

The ways this works is that under the hood, it creates a new immutable object with the specified field modified and then assigns it back to the appropriate place. Syntax wise, everything to the left of the @ is what's being assigned to, everything to the right of the @ is what is to be modified. E.g.

struct Foo4; d::Int; end
struct Foo3; c::Foo4; end
mutable struct Foo2; b::Foo3; end
struct Foo1; a::Foo2; end
x = Foo1(Foo2(Foo3(Foo4(1))))
x.a.b@c.d = 2
julia> x
Foo1(Foo2(Foo3(Foo4(2))))

Internally, everything to the right of the @ gets lowered to setindex (no bang) and setfield (also no bang), which are overridable by the user. The intent is to disallow setfield for immutables that have a non-default inner constructor, allowing the user to provide their own which checks any required invariants. That part isn't implemented here yet, however.

Lastly, LLVM isn't currently too happy about the IR this generates, so I'm working on making that happen to make sure this actually performs ok. I think from the julia side, this is pretty much the extent of it though. Pretty much untested at the moment. I want to get the LLVM side of things done first. With that in mind, feel free to check out this branch and see if you like it.

One motivating example here is of course efficient, generic fixed size arrays, so here's an example of that:

mutable struct MArray{Size, T, D, L} <: AbstractArray{T, D}
    data::NTuple{L, T}
end

Base.size(::MArray{Size}) where {Size} = tuple(Size.parameters...)
@inline to_linear_index(::Type{Tuple{}}, idxs, stride) = 1
@inline to_linear_index(::Type, idxs::Tuple{}, stride) = 1
@inline to_linear_index(::Type{Tuple{A}}, idxs, stride) where {A} = idxs[1]
@inline to_linear_index(::Type{Tuple{A,B}}, idxs, stride) where {A, B} = A*(idxs[1]-1) + idxs[2]
@inline to_linear_index(T::Type{<:Tuple{A, Vararg{Any}}}, idxs, stride) where A =
    (stride * (idxs[1]-1)) + to_linear_index(Base.tuple_type_tail(T), idxs[2:end], stride * A)
function Base.setindex!(a::MArray{Size}, v, idxs...) where Size
    @inbounds a.data@[to_linear_index(Size, idxs, 1)] = v
    v
end
Base.getindex(a::MArray{Size}, idxs...) where {Size} = a.data[to_linear_index(Size, idxs, 1)]

a = MArray{Tuple{2, 2}, Int, 2, 4}((1,2,3,4))
b = [1 2; 3 4]
c = [4 5; 6 7]


# Look ma, no allocations
julia> @benchmark A_mul_B!(a, b, c)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     86.193 ns (0.00% GC)
  median time:      86.993 ns (0.00% GC)
  mean time:        87.978 ns (0.00% GC)
  maximum time:     3.282 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     961

Fixes #11902
Supersedes #12113

JeffBezanson · 2017-05-16T18:39:18Z

Wonderful! This approach has my full approval.

JeffBezanson · 2017-05-16T18:49:51Z

test/parse.jl

@@ -1187,3 +1187,14 @@ module Test21607
        x
    end === 1.0
 end
+
+# Basic parsing for in place assign
+@test parse("a@b = 1") == :(a@b = 1)


This kind of test isn't really effective; for example it would pass if a@b = 1 parsed as 42.

Primarily I wanted to make sure it doesn't throw, because I'm not sure the parsing is final yet. I also find it useful, because that way I can use parse.jl to test JuliaParser.

yuyichao · 2017-05-16T19:42:18Z

Another approach I was thinking about that is related is a more general representation of memory location, i.e. Ref. The motivation is to support better atomics. The C/C++ model of expressing this doesn't work for us very well since we are missing the notion of pointer or reference which are the input type of atomic intrinsics in C/C++. If we want to bind atomic operations to certain types, we'll have to add additional indirection (what we do now) or make them a special case in the system which carries their location information (I don't think this is doable). That's why I think we should define atomic operations on Refs which is essentially our representation of memory location. For example, given struct B x::Int; y::Int end a::Vector{B} one would do atomic operation on the a[1].x element (a Int) by atomic_op!(reffield(refindex(a, 1), :x)).

It seems that most of the problems here can be solved with a similar lowering so a@[1].x = v can be lowerd to unsafe_store!(reffield(refindex(a, 1), :x), v). This won't cover the case of mutating a immutable variable by returning a new one though that can probably be distinguished from parser level.

I'm still looking for alternative proposal from atomic ops but haven't found any working ones yet. If we are going to have better atomic ops support (I think we should) it'll be nice if we can unify the API.

Keno · 2017-05-16T19:48:17Z

I had originally proposed that as an implementation strategy, but people didn't like it because the lack of access protection for immutables that create constraints in their inner constructors.

yuyichao · 2017-05-16T19:51:05Z

I feel like with reinterpret(::Array, ...) it shouldn't be that big of a concern........ Just for this particular problem the approach here looks fine but I really can't come up with a better approach for atomics...

Keno · 2017-05-16T19:57:41Z

Well, for what it's worth, my original proposal was to have a
immutable RefField{Base, Derived}; base::Base; offset::Int; end
type that would get created appropriately, so that's still a possibility. People just seemed really opposed to being able to bypass constructor invariants.

JeffBezanson · 2017-05-16T19:59:50Z

I feel like with reinterpret(::Array, ...) it shouldn't be that big of a concern

One sketchy feature doesn't make it ok to add more sketchy features. The uses of reinterpret are also quite limited, and the function could probably be restricted much more than it is now. This mutation feature, on the other hand, is a much more general feature that is likely to be used heavily.

yuyichao · 2017-05-16T20:00:03Z

I was thinking of struct RefField{Eltype} a; offset::Int end so that Int field from different source can have the same type. (and to reduce the number of types in general) but I guess that detail can be decided later.

yuyichao · 2017-05-16T20:06:10Z

One sketchy feature doesn't make it ok to add more sketchy features.

Sure. I don't feel like either is very sketchy though. There are invariants that the compilier can use and the ones that users can use. The constructor invariant is something only the user can use so it wouldn't create any undefined behavior if broken. Also, if one decides to mutate a field, he is already touching internal representation (unless it is documented to be public) so it isn't anything different from what we do for mutable struct either, i.e. we document that fields are in general private and mutating them can break other code but if it's needed for debugging or testing then you can do it without having the compiler stop you from doing what should be possible to do.

Keno · 2017-05-16T20:07:04Z

One possibility is to use this lowering for the LHS of an assign, but use the RefField lowering for this syntax every where else. I feel like a binary externally constrained/externally unconstrained would be ok. E.g.:

struct MyConstrained
    counter::Int
    launch_nukes::Bool
    MyConstrained() = MyConstrained(counter, false)
    function gepfield(x::MyConstrained, field)
        field == :x && return Core.gepfield(x, field)
        error("nice try")
    end
end

yuyichao · 2017-05-16T20:08:33Z

Having a way to disallow ref (gep) of a certain field would certainly be possible. It will also stop that field from any useful mutation though, which is why I didn't bring that up....

Keno · 2017-05-16T20:11:32Z

Well, seems like you could allow writing an inner method like so:

struct MyConstrained
    counter::Int
    launch_nukes::Bool
    MyConstrained() = MyConstrained(counter, false)
    function gepfield(x::MyConstrained, field)
        field == :x && return Core.gepfield(x, field)
        error("nice try")
    end
    function launch!(x::RefField{MyConstrained}, code)
        authorized(code) || error("This incident will be reported")
        Core.gepfield(x, :launch_nukes)[] = true
    end
end

yuyichao · 2017-05-16T20:21:51Z

For this one, yes. It'll be quite complicated for atomics though. Maybe just pass in the value during gep and let the caller do the storing?

yuyichao · 2017-05-16T20:26:50Z

Or at least separate the checking from the actual memory op.

Keno · 2017-05-16T20:27:13Z

Well, I just figured either you treat a field as a general atomic in which case it's either all or nothing (and can be done via gepfield) or you have to do the memory modification yourself (by defining an inner method).

yuyichao · 2017-05-16T20:36:13Z

So a type with checked invariance has to define all memory ops on the field it needs checked atomics? This would also include the case where the atomic is done only on a subfield?

Keno · 2017-05-16T20:43:48Z

Well, if they subfield itself has constraints, then yes, I don't really see how else you can do it. If not, you can define a gepfield for that field.

yuyichao · 2017-05-16T20:43:51Z

So what about this. If a type allows external mutation of a field, it'll be all or nothing. If it want to check, it needs to provide it's own interface to do that. It can do that by defining a function that accepts a correct Ref type. So if we have a struct A x::Int; y::Int end and it want to allow normal store to y with checked value to ensure y > x. It'll define

function mutate_y(ref, y)
    obj = ref.obj
    @assert y > obj.x
    unsafe_store!(ref, y)
end

And the user will call it with mutaate_y(container@[i].y, new_y) where the non-assigning container@[i].y is a place holder syntax for constructing the ref field object.

yuyichao · 2017-05-16T20:45:48Z

Or even

function mutate_y(ref_to_obj, y)
    obj = ref.obj
    @assert y > obj.x
    unsafe_store!(Core.fieldref_for_fieldref(ref, :y), y)
end

and called with mutate_y(container@[i])

Keno · 2017-05-16T20:47:05Z

Right, I think the second is ok. Your fieldref_for_fieldref is what I called Core.gepfield, which may be distinct from Base.gepfield called by the syntax (and may also only be available from within type scope like new, etc.

yuyichao · 2017-05-16T20:48:36Z

is what I called Core.gepfield

It's slightly different since Core.gepfield would take a (mutable) object and construct a gep on the object. Core.gepfield_for_ref would take a ref to an (immutable) object and construct a gep on the object.

yuyichao · 2017-05-16T20:49:19Z

Of course we can make it dispatch but I don't want to block constructing a field ref on a ref object (RefValue) at this low level.

Keno · 2017-05-16T20:49:46Z

If you look at my launch_nukes example above, I had implicitly assumed that gepfield allows this by dispatch. In fact allowing it is the only way to properly implement the syntax.

Keno · 2017-05-16T20:50:25Z

But yeah, I do realize that my gepfield implementation example should have probably taken a RefField.

yuyichao · 2017-05-16T20:50:31Z

Ok, yeah, that's right.

yuyichao · 2017-05-16T20:51:35Z

I wonder if gepfield should take a (immutable) type explicitly like llvm does to avoid confusion.....................

Keno · 2017-05-16T20:53:24Z

So only gepfield(::Type{T}, RefField{T}, fld) would be allowed?

yuyichao · 2017-05-16T20:54:25Z

gepfield(::Type{T}, ::Union{Ref{T},T}, fld)

stevengj · 2017-07-07T16:15:11Z

See also #17115. If I have an array a of an immutable type, does a[i]@b = c overwrite the entire a[i] element or just the b field?

yuyichao · 2017-07-07T16:26:56Z

Only b. The two sides of the assignment determines the exact element that is being assigned to.

stevengj · 2017-12-05T17:03:50Z

Now that we have constant propagation in inference, can this be revisited?

stevengj · 2018-02-24T13:47:17Z

Bump?

In #25908 it was noted that reinterpreting structures with paddings exposes undef LLVM values to user code. This is problematic, because an LLVM undef value is quite dangerous (it can have a different value at every use, e.g. for `a::Bool` undef, we can have `a || !a == true`. There are proposal in LLVM to create values that are merely arbitrary (but the same at every use), but that capability does not currently exist in LLVM. As such, we should try hard to prevent `undef` showing up in a user-visible way. There are several ways to fix this: 1. Wait until LLVM comes up with a safer `undef` and have the value merely be arbitrary, but not dangerous. 2. Always guarantee that padding bytes will be 0. 3. For contiguous-memory arrays, guarantee that we end up with the underlying bytes from that array. However, for now, I think don't think we should make a choice here. Issues like #21912, may play into the consideration, and I think we should be able to reserve making a choice until that point. So what this PR does is only allow reinterprets when they would not expose padding. This should hopefully cover the most common use cases of reinterpret: - Reinterpreting a vector or matrix of values to StaticVectors of the same element type. These should generally always have compatiable padding (if not, reinterpret was likely the wrong API to use). - Reinterpreting from a Vector{UInt8} to a vector of structs (that may have padding). This PR allows this for reading (but not for writing). Both cases are generally better served by the IO APIs, but hopefully this should still allow the common cases. Fixes #25908

sairus7 · 2019-02-15T01:42:39Z

There are several discussions on that, may be this helps:
https://discourse.julialang.org/t/make-immutable-mutable-again-or-make-arrays-of-structures-stack-allocated/20800

mauro3 · 2019-02-15T10:16:49Z

Just to mention the package which implements this (with macros): https://github.com/jw3126/Setfield.jl with macros.

EricForgy · 2019-06-07T13:20:21Z

Just to mention the package which implements this (with macros): https://github.com/jw3126/Setfield.jl with macros.

Any thoughts on the lens approach? It seems pretty slick to me 😊

mohamed82008 · 2020-08-13T14:23:13Z

Can this be part of 1.6?

vtjnash · 2021-04-08T22:57:57Z

We should still work on this (#11902). It just needs some design work and agreement, and probably needs to be rewritten since this is stale.

AhmedSalih3d · 2023-01-30T21:33:14Z

Out of curiosity, what happened to this proposal?

Did some package replace it or was it deemed not worth it any more?

mauro3 · 2023-01-31T16:58:01Z

There is https://github.com/JuliaObjects/Accessors.jl

tkelman added needs docs Documentation for this change is required needs news A NEWS entry is required for this change labels May 16, 2017

JeffBezanson reviewed May 16, 2017

View reviewed changes

Keno mentioned this pull request Jul 27, 2017

WIP: add support for working with immutables (#11902) #12113

Closed

andyferris mentioned this pull request Oct 9, 2017

Julep: Generalize indexing with and by Associatives #24019

Open

vtjnash mentioned this pull request Dec 5, 2017

allow overloading of a.b field access syntax #1974

Closed

nalimilan mentioned this pull request Jan 2, 2018

Supertype of AbstractDataFrame and DataFrameRow JuliaData/DataFrames.jl#1337

Closed

andyferris mentioned this pull request Jun 17, 2018

&<var> syntax to declare a scalar to broadcast with #27563

Open

Keno mentioned this pull request Jun 30, 2018

Don't allow reinterprets that would expose padding #27877

Merged

tkf mentioned this pull request Jan 12, 2019

RFC/Julep: Introduce getproperty on Array for built-in data tables. #30646

Closed

Keno mentioned this pull request Mar 28, 2019

make &x sugar for RefValue(x) #27608

Open

2 tasks

Keno mentioned this pull request Apr 5, 2019

Julep/Very WIP - Heap allocated immutable arrays and compiler support #31630

Closed

colinxs mentioned this pull request Oct 14, 2019

WIP: Implement mutating versions of @set and @lens (#71) jw3126/Setfield.jl#92

Closed

rapus95 mentioned this pull request Feb 18, 2020

Make everything mutable for more convenient in-place updates? JuliaCI/PkgTemplates.jl#141

Closed

vtjnash closed this Apr 8, 2021

vtjnash deleted the kf/mutatingimmutables branch April 8, 2021 22:58

WIP: Make mutating immutables easier #21912

WIP: Make mutating immutables easier #21912

Conversation

Keno commented May 16, 2017 • edited Loading

JeffBezanson commented May 16, 2017

JeffBezanson May 16, 2017

Choose a reason for hiding this comment

Keno May 16, 2017

Choose a reason for hiding this comment

yuyichao commented May 16, 2017

Keno commented May 16, 2017

yuyichao commented May 16, 2017 • edited Loading

Keno commented May 16, 2017 • edited Loading

JeffBezanson commented May 16, 2017

yuyichao commented May 16, 2017

yuyichao commented May 16, 2017 • edited Loading

Keno commented May 16, 2017 • edited Loading

yuyichao commented May 16, 2017

Keno commented May 16, 2017 • edited Loading

yuyichao commented May 16, 2017

yuyichao commented May 16, 2017

Keno commented May 16, 2017

yuyichao commented May 16, 2017

Keno commented May 16, 2017

yuyichao commented May 16, 2017

yuyichao commented May 16, 2017

Keno commented May 16, 2017

yuyichao commented May 16, 2017

yuyichao commented May 16, 2017

Keno commented May 16, 2017

Keno commented May 16, 2017

yuyichao commented May 16, 2017

yuyichao commented May 16, 2017 • edited Loading

Keno commented May 16, 2017

yuyichao commented May 16, 2017

stevengj commented Jul 7, 2017

yuyichao commented Jul 7, 2017

stevengj commented Dec 5, 2017

stevengj commented Feb 24, 2018

sairus7 commented Feb 15, 2019

mauro3 commented Feb 15, 2019

EricForgy commented Jun 7, 2019

mohamed82008 commented Aug 13, 2020

vtjnash commented Apr 8, 2021

AhmedSalih3d commented Jan 30, 2023

mauro3 commented Jan 31, 2023

Keno commented May 16, 2017 •

edited

Loading

yuyichao commented May 16, 2017 •

edited

Loading

Keno commented May 16, 2017 •

edited

Loading

yuyichao commented May 16, 2017 •

edited

Loading

Keno commented May 16, 2017 •

edited

Loading

Keno commented May 16, 2017 •

edited

Loading

yuyichao commented May 16, 2017 •

edited

Loading