-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Make mutating immutables easier #21912
Conversation
Wonderful! This approach has my full approval. |
test/parse.jl
Outdated
@@ -1187,3 +1187,14 @@ module Test21607 | |||
x | |||
end === 1.0 | |||
end | |||
|
|||
# Basic parsing for in place assign | |||
@test parse("a@b = 1") == :(a@b = 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of test isn't really effective; for example it would pass if a@b = 1
parsed as 42
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Primarily I wanted to make sure it doesn't throw, because I'm not sure the parsing is final yet. I also find it useful, because that way I can use parse.jl to test JuliaParser.
Another approach I was thinking about that is related is a more general representation of memory location, i.e. It seems that most of the problems here can be solved with a similar lowering so I'm still looking for alternative proposal from atomic ops but haven't found any working ones yet. If we are going to have better atomic ops support (I think we should) it'll be nice if we can unify the API. |
I had originally proposed that as an implementation strategy, but people didn't like it because the lack of access protection for immutables that create constraints in their inner constructors. |
I feel like with |
Well, for what it's worth, my original proposal was to have a |
One sketchy feature doesn't make it ok to add more sketchy features. The uses of |
I was thinking of |
Sure. I don't feel like either is very sketchy though. There are invariants that the compilier can use and the ones that users can use. The constructor invariant is something only the user can use so it wouldn't create any undefined behavior if broken. Also, if one decides to mutate a field, he is already touching internal representation (unless it is documented to be public) so it isn't anything different from what we do for mutable struct either, i.e. we document that fields are in general private and mutating them can break other code but if it's needed for debugging or testing then you can do it without having the compiler stop you from doing what should be possible to do. |
One possibility is to use this lowering for the LHS of an assign, but use the struct MyConstrained
counter::Int
launch_nukes::Bool
MyConstrained() = MyConstrained(counter, false)
function gepfield(x::MyConstrained, field)
field == :x && return Core.gepfield(x, field)
error("nice try")
end
end |
Having a way to disallow ref (gep) of a certain field would certainly be possible. It will also stop that field from any useful mutation though, which is why I didn't bring that up.... |
Well, seems like you could allow writing an inner method like so: struct MyConstrained
counter::Int
launch_nukes::Bool
MyConstrained() = MyConstrained(counter, false)
function gepfield(x::MyConstrained, field)
field == :x && return Core.gepfield(x, field)
error("nice try")
end
function launch!(x::RefField{MyConstrained}, code)
authorized(code) || error("This incident will be reported")
Core.gepfield(x, :launch_nukes)[] = true
end
end |
For this one, yes. It'll be quite complicated for atomics though. Maybe just pass in the value during |
Or at least separate the checking from the actual memory op. |
Well, I just figured either you treat a field as a general atomic in which case it's either all or nothing (and can be done via gepfield) or you have to do the memory modification yourself (by defining an inner method). |
So a type with checked invariance has to define all memory ops on the field it needs checked atomics? This would also include the case where the atomic is done only on a subfield? |
Well, if they subfield itself has constraints, then yes, I don't really see how else you can do it. If not, you can define a |
So what about this. If a type allows external mutation of a field, it'll be all or nothing. If it want to check, it needs to provide it's own interface to do that. It can do that by defining a function that accepts a correct function mutate_y(ref, y)
obj = ref.obj
@assert y > obj.x
unsafe_store!(ref, y)
end And the user will call it with |
Or even
and called with |
Right, I think the second is ok. Your |
It's slightly different since |
Of course we can make it dispatch but I don't want to block constructing a field ref on a ref object ( |
If you look at my launch_nukes example above, I had implicitly assumed that |
But yeah, I do realize that my gepfield implementation example should have probably taken a RefField. |
Ok, yeah, that's right. |
I wonder if |
So only |
|
See also #17115. If I have an array |
Only |
Now that we have constant propagation in inference, can this be revisited? |
Bump? |
In #25908 it was noted that reinterpreting structures with paddings exposes undef LLVM values to user code. This is problematic, because an LLVM undef value is quite dangerous (it can have a different value at every use, e.g. for `a::Bool` undef, we can have `a || !a == true`. There are proposal in LLVM to create values that are merely arbitrary (but the same at every use), but that capability does not currently exist in LLVM. As such, we should try hard to prevent `undef` showing up in a user-visible way. There are several ways to fix this: 1. Wait until LLVM comes up with a safer `undef` and have the value merely be arbitrary, but not dangerous. 2. Always guarantee that padding bytes will be 0. 3. For contiguous-memory arrays, guarantee that we end up with the underlying bytes from that array. However, for now, I think don't think we should make a choice here. Issues like #21912, may play into the consideration, and I think we should be able to reserve making a choice until that point. So what this PR does is only allow reinterprets when they would not expose padding. This should hopefully cover the most common use cases of reinterpret: - Reinterpreting a vector or matrix of values to StaticVectors of the same element type. These should generally always have compatiable padding (if not, reinterpret was likely the wrong API to use). - Reinterpreting from a Vector{UInt8} to a vector of structs (that may have padding). This PR allows this for reading (but not for writing). Both cases are generally better served by the IO APIs, but hopefully this should still allow the common cases. Fixes #25908
In #25908 it was noted that reinterpreting structures with paddings exposes undef LLVM values to user code. This is problematic, because an LLVM undef value is quite dangerous (it can have a different value at every use, e.g. for `a::Bool` undef, we can have `a || !a == true`. There are proposal in LLVM to create values that are merely arbitrary (but the same at every use), but that capability does not currently exist in LLVM. As such, we should try hard to prevent `undef` showing up in a user-visible way. There are several ways to fix this: 1. Wait until LLVM comes up with a safer `undef` and have the value merely be arbitrary, but not dangerous. 2. Always guarantee that padding bytes will be 0. 3. For contiguous-memory arrays, guarantee that we end up with the underlying bytes from that array. However, for now, I think don't think we should make a choice here. Issues like #21912, may play into the consideration, and I think we should be able to reserve making a choice until that point. So what this PR does is only allow reinterprets when they would not expose padding. This should hopefully cover the most common use cases of reinterpret: - Reinterpreting a vector or matrix of values to StaticVectors of the same element type. These should generally always have compatiable padding (if not, reinterpret was likely the wrong API to use). - Reinterpreting from a Vector{UInt8} to a vector of structs (that may have padding). This PR allows this for reading (but not for writing). Both cases are generally better served by the IO APIs, but hopefully this should still allow the common cases. Fixes #25908
In #25908 it was noted that reinterpreting structures with paddings exposes undef LLVM values to user code. This is problematic, because an LLVM undef value is quite dangerous (it can have a different value at every use, e.g. for `a::Bool` undef, we can have `a || !a == true`. There are proposal in LLVM to create values that are merely arbitrary (but the same at every use), but that capability does not currently exist in LLVM. As such, we should try hard to prevent `undef` showing up in a user-visible way. There are several ways to fix this: 1. Wait until LLVM comes up with a safer `undef` and have the value merely be arbitrary, but not dangerous. 2. Always guarantee that padding bytes will be 0. 3. For contiguous-memory arrays, guarantee that we end up with the underlying bytes from that array. However, for now, I think don't think we should make a choice here. Issues like #21912, may play into the consideration, and I think we should be able to reserve making a choice until that point. So what this PR does is only allow reinterprets when they would not expose padding. This should hopefully cover the most common use cases of reinterpret: - Reinterpreting a vector or matrix of values to StaticVectors of the same element type. These should generally always have compatiable padding (if not, reinterpret was likely the wrong API to use). - Reinterpreting from a Vector{UInt8} to a vector of structs (that may have padding). This PR allows this for reading (but not for writing). Both cases are generally better served by the IO APIs, but hopefully this should still allow the common cases. Fixes #25908
In #25908 it was noted that reinterpreting structures with paddings exposes undef LLVM values to user code. This is problematic, because an LLVM undef value is quite dangerous (it can have a different value at every use, e.g. for `a::Bool` undef, we can have `a || !a == true`. There are proposal in LLVM to create values that are merely arbitrary (but the same at every use), but that capability does not currently exist in LLVM. As such, we should try hard to prevent `undef` showing up in a user-visible way. There are several ways to fix this: 1. Wait until LLVM comes up with a safer `undef` and have the value merely be arbitrary, but not dangerous. 2. Always guarantee that padding bytes will be 0. 3. For contiguous-memory arrays, guarantee that we end up with the underlying bytes from that array. However, for now, I think don't think we should make a choice here. Issues like #21912, may play into the consideration, and I think we should be able to reserve making a choice until that point. So what this PR does is only allow reinterprets when they would not expose padding. This should hopefully cover the most common use cases of reinterpret: - Reinterpreting a vector or matrix of values to StaticVectors of the same element type. These should generally always have compatiable padding (if not, reinterpret was likely the wrong API to use). - Reinterpreting from a Vector{UInt8} to a vector of structs (that may have padding). This PR allows this for reading (but not for writing). Both cases are generally better served by the IO APIs, but hopefully this should still allow the common cases. Fixes #25908
In #25908 it was noted that reinterpreting structures with paddings exposes undef LLVM values to user code. This is problematic, because an LLVM undef value is quite dangerous (it can have a different value at every use, e.g. for `a::Bool` undef, we can have `a || !a == true`. There are proposal in LLVM to create values that are merely arbitrary (but the same at every use), but that capability does not currently exist in LLVM. As such, we should try hard to prevent `undef` showing up in a user-visible way. There are several ways to fix this: 1. Wait until LLVM comes up with a safer `undef` and have the value merely be arbitrary, but not dangerous. 2. Always guarantee that padding bytes will be 0. 3. For contiguous-memory arrays, guarantee that we end up with the underlying bytes from that array. However, for now, I think don't think we should make a choice here. Issues like #21912, may play into the consideration, and I think we should be able to reserve making a choice until that point. So what this PR does is only allow reinterprets when they would not expose padding. This should hopefully cover the most common use cases of reinterpret: - Reinterpreting a vector or matrix of values to StaticVectors of the same element type. These should generally always have compatiable padding (if not, reinterpret was likely the wrong API to use). - Reinterpreting from a Vector{UInt8} to a vector of structs (that may have padding). This PR allows this for reading (but not for writing). Both cases are generally better served by the IO APIs, but hopefully this should still allow the common cases. Fixes #25908
In #25908 it was noted that reinterpreting structures with paddings exposes undef LLVM values to user code. This is problematic, because an LLVM undef value is quite dangerous (it can have a different value at every use, e.g. for `a::Bool` undef, we can have `a || !a == true`. There are proposal in LLVM to create values that are merely arbitrary (but the same at every use), but that capability does not currently exist in LLVM. As such, we should try hard to prevent `undef` showing up in a user-visible way. There are several ways to fix this: 1. Wait until LLVM comes up with a safer `undef` and have the value merely be arbitrary, but not dangerous. 2. Always guarantee that padding bytes will be 0. 3. For contiguous-memory arrays, guarantee that we end up with the underlying bytes from that array. However, for now, I think don't think we should make a choice here. Issues like #21912, may play into the consideration, and I think we should be able to reserve making a choice until that point. So what this PR does is only allow reinterprets when they would not expose padding. This should hopefully cover the most common use cases of reinterpret: - Reinterpreting a vector or matrix of values to StaticVectors of the same element type. These should generally always have compatiable padding (if not, reinterpret was likely the wrong API to use). - Reinterpreting from a Vector{UInt8} to a vector of structs (that may have padding). This PR allows this for reading (but not for writing). Both cases are generally better served by the IO APIs, but hopefully this should still allow the common cases. Fixes #25908
There are several discussions on that, may be this helps: |
Just to mention the package which implements this (with macros): https://github.com/jw3126/Setfield.jl with macros. |
Any thoughts on the lens approach? It seems pretty slick to me 😊 |
Can this be part of 1.6? |
We should still work on this (#11902). It just needs some design work and agreement, and probably needs to be rewritten since this is stale. |
Out of curiosity, what happened to this proposal? Did some package replace it or was it deemed not worth it any more? |
I like immutables, I really do, they're simple to work with, fast, don't cause allocations. Really the only thing that bothers me about them is that they're well, immutable, making them a bit of pain to work with, esp. when wanting to construct one incrementally. So here's an attempt to remedy that.
Example usage:
The ways this works is that under the hood, it creates a new immutable object with the specified field modified and then assigns it back to the appropriate place. Syntax wise, everything to the left of the
@
is what's being assigned to, everything to the right of the@
is what is to be modified. E.g.Internally, everything to the right of the
@
gets lowered tosetindex
(no bang) andsetfield
(also no bang), which are overridable by the user. The intent is to disallowsetfield
for immutables that have a non-default inner constructor, allowing the user to provide their own which checks any required invariants. That part isn't implemented here yet, however.Lastly, LLVM isn't currently too happy about the IR this generates, so I'm working on making that happen to make sure this actually performs ok. I think from the julia side, this is pretty much the extent of it though. Pretty much untested at the moment. I want to get the LLVM side of things done first. With that in mind, feel free to check out this branch and see if you like it.
One motivating example here is of course efficient, generic fixed size arrays, so here's an example of that:
Fixes #11902
Supersedes #12113