-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: spec: read-only types #22876
Comments
I understand the desire for this kind of thing, but I am not particularly fond of this kind of proposal. This approach seems very similar to the You've identified the problems well: this does not provide immutability, and it does not avoid data races. I would like to see a workable proposal for immutability, and I would love to see one that avoids data races. This is not those proposals. Using In general this is an area where experience reports can help guide Go 2 development. Does this proposal help with real problems that Go programmers have encountered? |
I'm continuing to think about those things, but I wanted to get this proposal out there for two reasons. One, I think any proposal for immutability will have this as a subset. The second reason I wanted to share this is that it serves as a counterexample to anyone who thinks adding read-only types to Go is just a matter of adding a keyword.
Yes. At the recent Google-internal Go conference, @thockin specifically asked for |
What do you think of a builtin |
Out of curiosity, how does that work? And how does it detect modification of a nested value? |
I don't know exactly how it works, which is why I haven't written a proposal for it. One conceivable implementation would be to For a nested value, you use |
I'm not following the distinction between values and variables in this proposal. Why is the modification of the value stored in var a ro int
a = 1 // Modifying an ro int via a variable.
var b *ro int := &a
*b = 1 // Modifying an ro int via a pointer. A nitpick: Pointers-to-constants are entirely orthogonal to the rest of this proposal and (IMO) distract from the meat of it. Go already has syntax for constructing non-zero pointers to compound types; providing a similar facility for non-compound types does not require the addition of read-only values to the language. e.g., #19966. |
@jba Could you accomplish the same thing with overriding type operations? Is it important that the read-only property be at the type level? For example, string (basically a read-only []byte) could be defined as something like this: type string []byte
func (s string) []=(index int) byte {
panic("not supported")
} This doesn't require any changes to the type system, and seems to be backward-compatible at first glance. |
@neild var a int
a = 1 // Modifying an ro int via a variable.
var b *int := &a
*b = 1 // Modifying an ro int via a pointer. and of course both of those assigments are equally legal. The assignment in var c ro *int = &a
*c = 1 would not be, but I'm trying to avoid proposing both a type modifier and what C would call "storage class,", out of hygiene. (See Ian's blog post that he linked to above for a criticism of how C |
@willfaught I don't think operator overloading is a good fit for Go. One of the nice things about the language is that every bit of syntax has a fixed meaning. |
It seems identical to how methods and embedding work. Like the selector |
@jba Your proposal says:
The existence of You also say:
I can't square this with it being legal to modify the value of var c ro *int = &a The variable |
@willfaught Operator overloading is a very different idea that should be discussed in a separate proposal, not this one. |
That's a bug in my proposal. I chose a poor example. Replace
The it in your last sentence refers to the value of read-only type, the The situation is analagous to
The value is immutable, but the variable binding is not. |
It is possible that I have misunderstood the spec, but this is not consistent with my understanding of variable assignment. |
I guess I'm using the word "binding" wrong. I was thinking variables are bound to their values, and you're saying identifiers are bound to variables, which have values. Anyway, you can change variable-value associations, but some values cannot be modified. |
Why not reuse the already existing My point here is that Other point, say I have a read only type for ints. Is such type declarable (as in If yes, do I declare an instance of such type using the type T = ro int
var x T = 55
// or
const y T = 98 Moreover, wouldn't it be enough to allow constant pointers ? Other point, what about compound types ? Say, using these declarations type S struct {
Exported int
}
type RoS = ro S Does this snippet compile ? If not, what errors are thrown ? If yes, what is the expected behavior ? Does it panic ? If yes, how does the runtime detects this ? func main() {
ros := &RoS{Exported: 55}
p := &ros.Exported
*p = 98
} What about this one ? func main() {
ros := &RoS{Exported: 55}
p := (*int)(unsafe.Pointer(&ros.Exported))
*p = 98
} |
I just want to point to two proposals, one for immutable slices and one for pointerized structs that I think in combination amounts to a simpler set of language changes than what is proposed here. Please take a look!
Check out the Pointerized structsHere's a concrete example. Here is one way to control write-access to structs. Copying is trivial, you can just do type Foo struct {
value interface{}
}
func (f *Foo) SetValue(interface{}) {...}
func (f Foo) GetValue() interface{} {...} The other way is to protect the struct with a mutex: type Foo struct {
mtx sync.RWMutex
value interface{}
}
func (f *Foo) SetValue(interface{}) {...} // Lock/Unlock
func (f *Foo) GetValue() interface{} {...} // RLock/RUnlock Here's a full pointerized struct version: type foo struct* {
Value interface{}
}
func (f foo) GetValue() interface{} {...}
type Foo struct {
mtx sync.RWMutex
foo
}
func (f *Foo) SetValue(interface{}) {...} // Lock/Unlock
func (f *Foo) GetValue() interface{} {...} // RLock/RUnlock
f = Foo{...}
f.SetValue(...) // ok, f is addressable
g := f
g.SetValue(...) // ok, g is addressable
func id(f Foo) Foo { return f } // returns a non-addressable copy
id(g).SetValue(...) // compile-error, not addressable.
id(g).GetValue(...) // calls foo.GetValue, mtx not needed Q: So why readonly slices? It seems natural to create a "view" into an arbitrary-length array of objects without copying. For one, it's a required performance optimization. Second, there's no way to mark any items of a slice to be externally immutable, as can be done with private struct fields. For these reasons, readonly slices appear to be natural and necessary (for lack of any alternative). |
I think this design would lead to significant complexity in practice, similar to C++
One cheap alternative is to introduce a documentary type annotation, which just document the value should not be changed. There is no enforcement, but it offers a design contract between caller and callee. Go doesn't provide in-process security anyway, a bad library can do arbitrary damage. I am not sure whether we need to guard it at language level. |
No,
It fails to compile.
Of course, all bets are off with |
I don't see how
Can I assign to elements of
I don't understand this. What is immutable? Certainly not
|
I think it's a little less complex, but yes, I basically agree.
It doesn't look like a constant, it looks like a readonly value.
|
No, you can't.
I meant, you pass by value (e.g. copy) to prevent others from writing to it. Immutable is an overloaded word... I was using it to refer to pass-by-copy semantics. f := &Foo{value: "somevalue"}
f.SetValue("othervalue") // `f` is a pointer
g := *f
g.SetValue("another") // can't, g is a readonly copy. The use-cases for But I also acknowledge that Golang1 isn't perfectly suited for this kind of usage, because it forces you to write verbose and type-unsafe syntax to get the behavior you want... Here's an example with an (immutable) tree-like structure: type Node interface {
AssertIsNode()
}
type node struct {
Left Node
Right Node
}
func (_ node) AssertIsNode() {}
// Using the struct is cumbersome, but overall this has the behavior we want.
// Interfaces are pointer-like in how its represented in memory,
// copying is quick and efficient.
var tree Node = ...
maliciousCode(tree) // cannot mutate my copy
// But using this as a struct is cumbersome and type-unsafe.
var leftValue = tree.(node).Left.(node).Value Maybe one way to make this easier is to declare a struct to be "pointerized"... type Node struct* {
Left Node
Right Node
}
var n Node = nil // not the same as a zero value.
n.Left = ... // runtime error
n = Node{}
n.Left = ...
n.Right = ...
var n2 = n
n2.Left = ... // This won't affect `n`.
n2.Left.Left = ... // compile-time error, n2.Left is not addressable.
n.Left = n // circular references are OK. Please check out #23162 @jba Please check out the update to the last comment: #22876 (comment) |
Has anyone listed all the existing proposals for declaring a block of data to be stored in read-only memory? |
Focusing on Missing Immutability:
While correct, I think this statement misses that The immutability could be summarized as: To achieve this, the compiler must effectively do escape analysis on An example: func foo1() []int {
return append([]int{1}, 2)
}
// x is ro, and can also be considered immutable
x := ro []int(foo1())
func bar2 func([]int){}
func foo2() []int {
out := append([]int{1}, 2)
bar2(out)
return out
}
// y is ro but can't be considered immutable, the variable ro-escapes in bar
// (the example is so simple a smart compiler could actually identifies it doesn't escape, but I am not making that case here give go prioritizes compile time)
y := ro []int(foo2())
func bar3 func(ro []int){}
func foo3() []int {
out := append([]int{1}, 2)
bar3(out)
return out
}
// z is ro and considered immutable, the variable escapes but as a ro, and that's OK
z := ro []int(foo2()) Note this means that immutable func foo() ro []int {
out := []int{1} // not immutable, not even ro
out = append(out, 1) // not immutable or ro (and in fact is being modified!)
return out // ro and immutable, despite having been non-immutable (and non-ro) before. Has NOT been deep-copied
} In this sense immutable The benefits of this are exactly as defined in this proposal:
It is only that the gatekeeper to this performance gains is the compiler, not the developer - at least not explicitly. Write good Concluding Quoting from the proposal:
Replace programmer for compiler, that's all I'm trying to say. |
It is the exact type of tail, with the
But those performance gains come from how programmers write code. Say I'm writing a cache that accepts |
Yes, the programmers still have a part to play in writing both safe and performant code - more so than if there was
No more than you may have to do in current (non- If you want to ensure the //go:immutable
but I think generally that may introduce many issues as the compiler makes no promises that logically immutable |
It would be nice to have read only maps as asked in slack how to do the following: const database_config := map[string]string{ |
@Ookma-Kyi Constant map variables are an aspect of immutability and are covered by #6386. This proposal is less about immutability than it is about ensuring that functions don't change certain values. |
Interesting idea, but please don't repeat the bodge of having As have others, decades of wrestling with C's In that sense, Having already worked on large codebases with plenty of |
I think a read only value is more along the lines of C++ constexpr. `const`
is more or less a const alias to a value, but the value can be changed via
other aliases. A `constexpr` is kinda immutable value ensured by the
compiler and the runtime.
Given the simplicity focus of Go, I am not sure if this feature fits well
with Go.
|
Interesting idea. Here are some thoughts:
|
@mihaigalos The language you are proposing seems to have very little intersection with Go1 as it already exists. Practically no existing Go program would work. |
Hi @tv42. Is this not the correct thread to discuss breaking changes? |
@mihaigalos useful background on breaking changes and Go2: https://go.googlesource.com/proposal/+/master/design/28221-go2-transitions.md |
having a |
@amery, it would be a breaking change in some cases:
This breaks if
is changed to
|
even without new features you can't assume you can change F's signature and expect everyone using it to remain happy. to me key to be a breaking change is that old code stops working with the new release of the compiler. this is not the case here |
I propose adding read-only types to Go. Read-only types have two related benefits:
An additional minor benefit is the ability to take the address of constants.
This proposal makes significant changes to the language, so it is intended for Go 2.
All new syntax in this proposal is provisional and subject to bikeshedding.
Basics
All types have one of two permissions: read-only or read-write. Permission is a property of types, but I sometimes write "read-only value" to mean a value of read-only type.
A type preceded by
ro
is a read-only type. The identifierro
is pronounced row. It is a keyword. There is no notation for the read-write permission; any type not marked withro
is read-write.The
ro
modifier can be applied to slices, arrays, maps, pointers, structs, channels and interfaces. It cannot be applied to any other type, including a read-only type:ro ro T
is illegal.It is a compile-time error to
append
,A value of read-only type may not be immutable, because it may be referenced through another type that is not read-only.
Examples:
The compiler guarantees that the bytes of
data
will not be altered bytransmit
.This proposal is concerned exclusively with avoiding modifications to values, not variables. Thus it allows assignment to variables of read-only type.
One could imagine a companion proposal that also used
ro
, but to restrict assignment:I don't pursue that idea here.
Conversions
There is an automatic conversion from
T
toro T
. For instance, an actual parameter of type[]int
can be passed to a formal parameter of typero []int
. This conversion operates at any level: a[][]int
can be converted to a[]ro []int
for example.There is an automatic conversion from
string
toro []byte
. It does not apply to nested occurrences: there is no conversion from[][]string
to[]ro []byte
, for example.(Rationale:
ro
does not change the representation of a type, so there is no cost to addingro
to any type, at any depth. A constant-time change in representation is required to convert fromstring
toro []byte
because the latter is one word larger. Applying this change to every element of a slice, array or map would require a complete copy.)Transitivity
Permissions are transitive: a component retrieved from a read-only value is treated as read-only.
For example, consider
var a ro []*int
. It is not only illegal to assign toa[i]
; it is also illegal to assign to*a[i]
.Transitivity increases safety, and it can also simplify reasoning about read-only types. For example, what is the difference between
ro *int
and*ro int
? With transitivity, the first is equivalent toro *ro int
, so the difference is just the permission of the full type.The Address Operator
If
v
has typero T
, then&v
has type*ro T
.If
v
has typeT
, thenro &v
has typero *T
. This bit of syntax simplifies constructing read-only pointers to struct literals, likero &S{a: 1, b: 2}
.Taking the address of constants is permitted, including constant literals. If
c
is a constant of typeT
, then&c
is of typero *T
and is equivalent toRead-Only Interfaces
Any method of an interface may be preceded by
ro
. This indicates that the receiver of the method must have read-only type.If
I
is an interface type, thenro I
is effectively the sub-interface that contains just the read-only methods ofI
. If typeT
implementsI
, then typero T
implementsro I
.Read-only interfaces can prevent code duplication that might otherwise result from the combination of read-only types and interfaces. Consider the following code from the
sort
package:We would like to allow
IntsAreSorted
to accept a read-only slice, since it does not change its argument. But we cannotcast
ro []int
toIntSlice
, because theSwap
method modifies its receiver. It seems we must copy code somewhere.The solution is to mark the first two methods of the interface as read-only:
Now we can write
IsSorted
in terms of the read-only sub-interface:and call it on a read-only slice:
Permission Genericity
One of the problems with read-only types is that they lead to duplicate functions. For example, consider this trivial function, ignoring its obvious problem with zero-length slices:
We cannot call
tail1
on values of typero []int
, but we can take advantage of the automatic conversion to writeThanks to the conversion from read-write to read-only types,
tail2
can be passed an[]int
. But it loses type information, because the return type is alwaysro []int
. So the first of these calls is legal but the second is not:If we had to write two variants of every function like this, the benefits of read-only types would be outweighed by the pain they cause.
To deal with this problem, most programming languages rely on overloading. If Go had overloading, we would name both of the above functions
tail
, and the compiler would choose which to call based on the argument type. But we do not want to add overloading to Go.Instead, we can add generics to Go—but just for permissions. Hence permission genericity.
Any type inside a function, including a return type, may be preceded by
ro?
instead ofro
. Ifro?
appears in a function, it must appear in the function's argument list.A function with an
ro?
argumenta
must type-check in two ways:a
has typero T
andro?
is treated asro
.a
has typeT
andro?
is treated as absent.In calls to a function with a return type
ro? T
, the effective return type isT
if thero?
argumenta
is a read-write type, andro T
ifa
is a read-only type.Here is
tail
using this feature:tail
type-checks because:x
declared asro []int
, the slice expression can be assigned to the effective return typero []int
.x
declared as[]int
, the slice expression can be assigned to the effective return type[]int
.This call succeeds because the effective return type of
tail
isro []int
when the argument isro []int
:This call also succeeds, because
tail
returns[]int
when its argument is[]int
:Multiple, independent permissions can be expressed by using
ro?
,ro??
, etc. (If the only feasible type-checking algorithm is exponential, implementations may restrict the number of distinctro?...
forms in the same function to a reasonable maximum, like ten.)In an interface declaration,
ro?
may be used before the method name to refer to the receiver.There are no automatic conversions from function signatures using
ro?
to signatures that do not usero?
. Such conversions can be written explicitly. Examples:Permission genericity can be implemented completely within the compiler. It requires no run-time support. A function annotated with
ro?
requires only a single implementation.Strengths of This Proposal
Fewer Bugs
The use of
ro
should reduce the number of bugs where memory is inadvertently modified. There will be fewer race conditions where two goroutines modify the same memory. One goroutine can still modify the memory that another goroutine reads, so not all race conditions will be eliminated.Less Copying
Returning a reference to a value's unexported state can safely be done without copying the state, as shown in Example 2 above.
Many functions take
[]byte
arguments. Passing a string to such a function requires a copy. If the argument can be changed toro []byte
, the copy won't be necessary.Clearer Documentation
Function documentation often states conditions that promise that the function doesn't modify its argument, or that extracts a promise from the caller not to modify a return value. If
ro
arguments and return types are used, those conditions are enforced by the compiler, so they can be deleted from the documentation. Furthermore, readers know that in a well-designed function, a non-ro
argument will be written along at least one code path.Better Static Analysis Tools
Read-only annotations will make it easier for some tools to do their job. For example, consider a tool that checks whether a piece of memory is modified by a goroutine after it sends it on a channel, which may indicate a race condition. Of course if the value is itself read-only, there is nothing to do. But even if it isn't, the tool can do its job by checking for writes locally, and also observing that the value is passed to other functions only via read-only argument. Without
ro
annotations, the check would be difficult (requiring examining the code of functions not in the current package) or impossible (if the call was through an interface).Less Duplication in the Standard Library
Many functions in the standard library can be removed, or implemented as wrappers over other functions. Many of these involve the
string
and[]byte
types.If the
io.Writer.Write
method's argument becomes read-only, thenio.WriteString
is no longer necessary.Functions in the
strings
package that do not return strings can be eliminated if the correspondingbytes
method usesro
. For example,strings.Index(string, string) int
can be eliminated in favor of (or can trivially wrap)bytes.Index(ro []byte, ro []byte) int
. This amounts to 18 functions (includingReplacer.WriteString
). Also, thestrings.Reader
type can be eliminated.Functions that return
string
cannot be eliminated, but they can be implemented as wrappers around the correspondingbytes
function. For example,bytes.ToLower
would have the signaturefunc ToLower(s ro? []byte) ro? []byte
, and thestrings
version could look likeThe conversion to
string
involves a copy, butToLower
already contains a conversion from[]byte
tostring
, so there is no change in efficiency.Not all
strings
functions can wrap abytes
function with no loss of efficiency. For instance,strings.TrimSpace
currently does not copy, but wrapping it aroundbytes.TrimSpace
would require a conversion from[]byte
tostring
.Adding
ro
to the language without some sort of permission genericity would result in additional duplication in thebytes
package, since functions that returned a[]byte
would need a corresponding function returningro []byte
. Permission genericity avoids this additional duplication, as described above.Pointers to Literals
Sometimes it's useful to distinguish the absence of a value from the zero value. For example, in the original Google protobuf implementation (still used widely within Google), a primitive-typed field of a message may contain its default value, or may be absent.
The best translation of this feature into Go is to use pointers, so that, for example, an integer protobuf field maps to the Go type
*int
. That works well except for initialization: without pointers to literals, one must writeor use a helper function.
In Go as it currently stands, an expression like
&3
cannot be permitted because assignment through the resulting pointer would be problematic. But if we stipulate that&3
has typero *int
, then assignment is impossible and the problem goes away.Weaknesses of This Proposal
Loss of Generality
Having both
T
andro T
in the language reduces the opportunities for writing general code. For example, an interface method with a[]int
parameter cannot be satisfied by a concrete method that takesro []int
. A function variable of typefunc() ro []int
cannot be assigned a function of typefunc() []int
. Supporting these cases would start Go down the road of covariance/contravariance, which would be another large change to the language.Problems Going from
string
toro []byte
When we change an argument from
string
toro []byte
, we may eliminate copying at the call site, but it can reappear elsewhere because the guarantee is weaker: the argument is no longer immutable, so it is subject to change by code outside the function. For example,os.Open
returns an error that contains the filename. If the filename were not immutable, it would have to be copied into the error message. Data structures like caches that need to remember their methods' arguments would also have to copy.Also, replacing
string
withro []byte
would mean that implementers could no longer compare via operators, range over Unicode runes, or use values as map keys.Subsumed by Generics
Permission genericity could be subsumed by a suitably general design for generics. No such design for Go exists today. All known constraints on generic types use interfaces to express that satisfying types must provide all the interface's methods. The only other form of constraint is syntactic: for instance, one can write
[]T
, whereT
is a generic type variable, enforcing that only slice types can match. What is needed is a constraint of the form "T
is either[]S
orro []S
", that is, permission genericity. A generics proposal that included permissions would probably drop the syntax of this proposal and use identifiers for permissions, e.g.Missing Immutability
This proposal lacks a permission for immutability. Such a permission has obvious charms: immutable values are goroutine-safe, and conversion between strings and immutable byte slices would work in both directions.
The problem is how to construct immutable values. Literals of immutable type would only get one so far. For example, how could a program construct an immutable slice of the first N primes, where N is a parameter? The two easy answers—deep copying, or letting the programmer assert immutability—are both unpalatable. Other solutions exist, but they would require additional features on top of this proposal. Simply adding an
im
keyword would not be enough.Does Not Prevent Data Races
A value cannot be modified through a read-only reference, but there may be other references to it that can be modified concurrently. So this proposal prevents some but not all data races. Modern languages like Rust, Pony and Midori have shown that it is possible to eliminate all data races at compile time. But the cost in complexity is high, and the value unclear—there would still be many opportunities for race conditions. If Go wanted to explore this route, I would argue that the current proposal is a good starting point.
References
Brad Fitzpatrick's read-only slice proposal
Russ Cox's evaluation of the proposal. This document identifies the problem with the
sort
package discussed above, and raises the problem of loss of generality as well as the issues that arise in moving fromstring
toro []byte
.Discussion on golang-dev
The text was updated successfully, but these errors were encountered: