-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multiple expression return values, error type redesign, introduction of copyable property of types #83
Comments
This may be more than just an optimization problem. Consider: pub struct List(T: type) {
items: []T,
len: usize,
prealloc_items: [STATIC_SIZE]T,
pub fn init() -> List(T) {
var l: Self = undefined;
l.items = l.prealloc_items[0...];
l.len = 0;
return l;
}
}
fn basic_list_test() {
var list = List(i32).init();
defer list.deinit();
} Here, we assign a pointer to Note: I think this is the 2nd time I accidentally did this in the zig standard library, which caused a runtime memory corruption error, which I had to troubleshoot. This is the kind of code we don't want people to write on accident. If we wanted this to work we would need to make this "return value is on caller's stack" thing more explicit, which I'm not sure we want to do. But it would look something like this: fn make_foo(x: i32, y: i32) -> (foo: &Foo) {
foo.x = x;
foo.y = y;
} But now this function assumes that there is memory available to write to. If you called the function with no assignment, we'd need a hidden stack allocation (a concept we already have) to provide the memory: fn f() {
make_foo(1, 2);
} Another point is that if we had this different return value syntax, it implies that we should support tuples which is another can of worms. Maybe we should open it. Let's say we went with this though, and we wanted the instance on the heap: fn f() {
const list = alloc_one(List(i32), 1);
*list = List(i32).init();
} Kind of awkward. Compare to: fn heap() {
const list = alloc_one(List(i32), 1);
list.init();
}
fn stack() {
var list: List(i32) = undefined;
list.init();
} The latter seems more uniform. Maybe? Now I'm not so sure. But the point of this comment is that this optimization needs to be explicitly part of the syntax rather than something that sometimes works and is hidden when it doesn't work. Another option that I am considering is that we could make byvalue struct parameters and return values a compile error. The user would be forced by the compiler to do the undefined/init method, which notably is always available even if we want the other method to be idiomatic. Which means that "only one way to do it" gently suggests disallowing structs as byvalue params and return types. |
I'd parse this as a null dereference. |
Well, there no null dereferencing there. Just calling a function on uninitialized stack-allocated struct (which probably passes the struct as a pointer, but it won't be null). |
I am aware there's no null dereferencing. I'm trying to say is that
|
Yeah, |
Oh no, this is just me being lazy, the initializer list would be forced to put all fields. |
So it does already support the initializer list construct. |
Equivalent C (ignoring the generic stuff): List list;
list_init(&list); Translated to zig: var list: List = undefined;
list.init(); This looks like a reasonable translation to me. If it looks bad it's because we're playing with fire by using uninitialized memory. I would argue that your alarm is well deserved. Ideally we would construct the language design so that using As you mentioned on IRC, something that will help this a bunch is named return value. Then it could look like: const list = List.init(); Let's look at the proposed initializer syntax:
Status quo zig says you should write it this way instead: var list: List(i32) = undefined; // you can also put `zeroes` instead of `undefined`
list.len = 0; If you want a compile error if a field is added/removed to the struct, then use the initializer list which requires populating all fields (and you can specify some of the fields as But status quo I think correctly represents the dangers of undefined values. As for |
|
Here's a proposal for named return values: Functions can have one of two different return styles:
In both cases, any (non-void) return value must be accepted into variables rather than ignored. This is a change from status quo. The special The semantics of unnamed return values are unchanged. Every exit point from a function with a non-void unnamed return value must provide a value to return. Analogously, all named return values must be fully initialized at every exit point from a function. Static analysis for a function with named return values assumes the initial values of the return variables are Examples: fn init1() -> (result: Foo) {
result = Foo{
.field1 = value1,
.field2 = value2,
};
}
fn init2() -> (result: Foo) {
result.field1 = value1;
result.field2 = value2;
}
fn init_equivalent(result: &Foo) {
result.field1 = value1;
result.field2 = value2;
}
fn init4() -> (result: Foo) {
result.field1 = value1;
// ERROR: field2 is not initialized
}
fn init5() -> (result: Foo) {
return Foo{ // ERROR: attempt to return unnamed return value in function with named return values
.field1 = value1,
.field2 = value2,
};
}
fn init6() -> Foo { // ERROR: struct Foo cannot be used as an unnamed return value
}
fn div(numerator: i32, denominator: i32) -> (quotient: i32, remainder: i32) {
quotient = numerator / denominator;
remainder = numerator % denominator;
}
fn main() {
var foo = init1(); // type is inferred.
init_equivalent(&foo); // this is status quo
const foo2 = init1(); // const works too, which you can't do with status quo.
_ = init1(); // _ can be any type
init1(); // ERROR: cannot ignore return value
var x: i32;
var y: i32;
x, y = div(3,1); // this is not general-purpose tuples. this just named return values.
var x2, var y2 = div(3,2);
const x2, const y2 = div(3,2);
var x3, y = div(3,2);
x, var y3 = div(3,2);
x, _ = div(3,2);
_, y = div(3,2);
_, _ = div(3,2);
var broken = div(3,2); // ERROR: wrong number of return values
div(3,2); // ERROR: cannot ignore return values
} In practice, here's a case where the error for ignoring return values will matter: fn add_to_set(set: &Set, x: u32) -> bool {
// return true if x was actually added, and false if x was already in the set.
}
fn main() {
var set: Set = something();
add_to_set(&set, 1); // ERROR: cannot ignore return value
add_to_set(&set, 2); // ERROR: cannot ignore return value
add_to_set(&set, 3); // ERROR: cannot ignore return value
_ = add_to_set(&set, 1);
_ = add_to_set(&set, 2);
_ = add_to_set(&set, 3);
} Even though it's "annoying" to have to type those |
I like the proposal. How would it account for error union return types? |
Proposal: error DivByZero;
fn div(numerator: i32, denominator: i32) -> %(quotient: i32, remainder: i32) {
if (denominator == 0) return error.DivByZero;
quotient = numerator / denominator;
remainder = numerator % denominator;
}
fn main() -> %void {
var x2, var y2 = div(3,2) %% ([]type{i32, i32}){0, 0};
const x3, const y3 = %return div(3, 2);
} Maybe multiple named return types are in fact just tuples? |
New proposal from #212
New error function syntax, multiple return syntax, and error DivByZero;
fn div(numerator: i32, denominator: i32)
%-> (quotient: i32, remainder: i32)
{
if (denominator == 0) return error.DivByZero;
quotient = numerator / denominator;
remainder = numerator % denominator;
}
fn foo(c: i32, d: i32, condition: bool) {
try (const num, const den = div(3, 2)) {
// do something with num and den
} else |err| {
// do something with err
}
const a, const b = if (condition) {
c, d
} else {
d, c
};
}
Example of copyable concept: struct Vec2 {
x: f32,
y: f32,
}
struct Vec3 {
{@setIsCopyable(this, true);}
x: f32,
y: f32,
z: f32,
}
// error: non-copyable type passed by value
fn thisIsBroken(v: Vec2) {}
// error: non-copyable types require named return values
fn thisIsAlsoBroken() -> Vec2 {}
fn thisIsOk(v: Vec3) {}
fn alsoOk() -> Vec3 {}
fn alsoOkWithVec2() -> (result: Vec2) {} |
Here's a test I'm deleting. Reminding myself to port it to the new error syntax when that's done.
|
Inspired by #250, here's a usecase where this proposal gets awkward. This usecase does some very fancy first-class function stuff, which most languages are ill-equipped to handle: fn workerThreadMain() {
// we're not in the UI thread here.
try (const width, const height = runInGuiThread(delegate)) {
setDimensions(width, height);
}
}
fn delegate() %-> (s32, s32) {
// now we're in the UI thread
if (comboBox.selectedIndex() == 1) {
return widthSpinner.value(), heightSpinner.value();
} else {
return error.NotApplicable;
}
}
// this is the overloading you might find in Java or C#
fn runInGuiThread(f: fn() -> var) -> f.resultType { }
fn runInGuiThread(f: fn() -> (var, var)) -> (f.resultTypes[0], f.resultTypes[1]) { }
fn runInGuiThread(f: fn() %-> var) %-> f.resultType { }
fn runInGuiThread(f: fn() %-> (var, var)) %-> (f.resultTypes[0], f.resultTypes[1]) { }
// maybe something like this would be better:
fn runInGuiThread(f: fn() -> ...) -> ...f.resultTypes { }
// or maybe this:
fn runInGuiThread(f: fn() %-> ...) %-> ...f.resultTypes { } I haven't figured out how a generic function runner like that would be implemented, but #229 would probably help. |
These examples are starting to get outside the range of syntax complexity I'm comfortable having. If it gets too weird I'd rather stick with single return values. |
Did we ever consider this for multiple return values? fn div(numerator: i32, denominator: i32) -> struct {quotient: i32, remainder: i32} {
return this.ReturnType {
.quotient = numerator / denominator,
.remainder = numerator % denominator,
};
} To account for named return values: fn div(numerator: i32, denominator: i32) -> (result: struct {quotient: i32, remainder: i32}) {
result.quotient = numerator / denominator;
result.remainder = numerator % denominator;
} Now with an error: error DivByZero;
fn div(numerator: i32, denominator: i32) -> %struct {quotient: i32, remainder: i32} {
if (denominator == 0) return error.DivByZero;
return this.ReturnType {
.quotient = numerator / denominator,
.remainder = numerator % denominator,
};
} Named return value with an error: fn div(numerator: i32, denominator: i32) -> (result: %struct {quotient: i32, remainder: i32}) {
if (denominator == 0) {
result = error.DivByZero;
return;
}
result = @typeOf(result).ChildType {
.quotient = numerator / denominator,
.remainder = numerator % denominator,
};
} That last one is pretty awkward. I'm not satisfied with it. As a reminder, one of the main driving use cases of this issue is so that the pattern of struct initialization is instead of (status quo): var list: List(i32) = undefined;
list.init(); We want instead: var list = List(i32).init(); It's more than just aesthetics; we're trying to reduce the difference between compile-time code and run-time code, and the former cannot be used to initialize a global variable, while the latter can. So we're trying to make the common pattern work for both. |
Another idea for that last one. We introduce another syntax for error unions and nullable types. The syntax lets you set the value to non-null/non-error, and a pointer to the payload (presumably to write through it). The contents of memory the pointer points to when you use this operation are undefined. I'm also going to throw in there, that you can still fn div(numerator: i32, denominator: i32) -> (result: %struct {quotient: i32, remainder: i32}) {
if (denominator == 0) return error.DivByZero;
const payload = &%result; // result becomes non-error now, with undefined value payload
payload.quotient = numerator / denominator;
payload.remainder = numerator % denominator;
} Similarly this would introduce the This syntax seems to make sense because it's something you might want to do with a nullable type or an error union type anyway. Downside is that it introduces another sigil, which seems to be a common complaint from people newly exposed to zig. Also, even if we don't have multiple function return values, I think it still makes sense that you have to explicitly throw away void, and expressions can result in 0 or 1 values, instead of expressions using void to result in nothing. |
This proposal is too big and gnarly. Some of the issues it brought up are solved, some are no longer valid, and the rest are split up into other issues. |
(see below comment for up to date details of this issue)
When
make_foo
is generated, we should notice that f is always returned, which means instead of allocating stack space for f, we use the secret first argument pointer value and directly put the values there.Also, in the code generated for function
f
, we notice the simple assignment, and instead of doing a memcpy, simply allocate the stack variable forfoo
, and then callmake_foo
passing the stack variable address as the secret first parameter.Once these two optimizations are implemented, idiomatic zig code for initializing structs can be an assignment.
The text was updated successfully, but these errors were encountered: