-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
functions inside functions (closures) #229
Comments
Would you generate this as a separate function in the IR or as a block? |
Good question. Probably a separate function - that seems easier to do. Do you have any thoughts on the matter? |
If the functions can access variables in the parent function's scope, then it would need to be implemented as a separate block. Function call overhead would also be removed (if there is any). |
Consider that an inner function could call itself recursively. If it were implemented as a function, then the compiler would create a struct for all the local variables in the outer function, and then pass that struct address to the inner function as a hidden parameter, and the inner function has access to the outer function's local variables. If it were implemented as a block, the inner function's local variables would need to be a struct, and when you call the inner function, we allocate the stack space in the outer function's stack for the local variables of the inner function, and push the return address. Both of these strategies would work fine, and I believe the function call overhead we would avoid by doing the block thing is exactly the same overhead we introduce by allocating space for the inner function's variables and pushing the return address. I think the only reason to prefer the function strategy over the block strategy is that it might play more nicely with LLVM's optimizer. But that's just a hunch, and to be sure, an empirical test would be best. |
I think it's important to put the word "closures" in the title of this issue, since this isn't just about functions with locally scoped names, but functions that have implicitly managed references to bound variables. |
Proposal: convenient syntax for anonymous functions, or function literals. Useful and very common for callbacks and/or higher order functions.
Returning functions should also be possible:
Lambda functions can also capture and refer variables from the scope of the outer function. While non-capturing lambda functions can be thought as simple function pointers to global procedures, capturing requires also a pointer to the outer environment. We can think of this couple However, unlike D 2.0, and like D 1.0, I vote against the possibility to return delegates from functions:
Implementing this behavior safely seems extremely difficult (impossible?) and even provably safe cases require the continuation to be saved somewhere in the heap. Having a core language feature depending on an allocator sounds like a bad idea, and introducing a potential source of unsafety even more so. I argue that by only allowing |
There are several topics here to discuss:
Each one of the above bullets could be its own issue to discuss. |
I've been doing some thinking on this topic, so I thought I'd post some thoughts here for consideration. Currently, the compiler lets you do this:
Which I've used to create a sort of comptime closure in the past. Of course if I remove the 'comptime' restriction it crashes when called (even if called from within the outer fn). Compiler should probably stop me from doing that. It'd be nice to skip the struct part, but personally I don't see much of a need for non-comptime closures or lambdas, but here's a way I think they could work: Step 1: Anonymous functions, similar to how structs work:
Step 2: Introduce a new kind of type: closure. A closure would internally be a struct with variables and a function pointer, but can be called like a function. When defining the closure, you specify which currently scoped variables get copied into the struct, they are given the same name. Some inference can make it less painful to write.
This is functionally equivalent to:
You can allocate a closure on the heap (or wherever) by specifying an allocator:
|
@tgschultz
|
Arg! Played a little more around with C++ and my example is invalid, as all lambdas in C++ have different types (Idk why. Why doesn't two lambdas with the same closure have the same type?). The size problem is still there though:
|
@Hejsil:
MyClosure is now basically the same as this struct:
later, it is initialized:
I thought it'd make sense to use the payload operator here, since the type of the variable(s) the closure encloses is already defined and we're really just assigning a name for use in the code block. When |
@tgschultz Alright, so I would assume that all functions taking functions would need to be generic yes?
This also means that it's quite hard to store an array of closures. I guess one could store pointers to them, and when called, the functions would know internally how to interpret their closure.
|
I'd say yes. If it takes And yeah, you wouldn't be able to have an array of closures of different types, as it'd be equivalent to having an array of different struct types. So you'd have to use a union or pointers. |
I am trying to understand the proposed @tgschultz if you could pass closures where functions were expected, you could do a lot of really interesting things. It is an open question whether that violates the Zen of Zig too much. Closures inherently hide things. It can enable some really clean idioms. |
The closure type is not a Where I use a naked Perhaps I have combined too many ideas here (there's also anonymous functions and initializing functions using a type alias), and I made a few mistakes in the example code, leading to some of the confusion. The thing about passing a closure to something expecting a function is that a closure is actually a struct in this proposal. I suppose you can have the compiler transparently handle it via ducktyping, but I think that makes it difficult to reason about what's going on. And I can't think of a way it could work with extern, but maybe there is one. If I'm honest, I'm not fond of the closure part of this proposal. I spent a lot of time thinking about how it could work and fit in with what I understand to be Zig's philosophy, and I just don't think the concept works well in that context. If you make it more obvious what's going on, you end up with my "functionally equivalent" code in the example, if you make it more transparent you make it difficult to reason about and potentially need to introduce hidden allocations. I think allowing anonymous functions makes sense for consistency with how pretty much everything else in Zig works, and it would allow comptime closures without having to wrap it in an empty struct, and I think that's good enough, personally. Maybe we should split this issue off into |
@Hejsil, I don't know about C++, but I think that in Rust, the implicit Closure type actually generates a new struct type with its own name. So, even if the "method" signature is the same as another Closure, it is a different named type. |
Hi, someone new to Zig chiming in, with some ideas: Coming from C, all I really would want from Zig are anonymous functions (not closures / no environment capturing), ideally as function-ptr expressions, ex.: const double_i32 = fn(x: i32) i32 { return x * 2; };
comptime {
assert(@typeOf(double_i32) == fn(i32)i32);
} Anonymous function-ptr-expressions I think would be a great addition, they seem to harmonize with the zen and keep the code just as readable as if named functions were used. Possible use-case example:
Coming from C++ / Rust, I would also want comptime function parameters. fn doNTimes(comptime n: usize, comptime T: type, comptime func: fn(x: T) void, arg: T) {
comptime var i: usize = 0;
inline while (i < n) : (i += 1) {
func(arg);
}
}
// from doNTimes(3, i32, foo, 123)
fn doNTimes(arg: i32) {
foo(arg);
foo(arg);
foo(arg);
}
Hower, this opens up a whole can questions:
|
@Lisoph aside from anonymous functions and traits (which has its own issue), what you're asking about already exists. Notably comptime evaluation does everything you're describing and more (although we may remove variadic functions entirely, but that's a separate issue). Someday we'll have better docs, examples, blog posts, video talks, etc. Github issues are pretty inefficient for teaching about the language. |
I'm not completely sure how stack closures are implemented, but I don't think they need to be different sizes from one another. I think that a pointer to a stack frame (of the lexically containing function) and a pointer to a function should be sufficient. fn main() void {
// ...
var a: f64 = 12.0;
iter.map(x -> x * a);
} becomes (and I'm making up some builtins that probably shouldn't exist): fn _anonymous(frame: @FramePtr(main), x: f64) f64 {
return x * @frameLookup(frame, "a");
}
fn main() void {
// ...
var a: f64 = 12.0;
iter.map(Callback(f64, f64) {
.context = @framePtr(),
.function = _anonymous,
});
} I'd much prefer that the usage of closures tailor to only the most common uses; no syntax to specify by-value capturing, and definitely not heap-allocation. In order to accommodate less-common usage, the // Custom callback structs
const CountClicksCtx = struct {
nclicks: u32,
};
fn countClicks(ctx: *CountClicksCtx, event: *Event) void {
if (event.kind != Event.Kind.Click) return;
ctx.nclicks += 1;
}
fn main() !void {
// ...
const clickCounter = try arena.alloc(CountClicksCtx, 1);
clickCounter.nclicks = 0;
window.addListener(Callback(*Event, void) {
.context = clickCounter,
.function = countClicks,
});
} // Non-closure syntax sugar
fn double(x: f64) f64 {
return x * 2;
}
fn main() void {
// ...
iter.map(double);
// becomes:
iter.map(x -> double(x));
} I could be overlooking something, but I think |
Throwing in my thoughts (and apologies if I'm repeating anything already discussed):
EditAfter looking through @tgschultz's earlier post, I guess |
I hacked together userland closures: https://github.com/fengb/zig-closure No idea if this is actually usable in practice... |
I also hacked together userland closures, although it looks like with quite a different goal from fengb. https://gist.github.com/adrusi/c32e8133914fab5800d0eea7e3d6be15 tl;dr test "closures" {
var x: i32 = 60;
const foo = closure(struct {
x: *i32,
pub fn call(self: *const @This(), y: i32) i32 {
self.x.* += y;
return 420;
}
} { .x = &x });
assert(foo.call(.{ @as(i32, 9) }) == 420);
assert(x == 69);
} |
After discussing with @andrewrk and @thejoshwolfe, we've decided that function literals (as accepted in #1717) should not be able to implicitly capture any state. It would be too easy to accidentally close over a pointer to stack memory and clobber your stack. It would also cause a problem where closures are often half-comptime (function pointer known) and half-runtime (runtime closed-over data), which the type system doesn't have a good way of handling. Additionally, closures tend to encourage a highly functional programming style. While there are some merits to this style, it often obscures where memory is stored or what the associated lifetime is. The explicit capture syntax in C++ sidesteps this problem to some degree, but only works well because of C++'s use of constructors and destructors. A Zig equivalent would need to be more explicit, probably in ways that are dependent on the specific use case. It also forces the optimizer to choose between two terrible options: duplicate functions which accept closures (causing potentially large code bloat), or pass runtime function pointers (damaging cpu performance). Because of these drawbacks, we feel that the language should not take on additional complexity to support this style. However, note that #1717 alongside Zig's support for runtime function pointers and anonymous tuples make it possible to implement type safe abstractions that suffer less from these drawbacks. For those reasons, we are closing this issue. The workaround for this use case is to use a stateless lambda alongside a state object, and pass the state object to the lambda when it is invoked. |
To revisit this a bit, I noted in issue #6965 that some parts of this (not general closures!) do solve the problem there. The general idea: Allow a limited form of nested function.
These constraints are somewhat onerous, but GCC manages this just fine, so there is an existence proof. I have not done investigation with Godbolt or other to see what GCC does for optimization. The key problem here is that without the restrictions on assignment etc. a nested function could escape to be used outside of its defining call tree. The prohibition on taking the address and on assignment aim to solve that. I have not thought through all the edge cases to see if it is still a problem with these restrictions. |
I'm glad this was the decision because this would have been really bad news for anyone using zig on low cost/power embedded devices. Some projects completely disallow any heap allocations. |
I need anonymous functions and don't want to be wrapped in a struct. Closures are very complicated and can be omitted. I support zig to keep it simple and close to the implementation logic of c. For those who really need closures, you can turn all the variables you need to capture into explicit parameters, so just anonymous functions are needed. const f = struct {
fn f()void{}
}.f;
// i want
const f = fn()void{}; |
It makes sense for there to be a function inside a function, as long as it doesn't require allocating any memory, and the function is only callable while the outer function is still running.
The text was updated successfully, but these errors were encountered: