-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Notation for lazy map #19198
Comments
Ref. discussion around #16285 (comment) (generalizing compact broadcast syntax to compact higher-order function syntax, relevant given the behavior mentioned above seems like |
That's the problem with special syntax; there's never enough of it. I think the dots are pretty obscure in this case. What might be nice, and has been proposed before, is |
(One annoyance is the precedence: shouldn't |
In #16285 we discussed making |
I did not know about #16285. Anyway, I think I prefer lazy map than materializing map. IMHO, I'd prefer that
If I wanted to materialize it I could On the other hand, I cannot "unmaterialize" |
It is an interesting question whether one could make a lazy broadcast that is as efficient as a fusing materializing broadcast. That is, suppose For example, suppose that you cascade a sequence of these operations: |
Yep --- with laziness, performance variance tends to go up. |
Pardon my ignorance, but is it possible to "fuse" the generators (as is done in the materializing broadcast)? That is, is it possible to fuse
into
instad of
Can this be done automatically by the compiler? Then the efficiency issues of the lazy |
Of course; if |
My point is that with lazy It's a matter of taste then: Either you prefer short syntax for materializing broadcast, and use long syntax for lazy map (generators) (This is the way it is now). Or you prefer slightly larger syntax for materializing broadcast ( |
I had been playing with the idea of explicit (higher-order?) function composition (#17184, https://github.com/FugroRoames/CoordinateTransformations.jl) and I find that creating compositions and then (possibly later) applying them to functions is an incredibly powerful way of doing "lazy" evaluation. The idea is that This leads to a few interesting ways of doing things - you can (a) construct the entire chain of functions until the point you want the "full" output relatively easily and then call that, or (b) wrap these up in some kind of "lazy" broadcasting composition (maybe with another operator), or (c) perform static (or dynamic) analysis of the function chain to optimize it, in the same way e.g. SQL will optimize a query. I had started playing with this last idea but haven't had any time to commit to it - but in principle it would be a bit like Query.jl but without any macros. I suspect that (b) is very close to what you want - and the great thing is that you don't need any further support from the parser, the compiler or |
Now that we internally have a lazy broadcast object via #26891, it might be nice to have a syntax so that fused dot calls like Available (non-breaking) syntaxes seem to include One could also make a macro In #16285, the syntaxes |
I would also love something like |
Perhaps semicolon syntax could be used as a sign that the expression is to be run for side effects alone: [edit: on second thought, I don't think this would be as simple as it seemed, since you could imagine saying |
A macro could be a simple solution, e.g. |
@tkoolen, that seems to be subsumed by a non-materializing broadcast syntax, since then you could just use For example, if we adopt the |
That'd be a bit of a pun since bc = (for g.(f.(x))) # or whatever
foreach(i->bc[i], CartesianIndices(axes(bc))) |
@mbauman, my thinking is that the unmaterialized (Right now, |
Ah, yes of course. I'm blinded by sitting too close to the code. That said, we do have a zero-argument julia> foreach(()->println("hello"))
hello It's just the degenerate 0-arg case in a vararg, calling |
I think having foreach(args -> f(args...), @lazy tuple.(x, y, z)) which, although not that pretty, at least puts the |
I guess maybe with the new tuple destructuring, foreach(@lazy tuple.(x, y, z)) do (a, b, c)
println(a + b * c)
end |
There is nothing preventing us from implementing a single argument version of foreach for Broadcasted objects. Or a different function. That issue is far down on my priority list because it requires no language changes or breakage. The first question is how to get a non-materializing broadcast; I’m partial to the The other prerequisite is to make Broadcasted objects iterable. |
There are three different aspects of Broadcasted's "iterability." Internal implementation, exposed API, and supported arguments. My current plan is to:
While it feels like we should be able to refactor Broadcasted to work purely based upon iteration alone (and not indexing), I think it'd be really hard and possibly impossible for some combinations of iterable arguments. Doing so performantly is even worse. We'd have to very carefully structure the order of iteration and nested loops so we never need to backtrack up dimensions. Also it would change the semantics of broadcast. For example, with a pure iteration implementation, in |
Now that broadcasted objects are iterable, we should bikeshed a syntax for a non-materializing dot call. Just to start things off, maybe a non-binding straw poll of some non-breaking options (which eliminates
|
wouldn't |
Maybe the right way to do this is for |
Can we use a macro |
Thanks for pushing this forward! Regarding the syntax, I'm not sure I like |
Thanks for the comments!
I'm seeing
I think explicit is great but I don't think long means good. For example, lowered form of broadcasting expression is quite long but it's harder for humans to understand. I think Also, I think
I think this is a great idea because it can be applied to multiple broadcasting expressions in a block. At the same time, it's also nice to have a single-purpose macro like |
This is great, thanks for tackling it @tkf! On the name, I find |
Some ideas: |
FWIW, the three macros in that family |
Good point. It has to be a new macro if we were to do this. But people can experiment with the interface in libraries if we have good supports for lazy broadcast. |
Another motivation for this notation is that it is useful for defining "broadcastable call overload": https://discourse.julialang.org/t/best-practice-for-broadcastable-callables/21000 |
The new syntax |
Another solution is to implement an |
We "just" have to decide the syntax (which is probably always the hardest part...). Unfortunately, it seems that there is no consensus in Triage #31553 (comment). I tried to push the discussion by implementing proof-of-concepts but I'm "out of bullets" ATM. Regarding the implementation of broadcasted reduction, there is already #31020. I believe specializing |
I would second that. The ArrayReducer hack was done before the big 1.0 broadcast changes, which solved my immediate problem so I agree that it would not add anything significant here, and in particular #31020 is much more powerful. Good luck anyway in finding a syntax for the lazy map, I would be eager to have it in Julia Base as well. |
I think we do want something here, but this is a pretty deep feature and I think it doesn't hurt to let the idea simmer in the collective consciousness for a while. I'm just writing this to make it clear that people aren't against this, nor are we stonewalling, this is just the kind of feature that requires some significant rumination so that you don't end up baking in something that is good but not the best. |
Gilad Bracha has developed a new language that generalizes array broadcasting into n-dimensional streams.
Interactive in-browser introduction | Talk video | GitHub I really like the idea of non-materializing broadcast fusion over infinite streams like |
The notation
f.(v)
effectively mapsf
over an arrayv
. This operation is not lazy, it materializes the mapped array.I think it could be useful to have an analogous notation for lazy maps. Maybe
f..(v)
? Instead of materializing the mapped array, this should return something equivalent to the generator:(f(x) for x in v)
For example, this could be useful to do something like:
sum(f..(v))
which is more efficient than materializing the intermediate array in:
sum(f.(v))
Of course right now
sum
takes an optional function argument, so one can writesum(f,v)
. But see the discussion here: #19146. If one decides to remove the methodsum(f,v)
, I think the notationsum(f..(v))
could be a nice alternative.The text was updated successfully, but these errors were encountered: