Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: deprecate for loop vars that overwrite outer vars (#22314) #22659

Merged
merged 1 commit into from
Aug 23, 2017

Conversation

JeffBezanson
Copy link
Member

Connoisseurs of elaborate deprecations will appreciate this.

This scoping change basically has two cases. The first, simpler and more common case, is where the loop variable overwrites an existing variable in a way that doesn't affect program behavior:

i = g()
x = f(i)   # use i
for i = 1:n
    ...
end
return    # i not used again

The second, more breaking, case is where the last value of the loop variable is actually used:

local i
for i = 1:n
    ... # probably break at some point
end
y = f(i)   # use last value

So I implemented two deprecation modes: one warns for all cases, the other tries to identify only the second case. Warning in all cases was pretty annoying for Base, though good for identifying potentially confusing variable name collisions. So the limited mode is more useful, but very approximate, as getting it right would require full def-use analysis. It does seem to catch almost all cases that occur in practice though. The commit here has limited mode enabled. I have not looked at the test suite yet.

@JeffBezanson JeffBezanson added deprecation This change introduces or involves a deprecation compiler:lowering Syntax lowering (compiler front end, 2nd stage) labels Jul 2, 2017
test/read.jl Outdated
@@ -176,7 +176,8 @@ for (name, f) in l
old_text = text
cleanup()

for text in [
for text_ in [
text = text_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't look right...

@ararslan
Copy link
Member

ararslan commented Jul 3, 2017

I'm a bit confused by this. Capturing the value in the last iteration is a pretty common task. How would one have to write that after this deprecation?

@JeffBezanson
Copy link
Member Author

You would have to explicitly assign a separate variable to the loop variable, e.g.

local lastval
for i = iter
    if done
        lastval = i; break
    end
end

@ararslan
Copy link
Member

ararslan commented Jul 3, 2017

Thanks for the explanation. That seems fine to me, if a bit not-entirely-obvious. I suppose it provides a clearer scoping rule for this case. What happens at the end of the deprecation period? Does reusing a variable name in this way then become an error?

It appears @vtjnash isn't a fan, based on his use of reactions. Jameson: What are your concerns with this approach?

@JeffBezanson
Copy link
Member Author

The idea of this is to make for like let, i.e. always introduce a new variable. That will be the behavior after the deprecation.

@JeffBezanson
Copy link
Member Author

Ref #22314

@JeffBezanson JeffBezanson force-pushed the jb/loopvars branch 2 times, most recently from 147fae2 to bfa4ef6 Compare July 10, 2017 16:32
@JeffBezanson
Copy link
Member Author

Update: the syntax for outer i = ... has been proposed to get the existing behavior. This seems like a good idea, since there are clearly plenty of cases where this is useful, and adding outer is much easier than introducing an extra variable. Suggestions on spelling etc. are welcome. My main question is whether this commits us to a more general outer scoping declaration.

@ararslan
Copy link
Member

ararslan commented Aug 1, 2017

Suggestions on spelling etc. are welcome

+1 for for outer i = ... spelled as shown 🙂

My main question is whether this commits us to a more general outer scoping declaration.

I don't think so. There aren't many places where declaring something as outer is useful (or a generally good idea) when global wouldn't get the job done, with the notable exception of this particular case. So, at least in my opinion, limiting outer to for declarations is perfectly reasonable; it just becomes part of the for syntax.

@martinholters
Copy link
Member

The more obvious thing would be not to change for i = ... and let for local i = ... make i, well, local to the loop. But I just don't think many people would bother adding that local to their code...

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Aug 1, 2017

I think you've hit the nail on the head in your own comment, @martinholters – requiring for local to localize the loop variables is just the wrong default and people will do the easy thing instead of the right thing. @JeffBezanson, can you think of some scoping situation where outer would be needed? I'm still a little surprised that it's not really been necessary so far.

@JeffBezanson
Copy link
Member Author

I think we'd only need it if we changed the default scoping behavior. Of course it's possible that even now somebody might want "access the outer x even though I have a local x", but that's a really weird thing to do. Fortunately I think the meaning of outer in for outer i = ... would be compatible with whatever else we might do here.

@StefanKarpinski
Copy link
Member

I wasn't worried about incompatibility, but rather about staking out the outer x syntax before 1.0 in case we need to add it. I don't think accessing an outer x because there's a local x shadowing it is a reasonable need – as I said before, just change the name of the inner x in that case. So it seems like we may not need outer x in general, just for outer x.

@StefanKarpinski
Copy link
Member

Relevant prior discussions of nonlocal / outer keyword: #5331, #9955, #10559, #16727.

@mauro3
Copy link
Contributor

mauro3 commented Aug 2, 2017

Related is also #19324.

To introduce outer to just be used in for-loop headers seems like too much of a micro-optimization. But I am still in favor of ditching hard/soft scope distinction for functions and requiring outer to assign to a outer variable.

@rfourquet
Copy link
Member

I like what was proposed here:

last_i = for i in iter # last_i could also be spelled i
    # ...
end

as an alternative to for outer i in iter ... end.

@JeffBezanson
Copy link
Member Author

I am still in favor of ditching hard/soft scope distinction for functions and requiring outer to assign to a outer variable.

This isn't what the hard/soft scope distinction means. If you ignore global scope and only look inside functions, all scope constructs behave the same and use the same default of inheriting parent scope variables.

@JeffBezanson
Copy link
Member Author

last_i = for i in iter

I have to say, this is pretty appealing. No new word required, and much purer (in that you only need to return a value, and not repeatedly update a variable).

@StefanKarpinski
Copy link
Member

I guess the only downside – if it even is one – is that for loops return their last iterator value/tuple. I'm wondering if that would be annoying in interactive usage or if it might even be useful? I guess the only way to find out is to try it.

@mauro3
Copy link
Contributor

mauro3 commented Aug 8, 2017

What is returned when the iterator is empty?

@StefanKarpinski
Copy link
Member

What is returned when the iterator is empty?

Ah, yes, that's is a problem. I suppose it would have to be a default value like nothing or we could have an else clause on for loops as proposed in #22891. This is a slightly different notion, however, since it would have to execute when there are zero iterations in order to be helpful to provide a value for the for loop, which is the opposite interpretation of for/else as Python has.

@StefanKarpinski
Copy link
Member

In the case where the iterator can be empty, instead of writing something like this:

i = 0
for outer i = 1:n
    # do something
end

you would have to do this instead:

i = for i = 1:n
    # do something
end
i == nothing && (i = 0)

which is pretty awkward.

@JeffBezanson
Copy link
Member Author

It's a bit ad-hoc, but I'd actually prefer to only provide this when the for loop is the right side of an assignment. Then if the loop is empty, the assignment doesn't happen and you might get an undefined var error. Now that we can check @isdefined, this isn't so bad.

@rfourquet
Copy link
Member

Not that I think it's a specially good idea, but one possibility is to have for return a Nullable (null for empty loops).

@ararslan
Copy link
Member

ararslan commented Aug 8, 2017

That would put us in the same situation as Stefan mentioned above, except instead of

i == nothing && (i = 0)

you have

i = get(i, 0)

Marginally less awkward but still awkward.

@JeffBezanson
Copy link
Member Author

I know, it's a bit weird. But there are other cases where an apparent assignment doesn't happen, like

try
    x = error()
catch
end
x

or

x = p(i) ? 0 : break

Anyway, I'm ok with for outer as well.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Aug 10, 2017

The cases where an apparent assignment don't happen can all be explained by the RHS expression being evaluated first and the value of the RHS being bound to a local name as a separate step: if the expression never returns, the binding step doesn't happen. In this case, you would have an apparent assignment from a RHS expression that does evaluate normally and return but depending on how it evaluates, the assignment might not happen. We don't have anything else like that in the language and I think it would be radically more confusing and unexpected.

The for outer thing isn't amazing, but at least it's pretty clear what it means – if the loop variable isn't defined before the loop and the loop executes zero times, I don't think anyone would be surprised that it's still undefined after the loop.

@JeffBezanson
Copy link
Member Author

Implemented for outer and updated the manual. This should be ready to go very soon now.

provide the old behavior via `for outer i = ...`
@ararslan
Copy link
Member

Can you elaborate on the places where outer is recognized? For example, will for outer i = x, j = y declare both i and j to be in the outer scope of i, their respective outer scopes, or will i be outer and j be local? Is for outer i = x, outer j = y valid syntax, and if so what does it mean?

@rfourquet
Copy link
Member

This is a great change, I recently got bugged by an outer loop variable being altered by an inner loop variable of the same name (inserted by a splicing macro, so not easy to spot). I don't love the name outer, but I guess we get used to it. What about nonlocal instead? I'm also glad that @tkluck came up with his PR at the right time to offer a more sound (I think) meaning for a return value from for than what was considered here, thanks!

@JeffBezanson
Copy link
Member Author

outer is part of the i in itr iteration specification, so in for outer i = x, j = y only i is outer, and for outer i = x, outer j = y is valid and makes both variables outer.

@ararslan
Copy link
Member

and for outer i = x, outer j = y is valid and makes both variables outer.

Outer in the parent scope or outer to their respective scopes? If the former, it's no longer equivalent to

for outer i = x
    for outer j = y
        ...
    end
end

since in that case j would be encased in the i loop scope but not in the outer scope of i. Right?

@JeffBezanson
Copy link
Member Author

The short form is exactly equivalent to the form with nested fors, except for the behavior of break.

@ararslan
Copy link
Member

Okay, so the outer annotation on j in for outer i = x, outer j = y isn't actually useful then, right? Since there's no body that's within the i loop but not within the j loop.

@JeffBezanson
Copy link
Member Author

The outer j can come from any enclosing scope.

@ararslan
Copy link
Member

Ah, okay. Thanks for the explanation, much appreciated. Would it be worth adding some of that to the documentation?

@JeffBezanson
Copy link
Member Author

Sure it would, I was just kind of hoping people wouldn't use this feature :)

@JeffBezanson JeffBezanson merged commit edd8278 into master Aug 23, 2017
@JeffBezanson JeffBezanson deleted the jb/loopvars branch August 23, 2017 23:31
martinholters added a commit to HSU-ANT/ACME.jl that referenced this pull request Aug 29, 2017
@rfourquet
Copy link
Member

Sorry if I miss something obvious, but the following errors out:

julia> i = 0; for (outer i, j) in [(1, 2)]; end
ERROR: syntax: missing comma or ) in argument list

It would be convenient to support, shall I open an issue?

@JeffBezanson
Copy link
Member Author

#23511
I was reeeeally hoping to avoid feature creep here.

@rfourquet
Copy link
Member

Oh sorry to not have found this. The "break-with-value-and-else" for-loop PR would also be satisfying for my case.

tkluck added a commit to tkluck/julia that referenced this pull request Feb 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage) deprecation This change introduces or involves a deprecation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants