Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow loops to return values other than () #961

Closed
pnkfelix opened this issue Mar 10, 2015 · 176 comments
Closed

Allow loops to return values other than () #961

pnkfelix opened this issue Mar 10, 2015 · 176 comments
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.

Comments

@pnkfelix
Copy link
Member

Extend for, loop, and while loops to allow them to return values other than ():

  • add an optional else clause that is evaluated if the loop ended without using break;
  • add an optional expression parameter to break expressions to break out of a loop with a value.

Proposed in #352

Some discussion of future-proofing is available in #955

@JelteF
Copy link

JelteF commented Jan 23, 2016

I'm just starting with rust, but why would loops not just return their last statement like everything else seems to do, functions, if, match?

break statements could be treated like early returns and accept an expression parameter. The same could be done for continue statements, its argument would only be used when the loop exits afterwards.

The else clause is not even needed when implemented like this.

@ticki
Copy link
Contributor

ticki commented Jan 24, 2016

@JelteF the problem is that a loop without a return or break statement will never return, for that reason you cannot end your loop on an expression.

@withoutboats
Copy link
Contributor

@ticki for and while loops do return without a break, and could evaluate to the value on last iteration of the final expression in their block.

@ticki
Copy link
Contributor

ticki commented Jan 25, 2016

@withoutboats Yeah, that's right. for and while could definitely return a value! But loop cannot, without being able to specify a return value to the break statement.

@KalitaAlexey
Copy link

You want to make loops return value of last expression? You want to make it implicit without any break?
What if I want to break the while and return value? Do you suggest me to use code like

let mut done = false;
while (!done)
{
    let value = get_value();
    done = value.is_valid();
    value
}

I think this is ugly.
We need else block because loop may end before it started when a condition true from start.

@pnkfelix
Copy link
Member Author

Update: never mind the note below (which is preserved just so the responses that follow continue to make sense).

  • The reason that the note is irrelevant is that we currently have rules in the type checker ensuring that if a loop-body ends with a expression, that expression must have type ().
  • (This, I think, will ensure that no existing code will introduce temporaries that need to outlive the loop body itself, unless I am wrong about how the temporary r-value extents are assigned in such a case...)

@JelteF another reason to not just use the last expression in the loop-form's body as its return value is that it would be a breaking change: loop-form bodies today are allowed to end with an expression that is evaluated and then discarded.

Returning the last expression implicitly would change the dynamic extent of the returned value, which in turn would change the lifetime associated with it and its r-value temporaries. And that change would probably inject borrowck errors into existing stable code.

(The alternative of solely using break and else for the return value would be, I think, entirely backwards compatible...)

@JelteF
Copy link

JelteF commented Jan 25, 2016

@ticki You might have misunderstood. I only said we did not need the else block but the break and continue statements are obviously needed for early loop exits.

@KalitaAlexey I did suggest code like that, but the break could still be used. It is a very good point that you are making though. I had not thought about the case that the loop would never be evaluated. It seems you are right that the else block is needed in cases where the loop body is never executed, so there is no way to return from it.

@pnkfelix I'm not sure what the breaking change is, since the check for the type of the return value could simply be skipped in cases where it is not saved in a value.

@nagisa
Copy link
Member

nagisa commented Jan 25, 2016

add an optional else clause that is evaluated if the loop ended without using break;

Please, no. Python has this, and every time I encountered it I had to jump into REPL and see when this else thing would get evaluated: on or in absence of break.

I’m not that opposed to being able to “return” something from the loop with a break, but necessity of adding a else-ish thing makes this a no-brainer minus one million to me.

@golddranks
Copy link

I think that an else block is strictly more expressive than returning the final expression and only thing that makes sense in the presence of value-returning breaks. If the value of final expression is going to be returned, why should the evaluation of the last run of the loop be any different from any other run? And are the final expressions (given that there is no side effects) evaluated for nothing on all the other runs? Optimizing them off becomes then burden on the compiler.

If the value-returning break isn't hit, then there needs to be an alternative path that returns a value of the same type. It doesn't have to be named "else", but I think that's a sensible name.

@erikjohnston
Copy link

@nagisa

Please, no. Python has this, and every time I encountered it I had to jump into REPL and see when this else thing would get evaluated: on or in absence of break.

This bites me every time too, mainly because while useful its a rarely used feature. Maybe a better/more accurate name would help? (Nothing immediately springs to mind though.)

Or perhaps something slightly different, like having a default expression instead? e.g.:

for x in iterator {
  if foo(x) {
    break "yes!";
  }
} default {
  "awww :("
}

Where the default expression is evaluated either if iterator is empty or foo(x) is false for all x in iterator.

@JelteF
Copy link

JelteF commented Jan 25, 2016

@erikjohnston @nagisa

I agree that the else is always confusing when seeing it. I do think it will be less of a problem when break returns a value, which it doesn't do in Python. But the case still exists when else would be used like in python when the value is not saved in anything and the break might be empty.

I think another name would indeed be good. Something that comes to my mind would simply be nobreak. It's short and describes quite clearly what it is for.

PS. I retract my initial proposal about using the last statement instead of the else block, because of the good arguments against it.

@glaebhoerl
Copy link
Contributor

FWIW I think the best way to move forward on this, incrementally, would be to start by only allowing break EXPR inside of loops, and to not touch any of the other looping constructs for now. That sidesteps all the other tricky design questions we've been spinning in circles around.

@JelteF
Copy link

JelteF commented Jan 25, 2016

@glaebhoerl
I doubt that's a good way to go about it. It will only encourage people to "hack" a for or a while loop inside a loop loop. I've not heard any argument against using the else statement except for its name.

@arielb1
Copy link
Contributor

arielb1 commented Jan 25, 2016

This can kind-of be already done as

{
    let _RET;
    for x in iter {
        if pred(x) {
            _RET = x;
            break;
        }
    }
    _RET
}

@Ericson2314
Copy link
Contributor

Yet again, @glaebhoerl says exactly what I was going to say :). Somebody want to make an RFC for this, I'd be willing to help?

@JelteF heh, that's kinda the point! Once people see how nice this is, there will be more motivation to actually reach a consensus on break-with-value for other types of loops (and maybe even normal blocks!).

@JelteF
Copy link

JelteF commented Jan 25, 2016

@arielb1 Of course it can be done, but the point is that this:

let a = for x in 1..4 {
    if x == 2 {
        break x
    }
} nobreak {
    0
}

looks much cleaner than this:

let a = {
    let mut _ret = 0;
    for x in 1..4 {
        if x == 2 {
            _ret = x;
            break;
        }
    }
    _ret
}

@Ericson2314 It seems that if the only consensus that needs to be reached is the naming, it could be solved rather quickly. It would be weird to hurry an incomplete proposal, if all that needs to be done is pick a name for a statement.

@Ericson2314
Copy link
Contributor

@JelteF Well I'll grant you that originally there were more ideas, but because #955 did not happen else { } is the last one that makes sense. On the other hand, there are a few more small details than just the keyword. E.g. should this work?

<some loop> {
    ...
    break; // as opposed to break (); 
    ...
} else {
    my_fun() // returns ()
};

@nagisa Anyone playing around will notice that the type checker will require break .. and else { .. }` to have the same type. IMO that will help make clear the behavior, no manuals needed.

@JelteF
Copy link

JelteF commented Jan 25, 2016

@Ericson2314 I don't see a reason why that should not work. In Python it is not an expression and it still has a use. Namely handeling the edge case when the loop does not break. A simple example can be found here: https://shahriar.svbtle.com/pythons-else-clause-in-loops#but-why
Copy pasted:

for x in data:
    if meets_condition(x):
        break
else:
    # raise error or do additional processing 

vs

condition_is_met = False
for x in data:
    if meets_condition(x):
        condition_is_met = True

if not condition_is_met:
    # raise error or do additional processing

As for your comment @nagisa. In this case it might not be directly clear what the else does, which is why I think another name would still be clearer.

@nagisa
Copy link
Member

nagisa commented Jan 25, 2016

I passionately hate the idea itself of having an else keyword associated to a loop in any way, @Ericson2314. It simply makes no sense and I intensely highly doubt one can prove it otherwise. Thinking about it, it might make some sense if the else block was executed when 0 iterations of the loop are executed, actually, but that’s overall an useless construct.

I don’t want to see any of that weirdness in Rust just because Python has it. One might argue for a new keyword, but that’s ain’t happening either, because of backwards compatibility.

EDIT: All looping constructs have trivial desugarings into a plain old loop, @glaebhoerl, so there’s no necessity to do any of the “only allow x in y” dance, I think.

@glaebhoerl
Copy link
Contributor

@nagisa Sure, but the desugarings of while and for into loops contain breaks, ones which don't return a value (said differently: return ()) -- so if you want to break with a value elsewhere in the loop you have a type mismatch. This is precisely what the else clause would be for: in effect it's doing nothing else but providing the value argument to the implicit break embedded in the while/for constructs.

@JelteF
Copy link

JelteF commented Jan 25, 2016

@nagisa
if new keywords are a problem, maybe something like !break could be used. Which I guess is currently invalid syntax.

@Ericson2314
Copy link
Contributor

@nagisa Personally, it reminds me of the base case of a fold, and thus actually feels quite elegant.

@golddranks
Copy link

@nagisa If all looping constructs desugar into loop, we can use value-returning break without problems with loop, but with for, the types don't unify because there is more than one way to return from to loop: either break or then just looping 'till the end, which produces currently (). That's why we need some kind of "default" return value in the case a break isn't hit. Is it just the keyword else you are detesting, or the concept of having default return value by itself?

I just came to think of another possibility for for loop: the for loop could return an Option<T>. This way, we could write

let result = for i in haystack_iter {
    if i == needle {
        break "Found!";
    }
}.unwrap_or("Not found :(");

This is nice in the sense that it doesn't need any new keywords or reusing old keywords in surprising way.

@KalitaAlexey
Copy link

@golddranks But that's confusing. It is like functional programming but ugly.

@golddranks
Copy link

Another note: sometimes I've written a loop that is expected to set some outer state inside the loop. But because setting the state (in that particulal case was) may be expensive, you might want to avoid setting a "default" state before running the loop.

But this results in the fact that the control flow analysis can't be sure if the state is set in the end, since it's possible that the loop runs 0 times. I have to make a boolean flag to check, and even then, if the analysis isn't super smart, it won't be sure. Having a default/nobreak (whatever the name is going to be) code block would help the flow analysis in these kinds of situations. EDIT: of course for that to be of any help, there should be a piece of information available whether the loop terminated without running even once, or if it terminated because it iterated until the end.

@withoutboats
Copy link
Contributor

Per #1767 I'm closing this issue. We now support loops to evaluate to non-() values, but we've decided that none of the solutions to making for and while evaluate to other values have small enough downsides to implement. else confuses users, !break is a very surprising syntax, and making them evaluate to Option<T> is a breaking change.

We're open to revisiting this some day if conditions change a lot. For example, possible now that break value is on stable, we'll find out we are frequently transforming for loops into loops to acess this feature. Maybe if we are close to finalizing a generator/coroutine proposal, the calculus on this will change.

@orent
Copy link

orent commented Apr 20, 2017

In case this ever gets revisited, how about combining @glaebhoerl's idea of moving the break into the block and using the 'final' keyword as proposed by @canndrew:

... } final { break value }

I would find the meaning obvious enough reading this code even if I was not familiar with the feature (something I can't say about python's for/else).

@petrochenkov petrochenkov added T-lang Relevant to the language team, which will review and decide on the RFC. and removed postponed RFCs that have been postponed and may be revisited at a later time. labels Feb 24, 2018
@exprosic
Copy link

exprosic commented Feb 8, 2020

In case this ever gets revisited, how about combining @glaebhoerl's idea of moving the break into the block and using the 'final' keyword as proposed by @canndrew:

... } final { break value }

I would find the meaning obvious enough reading this code even if I was not familiar with the feature (something I can't say about python's for/else).

I would suggest then instead of final, since in all currently popular languages where it exists, final(ly) means the exact opposite of getting executed only when not being break-ed before, which is getting executed whatsoever. then would avoids the sort of naming tragedy like return in the Haskell community.

then also avoids the semantical confusion brought by else, since it naturally has a sequential meaning (I eat, then I walk) in parallel with its role in the conditional combination (if/then). In places where it joints two blocks ({ ... } then { ... }) instead of a boolean and a block (x<y then { ... }), the sequential semantics prevails intuitively.

@xkr47
Copy link

xkr47 commented Jun 14, 2020

For people not having enough time to read through the whole set of comments here, summary:

  1. "workaround" exists: Allow loops to return values other than () #961 (comment)
  2. Issue of for/while returning Option<T> instead is code like: Allow loops to return values other than () #961 (comment)
  3. Issue of else clause being confusing Allow loops to return values other than () #961 (comment)
  4. Lang team meeting decision of too many downsides in all approaches so far (in forked issue): Allow for and while loop to return value (without discussion of keyword names) #1767 (comment)

I wonder if the backward-compatibility issues are no longer such an issue now that we have the "edition" feature and the associated cargo fix --edition to update old code to behave correctly even in the newer edition.

growingspaghetti referenced this issue in growingspaghetti/project-euler Jul 21, 2020
@haltman-at
Copy link

I'll suggest coda as a name, maybe?

@hazer-hazer
Copy link

hazer-hazer commented Jul 7, 2021

@arielb1

let a = for x in 1..4 {
    if x == 2 {
        break x
    }
} nobreak {
    0
}

Hello, from Rust newcomer in 2021 who interested in compiler development. I like this one, and I think it could be done without introducing a new keyword, being parsed with one lookahead, using ! and break so:

for (...) {}
// if current is `!` and next is `break` then parse noReturn clause

We've got this:

let a = for x in 1..4 {
    if x == 2 {
        break x
    }
} !break {
    0
}

This won't break the parser as I see, as far as ! has higher precedence than break, thus it is impossible to encounter user case when someone's written let a = !break.

UPD: Sorry, didn't see that this is already proposed 😝

@porky11
Copy link

porky11 commented Jul 19, 2021

'finally" and "then" both sound like it's always executed.
"else" is fine. Python uses it. It means, "if some valid result is found in the iterator, return it, else return the else case"

@kevincox
Copy link

Else does work well in a lot of cases but can be a little confusing in others. I have definitely needed to explain it to people before in python. As you said it works well for for x in thing { if x.is_good() { return x } } else { return y } but that is even slightly less obvious if you are using for as an expression rather than using return. That being said in the expression case the symmetry to if else is very nice.

Overall even if not "perfect" I think it is the best option suggested so far by a fair margin.

@SimonSapin
Copy link
Contributor

What do people really mean when they say that for/else is "confusing"?

It is definitely unfamiliar to a lot of programmers and one can’t very easily guess what it does, but I don’t believe it is actively misleading either. It’s not like it exists elsewhere with a different meaning, or one could easily guess incorrectly. When someone encounters it for the first time they will not know what it means, and can easily look it up. In a search engine the first few results for "for else python" all do a decent job at explaining it.

@kevincox
Copy link

kevincox commented Jul 20, 2021 via email

@golddranks
Copy link

golddranks commented Jul 20, 2021

The "If the loop runs at all, else" interpretation is interesting also in the sense that the loop body can have side effects. In case some of those side-effects are things that matter from the viewpoint of control flow, such as "does a variable get initialized", the else body becomes a language-level enabler for being able to allow initializing variables in the case the loop body didn't initialize them. It's a pattern that the compiler ought be able to reason about, unlike the impossible task of reasoning about the behaviour of arbitrary iterators.

So, I'd say there are two interesting "else" cases with loops:

  1. to be able to evaluate to a value, the loop must break, but if it doesn't break, we need a "break-or-else" body to handle that.
  2. to be able to cause a side-effect, the loop must run at least once. If it doesn't run even once, we need a "run-or-else" body to handle that.

The names "break-or-else" and "run-or-else" are just for clarification, I don't have any concrete syntax proposals, but I think that if there would be a clear, understandable syntax for both, they both would be a valuable additions in an expressive language like Rust.

@andersk
Copy link
Contributor

andersk commented Jul 20, 2021

The “if the loop runs at all, else” interpretation is not possible, because this would not result in a value:

let a = for x in 1..4 {
    if false {
        break x;
    }
} else {
    0
};

@golddranks
Copy link

golddranks commented Jul 20, 2021

@andersk that's why I'm saying that a for loop with a "run-or-else" block but no "break-or-else" block should behave like the current for and not evaluate to anything (other than ()).

To elaborate further, I'm not arguing that the for loops should have else blocks, or if they would, what their semantics should be; I'm arguing that there're two interesting cases with for loops that warrant running a conditional body of code. Actually, I think that's an argument for that loops shouldn't have else blocks at all, since it's not clear, which sensible semantic they have.

A possible syntax just occured to me: nobreak blocks and norun blocks. But that's bikeshedding.

@haltman-at
Copy link

If you're willing to add new keywords, there's any number of possibilities that are clearer than else (such as your suggestion of nobreak). You'll notice that there's been a bunch of discussion of possible syntax above, such as !break (which has the advantage of not requiring a new keyword but the disadvantage of potentially being confusing); I'll repeat my suggestion from earlier of coda. :) I think the big question is whether that's something people are willing to do.

This idea of norun seems to be unrelated (seemingly suggested by the word "else" rather than the actual semantics being discussed) and I'm not sure that belongs here, or even makes much sense.

@golddranks
Copy link

golddranks commented Jul 20, 2021

seemingly suggested by the word "else" rather than the actual semantics being discussed

Indeed, by itself it's offtopic, but existence of another sensible semantic for else is a counterargument for using else as a keyword, so it deserves a mention.

@dhardy
Copy link
Contributor

dhardy commented Jul 20, 2021

to be able to cause a side-effect, the loop must run at least once. If it doesn't run even once, we need a "run-or-else" body to handle that.

I assume you're talking about variable assignments/bindings:

  • Assigning to a let binding requires exactly once, hence requires a nobreak case exactly like for returning a value
  • Assigning to a let mut binding which might be assigned to zero, one or several times is a corner-case not worth worrying about in my opinion. Better initialise with a default value or use Option or revise the code (e.g. use a function call)

@golddranks
Copy link

@dhardy I've been following this thread for years, and I just noticed that I'm running in loops myself ( The same take, earlier: #961 (comment) ). I just want point out that I feel your argument about variable assignment actually convincing, so thank you; I wish I'd remember what I originally felt the need for. The abstract argument is clear: to be able to guarantee a side-effect at least once, but the concrete need; not so much anymore. I still prefer nobreak, introduced by @JelteF, over else, and I'm glad to see they started a new issue about reserving that keyword.

@scottmcm
Copy link
Member

The “if the loop runs at all, else” interpretation is not possible, because this would not result in a value

It could work with letting the body of the loop be non-unit. For example,

let last =
    for x in whatever {
        Some(x)
    } else {
        None
    };

@hazer-hazer
Copy link

hazer-hazer commented Jul 21, 2021

@andersk

The “if the loop runs at all, else” interpretation is not possible, because this would not result in a value:
But it is possible to check the for-loop, which result is used, for a kind of exhaustiveness.
In your case:

let a = for x in 1..4 {
    if false {
        break x;
    }
} else {
    0
};

Here's a CFG branch that does not point to any target value (here it is code after if false {...}).
The compiler has options like:

  1. Disallow this at all
  2. Object that you need to cover else case yourself inside the for body (like if false {} else {})

The 1. option is simple makes introducing else clause pathetic 😐. The second one is pretty complex and makes the appearance of else clause pointless, as the user needs to cover all cases in the for body, also it requires const value check in CFG.

I am upset by the fact that the so nice feature has no good way to implement.

Also, if the value of for is immediately used, the interpretation with Option is possible (I'm not sure if it was already proposed):

let a = for x in 1..4 {
    
} else {
    0
};

Becomes

let a = loop {
    // Iterator lowering stuff without running
    // This is the first run of iterator next
   match iterator.next() {
       Some(T) => // Continue iterating keeping in mind that one iteration already done
       None => // Run `else` clause 
   }
};

By doing so, all for-loops whose return values are used return Option<T>

@phaux
Copy link

phaux commented Jul 21, 2021

(slightly offtopic but important to consider)

In the future when generators are stabilized it would make sense to allow iterating over yielded values with the for loop and the whole loop could evaluate to the return type of the generator. So if you have gen: Generator<Yield=i32, Return=Result<(), Error>> you could do:

for i in gen {
  // do something with i
}?; // handle error

@fstirlitz
Copy link

@phaux I went over this. (Not just once, but on Internals before as well.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests