diff --git a/text/2394-async_await.md b/text/2394-async_await.md new file mode 100644 index 00000000000..a25b9f6ee10 --- /dev/null +++ b/text/2394-async_await.md @@ -0,0 +1,654 @@ +- Feature Name: async_await +- Start Date: 2018-03-30 +- RFC PR: https://github.com/rust-lang/rfcs/pull/2394 +- Rust Issue: https://github.com/rust-lang/rust/issues/50547 + +# Summary +[summary]: #summary + +Add async & await syntaxes to make it more ergonomic to write code manipulating +futures. + +This has a companion RFC to add a small futures API to libstd and libcore. + +# Motivation +[motivation]: #motivation + +High performance network services frequently use asynchronous IO, rather than +blocking IO, because it can be easier to get optimal performance when handling +many concurrent connections. Rust has seen some adoption in the network +services space, and we wish to continue to enable those users - and to enable +adoption by other users - by making it more ergonomic to write asynchronous +network services in Rust. + +The development of asynchronous IO in Rust has gone through multiple phases. +Prior to 1.0, we experimented with having a green-threading runtime built into +the language. However, this proved too opinionated - because it impacted every +program written in Rust - and it was removed shortly before 1.0. After 1.0, +asynchronous IO initially focused around the mio library, which provided a +cross-platform abstraction over the async IO primitives of Linux, Mac OS, and +Windows. In mid-2016, the introduction of the futures crate had a major impact +by providing a convenient, shared abstraction for asynchronous operations. The +tokio library provided a mio-based event loop that could execute code +implemented using the futures interfaces. + +After gaining experience & user feedback with the futures-based ecosystem, we +discovered certain ergonomics challenges. Using state which needs to be shared +across await points was extremely unergonomic - requiring either Arcs or join +chaining - and while combinators were often more ergonomic than manually +writing a future, they still often led to messy sets of nested and chained +callbacks. + +Fortunately, the Future abstraction is well suited to use with a syntactic +sugar which has become common in many languages with async IO - the async and +await keywords. In brief, an asynchronous function returns a future, rather +than evaluating immediately when it is called. Inside the function, other +futures can be awaited using an await expression, which causes them to yield +control while the future is being polled. From a user's perspective, they can +use async/await as if it were synchronous code, and only need to annotate their +functions and calls. + +Async/await & futures can be a powerful abstraction for asynchronicity and +concurrency in general, and likely has applications outside of the asynchronous +IO space. The use cases we've experience with today are generally tied to async +IO, but by introducing first class syntax and libstd support we believe more +use cases for async & await will also flourish, that are not tied directly to +asynchronous IO. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +## Async functions + +Functions can be annotated with the `async` keyword, making them "async +functions": + +```rust +async fn function(argument: &str) -> usize { + // ... +} +``` + +Async functions work differently from normal functions. When an async function +is called, it does not enter the body immediately. Instead, it evaluates to an +anonymous type which implements `Future`. As that future is polled, the +function is evaluated up to the next `await` or return point inside of it (see +the await syntax section next). + +An async function is a kind of delayed computation - nothing in the body of the +function actually runs until you begin polling the future returned by the +function. For example: + +```rust +async fn print_async() { + println!("Hello from print_async") +} + +fn main() { + let future = print_async(); + println!("Hello from main"); + futures::block_on(future); +} +``` + +This will print `"Hello from main"` before printing `"Hello from print_async"`. + +An `async fn foo(args..) -> T` is a function of the type +`fn(args..) -> impl Future`. The return type is an anonymous type +generated by the compiler. + +### `async ||` closures + +In addition to functions, async can also be applied to closures. Like an async +function, an async closure has a return type of `impl Future`, rather +than `T`. When you call that closure, it returns a future immediately without +evaluating any of the body (just like an async function). + +```rust +fn main() { + let closure = async || { + println("Hello from async closure."); + }; + println!("Hello from main"); + let future = closure(); + println!("Hello from main again"); + futures::block_on(future); +} +``` + +This will print both "Hello from main" statements before printing "Hello from +async closure." + +`async` closures can be annotated with `move` to capture ownership of the +variables they close over. + +## `async` blocks + +You can create a future directly as an expression using an `async` block: + +```rust +let my_future = async { + println!("Hello from an async block"); +}; +``` + +This form is almost equivalent to an immediately-invoked `async` closure. +That is: + +```rust +async { /* body */ } + +// is equivalent to + +(async || { /* body */ })() +``` + +except that control-flow constructs like `return`, `break` and `continue` are +not allowed within `body` (unless they appear within a fresh control-flow +context like a closure or a loop). How the `?`-operator and early returns +should work inside async blocks has not yet been established (see unresolved +questions). + +As with `async` closures, `async` blocks can be annotated with `move` to capture +ownership of the variables they close over. + +## The `await!` compiler built-in + +A builtin called `await!` is added to the compiler. `await!` can be used to +"pause" the computation of the future, yielding control back to the caller. +`await!` takes any expression which implements `IntoFuture`, and evaluates to a +value of the item type that that future has. + +```rust +// future: impl Future +let n = await!(future); +``` + +The expansion of await repeatedly calls `poll` on the future it receives, +yielding control of the function when it returns `Poll::Pending` and +eventually evaluating to the item value when it returns `Poll::Ready`. + +`await!` can only be used inside of an async function, closure, or block. +Using it outside of that context is an error. + +(`await!` is a compiler built-in to leave space for deciding its exact syntax +later. See more information in the unresolved questions section.) + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +## Keywords + +Both `async` and `await` become keywords, gated on the 2018 edition. + +## Return type of `async` functions, closures, and blocks + +The return type of an async function is a unique anonymous type generated by +the compiler, similar to the type of a closure. You can think of this type as +being like an enum, with one variant for every "yield point" of the function - +the beginning of it, the await expressions, and every return. Each variant +stores the state that is needed to be stored to resume control from that yield +point. + +When the function is called, this anonymous type is returned in its initial +state, which contains all of the arguments to this function. + +### Trait bounds + +The anonymous return type implements `Future`, with the return type as its +`Item`. Polling it advances the state of the function, returning `Pending` +when it hits an `await` point, and `Ready` with the item when it hits a +`return` point. Any attempt to poll it after it has already returned `Ready` +once will panic. + +The anonymous return type has a negative impl for the `Unpin` trait - that is +`impl !Unpin`. This is because the future could have internal references which +means it needs to never be moved. + +## Lifetime capture in the anonymous future + +All of the input lifetimes to this function are captured in the future returned +by the async function, because it stores all of the arguments to the function +in its initial state (and possibly later states). That is, given a function +like this: + +```rust +async fn foo(arg: &str) -> usize { ... } +``` + +It has an equivalent type signature to this: + +```rust +fn foo<'a>(arg: &'a str) -> impl Future + 'a { ... } +``` + +This is different from the default for `impl Trait`, which does not capture the +lifetime. This is a big part of why the return type is `T` instead of `impl +Future`. + +### "Initialization" pattern + +One pattern that sometimes occurs is that a future has an "initialization" step +which should be performed during its construction. This is useful when dealing +with data conversion and temporary borrows. Because the async function does not +begin evaluating until you poll it, and it captures the lifetimes of its +arguments, this pattern cannot be expressed directly with an `async fn`. + +One option is to write a function that returns `impl Future` using a closure +which is evaluated immediately: + +```rust +// only arg1's lifetime is captured in the returned future +fn foo<'a>(arg1: &'a str, arg2: &str) -> impl Future + 'a { + // do some initialization using arg2 + + // closure which is evaluated immediately + async move { + // asynchronous portion of the function + } +} +``` + +## The expansion of await + +The `await!` builtin expands roughly to this: + +```rust +let mut future = IntoFuture::into_future($expression); +let mut pin = unsafe { Pin::new_unchecked(&mut future) }; +loop { + match Future::poll(Pin::borrow(&mut pin), &mut ctx) { + Poll::Ready(item) => break item, + Poll::Pending => yield, + } +} +``` + +This is not a literal expansion, because the `yield` concept cannot be +expressed in the surface syntax within `async` functions. This is why `await!` +is a compiler builtin instead of an actual macro. + +## The order of `async` and `move` + +Async closures and blocks can be annotated with `move` to capture ownership of +the variables they close over. The order of the keywords is fixed to +`async move`. Permitting only one ordering avoids confusion about whether it is +significant for the meaning. + +```rust +async move { + // body +} +``` + +# Drawbacks +[drawbacks]: #drawbacks + +Adding async & await syntax to Rust is a major change to the language - easily +one of the most significant additions since 1.0. Though we have started with +the smallest beachhead of features, in the long term the set of features it +implies will grow as well (see the unresolved questions section). Such a +significant addition mustn't be taken lightly, and only with strong motivation. + +We believe that an ergonomic asynchronous IO solution is essential to Rust's +success as a language for writing high performance network services, one of our +goals for 2018. Async & await syntax based on the Future trait is the most +expedient & low risk path to achieving that goal in the near future. + +This RFC, along with its companion lib RFC, makes a much firmer commitment to +futures & async/await than we have previously as a project. If we decide to +reverse course after stabilizing these features, it will be quite costly. +Adding an alternative mechanism for asynchronous programming would be more +costly because this exists. However, given our experience with futures, we are +confident that this is the correct path forward. + +There are drawbacks to several of the smaller decisions we have made as well. +There is a trade off between using the "inner" return type and the "outer" +return type, for example. We could have a different evaluation model for async +functions in which they are evaluated immediately up to the first await point. +The decisions we made on each of these questions are justified in the +appropriate section of the RFC. + +# Rationale and alternatives +[alternatives]: #alternatives + +This section contains alternative design decisions which this RFC rejects (as +opposed to those it merely postpones). + +## The return type (`T` instead of `impl Future`) + +The return type of an asynchronous function is a sort of complicated question. +There are two different perspectives on the return type of an async fn: the +"interior" return type - the type that you return with the `return` keyword, +and the "exterior" return type - the type that the function returns when you +call it. + +Most statically typed languages with async fns display the "outer" return type +in the function signature. This RFC proposes instead to display the "inner" +return type in the function signature. This has both advantages and +disadvantages. + +### The lifetime elision problem + +As eluded to previously, the returned future captures all input lifetimes. By +default, `impl Trait` does not capture any lifetimes. To accurately reflect the +outer return type, it would become necessary to eliminate lifetime elision: + +```rust +async fn foo<'ret, 'a: 'ret, 'b: 'ret>(x: &'a i32, y: &'b i32) -> impl Future + 'ret { + *x + *y +} +``` + +This would be very unergonomic and make async both much less pleasant to use +and much less easy to learn. This issue weighs heavily in the decision to +prefer returning the interior type. + +We could have it return `impl Future` but have lifetime capture work +differently for the return type of `async fn` than other functions; this seems +worse than showing the interior type. + +### Polymorphic return (a non-factor for us) + +According to the C# developers, one of the major factors in returning `Task` +(their "outer type") was that they wanted to have async functions which could +return types other than `Task`. We do not have a compelling use case for this: + +1. In the 0.2 branch of futures, there is a distinction between `Future` and + `StableFuture`. However, this distinction is artificial and only because + object-safe custom self-types are not available on stable yet. +2. The current `#[async]` macro has a `(boxed)` variant. We would prefer to + have async functions always be unboxed and only box them explicitly at the + call site. The motivation for the attribute variant was to support async + methods in object-safe traits. This is a special case of supporting `impl + Trait` in object-safe traits (probably by boxing the return type in the + object case), a feature we want separately from async fn. +3. It has been proposed that we support `async fn` which return streams. + However, this mean that the semantics of the internal function would differ + significantly between those which return futures and streams. As discussed + in the unresolved questions section, a solution based on generators and + async generators seems more promising. + +For these reasons, we don't think there's a strong argument from polymorphism +to return the outer type. + +### Learnability / documentation trade off + +There are arguments from learnability in favor of both the outer and inner +return type. One of the most compelling arguments in favor of the outer return +type is documentation: when you read automatically generated API docs, you will +definitely see what you get as the caller. In contrast, it can be easier to +understand how to write an async function using the inner return type, because +of the correspondence between the return type and the type of the expressions +you `return`. + +Rustdoc can handle async functions using the inner return type in a couple of +ways to make them easier to understand. At minimum we should make sure to +include the `async` annotation in the documentation, so that users who +understand async notation know that the function will return a future. We can +also perform other transformations, possibly optionally, to display the outer +signature of the function. Exactly how to handle API documentation for async +functions is left as an unresolved question. + +## Built-in syntax instead of using macros in generators + +Another alternative is to focus on stabilizing procedural macros and +generators, rather than introducing built-in syntax for async functions. An +async function can be modeled as a generator which yields `()`. + +In the long run, we believe we will want dedicated syntax for async functions, +because it is more ergonomic & the use case is compelling and significant +enough to justify it (similar to - for example - having built in for loops and +if statements rather than having macros which compile to loops and match +statements). Given that, the only question is whether or not we could have a +more expedited stability by using generators for the time being than by +introducing async functions now. + +It seems unlikely that using macros which expand to generators will result in a +faster stabilization. Generators can express a wider range of possibilities, +and have a wider range of open questions - both syntactic and semantic. This +does not even address the open questions of stabilizing more procedural macros. +For this reason, we believe it is more expedient to stabilize the minimal +built-in async/await functionality than to attempt to stabilize generators and +proc macros. + +## `async` based on generators alone + +Another alternative design would be to have async functions *be* the syntax for +creating generators. In this design, we would write a generator like this: + +```rust +async fn foo(arg: Arg) -> Return yield Yield +``` + +Both return and yield would be optional, default to `()`. An async fn that +yields `()` would implement `Future`, using a blanket impl. An async fn that +returns `()` would implement `Iterator`. + +The problem with this approach is that does not ergonomically handle `Stream`s, +which need to yield `Poll>`. It's unclear how `await` inside of an +async fn yielding something other than `()` (which would include streams) would +work. For this reason, the "matrix" approach in which we have independent +syntax for generator functions, async functions, and async generator functions, +seems like a more promising approach. + +## "Hot async functions" + +As proposed by this RFC, all async functions return immediately, without +evaluating their bodies at all. As discussed above, this is not convenient for +use cases in which you have an immediate "initialization" step - those use +cases need to use a terminal async block, for example. + +An alternative would be to have async functions immediately evaluate up until +their first `await`, preserving their state until then. The implementation of +this would be quite complicated - they would need to have an additional yield +point within the `await`, prior to polling the future being awaited, +conditional on whether or not the await is the first await in the body of the +future. + +A fundamental difference between Rust's futures and those from other languages +is that Rust's futures do not do anything unless polled. The whole system is +built around this: for example, cancellation is dropping the future for +precisely this reason. In contrast, in other languages, calling an async fn +spins up a future that starts executing immediately. This difference carries +over to `async fn` and `async` blocks as well, where it's vital that the +resulting future be *actively polled* to make progress. Allowing for partial, +eager execution is likely to lead to significant confusion and bugs. + +This is also complicated from a user perspective - when a portion of the body +is evaluated depends on whether or not it appears before all `await` +statements (which could possibly be macro generated). The use of a terminal +async block provide a clearer mechanism for distinguishing between the +immediately evaluated and asynchronously evaluated portions of a future with an +initialization step. + +## Using async/await instead of alternative asynchronicity systems + +A final - and extreme - alternative would be to abandon futures and async/await +as the mechanism for async/await in Rust and to adopt a different paradigm. +Among those suggested are a generalized effects system, monads & do notation, +green-threading, and stack-full coroutines. + +While it is hypothetically plausible that some generalization beyond +async/await could be supported by Rust, there has not enough research in this +area to support it in the near-term. Given our goals for 2018 - which emphasize +shipping - async/await syntax (a concept available widely in many languages +which interacts well with our existing async IO libraries) is the most logical +thing to implement at this stage in Rust's evolution. + +## Async blocks vs async closures + +As noted in the main text, `async` blocks and `async` closures are closely +related, and are roughly inter-expressible: + +```rust +// almost equivalent +async { ... } +(async || { ... })() + +// almost equivalent +async |..| { ... } +|..| async { ... } +``` + +We could consider having only one of the two constructs. However: + +- There's a strong reason to have `async ||` for consistency with `async fn`; + such closures are often useful for higher-order constructs like constructing a + service. + +- There's a strong reason to have `async` blocks: The initialization pattern + mentioned in the RFC text, and the fact that it provides a more + direct/primitive way of constructing futures. + +The RFC proposes to include both constructs up front, since it seems inevitable +that we will want both of them, but we can always reconsider this question +before stabilization. + +# Prior art +[prior-art]: #prior-art + +There is a lot of precedence from other languages for async/await syntax as a +way of handling asynchronous operation - notable examples include C#, +JavaScript, and Python. + +There are three paradigms for asynchronous programming which are dominant +today: + +- Async and await notation. +- An implicit concurrent runtime, often called "green-threading," such as + communicating sequential processes (e.g. Go) or an actor model (e.g. Erlang). +- Monadic transformations on lazily evaluated code, such as do notation (e.g. + Haskell). + +Async/await is the most compelling model for Rust because it interacts +favorably with ownership and borrowing (unlike systems based on monads) and it +enables us to have an entirely library-based asynchronicity model (unlike +green-threading). + +One way in which our handling of async/await differs from most other statically +typed languages (such as C#) is that we have chosen to show the "inner" return +type, rather than the outer return type. As discussed in the alternatives +section, Rust's specific context (lifetime elision, the lack of a need for +return type polymorphism here) make this deviation well-motivated. + +# Unresolved questions +[unresolved]: #unresolved-questions + +This section contains design extensions which have been postponed & not +included in this initial RFC. + +## Final syntax for the `await` expression + +Though this RFC proposes that `await` be a built-in macro, we'd prefer that +some day it be a normal control flow construct. The unresolved question about +this is how to handle its precedence & whether or not to require delimiters of +some kind. + +In particular, `await` has an interesting interaction with `?`. It is very +common to have a future which will evaluate to a `Result`, which the user will +then want to apply `?` to. This implies that await should have a tighter +precedence than `?`, so that the pattern will work how users wish it to. +However, because it introduces a space, it doesn't look like this is the +precedence you would get: + +``` +await future? +``` + +There are a couple of possible solutions: + +1. Require delimiters of some kind, maybe braces or parens or either, so that + it will look more like how you expect - `await { future }?` - this is rather + noisy. +2. Define the precedence as the obvious, if inconvenient precedence, requiring + users to write `(await future)?` - this seems very surprising for users. +3. Define the precedence as the inconvenient precedence - this seems equally + surprising as the other precedence. +4. Introduce a special syntax to handle the multiple applications, such as + `await? future` - this seems very unusual in its own way. + +This is left as an unresolved question to find another solution or decide which +of these is least bad. + +## `for await` and processing streams + +Another extension left out of the RFC for now is the ability to process streams +using a for loop. One could imagine a construct like `for await`, which takes +an `IntoStream` instead of an `IntoIterator`: + +```rust +for await value in stream { + println!("{}", value); +} +``` + +This is left out of the initial RFC to avoid having to stabilize a definition +of `Stream` in the standard library (to keep the companion RFC to this one as +small as possible). + +## Generators and Streams + +In the future, we may also want to be able to define async functions that +evaluate to streams, rather than evaluating to futures. We propose to handle +this use case by way of generators. Generators can evaluate to a kind of +iterator, while async generators can evaluate to a kind of stream. + +For example (using syntax which could change); + +```rust +// Returns an iterator of i32 +fn foo(mut x: i32) yield i32 { + while x > 0 { + yield x; + x -= 2; + } +} + +// Returns a stream of i32 +async fn foo(io: &AsyncRead) yield i32 { + async for line in io.lines() { + yield line.unwrap().parse().unwrap(); + } +} +``` + +## Async functions which implement `Unpin` + +As proposed in this RFC, all async functions do not implement `Unpin`, making +it unsafe to move them out of a `Pin`. This allows them to contain references +across yield points. + +We could also, with an annotation, typecheck an async function to confirm that it +does not contain any references across yield points, allowing it to implement +`Unpin`. The annotation to enable this is left unspecified for the time being. + +## `?`-operator and control-flow constructs in async blocks + +This RFC does not propose how the `?`-operator and control-flow constructs like +`return`, `break` and `continue` should work inside async blocks. + +It was discussed that async blocks should act as a boundary for the +`?`-operator. This would make them suitable for fallible IO: + +```rust +let reader: AsyncRead = ...; +async { + let foo = await!(reader.read_to_end())?; + Ok(foo.parse().unwrap_or(0)) +}: impl Future> +``` + +Also, it was discussed to allow the use of `break` to return early from +an async block: + +```rust +async { + if true { break "foo" } +} +``` + +The use of the `break` keyword instead of `return` could be beneficial to +indicate that it applies to the async block and not its surrounding function. On +the other hand this would introduce a difference to closures and async closures +which make use the `return` keyword.