From b9ecb4e045b5a9f8b63b742fee3c0003bca599ca Mon Sep 17 00:00:00 2001 From: Scott McMurray Date: Mon, 24 Apr 2017 18:30:07 -0700 Subject: [PATCH 1/4] Amend 1192 (RangeInclusive) to just be a two-field struct Rationale: 1. Having the variants make trying to use a RangeInclusive unergonomic. For example, `fn clamp(self, range: RangeInclusive)` is forced to deal with the `Empty` case. 2. The variants don't actually provide any offsetting safety or additional performance, since everything is pub so it's totally possible for a "`NonEmpty`" range to contain no elements. 3. This makes it more consistent with (half-open) `Range`. 4. The extra flag/variant is not actually needed in order to make it iterable, even using the existing Step trait. And supposing a more cut-down trait, generating `a, b` such that `!(a <= b)` is easier than other fundamental requirements of safe range iterability. --- text/1192-inclusive-ranges.md | 77 ++++++++++++++++++++++------------- 1 file changed, 49 insertions(+), 28 deletions(-) diff --git a/text/1192-inclusive-ranges.md b/text/1192-inclusive-ranges.md index a6f619d47a1..6b68ab87a92 100644 --- a/text/1192-inclusive-ranges.md +++ b/text/1192-inclusive-ranges.md @@ -27,13 +27,8 @@ more dots means more elements. ```rust pub enum RangeInclusive { - Empty { - at: T, - }, - NonEmpty { - start: T, - end: T, - } + pub start: T, + pub end: T, } pub struct RangeToInclusive { @@ -41,7 +36,8 @@ pub struct RangeToInclusive { } ``` -Writing `a...b` in an expression desugars to `std::ops::RangeInclusive::NonEmpty { start: a, end: b }`. Writing `...b` in an +Writing `a...b` in an expression desugars to +`std::ops::RangeInclusive { start: a, end: b }`. Writing `...b` in an expression desugars to `std::ops::RangeToInclusive { end: b }`. `RangeInclusive` implements the standard traits (`Clone`, `Debug` @@ -57,6 +53,10 @@ now would be `1. ..` i.e. a floating point number on the left, however, fortunately, it is actually tokenised like `1 ...`, and is hence an error with the current compiler. +This `struct` definition is maximally consistent with the existing `Range`. +`a..b` and `a...b` are the same size and have the same fields, just with +the expected difference in semantics. + # Drawbacks There's a mismatch between pattern-`...` and expression-`...`, in that @@ -66,10 +66,32 @@ semantically.) The `...` vs. `..` distinction is the exact inversion of Ruby's syntax. -Having an extra field in a language-level desugaring, catering to one -library use-case is a little non-"hygienic". It is especially strange -that the field isn't consistent across the different `...` -desugarings. +Not having a separate marker for `finished` or `empty` implies a requirement +on `T` that it's possible to provide values such that `b...a` is an empty +range. But a separate marker is a false invariant: whether a `finished` +field on the struct or a `Empty` variant of an enum, the range `10...0` still +desugars to a `RangeInclusive` with `finised: false` or of the `NonEmpty` +variant. And the fields are public, so even fixing the desugar cannot +guarantee the invariant. As a result, all code using a `RangeInclusive` +must still check whether a "`NonEmpty`" or "un`finished`" is actually finished. +The "can produce an empty range" requirement is not a hardship. It's trivial +for anything that can be stepped forward and backward, as all things which are +iterable in `std` are today. But ther are other possibilities as well. The +proof-of-concept implementation for this change is done using the `replace_one` +and `replace_zero` methods of the (existing but unstable) `Step` trait, as +`1...0` is of course an empty range. Something weirder, like walking along a +DAG, could use the fact that `PartialOrd` is sufficient, and produce a range +similar in character to `NaN...NaN`, which is empty as `(NaN <= NaN) == false`. +The exact details of what is required to make a range iterable is outside the +scope of this RFC, and will be decided in the [`step_by` issue][step_by]. + +Note that iterable ranges today have a much stronger bound than just +steppability: they require a `b is-reachable-from a` predicate (as `a <= b`). +Providing that efficiently for a DAG walk, or even a simpler forward list +walk, is a substantially harder thing to do that providing a pair `(x, y)` +such that `!(x <= y)`. + +[step_by]: https://github.com/rust-lang/rust/issues/27741 # Alternatives @@ -83,28 +105,25 @@ This RFC proposes single-ended syntax with only an end, `...b`, but not with only a start (`a...`) or unconstrained `...`. This balance could be reevaluated for usefulness and conflicts with other proposed syntax. -The `Empty` variant could be omitted, leaving two options: - - `RangeInclusive` could be a struct including a `finished` field. + This makes it easier for the struct to always be iterable, as the extra + field is set once the ends match. But having the extra field in a + language-level desugaring, catering to one library use-case is a little + non-"hygienic". It is especially strange that the field isn't consistent + across the different `...` desugarings. +- `RangeInclusive` could be an enum with `Empty` and `NonEmpty` variants. + This is cleaner than the `finished` field, but makes all uses of the + type substantially more complex. For example, the clamp RFC would + naturally use a `RangeInclusive` parameter, but then the + unreliable-`Empty` vs `NonEmpty` distinction provides no value. It does + prevent looking at `start` after iteration has completed, but that is + of questionable value when `Range` allows it without issue, and disallowing + looking at `start` while allowing looking at `end` feels inconsistent. - `a...b` only implements `IntoIterator`, not `Iterator`, by converting to a different type that does have the field. However, this means that `a.. .b` behaves differently to `a..b`, so `(a...b).map(|x| ...)` doesn't work (the `..` version of that is used reasonably often, in the author's experience) -- `a...b` can implement `Iterator` for types that can be stepped - backwards: the only case that is problematic things cases like - `x...255u8` where the endpoint is the last value in the type's - range. A naive implementation that just steps `x` and compares - against the second value will never terminate: it will yield 254 - (final state: `255...255`), 255 (final state: `0...255`), 0 (final - state: `1...255`). I.e. it will wrap around because it has no way to - detect whether 255 has been yielded or not. However, implementations - of `Iterator` can detect cases like that, and, after yielding `255`, - backwards-step the second piece of state to `255...254`. - - This means that `a...b` can only implement `Iterator` for types that - can be stepped backwards, which isn't always guaranteed, e.g. types - might not have a unique predecessor (walking along a DAG). # Unresolved questions @@ -114,3 +133,5 @@ None so far. * In rust-lang/rfcs#1320, this RFC was amended to change the `RangeInclusive` type from a struct with a `finished` field to an enum. +* In rust-lang/rfcs#1980, this RFC was amended to change the `RangeInclusive` + type from an enum to a struct with just `start` and `end` fields. From 67cc2cb401f73c2721e564ad20e303dcc1abad0d Mon Sep 17 00:00:00 2001 From: Scott McMurray Date: Thu, 27 Apr 2017 00:48:04 -0700 Subject: [PATCH 2/4] Address comments: move more details to design, address post-iteration `end` --- text/1192-inclusive-ranges.md | 75 +++++++++++++++++++---------------- 1 file changed, 41 insertions(+), 34 deletions(-) diff --git a/text/1192-inclusive-ranges.md b/text/1192-inclusive-ranges.md index 6b68ab87a92..cc8de0460f0 100644 --- a/text/1192-inclusive-ranges.md +++ b/text/1192-inclusive-ranges.md @@ -57,6 +57,26 @@ This `struct` definition is maximally consistent with the existing `Range`. `a..b` and `a...b` are the same size and have the same fields, just with the expected difference in semantics. +The range `a...b` contains all `x` where `a <= x && x <= b`. As such, an +inclusive range is non-empty _iff_ `a <= b`. When the range is iterable, +a non-empty range will produce at least one item when iterated. Because +`T::MAX...T::MAX` is a non-empty range, the iteration needs extra handling +compared to a half-open `Range`. As such, `.next()` on an empty range +`y...y` will produce the value `y` and replace the range with the canonical +empty range for the type. Using methods on the the existing (but unstable) +[`Step` trait][step_trait], that's `1...0` for all currently-iterable +value types. Providing such a range is not a burden on the `T` type as +any such range is acceptable, and only `PartialOrd` is required so +it can be satisfied with an incomparable value `n` with `!(n <= n)`. + +Note that because ranges are not required to be well-formed, they have a +much stronger bound than just needing successor function: they require a +`b is-reachable-from a` predicate (as `a <= b`). Providing that efficiently +for a DAG walk, or even a simpler forward list walk, is a substantially +harder thing to do than providing a pair `(x, y)` such that `!(x <= y)`. + +[step_trait]: https://github.com/rust-lang/rust/issues/27741 + # Drawbacks There's a mismatch between pattern-`...` and expression-`...`, in that @@ -66,32 +86,9 @@ semantically.) The `...` vs. `..` distinction is the exact inversion of Ruby's syntax. -Not having a separate marker for `finished` or `empty` implies a requirement -on `T` that it's possible to provide values such that `b...a` is an empty -range. But a separate marker is a false invariant: whether a `finished` -field on the struct or a `Empty` variant of an enum, the range `10...0` still -desugars to a `RangeInclusive` with `finised: false` or of the `NonEmpty` -variant. And the fields are public, so even fixing the desugar cannot -guarantee the invariant. As a result, all code using a `RangeInclusive` -must still check whether a "`NonEmpty`" or "un`finished`" is actually finished. -The "can produce an empty range" requirement is not a hardship. It's trivial -for anything that can be stepped forward and backward, as all things which are -iterable in `std` are today. But ther are other possibilities as well. The -proof-of-concept implementation for this change is done using the `replace_one` -and `replace_zero` methods of the (existing but unstable) `Step` trait, as -`1...0` is of course an empty range. Something weirder, like walking along a -DAG, could use the fact that `PartialOrd` is sufficient, and produce a range -similar in character to `NaN...NaN`, which is empty as `(NaN <= NaN) == false`. -The exact details of what is required to make a range iterable is outside the -scope of this RFC, and will be decided in the [`step_by` issue][step_by]. - -Note that iterable ranges today have a much stronger bound than just -steppability: they require a `b is-reachable-from a` predicate (as `a <= b`). -Providing that efficiently for a DAG walk, or even a simpler forward list -walk, is a substantially harder thing to do that providing a pair `(x, y)` -such that `!(x <= y)`. - -[step_by]: https://github.com/rust-lang/rust/issues/27741 +This proposal makes the post-iteration values of the `start` and `end` fields +constant, and thus useless. Some of the alternatives would expose the +last value returned from the iteration, through a more complex interface. # Alternatives @@ -110,20 +107,30 @@ reevaluated for usefulness and conflicts with other proposed syntax. field is set once the ends match. But having the extra field in a language-level desugaring, catering to one library use-case is a little non-"hygienic". It is especially strange that the field isn't consistent - across the different `...` desugarings. + across the different `...` desugarings. And the presence of the public + field encourages checkinging it, which can be misleading as + `r.finished == false` does not guarantee that `r.count() > 0`. - `RangeInclusive` could be an enum with `Empty` and `NonEmpty` variants. - This is cleaner than the `finished` field, but makes all uses of the - type substantially more complex. For example, the clamp RFC would - naturally use a `RangeInclusive` parameter, but then the - unreliable-`Empty` vs `NonEmpty` distinction provides no value. It does - prevent looking at `start` after iteration has completed, but that is - of questionable value when `Range` allows it without issue, and disallowing - looking at `start` while allowing looking at `end` feels inconsistent. + This is cleaner than the `finished` field, but still has the problem that + there's no invariant maintained: while an `Empty` range is definitely empty, + a `NonEmpty` range might actually be empty. And requiring matching on every + use of the type is less ergonomic. For example, the clamp RFC would + naturally use a `RangeInclusive` parameter, but because it still needs + to `assert!(start <= end)` in the `NonEmpty` arm, the noise of the `Empty` + vs `NonEmpty` match provides it no value. - `a...b` only implements `IntoIterator`, not `Iterator`, by converting to a different type that does have the field. However, this means that `a.. .b` behaves differently to `a..b`, so `(a...b).map(|x| ...)` doesn't work (the `..` version of that is used reasonably often, in the author's experience) +- Different choices for the end of iteration are also possible. While neither + `start` nor `end` can always have the last value of the iteration while + still producing an empty range (consider what happens with `MIN...MIN` + and `MAX...MAX`), they could be closer. For example, `a...a` could become + `(a+1)...a` where possible, and `a...(a-1)` otherwise. +- The name of the `end` field could be different, perhaps `last`, to reflect + its different (inclusive) semantics from the `end` (exclusive) field on + the other ranges. # Unresolved questions From 0fdca1398a6746bce59849d60a0e056c9bc054bf Mon Sep 17 00:00:00 2001 From: Scott McMurray Date: Fri, 5 May 2017 22:02:38 -0700 Subject: [PATCH 3/4] Remove a lingering mention of `Empty` Thanks @liigo --- text/1192-inclusive-ranges.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/text/1192-inclusive-ranges.md b/text/1192-inclusive-ranges.md index cc8de0460f0..c7863b1bb02 100644 --- a/text/1192-inclusive-ranges.md +++ b/text/1192-inclusive-ranges.md @@ -41,8 +41,7 @@ Writing `a...b` in an expression desugars to expression desugars to `std::ops::RangeToInclusive { end: b }`. `RangeInclusive` implements the standard traits (`Clone`, `Debug` -etc.), and implements `Iterator`. The `Empty` variant is to allow the -`Iterator` implementation to work without hacks (see Alternatives). +etc.), and implements `Iterator`. The use of `...` in a pattern remains as testing for inclusion within that range, *not* a struct match. From d6ea75b194a4fca971c0f34283fb4fe220787397 Mon Sep 17 00:00:00 2001 From: Scott McMurray Date: Mon, 15 May 2017 23:49:02 -0700 Subject: [PATCH 4/4] Unspecify the post-iteration value of RangeInclusive --- text/1192-inclusive-ranges.md | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/text/1192-inclusive-ranges.md b/text/1192-inclusive-ranges.md index c7863b1bb02..b75e0e7f482 100644 --- a/text/1192-inclusive-ranges.md +++ b/text/1192-inclusive-ranges.md @@ -61,12 +61,13 @@ inclusive range is non-empty _iff_ `a <= b`. When the range is iterable, a non-empty range will produce at least one item when iterated. Because `T::MAX...T::MAX` is a non-empty range, the iteration needs extra handling compared to a half-open `Range`. As such, `.next()` on an empty range -`y...y` will produce the value `y` and replace the range with the canonical -empty range for the type. Using methods on the the existing (but unstable) -[`Step` trait][step_trait], that's `1...0` for all currently-iterable -value types. Providing such a range is not a burden on the `T` type as +`y...y` will produce the value `y` and adjust the range such that +`!(start <= end)`. Providing such a range is not a burden on the `T` type as any such range is acceptable, and only `PartialOrd` is required so it can be satisfied with an incomparable value `n` with `!(n <= n)`. +A caller must not, in general, expect any particular `start` or `end` +after iterating, and is encouraged to detect empty ranges with +`ExactSizeIterator::is_empty` instead of by observing fields directly. Note that because ranges are not required to be well-formed, they have a much stronger bound than just needing successor function: they require a @@ -74,6 +75,17 @@ much stronger bound than just needing successor function: they require a for a DAG walk, or even a simpler forward list walk, is a substantially harder thing to do than providing a pair `(x, y)` such that `!(x <= y)`. +Implementation note: For currently-iterable types, the initial implementation +of this will have the range become `1...0` after yielding the final value, +as that can be done using the `replace_one` and `replace_zero` methods on +the existing (but unstable) [`Step` trait][step_trait]. It's expected, +however, that the trait will change to allow more type-appropriate `impl`s. +For example, a `num::BitInt` may rather become empty by incrementing `start`, +as `Range` does, since it doesn't to need to worry about overflow. Even for +primitives, it could be advantageous to choose a different implementation, +perhaps using `.overflowing_add(1)` and swapping on overflow, or `a...a` +could become `(a+1)...a` where possible and `a...(a-1)` otherwise. + [step_trait]: https://github.com/rust-lang/rust/issues/27741 # Drawbacks @@ -122,11 +134,6 @@ reevaluated for usefulness and conflicts with other proposed syntax. this means that `a.. .b` behaves differently to `a..b`, so `(a...b).map(|x| ...)` doesn't work (the `..` version of that is used reasonably often, in the author's experience) -- Different choices for the end of iteration are also possible. While neither - `start` nor `end` can always have the last value of the iteration while - still producing an empty range (consider what happens with `MIN...MIN` - and `MAX...MAX`), they could be closer. For example, `a...a` could become - `(a+1)...a` where possible, and `a...(a-1)` otherwise. - The name of the `end` field could be different, perhaps `last`, to reflect its different (inclusive) semantics from the `end` (exclusive) field on the other ranges.