Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Const/static type annotation elision #2010

Closed
wants to merge 3 commits into from

Conversation

schuster
Copy link

A proposal to allow elision of type annotations in many cases for const and static items, as part of the ergonomics initiative

Rendered

```rust
const THE_ANSWER = 42; // nothing in RHS indicates this must be i16

fn get_big_number() -> i16 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: -> i64.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

back in later.

Fallback is acceptable, however, if the overall type is still unique even
without the fallback rules, as in this example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it easy to enforce such rule? For instance

const fn foo<T, U>(_: T, u: U) -> U { u }
const A = foo(1, 2u64);  // should be ok?
const B = foo(1u64, 2);  // should be error?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your assumptions are correct: A should type-check, but B should not.

According to my understanding of Rust's type inference mechanisms, this is always enforceable. Either the unification algorithm comes up with a unique type, in which case it's okay, or it doesn't, in which case the expression results in a type error. My understanding is limited, though, so I'd be happy to learn about any counterexamples.

@est31
Copy link
Member

est31 commented May 29, 2017

👎 , for several reasons:

  1. It would introduce an inconsistency with other items living at the same level as const's and statics, fn's. The type of functions can't be elided.
  2. It would make the language harder to learn as you now need to know when you can omit the type and when not. The current rule is strict and simple. Yet another scary area for beginners to not know about!
  3. Like proposals that want to introduce the elision of function signature types via their bodies, its bad on the reader to force them to go through another layer of implementation and check the type. Especially as it may mean the reader may have to follow several layers of function definitions, e.g. you have const FOO = (Foo { i: 2}).bar().baz().boo(), and each of bar, baz and boo were complicated functions returning generic types that themselves depend on trait impls.
  4. The RFC author claims that IDEs can help with the aforementioned issue, but IDEs can also help to put the type annotation into place automatically for those who feel its too much a burden (see more on that below). You shouldn't be forced to use an IDE when reading code! I regularly read code on Github for example, and don't want to clone something locally just so that I can find out which type a const has. That's not more ergonomics, thats less of it.
  5. The ergonomics with consts is just fine. Usually you have to make one const or two per crate, then its bearable to type out the type annotation. Sometimes you also need to do a bunch of const elements in a row, but then their type is usually the same and you can just do a macro to define them. You'll arrive at a much smaller character count with a well engineered macro as with the proposal, as you won't have to type out "const" each time!
  6. There is nothing gained in raw character count by writing const FOOBAR = 43i32; instead of const FOOBAR: i32 = 43; (I'm not proposing that for integers/floats the default type is chosen, in fact I dislike this RFC less in the current form). The only place where you spare characters is when you type const FOO = (Foo { i: 2}).bar().baz().boo(), as that may return an arbitrarily complicated type. But these cases are precisely those cases where points 3 and 4 are extremely relevant. Limiting the functionality of this RFC to literals only solves the issues 3 and 4, but (except for &str and ()) not this issue.

@strega-nil
Copy link

@est31 as you laid this out, I see something that I think might be interesting: impl Trait for consts.

@glaebhoerl
Copy link
Contributor

glaebhoerl commented May 30, 2017

A more conservative alternative would be to allow the type to be elided only when the RHS consists entirely of literals. So for example you could write const FOO = Foo { x: true }; instead of const FOO: Foo<bool> = Foo { x: true };, because in that case the type is completely obvious "from the definition itself", and in a sense redundant. The difference relative to the RFC would be that if the definition of the const refers to other consts (resp. statics) or const fns, then a type annotation would still be required. This would more-or-less address the objection of having to track down various definitions manually to be able to reconstruct in your head what type the compiler will infer, and also the one where changing an item in the API could otherwise have "non-local effects" where it would result in the types of other items also silently changing.

(This would presumably be true only for monomorphic literals, so for example 5 and None would still require an explicit type. An interesting question is whether e.g. given struct Bar(Foo);, you would be allowed to write const BAR = Bar(FOO);, that is whether references to other consts would be allowed if it can't affect the type of the result, but probably still no for simplicity and consistency's sake.)

@schuster
Copy link
Author

@est31 Regarding functions, this proposal is more like inferring the return type of a function, but not its argument types (perhaps that's what you meant). However, constants are most often very simple things where the type is obvious, while functions tend to be much harder to mentally type-check at a glance.

Generally, I expect most constants/statics will be simple expressions with an obvious type, and even the use of constant functions will likely be more for things like AtomicUsize::new(0) than something like (Foo { i: 2}).bar().baz().boo(). In those cases where the expression is more complicated and where documenting the code is important, developers should be encouraged to leave the type annotation in.

I agree that the complexity of when an annotation can be elided or not is problematic, and this is probably the biggest downside of the proposal. Ideally we eventually find that there's little harm in allowing the fallback rules to apply here and change the rule to always allow type elision. This proposal would be a conservative step towards that, but admittedly there's no guarantee it happens. In the meantime, I think good error messages could ease this pain for beginners.

@schuster
Copy link
Author

@glaebhoerl Leaving aside const functions for a moment, would you say that even using other constants in the definition of a constant makes its type hard to determine in real programs? I would think in most cases where this happens the referenced constant's type would be obvious (by context and/or by the constant's name and/or because it's a well-known constant), but my experience is limited.

@eddyb
Copy link
Member

eddyb commented May 31, 2017

FWIW, we have the compiler infrastructure to determine the type of a global by type-checking its body and you get automatic cycle detection, enforcing a DAG between such global items.
We'll need to do something like that if we want to add typeof too.

@crumblingstatue
Copy link

crumblingstatue commented May 31, 2017

In my opinion, this should be consistent with the rules of global items always requiring annotations, but allowing inference for function-local items.

Under this rule, we could allow inference for consts that are local to a function. This would also help with the ergonomics of using const for forcing compile-time evaluation when calling a const fn.

@eddyb
Copy link
Member

eddyb commented May 31, 2017

@crumblingstatue Items, placed in a function, are still quite global, e.g. you can use them in an impl of a public trait for a public type, that is placed within the same function.

@crumblingstatue
Copy link

@eddyb Oh, never mind then.

Just curious: Is there any legitimate use case for impling an outside item inside of a function?

@eddyb
Copy link
Member

eddyb commented May 31, 2017

It's not explicitly supported - it just results from the fact that everything that can go directly in a module also can be inside any block expression - yes, this includes extern crate and nested mod.

@glaebhoerl
Copy link
Contributor

@schuster

Leaving aside const functions for a moment, would you say that even using other constants in the definition of a constant makes its type hard to determine in real programs?

I don't really have an opinion. My comment was just like "if we think that's a problem, then another option is..."

* Const functions may make it more difficult for a programmer to infer the type
of a const/static item just be reading it. Most likely, though, most uses of
const functions in this context will be things like `AtomicUsize::new(0)`
where the type is obvious.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this most likely? Once there are fully fledged const functions they may be used for all sort of things.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, they can be used in lots of ways, but since the motivation for that RFC was to initialize types like AtomicUsize<T> and Cell<T> in constant contexts, I'm assuming that will be the typical use-case. I could be wrong, though; perhaps people who were involved in that RFC discussion could provide more info.

In those more complicated cases where documenting the type is important for readability, the programmer should leave the type in. But mandating that practice by requiring annotations on all const/static items, even the simple ones, seems like overkill to me. I'll add a note about this to the RFC.

@aturon aturon added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jun 1, 2017
@aturon aturon self-assigned this Jun 1, 2017
@arielb1
Copy link
Contributor

arielb1 commented Jun 1, 2017

@crumblingstatue

Sure. e.g. if a function wants to do a visitor pass, then we can create the visitor and its impls inside that function instead of in the module, e.g. this: https://github.com/rust-lang/rust/blob/master/src/librustc_mir/transform/mod.rs#L74 - this is more useful if your functions are impl-items, so you would have to add the struct outside of the impl.

@aturon
Copy link
Member

aturon commented Jun 6, 2017

Thanks @schuster for the RFC, and thanks to those who've commented so far!

It seems opinions thusfar have been mixed. Let me start by covering what are, in my mind, the most important drawbacks raised so far.

  1. Moving away from the rule that "all top-level items are fully type-annotated".
  • The stated downside here is that the language becomes less consistent, and harder to learn, because the need for type annotations varies in a more subtle way.
  • There is, however, a simple counter-argument. If we allow statics and consts to leave off type annotations, then they come to resemble let bindings, leading to a simple rule: forms that bind variable names do not require type annotations; forms that define functions do.
  • More generally, I don't think that most people, when learning static and const definitions, think immediately in terms of the top-level items rule above; that's a pretty sophisticated generalization. Rather, they learn: "this is how you declare a constant".
  1. Having type annotations be sometimes required is again a source of inconsistency or unpredictaility.
  • Undeniably true! However, again thinking by analogy to let, this is already the case with type inference. In isolation, let x = None; fails to compile and requests a type annotation. The situation with constants and statics would be identical. There is, however, an argument that integer fallback should be allowed, to make the match between let, const, and static more complete.

I think the above are the most significant drawbacks, so I'll stop there for the sake of keeping debate focused.

On the positive side, to me a pain point is the need for inelegant, noisy stuttering:

static COUNT: AtomicUsize = AtomicUsize::new(0);
const USER_VISIBLE: Flags = Flags { /* ... */ };

In cases like these, readability is helped by leaving off type information, because it's easier to quickly see the relevant information and not have to skip over stuttering.

I'd love to see examples from real code where it would be possible to leave off a type annotation, and that would significant impair the reader's ability to determine the type.

@glaebhoerl
Copy link
Contributor

@aturon I think one potentially-scary-seeming drawback is that you might modify one part of your public API, and accidentally end up changing another part of it as well, because the type of the second was inferred from the first. I honestly don't have any idea whether this would be a thing that actually happens in practice. Has anyone had any experience where they changed the type of a thing in their API, got a type error on a const or static definition whose type depended on it (because the existing ascribed type was now incorrect), and thought "oh crap, I forgot that would also be affected", rather than "oh jeez I forgot to update that type accordingly"?

For that matter, it could happen cross-crate as well.

@le-jzr
Copy link

le-jzr commented Jun 6, 2017

Wouldn't that "unexpected" change be what you'd intend to do anyway?

Accidental semver-incompatible changes on constants are a very unlikely problem that can be discovered by automated tools. As much as I'm against giving programmers more decisions, in this case I support elision. You can let the author decide whether they want the type inferred or not, same as with locals.

@burdges
Copy link

burdges commented Jun 6, 2017

You can still teach this in parallel with eliding type annotations for let even if you require type annotations on exported const and static. Also, any exported const defaulting to i32 over usize will create problems.

@glaebhoerl
Copy link
Contributor

@le-jzr

Wouldn't that "unexpected" change be what you'd intend to do anyway?

That's what my question ("Has anyone had any experience where ...") was about.

@schuster
Copy link
Author

schuster commented Jun 9, 2017

@glaebhoerl To make this concrete, you're proposing a scenario like the following, right? Somewhere in a crate, we have:

const FOO: i32 = 5;

and elsewhere (in this crate or in another one that imports the first) we have:

const BAR = (FOO); // inferred type is (i32)

Then if someone changes the annotation on FOO to i64 instead, BAR's inferred type would change to (i64), even though we didn't touch BAR's definition. The problem being that this could break uses of BAR somewhere down the line.

I agree with @le-jzr that this seems unlikely in the cross-crate scenario, which seems like the more dangerous of the two (because the person who makes the change and the person who notices a problem are not the same). But like you, I'm also curious if anyone has had the kind of experience you're asking about.

@aturon
Copy link
Member

aturon commented Jun 9, 2017

cc @rust-lang/lang, it'd be good to start getting some additional thoughts on the tradeoffs here. I've written up a summary with my own thoughts about some of the arguments.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jun 9, 2017

@aturon

To my mind, you hit the nail on the head with this observation:

There is, however, a simple counter-argument. If we allow statics and consts to leave off type annotations, then they come to resemble let bindings, leading to a simple rule: forms that bind variable names do not require type annotations; forms that define functions do.

In particular, I've been trying to put my finger on why this RFC feels like such a win to me, whereas eliding return types on functions does not. I think it is precisely a question of the analogy with let, and the fact that I am accustomed to declaring values without having to state their types.

I also frequently find that I am making constants that are not "top-level constants" but rather constants within a function. In that context, they are technically nested top-level items, but my brain does not process them that way:

fn process_items(vec: &[i32]) {
    const CHUNK_SIZE = 32; // requiring `: usize = 32` doesn't feel like a win here

    for chunk in i.iter().windows(CHUNK_SIZE) { ... }
}

On the other hand, this example raises an interesting point. =) In particular, that bit of code would not work with this RFC, since I believe it would require a type annotation and, even if it didn't, it would presumably infer to i32, which is not what I want here. That's disappointing, as I think integral constants are pretty dang common.

Regardless, there are other examples (e.g., string constants, or more complex constants) that fit the mold.

@kennytm
Copy link
Member

kennytm commented Jun 9, 2017

I've checked my code,

  • Almost all const/static used are integers or integer arrays (usize, [u32; N] etc), so this RFC won't help at all.
  • Most complex types (Regex, HashMap etc) appear in lazy_static which is not able to benefit from this RFC.

@est31
Copy link
Member

est31 commented Aug 15, 2017

Personally, I think it would still be better to use that default just to be consistent with other parts of Rust. What do others think (of that aspect and any other parts of this data)?

I think the defaults are weakly documented, and having to know them wouldn't be nice from a learnability perspective. I think its harder to remember what the defaults are than remembering that you have to put type annotations on number literals.

@petrochenkov
Copy link
Contributor

petrochenkov commented Aug 15, 2017

If RFC 2071 "Add impl Trait type alias and variable declarations" ends up with the less conservative alternative (underlying type of impl Trait is revealed in the current module), it would be more or less equivalent to one variation of this RFC - type inference for private consts/statics.

$vis const C = init_expr;

could be then desugared into

// EDIT: `Anything` is not part of the desugaring, it lives somewhere in the standard library.
trait Anything {}
impl<T> Anything for T {}

// Desugaring
$vis const C: impl Anything = init_expr;

If C is used in the current module, it's type is not hidden.
If $vis is large enough for C to be nameable from other modules, it will be barely usable from there due to its anonymized type.
So the property of const signatures being "interfaces", like fn signatures, is kept, but only for outer modules.

cc @cramertj

@schuster
Copy link
Author

@petrochenkov Which part of either RFC would allow the creation of a new alias like that? This one would only allow type elision if a unique type can be inferred locally, whereas many types could be inferred for your example (e.g. impl Anything1, impl Anything2, etc.). The other RFC doesn't seem to elide type annotations at all (unless I missed something; I didn't read it too closely).

So if both RFCs were accepted and implemented, I would expect your example (before desugaring) to result in a compiler error. I wouldn't expect the compiler to generate both a new trait and a new impl for it as part of type inference. Have I missed something?

@petrochenkov
Copy link
Contributor

petrochenkov commented Aug 16, 2017

@schuster

The other RFC doesn't seem to elide type annotations at all (unless I missed something; I didn't read it too closely).

It doesn't, but it provides all the underlying mechanisms. Adding type elision through desugaring into impl Trait would be a trivial addition then.

I wouldn't expect the compiler to generate both a new trait and a new impl for it as part of type inference.

Right, Anything should live somewhere in the standard library, only the : impl Anything part is generated.

@liigo
Copy link
Contributor

liigo commented Aug 18, 2017

const NAME: TYPE = VALUE;

When reading code, the TYPE matters more than VALUE. The later does not help a lot to understanding code. If TYPE is elided, I (the reader, the programmer) must infer it myself, it's not obvious in some cases, and the compiler can't help me here (though rustc will infer it too).

Update: for local vars, we have context.

@eternaleye
Copy link

eternaleye commented Aug 27, 2017

A question from me: Would this RFC potentially allow the following?

const FOO = |a, b| a + b;

I'm not sure if I'd be in favor of or against this being permitted.

On the one hand, it makes perfect sense syntactically - if you don't need to name the type an unnameable type is no problem, and the closure would syntactically only be able to capture consts and statics.

On the other hand, it would enable a much more terse syntax for declaring functions, one which I am not especially in favor of; see my objections to a request for such a syntax.

@dobkeratops
Copy link

dobkeratops commented Aug 28, 2017

const FOO = |a, b| a + b;

nice idea!!! I would be greatly in favour of allowing this; it would allow easy migration of code between lambdas and functions. For another take on this , see the path being pursued in the JAI language, his idea is to make evolving code easy ('code goes through a maturation cycle'). (tangentially, Kotlin also has a nice idea )

@nikomatsakis
Copy link
Contributor

@rfcbot fcp postpone

I'm moving to postpone the RFC. Overall, while I still think that there's a problem here to be solved, I'm not sure that we've quite found the right formula for doing it. I think we should put this off and return to it in the future, once some of the dust has settled.

One point that sticks out in my mind is that, based on the data that @schuster gathered, it looks like enabling i32 fallback would basically always pick the wrong type for simple things like const FOO = 22. This means you would still have to annotate cases of simple integer literals, and yet those appears to be the vast majority of constants!

On the other hand, there are two other non-trivial cases that would be helped:

  • The const SOMETHING: Foo = Foo { } and const SOMETHING: Foo = Foo(...) repetition (fairly common, based on @schuster's statistics).
  • Array literals where you have to say the length const Foo: [u8; 3] = [...]; often you can do const Foo: &[u8] = &[...] instead here though.

The alternatives I see:

  • Broader inference could help with integers, but I'd not want to go there just now. Maybe later, after we gain more experience in other areas (e.g. impl Trait), where it appears that broader inference will be needed.
  • Perhaps something syntactic like @petrochenkov suggested would be better, but that feels kind of scary to me. It's not something we have precedent for. I don't think we have a good idea what it's corner cases are, or how it would feel to have two distinct kinds of inference co-existing in the compiler.

@rfcbot
Copy link
Collaborator

rfcbot commented Aug 29, 2017

Team member @nikomatsakis has proposed to postpone this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added the proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. label Aug 29, 2017
@rfcbot
Copy link
Collaborator

rfcbot commented Aug 30, 2017

🔔 This is now entering its final comment period, as per the review above. 🔔

1 similar comment
@rfcbot
Copy link
Collaborator

rfcbot commented Aug 30, 2017

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Aug 30, 2017
@schuster
Copy link
Author

schuster commented Sep 4, 2017

For anyone who tries to tackle this in the future, here are the two main sticking points I see:

  1. Inference for numeric literals. Given the stats above, it appears the fallback rule would not help in most cases, so there's room for a better option. Like @nikomatsakis said, perhaps some broader kind of inference could be helpful (although that makes it more difficult to determine the item's type by looking at the item alone).
  2. Whether type inference for const/static items can use the types for bound names (e.g. structs, tuple structs, and functions). The main difference between the approach suggested in this RFC and @petrochenkov's "syntactic" approach is that the RFC allows those those types to be used, while the syntactic approach does not. The concern seems to be that the non-local nature of those types could make it harder to read a const/static expression and infer its type in your head.

@rfcbot
Copy link
Collaborator

rfcbot commented Sep 9, 2017

The final comment period is now complete.

@aturon
Copy link
Member

aturon commented Sep 11, 2017

This RFC has been closed as postponed. While the lang team believes that there is room for improvement here, it's proven to be a much more complicated design space than hoped, and the payoff doesn't appear worth the complexity at this time.

Thanks, @schuster, for writing and shepherding the RFC!

@alexreg
Copy link

alexreg commented Mar 23, 2018

What's the current status on this?

@scottmcm
Copy link
Member

@alexreg I don't believe anything has changed here since the FCP proposal and sticking points comments above. That said, I wouldn't be surprised if there were a few cases of narrower proposals that could be interesting, like how eliding 'static was allowed in Rust 1.21.0.

@eddyb
Copy link
Member

eddyb commented Aug 18, 2018

I've just realized that #2010 (comment) (const FOO = |a, b| a + b;) is a non-problem:

use std::any::Any;
const FOO: &Any = &|a, b| a + b;
error[E0282]: type annotations needed
 --> src/lib.rs:2:27
  |
2 | const FOO: &Any = &|a, b| a + b;
  |                           ^ cannot infer type

Since this feature alone, without additional global inference, would rely on type-checking the body to get its overall type, it would hit the same problem of not being able to infer the closure's signature.
Furthermore, closure and function item types do not contain their signatures directly, so any sort of global inference would have to go through the trait system, which is an even bigger no-no.

OTOH, something like |a| Option::unwrap_or(a, 0) would work, as this compiles today:

const FOO: &Any = &|a| Option::unwrap_or(a, 0);

So if we're worried about the first example but not the last one, then I think it wouldn't be a mistake to accept this feature. If we're worried about the last one too, we can't do much, because this works:

#![feature(fn_traits, unboxed_closures)]
use std::any::Any;
use std::marker::PhantomData;
trait CallAny {
    fn call_any(&self, args: Box<Any>) -> Box<Any>;
}
// HACK: force dispatch to use M, not just T.
struct Mark<T, M>(T, PhantomData<M>);
impl<A: 'static, R: 'static, F: Fn<A, Output=R>> CallAny for Mark<F, A> {
    fn call_any(&self, args: Box<Any>) -> Box<Any> {
        Box::new(self.0.call(*args.downcast::<A>().unwrap())) as Box<Any>
    }
}
const FOO: &CallAny = &Mark(|a| Option::unwrap_or(a, 0), PhantomData);
fn main() {
    println!("{:?}", (
        FOO.call_any(Box::new((None::<i32>,))).downcast::<i32>(),
        FOO.call_any(Box::new((Some(123),))).downcast::<i32>(),
    ));
}
(Ok(0), Ok(123))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ergonomics Initiative Part of the ergonomics initiative final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.