Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Const indexing and friends #25570

Merged
merged 2 commits into from
Dec 1, 2015
Merged

Const indexing and friends #25570

merged 2 commits into from
Dec 1, 2015

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented May 18, 2015

This PR allows the constant evaluation of index operations on constant arrays and repeat expressions. This allows index expressions to appear in the expression path of the length expression of a repeat expression or an array type.

An example is

const ARR: [usize; 5] = [1, 2, 3, 4, 5];
const ARR2: [usize; ARR[1]] = [42, 99];

In most other locations llvm's const evaluator figures it out already. This is not specific to index expressions and could be remedied in the future.

@nrc
Copy link
Member

nrc commented May 18, 2015

cc @eddyb

@nrc
Copy link
Member

nrc commented May 18, 2015

Given that we're in the post-1.0 world. We need to either make this not a breaking change or justify how this is effectively just fixing a bug (we'd also need to measure the impact of the change in this case). So I don't think we can land this as is.

I'm afraid I don't understand either of the two proposals for fixing the breaking change - could you explain them in a bit more detail please?

@oli-obk
Copy link
Contributor Author

oli-obk commented May 19, 2015

everything in this comment is outdated and probably wrong. It is left since other comments refer to it

rebase and fix of #25370

This is still a breaking change, but could either be fixed by introducing a static fallback or by making enum variants first class consts (which I can do, but this PR is already getting out of hand).
The failing test is was

enum E { V1(isize), V0 }
const C: &'static [E] = &[E::V0, E::V1(0xDEADBEE)];
static C1: E = C[1];
static C0: E = C[0];

The failure is was that enum variants can currently not be evaluated inside consts. This used to work, because the const slice was secretly a static slice.

run-pass/const-enum-vec-index.rs:14:16: 14:20 error: constant indexing failed: unsupported constant expr
run-pass/const-enum-vec-index.rs:14 static C1: E = C[1];
                                                   ^~~~

r? @pnkfelix or @nrc

Explanation of changes:

Array expressions, Repeat expressions and references are now available as const_vals. This allows indexing into arrays, repeats and references to either to be evaluated completely inside const_eval at compile-time and does not require llvm to solve this for us at optimization-time.

This requires a few tricks:

  1. const evaluation treats statics as consts and simply squashes them if able
  2. referencing and (auto-)dereferencing is done in the const-evaluator
  3. statics special-treat references to indexing operations
  4. normal indexing ops return by-val and are computed at compile-time (llvm never knows there was an array)
  5. Two references to the same element of a constant array are not guaranteed to have the same address references to arrays are again promoted to statics if they aren't directly dereferenced again
  6. Two references to the same constant are not guaranteed to have the same address This is by-design but not actually happening

I'm afraid I don't understand either of the two proposals for fixing the breaking change - could you explain them in a bit more detail please?

  1. introducing a static fallback
    if the const evaluation of an indexing operation fails, we can call the old code that did the indexing in llvm
    This would be the safe solution, as there might be other situations where static evaluation (llvm) works but constant evaluation doesn't.
  2. by making enum variants first class consts
    I already created new const_val variants for Array, Repeat and Ref. I could add an EnumVariant const val and process it properly. Right now there's just some magic that works if a c-like enum is const evaluated.
    This is the nice solution, as it's another step towards full const evaluation of all const-evaluable code.

I see now that only Solution 1. is viable in this PR, as there shouldn't be any potential for a breaking change even if mostly no code would break.

For Solution 2 I'll add another PR in the future, for now I'll just fix this by reintroducing the original fallback

@oli-obk oli-obk force-pushed the const_indexing branch 2 times, most recently from ccc582e to 8ce7fdb Compare May 19, 2015 09:09
@@ -877,12 +962,57 @@ pub fn eval_const_expr_partial<'tcx>(tcx: &ty::ctxt<'tcx>,
ast::ExprBlock(ref block) => {
match block.expr {
Some(ref expr) => try!(eval_const_expr_partial(tcx, &**expr, ety)),
None => const_int(0)
None => const_int(0) // huh???
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in 24ece07

I believe it should be unreachable!() or a sess.span_bug but I might be wrong

cc @Kimundi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh sorry, should have been researching better: #14183 (comment)

@oli-obk
Copy link
Contributor Author

oli-obk commented May 19, 2015

hold this off until #25609 is merged, I'll rebase then.

@@ -566,12 +610,20 @@ fn const_expr_unadjusted<'a, 'tcx>(cx: &CrateContext<'a, 'tcx>,
}

ast::ExprIndex(ref base, ref index) => {
match const_eval::eval_const_expr_partial(cx.tcx(), &e, None) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an alternative would be to check if base points to a static or a constant and then decide on what to do. Any thoughts?

@oli-obk
Copy link
Contributor Author

oli-obk commented May 23, 2015

I thought constants couldn't refer to statics at all. Am I missing something?

yea, that confused me quite a bit, too. As I read the docs and book, it means that if you use a constant in between, you again loose all guarantees of memory locations and uniqueness of values. Although the compiler actually evaluates all constants with intermediate statics as statics

EDIT: apparently the comment was deleted

@eddyb
Copy link
Member

eddyb commented May 23, 2015

@oli-obk I realized the code path is for the value with which a static is initialized, not a const.
A const cannot mention any static at all, it's the initializer value of a static which can.

Also, the error constants cannot refer to other statics, insert an intermediate constant instead [E0013] is very confusing, I'm not sure what it's trying to tell me.

@oli-obk
Copy link
Contributor Author

oli-obk commented May 23, 2015

oh wow, I got everything confused here... I was so sure statics were simply evaluated as constants if called from a constant...

So I guess my "try const eval, if fails try static eval" is fine?

@bors
Copy link
Contributor

bors commented May 24, 2015

☔ The latest upstream changes (presumably #25609) made this pull request unmergeable. Please resolve the merge conflicts.

@oli-obk
Copy link
Contributor Author

oli-obk commented May 24, 2015

Also, the error constants cannot refer to other statics, insert an intermediate constant instead [E0013] is very confusing, I'm not sure what it's trying to tell me.

It's trying to tell you you should replace the static with a constant, and make the static simply be a static A: T = C; where C is your new constant. Then your other constant that failed with the error can also reference C instead of A.

@oli-obk
Copy link
Contributor Author

oli-obk commented May 24, 2015

rebased, it was only a minor change (forwarding a function parameter). make check is running

@eddyb
Copy link
Member

eddyb commented May 24, 2015

@oli-obk that seems very contrived, in practice you either replace the referenced static with a const or you replace the referencing const with a static.

@oli-obk
Copy link
Contributor Author

oli-obk commented May 25, 2015

I'd rather remove the error completely and have constant evaluation treat statics as constants. But then my "try const eval, else fallback to static eval" method fails horribly.

check-stage1 passes

@eddyb
Copy link
Member

eddyb commented May 25, 2015

@oli-obk No, no, no: statics are not constant.
The main two usecases for static are interior mutability and link-time constants.
Neither of these can be given meaningful semantics when used in a constant.
IMO a static which could be const should be linted against.

@oli-obk
Copy link
Contributor Author

oli-obk commented May 25, 2015

now that is something i can do! useless static-lint is on my TODO list

@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 8, 2015

rebased and removed the const references

This PR now only addresses const arrays, repeats and indexing into them.

NotOnFloat,
NotOnString,
NotOnBinary,
NotOnStruct,
NotOnTuple,
NotOnArray,
NotOnRepeat,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it should be a nested enum or something (is it done like this just for flattening? no, that doesn't make sense, below there's (u64, u64) payloads).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm okay with not trying to fold that particular change into this PR -- i.e. switching to a nested enum might make sense but also seems like it is a broader refactoring that we could develop later, no?)

@eddyb
Copy link
Member

eddyb commented Jun 8, 2015

I just checked and it seems that many of these cases just cause the compiler to abort (with SIGILL and no error message), for example (playpen):

fn main() {
    static A: i32 = 0;
    static B: i32 = *&A;
    println!("{:?}", B);
}

Though it only happens when the statics are inside a function, so I'm a bit confused as to what's going on.

IMO references to statics should not be dereferenced for any other purpose than to offset them (similar to &(*(0 as *const Struct)).field as *const _ as usize, which is no longer allowed).

Is it even possible to deny such uses after 1.0?
cc @alexcrichton @nikomatsakis

@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 9, 2015

so... I came to the conclusion that indexing into and derefing statics never makes sense in a static environment. except when taking the address directly again, but that's apparently not ok:

Borrowed statics can specifically only have their address taken, not any number of other borrows such as borrowing fields, reading elements of an array, etc.

taking either of these by value also makes no sense... so i tried to forbid that, but this also breaks constants since there's no difference between a &'static that points to a constant and one that points to a static... so... one solution is to introduce &'const (even if just in the ty::Region enum), other solutions are welcome.

situations this is happening in the test-suite:
https://github.com/rust-lang/rust/blob/master/src/test/run-pass/const-autoderef.rs#L13
false positives:
https://github.com/rust-lang/rust/blob/master/src/test/run-pass/const-deref.rs#L13
https://github.com/rust-lang/rust/blob/master/src/test/run-pass/const-block-item.rs#L32
https://github.com/rust-lang/rust/blob/master/src/test/run-pass/const-fields-and-indexing.rs
https://github.com/rust-lang/rust/blob/master/src/test/run-pass/issue-17233.rs

@bors
Copy link
Contributor

bors commented Jun 14, 2015

☔ The latest upstream changes (presumably #26071) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Nov 14, 2015

☔ The latest upstream changes (presumably #29781) made this pull request unmergeable. Please resolve the merge conflicts.

@bluss
Copy link
Member

bluss commented Nov 14, 2015

The PR's message will go into the git log when this is merged. I think it's best to make sure it reflects the merged version, and I guess it's best to remove the outdated text?

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 16, 2015

rebased.

@bluss, I moved it to my second comment

@bluss
Copy link
Member

bluss commented Nov 16, 2015

@oli-obk Thank you, that's neater!

@@ -416,6 +424,12 @@ pub enum ErrKind {
ExpectedConstTuple,
ExpectedConstStruct,
TupleIndexOutOfBounds,
IndexedNonVec,
IndexNotNatural,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I feel like I vaguely remember what "natural numbers" are from grade-school, but wouldn't "positive" be easier?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Positive doesn't include zero. (Usually the naturals start at zero.)

You could do IndexNotNonNegative, or just IndexNegative if you like. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(though perhaps this error is also issued when the index is a floating point number, i.e. a non-integer? That's the other reason one would say "Natural" ...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The non-integer case seems to be handed by the NotInt variant below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like IndexNegative, I'll go with that.

Natural numbers only include integers.

@nikomatsakis
Copy link
Contributor

So this PR looks good to me.

One question is whether we should feature-gate these changes? Otherwise it does represent an implicit commitment. OTOH, it seems like something we'll obviously want to support. I feel like we've chosen to feature-gate in other similar instances. But it seems like we need to make better procedure for moving forward with UNGATING. I know that @aturon made a bunch of issues, which is a first-step, we need to start reviewing them regularly as well, I imagine. Thoughts @rust-lang/lang?

@pnkfelix
Copy link
Member

@nikomatsakis I suppose the broader question is what criteria we should use in deciding whether to impose a feature gate.

I see three potential reasons for a feature gate:

  1. When we do not know whether we want a feature (in any form) at all
  2. When we have decided we want a feature, but there are details about its specified semantics that we are not sure about. (Note that there a range of potential semantics that we can assign, and there is a partial ordering between them, since we can reject code statically when it fails to adhere to some base set of rules ... so we might feature-gate some extension to a semantics, but leave the core ungated).
  3. When we think we know the semantics we want for a feature, but we are not confident that we can commit to the "current" implementation.

Does that sounds about right?

In this case, I think the answers are: (1.) we want this (but maybe others would disagree), (2.) any possible semantics we might choose is going to include what is outlined here, I think (i.e. this probably occupies a low point in the partial ordering of potential semantics that I alluded to above). So the only question mark for me is (3.): is the implementation sufficiently low-risk that we can side-step a gate? And I don't have a confident answer there.

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 18, 2015

We already allow indexing into constants anywhere but a true constant (e.g. array lengths). It's just computed by llvm's constant evaluator in trans/consts.rs So this is more of a bugfix ;)

So imo it's only a question about (3.)

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 18, 2015

addressed nits and added a test

@nikomatsakis
Copy link
Contributor

So let me be clear on my concern. In general, I think that constant expression evaluation is a kind of "ad-hoc, best-effort" mess whose limits are not well documented and are only understood by a relatively small set. The very fact that we have two constant evaluators that handle two distinct subsets -- and slightly differently, at times, though of course those are "just bugs" -- is sort of what I'm talking about. I would very much like to refactor things so that we have ONE constant evaluator that is following a relatively documented, comprehensible process. And I'm wary of accidentally enabling evaluation of expressions that turns out to work most of the time, but have rough edges. Let me give you another concrete example: we currently allow you to "dereference" an &'static pointer, which -- when it points into a static -- perhaps ought to be disallowed. Certainly it leads to #29884, which was partly because of a (rather ad-hoc) decision to support static recursion (#29719). Luckily we put a feature-gate on this decision, so we have some luxury to either prohibit derefs into statics or else fix #29719 before we are overly committed. So I think it's pretty reasonable to move slow in terms of allowing arbitrary stable code to make use of extended sets of constant expressions.

Now, indexing and dereferences are basically the same thing, so an example of something I'd like to be careful about is: what happens if you try to index into the content of a static here?

All this is not to say we need a feature-gate, but just to say that there is evidence that there is reason to think that @pnkfelix's condition #2 may apply here:

When we have decided we want a feature, but there are details about its specified semantics that we are not sure about.

Basically, I think it might be prudent to place a kind of feature gate around ALL expansions of constants until we have something a bit more disciplined (at least a unified evaluator, ideally an RFC laying out what is expected to work).

@nikomatsakis
Copy link
Contributor

I just want to be clear though: I'm also open to the idea that I'm being excessively cautious. Part of my problem I think is that I personally don't feel like I deeply understand the limits of the current system, and that makes me feel uneasy. Perhaps those more intimately familiar feel more comfortable.

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 18, 2015

Now, indexing and dereferences are basically the same thing, so an example of something I'd like to be careful about is: what happens if you try to index into the content of a static here?

you get an error, because a static is not a true constant and const_eval aborts.

would very much like to refactor things so that we have ONE constant evaluator that is following a relatively documented, comprehensible process.

This is the target I've been doing all these PRs for. Once const_eval can do anything trans/consts.rs can, I want to change trans/consts.rs to be nothing but a translator for instances of the ConstVal enum.

All this is not to say we need a feature-gate, but just to say that there is evidence that there is reason to think that @pnkfelix's condition #2 may apply here:

I'm perfectly fine with a feature gate.

And I just found #29914 in trans/consts. Totally related to mixing two constant evaluators that implement different subsets.

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 20, 2015

added a feature gate. make check passes.

@nikomatsakis
Copy link
Contributor

@oli-obk sorry, I was on PTO for a bit and fell behind with notifications, I'm catching up now. Anyway, thanks for adding the feature gate. I'm going to approve this then!

@bors r+

@bors
Copy link
Contributor

bors commented Dec 1, 2015

📌 Commit 6405122 has been approved by nikomatsakis

bors added a commit that referenced this pull request Dec 1, 2015
This PR allows the constant evaluation of index operations on constant arrays and repeat expressions. This allows index expressions to appear in the expression path of the length expression of a repeat expression or an array type.

An example is

```rust
const ARR: [usize; 5] = [1, 2, 3, 4, 5];
const ARR2: [usize; ARR[1]] = [42, 99];
```

In most other locations llvm's const evaluator figures it out already. This is not specific to index expressions and could be remedied in the future.
@bors
Copy link
Contributor

bors commented Dec 1, 2015

⌛ Testing commit 6405122 with merge eb1d018...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants