overlong bit shifts cause undefined behaviour #10183

thestinger · 2013-10-31T00:10:01Z

1 >> 24353125315

nikomatsakis · 2013-10-31T00:21:01Z

At some point we discussed this in a meeting and decided it was acceptable.

thestinger · 2013-10-31T00:22:00Z

@nikomatsakis: it's not just an unspecified result as I previously thought though, it breaks memory safety

nikomatsakis · 2013-10-31T00:25:18Z

Do you have a concrete example where this leads to breaking memory safety? I can imagine how it might theoretically happen (that is, the behavior of the program as a whole is technically undefined), but I'm not sure that it actually does? I guess I can imagine some compiler saying that "since i1 >> i2 implies i2 < 32, then I'll do some wacky optimization to some expression involving i2..."

thestinger · 2013-10-31T00:27:56Z

@nikomatsakis: LLVM won't reserve a register for the return value, and will just read arbitrary registers on a read of the value, so for example a bounds check could fail to work

nikomatsakis · 2013-10-31T00:29:29Z

@thestinger return value of what function? read of what value? Can you spell it out more?

thestinger · 2013-10-31T00:31:46Z

For example, a pattern like this:

let mut xs = [1, 2, 3];
let index = 1 >> 128; // this is `undef`
xs[index] = 100;

The indexing operation will essentially expand to this:

if index < 3 {
    *unsafe_index(xs, index) = 100;
} else {
    fail_bounds_check(...);
}

The index value as read for the comparison may be different than the value passed to unsafe_index.

http://llvm.org/docs/LangRef.html#undefined-values

In practice, it likely won't be in most cases, but in a more complex example where there was register pressure it would be.

thestinger · 2013-10-31T00:34:48Z

The key part:

an ‘undef‘ “variable” can arbitrarily change its value over its “live range”. This is true because the variable doesn’t actually have a live range. Instead, the value is logically read from arbitrary registers that happen to be around when needed, so the value is not necessarily consistent over time.

brson · 2013-10-31T01:42:31Z

Nominating.

nikomatsakis · 2013-10-31T01:56:36Z

I see. Fascinating! Is there a way to "capture" an undefined value and make it constant? For example, storing it to a stack slot and reading it back? (Presumably this load/store could be optimized away, so prob not). If so, we could capture the output of shift operations in this way.

pnkfelix · 2013-11-07T18:43:59Z

Accepted for P-backcompat lang. There are a variety of ways to deal with this. We need to choose one. Will discuss further (e.g. in tues mtg).

pcwalton · 2013-11-22T04:45:20Z

I don't think this is backwards incompatible at a language level. It will not cause code that was working OK to stop working. Nominating.

pnkfelix · 2013-12-19T18:44:43Z

changing priority to P-high. not assigning to 1.0 milestone.

The current implementation of `rotate_left` and `rotate_right` are incorrect when the rotation amount is 0, or a multiple of the input's bitsize. When `n = 0`, the expression (self >> n) | (self << ($BITS - n)) results in a shift left by `$BITS` bits, which is undefined behavior (see #10183), and currently results in a hardcoded `-1` value, instead of the original input value. Reducing `($BITS - n)` modulo `$BITS`, simplified to `(-n % $BITS)`, fixes this problem.

pnkfelix · 2014-10-01T14:58:35Z

@pcwalton Is it not true that the logic of your last comment generalize to cover e.g. "integer overflow is undefined behavior.", in the sense that one could argue "code that is not exercising the conditions that yield undefined behavior will continue to work fine" in that scenario too?

I had thought that we considered it an important property that we actually tie down the semantics for operations in order to not head off into the undefined behavior weeds.

(Sorry for revisiting an old topic; it came up on IRC.)

pnkfelix · 2014-10-01T15:00:44Z

context for previous comment: https://botbot.me/mozilla/rust-internals/2014-10-01/?msg=22667069&page=5

pythonesque · 2014-11-22T16:21:04Z

Why is this not blocking 1.0? It's UB in safe code. At the very least it should be very prominently documented in the reference.

thestinger · 2014-11-22T16:25:15Z

It was marked as a backwards compatibility issue after being nominated, but then it was downgraded to P-high after another nomination. I don't understand the reasoning behind it. It's possible to do this in safe code and have it appear to work as expected and this will need to change.

ben0x539 · 2014-11-22T17:47:46Z

What are our options here? Compile bitshifts with non-const rhs values to include asserts?

pythonesque · 2014-11-22T18:15:46Z

A bitmask of the right hand side would be the cheapest way to make the behavior defined. It is what x86 does by default so it would be free there, it would just prevent LLVM from turning it into undef. The behavior isn't very nice, but it's still way better than what we currently have, and if the current behavior is considered backwards compatible to change then so would this.

Another option would be to always turn the result into 0 (which I think most people expect when they do a long bit shift) wherever it went out of bounds. I believe it would require a branch, though, but maybe there's a less expensive way to do it. An assert would work too, it would be more expensive though. The thing that gives me pause here is that (1) bitshifts are mostly used where performance is paramount, so having the default option be too expensive might not be good, and (2) bounds check removal is likely to be a lot more optimized in LLVM than overlong bitshift check removal (so the effect might be larger, plus we don't have an abstraction like iterators that would make them always defined). OTOH, maybe nonconstant bitshifts are not that common.

A third option would be to add the ability to make bitshift not undef to LLVM, but I suspect that doesn't really change the question of how best to do it, just moves the discussion to a different repository. The biggest advantage to doing that would be that there wouldn't be additional overhead on --opt-level 0 (I'd imagine).

An unsafe (unchecked) version and a checked version would also be exposed, of course.

thestinger · 2014-11-23T01:53:19Z

It is what x86 does by default so it would be free there

It will only be free if LLVM knows how to optimize it out.

A third option would be to add the ability to make bitshift not undef to LLVM, but I suspect that doesn't really change the question of how best to do it, just moves the discussion to a different repository.

LLVM could be taught to have a non-portable result rather than UB.

emberian · 2014-11-23T15:25:02Z

My reading of the langref leads me to believe that this is not UB in the traditional sense, merely that the result will have an unspecified result, unless we are translating to one of the shift instructions that create a poison value (which is distinct from undef).

thestinger · 2014-11-23T15:34:16Z

@cmr: undef memory causes UB in many contexts, such as xs[undef]

thestinger · 2014-11-23T15:34:41Z

It's covered earlier in the discussion here.

emberian · 2014-11-23T15:35:52Z

Ah I see now, yes, I misunderstood your earlier comment.

steveklabnik · 2014-11-25T16:27:37Z

Today,

fn main() {
    let mut xs = [1i, 2i, 3i];
    let x = 1 >> 128;

    xs[x] = 100i;
}

gives me

hello.rs:3:13: 3:21 error: bitshift exceeds the type's number of bits, #[deny(exceeding_bitshifts)] on by default
hello.rs:3     let x = 1 >> 128;
                       ^~~~~~~~
error: aborting due to previous error

This error would seem to indicate that something has changed here? Or did I read the history of this ticket wrong?

glaebhoerl · 2014-11-25T16:38:26Z

@steveklabnik not up to date on the history but I would guess that the problem is when you shift by a value that's not a compile time constant?

thestinger · 2014-11-25T16:43:36Z

@steveklabnik: There hasn't been any change in regards to this issue. There is a lint to alert the programmer to these mistakes at compile-time but it is still UB if it's not a compile-time constant or even with compile-time constants if the lint is disabled (as it is just an optional warning).

steveklabnik · 2014-11-25T17:02:24Z

Can confirm that this code (with no -O stuff, I'd wonder if -O3 couldn't optimize it out)

    let mut xs = [1i, 2i, 3i];
    let idx = 64 + 1;
    let x = 1 >> idx;

    xs[x] = 100i;

doesn't throw the warning, and thus, compiles. On my system, it just does nothing, rather than crash or something. Woo UB!

thestinger · 2014-11-25T17:04:26Z

It's not fixed for shifts by constants either. The optional warning doesn't remove the responsibility of the compiler to prevent memory unsafety.

ruuda · 2015-02-16T21:06:37Z

Currently, the following code panics in debug mode but runs fine in release mode:

fn main() {
    let sh = 32;
    assert_eq!(0, 1_i32 << sh);
}

(Or try it in the playpen.)

I am not sure whether this is by design or whether it is a bug, but either way, the documentation of shl nor the documentation of i32 mentions this behaviour.

steveklabnik · 2015-02-16T23:06:01Z

The overflow RFC addresses this, and I have a bug open to reconcile all RFCs with the reference, so this will be taken care of as part of that.

thestinger · 2015-02-17T01:47:39Z

This is a memory safety hole in the implementation. It's an issue regardless of how the language is intended to work per RFCs...

srh · 2015-02-17T02:45:55Z

@steveklabnik How does the overflow RFC address this?

steveklabnik · 2015-02-17T02:56:25Z

Sorry, by 'address' I mean a decision was made, and I'll be documenting the result. Not that it's any different than today.

thestinger · 2015-02-17T02:59:26Z

This bug isn't asking for a change in semantics, it's filed to cover a known memory safety hole in the language. The fact that a clear solution has been specified doesn't mean the bug is fixed... pretending memory safety holes don't exist in the implementation doesn't make them go away.

Gankra · 2015-02-17T03:01:01Z

Agreed this issue should not be closed until UB in safe Rust does not occur.

ben0x539 · 2015-02-17T13:59:35Z

I don't think the RFC addresses this issue at all. Under "unresolved questions", it lists "shifts by an excessive number of bits". It just proposes debug assertions and doesn't prevent undefined behavior in release builds.

pnkfelix · 2015-03-20T09:13:44Z

Here is a variant of @ruud-v-a 's example that assert fails (in different places) in both -O0 and -O2 modes. (It was adapted from #23551.)

#![allow(exceeding_bitshifts)]

fn id<T>(x: T) -> T { x }

fn main() {
    let sh = 32;
    assert_eq!(0, 1_i32 << 32);     // note that this line "is the same" as the two below
    assert_eq!(0, 1_i32 << sh);     // fails here with -O0
    assert_eq!(0, 1_i32 << id(sh)); // fails here with -O2
}

playpen

pnkfelix · 2015-03-20T09:20:01Z

@ben0x539 I believe the intent of the RFC is to ensure that the fallback behaviors, when reasonable, do not expose undefined behavior from LLVM.

In these cases, it seems like the fallbacks should simply mask-out the higher order bits as appropriate for each type.

pnkfelix mentioned this issue Nov 7, 2013

floating point to integer casts can cause undefined behaviour #10184

Closed

pnkfelix mentioned this issue Dec 19, 2013

integer to floating point casts can cause undefined behaviour #10185

Closed

klutzy mentioned this issue May 11, 2014

rustllvm: Add LLVMRustArrayType #14083

Closed

thestinger mentioned this issue May 12, 2014

bitwise operation takes wrong value #11580

Closed

sneves mentioned this issue Jul 2, 2014

Fix rotate_{left, right} for multiple of bitsize rotation amounts #15324

Merged

sneves mentioned this issue Jul 16, 2014

Change Shl<T, T> (resp Shr<T, T>) for Int to Shl<uint, T> (resp Shr<uint, T>) #15407

Merged

ben0x539 mentioned this issue Oct 2, 2014

Should have a lint for statically detectable overlong bit shifts #17713

Closed

ben0x539 mentioned this issue Oct 9, 2014

Integer overflow semantics rust-lang/rfcs#359

Closed

pythonesque mentioned this issue Nov 22, 2014

Prominently document that overlong bitshifts cause undefined behavior #19214

Closed

Gankra mentioned this issue Dec 10, 2014

Correct bitv build failures and a bug in all() Gankra/rust#1

Merged

steveklabnik closed this as completed Feb 16, 2015

Gankra reopened this Feb 17, 2015

This was referenced Mar 20, 2015

overflowing left shift can lead to pseudo-byte with hidden bits #23551

Closed

Tracking issue for Integer Overflow (RFC 560) #22020

Closed

This was referenced Mar 20, 2015

overflowing-checking for rhs of shift operators #23536

Merged

Inaccuracies in the reference about bitshifts and overflows #23421

Closed

pnkfelix closed this as completed in 19fef72 Mar 24, 2015

overlong bit shifts cause undefined behaviour #10183

overlong bit shifts cause undefined behaviour #10183

Comments

thestinger commented Oct 31, 2013

nikomatsakis commented Oct 31, 2013

thestinger commented Oct 31, 2013

nikomatsakis commented Oct 31, 2013

thestinger commented Oct 31, 2013

nikomatsakis commented Oct 31, 2013

thestinger commented Oct 31, 2013

thestinger commented Oct 31, 2013

brson commented Oct 31, 2013

nikomatsakis commented Oct 31, 2013

pnkfelix commented Nov 7, 2013

pcwalton commented Nov 22, 2013

pnkfelix commented Dec 19, 2013

pnkfelix commented Oct 1, 2014

pnkfelix commented Oct 1, 2014

pythonesque commented Nov 22, 2014

thestinger commented Nov 22, 2014

ben0x539 commented Nov 22, 2014

pythonesque commented Nov 22, 2014

thestinger commented Nov 23, 2014

emberian commented Nov 23, 2014

thestinger commented Nov 23, 2014

thestinger commented Nov 23, 2014

emberian commented Nov 23, 2014

steveklabnik commented Nov 25, 2014

glaebhoerl commented Nov 25, 2014

thestinger commented Nov 25, 2014

steveklabnik commented Nov 25, 2014

thestinger commented Nov 25, 2014

ruuda commented Feb 16, 2015

steveklabnik commented Feb 16, 2015

thestinger commented Feb 17, 2015

srh commented Feb 17, 2015

steveklabnik commented Feb 17, 2015

thestinger commented Feb 17, 2015

Gankra commented Feb 17, 2015

ben0x539 commented Feb 17, 2015

pnkfelix commented Mar 20, 2015

pnkfelix commented Mar 20, 2015