-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
causes UB by calling mem::zeroed on any type #2
Comments
Well... my excuse is that the rust unsafe code guidelines didn't exist when the crate was made, and this would have been valid in C/C++ 😛 So uh.. thanks! Does Does the UB come from owning the uninitialised value, or does it come from having a live expression of type T, where T may be uninhabited, or something else? It seems like an intrinsic maybe needed or there will still be no way to do this without invoking UB? |
You might be interested in this discussion on internals that brought me here. :)
UB comes from operating on (in this case, assigning) a value of a type
You are fine as long as you avoid references. This is where you also need rust-lang/rfcs#2582. |
@RalfJung I see, the only bit that seems kindof a grey area for me is the fields access, eg.
Coming from C++, my intuition for how lvalues work is that they are essentially the same as implicit references. eg. auto x = *ptr + 1; This is actually taking the rvalue And this explains why the following does not actually read the memory referred to by auto y = &(*ptr).foo; Again In this model, the rules for uninit data are easy: you are just not allowed to have uninit data in an rvalue at any time. Now, going back to rust-land: if I try to find corresponding rules, it's much trickier, because now it matters whether an lvalue was derived from a reference vs a raw pointer or union access: let's call it raw_lvalue vs lvalue. Now the rule must be that neither lvalues nor rvalues can contain uninit data, but raw_lvalues can (or else the field offset in Without these rules, it's difficult to specify what kinds of operations are allowed to operate on uninit data: eg. you say that assignment cannot operate on uninit data, and you can derive that by saying that assignment operates on lvalues (and so raw_lvalues must first be converted to lvalues) whereas field access is defined for all three of raw_lvalues, lvalues and rvalues. |
This matches Rust.
Why do you think unions matter? I don't think they do. Actually I think the model should be uniform for all places (that's how lvalues are called in Rust), and the only thing it should assert is that the memory this points to is allocated (with the size given by the type). But so far that is just my model, not something we all agreed on yet. Only rvalues (for which we are still searching a better term in Rust land) have to be valid at their given type. |
OK, I think I understand. If I create a (normal) reference to an uninit lvalue: let x = &maybe_uninit.value; The uninitialised value is never in an rvalue, and so it's fine for it to be uninitialised. However, a reference to it is created which is an rvalue, and references must point to valid data to themselves be valid, and that's the source of the UB. If I could somehow create a reference such that the reference itself was an lvalue, then that reference could point to uninitialised data, without invoking UB, but that's not possible because As for naming rvalues, I think it would be tricky to improve on rvalue given how obscure the concept is, because there's never a good name for obscure concepts, and sometimes just using any existing precedent is better. Maybe "materialized", "actualized" or "reified" value. |
Exactly. Except that what exactly validity of references is is not yet finalized, we are collecting data and going to be discussing that in rust-lang/unsafe-code-guidelines#77. In my reading references have to point to allocated memory but it is okay if that memory is not initialized. But there might be reasons to require the stronger invariant.
Not sure what you mean here, but the point of rust-lang/rfcs#2582 is that you can turn a place immediately into an rvalue of type |
Reading through this again, I don't know why we talked about field access here. The UB is already in this line: // Construct a "fake" T. It's not valid, but the lambda shouldn't
// actually access it (which is why this is unsafe)
let x = mem::zeroed(); This "produces" a value from 0-filled memory for any type Do you think there is any chance this could use the |
I don't believe that version of the The way I get the address in this crate is by using pattern matching, so that Unfortunately, there is no equivalent of |
No, it specifically checks for that case first, before doing the field access. We went for that approach to make it easier to convert to rust-lang/rfcs#2582 once that lands, and to keep the references to uninitialized memory that we create to a minimum. |
@Diggsey did you have a chance to take a second look? I am fairly sure that your analysis above is incorrect and that The use of |
Yes, I don't have any concerns with the new implementation now, but was considering whether I should make this a breaking change: The existing (unsafe) The current documentation says that when using this method, you should not access the referenced value, but this is no longer sufficient: if I keep the method as-is, it will be invalid to call on any type which cannot be zeroed, so I would need to update the docs and add a second constructor without that constraint. However, it looks like in that case the new lint will still fire? (Shouldn't the lint be disabled for generic parameters if the method is unsafe, because it could simply be part of the contract that Anyway, I'm leaning towards just making it a breaking change to avoid any surprises, so this shouldn't be an issue. |
I've opened a PR: #3 Would you mind reviewing it? |
The lint only fires if the type is definitely not zero-able... so yeah I guess it will not apply here. I thought that code was in a macro, but it is not.
Ah, I didn't realize this would have to be a breaking change. Given the unsoundness of the old version, I wonder if it is worth either having a patch release that's still not sound but better (it could use a dangling reference instead of
Thanks! I'll take a look. |
Not sure why I need owner access for the review but thanks anyway. ;) |
Github wouldn't let me add you as a reviewer without first adding you as a contributor. Don't worry I'm not expecting you to help maintain 😛 |
Btw, I did find this line to be a little... dangerous: https://github.com/Gilnaa/memoffset/blob/master/src/offset_of.rs#L33 Maybe it's just personal taste, but I prefer to put as much of the unsafety outside of macros as possible, because in a macro, you might rely on something which today would be a syntax error, but in tomorrow's rust might be valid code and do something totally unexpected. |
I can't really imagine anything that would break that, do you have an example? Also this helps factor the code, which is important when dealing with unsafe. I'd rather not repeat myself. |
This code is highly problematic:
If
T
is, for example, a struct containing reference type, then this causes UB by initializing a reference with 0. The comment seems to say that it is okay to do this as long asx
is not accessed; that is not correct.That said, there currently is no correct way to do what you want. This bug can only really be fixed once
MaybeUninit
gets stabilized.(I also feel really uneasy with the extend to which this relies on type inference, but YMMV.^^)
The text was updated successfully, but these errors were encountered: