-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some questions around what is UB and what is defined when using raw pointers #205
Comments
I think 2 can fall as UB under:
This is a quote from: the docs of UnsafeCell. and should also be mentioned here in my opinion. |
Thats simply imposible, consts dont have any address, when you take |
@bjorn3 you're talking in a mindset of the rustc low level. |
re-read it again. and what you're saying is that every call to &CONST creates a copy of that const in read only memory? |
Conceptually, yes, but within one object file it is deduplicated, but copies can exist in different crates.
Yes
As said above within one object file it is likely equal, but between different crates it isnt.
Yes |
I believe it is always allowed, but impedes some optimizations
Allowed, but miri used to not support ptr<->int casts, so align_offset was invented. |
That's more an optimization that is sound for programs that don't exhibit UB, than the semantics of the language it self. In the abstract machine, using a When you write However, if you create a
Since the allocation is a local private temporary, that's like writing this: let x = 3;
let y = 3;
assert!(&x as *const _ as usize != &y as *const _ as usize); We do not provide any guarantees about what the addresses of x and y are. You can compare their addresses - that's not UB - but there are no guarantees about what the outcome of that
This is not UB.
This is not UB either. At most is a logic bug, and sometimes, it even works correctly. |
@gnzlbg Thank you. I finally think I understand consts in rust properly :)
So that's not a proper way to check alignment? (by looking at the implementation of |
That's how I understand it. It is not a good way to check alignment, but it is not UB per se. If that returns some incorrect result, your program might or might not do something later that is UB. |
4 is about alignment. What am I missing?
The conclusion was actually more like "not syntactically possible". Mutating read-only memory through raw ptr casting is UB, and the reference says so:
That's not a question. But know that int-ptr-casts are very poorly understood in C and LLVM, and Rust inherits this. But given that in Rust you can only cast usize with raw pointers, nothing can go wrong. These are even safe operations so I wonder why you might even think they are UB?
Indeed that's the only way to actually check alignment. |
That makes it sound like |
Sorry fixed. I meant 2.
I still think it's worth explaining because people who aren't familiar with how rust operates consts (like me before reading this thread :) ) it sounds confusing to tell them that something "isn't possible" when they can easily do
If I wasn't clear enough I meant will it be UB when dereferencing. Can int-ptr-cast change the pointer in some edge case or is this completely fine to do and then dereference(assuming the original ptr was fine to dereference).
What's wrong with |
Interesting. That would be weaker semantics than C, which does guarantee that different variables have distinct addresses (including locals), as long as those variables are all in scope. Has this been discussed before? |
I don't know but I think it might be worth it to open an issue to discuss that (it might be something worth guaranteeing, but if so, we should write that down somewhere). |
Okay, just filed #206. |
@gnzlbg @RalfJung Just as an example of the usize/align casting I saw right now: https://github.com/BurntSushi/rust-memchr/blob/master/src/x86/avx.rs#L41 |
That LGTM :/ |
@gnzlbg didn't say it's not. I just couldn't fine any official reference docs defining this casting.(or not defining here) |
What would you like to have documented? Pointer to integer casts, integer to pointer casts, and integer arithmetic, are all safe. AFAICT the only unsafe operation there is dereferencing a raw pointer, and that's already documented in the reference and in the nomicon. |
@gnzlbg |
Your examples have only one unsafe operation: a pointer dereference. In the reference and the nomicon we document that, when you dereference a pointer, that pointer must not be null, it must be aligned for its type, it must be dereferenceable for the size of the type, and it must point to memory containing a valid value of the type. When you write the So is |
@gnzlbg Ok, so i'll rephrase my questions accordingly :)
|
See https://github.com/rust-lang-nursery/reference/blob/master/src/expressions/operator-expr.md#type-cast-expressions - Casting a pointer to an |
Yes, modulo is better. Read the docs for
Yes it is, this follows directly from what alignment is. The address represented by the pointer must be divisible by the alignment without remainder. Is your question here how alignment is defined? That is something we could clarify in the Reference / Nomicon, I suppose. However, that doesn't seem to be the problem; you seem to know alignment is about being divisible without a remainder. But then I am surprised by these questions. I'm afraid if we start to list the answer questions like "after I checked The only document that could answer such questions conclusively is a proper spec, at least a partial one, describing the behavior of these expressions. But then that would be really hard to read as well.
Correction: we don't require that for raw ptr derefs. Validity of values only comes up when "producing a value of some type" (what we call "typed copy" in the UCG). So you can do
Agreed, though "address of" sounds like Also note (this is mostly directed to anyone reading along), casting a ptr to To go slightly meta, I think we discussed most of the points in this issue. What do we want to do before closing it? I feel this is as good an opportunity as any to start a "FAQ" document in this repo where we can collect answers to various specific questions that come up, and that do not fit the "discussion topic" scheme. So the issue here could be closed by adding answers to some of these questions to the FAQ (I am not sure if all of them are eligible, e.g. I don't think we should repeat the |
Another question. |
Because the C standard doesnt guarantee |
Since that works on stable Rust, and stable Rust does not have undefined behavior, I'd say that it has to be. |
@gnzlbg I meant that this is for passing to ffi that will cast back to pointer and dereference it (i.e syscalls) |
In Rust you can cast a pointer to a pointer-sized int, and you can pass that int to FFI. It's up to the unknown code at the other side to make sure it only does correct things with that, but from Rust pov, you can cast pointers to int and back without problems. |
Weird to me that @bjorn3 is saying that long isn't guaranteed to be pointer sized because musl and the kernel uses that in the syscalls even for pointers So I'm not really sure what to make of it. And should I use isize or c_long for registers for syscalls |
@bjorn3 is probably referring to what the C standard guarantees, which is that A specific platform, like Linux + x86_64, can offer more guarantees. These guarantees just aren't necessarily portable to other platforms. |
Indeed. The Linux kernel (and apparently musl as well) pervasively assumes that For syscall code which is both arch- and platform-specific, you can use whatever you want – Edit: If you're wondering why Linux does that: |
Thanks. That clarifies it for me. |
I think the questions here have been answered. |
Hi,
Read quite a lot of the information out there and I still have a bunch of open questions about pointers, answering them will help me but I also think that it will be good to add to the docs for other people.
This list isn't exhaustive and I'll try to add stuff I see and don't know the answer to:
const
variable. -> Is fine but may result in different pointers to the same variable (See Some questions around what is UB and what is defined when using raw pointers #205 (comment))const
variable through raw pointer casting. -> Not allowed. (See Some questions around what is UB and what is defined when using raw pointers #205 (comment))usize
and back.ptr as usize
. (related to 3) (less relevant now that we have align_offset but people still do it in the wild).The text was updated successfully, but these errors were encountered: