Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add two new pointer-sized integer types; uptr and iptr #1635

Closed
wants to merge 4 commits into from
Closed

Add two new pointer-sized integer types; uptr and iptr #1635

wants to merge 4 commits into from

Conversation

strega-nil
Copy link

usize and isize serve dual purposes; this RFC splits the purposes apart into two types.

@zofrex
Copy link

zofrex commented Jun 2, 2016

Rendered

@Ericson2314
Copy link
Contributor

It be a lot of churn, I think its worth it. Even on "normal" architectures, it allows types to better explain intent so I don't think this adds an extra mental burden there at all but rather does the opposite. Furthermore, they make some UB rules about pointers vs indexes a bit more intuitive when even in their "rawest form" they are different types.

Ultimately, just as Rust makes regular systems programming a lot less scary, I'd hope this would make embedded systems programing less scary.

@eddyb
Copy link
Member

eddyb commented Jun 3, 2016

I'd be fine with this if it doesn't change any public APIs.

@Ericson2314
Copy link
Contributor

@eddyb In std I hope that's true. But wasn't libc just stabilized?

@eddyb
Copy link
Member

eddyb commented Jun 3, 2016

@Ericson2314 libc is versioned on crates.io AFAIK.

@Ericson2314
Copy link
Contributor

Ah whew, good point.

@nrc nrc added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jun 3, 2016
@durka
Copy link
Contributor

durka commented Jun 3, 2016

Given that uptr and iptr will not be defined on platforms like the CHERI, how will portable code be written? It seems like we need another #[cfg(target_has_pointer_to_integer_casts)].

As for point (2), new lints aren't breaking changes unless they are going to be upgraded to errors eventually.

Point (7) sounds like needless complexity and a source of gotchas (literals being different from variables). Can we just rely on std::ptr::null(), or even uptr::null()?

Finally, it would be good to see more code examples of current code that would become wrong, and the good code that would replace it using these new types. I do not understand the one example given. Why should the length of a slice be a pointer-sized-integer?

@strega-nil
Copy link
Author

@durka

You cannot write portable code that converts from pointer types to integer types. It's unfortunate but true. Going back is even less portable.

The new lint isn't the breaking change. The breaking change is that usize is currently a pointer-sized type. The lint is in order to inform people that there is a breaking change, and also to catch people who are using the "wrong" integer types.

No, we can't. First is in consts: you can't use std::ptr::null() in a const context, at least on stable. Second, with no_core, how do you get a std::ptr::null()?

Because that's currently what usize is. That's the most correct signature for that function (and you can see in rusty-cheddar, what some people actually use for usize). It's not that it should be a pointer sized integer, it's that that's what usize is, and usize serves dual purposes. It's very confusing when writing FFI bindings, I can tell you :)


We want to support the embedded space, and new CPUs like the CHERI. These CPUs
do not support `usize` == `uptr`, and, in the case of the CHERI, don't support
`uptr` at all. Most CPUs don't actually support the idea that `uptr == usize`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most CPUs don't actually support the idea that uptr == usize: just the currently popular ones.

[citation needed]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any, it was just a nice phrase I heard; I'll take it out.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 3, 2016

First is in consts: you can't use std::ptr::null() in a const context, at least on stable.

That's a non-issue, since it will be stable some day

Second, with no_core, how do you get a std::ptr::null()?

by calling whatever intrinsic std::ptr::null() will use.


I don't really get why anyone wants to convert pointers to numbers... Can't you just work with *const T and *mut T and use the offset methods? Or if there's something missing, add a new method to the raw pointer types?

@durka
Copy link
Contributor

durka commented Jun 3, 2016

@ubsan

You cannot write portable code that converts from pointer types to integer types. It's unfortunate but true. Going back is even less portable.

Yes, I know. That's the problem this RFC is trying to solve, I think. But you propose to introduce some new types that aren't even defined on some platforms. By "portable" I meant "compiles on multiple platforms", not necessarily "does the same thing on multiple platforms". It seems to need a cfg that tells you whether the types even exist, unless you have an exhaustive list so you can do #[cfg(target_arch = "cheri)], #[cfg(target_arch = "whatever")], and #[cfg(not(any(target_arch = "cheri", target_arch = "whatever")))] but that seems tiresome/un-future-proof. So I'm asking, what is the model for writing code that uses uptr/iptr on platforms that support them, and something else (not necessarily something sensible -- maybe println!("platform not supported"); abort();) on platforms that don't.

The new lint isn't the breaking change. The breaking change is that usize is currently a pointer-sized type. The lint is in order to inform people that there is a breaking change, and also to catch people who are using the "wrong" integer types.

OK, it would be good to make it clearer in the detailed design what the breaking change is, since as written it seems to say that the warning on casts is the change in question. I guess the breaking change is actually point (1) more than point (2)?

No, we can't. First is in consts: you can't use std::ptr::null() in a const context, at least on stable. Second, with no_core, how do you get a std::ptr::null()?

These are not really answering the substance of my objection :) which is that adding complexity and subtle differences between 0 as *const _/let x = 0; x as *const _ seems like a mistake. We can find some way to use things in const context, for example making std::ptr::null() a const fn once those are implemented better, or introducing uptr::NULL once associated consts are working.

Because that's currently what usize is. That's the most correct signature for that function (and you can see in rusty-cheddar, what some people actually use for usize). It's not that it should be a pointer sized integer, it's that that's what usize is, and usize serves dual purposes. It's very confusing when writing FFI bindings, I can tell you :)

So as I said, the example is confusing. Please explain better which one currently generates right and wrong code, and what the function signature would look like using your new types.

Thanks for responding to all my questions so soon :D

@petrochenkov
Copy link
Contributor

petrochenkov commented Jun 3, 2016

Any concrete and practical plans to support platforms with sizeof(ptr) != sizeof(uN)? CHERI seems to be a highly experimental hardware.
Separation of usize and uptr is a serious regression from the current simple model, I wouldn't go this way without concrete plans for platform support. Experimental support can always be #[cfg]ed out in a custom rustc fork without official endorsement.
IIRC, there was a paragraph in the old FAQ, explaining that Rust isn't intended to support all possible exotic platforms that C supports and it's a tradeoff allowing to have a simpler model for integers, pointers, etc.

@anp
Copy link
Member

anp commented Jun 3, 2016

The title is "Add two new pointer types..." -- should this maybe be "Add two new pointer-sized integer types..." ?

@kennytm
Copy link
Member

kennytm commented Jun 3, 2016

I don't see the point of this RFC:

  1. The motivation is that usizeuptr on CHERI
  2. But uptr won't be defined on CHERI according to detail design 4

So the RFC won't benefit CHERI at all??

If the point is to prevent the pointer ↔ integer cast, one could just add a lint, without introducing any new types.

If the motivation was FFI semantics of usize on platforms where size_tuintptr_t (if we want to support them), then stop doing that and use the libc types instead...

@mahkoh
Copy link
Contributor

mahkoh commented Jun 3, 2016

An amusing RFC. I brought this topic up back in 2013 but was told that much of the stdlib relied of the fact that size_t == uinptr_t. But I can't imagine that a platform that is fundamentally incompatible with the C standard is relevant in any way. After all, the standard says that uintptr_t exists on all platforms and supports round-tripping of pointers. (Unless I'm mistaken.)

@mahkoh
Copy link
Contributor

mahkoh commented Jun 3, 2016

That being said, the linux kernel uses multiple address spaces (that is, pointers that carry compile time information so that they don't accidentally get mixed.) That might (in some limited sense) justify the idea that pointers are not just integers (in the sense of bi-directional conversion.) This is not quite the same as segmented memory but does (in some sense) justify the claim of the RFC that not all programs use only one flat address space.

@Ericson2314
Copy link
Contributor

Ericson2314 commented Jun 3, 2016

@mahkoh ah good point. There is the Mill's proposed ABI (and I assume probably others do this too) where virtual memory is done with one address space but per-process permissions, so the max object size in a "conceptual per-process address space" is indeed far smaller.

@mahkoh
Copy link
Contributor

mahkoh commented Jun 3, 2016

This document describes a version of the GCC features used by the linux kernel: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1275.pdf (p 28 ff)

In particular it describes the case where pointers to different address spaces don't necessarily have the same size. This might or might not be relevant to this RFC but it describes some of the thoughts on pointers of people who work in embedded software.

@mahkoh
Copy link
Contributor

mahkoh commented Jun 3, 2016

If one considers the case of multiple address spaces of different sizes, then one might also consider the case where the size of size_t depends on the address space. This seems to go beyond this RFC. Since the document posted above (2007) is not part of C11, it's not clear how useful this idea is and whether it should be considered in the overall design of usize vs. uptr.

@kennytm
Copy link
Member

kennytm commented Jun 3, 2016

@mahkoh If there are multiple address spaces the *const/*mut types aren't sufficient either. There could be RawPtr<Space, T> which disables the integer → pointer conversion or makes it unsafe. It doesn't make a standard uptr/iptr type necessary.

@mahkoh
Copy link
Contributor

mahkoh commented Jun 3, 2016

It doesn't make a standard uptr/iptr type necessary.

That is clear since on can simply choose usize = max(size_t, uintptr_t) on every platform. It is already the case that, in general, not all usize values are valid pointers or object sizes. One interesting thing mentioned in the RFC is that sizeof(size_t) is not necessarily sizeof(void *). Maybe that is the one thing where the documentation should be updated. I do not know if there is a case currently where size_t > uintptr_t (or even size_t != uintptr_t). But the guarantee that usize has the same size as ordinary pointers is certainly used in some contexts (e.g. transmute.)

It seems that the previous sentence already implies that any change that makes the size of usize unequal to the size of ordinary pointers is a breaking change.

Another interesting idea from the C standard is that function pointers are not necessarily related to ordinary pointers. I don't know if there are any platforms where this is the case. But if one considers the case where a platform has multiple address spaces, then one might also consider a case where code and data don't live in the same address spaces. This might also be something to incorporate into the RFC.

@mahkoh
Copy link
Contributor

mahkoh commented Jun 3, 2016

I recall that there was some discussion related to function pointers recently where it was recommended that functions be cast to usize before being transmuted. I believe that @nikomatsakis was the one talking about this. That is certainly problematic when usize != uintptr_t.

@strega-nil strega-nil changed the title Add two new pointer types; uptr and iptr Add two new pointer-sized integer types; uptr and iptr Jun 4, 2016
@strega-nil
Copy link
Author

strega-nil commented Jun 4, 2016

Alright, I'm going to respond to different people in multiple comments.

@durka The idea is that they're just completely unsupported, which means they're just not defined, on platforms like CHERI. Having some kind of cfg is a good idea.

Right, the breaking change is that "an integer which has the same number of bits as a pointer" isn't how usize would be defined. It'd be defined as "An integer type which is large enough to hold object size".

So, the issue there is that, with 0 as *const i32, it's obvious that it's a NULL pointer cast, and we can guarantee that it will cast to a NULL pointer. let x = 0uptr; x as *const i32 means "I want to cast the integer value 0 to a pointer", which in most cases means NULL, but in some cases on some architectures, does not. We could guarantee that [integer value 0] as *const i32 always results in the NULL pointer, regardless of the actual bit pattern of NULL, but then what if 0x00000000 is a valid bit pattern for pointers on this architecture? Then you're creating a NULL pointer when you wanted the zero pointer. (examples of these architectures include:

The Prime 50 series used segment 07777, offset 0 for the null pointer, at least for PL/I.

Some Honeywell-Bull mainframes use the bit pattern 06000 for (internal) null pointers.

The CDC Cyber 180 Series has 48-bit pointers consisting of a ring, segment, and offset. Most users (in ring 11) have null pointers of 0xB00000000000. It was common on old CDC ones-complement machines to use an all-one-bits word as a special flag for all kinds of data, including invalid addresses.

This also gives an example of an architecture with 48-bit pointers, which wouldn't be supported by our current model, unless we want to use a 48-bit usize. Note that two of these companies are defunct, although Bull is still making servers, supercomputers, and smartphones. It's unlikely to matter in practice; we could also guarantee a 0 bitpattern for NULL pointers. I'm uncomfortable making that guarantee in this RFC, however.

Neither of them are correct. The most correct, would, in fact, be something like

void takes_slice_from_c(int const* ptr, void* len);

because a usize is guaranteed to be the same size as a pointer. Neither uintptr_t nor size_t have this guarantee, so Rust cannot follow C here, without using uN types (this is the main driving force behind this RFC, by the way). If nothing else comes out of this RFC, that definition should be changed.

The correct signature, assuming rustc chooses to make usize == size_t (and uptr == uintptr_t) on all supported platforms (which is a guarantee it should make):

void takes_slice_from_c(int const* ptr, size_t len);

(Rust doesn't change at all)

@strega-nil
Copy link
Author

strega-nil commented Jun 4, 2016

@petrochenkov

No, there are no concrete plans at this time. Those are examples to show how broken our current model is, not really intended to be the driving force behind the change. The main driving force is that usize currently serves a dual purpose; the object size, and the pointer integer type. This means that when you write functions which take a usize, you do not know what it is; is it an object size, of the kind in &[T] or Vec<T>, or is it some sort of pointer which has been cast, and you'd like to cast back.

The only thing I don't see Rust supporting is architectures where different pointer types are differently sized, and even then, you could make everything *const () sized. We shouldn't stop ourselves prematurely from supporting everything we can, so we can replace C

everywhere.

@strega-nil
Copy link
Author

@kennytm That's really not the major motivation. The motivation is threefold: 1) segmented architectures, where object size isn't equal to pointer size (which we want to support, because replacing C is 👍), 2) on architectures where converting from pointer to integer and back isn't supported, error, instead of silently failing, and 3) (and most important) currently in the language, there's no way to semantically differentiate between a pointer integer and a size integer. That difference should be made, in my opinion.

Also, our definition of usize is just plain broken.

@strega-nil
Copy link
Author

@mahkoh

Right, making size_of::<usize>() != size_of::<*const ()>() in all cases would be breaking, but it is a change I believe we have to make.

Function pointers are already not guaranteed to be the size of regular pointers.

I'm... confused what you mean by "functions be cast to usize before being transmuted".

@durka
Copy link
Contributor

durka commented Jun 4, 2016

If we can't make let x = 0; x as *const _ make sense, then we shouldn't guarantee anything about 0 as *const _ either (maybe lint it), and just provide constants or const fns.

@strega-nil
Copy link
Author

@durka Okay, if you think so. Now that I'm actually looking, I think it'd be fine to say that "bits 0 is NULL", so we may just allow let x = 0; x as *const _. I don't want to do that just yet, though.

@strega-nil
Copy link
Author

Would anyone mind a "How do we teach this" section? I think #1636 is an awesome idea, and I'd love to implement it in my own RFC.

@durka
Copy link
Contributor

durka commented Jun 4, 2016

Such a section would be great, especially since this is adding complexity
and subtlety.
On Jun 4, 2016 03:22, "Nicole Mazzuca" notifications@github.com wrote:

Would anyone mind a "How do we teach this" section? I think #1636
#1636 is an awesome idea, and I'd
love to implement it in my own RFC.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1635 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAC3n8mcsdYXbXSOw8Ie4dGLjsJbMpOzks5qISeugaJpZM4ItCWp
.

@mahkoh
Copy link
Contributor

mahkoh commented Jun 4, 2016

Function pointers are already not guaranteed to be the size of regular pointers.

[citation needed]

I'm... confused what you mean by "functions be cast to usize before being transmuted".

transmute(f as usize)

@strega-nil
Copy link
Author

strega-nil commented Jun 4, 2016

@mahkoh

Never specified otherwise, therefore it's not a guarantee (unless I'm completely wrong, but I can't find any guarantees about it).

You shouldn't be transmuting like that. It's won't break code, but you can just do f as *const (). The other way, of course, requires a transmute, but you shouldn't be casting to usize first.

@mahkoh
Copy link
Contributor

mahkoh commented Jun 4, 2016

The usize type is an unsigned integer type with the same number of bits as the platform's pointer type. It can represent every memory address in the process.

Clearly this implies that there is a single platform pointer type including function pointers.

You shouldn't be transmuting like that.

Why? And more importantly: Why does @nikomatsakis say that one should transmute like that?

It's won't break code, but you can just do f as *const ().

That makes no sense since you just said that function pointers and data pointers are incompatible. If anything, this cast has undefined behavior.

@strega-nil
Copy link
Author

strega-nil commented Jun 4, 2016

@mahkoh

Look, what's behind this? Of course function and data pointers are incompatible. It's a cast, however, it can happen in safe code, it's not UB.

I very much doubt he ever said what you think he said, and if he did, he's wrong. Nobody is perfect.

That doesn't mean anything. Implications and reality are very, very different.

Edit: To be clear, what he said was

fn f(a: A);
let f: fn(B) = std::mem::transmute(f as usize);

instead of using

fn f(a: A);
let f: fn(B) = std::mem::transmute(f as fn(A));

in case you wanted to be lazy, due to the fn-types-are-zero-sized thing.

@mahkoh
Copy link
Contributor

mahkoh commented Jun 4, 2016

Of course function and data pointers are incompatible.

I literally quoted a part of the reference that talks about the ONE pointer type of a platform. And this is the natural way of looking at things since all relevant platform do not distinguish between the two types. And it is also often recommended to use usize in place of function pointers since rust lacks function pointers (in the sense of not being a reference.) Of course all of this is common knowledge so I'm surprised that you think the opposite is common knowledge and call it "of course."

Maybe you could point out where you get this from.

I very much doubt he ever said what you think he said, and if he did, he's wrong. Nobody is perfect.

I'm quite sure he did but it's best to simply wait for him to comment on this. (But if you insist then I can also search for the comment. It should not take much time.)

In the meantime, since you've already acknowledged that there is no such thing as one pointer type (at present we have "ordinary data pointers", "function pointers", and possibly "data pointers that point into a different address space and possibly have a different pointer size"). I don't quite see why you would want to add types called "uptr" to the language when they clearly don't correspond to pointers. It is not clear if such a type would have to be compatible with only data pointers, only function pointers, or possibly both. Would a platform where data pointers and function pointers have a different size have this type? Or is it strictly for usage with data pointers? If so then the name seems quite confusing.

@mahkoh
Copy link
Contributor

mahkoh commented Jun 4, 2016

Since rust doesn't have function pointers it's probably impossible to use it on platforms where function pointers are incompatible because the concept of a function pointer (not reference) simply cannot be expressed.

@eternaleye
Copy link

eternaleye commented Jun 7, 2016

@mahkoh

After all, the standard says that uintptr_t exists on all platforms and supports round-tripping of pointers. (Unless I'm mistaken.)

https://internals.rust-lang.org/t/tootsie-pop-model-for-unsafe-code/3522/39

It supports round-tripping, but only for a very, very narrow (and honestly useless) definition of round-tripping.

The only operation that needs to behave sensibly on a round-tripped pointer is comparing as equal to the original, which permits the round-trip result being a pointer to a zero-length subobject at the same address as the beginning of the original pointer's pointee.

Such a pointer cannot be meaningfully dereferenced; my understanding is that CHERI takes advantage of this.

@comex
Copy link

comex commented Jun 7, 2016

Since rust doesn't have function pointers it's probably impossible to use it on platforms where function pointers are incompatible because the concept of a function pointer (not reference) simply cannot be expressed.

Wait, what? Are fn() types not function pointers? They're not references; they don't have lifetimes.

@mahkoh
Copy link
Contributor

mahkoh commented Jun 7, 2016

It supports round-tripping, but only for a very, very narrow (and honestly useless) definition of round-tripping.

That interpretation seems far removed from practice: https://www.cl.cam.ac.uk/~pes20/cerberus/notes50-survey-discussion.html See question 5.

Wait, what? Are fn() types not function pointers?

They can be dereferenced without an unsafe block so they are certainly not pointers.

@comex
Copy link

comex commented Jun 7, 2016

They can be dereferenced without an unsafe block so they are certainly not pointers.

In what way do they differ from pointers, other than not requiring unsafe? Are you implying that they inherit (data) references' requirement to always point to valid memory at the cost of UB? I guess that could be an issue in some cases if true, but I wouldn't take it as a given, especially since on the vast majority of architectures there is no benefit to making such an assumption...

By the way, for the record: one platform with differently sized data and function pointers is 16-bit x86, in some variants; on the other hand, POSIX forbids it.

@mahkoh
Copy link
Contributor

mahkoh commented Jun 7, 2016

Are you implying that they inherit (data) references' requirement to always point to valid memory at the cost of UB?

If that were not the case then you could cause UB from safe code by calling such an invalid reference. Since that goes against the intention of rust, it seems reasonable to assume that creating such an object is UB.

POSIX forbids it.

The numeric values of function pointers are significant in some posix interfaces so it seems to require even more than just them being the same size. (Edit: Or maybe not since SIG_* are not necessarily defined to be numeric values. But I haven't bothered to look it up.)

@strega-nil
Copy link
Author

strega-nil commented Jun 7, 2016

So, there's some confusion on the exact meaning of the different terms used here:

Pointer: A pointer type. Includes references (&T, &mut T) and raw pointers (*const T, *mut T) at least, and may include function pointers (fn(...)).

Reference: A type of pointer. Always points to valid data, guaranteed to never be null, not allowed to alias mutably (excluding UnsafeCell shenanigans): &T, &mut T.

Raw Pointer: Not guaranteed to point to valid data, not guaranteed to not be null, not guaranteed to not alias: *const T, *mut T.

Function pointers: Guaranteed to point to a valid callable function (probably, not actually sure on this), guaranteed to never be null, may be a pointer, may just have "pointer" in the name due to legacy reasons. It's not like they actually act or look like pointers. fn(), fn(T), fn(T, U), extern fn(), etc.

@aturon aturon self-assigned this Jun 23, 2016
@nikomatsakis
Copy link
Contributor

Honestly, my biggest concern here is that I am very wary of turning out like C -- basically, my feeling in C is that there are a bazillion integer types and nobody knows how to use them (e.g., uintptr_t, ptrdiff_t, size_t, ssize_t, etc). I've basically never seen production code that doesn't at some point throw up its hands and do some dirty casts with a // FIXME that never gets fixed. (In part this is because, on most major platforms, these distinctions don't really matter.)

Moreover, whenever I've tried (in C) to be very precise about which of those different integer types I'm using, I wind up in a bind, because I find that (for some reason or another) I have some value that (e.g.) started out as a ptrdiff_t but winds up needing to get converted to size_t or something like that.

In contrast, I've basically found working in Rust to be a breath of fresh air. I guess the price I am paying for this is that my code is less portable, but I'm not sure how much this matters in practice.(There are some places where the same problems arise in Rust; typically when converting between usize and u32/u64. I think we could do better there, e.g. by allowing you to silently widen, and in particular to widen usize to u64 on all platforms.)

(Note that I feel basically the same about keeping a sharp distinction between "fn pointers", "other pointers", and "pointer-sized integers" -- it seems theoretically good, but in practice kind of a hassle, and doesn't seem to matter much in practice.)

That said, I think the portability thing is real. It'd be great to be able to target more platforms. To me, this all seems pretty related to @aturon's "pre-RFC" on how to handle platform support in the standard library. In particular, we may want some way to let people indicate that they intend to target more esoteric platforms, and thus opt-in to a certain amount of pain in the form of lints and the like, while still keeping the defaults relatively lax.

@eternaleye
Copy link

@nikomatsakis

and in particular to widen usize to u64 on all platforms.

Note that this would preclude any code relying on such behavior from supporting the RV128 variant of RISC-V in the future.

@nikomatsakis
Copy link
Contributor

@eternaleye I spoke a bit loosely. I should have said "...on all platforms where that makes sense". But yes, this would make it more natural to target 32-bit or 64-bit than 128-bit. Targeting 128-bit would be an active choice (and I could imagine that in the future we might alter our lints or defaults once 128-bit is better established). The key point here is that we might have widening transforms or implicit integer conversions that vary from platform to platform. I think we should try to ensure that the defaults ensure portability across "common" architectures but not "all" architectures past and future.

@strega-nil
Copy link
Author

strega-nil commented Jun 30, 2016

@nikomatsakis "Widening" transforms that makes real Rust code not work in 20 years when many have switched to 128-bit is not a good idea. RV128 is only the first platform to support 128 bit.

As to iptr/uptr: it's confusing when going from Rust to C. See, neither intptr_t nor size_t actually make full sense as function arguments. We've defined usize|isize to be "the same size as a pointer". Now, let's look at a real platform: x86_16 (someone is working on a rust compiler for x86_16). Depending on the code model, that would mean that usize is either a: (tiny|small|medium) size_t, or (large|huge|compact) uintptr_t (this is due to the fact that intptr_t is always the same size: big enough to hold a void *far). We can probably assume that most projects will not use far pointers, so large model it is -- then the default of most programmers to go usize == size_t is incorrect. The same for many other 16-bit CPUs.

The CHERI is in a different situation; its pointers are 192 bits wide. intptr_t is 64 bits wide. size_t is also 64 bits wide. Will we really only allow interoperability with CHERI C code by taking usize parameters as a void*?

@joshtriplett
Copy link
Member

The CHERI is in a different situation; its pointers are 192 bits wide. intptr_t is 64 bits wide.

Quoting the definition of intptr_t from the C99 standard, section 7.18.1.4:

The following type designates a signed integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:
intptr_t

Why does CHERI define an intptr_t incompatible with the C standard's definition? (And given that intptr_t is optional, why does CHERI define intptr_t at all, if it doesn't have an integer type the size of a pointer? Or alternatively, why not define an integer type the size of a pointer, even if it's an awkward type? C doesn't say intptr_t is the most efficient integer type to work with.)

@eternaleye
Copy link

eternaleye commented Jul 28, 2016

@joshtriplett:

The phrasing @ubsan used is subtly incorrect, and likely comes from me being incautious in my phrasing when using CHERI to argue about pointer -> uintptr_t -> pointer conversions in unsafe code.

Specifically, on CHERI, both pointers and intptr_t are 64-bit - CHERI is a set of capability extensions on top of regular BERI MIPS. However, capabilities are 192 bits (or in CHERIv5, 256-bit with a 128-bit compressed form), and the interactions between pointers and capabilities influence dereferenceability, which was the question at hand in that discussion.

In particular, a pointer round-tripped through uintptr_t only has one behavior guaranteed by the spec: Compare as equal to the original pointer. Equality includes pointers to prefix subobjects, and that means a zero-length (non-dereferenceable) subobject can be used in order to prevent the possibility of capability violations by way of pointer arithmetic.

@strega-nil
Copy link
Author

@eternaleye Actually, on further reading of the C standard, I would disagree with that assessment. I think the CHERI implementation is buggy, and they shouldn't implement intptr_t or uintptr_t.

@strega-nil
Copy link
Author

This is unlikely to ever happen. I'm going to close it.

@strega-nil strega-nil closed this Jul 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.