Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rephrase UnsafeCell doc #48201

Merged
merged 10 commits into from
Mar 13, 2018
Merged

rephrase UnsafeCell doc #48201

merged 10 commits into from
Mar 13, 2018

Conversation

NovemberZulu
Copy link
Contributor

As shown by discussions on users.rust-lang.org [1], [2], UnsafeCell doc is not totally clear. I tried to made the doc univocal regarding what is allowed and what is not. The edits are based on my understanding following [1].

Make UnsafeCell doc easier to follow
@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @TimNN (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@pietroalbini pietroalbini added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 14, 2018
@kennytm kennytm added the A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools label Feb 16, 2018
@TimNN
Copy link
Contributor

TimNN commented Feb 18, 2018

Re-assigning to docs team.

r? @steveklabnik

@rust-highfive rust-highfive assigned steveklabnik and unassigned TimNN Feb 18, 2018
@pietroalbini
Copy link
Member

Thanks for the PR @NovemberZulu!

@steveklabnik, or someone else from @rust-lang/docs, can we get a review on this PR?

/// that there are no active mutable references or mutations when an immutable reference is obtained
/// from the cell. This is often done via runtime checks.
/// to do this. When `UnsafeCell<T>` _itself_ is immutably aliased, it is still safe to obtain
/// a mutable reference to its _interior_ and/or to mutate the interior. However, it is up to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that putting italic in here brings anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed italic

@GuillaumeGomez
Copy link
Member

Except for my little concern, it seems good to me.

remove italic as per @GuillaumeGomez suggestion
@steveklabnik
Copy link
Member

steveklabnik commented Feb 27, 2018

I would really like someone from @rust-lang/libs or @rust-lang/lang to sign off here; changing an unsafe lang item's docs is something i'm super wary of doing without their input.

/// that there are no active mutable references or mutations when an immutable reference is obtained
/// from the cell. This is often done via runtime checks.
/// to do this. When `UnsafeCell<T>` itself is immutably aliased, it is still safe to obtain
/// a mutable reference to its interior and/or to mutate the interior. However, it is up to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reformat this as a list? Several of the entries are long, and the sentence becomes hard to follow.

/// to do this. When `UnsafeCell<T>` itself is immutably aliased, it is still safe to obtain
/// a mutable reference to its interior and/or to mutate the interior. However, it is up to
/// the abstraction designer to ensure that no two mutable references obtained this way are active
/// at the same time, there are no active immutable reference when a mutable reference is obtained
Copy link
Member

@cramertj cramertj Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These rules are kind of confusingly stated. I think it should be something like the following:
(1) While a mutable reference exists, no second reference or of any kind may exist.
(2) While a reference of any kind exists, all mutations must go through that reference. This means that mutations cannot occur while an immutable reference exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to merge three rules into one, so it will be easier to follow.

@NovemberZulu
Copy link
Contributor Author

As far as I understand -- and I quite possible I am wrong -- simply storing a "bad" reference is not a crime, it is just some bytes in memory, but actually using it is UB, so I talk about active references in the doc. What exactly is "using" as opposed to stored is another story, and should be defined in memory model, I believe.

@nikomatsakis
Copy link
Contributor

I tried to offer some feedback but found it harder than expected. I will say that I continue to be strongly opposed to the term "immutable reference" -- I know we use it a lot, but particularly in the context of UnsafeCell, it strikes me as quite confusing. I'd prefer that we either say "shared reference" or just &-reference (perhaps clearer).

I personally found the current form of "English-y paragraph" sort of confusing, to be honest. I think I'd prefer something that says like:

UnsafeCell is the only way that data reached via a &-reference can be correctly mutated: any data that is not stored within an UnsafeCell must be immutable when reachable via & reference. The UnsafeCell API itself is very simple: it gives you a raw pointer (*mut T) to its contents. It is up to you to use that raw pointer correctly.

The precise Rust aliasing rules are somewhat in flux, but the main points are not contentious:

  • If you create a safe reference with lifetime 'a (either a & or &mut reference) that is accessible by safe code (for example, because you returned it), then you must not access the data in any way that contradicts that reference for the remainder of 'a.
    • For example, that means that if you take the *mut T from an UnsafeCell<T> and case it to an &T, then until that reference's lifetime expires, the data in T must remain immutable (modulo any UnsafeCell data found within T, of course). Similarly, if you create an &mut T reference that is released to safe code, then you must not access the data within the UnsafeCell until that reference expires.
  • At all times, you must avoid data races, meaning that if multiple threads have access to the same UnsafeCell, then any writes must have a proper happens-before relation to all other accesses (or use C++11 atomics).

@NovemberZulu
Copy link
Contributor Author

I totally agree that proper wording here is a challenge -- the text must be univocal, concise, yet as simple as possible. I think that "shared reference" might be a bit confusing, because of "shared with what?" question. &-reference is a bit jargon term, but &-reference contrasts &mut-reference very nicely.

I think the best way is to state explicitly allowed code flows, at least for single-thread access -- multi-thread access is way more complicated (does Rust strictly define happens-before relation?) -- something like this:

UnsafeCell is the only way in core Rust that data reached via a &-reference can be correctly mutated: any data that is not stored within an UnsafeCell must be immutable when reachable via &-reference. All other types that allow internal mutability, such as Cell<T> and RefCell<T> use UnsafeCell to wrap their internal data. The UnsafeCell API itself is very simple: it gives you a raw pointer (*mut T) to its contents. It is up to you as the abstraction designer to use that raw pointer correctly.

The precise Rust aliasing rules are somewhat in flux, but the main points are not contentious:

  • If you create a safe reference with lifetime 'a (either a & or &mut reference) that is accessible by safe code (for example, because you returned it), then you must not access the data in any way that contradicts that reference for the remainder of 'a. For example, that means that if you take the *mut T from an UnsafeCell<T> and case it to an &T, then until that reference's lifetime expires, the data in T must remain immutable (modulo any UnsafeCell data found within T, of course). Similarly, if you create an &mut T reference that is released to safe code, then you must not access the data within the UnsafeCell until that reference expires. The following two scenarios are explicitly legal:

    1. a &-reference can be released to safe code and there it can co-exit with other &-references, but not with a &mut-reference

    2. a &mut-reference may be released to safe code, provided neither other &mut-references nor even &-references co-exist with it. A &mut-reference must always be unique.

  • At all times, you must avoid data races, meaning that if multiple threads have access to the same UnsafeCell, then any writes must have a proper happens-before relation to all other accesses (or use atomics).

@nikomatsakis
Copy link
Contributor

does Rust strictly define happens-before relation

We inherit the C++11 model, basically.

@NovemberZulu
Copy link
Contributor Author

We inherit the C++11 model, basically.

(sigh) I was so hoping to get away from it...

@emberian
Copy link
Member

@NovemberZulu why? because of how complex it is?

@NovemberZulu
Copy link
Contributor Author

Because, as with numerous other C++ features, you never know when it is going to bite you in the ass

@emberian
Copy link
Member

I'm confused. The C++11 memory model is extremely well understood. http://www.cl.cam.ac.uk/~pes20/weakmemory/#CPP

@NovemberZulu
Copy link
Contributor Author

C++ is hardly an esoteric or obscure language, and the documentation is both readily available and extensive, that is certainly true. Still, to my mind, it is too easy to end up with a code that looks OK, seems to work the way you expect it to work, but is, in fact, UB.

P.S. I think we are digressing.

@steveklabnik
Copy link
Member

@nikomatsakis we use "reference" and "mutable reference", officially. Or at least, that's what I've been keeping consistent in every bit of docs I come across, it's hard, of course.

I agree "immutable reference" is a bit misleading, which is part of why we chose this this way so long ago.

@NovemberZulu
Copy link
Contributor Author

we use "reference" and "mutable reference", officially.

I am afraid I have been using "reference" as an umbrella term for both &T and &mut T adding to the confusion :( Following Liskov, &mut T IS-A &T, isn't? BTW, what is misleading about "immutable reference"?

Anyway, may be we can just use &T and &mut T in UnsafeCell<T> doc?

@Kimundi
Copy link
Member

Kimundi commented Mar 1, 2018

@NovemberZulu: A &T might allow some mutation inside the T if T uses internal mutability like Cell, so its not always immutable.

@NovemberZulu
Copy link
Contributor Author

A &T might allow some mutation inside the T if T uses internal mutability like Cell, so its not always immutable.

Yes, true, and this is exactly the case we have. I still have a mental picture of T * const when thinking of UnsafeCell<T> :(

@nikomatsakis
Copy link
Contributor

@steveklabnik

we use "reference" and "mutable reference", officially. Or at least, that's what I've been keeping consistent in every bit of docs I come across, it's hard, of course.

That make sense, but it seems like sometimes we will want to clarify that we mean an &T and that we are excluding &mut T. In those cases, what do you do?

@steveklabnik
Copy link
Member

steveklabnik commented Mar 1, 2018

If "reference" isn't clear enough, I often write &T, like @Kimundi did above:

A &T might allow some mutation inside the T if T uses internal mutability like Cell, so its not always immutable.

@NovemberZulu
Copy link
Contributor Author

Another attempt at rewording:

If you have a reference &SomeStruct, then normally in Rust all fields of SomeStruct are immutable. UnsafeCel<T> is the only core language feature to work around this restriction. All other types that allow internal mutability, such as Cell<T> and RefCell<T> use UnsafeCell to wrap their internal data. The UnsafeCell API itself is technically very simple: it gives you a raw pointer *mut T to its contents. It is up to you as the abstraction designer to use that raw pointer correctly.

The precise Rust aliasing rules are somewhat in flux, but the main points are not contentious:

  • If you create a safe reference with lifetime 'a (either a &T or &mut T reference) that is accessible by safe code (for example, because you returned it), then you must not access the data in any way that contradicts that reference for the remainder of 'a. For example, that means that if you take the *mut T from an UnsafeCell<T> and case it to an &T, then until that reference's lifetime expires, the data in T must remain immutable (modulo any UnsafeCell data found within T, of course). Similarly, if you create an &mut T reference that is released to safe code, then you must not access the data within the UnsafeCell until that reference expires.

  • At all times, you must avoid data races, meaning that if multiple threads have access to the same UnsafeCell, then any writes must have a proper happens-before relation to all other accesses (or use atomics).

To assist with proper design, the following scenarios are explicitly declared legal for single-thread code:

  1. A &T reference can be released to safe code and there it can co-exit with other &T references, but not with a &mut T

  2. A &mut T reference may be released to safe code, provided neither other &mut T nor &T co-exist with it. A &mut T must always be unique.

@nikomatsakis
Copy link
Contributor

@NovemberZulu I like it quite a lot

@NovemberZulu
Copy link
Contributor Author

Updated the pull request. I feel the transition between old and new text is bit rough. Comments are welcome!

@pietroalbini
Copy link
Member

Ping from triage @steveklabnik! The author pushed new commits, could you review them?

@steveklabnik
Copy link
Member

I think it reads great, thanks!

@bors: r+ rollup

@bors
Copy link
Contributor

bors commented Mar 12, 2018

📌 Commit 55be283 has been approved by steveklabnik

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 12, 2018
kennytm added a commit to kennytm/rust that referenced this pull request Mar 12, 2018
rephrase UnsafeCell doc

As shown by discussions on users.rust-lang.org [[1]], [[2]], UnsafeCell doc is not totally clear. I tried to made the doc univocal regarding what is allowed and what is not. The edits are based on my understanding following [[1]].

[1]: https://users.rust-lang.org/t/unsafecell-behavior-details/1560
[2]: https://users.rust-lang.org/t/is-there-a-better-way-to-overload-index-indexmut-for-a-rc-refcell/15591/12
bors added a commit that referenced this pull request Mar 12, 2018
Rollup of 13 pull requests

- Successful merges: #48201, #48705, #48725, #48824, #48877, #48880, #48887, #48928, #48934, #48480, #48631, #48898, #48954
- Failed merges:
@bors bors merged commit 55be283 into rust-lang:master Mar 13, 2018
@NovemberZulu
Copy link
Contributor Author

Everyone, thank you very much for your support!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.