Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deref documentation: Remove references to "smart pointers" #91004

Closed
e2-71828 opened this issue Nov 18, 2021 · 16 comments · Fixed by #110340
Closed

Deref documentation: Remove references to "smart pointers" #91004

e2-71828 opened this issue Nov 18, 2021 · 16 comments · Fixed by #110340
Labels
A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools C-enhancement Category: An issue proposing an enhancement or a PR with one. E-help-wanted Call for participation: Help is requested to fix this issue. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@e2-71828
Copy link

e2-71828 commented Nov 18, 2021

The documentation for Deref reads, in part:

Implementing Deref for smart pointers makes accessing the data behind them convenient, which is why they implement Deref. On the other hand, the rules regarding Deref and DerefMut were designed specifically to accommodate smart pointers. Because of this, Deref should only be implemented for smart pointers to avoid confusion.

This warning has sparked significant confusion, which seems to stem from there being no consensus of what actually is a "smart pointer". The warning should be either removed or rewritten in terms of well-understood properties.

Relevant URLO discussions:

@JanBeh
Copy link
Contributor

JanBeh commented Nov 18, 2021

An alternative could be to actually define the term "smart pointer" somewhere. The reference contains a paragraph on smart pointers in section 4.1.13 on "pointer types", but lacks a definition.

However, it is at least questionable whether such a definition could

  • reflect whether Deref should or shouldn't be implemented

and at the same time

  • match the common understanding of the term (assuming there is one at all).

@ChrisDenton
Copy link
Member

Hm from a teaching standpoint, I'd definitely agree the docs shouldn't be introducing jargon without first defining it. And in this case I'm uncertain of the benefit of introducing it at all. In Rust, is the term actually used outside of Deref?

It may be better to directly explain what kinds of types Deref should be implemented for and what would help "avoid confusion". That said, the fuzziness around the term "smart pointer" does allow leeway in interpretation, whereas a too strict definition may not properly reflect how Deref is (or should be) used in practice.

@ChrisDenton
Copy link
Member

ChrisDenton commented Nov 18, 2021

Follow up: The Rust book does actually have a chapter on smart pointers. And then the term is used in the next chapter on Fearless Concurrency. Some quotes from the former:

In Rust, the different smart pointers defined in the standard library provide functionality beyond that provided by references.

In Rust, which uses the concept of ownership and borrowing, an additional difference between references and smart pointers is that references are pointers that only borrow data; in contrast, in many cases, smart pointers own the data they point to.

[String and Vec<T>] count as smart pointers because they own some memory and allow you to manipulate it. They also have metadata (such as their capacity) and extra capabilities or guarantees

The characteristic that distinguishes a smart pointer from an ordinary struct is that smart pointers implement the Deref and Drop traits.

@PatchMixolydic
Copy link
Contributor

The characteristic that distinguishes a smart pointer from an ordinary struct is that smart pointers implement the Deref and Drop traits.

Incidentally, String doesn't implement Drop since the inner Vec<u8>'s Drop implementation is sufficient.

@JanBeh
Copy link
Contributor

JanBeh commented Nov 19, 2021

An alternative could be to actually define the term "smart pointer" somewhere.

Perhaps a better way is to describe when Deref should or shouldn't be implemented. Then the term "smart pointer" or "smart reference" could (in the context of Rust) be defined as "anything that implements Deref", e.g. in the reference.

Currently some people say anything that implements Deref could be seen as smart pointer (depending on how you define it). Given the warning in Deref, this would currently be a circular definition: A smart pointer is something that implements Deref, but Deref should only be implemented for smart pointers.

If the warning was removed, the Rust community and the Rust reference could provide a clear definition what a smart pointer (or smart reference, see link above) actually is: Something that implements Deref.

@JanBeh
Copy link
Contributor

JanBeh commented Nov 19, 2021

Following my previous comment, the warning could be changed to something like:

"Warning: Implementing Deref on a data type will make this data type act like a smart pointer." Or: "turn it into a smart pointer." "This isn't always intended, e.g. when wrapping another type for other reasons."

@inquisitivecrystal inquisitivecrystal added A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools C-enhancement Category: An issue proposing an enhancement or a PR with one. I-needs-decision Issue: In need of a decision. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 19, 2021
@inquisitivecrystal
Copy link
Contributor

I feel like actually writing out a definition of smart pointer could be helpful here. Something like "Implementing Deref will often cause instances of the implementing type to be silently converted to the target type. Please consider whether this is desirable before implementing Deref, and avoid implementing it if the distinction between the two types is important." This phrasing is awful, but maybe it captures some of what's intended?

@Ixrec
Copy link
Contributor

Ixrec commented Nov 21, 2021

While "smart pointer" was never a precise technical term in any language community, I think what the Deref docs are trying to get at by using the phrase is: A user/library-defined type which is generic over a type T, and which conceptually wraps a single T with indirection.

So my strawman rewrite of this paragraph:

Implementing Deref for smart pointers makes accessing the data behind them convenient, which is why they implement Deref. On the other hand, the rules regarding Deref and DerefMut were designed specifically to accommodate smart pointers. Because of this, Deref should only be implemented for smart pointers to avoid confusion.

would be something like this:

Generic wrapper types with runtime indirection such as Box<T>, Rc<T>, and Arc<T> implement Deref in order to make accessing the data behind them convenient. On the other hand, the rules regarding Deref and DerefMut were designed specifically to accommodate these pointer-like generic wrapper structs, sometimes known as "smart pointers". Because of this, Deref should only be implemented for pointer-like types to avoid confusion.

The TRPL chapter on Smart Pointers I think is acceptable, since that intro-level writing contains a lot of similar simplifications and vague concepts that an expert user would eventually discard, but we can definitely tweak the introduction to make it clearer that "smart pointer" is not a precise term. For example, changing "We’ve already encountered a few smart pointers" to "We've already encountered a few types that could be considered smart pointers".

@e2-71828
Copy link
Author

Generic wrapper types with runtime indirection such as Box, Rc, and Arc implement Deref in order to make accessing the data behind them convenient. On the other hand, the rules regarding Deref and DerefMut were designed specifically to accommodate these pointer-like generic wrapper structs, sometimes known as "smart pointers". Because of this, Deref should only be implemented for pointer-like types to avoid confusion.

This is definitely an improvement, but it would be nice to elaborate some on "to avoid confusion:" How might this confusion manifest itself if Deref is implemented improperly?

There's also the question of whether any non-"pointer-like" types can reasonably implement Deref. Is it reasonable for a newtype that enforces some property but stores its target value inline, for example:

#[derive(Copy,Clone,Eq,PartialEq,Ord,PartialOrd,Hash,Debug)]
pub struct PrimeUsize(usize);

impl PrimeUsize {
    fn new(usize)->Self { /* Primality check here */ }
}

impl Deref for PrimeUsize {
    type Target = usize;
    fn deref(&self)->&usize { &self.0 }
}

In many ways, this behaves like String: It's a container for an owned value with some additional properties enforced by construction. Is this a reasonable use of Deref or not, and how can that be made clear in the docs?

@JanBeh
Copy link
Contributor

JanBeh commented Nov 23, 2021

I think what the Deref docs are trying to get at by using the phrase is: A user/library-defined type which is generic over a type T, and which conceptually wraps a single T with indirection.

I think that's an assumption that isn't necessarily true. There have been different opinions whether a "smart pointer" needs to be generic (see also the aboved linked discussion thread "What are smart pointers?"). Most prominent example is String which, according to the Rust book, is a smart pointer to str. If we would consider such non-generic types to not be smart pointers, it still makes sense to implement Deref for some of them. There are several other examples such as PathBuf (pointing/deref-ing to Path), for example. This isn't just limited to types from the standard library but might also be applicable to other types.

Introduction in Chapter 15 ("Smart Pointers"):

We’ve already encountered a few smart pointers in this book, such as String and Vec in Chapter 8, although we didn’t call them smart pointers at the time. Both these types count as smart pointers because they own some memory and allow you to manipulate it. They also have metadata (such as their capacity) and extra capabilities or guarantees (such as with String ensuring its data will always be valid UTF-8).

Also note that a String is (much) more than a wrapper to str.

@scottmcm
Copy link
Member

scottmcm commented Dec 1, 2021

Perhaps it could help to elaborate on the "confusion", rather than reference smart pointers.

For example, it could discuss how any method you add to something which implements Deref is a potential shadowing risk for something on the target. (And thus why MaybeUninit -- which doesn't Deref -- has a method assume_init_drop, but ManuallyDrop -- which does Deref -- has an associated function drop that's not a method.)

@yaahc yaahc added E-help-wanted Call for participation: Help is requested to fix this issue. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. and removed I-needs-decision Issue: In need of a decision. labels Mar 9, 2022
@yaahc
Copy link
Member

yaahc commented Mar 9, 2022

We discussed this in our T-libs meeting today and our consensus was that we would welcome a PR that clarifies this, but that we don't feel comfortable making a decisions regarding the discussion so far in the absence of such a PR that introduces concrete wording. We definitely agree with the general direction of clarifying when to use Deref and how it restricts your API design decisions going forward, and that we shouldn't center jargon terminology like "smart pointer".

@danielhenrymantilla
Copy link
Contributor

Also relevant to the discussion: currently the standard library implements Deref for Cows, including something like Cow<'_, u8>, which in the owned case features no pointer/indirection.

In general, I find this whole warning to be a bit too opinionated and subjective, since, as we can see, it's difficult to come up with hard rules. I thus second @scottmcm's more practical approach of: "beware that implementing Deref and (inherent) methods can be footgunny or surprising; types that implement Deref ought to strive not to feature other methods".

@JanBeh
Copy link
Contributor

JanBeh commented Jul 1, 2022

See also Christopher Durham's attempt to define a smart pointer here.

@CAD97
Copy link
Contributor

CAD97 commented Jul 1, 2022

A bit more complete; here's how I currently lean towards defining a "smart pointer:"

/// A smart pointer which can can temporarily suspend being smart.
///
/// # Safety
///
/// - `from_raw(into_raw(self))` must return `self` without any change
///   in validity / permissions / provenance / aliasing / etc.
/// - `into_raw(self)` must return a valid pointer to the same object
///   as as dereferencing `self`.
/// - If `Self: DerefMut`, then the pointer returned by `into_raw` must
///   be valid for writes as if through `DerefMut` as well as for reads.
pub unsafe trait Tartare: Deref {
    /// Convert this smart pointer into a raw one.
    ///
    /// It is valid to dereference the raw pointer
    /// as if it were the smart pointer.
    fn into_raw(this: Self) -> NonNull<Self::Target>;
    /// Convert a raw pointer back into a smart one.
    ///
    /// Moves ownership back into `Self`, invalidating all copies of the
    /// raw pointer. Note that this means that it is *not valid* to do
    /// `forget(from_raw(ptr))` and continue using `ptr` for all `Tartare`.
    ///
    /// # Safety
    ///
    /// - `raw` must have come from `Self::into_raw`.
    unsafe fn from_raw(raw: NonNull<Self::Target>) -> Self;
}

(From a crate1 where I'm experimenting with providing a better(?)2 version of StableDeref.)

Details are of course extremely fungible (and docs are still incomplete), but the main idea is that a "smart pointer" is "raw pointer + smarts," so it should be defined by conversions into/from a raw pointer.

Footnotes

  1. The name tartare is a pun on tartare being a raw meat dish. fugu and sashimi were taken, but the connotation is still something raw which can be made safe with special care. (plz no squat)

  2. Or at least one that allows sidestepping the Box uniqueness question/problem.

@jmaargh
Copy link
Contributor

jmaargh commented Apr 14, 2023

I've taken a crack at drafting something that I think it clearer and gives more helpful guidance: #110340

Comments very much welcome and if this direction seems good I will polish for proper review.

rust-timer added a commit to rust-lang-ci/rust that referenced this issue Nov 4, 2023
Rollup merge of rust-lang#110340 - jmaargh:jmaargh/deref-docs, r=Mark-Simulacrum

Deref docs: expand and remove "smart pointer" qualifier

**Ready for review**

~~This is an unpolished draft to be sanity-checked~~

Fixes rust-lang#91004

~~Comments on substance and content of this are welcome. This is deliberately unpolished until ready to review so please try to stay focused on the big-picture.~~

~~Once this has been sanity checked, I will similarly update `DerefMut` and polish for review.~~
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools C-enhancement Category: An issue proposing an enhancement or a PR with one. E-help-wanted Call for participation: Help is requested to fix this issue. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.