Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ban private items in public APIs #136

Merged
merged 5 commits into from
Aug 12, 2014

Conversation

glaebhoerl
Copy link
Contributor

@huonw
Copy link
Member

huonw commented Jun 24, 2014

FWIW, this is actually partially implemented in the form of the visible_private_types lint, which just handles types occurring in public signatures, but it doesn't ping cases like

struct Private;
pub type Public = Private;
trait Private {}
pub fn public<T: Private>() {}

Both of these were explicit decisions, since they essentially allow exposing a public interface around private types. The trait one I regard as particularly important, since it is the only way (that I know) of exposing a polymorphic interface while disallowing it to be overridden (or methods called) by others, e.g. the trait bound may be used by function doing ridiculously unsafe things that have only been written to work with a small number of types, so external implementations would be breaking these assumptions. (The type one can just be done by creating a wrapper struct with a private field, but this then requires implementing all traits etc. for the wrapper.)

I'm in favour of extending the lint's rules to include statics etc., and I'm somewhat in favour of making these hard compiler errors (allows the exported/visibility rules to be made more precise).

Also, this is closely related to rust-lang/rust#10573.

@glaebhoerl
Copy link
Contributor Author

FWIW, this is actually partially implemented in the form of the visible_private_types lint

Yeah I know, probably should've mentioned it.

The trait one I regard as particularly important, since it is the only way (that I know) of exposing a polymorphic interface while disallowing it to be overridden (or methods called) by others

I know about this trick, but is it actually used anywhere? It feels kind of hacky to me, and I would rather have some kind of formal mechanism to accomplish this (like maybe #[no_external_impls]), if there's demand for it.

Also, this is closely related to rust-lang/rust#10573.

Ah yes, thanks for unearthing the evidence for my point in the Motivation :)

It leads to various "what if" scenarios we need to think about and deal with, and it's easier to avoid them altogether.

@SiegeLord
Copy link

I know about this trick, but is it actually used anywhere?

Multiple people have asked about this on Reddit. I've used it in my library. In my case, I didn't care about the external implementations, rather I wanted to implement default methods of a public trait in terms of the methods provided by the private super-trait.

@glaebhoerl
Copy link
Contributor Author

I added some of the mentioned examples to the Motivation and Drawbacks sections.

@SiegeLord
Copy link

I don't think the drawbacks section accurately portrays the practical use of the private super-trait idiom.

In my case, the pattern like looks like this:

struct PrivateData
{
    data: int,
}

trait PrivateTrait
{
    fn get_data<'l>(&'l self) -> PrivateData;
}

pub trait PublicTrait : PrivateTrait
{
    fn public_function(&self) -> int
    {
        self.get_data().data
    }
}

Making everything public as required by this RFC would leak those internal implementation details. In principle I could mark those API entries as permanently unstable, but surely that's what the privacy system should be used for.

@AbigailBuccaneer
Copy link

The D language has this - under the name Voldemort types - and considers them a feature rather than a drawback.

@glaebhoerl
Copy link
Contributor Author

@SiegeLord I'm fairly certain your use case could be satisfactorally reformulated using closed trait or #[no_external_impls] trait or whatever, but I don't know enough based on the provided code to say how. Is there an impl<T: PrivateTrait> PublicTrait for T? Is public_function ever implemented as something other than the default? Where and how are these traits used? And so on.

In any case, you've offered a quasi-objection but not a proposed resolution. What would be your preference? To do nothing? To carve out a special exception for private supertraits? Would it be intolerable if this functionality was lost even temporarily? Would adopting this rule only be acceptable to you if the closed trait / #[no_external_impls] trait substitute (presuming it is indeed a substitute) were adopted at the same time, instead of being discussed separately?

@AbigailBuccaneer Our (better designed, more principled) equivalent to that functionality would be RFC PR #105.

From a philosophical perspective (which is important!), semantic invariants should be upheld using semantic means whenever possible. (To put it another way, in the typechecking and other semantic analysis phases, rather than in the name resolution phase.) If the safety of your abstraction rests on the fact that external code can't write the name of a thing, that's brittle. C++ libraries had a technique of relying on private classes returned by public functions to prevent them being used except as temporary rvalues, because their name couldn't be written, which suddenly broke when C++11 added auto and decltype.

In this particular case, it's admittedly unlikely that we would gain something like SuperTraitOf<Trait>, but I think the correct attitude in general is to stick to principles and do it cleanly unless there's a clear reason we can't.

EDIT: @SiegeLord: Just to clarify, forbidding private supertraits is not a hill I would choose to die on, and I'd be mostly fine with carving out an exception. If we can outlaw the other 95% of cases, I'll be 95% satisfied. And the process could just as well be structured to first outlaw the other 95% in the first phase, and then remove private supertraits together with introducing the replacement functionality in the second phase (as long as that happens before 1.0, of course). I just wanted to make clear why I think consistency is the better option on the merits, and to try to suss out what you think.

@SiegeLord
Copy link

I think an exception to private super-traits it something that I would be content with. To me, private super-traits of public traits are closely related to private fields in public types. I.e. having a private field shows up in the documentation (in the sense that its presence is indicated by a comment) and it prevents you from instantiating an instance of the type manually. Private super-traits seem to act the same way except in type space: they (could) show up in documentation as a comment, and they prevent you from 'instantiating' a new trait/type that implements it. I've never studied type theory, so I don't know if this analogy is valid.

Either way, to me, private super-traits are not members of the public API any more than private fields are. In this sense, banning all the other cases of private items while allowing these is perfectly consistent, at least in my view. In particular, a private super trait does not have to show up in function signatures etc to be useful, e.g.:

trait A
{
    fn private_method(&self) {}
}

pub trait B : A
{
}

pub fn public_function<T: B>(a: T)
{
    a.private_method();
}

@glaebhoerl
Copy link
Contributor Author

I'd still be curious about further details of your use case mentioned in the previous comment.

@SiegeLord
Copy link

There is no impl<T: PrivateTrait> PublicTrait for T. Currently, public_function is not overridden, but I don't see why it can't be. The public trait methods are currently used by the user. I don't know what you refer to by 'And so on."

@zwarich
Copy link

zwarich commented Jul 18, 2014

Servo also uses this.

@glaebhoerl
Copy link
Contributor Author

@nick29581

#136 - Ban private items in public APIs - glaebhoerl
Not much to add to the title. For example using a private struct as a type in a public function.
We have a lint for part of this already.
Apparently this is used as a hack for some kind of fine-grained privacy for traits. Seems like we should address that properly.
Mostly negative feedback, but (I think) only because of the hacks it would break.
Lets discuss this.

As noted above, we could simply carve out an exception for private supertraits for now, which seems to be the only controversial part of the proposal, to keep that kind of code working. We could then revisit it later with potentially a more proper solution if we want to.

@zwarich
Copy link

zwarich commented Jul 21, 2014

I wouldn't really want this to be removed until we have actual abstract types in modules. I'm fine with it being feature-gated forever, though, and that might actually be the right thing to do, since we wouldn't want to support this if we did have abstract types. Since Rust has recursive modules, adding abstract types might be nontrivial.

@glaebhoerl
Copy link
Contributor Author

I wouldn't really want this to be removed

Which "this"? The whole enchilada? Just private supertraits? Some other specific part of it? (If it's not just private supertraits you're referring to, could you provide an example of what you would use the functionality for which wouldn't otherwise be possible?)

until we have actual abstract types in modules.

Abstract types in what sense? To put it another way, what difference are you thinking of relative to structs with private fields?

I'm fine with it being feature-gated forever, though, and that might actually be the right thing to do

I'd be OK with feature gating it, but I also don't see why it's desirable to keep it. From my perspective it's just tying up loose ends and removing the possibility of weirdness.

@lilyball
Copy link
Contributor

If it is a type declaration, items referred to in its definition must be public.

This seems like a mistake. I would think there's a definite use for being able to use a type declaration to make public specific parameterizations of private types. For example, I might have something like

// private struct
struct MyStruct<T> { ... }

// private marker types
struct MarkerOne;
struct MarkerTwo;

impl MyStruct<MarkerOne> { ... }
impl MyStruct<MarkerTwo> { ... }

// publicize those two variants
pub type StructOne = MyStruct<MarkerOne>;
pub type StructTwo = MyStruct<MarkerTwo>;

This would allow me to vend the two types StructOne and StructTwo, and provide functionality on these types (any pub function in the impl blocks would presumably be usable), but explicitly prevent anyone from creating a third MyStruct<OtherType>.

I think the better solution is to merely consider that a pub type declaration produces a public type, which is equivalent to a potentially-private type, without requiring the private type to be public. Thus the privacy of the items referred to in the type definition are irrelevant.

@arielb1
Copy link
Contributor

arielb1 commented Jul 24, 2014

@kballard

type statements are syntactic sugar (they have no effect on semantics). You could make the Markers impl a private trait and use it.

Currently I think the primary problem here is that private types can leak in public signatures and bounds - and would prefer to prevent only that.

@tupshin
Copy link

tupshin commented Jul 27, 2014

I'd love to see this implemented, but on the other hand, there really has to be a way to share private members between modules. In implementing a wrapper for a somewhat large C API, I tried to chuck it up into modules (https://github.com/tupshin/rust-cassandra-wrapper/tree/master/src/cassandra). Unfortunately, there is no adequate way right now to have structs wrapping unsafe private structs, and still adequately modularize the implementation. I don't know if there's a better way, but "protected" visibility, so it can be seen by sibling modules, is what comes to mind.

@pnkfelix
Copy link
Member

@tupshin I take it that you were unable to structure your code in a manner like this?

Example playpen code illustrating sharing a private type among child modules.

(Note that this works even when you break the nested submodules b and c off into their own files.)

@haberman
Copy link

Would this prohibit this pattern, which was suggested to me and I find quite useful?

pub struct FieldDef {
    priv field: int,
}

Rationale and motivation for this (as well as a link to sample code) are in this thread: https://mail.mozilla.org/pipermail/rust-dev/2014-January/008033.html

@sfackler
Copy link
Member

The pattern of having a private field in a struct? No.

@tupshin
Copy link

tupshin commented Jul 27, 2014

@pnkfelix I did an experimental rework to use that approach in a simple way by moving my structs into mod.rs. https://github.com/tupshin/rust-cassandra-wrapper/blob/master/src/cassandra/mod.rs#L28
A few thoughts:

  1. That makes the impl for my structs more verbose by requiring super:: in many places
  2. It separates the structs all into one file, and the impls into one file per struct, whereas it feels much more natural to have the struct and its corresponding impl defined together
  3. I couldn't figure out any way to call a sibling impl method when the struct is defined in the super class. https://github.com/tupshin/rust-cassandra-wrapper/blob/master/src/cassandra/cluster.rs#L20 results in "failed to resolve. Use of undeclared module CassCluster" now.

@aturon
Copy link
Member

aturon commented Jul 27, 2014

I feel like this RFC is a good idea, but I've been trying to think of a crisper pitch than "allowing this feels weird, and we should be conservative."

One concrete problem with allowing private items to leak is that you lose some local reasoning. You might expect that if an item is marked private, you can refactor at will without breaking clients. But with leakage, you can't make this determination based on the item alone: you have to look at the entire API to spot leakages (or, I guess, have the lint do so for you). Perhaps not a huge deal in practice, but worrying nonetheless.

Put another way, it seems useful to have privacy as a strict and simple way of knowing what downstream clients can depend on, and what they can't.

Allowing private supertraits seems OK because, assuming all the other leakage checks are in place, clients have no way to depend on or use the supertrait. It would effectively be a marker that the child trait is unimplementable by the client, and could be shown as such in rustdoc.

It also seems like the use of pub type could be replaced by a use of newtypes.

Assuming these particular cases can be cleanly handled, what's the argument for not doing this?

@lilyball
Copy link
Contributor

Assuming these particular cases can be cleanly handled, what's the argument for not doing this?

That's the wrong way to think about it. We don't put everything in the language and then argue to remove things. We keep the language as simple as possible and argue what to add.

Your points about local reasoning seem like they should be satisfied by the visible_private_types lint just as well as they would be with this RFC.

Given that we already have a lint to flag this sort of thing, what do we gain by turning it into a language rule?

@aturon
Copy link
Member

aturon commented Jul 28, 2014

That's the wrong way to think about it. We don't put everything in the language and then argue to remove things. We keep the language as simple as possible and argue what to add.

This doesn't seem particularly helpful. For one, this RFC can as easily be viewed as taking away a feature as adding one. But I think that's not really relevant: the real question is about the meaning of privacy in Rust. I was surprised that leakage of private items is allowed, and I'm trying to understand the rationale of the current design, assuming it was intentional.

If you do want to make this kind of general point, I think @glaebhoerl's argument in the RFC is the stronger one: absent a clear reason and coherent design for exposing private items, the conservative thing to do is disallow them.

Your points about local reasoning seem like they should be satisfied by the visible_private_types lint just as well as they would be with this RFC.

Not exactly. Since lints can be turned off, safely refactoring private items an existing code base requires auditing places where the lint is disabled. If leakage was disallowed by the language, though, you could reason about privacy truly locally.

@lilyball
Copy link
Contributor

Not exactly. Since lints can be turned off, safely refactoring private items an existing code base requires auditing places where the lint is disabled.

Are you concerned about your own code, or are you trying to force what you consider to be good coding practices upon other people? Because if you're concerned about your own code, don't turn off the lint.

Personally, I think it's not a language's place to try to force what the author considers good design upon the users. It should encourage good design, but it should also be able to get out of the way and let people do what they really want to, if they really want. This is basically the same reason why I think the RFC about enforcing capitalization rules was a bad idea.

We've already shown that there are cases where exposing private types allows you to do things you couldn't otherwise (such as have a public trait nobody outside your crate can implement). The problem with defaulting to being conservative and adding exceptions for things like private supertraits is you can only add exceptions for the things you can think of doing right now. Sticking with a warning instead of a hard rule lets people disable it if they come up with valid reasons to violate the rule.

As an example of a valid reason that hasn't been suggested yet, not too long ago (I think in another PR but I forget) I wrote some code for someone that used .scan() to implement a custom iterator-returning function. The function returned iter::Scan, and that type includes the state type as a type parameter. But the state was wholly irrelevant to this public API; all that I really wanted to expose was the notion of some type that implemented Iterator<T>. Since you cannot define a function with an existential return type, I ended up just using a private type as the state.

I believe the only alternative, if this RFC were to be implemented, would be to have to create a brand new type that implemented Iterator by hand (even if it were just a wrapper around a hidden iter::Scan). It's a lot of work for what was otherwise a rather simple function.

For what it's worth, my earlier suggestion that a pub type should be able to still refer to private types in the definition would have solved this, as I could have created a simple pub type to hide the state type. This is a good example of why I think that pub type needs to be an exception to the rule, and also a good example of how being overly conservative as a hard rule can lead to unwanted inflexibility.

@pnkfelix
Copy link
Member

@kballard wrote:

For what it's worth, my earlier suggestion that a pub type should be able to still refer to private types in the definition would have solved this, as I could have created a simple pub type to hide the state type.

This seems like a good way to have our cake and eat it too, in a disciplined manner. . . and I think there is precedent for it in ML, no? (It is possible I misremember.)


@aturon in response to your point about using a newtype struct instead of this, do note @huonw 's point that:

(The type one can just be done by creating a wrapper struct with a private field, but this then requires implementing all traits etc. for the wrapper.)

I often overlook the latter cost myself.


@glaebhoerl do you have any thoughts on this detail (on making an exception to the blanket rule for pub type) as the RFC author?

@pnkfelix
Copy link
Member

I suppose a problem with making an exception for pub type is that we need to take care in what holes that could open up in our reasoning process.

e.g. as has been said above, what one really wants here is an existential type (or maybe "abstract type", though I guess the two are the same under some mental models), where all you can do is pass around the instance but do not know what its internal structure is. But currently that is not what happens when you mix pub type and a private struct (with pub fields):

example playpen code

@zwarich
Copy link

zwarich commented Jul 28, 2014

@pnkfelix I gave up trying to relate Rust's decisions to ML modules earlier in the discussion, but ML solves this by separating modules from their signatures and providing a rich signature language. Ascribing a signature to a module may hide type components of the module entirely, make them abstract (in the sense of existential types), leave them manifest. It gets more complex than that (e.g. sharing constraints between distinct modules so that you can control external knowledge of implementation details), but there is a distinction made between namespace visibility and type abstraction. In Rust that distinction is missing, although field privacy can be used to emulate it in some ways.

@lilyball
Copy link
Contributor

@pnkfelix Yes, existential types is what I really want in that example. But as long as we don't have them, I'll settle for a pub type referencing something private, in order to at least hide the private type from the documentation of the API. It's not ideal, but it's better than having to create a brand new type and reimplement all the desired traits.

@arielb1
Copy link
Contributor

arielb1 commented Jul 28, 2014

@kballard

What you want is either existential types, "abstract return types" (which are "module-level" existential types), or a newtype.

pub type is neither - type is supposed to be a kind of syntactic sugar. The problem is that in Rust privacy controls both nameability and access. Safe refactoring ("modularity", "independence of implementation from interface") requires both - control over nameability to prevent name conflicts and control over access to prevent the implementation from leaking. Note that these are distinct things - name conflicts can be resolved by simple α-renaming while implementation leaks can't be (in general). type controls only nameability - supposed-to-be-private type details can still leak.

An issue here is that the semantics of "private items" leaking is unclear (if you impl a private struct with a public trait, can you access it?) - the best way to avoid this kind of mess is to ban these "unprincipled" leaks and support only principled ones.

The "allow this - someone may have a use for this you haven't thought of" viewpoint isn't the way the Rust type-system is designed - if the Rust designers haven't thought of a special use for a feature (especially a feature designed as syntactic sugar), it might have weird interactions with other Rust features (see GeneralizedNewtypeDeriving[http://www.haskell.org/pipermail/haskell-cafe/2010-March/074305.html] for such an example in Haskell). This is a distinction between Rust's typesystem and C++ templates, with the latter caring much less about these edge-case (but used-by-clever-developers) interactions.

@glaebhoerl
Copy link
Contributor Author

In the absence of GeneralizedNewtypeDeriving (which I don't expect Rust to get any time soon)

As far as that goes, GNTD is very closely related to the Transmute (née Coercible) trait for safe transmutes, as it merely involves safe-transmuting each of the base type impl's methods for use as the impl for the newtype. So we're likely to get them at approximately the same time, whenever that is.

@aturon
Copy link
Member

aturon commented Jul 30, 2014

@glaebhoerl Just to make sure you saw it: this RFC was discussed in yesterday's meeting, with no definite decision yet.

@glaebhoerl
Copy link
Contributor Author

@aturon Thanks! I saw.

And gah, it looks like I've royally screwed up the rebasing. :( What to do? But I've significantly updated the RFC based on the discussion here.

@huonw
Copy link
Member

huonw commented Aug 8, 2014

You should be able to use git reflog to go back to a pre-rebase state.

(Also, theoretically one should never rebase or amend an RFC; just new commits.)

@glaebhoerl
Copy link
Contributor Author

That seems to have worked. Thanks.

(For some reason I had internalized the rule as "don't squash commits", instead of "do not rebase at all".)

(And all this time, I had been interpreting git reflog as "re-flog", thinking it was some kind of scary interactive mess-fixer-upper, whereas it turns out be a not-nearly-as-scary "ref-log".)

@pnkfelix
Copy link
Member

Hey, privacy experts: Is this actually a problem: rust-lang/rust#17697 ?

(I'm trying to figure out if it should be cloned into a fresh issue, as I am doing with others that were closed around the same time by that owner.)

@glaebhoerl
Copy link
Contributor Author

My first guess is that this is just useless - you can't expose a private type, so there's no way its field being public could have any relevance - and so it's worth warning about, but not imperative to forbid. But it's possible that I'm missing something.

@dhardy
Copy link
Contributor

dhardy commented Nov 14, 2015

I have an issue with this RFC: it does not explain what is meant by the word public within the RFC text. What does it mean?

  1. That an item is usable within the module it is declared, but not necessarily outside that module
  2. That an item can be used in some way outside the crate, but without the requirement that it be nameable
  3. That an item be usable and nameable (the "public path" thing) from outside the crate

The meaning of this sentence, which is key to interpreting the RFC, is thus unclear:

Items that are explicitly declared as pub are always public.

From the comments at the bottom, I'm guessing "public" means 2. Unfortunately, this still leaves the issue of how to document the "unnameable" type, but I guess that can be dealt with (though it means using the name of a private module, most likely).

@petrochenkov
Copy link
Contributor

Unfortunately, this still leaves the issue of how to document the "unnameable" type, but I guess that can be dealt with

Let's say a function return a value of unnameable type. Then the docs on this function can describe what a user of this function can do with that value. For example, C++ docs do this when return type of something is unspecified.
Ideally such functions should return something like #1305 and not unnameable types.

FWIW, I tweaked the privacy pass to apply more strict rules (variant 3 from your list) and run it on rustc and standard library. Most of broken code contained actual deficiencies - types that should have been nameable, but weren't reexported by accident.

@petrochenkov
Copy link
Contributor

@dhardy
Also, this PR is not the actual version of the RFC, there's a later revision, which changes the rules completely.

@dhardy
Copy link
Contributor

dhardy commented Nov 15, 2015

@petrochenkov I realise the RFC has been modified since this PR, but this is still the place for comments.

Also, I am not surprised that variant 3 works for most code. If you read this PR, it reads two uses for exporting "private" (unnameable) items, and in both cases better solutions (type aliases/deriving new-types, and private trait members). IMO that would be the better solution than the current RFC, but it's not really important.

What I don't like about current RFC (aside from the contradiction in allowing pub on members of private structs) is that if you read it with variant 3 of my list in mind, it still all makes sense, with the sentence I quoted above then meaning *it is an error if an item marked pub is not public (i.e. nameable)`. Confusing.

@glaebhoerl glaebhoerl mentioned this pull request Mar 21, 2016
@Centril Centril added A-typesystem Type system related proposals & ideas A-privacy Privacy related proposals & ideas labels Nov 23, 2018
wycats pushed a commit to wycats/rust-rfcs that referenced this pull request Mar 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-privacy Privacy related proposals & ideas A-typesystem Type system related proposals & ideas
Projects
None yet
Development

Successfully merging this pull request may close these issues.