Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow hyphens in Rust crate names #940

Merged

Conversation

lambda-fairy
Copy link
Contributor

Disallow hyphens in Rust crate names, but continue allowing them in Cargo packages.

Rendered

@carllerche
Copy link
Member

👎 Dashes are more readable and renames aren't the end of the world.

For example. I may have a crate: syncbox-io and using it would be:

extern crate "syncbox-io" as sio

Once there are dashes or underscores in the name, it's probably easier to alias to something shorter anyway for use in code.

@sfackler
Copy link
Member

sfackler commented Mar 5, 2015

Big 👍 here. People should try to make libraries they author as painless to use as possible, and forcing all consumers of a crate to rename it every time it's imported is not a great way to do that.

@lambda-fairy
Copy link
Contributor Author

@carllerche

Dashes are more readable

The readability/aesthetics argument is subjective, and not something everyone agrees on.

The fact that all arguments for them come down to they look nice should sound at least a bit suspicious. Especially when there are more objective arguments on the other side.

renames aren't the end of the world

They aren't, of course. But as @sfackler says, making this kind of renaming mandatory is not the way to go. It'll be yet another special case for newcomers to learn, and introductory tutorials to explain.

Quite a few features of Rust -- C-style syntax, lifetime elision, IntoIterator -- are geared toward reducing the number of concepts a beginner has to learn up front. Disallowing hyphens follows in this trend: beginners won't have to learn renaming syntax just to generate JSON.

Crate renaming will become optional sugar, as it should be.

Once there are dashes or underscores in the name, it's probably easier to alias to something shorter anyway for use in code.

Note that if we disallow hyphens, we'll be able to do this instead:

extern crate syncbox_io as sio;  // imaginary syntax

which compares favorably to

use std::fmt::Write as WriteFmt;

In fact, hyphens are the only character we allow in crates which aren't also valid in identifiers. Without them, we shouldn't need quotes around names at all. I'm not pushing for this syntax here, but such a change would be possible in the future.

(And before someone brings up Go: in that language the string is a path to the library, not a crate name. Note that Go doesn't allow hyphens in crate/package names either.)

@liigo
Copy link
Contributor

liigo commented Mar 5, 2015 via email

@CloudiDust
Copy link
Contributor

I'll admit that foo-bar looks better to me, but I still prefer removing -s from crate names for less exceptional cases in the language and better consistency across the ecosystem, so +1.

@tshepang
Copy link
Member

tshepang commented Mar 6, 2015

let's do this... 👍

@codyps
Copy link

codyps commented Mar 6, 2015

I suppose another alternate is to allow - in idents

@sfackler
Copy link
Member

sfackler commented Mar 6, 2015

It's ambiguous - is a-b the ident a-b or the ident a minus the ident b?

@codyps
Copy link

codyps commented Mar 6, 2015

@sfackler sure, we'd also need to change parsing rules (that would be a single ident), but it is an alternative.

@tshepang
Copy link
Member

tshepang commented Mar 6, 2015

To avoid ambiguity, whitespace would be needed (e.g. let a-b = a - b;) in order to have - as an operator. Much code would break, and it's also too unusual.

@tshepang
Copy link
Member

tshepang commented Mar 6, 2015

Is there any language that allows - as part of identifier?

@lambda-fairy
Copy link
Contributor Author

@tshepang Lisp-family languages do, but they usually require spaces between tokens as well. I don't know of a C or ML-style language that does that.

@tshepang
Copy link
Member

tshepang commented Mar 6, 2015

@lfairy I forgot about Lisp, thanks.

@dtantsur
Copy link

dtantsur commented Mar 6, 2015

+1, using rustc-serialize is a bit weird at first

@ftxqxd
Copy link
Contributor

ftxqxd commented Mar 7, 2015

👍 Not only is being forced to rename certain crates very frustrating, but it’s also annoying to have to choose between - and _ when choosing a crate name, and to remember which of - and _ a particular crate uses. Removing one of them seems to be a good course of action to me, and - is the best candidate for removal in my opinion by far (removing _ would be inconsistent with identifiers, having - requires renaming, and the other points already discussed).

@lifthrasiir
Copy link
Contributor

I agree to @P1start's rationale. Encoding currently has both encoding_index_tests and encoding-index-singlebyte crates (and so on), primarily because they are registered by different authors with different preferences. Having one correct way would be appreciated.

@steveklabnik
Copy link
Member

Big 👍 here.

@alexcrichton
Copy link
Member

We talked about this RFC in today's meeting. One of the points in favor of using hyphens, which I don't think has been stressed enough, is that there is a large amount of precedent for using - in "package names". For example github repositories, ruby gems (for namespacing), and npm all have a large and established convention of using a hyphen.

I am personally sympathetic with the hyphen convention, and I believe that going against the precedent-at-large to gain consistency is not worth it. I think that this RFC does a great job of laying out the alternatives (thanks!), and I would personally push for the fuzzy matching alternative to become the detailed design.

Of the two motivations, consistency and usability, fuzzy matching would solve the usability problem. There are probably still some more details to hash out precisely what happens, but overall I think it's the best direction to go in. We can continue the trend of hyphens set by other communities without sacrificing usability. In this situation we would probably continue to retain the quoting syntax for renaming if necessary, it just wouldn't be necessary if you used the canonical "extern crate" name for a crate.

@steveklabnik
Copy link
Member

steveklabnik commented Mar 10, 2015 via email

@alexcrichton
Copy link
Member

@steveklabnik could you elaborate a little more on why you think it doesn't solve the usability problem? I'm curious what use cases you have in mind and what possible extensions could be added (not that they should be, just curious!).

The specific use-case I have in mind is what's spelled out in the RFC:

extern crate "rustc-serialize" as rustc_serialize;
// vs
extern crate rustc_serialize;

@reem
Copy link

reem commented Mar 10, 2015

-1 for similar reasons as @carllerche.

Hyphenated names are often meant to be renamed. For instance, I usually do extern crate "rustc_serialize" as serialize, especially now that we have $crate for macros.

Another case is with distributing iron-related stuff, perhaps I want to distribute a set of crates under iron-X (right now everything is just X but that could change and there are other frameworks that will exist) that are meant to be imported extern crate "iron-x" as x.

@steveklabnik
Copy link
Member

@alexcrichton

Without hypens, crate names are just identifiers. With hypens and fuzzy matching, now I have to explain why it's not just an identifier, and what those fuzzy matchers are, when instead, we could all just use identifiers and sidestep this whole mess.

I guess that's the core of it: we can pile on rules to make hypens look just like underscores, or we can just use underscores and call it a day.

It's not like there are a zillion packages right now, but I certainly would use a non-hyphenated package over a hyphenated one, if they provided simliar functionality. It's ugly hoop-jumping for no actual benefit, imho.

@seanmonstar
Copy link
Contributor

👍 while you certainly can rename to your liking, you shouldn't be forced to.

@CloudiDust
Copy link
Contributor

@reem, I am thinking, would it be a good alternative to state in the guide lines that - is a "namespacing"/"tagging" symbol, and _ is for separating words in a name? So if a project foo has some component named bar baz, then the correct naming would be foo-bar_baz not foo-bar-baz nor foo_bar_baz.

But then on second thought, I'd rather we permit / as the namespacing symbol, and forbid -.

@everyone, would the above be a good alternative?

@alexcrichton
Copy link
Member

Without hypens, crate names are just identifiers. With hypens and fuzzy matching, now I have to explain why it's not just an identifier, and what those fuzzy matchers are, when instead, we could all just use identifiers and sidestep this whole mess.

I guess that's the core of it: we can pile on rules to make hypens look just like underscores, or we can just use underscores and call it a day.

I definitely agree that adding any form of fuzzy matching would be adding more complexity. This RFC classifies today's behavior as both complex and hindering usability. I'd like to tease apart these concerns in this discussion. You previously said "I don't think fuzzy matching really solves the usability problem", but it sounds like you're mostly worried about the complexity a fuzzy match would be adding?

To be clear, I'm not disagreeing that a fuzzy match adds complexity, only that a fuzzy match does not solve the usability problem. I think @sfackler put it well in that today a - in a crate name forces users to rename-on-import, but a fuzzy match does not have this restriction: the "canonical name" would be hyphens translated to underscores.

It's not like there are a zillion packages right now, but I certainly would use a non-hyphenated package over a hyphenated one, if they provided simliar functionality. It's ugly hoop-jumping for no actual benefit, imho.

I agree, and my motivation for pushing back on disallowing hyphens is that I like hyphens, not so much that I don't want to go migrate all the crates on crates.io :)

@alexcrichton
Copy link
Member

@CloudiDust

Everyone, would the above be a good alternative?

I believe that scheme is similar to what Ruby employs, although I'm less sure about Python or NPM. It does clash, however, with the *-sys convention as well as the number 1 crate on crates.io, rustc-serialize (just as some data points).

@Valloric
Copy link

Please don't add name fuzzy matching! Better leave it as is than add a possible footgun to the language. The fuzzy matching idea is much too clever for what crate importing is supposed to be doing.

Forcing everyone to just use the underscore is a better idea IMO, but damn-nigh everything is better than fuzzy matching the name. It's way too magic.

@CloudiDust
Copy link
Contributor

@alexcrichton, we may fuzzy match / and - (note: both mandates renaming) instead of fuzzy matching - and _. And we may guide the community towards using / instead of -. (Or if some package uses - to separate words in a name, it will change to use _. bloom-filter looks good, bloom/filter may not be what the author intends.)

Also, we may mimic how lein (Clojure's package manager) does things, by treating a namespace-less name foo as the same as foo/foo. This worked well for clojure, I think.

The mandatory renaming avoids name clashes nicely, without introducing the concept of package/crate namespacing into the compiler. (Besides foo = foo/foo, which can be a simple string rewrite.)

@lambda-fairy
Copy link
Contributor Author

@alexcrichton

  • This means that with a package name of foo-bar Cargo will pass --crate-name foo_bar to the compiler, right?

  • In Cargo, you can name targets differently than the package itself, and I suspect we would want this to be an error, right?

    [lib]
    name = "foo-bar"

Yes and yes. I've reworded the section to clarify these points, thanks!

  • As a data point, I just checked and we have 0 crates currently that only differ on - vs _, so this should be an easy restriction to add to crates.io.

Great!

@CloudiDust
Copy link
Contributor

If we are doing this (allow - in package names but not crate names), then we need a convention for -. Basically, I think we should use it for namespacing/tagging as if we are using /, so, the following are considered correct:

piston-some_component, sdl2-sys, regex-macros, bloom_filter.

(Note: macro packages are to be renamed, but I can also accept leaving them as an exception as "foo macros" flows natually, in contrast "foo sys" doesn't.)

Also when we create a new package in cargo, I hope it can warn about names that may break convention, like foo_sys, and for names containing - like foo-bar_baz, cargo can warn: "hyphens are for namespacing and tagging, does this look natual to you: foo/bar_baz?"

@lambda-fairy
Copy link
Contributor Author

@CloudiDust I agree that we need an official convention on this, but I think that should be done in a separate PR.

@ncm
Copy link

ncm commented Mar 14, 2015

If there is a separate mapping between file and crate names is certainly doesn't need to be duplicated here.

Mapping "-" and "_" still seems like a wart. If it is always a suffix anyway, maybe "-" could map to "::" in identifier-land?

@ncm
Copy link

ncm commented Mar 14, 2015

Or, could a "-anything" suffix be simply omitted from the identifier? "-suffixes" don't seem to disambiguate anything.

@CloudiDust
Copy link
Contributor

@ncm, :: is for intra-crate namespacing, so I don't think we should do the - -> ::mapping. (And we should not be ignoring "-suffixes". Too magical.) However, you've given me a new idea.

@ifairy @alexcrichton, what about retaining - in package names and map them to /s in crate names and make extern crate foo/bar_baz valid?

(Note, there are no quotes around foo/bar_baz, and the names would be bound to bar_baz, we are not doing funny things with a string literal, but / in a particular context.)

The advantage is that it is very clear hyphens/slashes are namespacing sigils. (And hyphens have to be mapped anyway.)

The disadvantage is foo-sys (package)/foo/sys(crate) have to be renamed if there are multiple low-level ffi crates being imported. But this is no worse than the status quo.

@arthurprs
Copy link

The current state of the RPC is reasonable, although I'd prefer to forbid "-" on both ends for the sake of consistency.

Does the core team expect to make a decision on this next week? I mean, the schedule is tight for the beta release.

@CloudiDust
Copy link
Contributor

@arthurprs, I agree forbidding - on both ends is more consistent.

@everyone, I suppose one of the reasons to retain hyphens in package names is to avoid massive renaming on crates.io. And, hyphens are used for namespacing already.

I looked at crates.io and found that most packages I saw were (correctly) using - for namespacing and _ for word separation.

While the convention has not been officially established, I'd be very surprised if - were not going to be the official namespacing sigil.

Mapping hyphens to underscores has one important advantage: it is simple. But it has one disadvantage: the namespacing information is lost in source code. If all crates become foo_bar_bazs in source code, then it kind of defeats the purpose of using - in package names.

So hyphens -> slashes feels natural to me.

Note: extern crate foo/bar/baz binding foo/bar/baz to baz is just like use foo::bar::Baz binding foo::bar::Baz to Baz.

@codyps
Copy link

codyps commented Mar 15, 2015

@CloudiDust which accepted RFC (or other documentation) specifies that - in crate names must be used for namespacing instead of as a normal seperator?

Unless there is something, I view the discussion on namespacing crates as something that probably belongs in a separate RFC.

@lambda-fairy
Copy link
Contributor Author

@CloudiDust Another disadvantage would be inconsistency: this / syntax isn't used anywhere else in the language, so it would feel strange in context. While it's an interesting idea, I'm hesitant to push such a change for this reason.

@arthurprs The RFC is written with a tight time frame in mind (all proposed changes are really easy to implement) so it shouldn't be an issue.

@jmesmon I think you're reading to much into his comment. He already mentions that "the convention has not been officially established".

@CloudiDust
Copy link
Contributor

@ifairy, I'm afraid that mapping hyphens to underscores would tempt people to go "I import with underscores anyway, so I'll always name the packages with underscores only." But this may be a minor problem and I am worrying too much.

Let's go with this RFC as written then.

@liigo
Copy link
Contributor

liigo commented Mar 16, 2015

+1
On Mar 13, 2015 9:45 AM, "Aidan Cully" notifications@github.com wrote:

I'll reiterate the comment I made on internals
http://internals.rust-lang.org/t/pre-rfc-resolve-support-for-hyphens-in-crate-names/1459/22?u=aidancully:
for Rust 1.0, the emphasis should be on preserving forward compatibility
with desired future language evolution. Namespaces are, I think, too large
a concept to design on the 1.0 schedule, but they could impact extern
crate usage. Therefore, I think we should reduce extern crate to the
simplest core that can possibly work for 1.0, so that we leave room to
expand it later. To me, that means disallowing hyphens in crate names
(since the way we have to rename crates now may not be the way we want to
rename them in the future, hyphens are a forward-compatibility risk),
turning off crate renaming entirely (for now), and possibly re-adding them
after an explicit design process post-1.0.

In other words, I am actually OK with the current behavior post-1.0, so
long as it's the result of an explicit design process. I think 1.0 itself,
though, should use the simplest system that can work: no hyphens, no crate
renames.


Reply to this email directly or view it on GitHub
#940 (comment).

@alexcrichton
Copy link
Member

We discussed this RFC today and the reception was quite favorable of the RFC as-written. I'm going to hold off on merging for a bit to allow any final comments, but otherwise I think this is good to go, thanks @lfairy!

@alexcrichton alexcrichton self-assigned this Mar 19, 2015
@alexcrichton alexcrichton merged commit 19bf9bb into rust-lang:master Mar 19, 2015
@alexcrichton
Copy link
Member

The current version of this RFC appears to have widespread support now by striking a nice balance between consistency in the language while allowing usage of naming conventions in external functions such as Cargo and crates.io. I've now merged this RFC, and thanks again @lfairy!

Tracking issue: rust-lang/rust#23533

@pnkfelix
Copy link
Member

@alexcrichton was the intention here to also forbid hyphens in the "crate name" of an output binary?

It seems like we need to do something with our test suite to avoid lots of annoying warnings there.

@alexcrichton
Copy link
Member

@pnkfelix currently it is invalid to manually specify --crate-name with a hyphen in it, but compiling foo-bar.rs as a binary will cause the compiler to emit foo-bar as a binary with foo_bar as the crate name (similar to what Cargo does). This means that the test suite shouldn't have any extra warnings.

@pnkfelix
Copy link
Member

@alexcrichton all I know is that when I currently compile tests by hand via the command lines I see from compiletest, I often see warnings.

e.g.:

% x86_64-apple-darwin/stage2/bin/rustc /Users/fklock/Dev/Mozilla/rust-lesser-box/src/test/compile-fail/borrowck-issue-14498.rs -L x86_64-apple-darwin/test/compile-fail/ --target=x86_64-apple-darwin -L x86_64-apple-darwin/test/compile-fail/borrowck-issue-14498.stage2-x86_64-apple-darwinlibaux -C prefer-dynamic -o x86_64-apple-darwin/test/compile-fail/borrowck-issue-14498.stage2-x86_64-apple-darwin --cfg rtopt --cfg debug -O -L x86_64-apple-darwin/rt
warning: crate names soon cannot contain hyphens: borrowck-issue-14498
...

(maybe I should just file a bug then.)

@alexcrichton
Copy link
Member

@pnkfelix are you sure you're rebased on the current master? I believe I removed that warning very recently.

@pnkfelix
Copy link
Member

@alexcrichton yeah okay I don't see it now.

@gsingh93
Copy link

gsingh93 commented Apr 3, 2015

Now that quotes are removed, are crate names like static no longer valid?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-modules Proposals relating to modules.
Projects
None yet
Development

Successfully merging this pull request may close these issues.