Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-RFC: std aware Cargo #1

Merged
merged 10 commits into from
Mar 16, 2019
289 changes: 289 additions & 0 deletions text/0000-std-aware-cargo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,289 @@
- Feature Name: cargo_the_std_awakens
- Start Date: 2018-02-09
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

Currently, the `core` and `std` components of Rust are handled in a different way than Cargo handles other crate dependencies. This causes issues for non-mainstream targets, such as WASM, Embedded, and new not-yet-tier-1 targets. The following RFC proposes a roadmap to address these concerns in a consistent and incremental process.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to give a brief summary of the "how it is achieved" here as well.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a little bit worried about repeating myself as this is explained without detail (in the Guide Level), and explained with detail (in the Reference Level).

I am unsure how to summarize without repeating the content of the Guide Level Explanation verbatim. I am open to suggestions!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to think of something to suggest.. :)

# Motivation
[motivation]: #motivation

In today's Rust environment, `core` and `std` are shipped as precompiled objects. This was done for a number of reasons, including faster compile times, and a more consistent experience for users of these dependencies. This design has served fairly well for the bulk of users, however there are a number of less common uses of Rust, that are not well served by this approach. Examples include:
jamesmunns marked this conversation as resolved.
Show resolved Hide resolved

* Supporting new/arbitrary targets, such as those defined by a ".json" file
* Making modifications to `core` or `std` through use of feature flags
jamesmunns marked this conversation as resolved.
Show resolved Hide resolved
* Users who would like to make different optimizations to `core` or `std`, such as `opt-level = 'z'`, with `panic = "abort"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add that we'd really like to be able to have debug_assertions in core, alloc, and std. See threads like https://internals.rust-lang.org/t/make-vec-set-len-enforce-the-len-cap-invariant/8927?u=scottmcm

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @scottmcm, is this something you would see as a compile time configuration option of core, std, alloc, etc?

This RFC isn't meant to propose which features should or should not be available, but instead is focused on the method for selecting them.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesmunns Yes, I do see it as a compile-time configuration option. (Ditto for having overflow checks on by default in debug but not in release -- right now that's hacked together with #[rustc_inherit_overflow_checks], which I'd love to be able to remove.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I'm fine with the RFC not saying how those particular things should be accomplished, but I think they're valuable examples since it's the Motivation section.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scottmcm Would you be interested in PR-ing (or suggesting text) for a section in "Future possibilities" to discuss a listing of possible flags to stabilize? I don't think this should impact the RFC, but might be good to centralize a list of things people are thinking of.

In general, I want to make it clear in that section that "this is a list of possible flags to stabilize, but this RFC does not guarantee any of the following will be stabilized".

Feel free to fork my repo and submit a PR to this branch, and I can accept it there.


Previously, these needs were somewhat addressed by the external tool [xargo], which managed the recompilation of these dependencies when necessary. However, this tool has become [deprecated], and even when supported, required a nightly version of the compiler for all operation.

This approach has [gathered support] from various [rust team members], and this RFC aims to take inspiration from tools and workflows like [xargo], and integrate them into Cargo itself.
jamesmunns marked this conversation as resolved.
Show resolved Hide resolved

[xargo]: https://github.com/japaric/xargo
[deprecated]: https://github.com/japaric/xargo/issues/193
[gathered support]: https://github.com/japaric/xargo/issues/193#issuecomment-359180429
[rust team members]: https://www.ncameron.org/blog/cargos-next-few-years/

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

This proposal aims to make `core` and `std` feel a little bit less like a special case compared to other dependencies to the end users of Cargo. This proposal aims to minimize the number of new concepts introduced to achieve this, by interacting, configuring, modifying, and patching `core` and `std` in a similar manner to other dependent crates.

This RFC proposes the following concrete changes, which may or may not be implemented in this order, and may be done incrementally. The details and caveats around these stages are discussed in the Reference Level Explanation.
jamesmunns marked this conversation as resolved.
Show resolved Hide resolved

1. Allow developers of root crates to recompile `core` (and `compiler-builtins`) when their desired target does not match one available as a `rustup target add` target, without the usage of a nightly compiler. This version of `core` would be built from the same source files used to build the current version of `rustc`/`cargo`.
2. Introduce the concept of "stable features" for `core`, which allow the end user to influence the behavior of their custom version of `core`, without the use of a nightly compiler.
3. Extend the new behaviors described in step 1 and 2 for `std` (and `alloc`).
4. Allow the user to provide their own custom source versions of `core` and `std`, allowing for deep customizations when necessary. This will require a nightly version of the compiler.

As a new concept, the items above propose the existence of "stable features" for `core` and `std`. These features would be considered stable with the same degree of guarantees made for stability in the rest of the language. These features would allow configuration of certain functionalities of `core` or `std`, in a way decided at compile time.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How strong would the testing guarantees of stable features be? Do stable features mean a commitment to testing every combination of stable features on every tier-1 platform? If so, then that blows up the cost of CI by a factor of 2n.


For example, we could propose a feature called `force-tiny-fmt`, which would use different algorithms to implement `fmt` for use on resource constrained systems. The developer of the root crate would be able to choose the default behavior, or the `force-tiny-fmt` behavior while still retaining the ability of using a stable compiler.
jamesmunns marked this conversation as resolved.
Show resolved Hide resolved
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to consider interactions with rust-lang#2492 ? (or if not relevant, explain why not somewhere...)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure how this is applicable, was that the right link? If so, could you please expand on what you mean? (The link currently goes to existential types).

The force-tiny-fmt feature flag I describe is purely theoretical, and is only used as an example for this RFC. If I need to make that more clear, please let me know.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(the link was intended)

So the way it might work is akin to #[global_allocator] and #[panic_handler] in that the standard library defines an extern existential type DebugImplementation: SomeTrait = TheDefaultOne; and then you can override that.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, interesting! I was not aware of how these worked under the hood. Could you suggest any possible items to discuss here? Maybe as an addition to the open questions section?

I am proposing this as mostly just a consumer of core/std/cargo, so I am not fully aware of the full implications here (outside of discussions with some of the core team).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I'll cc @Ericson2314 since they wrote the RFC :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as initially the features themselves are unstable (even if the mechanism for stable features exists), I wouldn't worry about that just yet :).



# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

A reference-level explanation is made for each of the items enumerated above.

## 1 - Allow developers of root crates to recompile `core`

### Use Case

For developers working with new targets not yet supported by the Rust project, this feature would allow the compilation of `core` for any target that can be specified as a valid [target json format].

[target json format]: https://rust-lang.github.io/rfcs/0131-target-specification.html

This functionality would be possible even with the use of a stable compiler.

Users of a nightly compiler would be able to set compile time feature flags for `core` through settings made in their `Cargo.toml`.

### Caveats

For users of a stable compiler, it would not be possible to modify the source code contents of `core`, or change any compile time features of `core` from the defaults used when publishing pre-compiled versions of `core`.

The source code used to build `core` would be the same as the compiler used for building the current project.

### User Interaction

When compiling for a non-standard target, users may specify their target using a json file, rather than a pre-defined target.

For example, currently a user may cross-compile by specifying a target known by Rust:

```sh
cargo build --target thumbv7em-none-eabihf
```

Users would also be able to specify a json file, by providing a path to the json file to be used.

```sh
cargo build --target thumbv7em-freertos-eabihf.json
```

In general, any of the following would prompt Cargo to recompile `core`, rather than use a pre-compiled version:

* A custom target json is used
* The root crate has modified the feature flags of `core`
* The root crate has set certain profile settings, such as opt-level, etc.
* The root crate has specified a `patch.sysroot` (this is defined in a later section)

Users of a stable compiler would not be able to customize `core` outside of these profile settings.

For users of a nightly compiler, compile time features of `core` may be specified using the same syntax used for other crate dependencies. These specified features may include unstable features.

```toml
[dependencies.core]
default-features = false
features = [...]
```

It is not necessary to explicitly mention the dependency of `core`, unless changes to features are necessary.

Cargo would use the source of `core` located in the user's `SYSROOT` directory. This source code would be obtained in the same was as necessary today, through the use of `rustup component add rust-src`. If this component is missing, Cargo would exit with an error code, and would prompt the user to execute the command specified above.

### Technical Implications

#### Stabilization of JSON target format

As the custom target json files would become part of the stable interface of Cargo. The format used by this JSON file must become stabilized, and further changes must be made in a backwards compatible way to guarantee stability.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What risks does this pose for rustc in terms of changes in its model of specifying targets? Does it affect our ability to change the language itself?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this would just constrain the json file format we use to be formalized a bit, and prevent backwards-incompatible changes. From a brief conversation with @alexcrichton, it has only not been formalized because it's been nightly only anyway, and there was little demand to formalize.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK; cool. :)

(as a matter of moving the RFC from a draft stage to being ready, I would write down reasonings such as this that you've made in the text)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always wondered why JSON is used for target specifications. I assume for historical reasons?

I know that it would be more work, but how about switching to TOML? It would be consistent with the other configuration files used by Rust (Cargo.toml, Cargo.lock, .cargo/config) and has the big advantage that it supports comments, which seems very useful because some target options are anything but self-documenting. It would also allow grouping the options in sections, e.g. for separating architecture options (data-layout etc) from build options (linker-flavor etc) if we like.


#### Building of `compiler-builtins`

Currently, `compiler-builtins` contains components implemented in the C programming language. While these dependencies have been highly optimized, the use of them would require the builder of the root crate to also have a sane compilation environment for compilation in C.

This RFC proposes instead to use the [pure rust implementation] when compiling for a custom target, removing the need for a C compiler.

While this may have code size or performance implications, this would allow for maximum portability.

[pure rust implementation]: https://github.com/rust-lang-nursery/compiler-builtins

#### `RUSTC_BOOTSTRAP`

It is necessary to use unstable features to build `core`. In order to allow users of a stable compiler to build `core`, we would set the `RUSTC_BOOTSTRAP` environment variable **ONLY** for the compilation of `core`.
jamesmunns marked this conversation as resolved.
Show resolved Hide resolved

This should be considered sound, as stable users may not change the source used to build `core`, or the features used to build `core`.

## 2 - Introduce the concept of "stable features" for `core`

### Use Case

In some cases, it may be desirable to modify `core` in set of predefined manners. For example, on some targets it may be necessary to have lighter weight machinery for `fmt`.

This step would provide a path for stabilization of compile time `core` features, which would be a subset of all total compile time features of `core`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While not essential, I have an additional use case that fits with Rust's larger goals: reducing the complexity for the beginner.

To someone new to Rust, requiring the "nightly compiler" to embark on embedded development can feel unsettling. Nightly feels advanced & dangerous, stable feels safer and more secure. ("I thought embedded was a great fit for Rust, why can't the stable compiler version handle that yet?") It also increases the teaching complexity, as I've encountered writing some drafts of Rust in Action content.

Copy link

@Ericson2314 Ericson2314 Feb 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the thing. A lot the reason we don't just stabilize everything needed for low-level is it's more complex than it needs to be. There's so many things that are just....needlessly different between "regular" and embedded development. The needless barriers between standard libraries and regular libraries are just one example of this.

The switch to unstable Rustc feels spooky, but ultimately just means that more things are available. Everything that works with stable Rust also works with unstable Rust. It's more spooky than actually dangerous.

If you students ask, tell them it's so future students get a smoother experience and we aren't stuck in a situation that cannot improve like Clang/GCC and C/C++.


### Caveats

Initially, the list of stable compile time features for `core` would be empty, as none of the current features have had an explicit decision to be stable or not.

### User Interaction

Compile time features for `core` may be specified using the same Cargo.toml syntax used for other crates.

The syntax is the same when using `unstable` and `stable` features, however the former may only be used with a nightly compiler, and use of an `unstable` feature with a stable compiler would result in a compile time error.

The syntax for these features would look as follows:

```toml
[dependencies.core]
default-features = false
features = [...]
```

It is not necessary to explicitly mentioned the dependency of `core`, unless changes to features are necessary.

### Technical Implications

#### Path to stabilization

The stabilization of a `core` feature flag would require a process similar to the stabilization of a feature in the language:

* Any new feature begins as unstable, requiring a nightly compiler
* When the feature is sufficiently explored, an RFC/PR can be made to `libcore` to promote this feature to stable
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we would add new feature flags to Cargo.toml of libcore? You may want to clarify this with less familiar readers...

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I hadn't actually considered that far. I would suppose that would be one possible way of implementation.

* When this has been accepted, the feature of `core` may be used with the stable compiler.

#### Implementation of Stable Features

There would be some mechanism of differentiating between flags used to build core, sorting them into the groups `unstable` and `stable`. This RFC does not prescribe a certain way of implementation.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair; but you may want to suggest a few possible mechanisms to make this implementable.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps! But I am not an active developer of core or std, so I was trying to avoid something that makes sense to me, but doesn't make sense in practice.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair :) Maybe someone else can help out here? e.g. @alexcrichton?


## 3 - Extend the new behaviors described for `std` (and `alloc`)

### Use Case

Once the design and implications of the changes have been made for `core`, it will be necessary to extend these abilities for `std`, including components like `liballoc`.

### Caveats

In general, the same restrictions for building `core` will apply to building `std`. These include:

* Users of the stable compiler must use the source used to build the current rust compiler
* Only compile time features considered `stable` may be used outside of nightly. Initially the list of `stable` features would be empty, and stabilizing these features would require a PR/RFC to `libstd`.

### User Interaction

The building of `std` would respect the current build profile, including

The syntax for these features would look as follows:

```toml
[dependencies.std]
default-features = false
features = [
"force_alloc_system",
]
```

It is not necessary to explicitly mention the dependency of `std`, unless changes to features are necessary.

### Technical Implications

None beyond the technical implications listed for `core`.

## 4 - Allow the user to provide their own custom source versions of `core` and `std`

### Use Case

This will allow users of a nightly compiler to provide a custom version of `core` and `std`, without requiring the recompilation of the compiler itself.

### Caveats

As stability guarantees cannot be made around modified versions of `core` or `std`, a nightly compiler must always be used.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why replacing core or std with a custom version requires nightly? How is it different from no_std and cargo source replacement which are both available on stable and allow similar effects? It makes core and std special again because the user has the freedom to use custom source for any other crate but for some reason is not allowed to use custom source for core or std.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that just replacing core or std with a custom version should be fine on stable, those custom versions should not be allowed to bypass stability like the standard version, so they will almost certainly require nightly to compile, but that's separate from the act of replacement.


### User Interaction

For this interaction, the existing `patch` syntax of Cargo.toml will be used. For example:

```toml
[patch.sysroot]
core = { path = 'my/local/core' }
std = { git = 'https://github.com/example/std' }
```

### Technical Implications

The `patch.sysroot` term will be introduced for patch when referring to components such as `std` and `core`.

# Drawbacks
[drawbacks]: #drawbacks

This RFC introduces new concepts to the use of Rust and Cargo, and could be confusing for current users of Rust who have not had to consider changes to `core` or `std` previously. However, in the normal case, most users are unlikely to need these settings, while they allow users that DO need to make changes to control important steps of the build process.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

> Why is this design the best in the space of possible designs?

This approach borrows from existing behaviors used by Cargo to allow configuration of `core` and `std`, as if they were a regular crate dependency.

This approach also offers an approach that can be developed and applied incrementally, allowing for time to find coner cases not considered by this RFC

> What other designs have been considered and what is the rationale for not choosing them?

To the author of this RFC's knowledge, there are no other open designs, other than the use tools that wrap Cargo entirely, such as [xargo].

[xargo]: https://github.com/japaric/xargo

> What is the impact of not doing this?

By not doing this, Rust will continue to be difficult to use for users and platforms "on the edge", such as new platform developers or embedded and WASM users.

# Prior art
[prior-art]: #prior-art

* https://github.com/rust-lang/rfcs/pull/1133
* https://github.com/japaric/xargo
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to elaborate on these links and give a summary :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rust-lang/cargo#2768 you might also want to link and also rust-lang/cargo#5002 and rust-lang/cargo#5003


# Unresolved questions
[unresolved-questions]: #unresolved-questions

## How are dependencies for `core` and `std` specified?

For example in a `no_core` or `no_std` crate, how would we tell Cargo **not** to build the `core` and/or `std` dependencies?

## Should `std` be rebuilt if `core` is rebuilt?

Is it necessary to rebuild `std` using the customized `core`, even if no changes to `std` are necessary?

## Should Cargo obtain or verify the source code for `libcore` or `libstd`?

Right now we depend on `rustup` to obtain the correct source code for these libraries, and we rely on the user not to tamper with the contents. Are these reasonable decisions?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discuss possible ways to avoid tampering? If we rely on users not tampering, how do we communicate this effectively?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the first thing we should decide is IF we should try to detect tampering, and whether that makes sense when the user owns their own PC anyway.

I am of the opinion that we shouldn't (as I think it is a cat and mouse game), however this was brought up by multiple people :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the first thing we should decide is IF we should try to detect tampering, and whether that makes sense when the user owns their own PC anyway.

Right, but we should at least be sure that we can detect tampering.. ;) IOW, let's not decide we want it and find out later that we cannot.

I am of the opinion that we shouldn't (as I think it is a cat and mouse game), however this was brought up by multiple people :)

Things like people wanting to use RUSTC_BOOTSTRAP on stable make me less sure; but maybe we can use social instead of technical means to discourage tampering?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO: Changing libcore/libstd would only affect the binaries produced by building this project, and would not be easy to "accidentally" end up with (except as a consumer of that binary). At the end of the day, anyone can patch the rust compiler for their own builds and distributions, which would defeat any measures we put in place.

Even if we bake in the CRCs of the source files into rustc, people can patch the rustc binary, or rebuild their own compiler. At the end of the day, the compiler is code running on their computer, which we can't do much about.

At best, I think tamper detection would serve only as a warning, "you changed the source, this is not supported behavior".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At best, I think tamper detection would serve only as a warning, "you changed the source, this is not supported behavior".

If it's not too troublesome performance and implementation-wise, that seems like a decent solution; at least we have communicated what we do and do not consider stable in a direct way then. :)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I am just trying to avoid wasted engineering time, and even just baking the CRC of the total source into the compiler seems like it may be more trouble than it is worth. Let's see what other people's feedback is!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah fair; for now you could discuss this as a possibility :)


## Should the custom built `libcore` and `libstd` reside locally or globally?

e.g., should the build artifacts be placed in `target/`, only usable by this project, or in `.cargo/`, to be possibly reused by multiple projects, if they happen to have the same settings?

## How do we handle `libcore` and `libstd`'s `Cargo.lock` file?

Right now these are built using the global lock file in `rust-lang/rust`. Should this always be true? How should Cargo handle this gracefully?

# Future possibilities
[future-possibilities]: #future-possibilities

## Unified `core` and `std`

With the mechanisms specified above, it could be possible to remove the concept of `core` and `std` from the user, leaving only `core`.

By using stable feature flags for `std`, we could say that `std` as a crate with `default-features = false` would essentially be `no_core`, or with `features = ["core"]`, we would be the same as `no_std`.

This abstraction may not map to the actual implementation of `libcore` or `libstd`, but instead be an abstraction layer for the end developer.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go into how we could architect such an abstraction layer?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I was unclear, but the "abstraction layer" is that everything except std goes away from the user's perspective, and today's no_core and no_std are just std with no/fewer features set.

I call it an abstraction layer, as under the hood it is unlikely (or undesirable) to try and merge libcore, liballoc, libstd, et. al due to a number of reasons (discussed by the libs team at all-hands).

Let me know how much of that you would like me to add to the RFC or expand upon :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know how much of that you would like me to add to the RFC or expand upon :)

Mostly, I think it would be good to consider how we might architect the facade that core and alloc would presumably turn into... possibly with some code snippets as examples.