Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-RFC: std aware Cargo #1

Merged
merged 10 commits into from
Mar 16, 2019
335 changes: 335 additions & 0 deletions text/0000-std-aware-cargo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,335 @@
- Feature Name: cargo_the_std_awakens
- Start Date: 2018-02-09
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

As of Rust 1.33.0, the `core` and `std` components of Rust are handled in a different way than Cargo handles other crate dependencies. This causes issues for non-mainstream targets, such as WASM, Embedded, and new not-yet-tier-1 targets. The following RFC proposes a roadmap to address these concerns in a consistent and incremental process.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to give a brief summary of the "how it is achieved" here as well.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a little bit worried about repeating myself as this is explained without detail (in the Guide Level), and explained with detail (in the Reference Level).

I am unsure how to summarize without repeating the content of the Guide Level Explanation verbatim. I am open to suggestions!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to think of something to suggest.. :)

# Motivation
[motivation]: #motivation

In today's Rust environment, `core` and `std` are shipped as precompiled objects. This was done for a number of reasons, including faster compile times, and a more consistent experience for users of these dependencies. This design has served the bulk of users fairly well. However there are a number of less common uses of Rust, that are not well served by this approach. Examples include:

* Supporting new/arbitrary targets, such as those defined by a custom target (".json") file
* Modifying `core` or `std` through use of feature flags
* Users who would like to make different optimizations to `core` or `std`, such as `opt-level = 's'`, with `panic = "abort"`

Previously, these needs were somewhat addressed by the external tool [xargo], which managed the recompilation of these dependencies when necessary. However, this tool has become [deprecated], and even when supported, required a nightly version of the compiler for all operation.

This approach has [gathered support] from various [rust team members]. This RFC aims to take inspiration from tools and workflows used by tools like [xargo], integrating them into Cargo itself.

[xargo]: https://github.com/japaric/xargo
[deprecated]: https://github.com/japaric/xargo/issues/193
[gathered support]: https://github.com/japaric/xargo/issues/193#issuecomment-359180429
[rust team members]: https://www.ncameron.org/blog/cargos-next-few-years/

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

This proposal aims to make `core` and `std` feel a little bit less like a special case compared to other dependencies to the end users of Cargo. This proposal aims to minimize the number of new concepts introduced to achieve this, by interacting, configuring, modifying, and patching `core` and `std` in a similar manner to other dependent crates.

This RFC proposes the following concrete changes, which may or may not be implemented in this order, and may be done incrementally. The details and caveats around these stages are discussed in the [Reference Level Explanation][reference-level-explanation].

In this document, we use the term "root crate" to refer to the Rust project being built directly by Cargo. This crate contains the Cargo.toml used to guide the modifications described below. This would typically be a crate containing a binary application, or a standalone item, such as an `rlib`.

1. Allow developers of root crates to recompile `core` (and `compiler-builtins`) when their desired target does not match one available as a `rustup target add` target, without the usage of a nightly compiler. This version of `core` would be built from the same source files used to build the current version of `rustc`/`cargo`.
2. Allow the usage of Cargo features with the `core` library, additionally introducing the concept of "stable features" for `core`, which allow the end user to influence the behavior of their custom version of `core` without the use of a nightly compiler.
3. Extend the new behaviors described in step 1 and 2 for `std` (and `alloc`).
4. Allow the user to provide their own custom source versions of `core` and `std`, allowing for deep customizations when necessary. This will require a nightly version of the compiler.

As a new concept, the items above propose the existence of "stable features" for `core` and `std`. These features would be considered stable with the same degree of guarantees made for stability in the rest of the language. These features would allow configuration of certain functionalities of `core` or `std`, in a way decided at compile time.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How strong would the testing guarantees of stable features be? Do stable features mean a commitment to testing every combination of stable features on every tier-1 platform? If so, then that blows up the cost of CI by a factor of 2n.


For example, we could propose a feature called `force-tiny-fmt`, which would use different algorithms to implement `fmt` for use on resource constrained systems. The developer of the root crate would be able to choose the default behavior, or the `force-tiny-fmt` behavior while still retaining the ability of using a stable compiler.
jamesmunns marked this conversation as resolved.
Show resolved Hide resolved
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to consider interactions with rust-lang#2492 ? (or if not relevant, explain why not somewhere...)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure how this is applicable, was that the right link? If so, could you please expand on what you mean? (The link currently goes to existential types).

The force-tiny-fmt feature flag I describe is purely theoretical, and is only used as an example for this RFC. If I need to make that more clear, please let me know.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(the link was intended)

So the way it might work is akin to #[global_allocator] and #[panic_handler] in that the standard library defines an extern existential type DebugImplementation: SomeTrait = TheDefaultOne; and then you can override that.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, interesting! I was not aware of how these worked under the hood. Could you suggest any possible items to discuss here? Maybe as an addition to the open questions section?

I am proposing this as mostly just a consumer of core/std/cargo, so I am not fully aware of the full implications here (outside of discussions with some of the core team).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I'll cc @Ericson2314 since they wrote the RFC :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as initially the features themselves are unstable (even if the mechanism for stable features exists), I wouldn't worry about that just yet :).



# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

A reference-level explanation is made for each of the items enumerated above.

## 1 - Allow developers of root crates to recompile `core`

### Use Case

For developers working with new targets not yet supported by the Rust project, this feature would allow the compilation of `core` for any target that can be specified as a valid [custom target specification].

[custom target specification]: https://rust-lang.github.io/rfcs/0131-target-specification.html

This functionality would be possible even with the use of a stable compiler.

Users of a nightly compiler would be able to set compile time feature flags for `core` through settings made in their `Cargo.toml`.

### Caveats

For users of a stable compiler, it would not be possible to modify the source code contents of `core`, or change any compile time features of `core` from the defaults used when publishing pre-compiled versions of `core`.

The source code used to build `core` would be the same as the compiler used for building the current project.

### User Interaction

When compiling for a non-standard target, users may specify their target using a target specification file, rather than a pre-defined target.

> NOTE: The current target specification is described in JSON, and contains some
> implementation details regarding the use of LLVM as the compiler backend. This
> RFC does not prescribe any changes to the Target Specification format, and is
> intended to work with whatever the current/stable method of specifying a
> custom target is.

For example, currently a user may cross-compile by specifying a target known by Rust:

```sh
cargo build --target thumbv7em-none-eabihf
```

Users would also be able to specify a target specification file, by providing a path to the file to be used.

```sh
cargo build --target thumbv7em-freertos-eabihf.json
```

In general, any of the following would prompt Cargo to recompile `core`, rather than use a pre-compiled version:

* A custom target specification is used
* The root crate has modified the feature flags of `core`
* The root crate has set certain profile settings, such as opt-level, etc.
* The root crate has specified a `patch.sysroot` (this is defined in a later section)

Users of a stable compiler would not be able to customize `core` outside of these profile settings.

For users of a nightly compiler, compile time features of `core` may be specified using the same syntax used for other crate dependencies. These specified features may include unstable features.

```toml
[dependencies.core]
default-features = false
features = [...]
```

It is not necessary to explicitly mention the dependency of `core`, unless changes to features are necessary.

Cargo would use the source of `core` located in the user's `SYSROOT` directory. This source code would be obtained in the same was as necessary today, through the use of `rustup component add rust-src`. If this component is missing, Cargo would exit with an error code, and would prompt the user to execute the command specified above.

### Technical Implications

#### Stabilization of a Target Specification Format

As the custom target specifications (currently JSON) would become part of the stable interface of Cargo. The format used by this file must become stabilized, and further changes must be made in a backwards compatible way to guarantee stability.

#### Building of `compiler-builtins`

Currently, `compiler-builtins` contains components implemented in the C programming language. While these dependencies have been highly optimized, the use of them would require the builder of the root crate to also have a working compilation environment for compilation in C.

This RFC proposes instead to use the [pure rust implementation] when compiling for a custom target, removing the need for a C compiler.

While this may have code size or performance implications, this would allow for maximum portability.

[pure rust implementation]: https://github.com/rust-lang-nursery/compiler-builtins

#### `RUSTC_BOOTSTRAP`

It is necessary to use unstable features to build `core`. To allow users of a stable compiler to build `core`, we would set the `RUSTC_BOOTSTRAP` environment variable **ONLY** for the compilation of `core`.

This should be considered sound, as stable users may not change the source used to build `core`, or the features used to build `core`.

## 2 - Introduce the concept of "stable features" for `core`

### Use Case

In some cases, it may be desirable to modify `core` in set of predefined manners. For example, on some targets it may be necessary to have lighter weight machinery for `fmt`.

This step would provide a path for stabilization of compile time `core` features, which would be a subset of all total compile time features of `core`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While not essential, I have an additional use case that fits with Rust's larger goals: reducing the complexity for the beginner.

To someone new to Rust, requiring the "nightly compiler" to embark on embedded development can feel unsettling. Nightly feels advanced & dangerous, stable feels safer and more secure. ("I thought embedded was a great fit for Rust, why can't the stable compiler version handle that yet?") It also increases the teaching complexity, as I've encountered writing some drafts of Rust in Action content.

Copy link

@Ericson2314 Ericson2314 Feb 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the thing. A lot the reason we don't just stabilize everything needed for low-level is it's more complex than it needs to be. There's so many things that are just....needlessly different between "regular" and embedded development. The needless barriers between standard libraries and regular libraries are just one example of this.

The switch to unstable Rustc feels spooky, but ultimately just means that more things are available. Everything that works with stable Rust also works with unstable Rust. It's more spooky than actually dangerous.

If you students ask, tell them it's so future students get a smoother experience and we aren't stuck in a situation that cannot improve like Clang/GCC and C/C++.


### Caveats

Initially, the list of stable compile time features for `core` would be empty, as none of the current features have had an explicit decision to be stable or not.

### User Interaction

Compile time features for `core` may be specified using the same Cargo.toml syntax used for other crates.

The syntax is the same when using `unstable` and `stable` features, however the former may only be used with a nightly compiler, and use of an `unstable` feature with a stable compiler would result in a compile time error.

The syntax for these features would look as follows:

```toml
[dependencies.core]
default-features = false
features = [...]
```

It is not necessary to explicitly mentione the dependency of `core`, unless changes to features are necessary.

### Technical Implications

#### Path to stabilization

The stabilization of a `core` feature flag would require a process similar to the stabilization of a feature in the language:

* Any new feature begins as unstable, requiring a nightly compiler
* When the feature is sufficiently explored, an RFC/PR can be made to `libcore` to promote this feature to stable
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we would add new feature flags to Cargo.toml of libcore? You may want to clarify this with less familiar readers...

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I hadn't actually considered that far. I would suppose that would be one possible way of implementation.

* When this has been accepted, the feature of `core` may be used with the stable compiler.

#### Implementation of Stable Features

There would be some mechanism of differentiating between flags used to build core, sorting them into the groups `unstable` and `stable`. This RFC does not prescribe a certain way of implementation.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair; but you may want to suggest a few possible mechanisms to make this implementable.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps! But I am not an active developer of core or std, so I was trying to avoid something that makes sense to me, but doesn't make sense in practice.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair :) Maybe someone else can help out here? e.g. @alexcrichton?


## 3 - Extend the new behaviors described for `std` (and `alloc`)

### Use Case

Once the design and implications of the changes have been made for `core`, it will be necessary to extend these abilities for `std`, including components like `liballoc`.

### Caveats

In general, the same restrictions for building `core` will apply to building `std`. These include:

* Users of the stable compiler must use the source used to build the current rust compiler
* Only compile time features considered `stable` may be used outside of nightly. Initially the list of `stable` features would be empty, and stabilizing these features would require a PR/RFC to `libstd`.

### User Interaction

The building of `std` would respect the current build profile, including optimization settings.

The syntax for these features would look as follows:

```toml
[dependencies.std]
default-features = false
features = [
"force_alloc_system",
]
```

It is not necessary to explicitly mention the dependency of `std`, unless changes to features are necessary.

### Technical Implications

None beyond the technical implications listed for `core`.

## 4 - Allow the user to provide their own custom source versions of `core` and `std`

### Use Case

This will allow users of a nightly compiler to provide a custom version of `core` and `std`, without requiring the recompilation of the compiler itself.

### Caveats

As stability guarantees cannot be made around modified versions of `core` or `std`, a nightly compiler must always be used.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why replacing core or std with a custom version requires nightly? How is it different from no_std and cargo source replacement which are both available on stable and allow similar effects? It makes core and std special again because the user has the freedom to use custom source for any other crate but for some reason is not allowed to use custom source for core or std.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that just replacing core or std with a custom version should be fine on stable, those custom versions should not be allowed to bypass stability like the standard version, so they will almost certainly require nightly to compile, but that's separate from the act of replacement.


### User Interaction

For this interaction, the existing `patch` syntax of Cargo.toml will be used. For example:

```toml
[patch.sysroot]
core = { path = 'my/local/core' }
std = { git = 'https://github.com/example/std' }
```

> NOTE: The use of `sysroot` as a category may be changed to a less loaded
> category name. This is likely an area for bikeshedding. `sysroot` will be
> used for the remainder of the document for consistency.

### Technical Implications

The `patch.sysroot` term will be introduced for patch when referring to components such as `std` and `core`.

# Drawbacks
[drawbacks]: #drawbacks

This RFC introduces new concepts to the use of Rust and Cargo, and could be confusing for current users of Rust who have not had to consider changes to `core` or `std` previously. However, in the normal case, most users are unlikely to need these settings, while they allow users that DO need to make changes to control important steps of the build process.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

> Why is this design the best in the space of possible designs?

This approach borrows from existing behaviors used by Cargo to allow configuration of `core` and `std`, as if they were a regular crate dependency.

This approach also offers an approach that can be developed and applied incrementally, allowing for time to find coner cases not considered by this RFC

> What other designs have been considered and what is the rationale for not choosing them?

To the author of this RFC's knowledge, there are no other open designs, other than the use tools that wrap Cargo entirely, such as [xargo].

[xargo]: https://github.com/japaric/xargo

> What is the impact of not doing this?

By not doing this, Rust will continue to be difficult to use for users and platforms "on the edge", such as new platform developers or embedded and WASM users.

# Prior art
[prior-art]: #prior-art

* [RFC1133] - This RFC from 2015 proposed making cargo aware of std. I still need to review in more detail to find the parts and syntax that may solve some open questions.
* [xargo] - This external tool was used to achieve a similar workflow as described above, limited to use with a nightly compiler
* [Cargo Issue 5002] - This issue proposed a syntax for explicit dependency on std
* [Cargo Issue 5003] - This issue discussed how to be backwards compatible with crates that don't explicitly mention std

[RFC1133]: https://github.com/rust-lang/rfcs/pull/1133
[Cargo Issue 5002]: https://github.com/rust-lang/cargo/issues/5002
[Cargo Issue 5003]: https://github.com/rust-lang/cargo/issues/5003

# Unresolved questions
[unresolved-questions]: #unresolved-questions

## How are dependencies (or non-dependency) on `core` and `std` specified?

For example in a `no_core` or `no_std` crate, how would we tell Cargo **not** to build the `core` and/or `std` dependencies?

## Should `std` be rebuilt if `core` is rebuilt?

Is it necessary to rebuild `std` using the customized `core`, even if no changes to `std` are necessary?

## Should Cargo obtain or verify the source code for `libcore` or `libstd`?

Right now we depend on `rustup` to obtain the correct source code for these libraries, and we rely on the user not to tamper with the contents. Are these reasonable decisions?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discuss possible ways to avoid tampering? If we rely on users not tampering, how do we communicate this effectively?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the first thing we should decide is IF we should try to detect tampering, and whether that makes sense when the user owns their own PC anyway.

I am of the opinion that we shouldn't (as I think it is a cat and mouse game), however this was brought up by multiple people :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the first thing we should decide is IF we should try to detect tampering, and whether that makes sense when the user owns their own PC anyway.

Right, but we should at least be sure that we can detect tampering.. ;) IOW, let's not decide we want it and find out later that we cannot.

I am of the opinion that we shouldn't (as I think it is a cat and mouse game), however this was brought up by multiple people :)

Things like people wanting to use RUSTC_BOOTSTRAP on stable make me less sure; but maybe we can use social instead of technical means to discourage tampering?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO: Changing libcore/libstd would only affect the binaries produced by building this project, and would not be easy to "accidentally" end up with (except as a consumer of that binary). At the end of the day, anyone can patch the rust compiler for their own builds and distributions, which would defeat any measures we put in place.

Even if we bake in the CRCs of the source files into rustc, people can patch the rustc binary, or rebuild their own compiler. At the end of the day, the compiler is code running on their computer, which we can't do much about.

At best, I think tamper detection would serve only as a warning, "you changed the source, this is not supported behavior".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At best, I think tamper detection would serve only as a warning, "you changed the source, this is not supported behavior".

If it's not too troublesome performance and implementation-wise, that seems like a decent solution; at least we have communicated what we do and do not consider stable in a direct way then. :)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I am just trying to avoid wasted engineering time, and even just baking the CRC of the total source into the compiler seems like it may be more trouble than it is worth. Let's see what other people's feedback is!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah fair; for now you could discuss this as a possibility :)


## Should the custom built `libcore` and `libstd` reside locally or globally?

e.g., should the build artifacts be placed in `target/`, only usable by this project, or in `.cargo/`, to be possibly reused by multiple projects, if they happen to have the same settings?

## How do we handle `libcore` and `libstd`'s `Cargo.lock` file?

Right now these are built using the global lock file in `rust-lang/rust`. Should this always be true? How should Cargo handle this gracefully?

## Should profile changes always prompt a rebuild of `core`/`std`?

For example, if a user sets their debug build to use `opt-level = 'z'`, should this rebuild `core`/`std` to use that opt level? Or should an additional flag, such as `apply-to-sysroot` be required to opt-in to this behavior, unless otherwise needed?

This could increase compile times for users that have set profile overrides, but have not previously needed a custom `core` or `std`.

Another option in this area is to force the use of profile overrides, as specified by [RFC2822](https://github.com/rust-lang/rfcs/blob/master/text/2282-profile-dependencies.md).

## Should providing a custom `core` or `std` require a nightly compiler?

It is currently unknown whether it is possible to provide a custom version of `core` or `std` without unstable features, as there are some compiler intrinsics and "magic" that are necessary (the format macros and box keyword come to mind).

I initially wrote the RFC in this manner, however I was later convinced this was not possible to do.

I am of the opinion that if you could, then it should be allowed to use a stable compiler, but that might be too theoretical for this RFC.

We could also move forward with the current restriction to nightly, and allow that to be lifted later by a follow-on RFC if this is possible and necessary.

## Should we allow configurable `core` and `std`

If we are to uphold stability guarantees for all configurations of `core` and `std`, this could require testing 2^(n+m) versions of Rust, where `n` is the number of `core` features, and `m` is the number of `std` features. This would have a negative impact on CI times.

# Future possibilities
[future-possibilities]: #future-possibilities

## Unified `core` and `std`

With the mechanisms specified above, it could be possible to remove the concept of `core` and `std` from the user, leaving only `std`.

By using stable feature flags for `std`, we could say that `std` as a crate with `default-features = false` would essentially be `no_core`, or with `features = ["core"]`, we would be the same as `no_std`.

This abstraction may not map to the actual implementation of `libcore` or `libstd`, but instead be an abstraction layer for the end developer.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go into how we could architect such an abstraction layer?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I was unclear, but the "abstraction layer" is that everything except std goes away from the user's perspective, and today's no_core and no_std are just std with no/fewer features set.

I call it an abstraction layer, as under the hood it is unlikely (or undesirable) to try and merge libcore, liballoc, libstd, et. al due to a number of reasons (discussed by the libs team at all-hands).

Let me know how much of that you would like me to add to the RFC or expand upon :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know how much of that you would like me to add to the RFC or expand upon :)

Mostly, I think it would be good to consider how we might architect the facade that core and alloc would presumably turn into... possibly with some code snippets as examples.


## Stop shipping pre-compiled `core` and `std`

With the ability to build these crates on demand, we may want to decide not to ship `target` bundles for any users.

This would come at a cost of increased compile times, at least for the first build, if the artifacts are cached globally. However it would remove a mental snag of having to sometimes run `rustup target add`, and confusion from some users why parts of `std` and `core` have different optimization settings (particularly for debug builds) when debugging.