Skip to content

Commit

Permalink
Auto merge of #7216 - ehuss:build-std, r=alexcrichton
Browse files Browse the repository at this point in the history
Basic standard library support.

This is not intended to be useful to anyone. If people want to try it, that's great, but do not rely on this. This is only for experimenting and setting up for future work.

This adds a flag `-Zbuild-std` to build the standard library with a project. The flag can also take optional comma-separated crate names, like `-Zbuild-std=core`. Default is `std,core,panic_unwind,compiler_builtins`.

Closes rust-lang/wg-cargo-std-aware#10.

Note: I can probably break some of the refactoring into smaller PRs if necessary.

## Overview
The general concept here is to use two resolvers, and to combine everything in the Unit graph. There are a number of changes to support this:

- A synthetic workspace for the standard library is created to set up the patches and members correctly.
- Decouple `unit_dependencies` from `Context` to make it easier to manage.
- Add `features` to `Unit` to keep it unique and to remove the need to query a resolver.
- Add a `UnitDep` struct which encodes the edges between `Unit`s. This removes the need to query a resolver for `extern_crate_name` and `public`.
- Remove `Resolver` from `BuildContext` to avoid any confusion and to keep the complexity focused in `unit_dependencies`.
- Remove `Links` from `Context` since it used the resolver. Adjusted so that instead of checking links at runtime, they are all checked at once in the beginning. Note that it does not check links for the standard lib, but it should be safe? I think `compiler-rt` is the only `links`?

I currently went with a strategy of linking the standard library dependencies using `--extern` (instead of `--sysroot` or `-L`). This has some benefits but some significant drawbacks. See below for some questions.

## For future PRs
- Add Cargo.toml support. See rust-lang/wg-cargo-std-aware#5
- Source is not downloaded. It assumes you have run `rustup component add rust-src`. See rust-lang/wg-cargo-std-aware#11
- `cargo metadata` does not include any information about std. I don't know how this should work.
- `cargo clean` is not std-aware.
- `cargo fetch` does not fetch std dependencies.
- `cargo vendor` does not vendor std dependencies.
- `cargo pkgid` is not std-aware.
- `--target` is required on the command-line. This should default to host-as-target.
- `-p` is not std aware.
- A synthetic `Cargo.toml` workspace is created which has to know about things like `rustc-std-workspace-core`. Perhaps rust-lang/rust should publish the source with this `Cargo.toml` already created?
- `compiler_builtins` uses default features (pure Rust implementation, etc.). See rust-lang/wg-cargo-std-aware#15
    - `compiler_builtins` may need to be built without debug assertions, see [this](https://github.com/rust-lang/rust/blob/8e917f48382c6afaf50568263b89d35fba5d98e4/src/bootstrap/bin/rustc.rs#L210-L214). Could maybe use profile overrides.
- Panic issues:
    - `panic_abort` is not yet supported, though it should probably be easy. It could maybe look at the profile to determine which panic implementation to use? This requires more hard-coding in Cargo to know about rustc implementation details.
    - [This](https://github.com/rust-lang/rust/blob/8e917f48382c6afaf50568263b89d35fba5d98e4/src/bootstrap/bin/rustc.rs#L186-L201) should probably be handled where `panic` is set for `panic_abort` and `compiler_builtins`. I would like to get a test case for it. This can maybe be done with profile overrides?
- Using two resolvers is quite messy and causes a lot of complications. It would be ideal if it could only use one, though that may not be possible for the foreseeable future. See rust-lang/wg-cargo-std-aware#12
- Features are hard-coded. See rust-lang/wg-cargo-std-aware#13
- Lots of various platform-specific support is not included (musl, wasi, windows-gnu, etc.).
- Default `backtrace` is used with C compiler. See rust-lang/wg-cargo-std-aware#16
- Sanitizers are not built. See rust-lang/wg-cargo-std-aware#17
- proc_macro has some hacky code to synthesize its dependencies. See rust-lang/wg-cargo-std-aware#18. This may not be necessary if this uses `--sysroot` instead.
- Profile overrides cause weird linker errors.
  That is:
  ```toml
  [profile.dev.overrides.std]
  opt-level = 2
  ```
  Using `[profile.dev.overrides."*"]` works. I tried fiddling with it, but couldn't figure it out.
  We may also want to consider altering the syntax for profile overrides. Having to repeat the same profile for `std` and `core` and `alloc` and everything else would not be ideal.
- ~~`Context::unit_deps` does not handle build overrides, see #7215.~~ FIXED

## Questions for this PR
- I went with the strategy of using `--extern` to link the standard lib. This seems to work, and I haven't found any problems, but it seems risky. It also forces Cargo to know about certain implicit dependencies like `compiler_builtins` and `panic_*`. The alternative is to create a sysroot and copy all the crates to that directory and pass `--sysroot`. However, this is complicated by pipelining, which would require special support to copy `.rmeta` files when they are generated. Let me know if you think I should use a different strategy. I'm on the fence here, and I think using `--sysroot` may be safer, but adds more complexity.
    - As an aside, if rustc ever tries to grab a crate from sysroot that was not passed in via `--extern`, then it results in duplicate lang items. For example, saying `extern crate proc_macro;` without specifying `proc_macro` as a dependency. We could prevent rustc from ever trying by passing `--sysroot=/nonexistent` to prevent it from trying. Or add an equivalent flag to rustc.
- How should this be tested? I added a janky integration test, but it has some drawbacks. It requires internet access. It is slow. Since it is slow, it reuses the same target directory for multiple tests which makes it awkward to work with.
    - What interesting things are there to test?
    - We may want to disable the test before merging if it seems too annoying to make it the default. It requires rust-src to be downloaded, and takes several minutes to run, and are somewhat platform-dependent.
- How to test that it is actually linking the correct standard library? I did tests locally with a modified libcore, but I can't think of a good way to do that in the test suite.
- I did not add `__CARGO_DEFAULT_LIB_METADATA` to the hash. I had a hard time coming up with a test case where it would matter.
    - My only thought is that it is a problem because libstd includes a dylib, which prevents the hash from being added to the filename. It does cause recompiles when switching between compilers, for example, when it normally wouldn't.
    - Very dumb question: Why exactly does libstd include a dylib? This can cause issues (see rust-lang/rust#56443).
    - This should probably change, but I want to better understand it first.
- The `bin_nostd` test needs to link libc on linux, and I'm not sure I understand why. I'm concerned there is something wrong there. libstd does not do that AFAIK.
  • Loading branch information
bors committed Sep 3, 2019
2 parents 5da11a5 + 9382be1 commit 49fbf52
Show file tree
Hide file tree
Showing 25 changed files with 1,117 additions and 385 deletions.
5 changes: 5 additions & 0 deletions ci/azure-test-all.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ steps:
- bash: rustup component add clippy || echo "clippy not available"
displayName: "Install clippy (maybe)"

# This is needed for standard library tests.
- bash: rustup component add rust-src
condition: and(succeeded(), eq(variables['TOOLCHAIN'], 'nightly'))
displayName: "Install rust-src (maybe)"

# Deny warnings on CI to avoid warnings getting into the codebase, and note the
# `force-system-lib-on-osx` which is intended to fix compile issues on OSX where
# compiling curl from source on OSX yields linker errors on Azure.
Expand Down
10 changes: 9 additions & 1 deletion src/cargo/core/compiler/build_config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ pub enum CompileMode {
/// a test.
Check { test: bool },
/// Used to indicate benchmarks should be built. This is not used in
/// `Target`, because it is essentially the same as `Test` (indicating
/// `Unit`, because it is essentially the same as `Test` (indicating
/// `--test` should be passed to rustc) and by using `Test` instead it
/// allows some de-duping of Units to occur.
Bench,
Expand Down Expand Up @@ -221,6 +221,14 @@ impl CompileMode {
}
}

/// Returns `true` if this is something that passes `--test` to rustc.
pub fn is_rustc_test(self) -> bool {
match self {
CompileMode::Test | CompileMode::Bench | CompileMode::Check { test: true } => true,
_ => false,
}
}

/// Returns `true` if this is the *execution* of a `build.rs` script.
pub fn is_run_custom_build(self) -> bool {
self == CompileMode::RunCustomBuild
Expand Down
16 changes: 1 addition & 15 deletions src/cargo/core/compiler/build_context/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ use crate::core::compiler::unit::UnitInterner;
use crate::core::compiler::{BuildConfig, BuildOutput, Kind, Unit};
use crate::core::profiles::Profiles;
use crate::core::{Dependency, Workspace};
use crate::core::{PackageId, PackageSet, Resolve};
use crate::core::{PackageId, PackageSet};
use crate::util::errors::CargoResult;
use crate::util::{profile, Cfg, Config, Rustc};

Expand All @@ -26,8 +26,6 @@ pub struct BuildContext<'a, 'cfg> {
pub ws: &'a Workspace<'cfg>,
/// The cargo configuration.
pub config: &'cfg Config,
/// The dependency graph for our build.
pub resolve: &'a Resolve,
pub profiles: &'a Profiles,
pub build_config: &'a BuildConfig,
/// Extra compiler args for either `rustc` or `rustdoc`.
Expand All @@ -48,7 +46,6 @@ pub struct BuildContext<'a, 'cfg> {
impl<'a, 'cfg> BuildContext<'a, 'cfg> {
pub fn new(
ws: &'a Workspace<'cfg>,
resolve: &'a Resolve,
packages: &'a PackageSet<'cfg>,
config: &'cfg Config,
build_config: &'a BuildConfig,
Expand All @@ -75,7 +72,6 @@ impl<'a, 'cfg> BuildContext<'a, 'cfg> {

Ok(BuildContext {
ws,
resolve,
packages,
config,
rustc,
Expand All @@ -90,16 +86,6 @@ impl<'a, 'cfg> BuildContext<'a, 'cfg> {
})
}

pub fn extern_crate_name(&self, unit: &Unit<'a>, dep: &Unit<'a>) -> CargoResult<String> {
self.resolve
.extern_crate_name(unit.pkg.package_id(), dep.pkg.package_id(), dep.target)
}

pub fn is_public_dependency(&self, unit: &Unit<'a>, dep: &Unit<'a>) -> bool {
self.resolve
.is_public_dep(unit.pkg.package_id(), dep.pkg.package_id())
}

/// Whether a dependency should be compiled for the host or target platform,
/// specified by `Kind`.
pub fn dep_platform_activated(&self, dep: &Dependency, kind: Kind) -> bool {
Expand Down
4 changes: 2 additions & 2 deletions src/cargo/core/compiler/compilation.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ use std::path::PathBuf;
use semver::Version;

use super::BuildContext;
use crate::core::{Edition, Package, PackageId, Target};
use crate::core::{Edition, InternedString, Package, PackageId, Target};
use crate::util::{self, join_paths, process, CargoResult, CfgExpr, Config, ProcessBuilder};

pub struct Doctest {
Expand All @@ -16,7 +16,7 @@ pub struct Doctest {
pub target: Target,
/// Extern dependencies needed by `rustdoc`. The path is the location of
/// the compiled lib.
pub deps: Vec<(String, PathBuf)>,
pub deps: Vec<(InternedString, PathBuf)>,
}

/// A structure returning the result of a compilation.
Expand Down
4 changes: 1 addition & 3 deletions src/cargo/core/compiler/context/compilation_files.rs
Original file line number Diff line number Diff line change
Expand Up @@ -522,9 +522,7 @@ fn compute_metadata<'a, 'cfg>(

// Also mix in enabled features to our metadata. This'll ensure that
// when changing feature sets each lib is separately cached.
bcx.resolve
.features_sorted(unit.pkg.package_id())
.hash(&mut hasher);
unit.features.hash(&mut hasher);

// Mix in the target-metadata of all the dependencies of this target.
{
Expand Down
160 changes: 48 additions & 112 deletions src/cargo/core/compiler/context/mod.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
#![allow(deprecated)]
use std::collections::{HashMap, HashSet};
use std::collections::{BTreeSet, HashMap, HashSet};
use std::ffi::OsStr;
use std::fmt::Write;
use std::path::PathBuf;
use std::sync::{Arc, Mutex};

Expand All @@ -10,7 +9,7 @@ use jobserver::Client;

use crate::core::compiler::compilation;
use crate::core::compiler::Unit;
use crate::core::{Package, PackageId, Resolve};
use crate::core::PackageId;
use crate::util::errors::{CargoResult, CargoResultExt};
use crate::util::{internal, profile, Config};

Expand All @@ -19,11 +18,9 @@ use super::custom_build::{self, BuildDeps, BuildScriptOutputs, BuildScripts};
use super::fingerprint::Fingerprint;
use super::job_queue::JobQueue;
use super::layout::Layout;
use super::unit_dependencies::{UnitDep, UnitGraph};
use super::{BuildContext, Compilation, CompileMode, Executor, FileFlavor, Kind};

mod unit_dependencies;
use self::unit_dependencies::build_unit_dependencies;

mod compilation_files;
use self::compilation_files::CompilationFiles;
pub use self::compilation_files::{Metadata, OutputFile};
Expand Down Expand Up @@ -51,23 +48,18 @@ pub struct Context<'a, 'cfg> {
/// Linking information for each `Unit`.
/// See `build_map` for details.
pub build_scripts: HashMap<Unit<'a>, Arc<BuildScripts>>,
/// Used to check the `links` field in the manifest is not duplicated and
/// is used correctly.
pub links: Links,
/// Job server client to manage concurrency with other processes.
pub jobserver: Client,
/// "Primary" packages are the ones the user selected on the command-line
/// with `-p` flags. If no flags are specified, then it is the defaults
/// based on the current directory and the default workspace members.
primary_packages: HashSet<PackageId>,
/// The dependency graph of units to compile.
unit_dependencies: HashMap<Unit<'a>, Vec<Unit<'a>>>,
unit_dependencies: UnitGraph<'a>,
/// An abstraction of the files and directories that will be generated by
/// the compilation. This is `None` until after `unit_dependencies` has
/// been computed.
files: Option<CompilationFiles<'a, 'cfg>>,
/// Cache of packages, populated when they are downloaded.
package_cache: HashMap<PackageId, &'a Package>,

/// A flag indicating whether pipelining is enabled for this compilation
/// session. Pipelining largely only affects the edges of the dependency
Expand All @@ -82,7 +74,11 @@ pub struct Context<'a, 'cfg> {
}

impl<'a, 'cfg> Context<'a, 'cfg> {
pub fn new(config: &'cfg Config, bcx: &'a BuildContext<'a, 'cfg>) -> CargoResult<Self> {
pub fn new(
config: &'cfg Config,
bcx: &'a BuildContext<'a, 'cfg>,
unit_dependencies: UnitGraph<'a>,
) -> CargoResult<Self> {
// Load up the jobserver that we'll use to manage our parallelism. This
// is the same as the GNU make implementation of a jobserver, and
// intentionally so! It's hoped that we can interact with GNU make and
Expand Down Expand Up @@ -112,12 +108,10 @@ impl<'a, 'cfg> Context<'a, 'cfg> {
compiled: HashSet::new(),
build_scripts: HashMap::new(),
build_explicit_deps: HashMap::new(),
links: Links::new(),
jobserver,
primary_packages: HashSet::new(),
unit_dependencies: HashMap::new(),
unit_dependencies,
files: None,
package_cache: HashMap::new(),
rmeta_required: HashSet::new(),
pipelining,
})
Expand Down Expand Up @@ -203,18 +197,15 @@ impl<'a, 'cfg> Context<'a, 'cfg> {
// pass `--extern` for rlib deps and skip out on all other
// artifacts.
let mut doctest_deps = Vec::new();
for dep in self.dep_targets(unit) {
if dep.target.is_lib() && dep.mode == CompileMode::Build {
let outputs = self.outputs(&dep)?;
for dep in self.unit_deps(unit) {
if dep.unit.target.is_lib() && dep.unit.mode == CompileMode::Build {
let outputs = self.outputs(&dep.unit)?;
let outputs = outputs.iter().filter(|output| {
output.path.extension() == Some(OsStr::new("rlib"))
|| dep.target.for_host()
|| dep.unit.target.for_host()
});
for output in outputs {
doctest_deps.push((
self.bcx.extern_crate_name(unit, &dep)?,
output.path.clone(),
));
doctest_deps.push((dep.extern_crate_name, output.path.clone()));
}
}
}
Expand All @@ -227,7 +218,7 @@ impl<'a, 'cfg> Context<'a, 'cfg> {
});
}

let feats = self.bcx.resolve.features(unit.pkg.package_id());
let feats = &unit.features;
if !feats.is_empty() {
self.compilation
.cfgs
Expand Down Expand Up @@ -305,7 +296,8 @@ impl<'a, 'cfg> Context<'a, 'cfg> {
self.primary_packages
.extend(units.iter().map(|u| u.pkg.package_id()));

build_unit_dependencies(self, units)?;
self.record_units_requiring_metadata();

let files = CompilationFiles::new(
units,
host_layout,
Expand Down Expand Up @@ -357,38 +349,35 @@ impl<'a, 'cfg> Context<'a, 'cfg> {

/// For a package, return all targets which are registered as dependencies
/// for that package.
/// NOTE: This is deprecated, use `unit_deps` instead.
//
// TODO: this ideally should be `-> &[Unit<'a>]`.
pub fn dep_targets(&self, unit: &Unit<'a>) -> Vec<Unit<'a>> {
self.unit_dependencies[unit].clone()
self.unit_dependencies[unit]
.iter()
.map(|dep| dep.unit)
.collect()
}

pub fn is_primary_package(&self, unit: &Unit<'a>) -> bool {
self.primary_packages.contains(&unit.pkg.package_id())
/// Direct dependencies for the given unit.
pub fn unit_deps(&self, unit: &Unit<'a>) -> &[UnitDep<'a>] {
&self.unit_dependencies[unit]
}

/// Gets a package for the given package ID.
pub fn get_package(&self, id: PackageId) -> CargoResult<&'a Package> {
self.package_cache
.get(&id)
.cloned()
.ok_or_else(|| failure::format_err!("failed to find {}", id))
pub fn is_primary_package(&self, unit: &Unit<'a>) -> bool {
self.primary_packages.contains(&unit.pkg.package_id())
}

/// Returns the list of filenames read by cargo to generate the `BuildContext`
/// (all `Cargo.toml`, etc.).
pub fn build_plan_inputs(&self) -> CargoResult<Vec<PathBuf>> {
let mut inputs = Vec::new();
// Note that we're using the `package_cache`, which should have been
// populated by `build_unit_dependencies`, and only those packages are
// considered as all the inputs.
//
// (Notably, we skip dev-deps here if they aren't present.)
for pkg in self.package_cache.values() {
inputs.push(pkg.manifest_path().to_path_buf());
// Keep sorted for consistency.
let mut inputs = BTreeSet::new();
// Note: dev-deps are skipped if they are not present in the unit graph.
for unit in self.unit_dependencies.keys() {
inputs.insert(unit.pkg.manifest_path().to_path_buf());
}
inputs.sort();
Ok(inputs)
Ok(inputs.into_iter().collect())
}

fn check_collistions(&self) -> CargoResult<()> {
Expand Down Expand Up @@ -488,6 +477,20 @@ impl<'a, 'cfg> Context<'a, 'cfg> {
Ok(())
}

/// Records the list of units which are required to emit metadata.
///
/// Units which depend only on the metadata of others requires the others to
/// actually produce metadata, so we'll record that here.
fn record_units_requiring_metadata(&mut self) {
for (key, deps) in self.unit_dependencies.iter() {
for dep in deps {
if self.only_requires_rmeta(key, &dep.unit) {
self.rmeta_required.insert(dep.unit);
}
}
}
}

/// Returns whether when `parent` depends on `dep` if it only requires the
/// metadata file from `dep`.
pub fn only_requires_rmeta(&self, parent: &Unit<'a>, dep: &Unit<'a>) -> bool {
Expand All @@ -509,70 +512,3 @@ impl<'a, 'cfg> Context<'a, 'cfg> {
self.rmeta_required.contains(unit)
}
}

#[derive(Default)]
pub struct Links {
validated: HashSet<PackageId>,
links: HashMap<String, PackageId>,
}

impl Links {
pub fn new() -> Links {
Links {
validated: HashSet::new(),
links: HashMap::new(),
}
}

pub fn validate(&mut self, resolve: &Resolve, unit: &Unit<'_>) -> CargoResult<()> {
if !self.validated.insert(unit.pkg.package_id()) {
return Ok(());
}
let lib = match unit.pkg.manifest().links() {
Some(lib) => lib,
None => return Ok(()),
};
if let Some(&prev) = self.links.get(lib) {
let pkg = unit.pkg.package_id();

let describe_path = |pkgid: PackageId| -> String {
let dep_path = resolve.path_to_top(&pkgid);
let mut dep_path_desc = format!("package `{}`", dep_path[0]);
for dep in dep_path.iter().skip(1) {
write!(dep_path_desc, "\n ... which is depended on by `{}`", dep).unwrap();
}
dep_path_desc
};

failure::bail!(
"multiple packages link to native library `{}`, \
but a native library can be linked only once\n\
\n\
{}\nlinks to native library `{}`\n\
\n\
{}\nalso links to native library `{}`",
lib,
describe_path(prev),
lib,
describe_path(pkg),
lib
)
}
if !unit
.pkg
.manifest()
.targets()
.iter()
.any(|t| t.is_custom_build())
{
failure::bail!(
"package `{}` specifies that it links to `{}` but does not \
have a custom build script",
unit.pkg.package_id(),
lib
)
}
self.links.insert(lib.to_string(), unit.pkg.package_id());
Ok(())
}
}
2 changes: 1 addition & 1 deletion src/cargo/core/compiler/custom_build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ fn build_work<'a, 'cfg>(cx: &mut Context<'a, 'cfg>, unit: &Unit<'a>) -> CargoRes

// Be sure to pass along all enabled features for this package, this is the
// last piece of statically known information that we have.
for feat in bcx.resolve.features(unit.pkg.package_id()).iter() {
for feat in &unit.features {
cmd.env(&format!("CARGO_FEATURE_{}", super::envify(feat)), "1");
}

Expand Down
Loading

0 comments on commit 49fbf52

Please sign in to comment.