From 52b3dbd6a3a9e73d68c8f4a2124c59b2ddfc62a7 Mon Sep 17 00:00:00 2001 From: Tbkhi Date: Sat, 24 Feb 2024 14:06:05 -0400 Subject: [PATCH 1/2] Update bootstrapping.md Additional external links, keyword formatting, and changes to text for clarity. Co-authored-by: Jieyou Xu <39484203+jieyouxu@users.noreply.github.com> --- .../bootstrapping/what-bootstrapping-does.md | 280 ++++++++++-------- 1 file changed, 150 insertions(+), 130 deletions(-) diff --git a/src/building/bootstrapping/what-bootstrapping-does.md b/src/building/bootstrapping/what-bootstrapping-does.md index 4dd8b1a8b..0b0c26f9f 100644 --- a/src/building/bootstrapping/what-bootstrapping-does.md +++ b/src/building/bootstrapping/what-bootstrapping-does.md @@ -2,6 +2,20 @@ +[*Bootstrapping*][boot] is the process of using a compiler to compile itself. +More accurately, it means using an older compiler to compile a newer version of +the same compiler. + +This raises a chicken-and-egg paradox: where did the first compiler come from? +It must have been written in a different language. In Rust's case it was +[written in OCaml][ocaml-compiler]. However it was abandoned long ago and the +only way to build a modern version of `rustc` is a slightly less modern version. + +This is exactly how [`./x.py`] works: it downloads the current beta release of +`rustc`, then uses it to compile the new compiler. + +[`./x.py`]: https://github.com/rust-lang/rust/blob/master/x.py + Note that this documentation mostly covers user-facing information. See [bootstrap/README.md][bootstrap-internals] to read about bootstrap internals. @@ -22,9 +36,9 @@ Compiling `rustc` is done in stages. Here's a diagram, adapted from Jynn Nelson' The `A`, `B`, `C`, and `D` show the ordering of the stages of bootstrapping. Blue nodes are downloaded, yellow nodes are built with the -stage0 compiler, and +`stage0` compiler, and green nodes are built with the -stage1 compiler. +`stage1` compiler. [rustconf22-talk]: https://www.youtube.com/watch?v=oUIjG-y4zaA @@ -49,35 +63,38 @@ graph TD The stage0 compiler is usually the current _beta_ `rustc` compiler and its associated dynamic libraries, -which `x.py` will download for you. -(You can also configure `x.py` to use something else.) +which `./x.py` will download for you. +(You can also configure `./x.py` to use something else.) + +The stage0 compiler is then used only to compile [`src/bootstrap`], +[`library/std`], and [`compiler/rustc`]. When assembling the libraries and +binaries that will become the stage1 `rustc` compiler, the freshly compiled +`std` and `rustc` are used. There are two concepts at play here: a compiler +(with its set of dependencies) and its 'target' or 'object' libraries (`std` +and `rustc`). Both are staged, but in a staggered manner. -The stage0 compiler is then used only to compile `src/bootstrap`, `std`, and `rustc`. -When assembling the libraries and binaries that will become the stage1 `rustc` -compiler, the freshly compiled `std` and `rustc` are used. -There are two concepts at play here: -a compiler (with its set of dependencies) -and its 'target' or 'object' libraries (`std` and `rustc`). -Both are staged, but in a staggered manner. +[`compiler/rustc`]: https://github.com/rust-lang/rust/tree/master/compiler/rustc +[`library/std`]: https://github.com/rust-lang/rust/tree/master/library/std +[`src/bootstrap`]: https://github.com/rust-lang/rust/tree/master/src/bootstrap ### Stage 1: from current code, by an earlier compiler -The rustc source code is then compiled with the stage0 compiler to produce the stage1 compiler. +The rustc source code is then compiled with the `stage0` compiler to produce the `stage1` compiler. ### Stage 2: the truly current compiler -We then rebuild our stage1 compiler with itself to produce the stage2 compiler. +We then rebuild our `stage1` compiler with itself to produce the `stage2` compiler. -In theory, the stage1 compiler is functionally identical to the stage2 compiler, +In theory, the `stage1` compiler is functionally identical to the `stage2` compiler, but in practice there are subtle differences. -In particular, the stage1 compiler itself was built by stage0 +In particular, the `stage1` compiler itself was built by `stage0` and hence not by the source in your working directory. -This means that the ABI generated by the stage0 compiler may not match the ABI that would have been -made by the stage1 compiler, which can cause problems for dynamic libraries, tests, and tools using +This means that the ABI generated by the `stage0` compiler may not match the ABI that would have been +made by the `stage1` compiler, which can cause problems for dynamic libraries, tests, and tools using `rustc_private`. -Note that the `proc_macro` crate avoids this issue with a C FFI layer called `proc_macro::bridge`, -allowing it to be used with stage 1. +Note that the `proc_macro` crate avoids this issue with a `C` `FFI` layer called `proc_macro::bridge`, +allowing it to be used with `stage1`. The `stage2` compiler is the one distributed with `rustup` and all other install methods. However, it takes a very long time to build @@ -89,13 +106,13 @@ See [Building the compiler](../how-to-build-and-run.html#building-the-compiler). ### Stage 3: the same-result test -Stage 3 is optional. To sanity check our new compiler, we -can build the libraries with the stage2 compiler. The result ought +Stage 3 is optional. To sanity check our new compiler we +can build the libraries with the `stage2` compiler. The result ought to be identical to before, unless something has broken. ### Building the stages -`x` tries to be helpful and pick the stage you most likely meant for each subcommand. +The script [`./x`] tries to be helpful and pick the stage you most likely meant for each subcommand. These defaults are as follows: - `check`: `--stage 0` @@ -110,25 +127,28 @@ You can always override the stage by passing `--stage N` explicitly. For more information about stages, [see below](#understanding-stages-of-bootstrap). +[`./x`]: https://github.com/rust-lang/rust/blob/master/x + ## Complications of bootstrapping -Since the build system uses the current beta compiler to build the stage-1 +Since the build system uses the current beta compiler to build a `stage1` bootstrapping compiler, the compiler source code can't use some features until they reach beta (because otherwise the beta compiler doesn't support them). On the other hand, for [compiler intrinsics][intrinsics] and internal features, the features _have_ to be used. Additionally, the compiler makes -heavy use of nightly features (`#![feature(...)]`). How can we resolve this +heavy use of `nightly` features (`#![feature(...)]`). How can we resolve this problem? There are two methods used: 1. The build system sets `--cfg bootstrap` when building with `stage0`, so we can use `cfg(not(bootstrap))` to only use features when built with `stage1`. -This is useful for e.g. features that were just stabilized, which require -`#![feature(...)]` when built with `stage0`, but not for `stage1`. +Setting `--cfg bootstrap` in this way is used for features that were just +stabilized, which require `#![feature(...)]` when built with `stage0`, but not +for `stage1`. 2. The build system sets `RUSTC_BOOTSTRAP=1`. This special variable means to -_break the stability guarantees_ of rust: Allow using `#![feature(...)]` with -a compiler that's not nightly. This should never be used except when -bootstrapping the compiler. +_break the stability guarantees_ of Rust: allowing use of `#![feature(...)]` +with a compiler that's not `nightly`. _Setting `RUSTC_BOOTSTRAP=1` should never +be used except when bootstrapping the compiler._ [boot]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers) [intrinsics]: ../../appendix/glossary.md#intrinsic @@ -140,7 +160,7 @@ bootstrapping the compiler. This is a detailed look into the separate bootstrap stages. -The convention `x` uses is that: +The convention `./x` uses is that: - A `--stage N` flag means to run the stage N compiler (`stageN/rustc`). - A "stage N artifact" is a build artifact that is _produced_ by the stage N compiler. @@ -149,8 +169,8 @@ The convention `x` uses is that: #### Build artifacts -Anything you can build with `x` is a _build artifact_. -Build artifacts include, but are not limited to: +Anything you can build with `./x` is a _build artifact_. Build artifacts +include, but are not limited to: - binaries, like `stage0-rustc/rustc-main` - shared objects, like `stage0-sysroot/rustlib/libstd-6fae108520cf72fe.so` @@ -161,35 +181,36 @@ Build artifacts include, but are not limited to: #### Examples -- `./x build --stage 0` means to build with the beta `rustc`. -- `./x doc --stage 0` means to document using the beta `rustdoc`. +- `./x test tests/ui` means to build the `stage1` compiler and run + `compiletest` on it. If you're working on the compiler, this is normally the + test command you want. - `./x test --stage 0 library/std` means to run tests on the standard library - without building `rustc` from source ('build with stage 0, then test the + without building `rustc` from source ('build with `stage0`, then test the artifacts'). If you're working on the standard library, this is normally the test command you want. -- `./x test tests/ui` means to build the stage 1 compiler and run - `compiletest` on it. If you're working on the compiler, this is normally the - test command you want. +- `./x build --stage 0` means to build with the beta `rustc`. +- `./x doc --stage 0` means to document using the beta `rustdoc`. #### Examples of what *not* to do -- `./x test --stage 0 tests/ui` is not useful: it runs tests on the - _beta_ compiler and doesn't build `rustc` from source. Use `test tests/ui` - instead, which builds stage 1 from source. +- `./x test --stage 0 tests/ui` is not useful: it runs tests on the _beta_ + compiler and doesn't build `rustc` from source. Use `test tests/ui` instead, + which builds `stage1` from source. - `./x test --stage 0 compiler/rustc` builds the compiler but runs no tests: - it's running `cargo test -p rustc`, but cargo doesn't understand Rust's - tests. You shouldn't need to use this, use `test` instead (without arguments). + it's running `cargo test -p rustc`, but `cargo` doesn't understand Rust's + tests. You shouldn't need to use this, use `test` instead (without + arguments). - `./x build --stage 0 compiler/rustc` builds the compiler, but does not build - libstd or even libcore. Most of the time, you'll want `./x build -library` instead, which allows compiling programs without needing to define - lang items. + `libstd` or even `libcore`. Most of the time, you'll want `./x build library` + instead, which allows compiling programs without needing to define lang + items. ### Building vs. running Note that `build --stage N compiler/rustc` **does not** build the stage N compiler: instead it builds the stage N+1 compiler _using_ the stage N compiler. -In short, _stage 0 uses the stage0 compiler to create stage0 artifacts which +In short, _stage 0 uses the `stage0` compiler to create `stage0` artifacts which will later be uplifted to be the stage1 compiler_. In each stage, two major steps are performed: @@ -214,72 +235,71 @@ Stage N `std` is pretty much necessary for any useful work with the stage N comp Without it, you can only compile programs with `#![no_core]` -- not terribly useful! The reason these need to be different is because they aren't necessarily ABI-compatible: -there could be new layout optimizations, changes to MIR, or other changes -to Rust metadata on nightly that aren't present in beta. +there could be new layout optimizations, changes to `MIR`, or other changes +to Rust metadata on `nightly` that aren't present in beta. This is also where `--keep-stage 1 library/std` comes into play. Since most changes to the compiler don't actually change the ABI, once you've produced a -`std` in stage 1, you can probably just reuse it with a different compiler. -If the ABI hasn't changed, you're good to go, no need to spend time -recompiling that `std`. -`--keep-stage` simply assumes the previous compile is fine and copies those -artifacts into the appropriate place, skipping the cargo invocation. +`std` in `stage1`, you can probably just reuse it with a different compiler. If +the ABI hasn't changed, you're good to go, no need to spend time recompiling +that `std`. The flag `--keep-stage` simply instructs the build script to +assumes the previous compile is fine and copies those artifacts into the +appropriate place, skipping the `cargo` invocation. ### Cross-compiling rustc *Cross-compiling* is the process of compiling code that will run on another architecture. For instance, you might want to build an ARM version of rustc using an x86 machine. -Building stage2 `std` is different when you are cross-compiling. +Building `stage2` `std` is different when you are cross-compiling. -This is because `x` uses a trick: if `HOST` and `TARGET` are the same, -it will reuse stage1 `std` for stage2! This is sound because stage1 `std` -was compiled with the stage1 compiler, i.e. a compiler using the source code +This is because `./x` uses the following logic: if `HOST` and `TARGET` are the same, +it will reuse `stage1` `std` for `stage2`! This is sound because `stage1` `std` +was compiled with the `stage1` compiler, i.e. a compiler using the source code you currently have checked out. So it should be identical (and therefore ABI-compatible) to the `std` that `stage2/rustc` would compile. -However, when cross-compiling, stage1 `std` will only run on the host. -So the stage2 compiler has to recompile `std` for the target. +However, when cross-compiling, `stage1` `std` will only run on the host. +So the `stage2` compiler has to recompile `std` for the target. -(See in the table how stage2 only builds non-host `std` targets). +(See in the table how `stage2` only builds non-host `std` targets). ### Why does only libstd use `cfg(bootstrap)`? -NOTE: for docs on `cfg(bootstrap)` itself, see [Complications of Bootstrapping][complications]. +For docs on `cfg(bootstrap)` itself, see +[Complications of Bootstrapping](#complications-of-bootstrapping). -[complications]: #complications-of-bootstrapping +The `rustc` generated by the `stage0` compiler is linked to the freshly-built +`std`, which means that for the most part only `std` needs to be `cfg`-gated, +so that `rustc` can use features added to `std` immediately after their addition, +without need for them to get into the downloaded `beta` compiler. -The `rustc` generated by the stage0 compiler is linked to the freshly-built -`std`, which means that for the most part only `std` needs to be cfg-gated, -so that `rustc` can use features added to std immediately after their addition, -without need for them to get into the downloaded beta. +Note this is different from any other Rust program: `stage1` `rustc` +is built by the _beta_ compiler, but using the _master_ version of `libstd`! -Note this is different from any other Rust program: stage1 `rustc` -is built by the _beta_ compiler, but using the _master_ version of libstd! - -The only time `rustc` uses `cfg(bootstrap)` is when it adds internal lints -that use diagnostic items, or when it uses unstable library features that were recently changed. +The only time `rustc` uses `cfg(bootstrap)` is when it adds internal lints that +use diagnostic items, or when it uses unstable library features that were +recently changed. ### What is a 'sysroot'? -When you build a project with cargo, the build artifacts for dependencies -are normally stored in `target/debug/deps`. This only contains dependencies cargo +When you build a project with `cargo`, the build artifacts for dependencies +are normally stored in `target/debug/deps`. This only contains dependencies `cargo` knows about; in particular, it doesn't have the standard library. Where do -`std` or `proc_macro` come from? It comes from the **sysroot**, the root +`std` or `proc_macro` come from? They comes from the **sysroot**, the root of a number of directories where the compiler loads build artifacts at runtime. -The sysroot doesn't just store the standard library, though - it includes +The `sysroot` doesn't just store the standard library, though - it includes anything that needs to be loaded at runtime. That includes (but is not limited to): -- `libstd`/`libtest`/`libproc_macro` -- The compiler crates themselves, when using `rustc_private`. In-tree these - are always present; out of tree, you need to install `rustc-dev` with rustup. -- `libLLVM.so`, the shared object file for the LLVM project. In-tree this is +- Libraries `libstd`/`libtest`/`libproc_macro`. +- Compiler crates themselves, when using `rustc_private`. In-tree these + are always present; out of tree, you need to install `rustc-dev` with `rustup`. +- Shared object file `libLLVM.so` for the LLVM project. In-tree this is either built from source or downloaded from CI; out-of-tree, you need to - install `llvm-tools-preview` with rustup. + install `llvm-tools-preview` with `rustup`. All the artifacts listed so far are *compiler* runtime dependencies. You can see them with `rustc --print sysroot`: - ``` $ ls $(rustc --print sysroot)/lib libchalk_derive-0685d79833dc9b2b.so libstd-25c6acf8063a3802.so @@ -289,8 +309,7 @@ librustc_macros-5f0ec4a119c6ac86.so rustlib ``` There are also runtime dependencies for the standard library! These are in -`lib/rustlib`, not `lib/` directly. - +`lib/rustlib/`, not `lib/` directly. ``` $ ls $(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/lib | head -n 5 libaddr2line-6c8e02b8fedc1e5f.rlib @@ -300,49 +319,49 @@ libcfg_if-512eb53291f6de7e.rlib libcompiler_builtins-ef2408da76957905.rlib ``` -`rustlib` includes libraries like `hashbrown` and `cfg_if`, which are not part -of the public API of the standard library, but are used to implement it. -`rustlib` is part of the search path for linkers, but `lib` will never be part -of the search path. +Directory `lib/rustlib/` includes libraries like `hashbrown` and `cfg_if`, which +are not part of the public API of the standard library, but are used to +implement it. Also `lib/rustlib/` is part of the search path for linkers, but +`lib` will never be part of the search path. #### -Z force-unstable-if-unmarked -Since `rustlib` is part of the search path, it means we have to be careful +Since `lib/rustlib/` is part of the search path we have to be careful about which crates are included in it. In particular, all crates except for the standard library are built with the flag `-Z force-unstable-if-unmarked`, which means that you have to use `#![feature(rustc_private)]` in order to load it (as opposed to the standard library, which is always available). The `-Z force-unstable-if-unmarked` flag has a variety of purposes to help -enforce that the correct crates are marked as unstable. It was introduced +enforce that the correct crates are marked as `unstable`. It was introduced primarily to allow rustc and the standard library to link to arbitrary crates on crates.io which do not themselves use `staged_api`. `rustc` also relies on -this flag to mark all of its crates as unstable with the `rustc_private` +this flag to mark all of its crates as `unstable` with the `rustc_private` feature so that each crate does not need to be carefully marked with `unstable`. This flag is automatically applied to all of `rustc` and the standard library by the bootstrap scripts. This is needed because the compiler and all of its -dependencies are shipped in the sysroot to all users. +dependencies are shipped in `sysroot` to all users. This flag has the following effects: -- Marks the crate as "unstable" with the `rustc_private` feature if it is not - itself marked as stable or unstable. +- Marks the crate as "`unstable`" with the `rustc_private` feature if it is not + itself marked as `stable` or `unstable`. - Allows these crates to access other forced-unstable crates without any need for attributes. Normally a crate would need a `#![feature(rustc_private)]` - attribute to use other unstable crates. However, that would make it + attribute to use other `unstable` crates. However, that would make it impossible for a crate from crates.io to access its own dependencies since that crate won't have a `feature(rustc_private)` attribute, but *everything* is compiled with `-Z force-unstable-if-unmarked`. Code which does not use `-Z force-unstable-if-unmarked` should include the -`#![feature(rustc_private)]` crate attribute to access these force-unstable -crates. This is needed for things that link `rustc`, such as `miri` or +`#![feature(rustc_private)]` crate attribute to access these forced-unstable +crates. This is needed for things which link `rustc` its self, such as `MIRI` or `clippy`. You can find more discussion about sysroots in: -- The [rustdoc PR] explaining why it uses `extern crate` for dependencies loaded from sysroot +- The [rustdoc PR] explaining why it uses `extern crate` for dependencies loaded from `sysroot` - [Discussions about sysroot on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/deps.20in.20sysroot/) - [Discussions about building rustdoc out of tree](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/How.20to.20create.20an.20executable.20accessing.20.60rustc_private.60.3F) @@ -350,23 +369,25 @@ You can find more discussion about sysroots in: ## Passing flags to commands invoked by `bootstrap` -`x` allows you to pass stage-specific flags to `rustc` and `cargo` when bootstrapping. -The `RUSTFLAGS_BOOTSTRAP` environment variable is passed as `RUSTFLAGS` to the bootstrap stage -(stage0), and `RUSTFLAGS_NOT_BOOTSTRAP` is passed when building artifacts for later stages. -`RUSTFLAGS` will work, but also affects the build of `bootstrap` itself, so it will be rare to want -to use it. -Finally, `MAGIC_EXTRA_RUSTFLAGS` bypasses the `cargo` cache to pass flags to rustc without -recompiling all dependencies. - -`RUSTDOCFLAGS`, `RUSTDOCFLAGS_BOOTSTRAP`, and `RUSTDOCFLAGS_NOT_BOOTSTRAP` are analogous to -`RUSTFLAGS`, but for rustdoc. - -`CARGOFLAGS` will pass arguments to cargo itself (e.g. `--timings`). `CARGOFLAGS_BOOTSTRAP` and -`CARGOFLAGS_NOT_BOOTSTRAP` work analogously to `RUSTFLAGS_BOOTSTRAP`. - -`--test-args` will pass arguments through to the test runner. For `tests/ui`, this is -compiletest; for unit tests and doctests this is the `libtest` runner. Most test runner accept -`--help`, which you can use to find out the options accepted by the runner. +Conveniently `./x` allows you to pass stage-specific flags to `rustc` and +`cargo` when bootstrapping. The `RUSTFLAGS_BOOTSTRAP` environment variable is +passed as `RUSTFLAGS` to the bootstrap stage (`stage0`), and +`RUSTFLAGS_NOT_BOOTSTRAP` is passed when building artifacts for later stages. +`RUSTFLAGS` will work, but also affects the build of `bootstrap` itself, so it +will be rare to want to use it. Finally, `MAGIC_EXTRA_RUSTFLAGS` bypasses the +`cargo` cache to pass flags to rustc without recompiling all dependencies. + +- `RUSTDOCFLAGS`, `RUSTDOCFLAGS_BOOTSTRAP` and `RUSTDOCFLAGS_NOT_BOOTSTRAP` are + analogous to `RUSTFLAGS`, but for `rustdoc`. +- `CARGOFLAGS` will pass arguments to cargo itself (e.g. `--timings`). + `CARGOFLAGS_BOOTSTRAP` and `CARGOFLAGS_NOT_BOOTSTRAP` work analogously to + `RUSTFLAGS_BOOTSTRAP`. +- `--test-args` will pass arguments through to the test runner. For `tests/ui`, + this is `compiletest`. For unit tests and doc tests this is the `libtest` + runner. + +Most test runner accept `--help`, which you can use to find out the options +accepted by the runner. ## Environment Variables @@ -374,31 +395,30 @@ During bootstrapping, there are a bunch of compiler-internal environment variables that are used. If you are trying to run an intermediate version of `rustc`, sometimes you may need to set some of these environment variables manually. Otherwise, you get an error like the following: - ```text thread 'main' panicked at 'RUSTC_STAGE was not set: NotPresent', library/core/src/result.rs:1165:5 ``` If `./stageN/bin/rustc` gives an error about environment variables, that -usually means something is quite wrong -- or you're trying to compile e.g. -`rustc` or `std` or something that depends on environment variables. In -the unlikely case that you actually need to invoke rustc in such a situation, -you can tell the bootstrap shim to print all env variables by adding `-vvv` to your `x` command. +usually means something is quite wrong -- such as you're trying to compile +`rustc` or `std` or something which depends on environment variables. In the +unlikely case that you actually need to invoke `rustc` in such a situation, you +can tell the bootstrap shim to print all `env` variables by adding `-vvv` to your +`x` command. Finally, bootstrap makes use of the [cc-rs crate] which has [its own -method][env-vars] of configuring C compilers and C flags via environment +method][env-vars] of configuring `C` compilers and `C` flags via environment variables. [cc-rs crate]: https://github.com/rust-lang/cc-rs [env-vars]: https://docs.rs/cc/latest/cc/#external-configuration-via-environment-variables -## Clarification of build command's stdout +## Clarification of build command's `stdout` -In this part, we will investigate the build command's stdout in an action +In this part, we will investigate the build command's `stdout` in an action (similar, but more detailed and complete documentation compare to topic above). When you execute `x build --dry-run` command, the build output will be something like the following: - ```text Building stage0 library artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu) Copying stage0 library from stage0 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu / x86_64-unknown-linux-gnu) @@ -418,18 +438,18 @@ local Rust source into libraries we can use. ### Copying stage0 \{std,rustc\} -This copies the library and compiler artifacts from Cargo into +This copies the library and compiler artifacts from `cargo` into `stage0-sysroot/lib/rustlib/{target-triple}/lib` ### Assembling stage1 compiler -This copies the libraries we built in "building stage0 ... artifacts" into -the stage1 compiler's lib directory. These are the host libraries that the +This copies the libraries we built in "building `stage0` ... artifacts" into +the `stage1` compiler's `lib/` directory. These are the host libraries that the compiler itself uses to run. These aren't actually used by artifacts the new -compiler generates. This step also copies the rustc and rustdoc binaries we +compiler generates. This step also copies the `rustc` and `rustdoc` binaries we generated into `build/$HOST/stage/bin`. -The stage1/bin/rustc is a fully functional compiler, but it doesn't yet have +The `stage1/bin/rustc` is a fully functional compiler, but it doesn't yet have any libraries to link built binaries or libraries to. The next 3 steps will provide those libraries for it; they are mostly equivalent to constructing -the stage1/bin compiler so we don't go through them individually. +the `stage1/bin` compiler so we don't go through them individually here. From 26705e5435d4450f32c7221ee697ccb3781147ff Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=AE=B8=E6=9D=B0=E5=8F=8B=20Jieyou=20Xu=20=28Joe=29?= <39484203+jieyouxu@users.noreply.github.com> Date: Sun, 13 Oct 2024 17:50:38 +0800 Subject: [PATCH 2/2] Reflow markdown --- .../bootstrapping/what-bootstrapping-does.md | 286 +++++++++--------- 1 file changed, 148 insertions(+), 138 deletions(-) diff --git a/src/building/bootstrapping/what-bootstrapping-does.md b/src/building/bootstrapping/what-bootstrapping-does.md index 0b0c26f9f..7d269ffa1 100644 --- a/src/building/bootstrapping/what-bootstrapping-does.md +++ b/src/building/bootstrapping/what-bootstrapping-does.md @@ -30,15 +30,16 @@ Note that this documentation mostly covers user-facing information. See - Stage 2: the truly current compiler - Stage 3: the same-result test -Compiling `rustc` is done in stages. Here's a diagram, adapted from Jynn Nelson's -[talk on bootstrapping][rustconf22-talk] at RustConf 2022, with detailed explanations below. +Compiling `rustc` is done in stages. Here's a diagram, adapted from Jynn +Nelson's [talk on bootstrapping][rustconf22-talk] at RustConf 2022, with +detailed explanations below. The `A`, `B`, `C`, and `D` show the ordering of the stages of bootstrapping. -Blue nodes are downloaded, -yellow nodes are built with the -`stage0` compiler, and -green nodes are built with the -`stage1` compiler. +Blue nodes are +downloaded, yellow +nodes are built with the `stage0` compiler, and green nodes are built with the `stage1` +compiler. [rustconf22-talk]: https://www.youtube.com/watch?v=oUIjG-y4zaA @@ -61,17 +62,16 @@ graph TD ### Stage 0: the pre-compiled compiler -The stage0 compiler is usually the current _beta_ `rustc` compiler -and its associated dynamic libraries, -which `./x.py` will download for you. -(You can also configure `./x.py` to use something else.) +The stage0 compiler is usually the current _beta_ `rustc` compiler and its +associated dynamic libraries, which `./x.py` will download for you. (You can +also configure `./x.py` to use something else.) The stage0 compiler is then used only to compile [`src/bootstrap`], [`library/std`], and [`compiler/rustc`]. When assembling the libraries and binaries that will become the stage1 `rustc` compiler, the freshly compiled `std` and `rustc` are used. There are two concepts at play here: a compiler -(with its set of dependencies) and its 'target' or 'object' libraries (`std` -and `rustc`). Both are staged, but in a staggered manner. +(with its set of dependencies) and its 'target' or 'object' libraries (`std` and +`rustc`). Both are staged, but in a staggered manner. [`compiler/rustc`]: https://github.com/rust-lang/rust/tree/master/compiler/rustc [`library/std`]: https://github.com/rust-lang/rust/tree/master/library/std @@ -79,41 +79,42 @@ and `rustc`). Both are staged, but in a staggered manner. ### Stage 1: from current code, by an earlier compiler -The rustc source code is then compiled with the `stage0` compiler to produce the `stage1` compiler. +The rustc source code is then compiled with the `stage0` compiler to produce the +`stage1` compiler. ### Stage 2: the truly current compiler -We then rebuild our `stage1` compiler with itself to produce the `stage2` compiler. +We then rebuild our `stage1` compiler with itself to produce the `stage2` +compiler. -In theory, the `stage1` compiler is functionally identical to the `stage2` compiler, -but in practice there are subtle differences. -In particular, the `stage1` compiler itself was built by `stage0` -and hence not by the source in your working directory. -This means that the ABI generated by the `stage0` compiler may not match the ABI that would have been -made by the `stage1` compiler, which can cause problems for dynamic libraries, tests, and tools using +In theory, the `stage1` compiler is functionally identical to the `stage2` +compiler, but in practice there are subtle differences. In particular, the +`stage1` compiler itself was built by `stage0` and hence not by the source in +your working directory. This means that the ABI generated by the `stage0` +compiler may not match the ABI that would have been made by the `stage1` +compiler, which can cause problems for dynamic libraries, tests, and tools using `rustc_private`. -Note that the `proc_macro` crate avoids this issue with a `C` `FFI` layer called `proc_macro::bridge`, -allowing it to be used with `stage1`. +Note that the `proc_macro` crate avoids this issue with a `C` FFI layer called +`proc_macro::bridge`, allowing it to be used with `stage1`. -The `stage2` compiler is the one distributed with `rustup` and all other install methods. -However, it takes a very long time to build -because one must first build the new compiler with an older compiler -and then use that to build the new compiler with itself. -For development, you usually only want the `stage1` compiler, -which you can build with `./x build library`. -See [Building the compiler](../how-to-build-and-run.html#building-the-compiler). +The `stage2` compiler is the one distributed with `rustup` and all other install +methods. However, it takes a very long time to build because one must first +build the new compiler with an older compiler and then use that to build the new +compiler with itself. For development, you usually only want the `stage1` +compiler, which you can build with `./x build library`. See [Building the +compiler](../how-to-build-and-run.html#building-the-compiler). ### Stage 3: the same-result test -Stage 3 is optional. To sanity check our new compiler we -can build the libraries with the `stage2` compiler. The result ought -to be identical to before, unless something has broken. +Stage 3 is optional. To sanity check our new compiler we can build the libraries +with the `stage2` compiler. The result ought to be identical to before, unless +something has broken. ### Building the stages -The script [`./x`] tries to be helpful and pick the stage you most likely meant for each subcommand. -These defaults are as follows: +The script [`./x`] tries to be helpful and pick the stage you most likely meant +for each subcommand. These defaults are as follows: - `check`: `--stage 0` - `doc`: `--stage 0` @@ -125,30 +126,31 @@ These defaults are as follows: You can always override the stage by passing `--stage N` explicitly. -For more information about stages, [see below](#understanding-stages-of-bootstrap). +For more information about stages, [see +below](#understanding-stages-of-bootstrap). [`./x`]: https://github.com/rust-lang/rust/blob/master/x ## Complications of bootstrapping Since the build system uses the current beta compiler to build a `stage1` -bootstrapping compiler, the compiler source code can't use some features -until they reach beta (because otherwise the beta compiler doesn't support -them). On the other hand, for [compiler intrinsics][intrinsics] and internal -features, the features _have_ to be used. Additionally, the compiler makes -heavy use of `nightly` features (`#![feature(...)]`). How can we resolve this -problem? +bootstrapping compiler, the compiler source code can't use some features until +they reach beta (because otherwise the beta compiler doesn't support them). On +the other hand, for [compiler intrinsics][intrinsics] and internal features, the +features _have_ to be used. Additionally, the compiler makes heavy use of +`nightly` features (`#![feature(...)]`). How can we resolve this problem? There are two methods used: + 1. The build system sets `--cfg bootstrap` when building with `stage0`, so we -can use `cfg(not(bootstrap))` to only use features when built with `stage1`. -Setting `--cfg bootstrap` in this way is used for features that were just -stabilized, which require `#![feature(...)]` when built with `stage0`, but not -for `stage1`. + can use `cfg(not(bootstrap))` to only use features when built with `stage1`. + Setting `--cfg bootstrap` in this way is used for features that were just + stabilized, which require `#![feature(...)]` when built with `stage0`, but + not for `stage1`. 2. The build system sets `RUSTC_BOOTSTRAP=1`. This special variable means to -_break the stability guarantees_ of Rust: allowing use of `#![feature(...)]` -with a compiler that's not `nightly`. _Setting `RUSTC_BOOTSTRAP=1` should never -be used except when bootstrapping the compiler._ + _break the stability guarantees_ of Rust: allowing use of `#![feature(...)]` + with a compiler that's not `nightly`. _Setting `RUSTC_BOOTSTRAP=1` should + never be used except when bootstrapping the compiler._ [boot]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers) [intrinsics]: ../../appendix/glossary.md#intrinsic @@ -163,9 +165,10 @@ This is a detailed look into the separate bootstrap stages. The convention `./x` uses is that: - A `--stage N` flag means to run the stage N compiler (`stageN/rustc`). -- A "stage N artifact" is a build artifact that is _produced_ by the stage N compiler. -- The stage N+1 compiler is assembled from stage N *artifacts*. This - process is called _uplifting_. +- A "stage N artifact" is a build artifact that is _produced_ by the stage N + compiler. +- The stage N+1 compiler is assembled from stage N *artifacts*. This process is + called _uplifting_. #### Build artifacts @@ -181,9 +184,9 @@ include, but are not limited to: #### Examples -- `./x test tests/ui` means to build the `stage1` compiler and run - `compiletest` on it. If you're working on the compiler, this is normally the - test command you want. +- `./x test tests/ui` means to build the `stage1` compiler and run `compiletest` + on it. If you're working on the compiler, this is normally the test command + you want. - `./x test --stage 0 library/std` means to run tests on the standard library without building `rustc` from source ('build with `stage0`, then test the artifacts'). If you're working on the standard library, this is normally the @@ -198,17 +201,15 @@ include, but are not limited to: which builds `stage1` from source. - `./x test --stage 0 compiler/rustc` builds the compiler but runs no tests: it's running `cargo test -p rustc`, but `cargo` doesn't understand Rust's - tests. You shouldn't need to use this, use `test` instead (without - arguments). + tests. You shouldn't need to use this, use `test` instead (without arguments). - `./x build --stage 0 compiler/rustc` builds the compiler, but does not build `libstd` or even `libcore`. Most of the time, you'll want `./x build library` - instead, which allows compiling programs without needing to define lang - items. + instead, which allows compiling programs without needing to define lang items. ### Building vs. running -Note that `build --stage N compiler/rustc` **does not** build the stage N compiler: -instead it builds the stage N+1 compiler _using_ the stage N compiler. +Note that `build --stage N compiler/rustc` **does not** build the stage N +compiler: instead it builds the stage N+1 compiler _using_ the stage N compiler. In short, _stage 0 uses the `stage0` compiler to create `stage0` artifacts which will later be uplifted to be the stage1 compiler_. @@ -216,65 +217,69 @@ will later be uplifted to be the stage1 compiler_. In each stage, two major steps are performed: 1. `std` is compiled by the stage N compiler. -2. That `std` is linked to programs built by the stage N compiler, - including the stage N artifacts (stage N+1 compiler). +2. That `std` is linked to programs built by the stage N compiler, including the + stage N artifacts (stage N+1 compiler). This is somewhat intuitive if one thinks of the stage N artifacts as "just" -another program we are building with the stage N compiler: -`build --stage N compiler/rustc` is linking the stage N artifacts to the `std` -built by the stage N compiler. +another program we are building with the stage N compiler: `build --stage N +compiler/rustc` is linking the stage N artifacts to the `std` built by the stage +N compiler. ### Stages and `std` Note that there are two `std` libraries in play here: -1. The library _linked_ to `stageN/rustc`, which was built by stage N-1 (stage N-1 `std`) -2. The library _used to compile programs_ with `stageN/rustc`, which was - built by stage N (stage N `std`). -Stage N `std` is pretty much necessary for any useful work with the stage N compiler. -Without it, you can only compile programs with `#![no_core]` -- not terribly useful! +1. The library _linked_ to `stageN/rustc`, which was built by stage N-1 (stage + N-1 `std`) +2. The library _used to compile programs_ with `stageN/rustc`, which was built + by stage N (stage N `std`). + +Stage N `std` is pretty much necessary for any useful work with the stage N +compiler. Without it, you can only compile programs with `#![no_core]` -- not +terribly useful! -The reason these need to be different is because they aren't necessarily ABI-compatible: -there could be new layout optimizations, changes to `MIR`, or other changes -to Rust metadata on `nightly` that aren't present in beta. +The reason these need to be different is because they aren't necessarily +ABI-compatible: there could be new layout optimizations, changes to `MIR`, or +other changes to Rust metadata on `nightly` that aren't present in beta. This is also where `--keep-stage 1 library/std` comes into play. Since most changes to the compiler don't actually change the ABI, once you've produced a `std` in `stage1`, you can probably just reuse it with a different compiler. If the ABI hasn't changed, you're good to go, no need to spend time recompiling -that `std`. The flag `--keep-stage` simply instructs the build script to -assumes the previous compile is fine and copies those artifacts into the -appropriate place, skipping the `cargo` invocation. +that `std`. The flag `--keep-stage` simply instructs the build script to assumes +the previous compile is fine and copies those artifacts into the appropriate +place, skipping the `cargo` invocation. ### Cross-compiling rustc -*Cross-compiling* is the process of compiling code that will run on another architecture. -For instance, you might want to build an ARM version of rustc using an x86 machine. -Building `stage2` `std` is different when you are cross-compiling. +*Cross-compiling* is the process of compiling code that will run on another +architecture. For instance, you might want to build an ARM version of rustc +using an x86 machine. Building `stage2` `std` is different when you are +cross-compiling. -This is because `./x` uses the following logic: if `HOST` and `TARGET` are the same, -it will reuse `stage1` `std` for `stage2`! This is sound because `stage1` `std` -was compiled with the `stage1` compiler, i.e. a compiler using the source code -you currently have checked out. So it should be identical (and therefore ABI-compatible) -to the `std` that `stage2/rustc` would compile. +This is because `./x` uses the following logic: if `HOST` and `TARGET` are the +same, it will reuse `stage1` `std` for `stage2`! This is sound because `stage1` +`std` was compiled with the `stage1` compiler, i.e. a compiler using the source +code you currently have checked out. So it should be identical (and therefore +ABI-compatible) to the `std` that `stage2/rustc` would compile. -However, when cross-compiling, `stage1` `std` will only run on the host. -So the `stage2` compiler has to recompile `std` for the target. +However, when cross-compiling, `stage1` `std` will only run on the host. So the +`stage2` compiler has to recompile `std` for the target. (See in the table how `stage2` only builds non-host `std` targets). ### Why does only libstd use `cfg(bootstrap)`? -For docs on `cfg(bootstrap)` itself, see -[Complications of Bootstrapping](#complications-of-bootstrapping). +For docs on `cfg(bootstrap)` itself, see [Complications of +Bootstrapping](#complications-of-bootstrapping). The `rustc` generated by the `stage0` compiler is linked to the freshly-built -`std`, which means that for the most part only `std` needs to be `cfg`-gated, -so that `rustc` can use features added to `std` immediately after their addition, +`std`, which means that for the most part only `std` needs to be `cfg`-gated, so +that `rustc` can use features added to `std` immediately after their addition, without need for them to get into the downloaded `beta` compiler. -Note this is different from any other Rust program: `stage1` `rustc` -is built by the _beta_ compiler, but using the _master_ version of `libstd`! +Note this is different from any other Rust program: `stage1` `rustc` is built by +the _beta_ compiler, but using the _master_ version of `libstd`! The only time `rustc` uses `cfg(bootstrap)` is when it adds internal lints that use diagnostic items, or when it uses unstable library features that were @@ -282,24 +287,24 @@ recently changed. ### What is a 'sysroot'? -When you build a project with `cargo`, the build artifacts for dependencies -are normally stored in `target/debug/deps`. This only contains dependencies `cargo` -knows about; in particular, it doesn't have the standard library. Where do -`std` or `proc_macro` come from? They comes from the **sysroot**, the root -of a number of directories where the compiler loads build artifacts at runtime. -The `sysroot` doesn't just store the standard library, though - it includes -anything that needs to be loaded at runtime. That includes (but is not limited -to): +When you build a project with `cargo`, the build artifacts for dependencies are +normally stored in `target/debug/deps`. This only contains dependencies `cargo` +knows about; in particular, it doesn't have the standard library. Where do `std` +or `proc_macro` come from? They comes from the **sysroot**, the root of a number +of directories where the compiler loads build artifacts at runtime. The +`sysroot` doesn't just store the standard library, though - it includes anything +that needs to be loaded at runtime. That includes (but is not limited to): - Libraries `libstd`/`libtest`/`libproc_macro`. -- Compiler crates themselves, when using `rustc_private`. In-tree these - are always present; out of tree, you need to install `rustc-dev` with `rustup`. -- Shared object file `libLLVM.so` for the LLVM project. In-tree this is - either built from source or downloaded from CI; out-of-tree, you need to - install `llvm-tools-preview` with `rustup`. - -All the artifacts listed so far are *compiler* runtime dependencies. You can -see them with `rustc --print sysroot`: +- Compiler crates themselves, when using `rustc_private`. In-tree these are + always present; out of tree, you need to install `rustc-dev` with `rustup`. +- Shared object file `libLLVM.so` for the LLVM project. In-tree this is either + built from source or downloaded from CI; out-of-tree, you need to install + `llvm-tools-preview` with `rustup`. + +All the artifacts listed so far are *compiler* runtime dependencies. You can see +them with `rustc --print sysroot`: + ``` $ ls $(rustc --print sysroot)/lib libchalk_derive-0685d79833dc9b2b.so libstd-25c6acf8063a3802.so @@ -310,6 +315,7 @@ librustc_macros-5f0ec4a119c6ac86.so rustlib There are also runtime dependencies for the standard library! These are in `lib/rustlib/`, not `lib/` directly. + ``` $ ls $(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/lib | head -n 5 libaddr2line-6c8e02b8fedc1e5f.rlib @@ -324,24 +330,23 @@ are not part of the public API of the standard library, but are used to implement it. Also `lib/rustlib/` is part of the search path for linkers, but `lib` will never be part of the search path. -#### -Z force-unstable-if-unmarked +#### `-Z force-unstable-if-unmarked` -Since `lib/rustlib/` is part of the search path we have to be careful -about which crates are included in it. In particular, all crates except for -the standard library are built with the flag `-Z force-unstable-if-unmarked`, -which means that you have to use `#![feature(rustc_private)]` in order to -load it (as opposed to the standard library, which is always available). +Since `lib/rustlib/` is part of the search path we have to be careful about +which crates are included in it. In particular, all crates except for the +standard library are built with the flag `-Z force-unstable-if-unmarked`, which +means that you have to use `#![feature(rustc_private)]` in order to load it (as +opposed to the standard library, which is always available). The `-Z force-unstable-if-unmarked` flag has a variety of purposes to help enforce that the correct crates are marked as `unstable`. It was introduced -primarily to allow rustc and the standard library to link to arbitrary crates -on crates.io which do not themselves use `staged_api`. `rustc` also relies on -this flag to mark all of its crates as `unstable` with the `rustc_private` -feature so that each crate does not need to be carefully marked with -`unstable`. - -This flag is automatically applied to all of `rustc` and the standard library -by the bootstrap scripts. This is needed because the compiler and all of its +primarily to allow rustc and the standard library to link to arbitrary crates on +crates.io which do not themselves use `staged_api`. `rustc` also relies on this +flag to mark all of its crates as `unstable` with the `rustc_private` feature so +that each crate does not need to be carefully marked with `unstable`. + +This flag is automatically applied to all of `rustc` and the standard library by +the bootstrap scripts. This is needed because the compiler and all of its dependencies are shipped in `sysroot` to all users. This flag has the following effects: @@ -361,9 +366,12 @@ crates. This is needed for things which link `rustc` its self, such as `MIRI` or `clippy`. You can find more discussion about sysroots in: -- The [rustdoc PR] explaining why it uses `extern crate` for dependencies loaded from `sysroot` -- [Discussions about sysroot on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/deps.20in.20sysroot/) -- [Discussions about building rustdoc out of tree](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/How.20to.20create.20an.20executable.20accessing.20.60rustc_private.60.3F) +- The [rustdoc PR] explaining why it uses `extern crate` for dependencies loaded + from `sysroot` +- [Discussions about sysroot on + Zulip](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/deps.20in.20sysroot/) +- [Discussions about building rustdoc out of + tree](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/How.20to.20create.20an.20executable.20accessing.20.60rustc_private.60.3F) [rustdoc PR]: https://github.com/rust-lang/rust/pull/76728 @@ -395,16 +403,17 @@ During bootstrapping, there are a bunch of compiler-internal environment variables that are used. If you are trying to run an intermediate version of `rustc`, sometimes you may need to set some of these environment variables manually. Otherwise, you get an error like the following: + ```text thread 'main' panicked at 'RUSTC_STAGE was not set: NotPresent', library/core/src/result.rs:1165:5 ``` -If `./stageN/bin/rustc` gives an error about environment variables, that -usually means something is quite wrong -- such as you're trying to compile -`rustc` or `std` or something which depends on environment variables. In the -unlikely case that you actually need to invoke `rustc` in such a situation, you -can tell the bootstrap shim to print all `env` variables by adding `-vvv` to your -`x` command. +If `./stageN/bin/rustc` gives an error about environment variables, that usually +means something is quite wrong -- such as you're trying to compile `rustc` or +`std` or something which depends on environment variables. In the unlikely case +that you actually need to invoke `rustc` in such a situation, you can tell the +bootstrap shim to print all `env` variables by adding `-vvv` to your `x` +command. Finally, bootstrap makes use of the [cc-rs crate] which has [its own method][env-vars] of configuring `C` compilers and `C` flags via environment @@ -419,6 +428,7 @@ In this part, we will investigate the build command's `stdout` in an action (similar, but more detailed and complete documentation compare to topic above). When you execute `x build --dry-run` command, the build output will be something like the following: + ```text Building stage0 library artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu) Copying stage0 library from stage0 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu / x86_64-unknown-linux-gnu) @@ -433,8 +443,8 @@ Building rustdoc for stage1 (x86_64-unknown-linux-gnu) ### Building stage0 {std,compiler} artifacts -These steps use the provided (downloaded, usually) compiler to compile the -local Rust source into libraries we can use. +These steps use the provided (downloaded, usually) compiler to compile the local +Rust source into libraries we can use. ### Copying stage0 \{std,rustc\} @@ -443,13 +453,13 @@ This copies the library and compiler artifacts from `cargo` into ### Assembling stage1 compiler -This copies the libraries we built in "building `stage0` ... artifacts" into -the `stage1` compiler's `lib/` directory. These are the host libraries that the +This copies the libraries we built in "building `stage0` ... artifacts" into the +`stage1` compiler's `lib/` directory. These are the host libraries that the compiler itself uses to run. These aren't actually used by artifacts the new compiler generates. This step also copies the `rustc` and `rustdoc` binaries we generated into `build/$HOST/stage/bin`. The `stage1/bin/rustc` is a fully functional compiler, but it doesn't yet have any libraries to link built binaries or libraries to. The next 3 steps will -provide those libraries for it; they are mostly equivalent to constructing -the `stage1/bin` compiler so we don't go through them individually here. +provide those libraries for it; they are mostly equivalent to constructing the +`stage1/bin` compiler so we don't go through them individually here.