Explain stages in terms of the compiler currently running (take N+1) (#857)
* Explain stages in terms of the compiler currently running
- Address some confusing points
+ stage N+1 -> stage N artifacts
+ Use more likely examples of an ABI break
+ stage N -> stage N compiler
- Mention why rustc occasionally uses `cfg(bootstrap)`
- Note that stage1 is built using two different versions
- Add lots of examples
+ `test src/test/ui` and `test compiler/rustc` run different compilers 😢
+ Separate examples of what to do from examples of what not to do
- 'ship stage 1 artifacts' -> 'ship stage 2 compiler'
This is hopefully less confusing.
* build -> x.py build
* Add section on build artifacts
* Improve wording
Co-authored-by: Camelid <37223377+camelid@users.noreply.github.com>
* uplifted -> assembled
Co-authored-by: Camelid <37223377+camelid@users.noreply.github.com>
This commit is contained in:
parent
3b4462f582
commit
fcc93a7043
|
|
@ -14,7 +14,7 @@ It must have been written in a different language. In Rust's case it was
|
|||
only way to build a modern version of rustc is a slightly less modern
|
||||
version.
|
||||
|
||||
This is exactly how `x.py` works: it downloads the current `beta` release of
|
||||
This is exactly how `x.py` works: it downloads the current beta release of
|
||||
rustc, then uses it to compile the new compiler.
|
||||
|
||||
## Stages of bootstrapping
|
||||
|
|
@ -71,6 +71,8 @@ These defaults are as follows:
|
|||
|
||||
You can always override the stage by passing `--stage N` explicitly.
|
||||
|
||||
For more information about stages, [see below](#understanding-stages-of-bootstrap).
|
||||
|
||||
## Complications of bootstrapping
|
||||
|
||||
Since the build system uses the current beta compiler to build the stage-1
|
||||
|
|
@ -122,43 +124,76 @@ contribution [here][bootstrap-build].
|
|||
|
||||
## Understanding stages of bootstrap
|
||||
|
||||
This is a detailed look into the separate bootstrap stages. When running
|
||||
`x.py` you will see output such as:
|
||||
### Overview
|
||||
|
||||
```txt
|
||||
Building stage0 std artifacts
|
||||
Copying stage0 std from stage0
|
||||
Building stage0 compiler artifacts
|
||||
Copying stage0 rustc from stage0
|
||||
Building LLVM for x86_64-apple-darwin
|
||||
Building stage0 codegen artifacts
|
||||
Assembling stage1 compiler
|
||||
Building stage1 std artifacts
|
||||
Copying stage1 std from stage1
|
||||
Building stage1 compiler artifacts
|
||||
Copying stage1 rustc from stage1
|
||||
Building stage1 codegen artifacts
|
||||
Assembling stage2 compiler
|
||||
Uplifting stage1 std
|
||||
Copying stage2 std from stage1
|
||||
Generating unstable book md files
|
||||
Building stage0 tool unstable-book-gen
|
||||
Building stage0 tool rustbook
|
||||
Documenting standalone
|
||||
Building rustdoc for stage2
|
||||
Documenting book redirect pages
|
||||
Documenting stage2 std
|
||||
Building rustdoc for stage1
|
||||
Documenting stage2 whitelisted compiler
|
||||
Documenting stage2 compiler
|
||||
Documenting stage2 rustdoc
|
||||
Documenting error index
|
||||
Uplifting stage1 rustc
|
||||
Copying stage2 rustc from stage1
|
||||
Building stage2 tool error_index_generator
|
||||
```
|
||||
This is a detailed look into the separate bootstrap stages.
|
||||
|
||||
A deeper look into `x.py`'s phases can be seen here:
|
||||
The convention `x.py` uses is that:
|
||||
- A `--stage N` flag means to run the stage N compiler (`stageN/rustc`).
|
||||
- A "stage N artifact" is a build artifact that is _produced_ by the stage N compiler.
|
||||
- The "stage (N+1) compiler" is assembled from "stage N artifacts". This
|
||||
process is called _uplifting_.
|
||||
|
||||
#### Build artifacts
|
||||
|
||||
Anything you can build with `x.py` is a _build artifact_.
|
||||
Build artifacts include, but are not limited to:
|
||||
|
||||
- binaries, like `stage0-rustc/rustc-main`
|
||||
- shared objects, like `stage0-sysroot/rustlib/libstd-6fae108520cf72fe.so`
|
||||
- [rlib] files, like `stage0-sysroot/rustlib/libstd-6fae108520cf72fe.rlib`
|
||||
- HTML files generated by rustdoc, like `doc/std`
|
||||
|
||||
[rlib]: ../serialization.md
|
||||
|
||||
#### Examples
|
||||
|
||||
- `x.py build --stage 0` means to build with the beta `rustc`.
|
||||
- `x.py doc --stage 0` means to document using the beta `rustdoc`.
|
||||
- `x.py test --stage 0 library/std` means to run tests on the standard library
|
||||
without building `rustc` from source ('build with stage 0, then test the
|
||||
artifacts'). If you're working on the standard library, this is normally the
|
||||
test command you want.
|
||||
- `x.py test src/test/ui` means to build the stage 1 compiler and run
|
||||
`compiletest` on it. If you're working on the compiler, this is normally the
|
||||
test command you want.
|
||||
|
||||
#### Examples of what *not* to do
|
||||
|
||||
- `x.py test --stage 0 src/test/ui` is not meaningful: it runs tests on the
|
||||
_beta_ compiler and doesn't build `rustc` from source. Use `test src/test/ui`
|
||||
instead, which builds stage 1 from source.
|
||||
- `x.py test --stage 0 compiler/rustc` builds the compiler but runs no tests:
|
||||
it's running `cargo test -p rustc`, but cargo doesn't understand Rust's
|
||||
tests. You shouldn't need to use this, use `test` instead (without arguments).
|
||||
- `x.py build --stage 0 compiler/rustc` builds the compiler, but does not make
|
||||
it usable: the build artifacts are not assembled into the final compiler
|
||||
([#73519]). Use `x.py build library/std` instead, which puts the compiler in
|
||||
`stage1/rustc`.
|
||||
|
||||
[#73519]: https://github.com/rust-lang/rust/issues/73519
|
||||
|
||||
### Building vs. Running
|
||||
|
||||
|
||||
Note that `build --stage N compiler/rustc` **does not** build the stage N compiler:
|
||||
instead it builds the stage _N+1_ compiler _using_ the stage N compiler.
|
||||
|
||||
In short, _stage 0 uses the stage0 compiler to create stage0 artifacts which
|
||||
will later be uplifted to be the stage1 compiler_.
|
||||
|
||||
In each stage, two major steps are performed:
|
||||
|
||||
1. `std` is compiled by the stage N compiler.
|
||||
2. That `std` is linked to programs built by the stage N compiler, including
|
||||
the stage N artifacts (stage (N+1) compiler).
|
||||
|
||||
This is somewhat intuitive if one thinks of the stage N artifacts as "just"
|
||||
another program we are building with the stage N compiler:
|
||||
`build --stage N compiler/rustc` is linking the stage N artifacts to the `std`
|
||||
built by the stage N compiler.
|
||||
|
||||
Here is a chart of a full build using `x.py`:
|
||||
|
||||
<img alt="A diagram of the rustc compilation phases" src="../img/rustc_stages.svg" class="center" />
|
||||
|
||||
|
|
@ -166,6 +201,58 @@ Keep in mind this diagram is a simplification, i.e. `rustdoc` can be built at
|
|||
different stages, the process is a bit different when passing flags such as
|
||||
`--keep-stage`, or if there are non-host targets.
|
||||
|
||||
The stage 2 compiler is what is shipped to end-users.
|
||||
|
||||
### Stages and `std`
|
||||
|
||||
Note that there are two `std` libraries in play here:
|
||||
1. The library _linked_ to `stageN/rustc`, which was built by stage N-1 (stage N-1 `std`)
|
||||
2. The library _used to compile programs_ with `stageN/rustc`, which was
|
||||
built by stage N (stage N `std`).
|
||||
|
||||
Stage N `std` is pretty much necessary for any useful work with the stage N compiler.
|
||||
Without it, you can only compile programs with `#![no_core]` -- not terribly useful!
|
||||
|
||||
The reason these need to be different is because they aren't necessarily ABI-compatible:
|
||||
there could be a new layout optimizations, changes to MIR, or other changes
|
||||
to Rust metadata on nightly that aren't present in beta.
|
||||
|
||||
This is also where `--keep-stage 1 library/std` comes into play. Since most
|
||||
changes to the compiler don't actually change the ABI, once you've produced a
|
||||
`std` in stage 1, you can probably just reuse it with a different compiler.
|
||||
If the ABI hasn't changed, you're good to go, no need to spend time
|
||||
recompiling that `std`.
|
||||
`--keep-stage` simply assumes the previous compile is fine and copies those
|
||||
artifacts into the appropriate place, skipping the cargo invocation.
|
||||
|
||||
### Cross-compiling
|
||||
|
||||
Building stage2 `std` is different depending on whether you are cross-compiling or not
|
||||
(see in the table how stage2 only builds non-host `std` targets).
|
||||
This is because `x.py` uses a trick: if `HOST` and `TARGET` are the same,
|
||||
it will reuse stage1 `std` for stage2! This is sound because stage1 `std`
|
||||
was compiled with the stage1 compiler, i.e. a compiler using the source code
|
||||
you currently have checked out. So it should be identical (and therefore ABI-compatible)
|
||||
to the `std` that `stage2/rustc` would compile.
|
||||
|
||||
However, when cross-compiling, stage1 `std` will only run on the host.
|
||||
So the stage2 compiler has to recompile `std` for the target.
|
||||
|
||||
### Why does only libstd use `cfg(bootstrap)`?
|
||||
|
||||
The `rustc` generated by the stage0 compiler is linked to the freshly-built
|
||||
`std`, which means that for the most part only `std` needs to be cfg-gated,
|
||||
so that `rustc` can use features added to std immediately after their addition,
|
||||
without need for them to get into the downloaded beta.
|
||||
|
||||
Note this is different from any other Rust program: stage1 `rustc`
|
||||
is built by the _beta_ compiler, but using the _master_ version of libstd!
|
||||
|
||||
The only time `rustc` uses `cfg(bootstrap)` is when it adds internal lints
|
||||
that use diagnostic items. This happens very rarely.
|
||||
|
||||
### Directories and artifacts generated by x.py
|
||||
|
||||
The following tables indicate the outputs of various stage actions:
|
||||
|
||||
| Stage 0 Action | Output |
|
||||
|
|
@ -178,7 +265,7 @@ The following tables indicate the outputs of various stage actions:
|
|||
| copy `stage0-rustc (except executable)` | `build/HOST/stage0-sysroot/lib/rustlib/HOST` |
|
||||
| build `llvm` | `build/HOST/llvm` |
|
||||
| `stage0` builds `codegen` with `stage0-sysroot` | `build/HOST/stage0-codegen/HOST` |
|
||||
| `stage0` builds `rustdoc` with `stage0-sysroot` | `build/HOST/stage0-tools/HOST` |
|
||||
| `stage0` builds `rustdoc`, `clippy`, `miri`, with `stage0-sysroot` | `build/HOST/stage0-tools/HOST` |
|
||||
|
||||
`--stage=0` stops here.
|
||||
|
||||
|
|
@ -201,85 +288,11 @@ The following tables indicate the outputs of various stage actions:
|
|||
| copy (uplift) `stage1-sysroot` | `build/HOST/stage2/lib and build/HOST/stage2/lib/rustlib/HOST` |
|
||||
| `stage2` builds `test`/`std` (not HOST targets) | `build/HOST/stage2-std/TARGET` |
|
||||
| copy `stage2-std` (not HOST targets) | `build/HOST/stage2/lib/rustlib/TARGET` |
|
||||
| `stage2` builds `rustdoc` | `build/HOST/stage2-tools/HOST` |
|
||||
| `stage2` builds `rustdoc`, `clippy`, `miri` | `build/HOST/stage2-tools/HOST` |
|
||||
| copy `rustdoc` | `build/HOST/stage2/bin` |
|
||||
|
||||
`--stage=2` stops here.
|
||||
|
||||
Note that the convention `x.py` uses is that:
|
||||
- A "stage N artifact" is an artifact that is _produced_ by the stage N compiler.
|
||||
- The "stage (N+1) compiler" is assembled from "stage N artifacts".
|
||||
- A `--stage N` flag means build _with_ stage N.
|
||||
|
||||
In short, _stage 0 uses the stage0 compiler to create stage0 artifacts which
|
||||
will later be uplifted to stage1_.
|
||||
|
||||
Every time any of the main artifacts (`std` and `rustc`) are compiled, two
|
||||
steps are performed.
|
||||
When `std` is compiled by a stage N compiler, that `std` will be linked to
|
||||
programs built by the stage N compiler (including `rustc` built later
|
||||
on). It will also be used by the stage (N+1) compiler to link against itself.
|
||||
This is somewhat intuitive if one thinks of the stage (N+1) compiler as "just"
|
||||
another program we are building with the stage N compiler. In some ways, `rustc`
|
||||
(the binary, not the `rustbuild` step) could be thought of as one of the few
|
||||
`no_core` binaries out there.
|
||||
|
||||
So "stage0 std artifacts" are in fact the output of the downloaded stage0
|
||||
compiler, and are going to be used for anything built by the stage0 compiler:
|
||||
e.g. `rustc` artifacts. When it announces that it is "building stage1
|
||||
std artifacts" it has moved on to the next bootstrapping phase. This pattern
|
||||
continues in latter stages.
|
||||
|
||||
Also note that building host `std` and target `std` are different based on the
|
||||
stage (e.g. see in the table how stage2 only builds non-host `std` targets.
|
||||
This is because during stage2, the host `std` is uplifted from the "stage 1"
|
||||
`std` -- specifically, when "Building stage 1 artifacts" is announced, it is
|
||||
later copied into stage2 as well (both the compiler's `libdir` and the
|
||||
`sysroot`).
|
||||
|
||||
This `std` is pretty much necessary for any useful work with the compiler.
|
||||
Specifically, it's used as the `std` for programs compiled by the newly compiled
|
||||
compiler (so when you compile `fn main() { }` it is linked to the last `std`
|
||||
compiled with `x.py build library/std`).
|
||||
|
||||
The `rustc` generated by the stage0 compiler is linked to the freshly-built
|
||||
`std`, which means that for the most part only `std` needs to be cfg-gated,
|
||||
so that `rustc` can use featured added to std immediately after their addition,
|
||||
without need for them to get into the downloaded beta. The `std` built by the
|
||||
`stage1/bin/rustc` compiler, also known as "stage1 std artifacts", is not
|
||||
necessarily ABI-compatible with that compiler.
|
||||
That is, the `rustc` binary most likely could not use this `std` itself.
|
||||
It is however ABI-compatible with any programs that the `stage1/bin/rustc`
|
||||
binary builds (including itself), so in that sense they're paired.
|
||||
|
||||
This is also where `--keep-stage 1 library/std` comes into play. Since most
|
||||
changes to the compiler don't actually change the ABI, once you've produced a
|
||||
`std` in stage 1, you can probably just reuse it with a different compiler.
|
||||
If the ABI hasn't changed, you're good to go, no need to spend the time
|
||||
recompiling that `std`.
|
||||
`--keep-stage` simply assumes the previous compile is fine and copies those
|
||||
artifacts into the appropriate place, skipping the cargo invocation.
|
||||
|
||||
The reason we first build `std`, then `rustc`, is largely just
|
||||
because we want to minimize `cfg(stage0)` in the code for `rustc`.
|
||||
Currently `rustc` is always linked against a "new" `std` so it doesn't
|
||||
ever need to be concerned with differences in std; it can assume that the std is
|
||||
as fresh as possible.
|
||||
|
||||
The reason we need to build it twice is because of ABI compatibility.
|
||||
The beta compiler has it's own ABI, and then the `stage1/bin/rustc` compiler
|
||||
will produce programs/libraries with the new ABI.
|
||||
We used to build three times, but because we assume that the ABI is constant
|
||||
within a codebase, we presume that the libraries produced by the "stage2"
|
||||
compiler (produced by the `stage1/bin/rustc` compiler) is ABI-compatible with
|
||||
the `stage1/bin/rustc` compiler's produced libraries.
|
||||
What this means is that we can skip that final compilation -- and simply use the
|
||||
same libraries as the `stage2/bin/rustc` compiler uses itself for programs it
|
||||
links against.
|
||||
|
||||
This `stage2/bin/rustc` compiler is shipped to end-users, along with the
|
||||
`stage 1 {std,rustc}` artifacts.
|
||||
|
||||
## Passing stage-specific flags to `rustc`
|
||||
|
||||
`x.py` allows you to pass stage-specific flags to `rustc` when bootstrapping.
|
||||
|
|
@ -287,7 +300,7 @@ The `RUSTFLAGS_STAGE_0`, `RUSTFLAGS_STAGE_1` and `RUSTFLAGS_STAGE_2`
|
|||
environment variables pass the given flags when building stage 0, 1, and 2
|
||||
artifacts respectively.
|
||||
|
||||
Additionally, the `RUSTFLAGS_STAGE_NOT_0` variable, as its name suggests, pass
|
||||
Additionally, the `RUSTFLAGS_STAGE_NOT_0` variable, as its name suggests, passes
|
||||
the given arguments if the stage is not 0.
|
||||
|
||||
## Environment Variables
|
||||
|
|
|
|||
Loading…
Reference in New Issue