Add section about building an optimized version of `rustc`
This commit is contained in:
parent
c81bddf34e
commit
665bd2cdcc
|
|
@ -14,6 +14,7 @@
|
|||
- [Building Documentation](./building/compiler-documenting.md)
|
||||
- [Rustdoc overview](./rustdoc.md)
|
||||
- [Adding a new target](./building/new-target.md)
|
||||
- [Optimized build](./building/optimized-build.md)
|
||||
- [Testing the compiler](./tests/intro.md)
|
||||
- [Running tests](./tests/running.md)
|
||||
- [Testing with Docker](./tests/docker.md)
|
||||
|
|
|
|||
|
|
@ -0,0 +1,131 @@
|
|||
# Optimized build of the compiler
|
||||
|
||||
<!-- toc -->
|
||||
|
||||
There are multiple additional build configuration options and techniques that can used to compile a
|
||||
build of `rustc` that is as optimized as possible (for example when building `rustc` for a Linux
|
||||
distribution). The status of these configuration options for various Rust targets is tracked [here].
|
||||
This page describes how you can use these approaches when building `rustc` yourself.
|
||||
|
||||
[here]: https://github.com/rust-lang/rust/issues/103595
|
||||
|
||||
## Link-time optimization
|
||||
|
||||
Link-time optimization is a powerful compiler technique that can increase program performance. To
|
||||
enable (Thin-)LTO when building `rustc`, set the `rust.lto` config option to `"thin"`
|
||||
in `config.toml`:
|
||||
|
||||
```toml
|
||||
[rust]
|
||||
lto = "thin"
|
||||
```
|
||||
|
||||
> Note that LTO for `rustc` is currently supported and tested only for
|
||||
> the `x86_64-unknown-linux-gnu` target. Other targets *may* work, but no guarantees are provided.
|
||||
> Notably, LTO optimized `rustc` currently produces [miscompilations] on Windows.
|
||||
|
||||
[miscompilations]: https://github.com/rust-lang/rust/issues/109114
|
||||
|
||||
Enabling LTO on Linux has [produced] speed-ups by up to 10%.
|
||||
|
||||
[produced]: https://github.com/rust-lang/rust/pull/101403#issuecomment-1288190019
|
||||
|
||||
## Memory allocator
|
||||
|
||||
Using a different memory allocator for `rustc` can provide significant performance benefits. If you
|
||||
want to enable the `jemalloc` allocator, you can set the `rust.jemalloc` option to `true`
|
||||
in `config.toml`:
|
||||
|
||||
```toml
|
||||
[rust]
|
||||
jemalloc = true
|
||||
```
|
||||
|
||||
> Note that this option is currently only supported for Linux and macOS targets.
|
||||
|
||||
## Codegen units
|
||||
|
||||
Reducing the amount of codegen units per `rustc` crate can produce a faster build of the compiler.
|
||||
You can modify the number of codegen units for `rustc` and `libstd` in `config.toml` with the
|
||||
following options:
|
||||
|
||||
```toml
|
||||
[rust]
|
||||
codegen-units = 1
|
||||
codegen-units-std = 1
|
||||
```
|
||||
|
||||
## Instruction set
|
||||
|
||||
By default, `rustc` is compiled for a generic (and conservative) instruction set architecture
|
||||
(depending on the selected target), to make it support as many CPUs as possible. If you want to
|
||||
compile `rustc` for a specific instruction set architecture, you can set the `target_cpu` compiler
|
||||
option in `RUSTFLAGS`:
|
||||
|
||||
```bash
|
||||
$ RUSTFLAGS="-C target_cpu=x86-64-v3" x.py build ...
|
||||
```
|
||||
|
||||
If you also want to compile LLVM for a specific instruction set, you can set `llvm` flags
|
||||
in `config.toml`:
|
||||
|
||||
```toml
|
||||
[llvm]
|
||||
cxxflags = "-march=x86-64-v3"
|
||||
cflags = "-march=x86-64-v3"
|
||||
```
|
||||
|
||||
## Profile-guided optimization
|
||||
|
||||
Applying profile-guided optimizations (or more generally, feedback-directed optimizations) can
|
||||
produce a large increase to `rustc` performance, by up to 25%. However, these techniques are not
|
||||
simply enabled by a configuration option, but rather they require a complex build workflow that
|
||||
compiles `rustc` multiple times and profiles it on selected benchmarks.
|
||||
|
||||
There is a tool called `opt-dist` that is used to optimize `rustc` with [PGO] (profile-guided
|
||||
optimizations) and [BOLT] (a post-link binary optimizer) for builds distributed to end users. You
|
||||
can examine the tool, which is located in `src/tools/opt-dist`, and build a custom PGO build
|
||||
workflow based on it, or try to use it directly. Note that the tool is currently quite hardcoded to
|
||||
the way we use it in Rust's continuous integration workflows, and it might require some custom
|
||||
changes to make it work in a different environment.
|
||||
|
||||
[PGO]: https://doc.rust-lang.org/rustc/profile-guided-optimization.html
|
||||
|
||||
[BOLT]: https://github.com/llvm/llvm-project/blob/main/bolt/README.md
|
||||
|
||||
To use the tool, you will need to provide some external dependencies:
|
||||
|
||||
- A Python3 interpreter (for executing `x.py`).
|
||||
- Compiled LLVM toolchain, with the `llvm-profdata` binary. Optionally, if you want to use BOLT,
|
||||
the `llvm-bolt` and
|
||||
`merge-fdata` binaries have to be available in the toolchain.
|
||||
- Downloaded [Rust benchmark suite].
|
||||
|
||||
These dependencies are provided to `opt-dist` by an implementation of the [`Environment`] trait. You
|
||||
can either implement the trait for your custom environment, by providing paths to these dependencies
|
||||
in its methods, or reuse one of the existing implementations (currently, there is an implementation
|
||||
for Linux and Windows). If you want your environment to support BOLT, return `true` from
|
||||
the `supports_bolt` method.
|
||||
|
||||
Here is an example of how can `opt-dist` be used with the default Linux environment (it assumes that
|
||||
you execute the following commands on a Linux system):
|
||||
|
||||
1. Build the tool with the following command:
|
||||
```bash
|
||||
$ python3 x.py build tools/opt-dist
|
||||
```
|
||||
2. Run the tool with the `PGO_HOST` environment variable set to the 64-bit Linux target:
|
||||
```bash
|
||||
$ PGO_HOST=x86_64-unknown-linux-gnu ./build/host/stage0-tools-bin/opt-dist
|
||||
```
|
||||
Note that the default Linux environment expects several hardcoded paths to exist:
|
||||
- `/checkout` should contain a checkout of the Rust compiler repository that will be compiled.
|
||||
- `/rustroot` should contain the compiled LLVM toolchain (containing BOLT).
|
||||
- A Python 3 interpreter should be available under the `python3` binary.
|
||||
- `/tmp/rustc-perf` should contain a downloaded checkout of the Rust benchmark suite.
|
||||
|
||||
You can modify `LinuxEnvironment` (or implement your own) to override these paths.
|
||||
|
||||
[`Environment`]: https://github.com/rust-lang/rust/blob/65e468f9c259749c210b1ae8972bfe14781f72f1/src/tools/opt-dist/src/environment/mod.rs#L8-L7
|
||||
|
||||
[Rust benchmark suite]: https://github.com/rust-lang/rustc-perf
|
||||
Loading…
Reference in New Issue