80 lines
3.8 KiB
Markdown
80 lines
3.8 KiB
Markdown
# Code generation
|
|
|
|
Code generation (or "codegen") is the part of the compiler
|
|
that actually generates an executable binary.
|
|
Usually, rustc uses LLVM for code generation,
|
|
but there is also support for [Cranelift] and [GCC].
|
|
The key is that rustc doesn't implement codegen itself.
|
|
It's worth noting, though, that in the Rust source code,
|
|
many parts of the backend have `codegen` in their names
|
|
(there are no hard boundaries).
|
|
|
|
[Cranelift]: https://github.com/bytecodealliance/wasmtime/tree/main/cranelift
|
|
[GCC]: https://github.com/rust-lang/rustc_codegen_gcc
|
|
|
|
> NOTE: If you are looking for hints on how to debug code generation bugs,
|
|
> please see [this section of the debugging chapter][debugging].
|
|
|
|
[debugging]: ./debugging.md
|
|
|
|
## What is LLVM?
|
|
|
|
[LLVM](https://llvm.org) is "a collection of modular and reusable compiler and
|
|
toolchain technologies". In particular, the LLVM project contains a pluggable
|
|
compiler backend (also called "LLVM"), which is used by many compiler projects,
|
|
including the `clang` C compiler and our beloved `rustc`.
|
|
|
|
LLVM takes input in the form of LLVM IR. It is basically assembly code with
|
|
additional low-level types and annotations added. These annotations are helpful
|
|
for doing optimizations on the LLVM IR and outputted machine code. The end
|
|
result of all this is (at long last) something executable (e.g. an ELF object,
|
|
an EXE, or wasm).
|
|
|
|
There are a few benefits to using LLVM:
|
|
|
|
- We don't have to write a whole compiler backend. This reduces implementation
|
|
and maintenance burden.
|
|
- We benefit from the large suite of advanced optimizations that the LLVM
|
|
project has been collecting.
|
|
- We can automatically compile Rust to any of the platforms for which LLVM has
|
|
support. For example, as soon as LLVM added support for wasm, voila! rustc,
|
|
clang, and a bunch of other languages were able to compile to wasm! (Well,
|
|
there was some extra stuff to be done, but we were 90% there anyway).
|
|
- We and other compiler projects benefit from each other. For example, when the
|
|
[Spectre and Meltdown security vulnerabilities][spectre] were discovered,
|
|
only LLVM needed to be patched.
|
|
|
|
[spectre]: https://meltdownattack.com/
|
|
|
|
## Running LLVM, linking, and metadata generation
|
|
|
|
Once LLVM IR for all of the functions and statics, etc is built, it is time to
|
|
start running LLVM and its optimization passes. LLVM IR is grouped into
|
|
"modules". Multiple "modules" can be codegened at the same time to aid in
|
|
multi-core utilization. These "modules" are what we refer to as _codegen
|
|
units_. These units were established way back during monomorphization
|
|
collection phase.
|
|
|
|
Once LLVM produces objects from these modules, these objects are passed to the
|
|
linker along with, optionally, the metadata object and an archive or an
|
|
executable is produced.
|
|
|
|
It is not necessarily the codegen phase described above that runs the
|
|
optimizations. With certain kinds of LTO, the optimization might happen at the
|
|
linking time instead. It is also possible for some optimizations to happen
|
|
before objects are passed on to the linker and some to happen during the
|
|
linking.
|
|
|
|
This all happens towards the very end of compilation. The code for this can be
|
|
found in [`rustc_codegen_ssa::back`][ssaback] and
|
|
[`rustc_codegen_llvm::back`][llvmback]. Sadly, this piece of code is not
|
|
really well-separated into LLVM-dependent code; the [`rustc_codegen_ssa`][ssa]
|
|
contains a fair amount of code specific to the LLVM backend.
|
|
|
|
Once these components are done with their work you end up with a number of
|
|
files in your filesystem corresponding to the outputs you have requested.
|
|
|
|
[ssa]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/index.html
|
|
[ssaback]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/back/index.html
|
|
[llvmback]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/back/index.html
|