we got 3 (#1447)
This commit is contained in:
parent
4128e99571
commit
f54dffb9e1
|
|
@ -1,13 +1,16 @@
|
|||
# Code generation
|
||||
|
||||
Code generation or "codegen" is the part of the compiler that actually
|
||||
generates an executable binary. Usually, rustc uses LLVM for code generation;
|
||||
there is also support for [Cranelift]. The key is that rustc doesn't implement
|
||||
codegen itself. It's worth noting, though, that in the Rust source code, many
|
||||
parts of the backend have `codegen` in their names (there are no hard
|
||||
boundaries).
|
||||
Code generation (or "codegen") is the part of the compiler
|
||||
that actually generates an executable binary.
|
||||
Usually, rustc uses LLVM for code generation,
|
||||
bu there is also support for [Cranelift] and [GCC].
|
||||
The key is that rustc doesn't implement codegen itself.
|
||||
It's worth noting, though, that in the Rust source code,
|
||||
many parts of the backend have `codegen` in their names
|
||||
(there are no hard boundaries).
|
||||
|
||||
[Cranelift]: https://github.com/bytecodealliance/wasmtime/tree/HEAD/cranelift
|
||||
[Cranelift]: https://github.com/bytecodealliance/wasmtime/tree/main/cranelift
|
||||
[GCC]: https://github.com/rust-lang/rustc_codegen_gcc
|
||||
|
||||
> NOTE: If you are looking for hints on how to debug code generation bugs,
|
||||
> please see [this section of the debugging chapter][debugging].
|
||||
|
|
|
|||
|
|
@ -1,54 +1,57 @@
|
|||
# From MIR to Binaries
|
||||
|
||||
All of the preceding chapters of this guide have one thing in common: we never
|
||||
generated any executable machine code at all! With this chapter, all of that
|
||||
changes.
|
||||
All of the preceding chapters of this guide have one thing in common:
|
||||
we never generated any executable machine code at all!
|
||||
With this chapter, all of that changes.
|
||||
|
||||
So far, we've shown how the compiler can take raw source code in text format
|
||||
and transform it into [MIR]. We have also shown how the compiler does various
|
||||
analyses on the code to detect things like type or lifetime errors. Now, we
|
||||
will finally take the MIR and produce some executable machine code.
|
||||
So far,
|
||||
we've shown how the compiler can take raw source code in text format
|
||||
and transform it into [MIR].
|
||||
We have also shown how the compiler does various
|
||||
analyses on the code to detect things like type or lifetime errors.
|
||||
Now, we will finally take the MIR and produce some executable machine code.
|
||||
|
||||
[MIR]: ./mir/index.md
|
||||
|
||||
> NOTE: This part of a compiler is often called the _backend_. The term is a bit
|
||||
> overloaded because in the compiler source, it usually refers to the "codegen
|
||||
> backend" (i.e. LLVM or Cranelift). Usually, when you see the word "backend"
|
||||
> in this part, we are referring to the "codegen backend".
|
||||
> NOTE: This part of a compiler is often called the _backend_.
|
||||
> The term is a bit overloaded because in the compiler source,
|
||||
> it usually refers to the "codegen backend" (i.e. LLVM, Cranelift, or GCC).
|
||||
> Usually, when you see the word "backend" in this part,
|
||||
> we are referring to the "codegen backend".
|
||||
|
||||
So what do we need to do?
|
||||
|
||||
0. First, we need to collect the set of things to generate code for. In
|
||||
particular, we need to find out which concrete types to substitute for
|
||||
generic ones, since we need to generate code for the concrete types.
|
||||
Generating code for the concrete types (i.e. emitting a copy of the code for
|
||||
each concrete type) is called _monomorphization_, so the process of
|
||||
collecting all the concrete types is called _monomorphization collection_.
|
||||
0. First, we need to collect the set of things to generate code for.
|
||||
In particular,
|
||||
we need to find out which concrete types to substitute for generic ones,
|
||||
since we need to generate code for the concrete types.
|
||||
Generating code for the concrete types
|
||||
(i.e. emitting a copy of the code for each concrete type) is called _monomorphization_,
|
||||
so the process of collecting all the concrete types is called _monomorphization collection_.
|
||||
1. Next, we need to actually lower the MIR to a codegen IR
|
||||
(usually LLVM IR) for each concrete type we collected.
|
||||
2. Finally, we need to invoke LLVM or Cranelift, which runs a bunch of
|
||||
optimization passes, generates executable code, and links together an
|
||||
executable binary.
|
||||
2. Finally, we need to invoke the codegen backend,
|
||||
which runs a bunch of optimization passes,
|
||||
generates executable code,
|
||||
and links together an executable binary.
|
||||
|
||||
[codegen1]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_crate.html
|
||||
|
||||
The code for codegen is actually a bit complex due to a few factors:
|
||||
|
||||
- Support for multiple codegen backends (LLVM and Cranelift). We try to share as much
|
||||
backend code between them as possible, so a lot of it is generic over the
|
||||
codegen implementation. This means that there are often a lot of layers of
|
||||
abstraction.
|
||||
- Support for multiple codegen backends (LLVM, Cranelift, and GCC).
|
||||
We try to share as much backend code between them as possible,
|
||||
so a lot of it is generic over the codegen implementation.
|
||||
This means that there are often a lot of layers of abstraction.
|
||||
- Codegen happens asynchronously in another thread for performance.
|
||||
- The actual codegen is done by a third-party library (either LLVM or Cranelift).
|
||||
- The actual codegen is done by a third-party library (either of the 3 backends).
|
||||
|
||||
Generally, the [`rustc_codegen_ssa`][ssa] crate contains backend-agnostic code
|
||||
(i.e. independent of LLVM or Cranelift), while the [`rustc_codegen_llvm`][llvm]
|
||||
crate contains code specific to LLVM codegen.
|
||||
Generally, the [`rustc_codegen_ssa`][ssa] crate contains backend-agnostic code,
|
||||
while the [`rustc_codegen_llvm`][llvm] crate contains code specific to LLVM codegen.
|
||||
|
||||
[ssa]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/index.html
|
||||
[llvm]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/index.html
|
||||
|
||||
At a very high level, the entry point is
|
||||
[`rustc_codegen_ssa::base::codegen_crate`][codegen1]. This function starts the
|
||||
process discussed in the rest of this chapter.
|
||||
|
||||
[`rustc_codegen_ssa::base::codegen_crate`][codegen1].
|
||||
This function starts the process discussed in the rest of this chapter.
|
||||
|
|
|
|||
Loading…
Reference in New Issue