we got 3 (#1447)
This commit is contained in:
parent
4128e99571
commit
f54dffb9e1
|
|
@ -1,13 +1,16 @@
|
||||||
# Code generation
|
# Code generation
|
||||||
|
|
||||||
Code generation or "codegen" is the part of the compiler that actually
|
Code generation (or "codegen") is the part of the compiler
|
||||||
generates an executable binary. Usually, rustc uses LLVM for code generation;
|
that actually generates an executable binary.
|
||||||
there is also support for [Cranelift]. The key is that rustc doesn't implement
|
Usually, rustc uses LLVM for code generation,
|
||||||
codegen itself. It's worth noting, though, that in the Rust source code, many
|
bu there is also support for [Cranelift] and [GCC].
|
||||||
parts of the backend have `codegen` in their names (there are no hard
|
The key is that rustc doesn't implement codegen itself.
|
||||||
boundaries).
|
It's worth noting, though, that in the Rust source code,
|
||||||
|
many parts of the backend have `codegen` in their names
|
||||||
|
(there are no hard boundaries).
|
||||||
|
|
||||||
[Cranelift]: https://github.com/bytecodealliance/wasmtime/tree/HEAD/cranelift
|
[Cranelift]: https://github.com/bytecodealliance/wasmtime/tree/main/cranelift
|
||||||
|
[GCC]: https://github.com/rust-lang/rustc_codegen_gcc
|
||||||
|
|
||||||
> NOTE: If you are looking for hints on how to debug code generation bugs,
|
> NOTE: If you are looking for hints on how to debug code generation bugs,
|
||||||
> please see [this section of the debugging chapter][debugging].
|
> please see [this section of the debugging chapter][debugging].
|
||||||
|
|
|
||||||
|
|
@ -1,54 +1,57 @@
|
||||||
# From MIR to Binaries
|
# From MIR to Binaries
|
||||||
|
|
||||||
All of the preceding chapters of this guide have one thing in common: we never
|
All of the preceding chapters of this guide have one thing in common:
|
||||||
generated any executable machine code at all! With this chapter, all of that
|
we never generated any executable machine code at all!
|
||||||
changes.
|
With this chapter, all of that changes.
|
||||||
|
|
||||||
So far, we've shown how the compiler can take raw source code in text format
|
So far,
|
||||||
and transform it into [MIR]. We have also shown how the compiler does various
|
we've shown how the compiler can take raw source code in text format
|
||||||
analyses on the code to detect things like type or lifetime errors. Now, we
|
and transform it into [MIR].
|
||||||
will finally take the MIR and produce some executable machine code.
|
We have also shown how the compiler does various
|
||||||
|
analyses on the code to detect things like type or lifetime errors.
|
||||||
|
Now, we will finally take the MIR and produce some executable machine code.
|
||||||
|
|
||||||
[MIR]: ./mir/index.md
|
[MIR]: ./mir/index.md
|
||||||
|
|
||||||
> NOTE: This part of a compiler is often called the _backend_. The term is a bit
|
> NOTE: This part of a compiler is often called the _backend_.
|
||||||
> overloaded because in the compiler source, it usually refers to the "codegen
|
> The term is a bit overloaded because in the compiler source,
|
||||||
> backend" (i.e. LLVM or Cranelift). Usually, when you see the word "backend"
|
> it usually refers to the "codegen backend" (i.e. LLVM, Cranelift, or GCC).
|
||||||
> in this part, we are referring to the "codegen backend".
|
> Usually, when you see the word "backend" in this part,
|
||||||
|
> we are referring to the "codegen backend".
|
||||||
|
|
||||||
So what do we need to do?
|
So what do we need to do?
|
||||||
|
|
||||||
0. First, we need to collect the set of things to generate code for. In
|
0. First, we need to collect the set of things to generate code for.
|
||||||
particular, we need to find out which concrete types to substitute for
|
In particular,
|
||||||
generic ones, since we need to generate code for the concrete types.
|
we need to find out which concrete types to substitute for generic ones,
|
||||||
Generating code for the concrete types (i.e. emitting a copy of the code for
|
since we need to generate code for the concrete types.
|
||||||
each concrete type) is called _monomorphization_, so the process of
|
Generating code for the concrete types
|
||||||
collecting all the concrete types is called _monomorphization collection_.
|
(i.e. emitting a copy of the code for each concrete type) is called _monomorphization_,
|
||||||
|
so the process of collecting all the concrete types is called _monomorphization collection_.
|
||||||
1. Next, we need to actually lower the MIR to a codegen IR
|
1. Next, we need to actually lower the MIR to a codegen IR
|
||||||
(usually LLVM IR) for each concrete type we collected.
|
(usually LLVM IR) for each concrete type we collected.
|
||||||
2. Finally, we need to invoke LLVM or Cranelift, which runs a bunch of
|
2. Finally, we need to invoke the codegen backend,
|
||||||
optimization passes, generates executable code, and links together an
|
which runs a bunch of optimization passes,
|
||||||
executable binary.
|
generates executable code,
|
||||||
|
and links together an executable binary.
|
||||||
|
|
||||||
[codegen1]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_crate.html
|
[codegen1]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_crate.html
|
||||||
|
|
||||||
The code for codegen is actually a bit complex due to a few factors:
|
The code for codegen is actually a bit complex due to a few factors:
|
||||||
|
|
||||||
- Support for multiple codegen backends (LLVM and Cranelift). We try to share as much
|
- Support for multiple codegen backends (LLVM, Cranelift, and GCC).
|
||||||
backend code between them as possible, so a lot of it is generic over the
|
We try to share as much backend code between them as possible,
|
||||||
codegen implementation. This means that there are often a lot of layers of
|
so a lot of it is generic over the codegen implementation.
|
||||||
abstraction.
|
This means that there are often a lot of layers of abstraction.
|
||||||
- Codegen happens asynchronously in another thread for performance.
|
- Codegen happens asynchronously in another thread for performance.
|
||||||
- The actual codegen is done by a third-party library (either LLVM or Cranelift).
|
- The actual codegen is done by a third-party library (either of the 3 backends).
|
||||||
|
|
||||||
Generally, the [`rustc_codegen_ssa`][ssa] crate contains backend-agnostic code
|
Generally, the [`rustc_codegen_ssa`][ssa] crate contains backend-agnostic code,
|
||||||
(i.e. independent of LLVM or Cranelift), while the [`rustc_codegen_llvm`][llvm]
|
while the [`rustc_codegen_llvm`][llvm] crate contains code specific to LLVM codegen.
|
||||||
crate contains code specific to LLVM codegen.
|
|
||||||
|
|
||||||
[ssa]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/index.html
|
[ssa]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/index.html
|
||||||
[llvm]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/index.html
|
[llvm]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/index.html
|
||||||
|
|
||||||
At a very high level, the entry point is
|
At a very high level, the entry point is
|
||||||
[`rustc_codegen_ssa::base::codegen_crate`][codegen1]. This function starts the
|
[`rustc_codegen_ssa::base::codegen_crate`][codegen1].
|
||||||
process discussed in the rest of this chapter.
|
This function starts the process discussed in the rest of this chapter.
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue