Apply Centril suggestions
Co-Authored-By: Centril <twingoow@gmail.com>
This commit is contained in:
parent
91ae0ab7e0
commit
1775a945aa
|
|
@ -50,7 +50,7 @@ we'll talk about that later.
|
||||||
doesn't seem to be able to optimize the pattern the [`simplify_try`] mir
|
doesn't seem to be able to optimize the pattern the [`simplify_try`] mir
|
||||||
opt looks for.
|
opt looks for.
|
||||||
- Rust code is _monomorphized_, which means making copies of all the generic
|
- Rust code is _monomorphized_, which means making copies of all the generic
|
||||||
code with the type parameters replaced by concrete types. In order to do
|
code with the type parameters replaced by concrete types. To do
|
||||||
this, we need to collect a list of what concrete types to generate code for.
|
this, we need to collect a list of what concrete types to generate code for.
|
||||||
This is called _monomorphization collection_.
|
This is called _monomorphization collection_.
|
||||||
- We then begin what is vaguely called _code generation_ or _codegen_.
|
- We then begin what is vaguely called _code generation_ or _codegen_.
|
||||||
|
|
@ -105,7 +105,7 @@ satisfy/optimize for. For example,
|
||||||
- Compiler compilation speed: how long does it take to compile the compiler?
|
- Compiler compilation speed: how long does it take to compile the compiler?
|
||||||
This impacts contributors and compiler maintenance.
|
This impacts contributors and compiler maintenance.
|
||||||
- Compiler implementation complexity: building a compiler is one of the hardest
|
- Compiler implementation complexity: building a compiler is one of the hardest
|
||||||
things a person/group can do, and rust is not a very simple language, so how
|
things a person/group can do, and Rust is not a very simple language, so how
|
||||||
do we make the compiler's code base manageable?
|
do we make the compiler's code base manageable?
|
||||||
- Compiler correctness: the binaries produced by the compiler should do what
|
- Compiler correctness: the binaries produced by the compiler should do what
|
||||||
the input programs says they do, and should continue to do so despite the
|
the input programs says they do, and should continue to do so despite the
|
||||||
|
|
@ -119,14 +119,13 @@ satisfy/optimize for. For example,
|
||||||
always going on to its implementation.
|
always going on to its implementation.
|
||||||
- Limitations of other tools: rustc uses LLVM in its backend, and LLVM has some
|
- Limitations of other tools: rustc uses LLVM in its backend, and LLVM has some
|
||||||
strengths we leverage and some limitations/weaknesses we need to work around.
|
strengths we leverage and some limitations/weaknesses we need to work around.
|
||||||
- And others that I'm probably forgetting.
|
|
||||||
|
|
||||||
So, as you read through the rest of the guide, keep these things in mind. They
|
So, as you read through the rest of the guide, keep these things in mind. They
|
||||||
will often inform decisions that we make.
|
will often inform decisions that we make.
|
||||||
|
|
||||||
### Constant change
|
### Constant change
|
||||||
|
|
||||||
One thing to keep in mind is that `rustc` is a real production-quality product.
|
Keep in mind that `rustc` is a real production-quality product.
|
||||||
As such, it has its fair share of codebase churn and technical debt. A lot of
|
As such, it has its fair share of codebase churn and technical debt. A lot of
|
||||||
the designs discussed throughout this guide are idealized designs that are not
|
the designs discussed throughout this guide are idealized designs that are not
|
||||||
fully realized yet. And things keep changing so that it is hard to keep this
|
fully realized yet. And things keep changing so that it is hard to keep this
|
||||||
|
|
@ -139,19 +138,19 @@ to keep up with the requirements above.
|
||||||
|
|
||||||
As with most compilers, `rustc` uses some intermediate representations (IRs) to
|
As with most compilers, `rustc` uses some intermediate representations (IRs) to
|
||||||
facilitate computations. In general, working directly with the source code is
|
facilitate computations. In general, working directly with the source code is
|
||||||
extremely inconvenient. Source code is designed to be human-friendly while at
|
extremely inconvenient and error-prone. Source code is designed to be human-friendly while at
|
||||||
the same time being unambiguous, but it's less convenient for doing something
|
the same time being unambiguous, but it's less convenient for doing something
|
||||||
like, say, type checking.
|
like, say, type checking.
|
||||||
|
|
||||||
Instead most compilers, including `rustc`, build some sort of IR out of the
|
Instead most compilers, including `rustc`, build some sort of IR out of the
|
||||||
source code which is easier to analyze. `rustc` has a few IRs, each optimized
|
source code which is easier to analyze. `rustc` has a few IRs, each optimized
|
||||||
for different things:
|
for different purposes:
|
||||||
|
|
||||||
- Abstract Syntax Tree (AST): the abstract syntax tree is built from the stream
|
- Abstract Syntax Tree (AST): the abstract syntax tree is built from the stream
|
||||||
of tokens produced by the lexer directly from the source code. It represents
|
of tokens produced by the lexer directly from the source code. It represents
|
||||||
pretty much exactly what the user wrote. It helps to do some syntactic sanity
|
pretty much exactly what the user wrote. It helps to do some syntactic sanity
|
||||||
checking (e.g. checking that a type is expected where the user wrote one).
|
checking (e.g. checking that a type is expected where the user wrote one).
|
||||||
- High-level IR (HIR): This is a sort of very desugared AST. It's still close
|
- High-level IR (HIR): This is a sort of desugared AST. It's still close
|
||||||
to what the user wrote syntactically, but it includes some implicit things
|
to what the user wrote syntactically, but it includes some implicit things
|
||||||
such as some elided lifetimes, etc. This IR is amenable to type checking.
|
such as some elided lifetimes, etc. This IR is amenable to type checking.
|
||||||
- HAIR: This is an intermediate between HIR and MIR. This only exists to make
|
- HAIR: This is an intermediate between HIR and MIR. This only exists to make
|
||||||
|
|
@ -166,7 +165,7 @@ for different things:
|
||||||
MIRI). Because MIR is still generic, we can do a lot of analyses here more
|
MIRI). Because MIR is still generic, we can do a lot of analyses here more
|
||||||
efficiently than after monomorphization.
|
efficiently than after monomorphization.
|
||||||
- LLVM IR: This is the standard form of all input to the LLVM compiler. LLVM IR
|
- LLVM IR: This is the standard form of all input to the LLVM compiler. LLVM IR
|
||||||
is basically a sort of typed assembly language with lots of annotations. It's
|
is a sort of typed assembly language with lots of annotations. It's
|
||||||
a standard format that is used by all compilers that use LLVM (e.g. the clang
|
a standard format that is used by all compilers that use LLVM (e.g. the clang
|
||||||
C compiler also outputs LLVM IR). LLVM IR is designed to be easy for other
|
C compiler also outputs LLVM IR). LLVM IR is designed to be easy for other
|
||||||
compilers to emit and also rich enough for LLVM to run a bunch of
|
compilers to emit and also rich enough for LLVM to run a bunch of
|
||||||
|
|
@ -181,9 +180,9 @@ compiler does this to make incremental compilation possible -- that is, if the
|
||||||
user makes a change to their program and recompiles, we want to do as little
|
user makes a change to their program and recompiles, we want to do as little
|
||||||
redundant work as possible to produce the new binary.
|
redundant work as possible to produce the new binary.
|
||||||
|
|
||||||
In rustc, all the major steps above are organized as a bunch of queries that
|
In `rustc`, all the major steps above are organized as a bunch of queries that
|
||||||
call each other. For example, there is a query to ask for the type of something
|
call each other. For example, there is a query to ask for the type of something
|
||||||
and another to ask for the optimized MIR of a function, and so on. These
|
and another to ask for the optimized MIR of a function. These
|
||||||
queries can call each other and are all tracked through the query system, and
|
queries can call each other and are all tracked through the query system, and
|
||||||
the results of the queries are cached on disk so that we can tell which
|
the results of the queries are cached on disk so that we can tell which
|
||||||
queries' results changed from the last compilation and only redo those. This is
|
queries' results changed from the last compilation and only redo those. This is
|
||||||
|
|
@ -209,7 +208,7 @@ to remain to ensure that unreachable functions still have their errors emitted.
|
||||||
|
|
||||||
Moreover, the compiler wasn't originally built to use a query system; the query
|
Moreover, the compiler wasn't originally built to use a query system; the query
|
||||||
system has been retrofitted into the compiler, so parts of it are not
|
system has been retrofitted into the compiler, so parts of it are not
|
||||||
query-fied yet. Also, LLVM isn't our code, so obviously that isn't querified
|
query-fied yet. Also, LLVM isn't our code, so that isn't querified
|
||||||
either. The plan is to eventually query-fy all of the steps listed in the
|
either. The plan is to eventually query-fy all of the steps listed in the
|
||||||
previous section, but as of this writing, only the steps between HIR and
|
previous section, but as of this writing, only the steps between HIR and
|
||||||
LLVM-IR are query-fied. That is, lexing and parsing are done all at once for
|
LLVM-IR are query-fied. That is, lexing and parsing are done all at once for
|
||||||
|
|
@ -239,8 +238,8 @@ Oh, and also the `rustc::ty` module defines the `TyCtxt` struct we mentioned bef
|
||||||
|
|
||||||
### Parallelism
|
### Parallelism
|
||||||
|
|
||||||
Compiler performance is a problem that we would very much like to improve on
|
Compiler performance is a problem that we would like to improve on
|
||||||
(and are always working on). One aspect of that is attempting to parallelize
|
(and are always working on). One aspect of that is parallelizing
|
||||||
`rustc` itself.
|
`rustc` itself.
|
||||||
|
|
||||||
Currently, there is only one part of rustc that is already parallel: codegen.
|
Currently, there is only one part of rustc that is already parallel: codegen.
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue