Improve MIR sections in appendix

Add a new graphical representation of one of the examples of MIR and
basic blocks!

And make other minor improvements.
This commit is contained in:
Camelid 2020-09-14 16:38:26 -07:00 committed by Joshua Nelson
parent 7bda848e3c
commit 231e3d9c83
1 changed files with 54 additions and 27 deletions

View File

@ -8,10 +8,10 @@ Rust-specific context.
## What is a control-flow graph? ## What is a control-flow graph?
A control-flow graph is a common term from compilers. If you've ever A control-flow graph (CFG) is a common term from compilers. If you've ever
used a flow-chart, then the concept of a control-flow graph will be used a flow-chart, then the concept of a control-flow graph will be
pretty familiar to you. It's a representation of your program that pretty familiar to you. It's a representation of your program that
exposes the underlying control flow in a very clear way. clearly exposes the underlying control flow.
A control-flow graph is structured as a set of **basic blocks** A control-flow graph is structured as a set of **basic blocks**
connected by edges. The key idea of a basic block is that it is a set connected by edges. The key idea of a basic block is that it is a set
@ -44,12 +44,17 @@ if some_variable {
d = 1; d = 1;
``` ```
This would compile into four basic blocks: This would compile into four basic blocks in MIR. In textual form, it looks like
this:
```mir ```mir
BB0: { BB0: {
a = 1; a = 1;
if some_variable { goto BB1 } else { goto BB2 } if some_variable {
goto BB1;
} else {
goto BB2;
}
} }
BB1: { BB1: {
@ -64,10 +69,33 @@ BB2: {
BB3: { BB3: {
d = 1; d = 1;
...; ...
} }
``` ```
In graphical form, it looks like this:
```
BB0
+--------------------+
| a = 1; |
+--------------------+
/ \
if some_variable else
/ \
BB1 / \ BB2
+-----------+ +-----------+
| b = 1; | | c = 1; |
+-----------+ +-----------+
\ /
\ /
\ BB3 /
+----------+
| d = 1; |
| ... |
+----------+
```
When using a control-flow graph, a loop simply appears as a cycle in When using a control-flow graph, a loop simply appears as a cycle in
the graph, and the `break` keyword translates into a path out of that the graph, and the `break` keyword translates into a path out of that
cycle. cycle.
@ -82,10 +110,10 @@ and Michael I. Schwartzbach is an incredible resource!
_Dataflow analysis_ is a type of static analysis that is common in many _Dataflow analysis_ is a type of static analysis that is common in many
compilers. It describes a general technique, rather than a particular analysis. compilers. It describes a general technique, rather than a particular analysis.
The basic idea is that we can walk over a [CFG](#cfg) and keep track of what The basic idea is that we can walk over a [control-flow graph (CFG)](#cfg) and
some value could be. At the end of the walk, we might have shown that some keep track of what some value could be. At the end of the walk, we might have
claim is true or not necessarily true (e.g. "this variable must be shown that some claim is true or not necessarily true (e.g. "this variable must
initialized"). `rustc` tends to do dataflow analyses over the MIR, since that be initialized"). `rustc` tends to do dataflow analyses over the MIR, since MIR
is already a CFG. is already a CFG.
For example, suppose we want to check that `x` is initialized before it is used For example, suppose we want to check that `x` is initialized before it is used
@ -207,17 +235,17 @@ such that the function is well-typed: `∃ T: (T: Debug) and well_typed(foo)`.
<a name="variance"></a> <a name="variance"></a>
## What is a DeBruijn Index? ## What is a de Bruijn Index?
DeBruijn indices are a way of representing which variables are bound in [De Bruijn indices][wikideb] are a way of representing using only integers which
which binders using only integers. They were [originally invented][wikideb] for variables are bound in which binders. They were originally invented for use in
use in lambda calculus evaluation. In `rustc`, we use a similar idea for the lambda calculus evaluation (see [this Wikipedia article][wikideb] for more). In
[representation of generic types][sub]. `rustc`, we use a similar idea for the [representation of generic types][sub].
[wikideb]: https://en.wikipedia.org/wiki/De_Bruijn_index [wikideb]: https://en.wikipedia.org/wiki/De_Bruijn_index
[sub]: ../generics.md [sub]: ../generics.md
Here is a basic example of how DeBruijn indices might be used for closures (we Here is a basic example of how de Bruijn indices might be used for closures (we
don't actually do this in `rustc` though): don't actually do this in `rustc` though):
```rust,ignore ```rust,ignore
@ -231,7 +259,7 @@ don't actually do this in `rustc` though):
} }
``` ```
## What is co- and contra-variance? ## What are co- and contra-variance?
Check out the subtyping chapter from the Check out the subtyping chapter from the
[Rust Nomicon](https://doc.rust-lang.org/nomicon/subtyping.html). [Rust Nomicon](https://doc.rust-lang.org/nomicon/subtyping.html).
@ -246,17 +274,16 @@ the type checker handles variance.
Let's describe the concepts of free vs bound in terms of program Let's describe the concepts of free vs bound in terms of program
variables, since that's the thing we're most familiar with. variables, since that's the thing we're most familiar with.
- Consider this expression, which creates a closure: `|a, - Consider this expression, which creates a closure: `|a, b| a + b`.
b| a + b`. Here, the `a` and `b` in `a + b` refer to the arguments Here, the `a` and `b` in `a + b` refer to the arguments that the closure will
that the closure will be given when it is called. We say that the be given when it is called. We say that the `a` and `b` there are **bound** to
`a` and `b` there are **bound** to the closure, and that the closure the closure, and that the closure signature `|a, b|` is a **binder** for the
signature `|a, b|` is a **binder** for the names `a` and `b` names `a` and `b` (because any references to `a` or `b` within refer to the
(because any references to `a` or `b` within refer to the variables variables that it introduces).
that it introduces). - Consider this expression: `a + b`. In this expression, `a` and `b` refer to
- Consider this expression: `a + b`. In this expression, `a` and `b` local variables that are defined *outside* of the expression. We say that
refer to local variables that are defined *outside* of the those variables **appear free** in the expression (i.e., they are **free**,
expression. We say that those variables **appear free** in the not **bound** (tied up)).
expression (i.e., they are **free**, not **bound** (tied up)).
So there you have it: a variable "appears free" in some So there you have it: a variable "appears free" in some
expression/statement/whatever if it refers to something defined expression/statement/whatever if it refers to something defined