add content

This commit is contained in:
Niko Matsakis 2018-08-31 09:57:35 -04:00
parent 363ae64a2f
commit b92a92507f
6 changed files with 217 additions and 23 deletions

View File

@ -55,6 +55,9 @@
- [MIR passes: getting the MIR for a function](./mir/passes.md) - [MIR passes: getting the MIR for a function](./mir/passes.md)
- [MIR optimizations](./mir/optimizations.md) - [MIR optimizations](./mir/optimizations.md)
- [The borrow checker](./borrow_check.md) - [The borrow checker](./borrow_check.md)
- [Tracking moves and initialization](./borrow_check/moves_and_initialization.md)
- [Move paths](./borrow_check/moves_and_initialization/move_paths.md)
- [MIR type checker](./borrow_check/type_check.md)
- [Region inference](./borrow_check/region_inference.md) - [Region inference](./borrow_check/region_inference.md)
- [Constant evaluation](./const-eval.md) - [Constant evaluation](./const-eval.md)
- [miri const evaluator](./miri.md) - [miri const evaluator](./miri.md)

View File

@ -14,7 +14,10 @@ enforcing a number of properties:
At the time of this writing, the code is in a state of transition. The At the time of this writing, the code is in a state of transition. The
"main" borrow checker still works by processing [the HIR](hir.html), "main" borrow checker still works by processing [the HIR](hir.html),
but that is being phased out in favor of the MIR-based borrow checker. but that is being phased out in favor of the MIR-based borrow checker.
Doing borrow checking on MIR has two key advantages: Accordingly, this documentation focuses on the new, MIR-based borrow
checker.
Doing borrow checking on MIR has several advantages:
- The MIR is *far* less complex than the HIR; the radical desugaring - The MIR is *far* less complex than the HIR; the radical desugaring
helps prevent bugs in the borrow checker. (If you're curious, you helps prevent bugs in the borrow checker. (If you're curious, you
@ -30,30 +33,31 @@ Doing borrow checking on MIR has two key advantages:
The borrow checker source is found in The borrow checker source is found in
[the `rustc_mir::borrow_check` module][b_c]. The main entry point is [the `rustc_mir::borrow_check` module][b_c]. The main entry point is
the `mir_borrowck` query. At the time of this writing, MIR borrowck can operate the [`mir_borrowck`] query.
in several modes, but this text will describe only the mode when NLL is enabled
(what you get with `#![feature(nll)]`).
[b_c]: https://github.com/rust-lang/rust/tree/master/src/librustc_mir/borrow_check [b_c]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/index.html
[`mir_borrowck`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/fn.mir_borrowck.html
The overall flow of the borrow checker is as follows:
- We first create a **local copy** C of the MIR. In the coming steps, - We first create a **local copy** C of the MIR. In the coming steps,
we will modify this copy in place to modify the types and things to we will modify this copy in place to modify the types and things to
include references to the new regions that we are computing. include references to the new regions that we are computing.
- We then invoke `nll::replace_regions_in_mir` to modify this copy C. - We then invoke [`replace_regions_in_mir`] to modify this copy C.
Among other things, this function will replace all of the regions in Among other things, this function will replace all of the [regions](./appendix/glossary.html) in
the MIR with fresh [inference variables](./appendix/glossary.html). the MIR with fresh [inference variables](./appendix/glossary.html).
- (More details can be found in [the regionck section](./mir/regionck.html).) - Next, we perform a number of
- Next, we perform a number of [dataflow [dataflow analyses](./appendix/background.html#dataflow) that
analyses](./appendix/background.html#dataflow) compute what data is moved and when.
that compute what data is moved and when. The results of these analyses - We then do a [second type check](borrow_check/type_check.html) across the MIR:
are needed to do both borrow checking and region inference. the purpose of this type check is to determine all of the constraints between
- Using the move data, we can then compute the values of all the regions in the different regions.
MIR. - Next, we do [region inference](borrow_check/region_inference.html), which computes
- (More details can be found in [the NLL section](./mir/regionck.html).) the values of each region -- basically, points in the control-flow graph.
- Finally, the borrow checker itself runs, taking as input (a) the - At this point, we can compute the "borrows in scope" at each point.
results of move analysis and (b) the regions computed by the region - Finally, we do a second walk over the MIR, looking at the actions it
checker. This allows us to figure out which loans are still in scope does and reporting errors. For example, if we see a statement like
at any particular point. `*a + 1`, then we would check that the variable `a` is initialized
and that it is not mutably borrowed, as either of those would
require an error to be reported.
- Doing this check requires the results of all the previous analyses.
[`replace_regions_in_mir`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/fn.replace_regions_in_mir.html

View File

@ -0,0 +1,50 @@
# Tracking moves and initialization
Part of the borrow checker's job is to track which variables are
"initialized" at any given point in time -- this also requires
figuring out where moves occur and tracking those.
## Initialization and moves
From a user's perspective, initialization -- giving a variable some
value -- and moves -- transfering ownership to another place -- might
seem like distinct topics. Indeed, our borrow checker error messages
often talk about them differently. But **within the borrow checker**,
they are not nearly as separate. Roughly speaking, the borrow checker
tracks the set of "initialized places" at any point in time. Assigning
to a previously uninitialized local variable adds it to that set;
moving from a local variable removes it from that set.
Consider this example:
```rust
fn foo() {
let a: Vec<u32>;
// a is not initialized yet
a = vec![22];
// a is initialized here
std::mem::drop(a); // a is moved here
// a is no longer initialized here
let l = a.len(); //~ ERROR
}
```
Here you can see that `a` starts off as uninitialized; once it is
assigned, it becomes initialized. But when `drop(a)` is called, it
becomes uninitialized again.
## Subsections
To make it easier to peruse, this section is broken into a number of
subsections:
- [Move paths](./moves_and_initialization/move_paths.html the
*move path* concept that we use to track which local variables (or parts of
local variables, in some cases) are initialized.
- *Rest not yet written* =)

View File

@ -0,0 +1,126 @@
# Move paths
In reality, it's not enough to track initialization at the granularity
of local variables. Sometimes we need to track, e.g., individual fields:
```rust
fn foo() {
let a: (Vec<u32>, Vec<u32>) = (vec![22], vec![44]);
// a.0 and a.1 are both initialized
let b = a.0; // moves a.0
// a.0 is not initializd, but a.1 still is
let c = a.0; // ERROR
let d = a.1; // OK
}
```
To handle this, we track initialization at the granularity of a **move
path**. A [`MovePath`] represents some location that the user can
initialize, move, etc. So e.g. there is a move-path representing the
local variable `a`, and there is a move-path representing `a.0`. Move
paths roughly correspond to the concept of a [`Place`] from MIR, but
they are indexed in ways that enable us to do move analysis more
efficiently.
[`MovePath`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/move_paths/struct.MovePath.html
[`Place`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc/mir/enum.Place.html
## Move path indices
Although there is a [`MovePath`] data structure, they are never
referenced directly. Instead, all the code passes around *indices* of
type
[`MovePathIndex`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/move_paths/indexes/struct.MovePathIndex.html). If
you need to get information about a move path, you use this index with
the [`move_paths` field of the `MoveData`][move_paths]. For example,
to convert a [`MovePathIndex`] `mpi` into a MIR [`Place`], you might
access the [`MovePath::place`] field like so:
```rust
move_data.move_paths[mpi].place
```
[move_paths]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/move_paths/struct.MoveData.html#structfield.move_paths
[`MovePath::place`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/move_paths/struct.MovePath.html#structfield.place
## Building move paths
One of the first things we do in the MIR borrow check is to construct
the set of move paths. This is done as part of the
[`MoveData::gather_moves`] function. This function uses a MIR visitor
called [`Gatherer`] to walk the MIR and look at how each [`Place`]
within is accessed. For each such [`Place`], it constructs a
corresponding [`MovePathIndex`]. It also records when/where that
particular move path is moved/initialized, but we'll get to that in a
later section.
[`Gatherer`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/move_paths/builder/struct.Gatherer.html
[`MoveData::gather_moves`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/move_paths/struct.MoveData.html#method.gather_moves
### Illegal move paths
We don't actually move-paths for **every** [`Place`] that gets used.
In particular, if it is illegal to move from a [`Place`], then there
is no need for a [`MovePathIndex`]. Some examples:
- You cannot move from a static variable, so we do not create a [`MovePathIndex`]
for static variables.
- You cannot move an individual element of an array, so if we have e.g. `foo: [String; 3]`,
there would be no move-path for `foo[1]`.
- You cannot move from inside of a borrowed reference, so if we have e.g. `foo: &String`,
there would be no move-path for `*foo`.
These rules are enforced by the [`move_path_for`] function, which
converts a [`Place`] into a [`MovePathIndex`] -- in error cases like
those just discussed, the function returns an `Err`. This in turn
means we don't have to bother tracking whether those places are
initialized (which lowers overhead).
[`move_path_for`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/move_paths/builder/struct.Gatherer.html#method.move_path_for
## Looking up a move-path
If you have a [`Place`] and you would like to convert it to a [`MovePathIndex`], you
can do that using the [`MovePathLookup`] structure found in the [`rev_lookup`] field
of [`MoveData`]. There are two different methods:
[`MovePathLookup`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/move_paths/struct.MovePathLookup.html
[`rev_lookup`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/move_paths/struct.MoveData.html#structfield.rev_lookup
- [`find_local`], which takes a [`mir::Local`] representing a local
variable. This is the easier method, because we **always** create a
[`MovePathIndex`] for every local variable.
- [`find`], which takes an arbitrary [`Place`]. This method is a bit
more annoying to use, precisely because we don't have a
[`MovePathIndex`] for **every** [`Place`] (as we just discussed in
the "illegal move paths" section). Therefore, [`find`] returns a
[`LookupResult`] indicating the closest path it was able to find
that exists (e.g., for `foo[1]`, it might return just the path for
`foo`).
[`find`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/move_paths/struct.MovePathLookup.html#method.find
[`find_local`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/move_paths/struct.MovePathLookup.html#method.find_local
[`mir::Local`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc/mir/struct.Local.html
[`LookupResult`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/move_paths/enum.LookupResult.html
## Cross-references
As we noted above, move-paths are stored in a big vector and
referenced via their [`MovePathIndex`]. However, within this vector,
they are also structured into a tree. So for example if you have the
[`MovePathIndex`] for `a.b.c`, you can go to its parent move-path
`a.b`. You can also iterate over all children paths: so, from `a.b`,
you might iterate to find the path `a.b.c` (here you are iterating
just over the paths that the user **actually referenced**, not all
**possible** paths the user could have done). These references are
used for example in the [`has_any_child_of`] function, which checks
whether the dataflow results contain a value for the given move-path
(e.g., `a.b`) or any child of that move-path (e.g., `a.b.c`).
[`Place`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc/mir/enum.Place.html
[`has_any_child_of`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/at_location/struct.FlowAtLocation.html#method.has_any_child_of

View File

@ -1,11 +1,11 @@
# MIR-based region checking (NLL) # Region inference (NLL)
The MIR-based region checking code is located in The MIR-based region checking code is located in
[the `rustc_mir::borrow_check::nll` module][nll]. (NLL, of course, [the `rustc_mir::borrow_check::nll` module][nll]. (NLL, of course,
stands for "non-lexical lifetimes", a term that will hopefully be stands for "non-lexical lifetimes", a term that will hopefully be
deprecated once they become the standard kind of lifetime.) deprecated once they become the standard kind of lifetime.)
[nll]: https://github.com/rust-lang/rust/tree/master/src/librustc_mir/borrow_check/nll [nll]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/index.html
The MIR-based region analysis consists of two major functions: The MIR-based region analysis consists of two major functions:

View File

@ -0,0 +1,11 @@
# The MIR type-check
A key component of the borrow check is the
[MIR type-check](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/type_check/index.html).
This check walks the MIR and does a complete "type check" -- the same
kind you might find in any other language. In the process of doing
this type-check, we also uncover the region constraints that apply to
the program.