Improve doc of MIR queries & passes

This commit is contained in:
chj 2022-08-16 19:10:19 +08:00 committed by Oli Scherer
parent d3f43d361d
commit 2f8dd37f16
2 changed files with 148 additions and 64 deletions

View File

@ -110,8 +110,8 @@
- [The MIR (Mid-level IR)](./mir/index.md) - [The MIR (Mid-level IR)](./mir/index.md)
- [MIR construction](./mir/construction.md) - [MIR construction](./mir/construction.md)
- [MIR visitor and traversal](./mir/visitor.md) - [MIR visitor and traversal](./mir/visitor.md)
- [MIR passes: getting the MIR for a function](./mir/passes.md) - [MIR queries and passes: getting the MIR](./mir/passes.md)
- [Identifiers in the compiler](./identifiers.md) - [Identifiers in the Compiler](./identifiers.md)
- [Closure expansion](./closure.md) - [Closure expansion](./closure.md)
- [Inline assembly](./asm.md) - [Inline assembly](./asm.md)

View File

@ -1,100 +1,184 @@
# MIR passes # MIR queries and passes
If you would like to get the MIR for a function (or constant, etc), If you would like to get the MIR:
you can use the `optimized_mir(def_id)` query. This will give you back
the final, optimized MIR. For foreign def-ids, we simply read the MIR - for a function - you can use the `optimized_mir(def_id)` query;
- for a promoted - you can use the `promoted_mir(def_id)` query.
These will give you back the final, optimized MIR. For foreign def-ids, we simply read the MIR
from the other crate's metadata. But for local def-ids, the query will from the other crate's metadata. But for local def-ids, the query will
construct the MIR and then iteratively optimize it by applying a construct the optimized MIR by requesting a pipeline of upstream queries[^query].
series of passes. This section describes how those passes work and how Each query will contain a series of passes.
you can extend them. This section describes how those queries and passes work and how you can extend them.
To produce the `optimized_mir(D)` for a given def-id `D`, the MIR To produce the optimized MIR for a given def-id `D`, `optimized_mir(D)`
passes through several suites of optimizations, each represented by a goes through several suites of passes, each grouped by a
query. Each suite consists of multiple optimizations and query. Each suite consists of passes which perform analysis, transformation or optimization.
transformations. These suites represent useful intermediate points Each query represent a useful intermediate point
where we want to access the MIR for type checking or other purposes: where we can access the MIR dialect for type checking or other purposes:
- `mir_build(D)` not a query, but this constructs the initial MIR - `mir_built(D)` it gives the initial MIR just after it's built;
- `mir_const(D)` applies some simple transformations to make MIR ready for - `mir_const(D)` it applies some simple transformation passes to make MIR ready for
constant evaluation; const qualification;
- `mir_validated(D)` applies some more transformations, making MIR ready for - `mir_promoted(D)` - it extracts promotable temps into separate MIR bodies, and also makes MIR
borrow checking; ready for borrow checking;
- `optimized_mir(D)` the final state, after all optimizations have been - `mir_drops_elaborated_and_const_checked(D)` - it performs borrow checking, runs major
performed. transformation passes (such as drop elaboration) and makes MIR ready for optimization;
- `optimized_mir(D)` it performs all enabled optimizations and reaches the final state.
### Implementing and registering a pass [^query]: See the [Queries](../query.md) chapter for the general concept of query.
A `MirPass` is some bit of code that processes the MIR, typically ## Implementing and registering a pass
but not always transforming it along the way somehow. For example,
it might perform an optimization. The `MirPass` trait itself is found A `MirPass` is some bit of code that processes the MIR, typically transforming it along the way
in [the `rustc_mir_transform` crate][mirtransform], and it somehow. But it may also do other things like lingint (e.g., [`CheckPackedRef`][lint1],
basically consists of one method, `run_pass`, that simply gets an [`CheckConstItemMutation`][lint2], [`FunctionItemReferences`][lint3], which implement `MirLint`) or
`&mut Mir` (along with the tcx and some information about where it optimization (e.g., [`SimplifyCfg`][opt1], [`RemoveUnneededDrops`][opt2]). While most MIR passes
came from). The MIR is therefore modified in place (which helps to are defined in the [`rustc_mir_transform`][mirtransform] crate, the `MirPass` trait itself is
keep things efficient). [found][mirpass] in the `rustc_middle` crate, and it basically consists of one primary method,
`run_pass`, that simply gets an `&mut Body` (along with the `tcx`).
The MIR is therefore modified in place (which helps to keep things efficient).
A basic example of a MIR pass is [`RemoveStorageMarkers`], which walks A basic example of a MIR pass is [`RemoveStorageMarkers`], which walks
the MIR and removes all storage marks if they won't be emitted during codegen. As you the MIR and removes all storage marks if they won't be emitted during codegen. As you
can see from its source, a MIR pass is defined by first defining a can see from its source, a MIR pass is defined by first defining a
dummy type, a struct with no fields, something like: dummy type, a struct with no fields:
```rust ```rust
struct MyPass; pub struct RemoveStorageMarkers;
``` ```
for which you then implement the `MirPass` trait. You can then insert for which we implement the `MirPass` trait. We can then insert
this pass into the appropriate list of passes found in a query like this pass into the appropriate list of passes found in a query like
`optimized_mir`, `mir_validated`, etc. (If this is an optimization, it `mir_built`, `optimized_mir`, etc. (If this is an optimization, it
should go into the `optimized_mir` list.) should go into the `optimized_mir` list.)
Another example of a simple MIR pass is [`CleanupNonCodegenStatements`][cleanup-pass], which walks
the MIR and removes all statements that are not relevant to code generation. As you can see from
its [source][cleanup-source], it is defined by first defining a dummy type, a struct with no
fields:
```rust
pub struct CleanupNonCodegenStatements;
```
for which we then implement the `MirPass` trait:
```rust
impl<'tcx> MirPass<'tcx> for CleanupNonCodegenStatements {
fn run_pass(&self, tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
...
}
}
```
We [register][pass-register] this pass inside the `mir_drops_elaborated_and_const_checked` query.
(If this is an optimization, it should go into the `optimized_mir` list.)
If you are writing a pass, there's a good chance that you are going to If you are writing a pass, there's a good chance that you are going to
want to use a [MIR visitor]. MIR visitors are a handy way to walk all want to use a [MIR visitor]. MIR visitors are a handy way to walk all
the parts of the MIR, either to search for something or to make small the parts of the MIR, either to search for something or to make small
edits. edits.
### Stealing ## Stealing
The intermediate queries `mir_const()` and `mir_validated()` yield up The intermediate queries `mir_const()` and `mir_promoted()` yield up
a `&'tcx Steal<Mir<'tcx>>`, allocated using a `&'tcx Steal<Body<'tcx>>`, allocated using `tcx.alloc_steal_mir()`.
`tcx.alloc_steal_mir()`. This indicates that the result may be This indicates that the result may be **stolen** by a subsequent query this is an
**stolen** by the next suite of optimizations this is an
optimization to avoid cloning the MIR. Attempting to use a stolen optimization to avoid cloning the MIR. Attempting to use a stolen
result will cause a panic in the compiler. Therefore, it is important result will cause a panic in the compiler. Therefore, it is important
that you do not read directly from these intermediate queries except as that you do not accidently read from these intermediate queries without
part of the MIR processing pipeline. the consideration of the dependency in the MIR processing pipeline.
Because of this stealing mechanism, some care must also be taken to Because of this stealing mechanism, some care must be taken to
ensure that, before the MIR at a particular phase in the processing ensure that, before the MIR at a particular phase in the processing
pipeline is stolen, anyone who may want to read from it has already pipeline is stolen, anyone who may want to read from it has already
done so. Concretely, this means that if you have some query `foo(D)` done so.
<!-- FIXME - What is force? Do we still have it in rustc? -->
Concretely, this means that if you have a query `foo(D)`
that wants to access the result of `mir_const(D)` or that wants to access the result of `mir_const(D)` or
`mir_validated(D)`, you need to have the successor pass "force" `mir_promoted(D)`, you need to have the successor pass "force"
`foo(D)` using `ty::queries::foo::force(...)`. This will force a query `foo(D)` using `ty::queries::foo::force(...)`. This will force a query
to execute even though you don't directly require its result. to execute even though you don't directly require its result.
As an example, consider MIR const qualification. It wants to read the > This mechanism is a bit dodgy. There is a discussion of more elegant
result produced by the `mir_const()` suite. However, that result will
be **stolen** by the `mir_validated()` suite. If nothing was done,
then `mir_const_qualif(D)` would succeed if it came before
`mir_validated(D)`, but fail otherwise. Therefore, `mir_validated(D)`
will **force** `mir_const_qualif` before it actually steals, thus
ensuring that the reads have already happened (remember that
[queries are memoized](../query.html), so executing a query twice
simply loads from a cache the second time):
```text
mir_const(D) --read-by--> mir_const_qualif(D)
| ^
stolen-by |
| (forces)
v |
mir_validated(D) ------------+
```
This mechanism is a bit dodgy. There is a discussion of more elegant
alternatives in [rust-lang/rust#41710]. alternatives in [rust-lang/rust#41710].
### Overview
Below is an overview of the stealing dependency in the MIR processing pipeline[^part]:
```mermaid
flowchart BT
mir_for_ctfe* --borrow--> id40
id5 --steal--> id40
mir_borrowck* --borrow--> id3
id41 --steal part 1--> id3
id40 --steal part 0--> id3
mir_const_qualif* -- borrow --> id2
id3 -- steal --> id2
id2 -- steal --> id1
id1([mir_built])
id2([mir_const])
id3([mir_promoted])
id40([mir_drops_elaborated_and_const_checked])
id41([promoted_mir])
id5([optimized_mir])
style id1 fill:#bbf
style id2 fill:#bbf
style id3 fill:#bbf
style id40 fill:#bbf
style id41 fill:#bbf
style id5 fill:#bbf
```
The stadium-shape queries (e.g., `mir_built`) with a deep color are the primary queries in the
pipeline, while the rectangle-shape queries (e.g., `mir_const_qualif*`[^star]) with a shallow color
are those subsequent queries that need to read the results from `&'tcx Steal<Body<'tcx>>`. With the
stealing mechanism, the rectangle-shape queries must be performed before any stadium-shape queries,
that have an equal or larger height in the dependency tree, ever do.
[^part]: The `mir_promoted` query will yield up a tuple
`(&'tcx Steal<Body<'tcx>>, &'tcx Steal<IndexVec<Promoted, Body<'tcx>>>)`, `promoted_mir` will steal
part 1 (`&'tcx Steal<IndexVec<Promoted, Body<'tcx>>>`) and `mir_drops_elaborated_and_const_checked`
will steal part 0 (`&'tcx Steal<Body<'tcx>>`). And their stealing is irrelevant to each other,
i.e., can be performed separately.
[^star]: Note that the `*` suffix in the queries represent a set of queries with the same prefix.
For example, `mir_borrowck*` represents `mir_borrowck`, `mir_borrowck_const_arg` and
`mir_borrowck_opt_const_arg`.
### Example
As an example, consider MIR const qualification. It wants to read the result produced by the
`mir_const` query. However, that result will be **stolen** by the `mir_promoted` query at some
time in the pipeline. Before `mir_promoted` is ever queried, calling the `mir_const_qualif` query
will succeed since `mir_const` will produce (if queried the first time) or cache (if queried
multiple times) the `Steal` result and the result is **not** stolen yet. After `mir_promoted` is
queried, the result would be stolen and calling the `mir_const_qualif` query to read the result
would cause a panic.
Therefore, with this stealing mechanism, `mir_promoted` should guarantee any `mir_const_qualif*`
queries are called before it actually steals, thus ensuring that the reads have already happened
(remember that [queries are memoized](../query.html), so executing a query twice
simply loads from a cache the second time).
[rust-lang/rust#41710]: https://github.com/rust-lang/rust/issues/41710 [rust-lang/rust#41710]: https://github.com/rust-lang/rust/issues/41710
[mirpass]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/trait.MirPass.html
[lint1]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/check_packed_ref/struct.CheckPackedRef.html
[lint2]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/check_const_item_mutation/struct.CheckConstItemMutation.html
[lint3]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/function_item_references/struct.FunctionItemReferences.html
[opt1]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/simplify/struct.SimplifyCfg.html
[opt2]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/remove_unneeded_drops/struct.RemoveUnneededDrops.html
[mirtransform]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/ [mirtransform]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/
[`RemoveStorageMarkers`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/remove_storage_markers/struct.RemoveStorageMarkers.html [`RemoveStorageMarkers`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/remove_storage_markers/struct.RemoveStorageMarkers.html
[cleanup-pass]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/cleanup_post_borrowck/struct.CleanupNonCodegenStatements.html
[cleanup-source]: https://github.com/rust-lang/rust/blob/e2b52ff73edc8b0b7c74bc28760d618187731fe8/compiler/rustc_mir_transform/src/cleanup_post_borrowck.rs#L27
[pass-register]: https://github.com/rust-lang/rust/blob/e2b52ff73edc8b0b7c74bc28760d618187731fe8/compiler/rustc_mir_transform/src/lib.rs#L413
[MIR visitor]: ./visitor.html [MIR visitor]: ./visitor.html