292 lines
11 KiB
Markdown
292 lines
11 KiB
Markdown
# Queries: demand-driven compilation
|
||
|
||
As described in [the high-level overview of the compiler][hl], the Rust compiler
|
||
is still (as of <!-- date: 2021-01 --> January 2021) transitioning from a
|
||
traditional "pass-based" setup to a "demand-driven" system. **The Compiler Query
|
||
System is the key to our new demand-driven organization.** The idea is pretty
|
||
simple. You have various queries that compute things about the input – for
|
||
example, there is a query called `type_of(def_id)` that, given the [def-id] of
|
||
some item, will compute the type of that item and return it to you.
|
||
|
||
[def-id]: appendix/glossary.md#def-id
|
||
[hl]: ./compiler-src.md
|
||
|
||
Query execution is **memoized** – so the first time you invoke a
|
||
query, it will go do the computation, but the next time, the result is
|
||
returned from a hashtable. Moreover, query execution fits nicely into
|
||
**incremental computation**; the idea is roughly that, when you do a
|
||
query, the result **may** be returned to you by loading stored data
|
||
from disk (but that's a separate topic we won't discuss further here).
|
||
|
||
The overall vision is that, eventually, the entire compiler
|
||
control-flow will be query driven. There will effectively be one
|
||
top-level query ("compile") that will run compilation on a crate; this
|
||
will in turn demand information about that crate, starting from the
|
||
*end*. For example:
|
||
|
||
- This "compile" query might demand to get a list of codegen-units
|
||
(i.e. modules that need to be compiled by LLVM).
|
||
- But computing the list of codegen-units would invoke some subquery
|
||
that returns the list of all modules defined in the Rust source.
|
||
- That query in turn would invoke something asking for the HIR.
|
||
- This keeps going further and further back until we wind up doing the
|
||
actual parsing.
|
||
|
||
However, that vision is not fully realized. Still, big chunks of the
|
||
compiler (for example, generating MIR) work exactly like this.
|
||
|
||
### Incremental Compilation in Detail
|
||
|
||
The [Incremental Compilation in Detail][query-model] chapter gives a more
|
||
in-depth description of what queries are and how they work.
|
||
If you intend to write a query of your own, this is a good read.
|
||
|
||
### Invoking queries
|
||
|
||
To invoke a query is simple. The tcx ("type context") offers a method
|
||
for each defined query. So, for example, to invoke the `type_of`
|
||
query, you would just do this:
|
||
|
||
```rust,ignore
|
||
let ty = tcx.type_of(some_def_id);
|
||
```
|
||
|
||
### How the compiler executes a query
|
||
|
||
So you may be wondering what happens when you invoke a query
|
||
method. The answer is that, for each query, the compiler maintains a
|
||
cache – if your query has already been executed, then, the answer is
|
||
simple: we clone the return value out of the cache and return it
|
||
(therefore, you should try to ensure that the return types of queries
|
||
are cheaply cloneable; insert a `Rc` if necessary).
|
||
|
||
#### Providers
|
||
|
||
If, however, the query is *not* in the cache, then the compiler will
|
||
try to find a suitable **provider**. A provider is a function that has
|
||
been defined and linked into the compiler somewhere that contains the
|
||
code to compute the result of the query.
|
||
|
||
**Providers are defined per-crate.** The compiler maintains,
|
||
internally, a table of providers for every crate, at least
|
||
conceptually. Right now, there are really two sets: the providers for
|
||
queries about the **local crate** (that is, the one being compiled)
|
||
and providers for queries about **external crates** (that is,
|
||
dependencies of the local crate). Note that what determines the crate
|
||
that a query is targeting is not the *kind* of query, but the *key*.
|
||
For example, when you invoke `tcx.type_of(def_id)`, that could be a
|
||
local query or an external query, depending on what crate the `def_id`
|
||
is referring to (see the [`self::keys::Key`][Key] trait for more
|
||
information on how that works).
|
||
|
||
Providers always have the same signature:
|
||
|
||
```rust,ignore
|
||
fn provider<'tcx>(
|
||
tcx: TyCtxt<'tcx>,
|
||
key: QUERY_KEY,
|
||
) -> QUERY_RESULT {
|
||
...
|
||
}
|
||
```
|
||
|
||
Providers take two arguments: the `tcx` and the query key.
|
||
They return the result of the query.
|
||
|
||
#### How providers are setup
|
||
|
||
When the tcx is created, it is given the providers by its creator using
|
||
the [`Providers`][providers_struct] struct. This struct is generated by
|
||
the macros here, but it is basically a big list of function pointers:
|
||
|
||
[providers_struct]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/query/struct.Providers.html
|
||
|
||
```rust,ignore
|
||
struct Providers {
|
||
type_of: for<'tcx> fn(TyCtxt<'tcx>, DefId) -> Ty<'tcx>,
|
||
...
|
||
}
|
||
```
|
||
|
||
At present, we have one copy of the struct for local crates, and one
|
||
for external crates, though the plan is that we may eventually have
|
||
one per crate.
|
||
|
||
These `Providers` structs are ultimately created and populated by
|
||
`rustc_driver`, but it does this by distributing the work
|
||
throughout the other `rustc_*` crates. This is done by invoking
|
||
various [`provide`][provide_fn] functions. These functions tend to look
|
||
something like this:
|
||
|
||
[provide_fn]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/fn.provide.html
|
||
|
||
```rust,ignore
|
||
pub fn provide(providers: &mut Providers) {
|
||
*providers = Providers {
|
||
type_of,
|
||
..*providers
|
||
};
|
||
}
|
||
```
|
||
|
||
That is, they take an `&mut Providers` and mutate it in place. Usually
|
||
we use the formulation above just because it looks nice, but you could
|
||
as well do `providers.type_of = type_of`, which would be equivalent.
|
||
(Here, `type_of` would be a top-level function, defined as we saw
|
||
before.) So, if we want to add a provider for some other query,
|
||
let's call it `fubar`, into the crate above, we might modify the `provide()`
|
||
function like so:
|
||
|
||
```rust,ignore
|
||
pub fn provide(providers: &mut Providers) {
|
||
*providers = Providers {
|
||
type_of,
|
||
fubar,
|
||
..*providers
|
||
};
|
||
}
|
||
|
||
fn fubar<'tcx>(tcx: TyCtxt<'tcx>, key: DefId) -> Fubar<'tcx> { ... }
|
||
```
|
||
|
||
N.B. Most of the `rustc_*` crates only provide **local
|
||
providers**. Almost all **extern providers** wind up going through the
|
||
[`rustc_metadata` crate][rustc_metadata], which loads the information
|
||
from the crate metadata. But in some cases there are crates that
|
||
provide queries for *both* local and external crates, in which case
|
||
they define both a `provide` and a `provide_extern` function, through
|
||
[`provide_both`][ext_provide_both], that `rustc_driver` can invoke.
|
||
|
||
[rustc_metadata]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_metadata/index.html
|
||
[ext_provide_both]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/attributes/fn.provide_both.html
|
||
|
||
### Adding a new kind of query
|
||
|
||
So suppose you want to add a new kind of query, how do you do so?
|
||
Well, defining a query takes place in two steps:
|
||
|
||
1. first, you have to specify the query name and arguments; and then,
|
||
2. you have to supply query providers where needed.
|
||
|
||
To specify the query name and arguments, you simply add an entry to
|
||
the big macro invocation in
|
||
[`compiler/rustc_middle/src/query/mod.rs`][query-mod], which looks something like:
|
||
|
||
[query-mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/query/index.html
|
||
|
||
```rust,ignore
|
||
rustc_queries! {
|
||
Other {
|
||
/// Records the type of every item.
|
||
query type_of(key: DefId) -> Ty<'tcx> {
|
||
cache { key.is_local() }
|
||
}
|
||
}
|
||
|
||
...
|
||
}
|
||
```
|
||
|
||
Queries are grouped into categories (`Other`, `Codegen`, `TypeChecking`, etc.).
|
||
Each group contains one or more queries. Each query definition is broken up like
|
||
this:
|
||
|
||
```rust,ignore
|
||
query type_of(key: DefId) -> Ty<'tcx> { ... }
|
||
^^ ^^^^^^^ ^^^^^ ^^^^^^^^ ^^^
|
||
| | | | |
|
||
| | | | query modifiers
|
||
| | | result type of query
|
||
| | query key type
|
||
| name of query
|
||
query keyword
|
||
```
|
||
|
||
Let's go over them one by one:
|
||
|
||
- **Query keyword:** indicates a start of a query definition.
|
||
- **Name of query:** the name of the query method
|
||
(`tcx.type_of(..)`). Also used as the name of a struct
|
||
(`ty::queries::type_of`) that will be generated to represent
|
||
this query.
|
||
- **Query key type:** the type of the argument to this query.
|
||
This type must implement the [`ty::query::keys::Key`][Key] trait, which
|
||
defines (for example) how to map it to a crate, and so forth.
|
||
- **Result type of query:** the type produced by this query. This type
|
||
should (a) not use `RefCell` or other interior mutability and (b) be
|
||
cheaply cloneable. Interning or using `Rc` or `Arc` is recommended for
|
||
non-trivial data types.
|
||
- The one exception to those rules is the `ty::steal::Steal` type,
|
||
which is used to cheaply modify MIR in place. See the definition
|
||
of `Steal` for more details. New uses of `Steal` should **not** be
|
||
added without alerting `@rust-lang/compiler`.
|
||
- **Query modifiers:** various flags and options that customize how the
|
||
query is processed (mostly with respect to [incremental compilation][incrcomp]).
|
||
|
||
[Key]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_query_impl/keys/trait.Key.html
|
||
[incrcomp]: queries/incremental-compilation-in-detail.html#query-modifiers
|
||
|
||
So, to add a query:
|
||
|
||
- Add an entry to `rustc_queries!` using the format above.
|
||
- Link the provider by modifying the appropriate `provide` method;
|
||
or add a new one if needed and ensure that `rustc_driver` is invoking it.
|
||
|
||
#### Query structs and descriptions
|
||
|
||
For each kind, the `rustc_queries` macro will generate a "query struct"
|
||
named after the query. This struct is a kind of a place-holder
|
||
describing the query. Each such struct implements the
|
||
[`self::config::QueryConfig`][QueryConfig] trait, which has associated types for the
|
||
key/value of that particular query. Basically the code generated looks something
|
||
like this:
|
||
|
||
```rust,ignore
|
||
// Dummy struct representing a particular kind of query:
|
||
pub struct type_of<'tcx> { data: PhantomData<&'tcx ()> }
|
||
|
||
impl<'tcx> QueryConfig for type_of<'tcx> {
|
||
type Key = DefId;
|
||
type Value = Ty<'tcx>;
|
||
|
||
const NAME: QueryName = QueryName::type_of;
|
||
const CATEGORY: ProfileCategory = ProfileCategory::Other;
|
||
}
|
||
```
|
||
|
||
There is an additional trait that you may wish to implement called
|
||
[`self::config::QueryDescription`][QueryDescription]. This trait is
|
||
used during cycle errors to give a "human readable" name for the query,
|
||
so that we can summarize what was happening when the cycle occurred.
|
||
Implementing this trait is optional if the query key is `DefId`, but
|
||
if you *don't* implement it, you get a pretty generic error ("processing `foo`...").
|
||
You can put new impls into the `config` module. They look something like this:
|
||
|
||
[QueryConfig]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_query_system/query/config/trait.QueryConfig.html
|
||
[QueryDescription]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_query_system/query/config/trait.QueryDescription.html
|
||
|
||
```rust,ignore
|
||
impl<'tcx> QueryDescription for queries::type_of<'tcx> {
|
||
fn describe(tcx: TyCtxt, key: DefId) -> String {
|
||
format!("computing the type of `{}`", tcx.def_path_str(key))
|
||
}
|
||
}
|
||
```
|
||
|
||
Another option is to add `desc` modifier:
|
||
|
||
```rust,ignore
|
||
rustc_queries! {
|
||
Other {
|
||
/// Records the type of every item.
|
||
query type_of(key: DefId) -> Ty<'tcx> {
|
||
desc { |tcx| "computing the type of `{}`", tcx.def_path_str(key) }
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
`rustc_queries` macro will generate an appropriate `impl` automatically.
|
||
|
||
[query-model]: queries/incremental-compilation-in-detail.md
|