# Queries: demand-driven compilation As described in [the high-level overview of the compiler][hl], the Rust compiler is currently transitioning from a traditional "pass-based" setup to a "demand-driven" system. **The Compiler Query System is the key to our new demand-driven organization.** The idea is pretty simple. You have various queries that compute things about the input – for example, there is a query called `type_of(def_id)` that, given the [def-id] of some item, will compute the type of that item and return it to you. [def-id]: appendix/glossary.md#def-id [hl]: compiler-src.html Query execution is **memoized** – so the first time you invoke a query, it will go do the computation, but the next time, the result is returned from a hashtable. Moreover, query execution fits nicely into **incremental computation**; the idea is roughly that, when you do a query, the result **may** be returned to you by loading stored data from disk (but that's a separate topic we won't discuss further here). The overall vision is that, eventually, the entire compiler control-flow will be query driven. There will effectively be one top-level query ("compile") that will run compilation on a crate; this will in turn demand information about that crate, starting from the *end*. For example: - This "compile" query might demand to get a list of codegen-units (i.e. modules that need to be compiled by LLVM). - But computing the list of codegen-units would invoke some subquery that returns the list of all modules defined in the Rust source. - That query in turn would invoke something asking for the HIR. - This keeps going further and further back until we wind up doing the actual parsing. However, that vision is not fully realized. Still, big chunks of the compiler (for example, generating MIR) work exactly like this. ### Incremental Compilation in Detail The [Incremental Compilation in Detail][query-model] chapter gives a more in-depth description of what queries are and how they work. If you intend to write a query of your own, this is a good read. ### Invoking queries To invoke a query is simple. The tcx ("type context") offers a method for each defined query. So, for example, to invoke the `type_of` query, you would just do this: ```rust,ignore let ty = tcx.type_of(some_def_id); ``` ### How the compiler executes a query So you may be wondering what happens when you invoke a query method. The answer is that, for each query, the compiler maintains a cache – if your query has already been executed, then, the answer is simple: we clone the return value out of the cache and return it (therefore, you should try to ensure that the return types of queries are cheaply cloneable; insert a `Rc` if necessary). #### Providers If, however, the query is *not* in the cache, then the compiler will try to find a suitable **provider**. A provider is a function that has been defined and linked into the compiler somewhere that contains the code to compute the result of the query. **Providers are defined per-crate.** The compiler maintains, internally, a table of providers for every crate, at least conceptually. Right now, there are really two sets: the providers for queries about the **local crate** (that is, the one being compiled) and providers for queries about **external crates** (that is, dependencies of the local crate). Note that what determines the crate that a query is targeting is not the *kind* of query, but the *key*. For example, when you invoke `tcx.type_of(def_id)`, that could be a local query or an external query, depending on what crate the `def_id` is referring to (see the [`self::keys::Key`][Key] trait for more information on how that works). Providers always have the same signature: ```rust,ignore fn provider<'tcx>( tcx: TyCtxt<'tcx>, key: QUERY_KEY, ) -> QUERY_RESULT { ... } ``` Providers take two arguments: the `tcx` and the query key. They return the result of the query. #### How providers are setup When the tcx is created, it is given the providers by its creator using the [`Providers`][providers_struct] struct. This struct is generated by the macros here, but it is basically a big list of function pointers: [providers_struct]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/query/struct.Providers.html ```rust,ignore struct Providers { type_of: for<'tcx> fn(TyCtxt<'tcx>, DefId) -> Ty<'tcx>, ... } ``` At present, we have one copy of the struct for local crates, and one for external crates, though the plan is that we may eventually have one per crate. These `Providers` structs are ultimately created and populated by `librustc_driver`, but it does this by distributing the work throughout the other `rustc_*` crates. This is done by invoking various [`provide`][provide_fn] functions. These functions tend to look something like this: [provide_fn]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/fn.provide.html ```rust,ignore pub fn provide(providers: &mut Providers) { *providers = Providers { type_of, ..*providers }; } ``` That is, they take an `&mut Providers` and mutate it in place. Usually we use the formulation above just because it looks nice, but you could as well do `providers.type_of = type_of`, which would be equivalent. (Here, `type_of` would be a top-level function, defined as we saw before.) So, if we want to add a provider for some other query, let's call it `fubar`, into the crate above, we might modify the `provide()` function like so: ```rust,ignore pub fn provide(providers: &mut Providers) { *providers = Providers { type_of, fubar, ..*providers }; } fn fubar<'tcx>(tcx: TyCtxt<'tcx>, key: DefId) -> Fubar<'tcx> { ... } ``` N.B. Most of the `rustc_*` crates only provide **local providers**. Almost all **extern providers** wind up going through the [`rustc_metadata` crate][rustc_metadata], which loads the information from the crate metadata. But in some cases there are crates that provide queries for *both* local and external crates, in which case they define both a [`provide`][ext_provide] and a [`provide_extern`][ext_provide_extern] function that `rustc_driver` can invoke. [rustc_metadata]: https://github.com/rust-lang/rust/tree/master/src/librustc_metadata [ext_provide]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/attributes/fn.provide.html [ext_provide_extern]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/attributes/fn.provide_extern.html ### Adding a new kind of query So suppose you want to add a new kind of query, how do you do so? Well, defining a query takes place in two steps: 1. first, you have to specify the query name and arguments; and then, 2. you have to supply query providers where needed. To specify the query name and arguments, you simply add an entry to the big macro invocation in [`src/librustc_middle/query/mod.rs`][query-mod], which looks something like: [query-mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/query/index.html ```rust,ignore rustc_queries! { Other { /// Records the type of every item. query type_of(key: DefId) -> Ty<'tcx> { cache { key.is_local() } } } ... } ``` Queries are grouped into categories (`Other`, `Codegen`, `TypeChecking`, etc.). Each group contains one or more queries. Each query definition is broken up like this: ```rust,ignore query type_of(key: DefId) -> Ty<'tcx> { ... } ^^ ^^^^^^^ ^^^^^ ^^^^^^^^ ^^^ | | | | | | | | | query modifiers | | | result type of query | | query key type | name of query query keyword ``` Let's go over them one by one: - **Query keyword:** indicates a start of a query definition. - **Name of query:** the name of the query method (`tcx.type_of(..)`). Also used as the name of a struct (`ty::queries::type_of`) that will be generated to represent this query. - **Query key type:** the type of the argument to this query. This type must implement the [`ty::query::keys::Key`][Key] trait, which defines (for example) how to map it to a crate, and so forth. - **Result type of query:** the type produced by this query. This type should (a) not use `RefCell` or other interior mutability and (b) be cheaply cloneable. Interning or using `Rc` or `Arc` is recommended for non-trivial data types. - The one exception to those rules is the `ty::steal::Steal` type, which is used to cheaply modify MIR in place. See the definition of `Steal` for more details. New uses of `Steal` should **not** be added without alerting `@rust-lang/compiler`. - **Query modifiers:** various flags and options that customize how the query is processed (mostly with respect to [incremental compilation][incrcomp]). [Key]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/query/keys/trait.Key.html [incrcomp]: queries/incremental-compilation-in-detail.html#query-modifiers So, to add a query: - Add an entry to `rustc_queries!` using the format above. - Link the provider by modifying the appropriate `provide` method; or add a new one if needed and ensure that `rustc_driver` is invoking it. #### Query structs and descriptions For each kind, the `rustc_queries` macro will generate a "query struct" named after the query. This struct is a kind of a place-holder describing the query. Each such struct implements the [`self::config::QueryConfig`][QueryConfig] trait, which has associated types for the key/value of that particular query. Basically the code generated looks something like this: ```rust,ignore // Dummy struct representing a particular kind of query: pub struct type_of<'tcx> { data: PhantomData<&'tcx ()> } impl<'tcx> QueryConfig for type_of<'tcx> { type Key = DefId; type Value = Ty<'tcx>; const NAME: QueryName = QueryName::type_of; const CATEGORY: ProfileCategory = ProfileCategory::Other; } ``` There is an additional trait that you may wish to implement called [`self::config::QueryDescription`][QueryDescription]. This trait is used during cycle errors to give a "human readable" name for the query, so that we can summarize what was happening when the cycle occurred. Implementing this trait is optional if the query key is `DefId`, but if you *don't* implement it, you get a pretty generic error ("processing `foo`..."). You can put new impls into the `config` module. They look something like this: [QueryConfig]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/query/trait.QueryConfig.html [QueryDescription]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_query_system/query/config/trait.QueryDescription.html ```rust,ignore impl<'tcx> QueryDescription for queries::type_of<'tcx> { fn describe(tcx: TyCtxt, key: DefId) -> String { format!("computing the type of `{}`", tcx.def_path_str(key)) } } ``` Another option is to add `desc` modifier: ```rust,ignore rustc_queries! { Other { /// Records the type of every item. query type_of(key: DefId) -> Ty<'tcx> { desc { |tcx| "computing the type of `{}`", tcx.def_path_str(key) } } } } ``` `rustc_queries` macro will generate an appropriate `impl` automatically. [query-model]: queries/incremental-compilation-in-detail.md