From c3a35021d6e607ecc7b05573875b49d476ded8f6 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Sat, 10 Mar 2018 05:32:47 -0500 Subject: [PATCH] add background material on trait queries --- src/SUMMARY.md | 8 +- src/traits-canonical-queries.md | 228 ++++++++++++++++++++++++++++++++ src/traits-canonicalization.md | 7 +- src/traits-slg.md | 1 + src/traits.md | 2 +- 5 files changed, 241 insertions(+), 5 deletions(-) create mode 100644 src/traits-canonical-queries.md create mode 100644 src/traits-slg.md diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 9c1920d3..95a5ee57 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -23,12 +23,14 @@ - [Specialization](./trait-specialization.md) - [Trait solving (new-style)](./traits.md) - [Lowering to logic](./traits-lowering-to-logic.md) - - [Canonical queries](./traits-canonicalization.md) - - [Goals and clauses](./traits-goals-and-clauses.md) - - [Lowering rules](./traits-lowering-rules.md) + - [Goals and clauses](./traits-goals-and-clauses.md) + - [Lowering rules](./traits-lowering-rules.md) + - [Canonical queries](./traits-canonical-queries.md) + - [Canonicalization](./traits-canonicalization.md) - [Equality and associated types](./traits-associated-types.md) - [Region constraints](./traits-regions.md) - [Well-formedness checking](./traits-wf.md) + - [The SLG solver](./traits-slg.md) - [Bibliography](./traits-bibliography.md) - [Type checking](./type-checking.md) - [The MIR (Mid-level IR)](./mir.md) diff --git a/src/traits-canonical-queries.md b/src/traits-canonical-queries.md new file mode 100644 index 00000000..ec861eb4 --- /dev/null +++ b/src/traits-canonical-queries.md @@ -0,0 +1,228 @@ +# Canonical queries + +The "start" of the trait system is the **canonical query** (these are +both queries in the more general sense of the word -- something you +would like to know the answer to -- and in the +[rustc-specific sense](./query.html)). The idea is that the type +checker or other parts of the system, may in the course of doing their +thing want to know whether some trait is implemented for some type +(e.g., is `u32: Debug` true?). Or they may want to +[normalize some associated type](./traits-associated-types.html). + +This section covers queries at a fairly high level of abstraction. The +subsections look a bit more closely at how these ideas are implemented +in rustc. + +## The traditional, interactive Prolog query + +In a traditional Prolog system, when you start a query, the solver +will run off and start supplying you with every possible answer it can +find. So given something like this: + + ?- Vec: AsRef + +The solver might answer: + + Vec: AsRef<[i32]> + continue? (y/n) + +This `continue` bit is interesting. The idea in Prolog is that the +solver is finding **all possible** instantiations of your query that +are true. In this case, if we instantiate `?U = [i32]`, then the query +is true (note that a traditional Prolog interface does not, directly, +tell us a value for `?U`, but we can infer one by unifying the +response with our original query -- Rust's solver gives back a +substitution instead). If we were to hit `y`, the solver might then +give us another possible answer: + + Vec: AsRef> + continue? (y/n) + +This answer derives from the fact that there is a reflexive impl +(`impl AsRef for T`) for `AsRef`. If were to hit `y` again, +then we might get back a negative response: + + no + +Naturally, in some cases, there may be no possible answers, and hence +the solver will just give me back `no` right away: + + ?- Box: Copy + no + +In some cases, there might be an infinite number of responses. So for +example if I gave this query, and I kept hitting `y`, then the solver +would never stop giving me back answers: + + ?- Vec: Clone + Vec: Clone + continue? (y/n) + Vec>: Clone + continue? (y/n) + Vec>>: Clone + continue? (y/n) + Vec>>>: Clone + continue? (y/n) + +As you can imagine, the solver will gleefully keep adding another +layer of `Box` until we ask it to stop, or it runs out of memory. + +Another interesting thing is that queries might still have variables +in them. For example: + + ?- Rc: Clone + +might produce the answer: + + Rc: Clone + continue? (y/n) + +After all, `Rc` is true **no matter what type `?T` is**. + +## A trait query in rustc + +The trait queries in rustc work somewhat differently. Instead of +trying to enumerate **all possible** answers for you, they are looking +for an **unambiguous** answer. In particular, when they tells you the +value for a type variable, that means that this is the **only possible +instantiation** that you could use, given the current set of impls and +where-clauses, that would be provable. (Internally within the solver, +though, they can potentially enumerate all possible answers. See +[the description of the SLG solver](./traits-slg.html) for details.) + +The response to a trait query in rustc is typically a +`Result, NoSolution>` (where the `T` will vary a bit +depending on the query itself). The `Err(NoSolution)` case indicates +that the query was false and had no answers (e.g., `Box: Copy`). +Otherwise, the `QueryResult` gives back information about the possible answer(s) +we did find. It consists of four parts: + +- **Certainty:** tells you how sure we are of this answer. It can have two values: + - `Proven` means that the result is known to be true. + - This might be the result for trying to prove `Vec: Clone`, + say, or `Rc: Clone`. + - `Ambiguous` means that there were things we could not yet prove to + be either true *or* false, typically because more type information + was needed. (We'll see an example shortly.) + - This might be the result for trying to prove `Vec: Clone`. +- **Var values:** Values for each of the unbound inference variables + (like `?T`) that appeared in your original query. (Remember that in Prolog, + we had to infer these.) + - As we'll see in the example below, we can get back var values even + for `Ambiguous` cases. +- **Region constraints:** these are relations that must hold between + the lifetimes that you supplied as inputs. We'll ignore these here, + but see the + [section on handling regions in traits](./traits-regions.html) for + more details. +- **Value:** The query result also comes with a value of type `T`. For + some specialized queries -- like normalizing associated types -- + this is used to carry back an extra result, but it's often just + `()`. + +### Examples + +Let's work through an example query to see what all the parts mean. +Consider [the `Borrow` trait][borrow]. This trait has a number of +impls; among them, there are these two (for clarify, I've written the +`Sized` bounds explicitly): + +[borrow]: https://doc.rust-lang.org/std/borrow/trait.Borrow.html + +```rust +impl Borrow for T where T: ?Sized +impl Borrow<[T]> for Vec where T: Sized +``` + +**Example 1.** Imagine we are type-checking this (rather artificial) +bit of code: + +```rust +fn foo(a: A, vec_b: Option) where A: Borrow { } + +fn main() { + let mut t: Vec<_> = vec![]; // Type: Vec + let mut u: Option<_> = None; // Type: Option + foo(t, u); // Example 1: requires `Vec: Borrow` + ... +} +``` + +As the comments indicate, we first create two variables `t` and `u`; +`t` is an empty vector and `u` is a `None` option. Both of these +variables have unbound inference variables in their type: `?T` +represents the elements in the vector `t` and `?U` represents the +value stored in the option `u`. Next, we invoke `foo`; comparing the +signature of `foo` to its arguments, we wind up with `A = Vec` and +`B = ?U`.Therefore, the where clause on `foo` requires that `Vec: +Borrow`. This is thus our first example trait query. + +There are many possible solutions to the query `Vec: Borrow`; +for example: + +- `?U = Vec`, +- `?U = [?T]`, +- `?T = u32, ?U = [u32]` +- and so forth. + +Therefore, the result we get back would be as follows (I'm going to +ignore region constraints and the "value"): + +- Certainty: `Ambiguous` -- we're not sure yet if this holds +- Var values: `[?T = ?T, ?U = ?U]` -- we learned nothing about the values of the variables + +In short, the query result says that it is too soon to say much about +whether this trait is proven. During type-checking, this is not an +immediate error: instead, the type checker would hold on to this +requirement (`Vec: Borrow`) and wait. As we'll see in the next +example, it may happen that `?T` and `?U` wind up constrained from +other sources, in which case we can try the trait query again. + +**Example 2.** We can now extend our previous example a bit, +and assign a value to `u`: + +```rust +fn foo(a: A, vec_b: Option) where A: Borrow { } + +fn main() { + // What we saw before: + let mut t: Vec<_> = vec![]; // Type: Vec + let mut u: Option<_> = None; // Type: Option + foo(t, u); // `Vec: Borrow` => ambiguous + + // New stuff: + u = Some(vec![]); // ?U = Vec +} +``` + +As a result of this assignment, the type of `u` is forced to be +`Option>`, where `?V` represents the element type of the +vector. This in turn implies that `?U` is [unified] to `Vec`. + +[unified]: ./type-checking.html + +Let's suppose that the type checker decides to revisit the +"as-yet-unproven" trait obligation we saw before, `Vec: +Borrow`. `?U` is no longer an unbound inference variable; it now +has a value, &. So, if we "refresh" the query with that value, we get: + + Vec: Borrow> + +This time, there is only one impl that applies, the reflexive impl: + + impl Borrow for T where T: ?Sized + +Therefore, the trait checker will answer: + +- Certainty: `Proven` +- Var values: `[?T = ?T, ?V = ?T]` + +Here, it is saying that we have indeed proven that the obligation +holds, and we also know that `?T` and `?V` are the same type (but we +don't know what that type is yet!). + +(In fact, as the function ends here, the type checker would give an +error at this point, since the element types of `t` and `u` are still +not yet known, even though they are known to be the same.) + + diff --git a/src/traits-canonicalization.md b/src/traits-canonicalization.md index 9b71c847..94b1d6b8 100644 --- a/src/traits-canonicalization.md +++ b/src/traits-canonicalization.md @@ -1,7 +1,10 @@ # Canonicalization Canonicalization is the process of **isolating** an inference value -from its context. It is really based on a very simple concept: every +from its context. It is a key part of implementing +[canonical queries][cq]. + +Canonicalization is really based on a very simple concept: every [inference variable](./type-inference.html#vars) is always in one of two states: either it is **unbound**, in which case we don't know yet what type it is, or it is **bound**, in which case we do. So to @@ -12,6 +15,8 @@ starting from zero and numbered in a fixed order (left to right, for the most part, but really it doesn't matter as long as it is consistent). +[cq]: ./traits-canonical-queries.html + So, for example, if we have the type `X = (?T, ?U)`, where `?T` and `?U` are distinct, unbound inference variables, then the canonical form of `X` would be `(?0, ?1)`, where `?0` and `?1` represent these diff --git a/src/traits-slg.md b/src/traits-slg.md new file mode 100644 index 00000000..cdd52ece --- /dev/null +++ b/src/traits-slg.md @@ -0,0 +1 @@ +# The SLG solver diff --git a/src/traits.md b/src/traits.md index 18cb0c64..6994f33d 100644 --- a/src/traits.md +++ b/src/traits.md @@ -24,7 +24,7 @@ Trait solving is based around a few key ideas: - [Lazy normalization](./traits-associated-types.html), which is the technique we use to accommodate associated types when figuring out whether types are equal. -- [Region constraints](./traits-regions.md), which are accumulated +- [Region constraints](./traits-regions.html), which are accumulated during trait solving but mostly ignored. This means that trait solving effectively ignores the precise regions involved, always -- but we still remember the constraints on them so that those