add background material on trait queries

2018-03-10 05:32:47 -05:00 · 2018-03-10 05:32:47 -05:00 · c3a35021d6
parent 3c8a827a37
commit c3a35021d6
5 changed files with 241 additions and 5 deletions
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@ -23,12 +23,14 @@
    - [Specialization](./trait-specialization.md)
 - [Trait solving (new-style)](./traits.md)    
    - [Lowering to logic](./traits-lowering-to-logic.md)
-    - [Canonical queries](./traits-canonicalization.md)
      - [Goals and clauses](./traits-goals-and-clauses.md)
      - [Lowering rules](./traits-lowering-rules.md)
+    - [Canonical queries](./traits-canonical-queries.md)
+      - [Canonicalization](./traits-canonicalization.md)
    - [Equality and associated types](./traits-associated-types.md)
    - [Region constraints](./traits-regions.md)
    - [Well-formedness checking](./traits-wf.md)
+    - [The SLG solver](./traits-slg.md)
    - [Bibliography](./traits-bibliography.md)
 - [Type checking](./type-checking.md)
 - [The MIR (Mid-level IR)](./mir.md)
--- a/src/traits-canonical-queries.md
+++ b/src/traits-canonical-queries.md
@ -0,0 +1,228 @@
+# Canonical queries
+
+The "start" of the trait system is the **canonical query** (these are
+both queries in the more general sense of the word -- something you
+would like to know the answer to -- and in the
+[rustc-specific sense](./query.html)).  The idea is that the type
+checker or other parts of the system, may in the course of doing their
+thing want to know whether some trait is implemented for some type
+(e.g., is `u32: Debug` true?). Or they may want to
+[normalize some associated type](./traits-associated-types.html).
+
+This section covers queries at a fairly high level of abstraction. The
+subsections look a bit more closely at how these ideas are implemented
+in rustc.
+
+## The traditional, interactive Prolog query
+
+In a traditional Prolog system, when you start a query, the solver
+will run off and start supplying you with every possible answer it can
+find. So given something like this:
+
+    ?- Vec<i32>: AsRef<?U>
+    
+The solver might answer:
+
+    Vec<i32>: AsRef<[i32]>
+        continue? (y/n)
+
+This `continue` bit is interesting. The idea in Prolog is that the
+solver is finding **all possible** instantiations of your query that
+are true. In this case, if we instantiate `?U = [i32]`, then the query
+is true (note that a traditional Prolog interface does not, directly,
+tell us a value for `?U`, but we can infer one by unifying the
+response with our original query -- Rust's solver gives back a
+substitution instead). If we were to hit `y`, the solver might then
+give us another possible answer:
+
+    Vec<i32>: AsRef<Vec<i32>>
+        continue? (y/n)
+
+This answer derives from the fact that there is a reflexive impl
+(`impl<T> AsRef<T> for T`) for `AsRef`. If were to hit `y` again,
+then we might get back a negative response:
+
+    no
+
+Naturally, in some cases, there may be no possible answers, and hence
+the solver will just give me back `no` right away:
+
+    ?- Box<i32>: Copy
+        no
+
+In some cases, there might be an infinite number of responses. So for
+example if I gave this query, and I kept hitting `y`, then the solver
+would never stop giving me back answers:
+
+    ?- Vec<?U>: Clone
+       Vec<i32>: Clone
+         continue? (y/n)
+       Vec<Box<i32>>: Clone
+         continue? (y/n)
+       Vec<Box<Box<i32>>>: Clone
+         continue? (y/n)
+       Vec<Box<Box<Box<i32>>>>: Clone
+         continue? (y/n)
+
+As you can imagine, the solver will gleefully keep adding another
+layer of `Box` until we ask it to stop, or it runs out of memory.
+
+Another interesting thing is that queries might still have variables
+in them. For example:
+
+    ?- Rc<?T>: Clone
+    
+might produce the answer:
+
+    Rc<?T>: Clone
+        continue? (y/n)
+        
+After all, `Rc<?T>` is true **no matter what type `?T` is**.        
+
+## A trait query in rustc
+
+The trait queries in rustc work somewhat differently. Instead of
+trying to enumerate **all possible** answers for you, they are looking
+for an **unambiguous** answer. In particular, when they tells you the
+value for a type variable, that means that this is the **only possible
+instantiation** that you could use, given the current set of impls and
+where-clauses, that would be provable. (Internally within the solver,
+though, they can potentially enumerate all possible answers. See
+[the description of the SLG solver](./traits-slg.html) for details.)
+
+The response to a trait query in rustc is typically a
+`Result<QueryResult<T>, NoSolution>` (where the `T` will vary a bit
+depending on the query itself). The `Err(NoSolution)` case indicates
+that the query was false and had no answers (e.g., `Box<i32>: Copy`).
+Otherwise, the `QueryResult` gives back information about the possible answer(s)
+we did find. It consists of four parts:
+
+- **Certainty:** tells you how sure we are of this answer. It can have two values:
+  - `Proven` means that the result is known to be true.
+    - This might be the result for trying to prove `Vec<i32>: Clone`,
+      say, or `Rc<?T>: Clone`.
+  - `Ambiguous` means that there were things we could not yet prove to
+    be either true *or* false, typically because more type information
+    was needed. (We'll see an example shortly.)
+    - This might be the result for trying to prove `Vec<?T>: Clone`.
+- **Var values:** Values for each of the unbound inference variables
+  (like `?T`) that appeared in your original query. (Remember that in Prolog,
+  we had to infer these.)
+  - As we'll see in the example below, we can get back var values even
+    for `Ambiguous` cases.
+- **Region constraints:** these are relations that must hold between
+  the lifetimes that you supplied as inputs. We'll ignore these here,
+  but see the
+  [section on handling regions in traits](./traits-regions.html) for
+  more details.
+- **Value:** The query result also comes with a value of type `T`. For
+  some specialized queries -- like normalizing associated types --
+  this is used to carry back an extra result, but it's often just
+  `()`.
+
+### Examples
+
+Let's work through an example query to see what all the parts mean.
+Consider [the `Borrow` trait][borrow]. This trait has a number of
+impls; among them, there are these two (for clarify, I've written the
+`Sized` bounds explicitly):
+
+[borrow]: https://doc.rust-lang.org/std/borrow/trait.Borrow.html
+
+```rust
+impl<T> Borrow<T> for T where T: ?Sized
+impl<T> Borrow<[T]> for Vec<T> where T: Sized
+```
+
+**Example 1.** Imagine we are type-checking this (rather artificial)
+bit of code:
+
+```rust
+fn foo<A, B>(a: A, vec_b: Option<B>) where A: Borrow<B> { }
+
+fn main() {
+    let mut t: Vec<_> = vec![]; // Type: Vec<?T>
+    let mut u: Option<_> = None; // Type: Option<?U>
+    foo(t, u); // Example 1: requires `Vec<?T>: Borrow<?U>`
+    ...
+}
+```
+
+As the comments indicate, we first create two variables `t` and `u`;
+`t` is an empty vector and `u` is a `None` option. Both of these
+variables have unbound inference variables in their type: `?T`
+represents the elements in the vector `t` and `?U` represents the
+value stored in the option `u`.  Next, we invoke `foo`; comparing the
+signature of `foo` to its arguments, we wind up with `A = Vec<?T>` and
+`B = ?U`.Therefore, the where clause on `foo` requires that `Vec<?T>:
+Borrow<?U>`. This is thus our first example trait query.
+
+There are many possible solutions to the query `Vec<?T>: Borrow<?U>`;
+for example:
+
+- `?U = Vec<?T>`,
+- `?U = [?T]`,
+- `?T = u32, ?U = [u32]`
+- and so forth.
+
+Therefore, the result we get back would be as follows (I'm going to
+ignore region constraints and the "value"):
+
+- Certainty: `Ambiguous` -- we're not sure yet if this holds
+- Var values: `[?T = ?T, ?U = ?U]` -- we learned nothing about the values of the variables
+
+In short, the query result says that it is too soon to say much about
+whether this trait is proven. During type-checking, this is not an
+immediate error: instead, the type checker would hold on to this
+requirement (`Vec<?T>: Borrow<?U>`) and wait. As we'll see in the next
+example, it may happen that `?T` and `?U` wind up constrained from
+other sources, in which case we can try the trait query again.
+
+**Example 2.** We can now extend our previous example a bit,
+and assign a value to `u`:
+
+```rust
+fn foo<A, B>(a: A, vec_b: Option<B>) where A: Borrow<B> { }
+
+fn main() {
+    // What we saw before:
+    let mut t: Vec<_> = vec![]; // Type: Vec<?T>
+    let mut u: Option<_> = None; // Type: Option<?U>
+    foo(t, u); // `Vec<?T>: Borrow<?U>` => ambiguous
+    
+    // New stuff:
+    u = Some(vec![]); // ?U = Vec<?V>
+}
+```
+
+As a result of this assignment, the type of `u` is forced to be
+`Option<Vec<?V>>`, where `?V` represents the element type of the
+vector. This in turn implies that `?U` is [unified] to `Vec<?V>`.
+
+[unified]: ./type-checking.html
+
+Let's suppose that the type checker decides to revisit the
+"as-yet-unproven" trait obligation we saw before, `Vec<?T>:
+Borrow<?U>`. `?U` is no longer an unbound inference variable; it now
+has a value, &. So, if we "refresh" the query with that value, we get:
+
+    Vec<?T>: Borrow<Vec<?V>>
+    
+This time, there is only one impl that applies, the reflexive impl:
+
+    impl<T> Borrow<T> for T where T: ?Sized
+
+Therefore, the trait checker will answer:
+
+- Certainty: `Proven`
+- Var values: `[?T = ?T, ?V = ?T]`
+
+Here, it is saying that we have indeed proven that the obligation
+holds, and we also know that `?T` and `?V` are the same type (but we
+don't know what that type is yet!).
+
+(In fact, as the function ends here, the type checker would give an
+error at this point, since the element types of `t` and `u` are still
+not yet known, even though they are known to be the same.)
+
+
--- a/src/traits-canonicalization.md
+++ b/src/traits-canonicalization.md
@ -1,7 +1,10 @@
 # Canonicalization

 Canonicalization is the process of **isolating** an inference value
-from its context. It is really based on a very simple concept: every
+from its context. It is a key part of implementing
+[canonical queries][cq].
+
+Canonicalization is really based on a very simple concept: every
 [inference variable](./type-inference.html#vars) is always in one of
 two states: either it is **unbound**, in which case we don't know yet
 what type it is, or it is **bound**, in which case we do. So to
@ -12,6 +15,8 @@ starting from zero and numbered in a fixed order (left to right, for
 the most part, but really it doesn't matter as long as it is
 consistent).

+[cq]: ./traits-canonical-queries.html
+
 So, for example, if we have the type `X = (?T, ?U)`, where `?T` and
 `?U` are distinct, unbound inference variables, then the canonical
 form of `X` would be `(?0, ?1)`, where `?0` and `?1` represent these
--- a/src/traits-slg.md
+++ b/src/traits-slg.md
@ -0,0 +1 @@
+# The SLG solver
--- a/src/traits.md
+++ b/src/traits.md
@ -24,7 +24,7 @@ Trait solving is based around a few key ideas:
 - [Lazy normalization](./traits-associated-types.html), which is the
  technique we use to accommodate associated types when figuring out
  whether types are equal.
- [Region constraints](./traits-regions.md), which are accumulated
+- [Region constraints](./traits-regions.html), which are accumulated
  during trait solving but mostly ignored. This means that trait
  solving effectively ignores the precise regions involved, always --
  but we still remember the constraints on them so that those