Merge pull request #18 from nikomatsakis/master

remove chap-NNN labels, move some content from rustc
2018-01-19 06:52:49 -05:00 · 2018-01-19 06:52:49 -05:00 · 50978bf706
parent de5f18e01a 141528264b
commit 50978bf706
21 changed files with 822 additions and 19 deletions
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@ -1,19 +1,20 @@
 # Summary

- [How to build the compiler and run what you built](./chap-010-how-to-build-and-run.md)
- [Using the compiler testing framework](./chap-020-running-tests.md)
- [Walkthrough: a typical contribution](./chap-030-walkthrough.md)
- [Conventions used in the compiler](./chap-040-compiler-conventions.md)
- [The parser](./chap-050-the-parser.md)
- [Macro expansion](./chap-060-macro-expansion.md)
- [Name resolution](./chap-070-name-resolution.md)
- [HIR lowering](./chap-080-hir-lowering.md)
- [Representing types (`ty` module in depth)](./chap-090-ty.md)
- [Type inference](./chap-100-type-inference.md)
- [Trait resolution](./chap-110-trait-resolution.md)
- [Type checking](./chap-120-type-checking.md)
- [MIR construction](./chap-130-mir-construction.md)
- [MIR borrowck](./chap-140-mir-borrowck.md)
- [MIR optimizations](./chap-150-mir-optimizations.md)
- [trans: generating LLVM IR](./chap-160-trans.md)
+- [About this guide](./about-this-guide.md)
+- [How to build the compiler and run what you built](./how-to-build-and-run.md)
+- [Using the compiler testing framework](./running-tests.md)
+- [Walkthrough: a typical contribution](./walkthrough.md)
+- [High-level overview of the compiler source](./high-level-overview.md)
+- [The parser](./the-parser.md)
+- [Macro expansion](./macro-expansion.md)
+- [Name resolution](./name-resolution.md)
+- [HIR lowering](./hir-lowering.md)
+- [The `ty` module: representing types](./ty.md)
+- [Type inference](./type-inference.md)
+- [Trait resolution](./trait-resolution.md)
+- [Type checking](./type-checking.md)
+- [MIR construction](./mir-construction.md)
+- [MIR borrowck](./mir-borrowck.md)
+- [MIR optimizations](./mir-optimizations.md)
+- [trans: generating LLVM IR](./trans.md)
 - [Glossary](./glossary.md)
--- a/src/about-this-guide.md
+++ b/src/about-this-guide.md
@ -0,0 +1,14 @@
+# About this guide
+
+This guide is meant to help document how rustc -- the Rust compiler --
+works, as well as to help new contributors get involved in rustc
+development. It is not meant to replace code documentation -- each
+chapter gives only high-level details, the kinds of things that
+(ideally) don't change frequently.
+
+The guide itself is of course open source as well, and the sources can
+be found at [the GitHub repository]. If you find any mistakes in the
+guide, please file an issue about it -- or, even better, open a PR
+with a correction!
+
+[the GitHub repository]: https://github.com/rust-lang-nursery/rustc-guide/
--- a/src/chap-040-compiler-conventions.md
+++ b/src/chap-040-compiler-conventions.md
@ -1 +0,0 @@
-# Conventions used in the compiler
--- a/src/chap-090-ty.md
+++ b/src/chap-090-ty.md
@ -1 +0,0 @@
-# Representing types (`ty` module in depth)
--- a/src/chap-110-trait-resolution.md
+++ b/src/chap-110-trait-resolution.md
@ -1 +0,0 @@
-# Trait resolution
--- a/src/high-level-overview.md
+++ b/src/high-level-overview.md
@ -0,0 +1,141 @@
+# High-level overview of the compiler source
+
+## Crate structure
+
+The main Rust repository consists of a `src` directory, under which
+there live many crates. These crates contain the sources for the
+standard library and the compiler.  This document, of course, focuses
+on the latter.
+
+Rustc consists of a number of crates, including `syntax`,
+`rustc`, `rustc_back`, `rustc_trans`, `rustc_driver`, and
+many more. The source for each crate can be found in a directory
+like `src/libXXX`, where `XXX` is the crate name.
+
+(NB. The names and divisions of these crates are not set in
+stone and may change over time -- for the time being, we tend towards
+a finer-grained division to help with compilation time, though as
+incremental improves that may change.)
+
+The dependency structure of these crates is roughly a diamond:
+
+```
+                  rustc_driver
+                /      |       \
+              /        |         \
+            /          |           \
+          /            v             \
+rustc_trans    rustc_borrowck   ...  rustc_metadata
+          \            |            /
+            \          |          /
+              \        |        /
+                \      v      /
+                    rustc
+                       |
+                       v
+                    syntax
+                    /    \
+                  /       \
+           syntax_pos  syntax_ext
+```                    
+
+The `rustc_driver` crate, at the top of this lattice, is effectively
+the "main" function for the rust compiler. It doesn't have much "real
+code", but instead ties together all of the code defined in the other
+crates and defines the overall flow of execution. (As we transition
+more and more to the [query model](ty/maps/README.md), however, the
+"flow" of compilation is becoming less centrally defined.)
+
+At the other extreme, the `rustc` crate defines the common and
+pervasive data structures that all the rest of the compiler uses
+(e.g., how to represent types, traits, and the program itself). It
+also contains some amount of the compiler itself, although that is
+relatively limited.
+
+Finally, all the crates in the bulge in the middle define the bulk of
+the compiler -- they all depend on `rustc`, so that they can make use
+of the various types defined there, and they export public routines
+that `rustc_driver` will invoke as needed (more and more, what these
+crates export are "query definitions", but those are covered later
+on).
+
+Below `rustc` lie various crates that make up the parser and error
+reporting mechanism. For historical reasons, these crates do not have
+the `rustc_` prefix, but they are really just as much an internal part
+of the compiler and not intended to be stable (though they do wind up
+getting used by some crates in the wild; a practice we hope to
+gradually phase out).
+
+Each crate has a `README.md` file that describes, at a high-level,
+what it contains, and tries to give some kind of explanation (some
+better than others).
+
+## The main stages of compilation
+
+The Rust compiler is in a bit of transition right now. It used to be a
+purely "pass-based" compiler, where we ran a number of passes over the
+entire program, and each did a particular check of transformation. We
+are gradually replacing this pass-based code with an alternative setup
+based on on-demand **queries**. In the query-model, we work backwards,
+executing a *query* that expresses our ultimate goal (e.g., "compile
+this crate"). This query in turn may make other queries (e.g., "get me
+a list of all modules in the crate"). Those queries make other queries
+that ultimately bottom out in the base operations, like parsing the
+input, running the type-checker, and so forth. This on-demand model
+permits us to do exciting things like only do the minimal amount of
+work needed to type-check a single function. It also helps with
+incremental compilation. (For details on defining queries, check out
+`src/librustc/ty/maps/README.md`.)
+
+Regardless of the general setup, the basic operations that the
+compiler must perform are the same. The only thing that changes is
+whether these operations are invoked front-to-back, or on demand.  In
+order to compile a Rust crate, these are the general steps that we
+take:
+
+1. **Parsing input**
+    - this processes the `.rs` files and produces the AST ("abstract syntax tree")
+    - the AST is defined in `syntax/ast.rs`. It is intended to match the lexical
+      syntax of the Rust language quite closely.
+2. **Name resolution, macro expansion, and configuration**
+    - once parsing is complete, we process the AST recursively, resolving paths
+      and expanding macros. This same process also processes `#[cfg]` nodes, and hence
+      may strip things out of the AST as well.
+3. **Lowering to HIR**
+    - Once name resolution completes, we convert the AST into the HIR,
+      or "high-level IR". The HIR is defined in `src/librustc/hir/`; that module also includes
+      the lowering code.
+    - The HIR is a lightly desugared variant of the AST. It is more processed than the
+      AST and more suitable for the analyses that follow. It is **not** required to match
+      the syntax of the Rust language.
+    - As a simple example, in the **AST**, we preserve the parentheses
+      that the user wrote, so `((1 + 2) + 3)` and `1 + 2 + 3` parse
+      into distinct trees, even though they are equivalent. In the
+      HIR, however, parentheses nodes are removed, and those two
+      expressions are represented in the same way.
+3. **Type-checking and subsequent analyses**
+    - An important step in processing the HIR is to perform type
+      checking. This process assigns types to every HIR expression,
+      for example, and also is responsible for resolving some
+      "type-dependent" paths, such as field accesses (`x.f` -- we
+      can't know what field `f` is being accessed until we know the
+      type of `x`) and associated type references (`T::Item` -- we
+      can't know what type `Item` is until we know what `T` is).
+    - Type checking creates "side-tables" (`TypeckTables`) that include
+      the types of expressions, the way to resolve methods, and so forth.
+    - After type-checking, we can do other analyses, such as privacy checking.
+4. **Lowering to MIR and post-processing**
+    - Once type-checking is done, we can lower the HIR into MIR ("middle IR"), which
+      is a **very** desugared version of Rust, well suited to the borrowck but also
+      certain high-level optimizations. 
+5. **Translation to LLVM and LLVM optimizations**
+    - From MIR, we can produce LLVM IR.
+    - LLVM then runs its various optimizations, which produces a number of `.o` files
+      (one for each "codegen unit").
+6. **Linking**
+    - Finally, those `.o` files are linked together.
+
+
+
+
+The first thing you may wonder if 
--- a/src/chap-080-hir-lowering.md
+++ b/src/chap-080-hir-lowering.md
--- a/src/chap-010-how-to-build-and-run.md
+++ b/src/chap-010-how-to-build-and-run.md
--- a/src/chap-060-macro-expansion.md
+++ b/src/chap-060-macro-expansion.md
--- a/src/chap-140-mir-borrowck.md
+++ b/src/chap-140-mir-borrowck.md
--- a/src/chap-130-mir-construction.md
+++ b/src/chap-130-mir-construction.md
--- a/src/chap-150-mir-optimizations.md
+++ b/src/chap-150-mir-optimizations.md
--- a/src/chap-070-name-resolution.md
+++ b/src/chap-070-name-resolution.md
--- a/src/chap-020-running-tests.md
+++ b/src/chap-020-running-tests.md
--- a/src/chap-050-the-parser.md
+++ b/src/chap-050-the-parser.md
--- a/src/trait-resolution.md
+++ b/src/trait-resolution.md
@ -0,0 +1,485 @@
+# Trait resolution
+
+This document describes the general process and points out some non-obvious
+things.
+
+**WARNING:** This material was moved verbatim from a rustc README, so
+it may not "fit" the style of the guide until it is adapted.
+
+## Major concepts
+
+Trait resolution is the process of pairing up an impl with each
+reference to a trait. So, for example, if there is a generic function like:
+
+```rust
+fn clone_slice<T:Clone>(x: &[T]) -> Vec<T> { /*...*/ }
+```
+
+and then a call to that function:
+
+```rust
+let v: Vec<isize> = clone_slice(&[1, 2, 3])
+```
+
+it is the job of trait resolution to figure out (in which case)
+whether there exists an impl of `isize : Clone`
+
+Note that in some cases, like generic functions, we may not be able to
+find a specific impl, but we can figure out that the caller must
+provide an impl. To see what I mean, consider the body of `clone_slice`:
+
+```rust
+fn clone_slice<T:Clone>(x: &[T]) -> Vec<T> {
+    let mut v = Vec::new();
+    for e in &x {
+        v.push((*e).clone()); // (*)
+    }
+}
+```
+
+The line marked `(*)` is only legal if `T` (the type of `*e`)
+implements the `Clone` trait. Naturally, since we don't know what `T`
+is, we can't find the specific impl; but based on the bound `T:Clone`,
+we can say that there exists an impl which the caller must provide.
+
+We use the term *obligation* to refer to a trait reference in need of
+an impl.
+
+## Overview
+
+Trait resolution consists of three major parts:
+
+- SELECTION: Deciding how to resolve a specific obligation. For
+  example, selection might decide that a specific obligation can be
+  resolved by employing an impl which matches the self type, or by
+  using a parameter bound. In the case of an impl, Selecting one
+  obligation can create *nested obligations* because of where clauses
+  on the impl itself. It may also require evaluating those nested
+  obligations to resolve ambiguities.
+
+- FULFILLMENT: The fulfillment code is what tracks that obligations
+  are completely fulfilled. Basically it is a worklist of obligations
+  to be selected: once selection is successful, the obligation is
+  removed from the worklist and any nested obligations are enqueued.
+
+- COHERENCE: The coherence checks are intended to ensure that there
+  are never overlapping impls, where two impls could be used with
+  equal precedence.
+
+## Selection
+
+Selection is the process of deciding whether an obligation can be
+resolved and, if so, how it is to be resolved (via impl, where clause, etc).
+The main interface is the `select()` function, which takes an obligation
+and returns a `SelectionResult`. There are three possible outcomes:
+
+- `Ok(Some(selection))` -- yes, the obligation can be resolved, and
+  `selection` indicates how. If the impl was resolved via an impl,
+  then `selection` may also indicate nested obligations that are required
+  by the impl.
+
+- `Ok(None)` -- we are not yet sure whether the obligation can be
+  resolved or not. This happens most commonly when the obligation
+  contains unbound type variables.
+
+- `Err(err)` -- the obligation definitely cannot be resolved due to a
+  type error, or because there are no impls that could possibly apply,
+  etc.
+
+The basic algorithm for selection is broken into two big phases:
+candidate assembly and confirmation.
+
+### Candidate assembly
+
+Searches for impls/where-clauses/etc that might
+possibly be used to satisfy the obligation. Each of those is called
+a candidate. To avoid ambiguity, we want to find exactly one
+candidate that is definitively applicable. In some cases, we may not
+know whether an impl/where-clause applies or not -- this occurs when
+the obligation contains unbound inference variables.
+
+The basic idea for candidate assembly is to do a first pass in which
+we identify all possible candidates. During this pass, all that we do
+is try and unify the type parameters. (In particular, we ignore any
+nested where clauses.) Presuming that this unification succeeds, the
+impl is added as a candidate.
+
+Once this first pass is done, we can examine the set of candidates. If
+it is a singleton set, then we are done: this is the only impl in
+scope that could possibly apply. Otherwise, we can winnow down the set
+of candidates by using where clauses and other conditions. If this
+reduced set yields a single, unambiguous entry, we're good to go,
+otherwise the result is considered ambiguous.
+
+#### The basic process: Inferring based on the impls we see
+
+This process is easier if we work through some examples. Consider
+the following trait:
+
+```rust
+trait Convert<Target> {
+    fn convert(&self) -> Target;
+}
+```
+
+This trait just has one method. It's about as simple as it gets. It
+converts from the (implicit) `Self` type to the `Target` type. If we
+wanted to permit conversion between `isize` and `usize`, we might
+implement `Convert` like so:
+
+```rust
+impl Convert<usize> for isize { /*...*/ } // isize -> usize
+impl Convert<isize> for usize { /*...*/ } // usize -> isize
+```
+
+Now imagine there is some code like the following:
+
+```rust
+let x: isize = ...;
+let y = x.convert();
+```
+
+The call to convert will generate a trait reference `Convert<$Y> for
+isize`, where `$Y` is the type variable representing the type of
+`y`. When we match this against the two impls we can see, we will find
+that only one remains: `Convert<usize> for isize`. Therefore, we can
+select this impl, which will cause the type of `$Y` to be unified to
+`usize`. (Note that while assembling candidates, we do the initial
+unifications in a transaction, so that they don't affect one another.)
+
+There are tests to this effect in src/test/run-pass:
+
+   traits-multidispatch-infer-convert-source-and-target.rs
+   traits-multidispatch-infer-convert-target.rs
+
+#### Winnowing: Resolving ambiguities
+
+But what happens if there are multiple impls where all the types
+unify? Consider this example:
+
+```rust
+trait Get {
+    fn get(&self) -> Self;
+}
+
+impl<T:Copy> Get for T {
+    fn get(&self) -> T { *self }
+}
+
+impl<T:Get> Get for Box<T> {
+    fn get(&self) -> Box<T> { box get_it(&**self) }
+}
+```
+
+What happens when we invoke `get_it(&box 1_u16)`, for example? In this
+case, the `Self` type is `Box<u16>` -- that unifies with both impls,
+because the first applies to all types, and the second to all
+boxes. In the olden days we'd have called this ambiguous. But what we
+do now is do a second *winnowing* pass that considers where clauses
+and attempts to remove candidates -- in this case, the first impl only
+applies if `Box<u16> : Copy`, which doesn't hold. After winnowing,
+then, we are left with just one candidate, so we can proceed. There is
+a test of this in `src/test/run-pass/traits-conditional-dispatch.rs`.
+
+#### Matching
+
+The subroutines that decide whether a particular impl/where-clause/etc
+applies to a particular obligation. At the moment, this amounts to
+unifying the self types, but in the future we may also recursively
+consider some of the nested obligations, in the case of an impl.
+
+#### Lifetimes and selection
+
+Because of how that lifetime inference works, it is not possible to
+give back immediate feedback as to whether a unification or subtype
+relationship between lifetimes holds or not. Therefore, lifetime
+matching is *not* considered during selection. This is reflected in
+the fact that subregion assignment is infallible. This may yield
+lifetime constraints that will later be found to be in error (in
+contrast, the non-lifetime-constraints have already been checked
+during selection and can never cause an error, though naturally they
+may lead to other errors downstream).
+
+#### Where clauses
+
+Besides an impl, the other major way to resolve an obligation is via a
+where clause. The selection process is always given a *parameter
+environment* which contains a list of where clauses, which are
+basically obligations that can assume are satisfiable. We will iterate
+over that list and check whether our current obligation can be found
+in that list, and if so it is considered satisfied. More precisely, we
+want to check whether there is a where-clause obligation that is for
+the same trait (or some subtrait) and for which the self types match,
+using the definition of *matching* given above.
+
+Consider this simple example:
+
+```rust
+trait A1 { /*...*/ }
+trait A2 : A1 { /*...*/ }
+
+trait B { /*...*/ }
+
+fn foo<X:A2+B> { /*...*/ }
+```
+
+Clearly we can use methods offered by `A1`, `A2`, or `B` within the
+body of `foo`. In each case, that will incur an obligation like `X :
+A1` or `X : A2`. The parameter environment will contain two
+where-clauses, `X : A2` and `X : B`. For each obligation, then, we
+search this list of where-clauses.  To resolve an obligation `X:A1`,
+we would note that `X:A2` implies that `X:A1`.
+
+### Confirmation
+
+Confirmation unifies the output type parameters of the trait with the
+values found in the obligation, possibly yielding a type error.  If we
+return to our example of the `Convert` trait from the previous
+section, confirmation is where an error would be reported, because the
+impl specified that `T` would be `usize`, but the obligation reported
+`char`. Hence the result of selection would be an error.
+
+### Selection during translation
+
+During type checking, we do not store the results of trait selection.
+We simply wish to verify that trait selection will succeed. Then
+later, at trans time, when we have all concrete types available, we
+can repeat the trait selection.  In this case, we do not consider any
+where-clauses to be in scope. We know that therefore each resolution
+will resolve to a particular impl.
+
+One interesting twist has to do with nested obligations. In general, in trans,
+we only need to do a "shallow" selection for an obligation. That is, we wish to
+identify which impl applies, but we do not (yet) need to decide how to select
+any nested obligations. Nonetheless, we *do* currently do a complete resolution,
+and that is because it can sometimes inform the results of type inference. That is,
+we do not have the full substitutions in terms of the type variables of the impl available
+to us, so we must run trait selection to figure everything out.
+
+Here is an example:
+
+```rust
+trait Foo { /*...*/ }
+impl<U,T:Bar<U>> Foo for Vec<T> { /*...*/ }
+
+impl Bar<usize> for isize { /*...*/ }
+```
+
+After one shallow round of selection for an obligation like `Vec<isize>
+: Foo`, we would know which impl we want, and we would know that
+`T=isize`, but we do not know the type of `U`.  We must select the
+nested obligation `isize : Bar<U>` to find out that `U=usize`.
+
+It would be good to only do *just as much* nested resolution as
+necessary. Currently, though, we just do a full resolution.
+
+# Higher-ranked trait bounds
+
+One of the more subtle concepts at work are *higher-ranked trait
+bounds*. An example of such a bound is `for<'a> MyTrait<&'a isize>`.
+Let's walk through how selection on higher-ranked trait references
+works.
+
+## Basic matching and skolemization leaks
+
+Let's walk through the test `compile-fail/hrtb-just-for-static.rs` to see
+how it works. The test starts with the trait `Foo`:
+
+```rust
+trait Foo<X> {
+    fn foo(&self, x: X) { }
+}
+```
+
+Let's say we have a function `want_hrtb` that wants a type which
+implements `Foo<&'a isize>` for any `'a`:
+
+```rust
+fn want_hrtb<T>() where T : for<'a> Foo<&'a isize> { ... }
+```
+
+Now we have a struct `AnyInt` that implements `Foo<&'a isize>` for any
+`'a`:
+
+```rust
+struct AnyInt;
+impl<'a> Foo<&'a isize> for AnyInt { }
+```
+
+And the question is, does `AnyInt : for<'a> Foo<&'a isize>`? We want the
+answer to be yes. The algorithm for figuring it out is closely related
+to the subtyping for higher-ranked types (which is described in
+`middle::infer::higher_ranked::doc`, but also in a [paper by SPJ] that
+I recommend you read).
+
+1. Skolemize the obligation.
+2. Match the impl against the skolemized obligation.
+3. Check for skolemization leaks.
+
+[paper by SPJ]: http://research.microsoft.com/en-us/um/people/simonpj/papers/higher-rank/
+
+So let's work through our example. The first thing we would do is to
+skolemize the obligation, yielding `AnyInt : Foo<&'0 isize>` (here `'0`
+represents skolemized region #0). Note that now have no quantifiers;
+in terms of the compiler type, this changes from a `ty::PolyTraitRef`
+to a `TraitRef`. We would then create the `TraitRef` from the impl,
+using fresh variables for it's bound regions (and thus getting
+`Foo<&'$a isize>`, where `'$a` is the inference variable for `'a`). Next
+we relate the two trait refs, yielding a graph with the constraint
+that `'0 == '$a`. Finally, we check for skolemization "leaks" -- a
+leak is basically any attempt to relate a skolemized region to another
+skolemized region, or to any region that pre-existed the impl match.
+The leak check is done by searching from the skolemized region to find
+the set of regions that it is related to in any way. This is called
+the "taint" set. To pass the check, that set must consist *solely* of
+itself and region variables from the impl. If the taint set includes
+any other region, then the match is a failure. In this case, the taint
+set for `'0` is `{'0, '$a}`, and hence the check will succeed.
+
+Let's consider a failure case. Imagine we also have a struct
+
+```rust
+struct StaticInt;
+impl Foo<&'static isize> for StaticInt;
+```
+
+We want the obligation `StaticInt : for<'a> Foo<&'a isize>` to be
+considered unsatisfied. The check begins just as before. `'a` is
+skolemized to `'0` and the impl trait reference is instantiated to
+`Foo<&'static isize>`. When we relate those two, we get a constraint
+like `'static == '0`. This means that the taint set for `'0` is `{'0,
+'static}`, which fails the leak check.
+
+## Higher-ranked trait obligations
+
+Once the basic matching is done, we get to another interesting topic:
+how to deal with impl obligations. I'll work through a simple example
+here. Imagine we have the traits `Foo` and `Bar` and an associated impl:
+
+```rust
+trait Foo<X> {
+    fn foo(&self, x: X) { }
+}
+
+trait Bar<X> {
+    fn bar(&self, x: X) { }
+}
+
+impl<X,F> Foo<X> for F
+    where F : Bar<X>
+{
+}
+```
+
+Now let's say we have a obligation `for<'a> Foo<&'a isize>` and we match
+this impl. What obligation is generated as a result? We want to get
+`for<'a> Bar<&'a isize>`, but how does that happen?
+
+After the matching, we are in a position where we have a skolemized
+substitution like `X => &'0 isize`. If we apply this substitution to the
+impl obligations, we get `F : Bar<&'0 isize>`. Obviously this is not
+directly usable because the skolemized region `'0` cannot leak out of
+our computation.
+
+What we do is to create an inverse mapping from the taint set of `'0`
+back to the original bound region (`'a`, here) that `'0` resulted
+from. (This is done in `higher_ranked::plug_leaks`). We know that the
+leak check passed, so this taint set consists solely of the skolemized
+region itself plus various intermediate region variables. We then walk
+the trait-reference and convert every region in that taint set back to
+a late-bound region, so in this case we'd wind up with `for<'a> F :
+Bar<&'a isize>`.
+
+# Caching and subtle considerations therewith
+
+In general we attempt to cache the results of trait selection.  This
+is a somewhat complex process. Part of the reason for this is that we
+want to be able to cache results even when all the types in the trait
+reference are not fully known. In that case, it may happen that the
+trait selection process is also influencing type variables, so we have
+to be able to not only cache the *result* of the selection process,
+but *replay* its effects on the type variables.
+
+## An example
+
+The high-level idea of how the cache works is that we first replace
+all unbound inference variables with skolemized versions. Therefore,
+if we had a trait reference `usize : Foo<$1>`, where `$n` is an unbound
+inference variable, we might replace it with `usize : Foo<%0>`, where
+`%n` is a skolemized type. We would then look this up in the cache.
+If we found a hit, the hit would tell us the immediate next step to
+take in the selection process: i.e., apply impl #22, or apply where
+clause `X : Foo<Y>`. Let's say in this case there is no hit.
+Therefore, we search through impls and where clauses and so forth, and
+we come to the conclusion that the only possible impl is this one,
+with def-id 22:
+
+```rust
+impl Foo<isize> for usize { ... } // Impl #22
+```
+
+We would then record in the cache `usize : Foo<%0> ==>
+ImplCandidate(22)`. Next we would confirm `ImplCandidate(22)`, which
+would (as a side-effect) unify `$1` with `isize`.
+
+Now, at some later time, we might come along and see a `usize :
+Foo<$3>`.  When skolemized, this would yield `usize : Foo<%0>`, just as
+before, and hence the cache lookup would succeed, yielding
+`ImplCandidate(22)`. We would confirm `ImplCandidate(22)` which would
+(as a side-effect) unify `$3` with `isize`.
+
+## Where clauses and the local vs global cache
+
+One subtle interaction is that the results of trait lookup will vary
+depending on what where clauses are in scope. Therefore, we actually
+have *two* caches, a local and a global cache. The local cache is
+attached to the `ParamEnv` and the global cache attached to the
+`tcx`. We use the local cache whenever the result might depend on the
+where clauses that are in scope. The determination of which cache to
+use is done by the method `pick_candidate_cache` in `select.rs`. At
+the moment, we use a very simple, conservative rule: if there are any
+where-clauses in scope, then we use the local cache.  We used to try
+and draw finer-grained distinctions, but that led to a serious of
+annoying and weird bugs like #22019 and #18290. This simple rule seems
+to be pretty clearly safe and also still retains a very high hit rate
+(~95% when compiling rustc).
+
+# Specialization
+
+Defined in the `specialize` module.
+
+The basic strategy is to build up a *specialization graph* during
+coherence checking. Insertion into the graph locates the right place
+to put an impl in the specialization hierarchy; if there is no right
+place (due to partial overlap but no containment), you get an overlap
+error. Specialization is consulted when selecting an impl (of course),
+and the graph is consulted when propagating defaults down the
+specialization hierarchy.
+
+You might expect that the specialization graph would be used during
+selection -- i.e., when actually performing specialization. This is
+not done for two reasons:
+
+- It's merely an optimization: given a set of candidates that apply,
+  we can determine the most specialized one by comparing them directly
+  for specialization, rather than consulting the graph. Given that we
+  also cache the results of selection, the benefit of this
+  optimization is questionable.
+
+- To build the specialization graph in the first place, we need to use
+  selection (because we need to determine whether one impl specializes
+  another). Dealing with this reentrancy would require some additional
+  mode switch for selection. Given that there seems to be no strong
+  reason to use the graph anyway, we stick with a simpler approach in
+  selection, and use the graph only for propagating default
+  implementations.
+
+Trait impl selection can succeed even when multiple impls can apply,
+as long as they are part of the same specialization family. In that
+case, it returns a *single* impl on success -- this is the most
+specialized impl *known* to apply. However, if there are any inference
+variables in play, the returned impl may not be the actual impl we
+will use at trans time. Thus, we take special care to avoid projecting
+associated types unless either (1) the associated type does not use
+`default` and thus cannot be overridden or (2) all input types are
+known concretely.
--- a/src/chap-160-trans.md
+++ b/src/chap-160-trans.md
--- a/src/ty.md
+++ b/src/ty.md
@ -0,0 +1,165 @@
+# The `ty` module: representing types
+
+The `ty` module defines how the Rust compiler represents types
+internally. It also defines the *typing context* (`tcx` or `TyCtxt`),
+which is the central data structure in the compiler.
+
+## The tcx and how it uses lifetimes
+
+The `tcx` ("typing context") is the central data structure in the
+compiler. It is the context that you use to perform all manner of
+queries. The struct `TyCtxt` defines a reference to this shared context:
+
+```rust
+tcx: TyCtxt<'a, 'gcx, 'tcx>
+//          --  ----  ----
+//          |   |     |
+//          |   |     innermost arena lifetime (if any)
+//          |   "global arena" lifetime
+//          lifetime of this reference
+```
+
+As you can see, the `TyCtxt` type takes three lifetime parameters.
+These lifetimes are perhaps the most complex thing to understand about
+the tcx. During Rust compilation, we allocate most of our memory in
+**arenas**, which are basically pools of memory that get freed all at
+once. When you see a reference with a lifetime like `'tcx` or `'gcx`,
+you know that it refers to arena-allocated data (or data that lives as
+long as the arenas, anyhow).
+
+We use two distinct levels of arenas. The outer level is the "global
+arena". This arena lasts for the entire compilation: so anything you
+allocate in there is only freed once compilation is basically over
+(actually, when we shift to executing LLVM).
+
+To reduce peak memory usage, when we do type inference, we also use an
+inner level of arena. These arenas get thrown away once type inference
+is over. This is done because type inference generates a lot of
+"throw-away" types that are not particularly interesting after type
+inference completes, so keeping around those allocations would be
+wasteful.
+
+Often, we wish to write code that explicitly asserts that it is not
+taking place during inference. In that case, there is no "local"
+arena, and all the types that you can access are allocated in the
+global arena.  To express this, the idea is to use the same lifetime
+for the `'gcx` and `'tcx` parameters of `TyCtxt`. Just to be a touch
+confusing, we tend to use the name `'tcx` in such contexts. Here is an
+example:
+
+```rust
+fn not_in_inference<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, def_id: DefId) {
+    //                                        ----  ----
+    //                                        Using the same lifetime here asserts
+    //                                        that the innermost arena accessible through
+    //                                        this reference *is* the global arena.
+}
+```
+
+In contrast, if we want to code that can be usable during type inference, then you
+need to declare a distinct `'gcx` and `'tcx` lifetime parameter:
+
+```rust
+fn maybe_in_inference<'a, 'gcx, 'tcx>(tcx: TyCtxt<'a, 'gcx, 'tcx>, def_id: DefId) {
+    //                                                ----  ----
+    //                                        Using different lifetimes here means that
+    //                                        the innermost arena *may* be distinct
+    //                                        from the global arena (but doesn't have to be).
+}
+```
+
+### Allocating and working with types
+
+Rust types are represented using the `Ty<'tcx>` defined in the `ty`
+module (not to be confused with the `Ty` struct from [the HIR]). This
+is in fact a simple type alias for a reference with `'tcx` lifetime:
+
+```rust
+pub type Ty<'tcx> = &'tcx TyS<'tcx>;
+```
+
+[the HIR]: ../hir/README.md
+
+You can basically ignore the `TyS` struct -- you will basically never
+access it explicitly. We always pass it by reference using the
+`Ty<'tcx>` alias -- the only exception I think is to define inherent
+methods on types. Instances of `TyS` are only ever allocated in one of
+the rustc arenas (never e.g. on the stack).
+
+One common operation on types is to **match** and see what kinds of
+types they are. This is done by doing `match ty.sty`, sort of like this:
+
+```rust
+fn test_type<'tcx>(ty: Ty<'tcx>) {
+    match ty.sty {
+        ty::TyArray(elem_ty, len) => { ... }
+        ...
+    }
+}
+```
+
+The `sty` field (the origin of this name is unclear to me; perhaps
+structural type?) is of type `TypeVariants<'tcx>`, which is an enum
+defining all of the different kinds of types in the compiler.
+
+> NB: inspecting the `sty` field on types during type inference can be
+> risky, as there may be inference variables and other things to
+> consider, or sometimes types are not yet known that will become
+> known later.).
+
+To allocate a new type, you can use the various `mk_` methods defined
+on the `tcx`. These have names that correpond mostly to the various kinds
+of type variants. For example:
+
+```rust
+let array_ty = tcx.mk_array(elem_ty, len * 2);
+```
+
+These methods all return a `Ty<'tcx>` -- note that the lifetime you
+get back is the lifetime of the innermost arena that this `tcx` has
+access to. In fact, types are always canonicalized and interned (so we
+never allocate exactly the same type twice) and are always allocated
+in the outermost arena where they can be (so, if they do not contain
+any inference variables or other "temporary" types, they will be
+allocated in the global arena). However, the lifetime `'tcx` is always
+a safe approximation, so that is what you get back.
+
+> NB. Because types are interned, it is possible to compare them for
+> equality efficiently using `==` -- however, this is almost never what
+> you want to do unless you happen to be hashing and looking for
+> duplicates. This is because often in Rust there are multiple ways to
+> represent the same type, particularly once inference is involved. If
+> you are going to be testing for type equality, you probably need to
+> start looking into the inference code to do it right.
+
+You can also find various common types in the `tcx` itself by accessing
+`tcx.types.bool`, `tcx.types.char`, etc (see `CommonTypes` for more).
+
+### Beyond types: Other kinds of arena-allocated data structures
+
+In addition to types, there are a number of other arena-allocated data
+structures that you can allocate, and which are found in this
+module. Here are a few examples:
+
+- `Substs`, allocated with `mk_substs` -- this will intern a slice of types, often used to
+  specify the values to be substituted for generics (e.g., `HashMap<i32, u32>`
+  would be represented as a slice `&'tcx [tcx.types.i32, tcx.types.u32]`).
+- `TraitRef`, typically passed by value -- a **trait reference**
+  consists of a reference to a trait along with its various type
+  parameters (including `Self`), like `i32: Display` (here, the def-id
+  would reference the `Display` trait, and the substs would contain
+  `i32`).
+- `Predicate` defines something the trait system has to prove (see `traits` module).
+
+### Import conventions
+
+Although there is no hard and fast rule, the `ty` module tends to be used like so:
+
+```rust
+use ty::{self, Ty, TyCtxt};
+```
+
+In particular, since they are so common, the `Ty` and `TyCtxt` types
+are imported directly. Other types are often referenced with an
+explicit `ty::` prefix (e.g., `ty::TraitRef<'tcx>`). But some modules
+choose to import a larger or smaller set of names explicitly.
--- a/src/chap-120-type-checking.md
+++ b/src/chap-120-type-checking.md
--- a/src/chap-100-type-inference.md
+++ b/src/chap-100-type-inference.md
--- a/src/chap-030-walkthrough.md
+++ b/src/chap-030-walkthrough.md