move over the `ty` README

2018-01-19 06:51:14 -05:00 · 2018-01-19 06:51:14 -05:00 · 141528264b
parent b440859ad3
commit 141528264b
2 changed files with 166 additions and 2 deletions
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@ -9,7 +9,7 @@
 - [Macro expansion](./macro-expansion.md)
 - [Name resolution](./name-resolution.md)
 - [HIR lowering](./hir-lowering.md)
- [Representing types (`ty` module in depth)](./ty.md)
+- [The `ty` module: representing types](./ty.md)
 - [Type inference](./type-inference.md)
 - [Trait resolution](./trait-resolution.md)
 - [Type checking](./type-checking.md)
--- a/src/ty.md
+++ b/src/ty.md
@ -1 +1,165 @@
-# Representing types (`ty` module in depth)
+# The `ty` module: representing types
+
+The `ty` module defines how the Rust compiler represents types
+internally. It also defines the *typing context* (`tcx` or `TyCtxt`),
+which is the central data structure in the compiler.
+
+## The tcx and how it uses lifetimes
+
+The `tcx` ("typing context") is the central data structure in the
+compiler. It is the context that you use to perform all manner of
+queries. The struct `TyCtxt` defines a reference to this shared context:
+
+```rust
+tcx: TyCtxt<'a, 'gcx, 'tcx>
+//          --  ----  ----
+//          |   |     |
+//          |   |     innermost arena lifetime (if any)
+//          |   "global arena" lifetime
+//          lifetime of this reference
+```
+
+As you can see, the `TyCtxt` type takes three lifetime parameters.
+These lifetimes are perhaps the most complex thing to understand about
+the tcx. During Rust compilation, we allocate most of our memory in
+**arenas**, which are basically pools of memory that get freed all at
+once. When you see a reference with a lifetime like `'tcx` or `'gcx`,
+you know that it refers to arena-allocated data (or data that lives as
+long as the arenas, anyhow).
+
+We use two distinct levels of arenas. The outer level is the "global
+arena". This arena lasts for the entire compilation: so anything you
+allocate in there is only freed once compilation is basically over
+(actually, when we shift to executing LLVM).
+
+To reduce peak memory usage, when we do type inference, we also use an
+inner level of arena. These arenas get thrown away once type inference
+is over. This is done because type inference generates a lot of
+"throw-away" types that are not particularly interesting after type
+inference completes, so keeping around those allocations would be
+wasteful.
+
+Often, we wish to write code that explicitly asserts that it is not
+taking place during inference. In that case, there is no "local"
+arena, and all the types that you can access are allocated in the
+global arena.  To express this, the idea is to use the same lifetime
+for the `'gcx` and `'tcx` parameters of `TyCtxt`. Just to be a touch
+confusing, we tend to use the name `'tcx` in such contexts. Here is an
+example:
+
+```rust
+fn not_in_inference<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, def_id: DefId) {
+    //                                        ----  ----
+    //                                        Using the same lifetime here asserts
+    //                                        that the innermost arena accessible through
+    //                                        this reference *is* the global arena.
+}
+```
+
+In contrast, if we want to code that can be usable during type inference, then you
+need to declare a distinct `'gcx` and `'tcx` lifetime parameter:
+
+```rust
+fn maybe_in_inference<'a, 'gcx, 'tcx>(tcx: TyCtxt<'a, 'gcx, 'tcx>, def_id: DefId) {
+    //                                                ----  ----
+    //                                        Using different lifetimes here means that
+    //                                        the innermost arena *may* be distinct
+    //                                        from the global arena (but doesn't have to be).
+}
+```
+
+### Allocating and working with types
+
+Rust types are represented using the `Ty<'tcx>` defined in the `ty`
+module (not to be confused with the `Ty` struct from [the HIR]). This
+is in fact a simple type alias for a reference with `'tcx` lifetime:
+
+```rust
+pub type Ty<'tcx> = &'tcx TyS<'tcx>;
+```
+
+[the HIR]: ../hir/README.md
+
+You can basically ignore the `TyS` struct -- you will basically never
+access it explicitly. We always pass it by reference using the
+`Ty<'tcx>` alias -- the only exception I think is to define inherent
+methods on types. Instances of `TyS` are only ever allocated in one of
+the rustc arenas (never e.g. on the stack).
+
+One common operation on types is to **match** and see what kinds of
+types they are. This is done by doing `match ty.sty`, sort of like this:
+
+```rust
+fn test_type<'tcx>(ty: Ty<'tcx>) {
+    match ty.sty {
+        ty::TyArray(elem_ty, len) => { ... }
+        ...
+    }
+}
+```
+
+The `sty` field (the origin of this name is unclear to me; perhaps
+structural type?) is of type `TypeVariants<'tcx>`, which is an enum
+defining all of the different kinds of types in the compiler.
+
+> NB: inspecting the `sty` field on types during type inference can be
+> risky, as there may be inference variables and other things to
+> consider, or sometimes types are not yet known that will become
+> known later.).
+
+To allocate a new type, you can use the various `mk_` methods defined
+on the `tcx`. These have names that correpond mostly to the various kinds
+of type variants. For example:
+
+```rust
+let array_ty = tcx.mk_array(elem_ty, len * 2);
+```
+
+These methods all return a `Ty<'tcx>` -- note that the lifetime you
+get back is the lifetime of the innermost arena that this `tcx` has
+access to. In fact, types are always canonicalized and interned (so we
+never allocate exactly the same type twice) and are always allocated
+in the outermost arena where they can be (so, if they do not contain
+any inference variables or other "temporary" types, they will be
+allocated in the global arena). However, the lifetime `'tcx` is always
+a safe approximation, so that is what you get back.
+
+> NB. Because types are interned, it is possible to compare them for
+> equality efficiently using `==` -- however, this is almost never what
+> you want to do unless you happen to be hashing and looking for
+> duplicates. This is because often in Rust there are multiple ways to
+> represent the same type, particularly once inference is involved. If
+> you are going to be testing for type equality, you probably need to
+> start looking into the inference code to do it right.
+
+You can also find various common types in the `tcx` itself by accessing
+`tcx.types.bool`, `tcx.types.char`, etc (see `CommonTypes` for more).
+
+### Beyond types: Other kinds of arena-allocated data structures
+
+In addition to types, there are a number of other arena-allocated data
+structures that you can allocate, and which are found in this
+module. Here are a few examples:
+
+- `Substs`, allocated with `mk_substs` -- this will intern a slice of types, often used to
+  specify the values to be substituted for generics (e.g., `HashMap<i32, u32>`
+  would be represented as a slice `&'tcx [tcx.types.i32, tcx.types.u32]`).
+- `TraitRef`, typically passed by value -- a **trait reference**
+  consists of a reference to a trait along with its various type
+  parameters (including `Self`), like `i32: Display` (here, the def-id
+  would reference the `Display` trait, and the substs would contain
+  `i32`).
+- `Predicate` defines something the trait system has to prove (see `traits` module).
+
+### Import conventions
+
+Although there is no hard and fast rule, the `ty` module tends to be used like so:
+
+```rust
+use ty::{self, Ty, TyCtxt};
+```
+
+In particular, since they are so common, the `Ty` and `TyCtxt` types
+are imported directly. Other types are often referenced with an
+explicit `ty::` prefix (e.g., `ty::TraitRef<'tcx>`). But some modules
+choose to import a larger or smaller set of names explicitly.