monomorphization chapter
This commit is contained in:
parent
e19762b57c
commit
44cba6e075
|
|
@ -52,7 +52,9 @@ newtype | a "newtype" is a wrapper around some other type (e.g.
|
|||
NLL | [non-lexical lifetimes](../borrow_check/region_inference.html), an extension to Rust's borrowing system to make it be based on the control-flow graph.
|
||||
node-id or NodeId | an index identifying a particular node in the AST or HIR; gradually being phased out and replaced with `HirId`.
|
||||
obligation | something that must be proven by the trait system ([see more](../traits/resolution.html))
|
||||
placeholder | **NOTE: skolemization is deprecated by placeholder** a way of handling subtyping around "for-all" types (e.g., `for<'a> fn(&'a u32)`) as well as solving higher-ranked trait bounds (e.g., `for<'a> T: Trait<'a>`). See [the chapter on placeholder and universes](../borrow_check/region_inference/placeholders_and_universes.md) for more details.
|
||||
point | used in the NLL analysis to refer to some particular location in the MIR; typically used to refer to a node in the control-flow graph.
|
||||
polymorphize | An optimization that avoids unnecessary monomorphisation ([see more](../backend/monomorph.md#polymorphization))
|
||||
projection | a general term for a "relative path", e.g. `x.f` is a "field projection", and `T::Item` is an ["associated type projection"](../traits/goals-and-clauses.html#trait-ref)
|
||||
promoted constants | constants extracted from a function and lifted to static scope; see [this section](../mir/index.html#promoted) for more details.
|
||||
provider | the function that executes a query ([see more](../query.html))
|
||||
|
|
@ -63,7 +65,6 @@ rib | a data structure in the name resolver that keeps trac
|
|||
sess | the compiler session, which stores global data used throughout compilation
|
||||
side tables | because the AST and HIR are immutable once created, we often carry extra information about them in the form of hashtables, indexed by the id of a particular node.
|
||||
sigil | like a keyword but composed entirely of non-alphanumeric tokens. For example, `&` is a sigil for references.
|
||||
placeholder | **NOTE: skolemization is deprecated by placeholder** a way of handling subtyping around "for-all" types (e.g., `for<'a> fn(&'a u32)`) as well as solving higher-ranked trait bounds (e.g., `for<'a> T: Trait<'a>`). See [the chapter on placeholder and universes](../borrow_check/region_inference/placeholders_and_universes.md) for more details.
|
||||
soundness | soundness is a technical term in type theory. Roughly, if a type system is sound, then if a program type-checks, it is type-safe; i.e. I can never (in safe rust) force a value into a variable of the wrong type. (see "completeness").
|
||||
span | a location in the user's source code, used for error reporting primarily. These are like a file-name/line-number/column tuple on steroids: they carry a start/end point, and also track macro expansions and compiler desugaring. All while being packed into a few bytes (really, it's an index into a table). See the Span datatype for more.
|
||||
substs | the substitutions for a given generic type or item (e.g. the `i32`, `u32` in `HashMap<i32, u32>`)
|
||||
|
|
|
|||
|
|
@ -1,8 +1,84 @@
|
|||
# Monomorphization
|
||||
|
||||
TODO
|
||||
As you probably know, rust has a very expressive type system that has extensive
|
||||
support for generic types. But of course, assembly is not generic, so we need
|
||||
to figure out the concrete types of all the generics before the code can
|
||||
execute.
|
||||
|
||||
Different languages handle this problem differently. For example, in some
|
||||
languages, such as Java, we may not know the most precise type of value until
|
||||
runtime. In the case of Java, this is ok because (almost) all variables are
|
||||
reference values anyway (i.e. pointers to a stack allocated object). This
|
||||
flexibility comes at the cost of performance, since all accesses to an object
|
||||
must dereference a pointer.
|
||||
|
||||
Rust takes a different approach: it _monomorphizes_ all generic types. This
|
||||
means that compiler stamps out a different copy of the code of a generic
|
||||
function for each concrete type needed. For example, if I use a `Vec<u64>` and
|
||||
a `Vec<String>` in my code, then the generated binary will have two copies of
|
||||
the generated code for `Vec`: one for `Vec<u64>` and another for `Vec<String>`.
|
||||
The result is fast programs, but it comes at the cost of compile time (creating
|
||||
all those copies can take a while) and binary size (all those copies might take
|
||||
a lot of space).
|
||||
|
||||
Monomorphization is the first step in the backend of the rust compiler.
|
||||
|
||||
## Collection
|
||||
|
||||
First, we need to figure out what concrete types we need for all the generic
|
||||
things in our program. This is called _collection_, and the code that does this
|
||||
is called the _monomorphization collector_.
|
||||
|
||||
Take this example:
|
||||
|
||||
```rust
|
||||
fn banana() {
|
||||
peach::<u64>();
|
||||
}
|
||||
|
||||
fn main() {
|
||||
banana();
|
||||
}
|
||||
```
|
||||
|
||||
The monomorphisation collector will give you a list of `[main, banana,
|
||||
peach::<u64>]`. These are the functions that will have machine code generated
|
||||
for them. Collector will also add things like statics to that list.
|
||||
|
||||
See [the collector rustdocs][collect] for more info.
|
||||
|
||||
[collect]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/monomorphize/collector/index.html
|
||||
|
||||
## Polymorphization
|
||||
|
||||
TODO
|
||||
As mentioned above, monomorphisation produces fast code, but it comes at the
|
||||
cost of compile time and binary size. [MIR
|
||||
optimizations](../mir/optimizations.md) can help a bit with this. Another
|
||||
optimization currently under development is called _polymorphization_.
|
||||
|
||||
The general idea is that often we can share some code between monomorphized
|
||||
copies of code. More precisely, if a MIR block is not dependent on a type
|
||||
parameter, it may not need to be monomorphized into many copies. Consider the
|
||||
following example:
|
||||
|
||||
```rust
|
||||
pub fn f() {
|
||||
g::<bool>();
|
||||
g::<usize>();
|
||||
}
|
||||
|
||||
fn g<T>() -> usize {
|
||||
let n = 1;
|
||||
let closure = || n;
|
||||
closure()
|
||||
}
|
||||
```
|
||||
|
||||
In this case, we would currently collect `[f, g::<bool>, g::<usize>,
|
||||
g::<bool>::{{closure}}, g::<usize>::{{closure}}]`, but notice that the two
|
||||
closures would be identical -- they don't depend on the type parameter `T` of
|
||||
function `g`. So we only need to emit one copy of the closure.
|
||||
|
||||
For more information, see [this thread on github][polymorph].
|
||||
|
||||
[polymorph]: https://github.com/rust-lang/rust/issues/46477
|
||||
|
|
|
|||
Loading…
Reference in New Issue