Update name-resolution.md
This commit is contained in:
parent
c612fc3a72
commit
6b70577e7b
|
|
@ -2,24 +2,27 @@
|
|||
|
||||
<!-- toc -->
|
||||
|
||||
In the previous chapters, we saw how the AST is built with all macros expanded.
|
||||
We saw how doing that requires doing some name resolution to resolve imports
|
||||
and macro names. In this chapter, we show how this is actually done and more.
|
||||
In the previous chapters, we saw how the [*Abstract Syntax Tree* (`AST`)][ast]
|
||||
is built with all `macros` expanded. We saw how doing that requires doing some
|
||||
name resolution to resolve imports and `macro` names. In this chapter, we show
|
||||
how this is actually done and more.
|
||||
|
||||
In fact, we don't do full name resolution during macro expansion -- we only
|
||||
resolve imports and macros at that time. This is required to know what to even
|
||||
expand. Later, after we have the whole AST, we do full name resolution to
|
||||
[ast]: https://en.wikipedia.org/wiki/Abstract_syntax_tree
|
||||
|
||||
In fact, we don't do full name resolution during `macro` expansion -- we only
|
||||
resolve imports and `macros` at that time. This is required to know what to even
|
||||
expand. Later, after we have the whole `AST`, we do full name resolution to
|
||||
resolve all names in the crate. This happens in [`rustc_resolve::late`][late].
|
||||
Unlike during macro expansion, in this late expansion, we only need to try to
|
||||
Unlike during `macro` expansion, in this late expansion, we only need to try to
|
||||
resolve a name once, since no new names can be added. If we fail to resolve a
|
||||
name now, then it is a compiler error.
|
||||
name, then it is a compiler error.
|
||||
|
||||
Name resolution can be complex. There are a few different namespaces (e.g.
|
||||
macros, values, types, lifetimes), and names may be valid at different (nested)
|
||||
scopes. Also, different types of names can fail to be resolved differently, and
|
||||
failures can happen differently at different scopes. For example, for a module
|
||||
scope, failure means no unexpanded macros and no unresolved glob imports in
|
||||
that module. On the other hand, in a function body, failure requires that a
|
||||
Name resolution can be complex. There are different namespaces (e.g.
|
||||
`macros`, values, types, lifetimes), and names may be valid at different (nested)
|
||||
scopes. Also, different types of names can fail resolution differently, and
|
||||
failures can happen differently at different scopes. For example, in a module
|
||||
scope, failure means no unexpanded `macros` and no unresolved glob imports in
|
||||
that module. On the other hand, in a function body scope, failure requires that a
|
||||
name be absent from the block we are in, all outer scopes, and the global
|
||||
scope.
|
||||
|
||||
|
|
@ -27,7 +30,7 @@ scope.
|
|||
|
||||
## Basics
|
||||
|
||||
In our programs we can refer to variables, types, functions, etc, by giving them
|
||||
In our programs we refer to variables, types, functions, etc, by giving them
|
||||
a name. These names are not always unique. For example, take this valid Rust
|
||||
program:
|
||||
|
||||
|
|
@ -37,38 +40,39 @@ let x: x = 1;
|
|||
let y: x = 2;
|
||||
```
|
||||
|
||||
How do we know on line 3 whether `x` is a type (u32) or a value (1)? These
|
||||
How do we know on line 3 whether `x` is a type (`u32`) or a value (1)? These
|
||||
conflicts are resolved during name resolution. In this specific case, name
|
||||
resolution defines that type names and variable names live in separate
|
||||
namespaces and therefore can co-exist.
|
||||
|
||||
The name resolution in Rust is a two-phase process. In the first phase, which runs
|
||||
during macro expansion, we build a tree of modules and resolve imports. Macro
|
||||
during `macro` expansion, we build a tree of modules and resolve imports. Macro
|
||||
expansion and name resolution communicate with each other via the
|
||||
[`ResolverAstLoweringExt`] trait.
|
||||
|
||||
The input to the second phase is the syntax tree, produced by parsing input
|
||||
files and expanding macros. This phase produces links from all the names in the
|
||||
files and expanding `macros`. This phase produces links from all the names in the
|
||||
source to relevant places where the name was introduced. It also generates
|
||||
helpful error messages, like typo suggestions, traits to import or lints about
|
||||
helpful error messages, like typo suggestions, `trait`s to import or lints about
|
||||
unused items.
|
||||
|
||||
A successful run of the second phase ([`Resolver::resolve_crate`]) creates kind
|
||||
of an index the rest of the compilation may use to ask about the present names
|
||||
(through the `hir::lowering::Resolver` interface).
|
||||
|
||||
The name resolution lives in the `rustc_resolve` crate, with the meat in
|
||||
The name resolution lives in the [`rustc_resolve`] crate, with the bulk in
|
||||
`lib.rs` and some helpers or symbol-type specific logic in the other modules.
|
||||
|
||||
[`Resolver::resolve_crate`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/struct.Resolver.html#method.resolve_crate
|
||||
[`ResolverAstLoweringExt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast_lowering/trait.ResolverAstLoweringExt.html
|
||||
[`rustc_resolve`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/index.html
|
||||
|
||||
## Namespaces
|
||||
|
||||
Different kind of symbols live in different namespaces ‒ e.g. types don't
|
||||
clash with variables. This usually doesn't happen, because variables start with
|
||||
lower-case letter while types with upper case one, but this is only a
|
||||
convention. This is legal Rust code that'll compile (with warnings):
|
||||
lower-case letter while types with upper-case one, but this is only a
|
||||
convention. This is legal Rust code that will compile (with warnings):
|
||||
|
||||
```rust
|
||||
type x = u32;
|
||||
|
|
@ -81,33 +85,37 @@ namespaces, the resolver keeps them separated and builds separate structures for
|
|||
them.
|
||||
|
||||
In other words, when the code talks about namespaces, it doesn't mean the module
|
||||
hierarchy, it's types vs. values vs. macros.
|
||||
hierarchy, it's types vs. values vs. `macros`.
|
||||
|
||||
## Scopes and ribs
|
||||
|
||||
A name is visible only in certain area in the source code. This forms a
|
||||
hierarchical structure, but not necessarily a simple one ‒ if one scope is
|
||||
part of another, it doesn't mean the name visible in the outer one is also
|
||||
visible in the inner one, or that it refers to the same thing.
|
||||
part of another, it doesn't mean a name visible in the outer scope is also
|
||||
visible in the inner scope, or that it refers to the same thing.
|
||||
|
||||
To cope with that, the compiler introduces the concept of Ribs. This is
|
||||
To cope with that, the compiler introduces the concept of [`Rib`]s. This is
|
||||
an abstraction of a scope. Every time the set of visible names potentially changes,
|
||||
a new rib is pushed onto a stack. The places where this can happen include for
|
||||
a new [`Rib`] is pushed onto a stack. The places where this can happen include for
|
||||
example:
|
||||
|
||||
[`Rib`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/late/struct.Rib.html
|
||||
|
||||
* The obvious places ‒ curly braces enclosing a block, function boundaries,
|
||||
modules.
|
||||
* Introducing a let binding ‒ this can shadow another binding with the same
|
||||
* Introducing a `let` binding ‒ this can shadow another binding with the same
|
||||
name.
|
||||
* Macro expansion border ‒ to cope with macro hygiene.
|
||||
* Macro expansion border ‒ to cope with `macro` hygiene.
|
||||
|
||||
When searching for a name, the stack of ribs is traversed from the innermost
|
||||
When searching for a name, the stack of [`ribs`] is traversed from the innermost
|
||||
outwards. This helps to find the closest meaning of the name (the one not
|
||||
shadowed by anything else). The transition to outer rib may also affect
|
||||
what names are usable ‒ if there are nested functions (not closures),
|
||||
shadowed by anything else). The transition to outer [`Rib`] may also affect
|
||||
what names are usable ‒ if there are nested functions (not `closure`s),
|
||||
the inner one can't access parameters and local bindings of the outer one,
|
||||
even though they should be visible by ordinary scoping rules. An example:
|
||||
|
||||
[`ribs`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/late/struct.LateResolutionVisitor.html#structfield.ribs
|
||||
|
||||
```rust
|
||||
fn do_something<T: Default>(val: T) { // <- New rib in both types and values (1)
|
||||
// `val` is accessible, as is the helper function
|
||||
|
|
@ -126,45 +134,49 @@ fn do_something<T: Default>(val: T) { // <- New rib in both types and values (1)
|
|||
```
|
||||
|
||||
Because the rules for different namespaces are a bit different, each namespace
|
||||
has its own independent rib stack that is constructed in parallel to the others.
|
||||
In addition, there's also a rib stack for local labels (e.g. names of loops or
|
||||
has its own independent [`Rib`] stack that is constructed in parallel to the others.
|
||||
In addition, there's also a [`Rib`] stack for local labels (e.g. names of loops or
|
||||
blocks), which isn't a full namespace in its own right.
|
||||
|
||||
## Overall strategy
|
||||
|
||||
To perform the name resolution of the whole crate, the syntax tree is traversed
|
||||
top-down and every encountered name is resolved. This works for most kinds of
|
||||
names, because at the point of use of a name it is already introduced in the Rib
|
||||
names, because at the point of use of a name it is already introduced in the [`Rib`]
|
||||
hierarchy.
|
||||
|
||||
There are some exceptions to this. Items are bit tricky, because they can be
|
||||
used even before encountered ‒ therefore every block needs to be first scanned
|
||||
for items to fill in its Rib.
|
||||
for items to fill in its [`Rib`].
|
||||
|
||||
Other, even more problematic ones, are imports which need recursive fixed-point
|
||||
resolution and macros, that need to be resolved and expanded before the rest of
|
||||
resolution and `macros`, that need to be resolved and expanded before the rest of
|
||||
the code can be processed.
|
||||
|
||||
Therefore, the resolution is performed in multiple stages.
|
||||
|
||||
## Speculative crate loading
|
||||
|
||||
To give useful errors, rustc suggests importing paths into scope if they're
|
||||
To give useful errors, `rustc` suggests importing paths into scope if they're
|
||||
not found. How does it do this? It looks through every module of every crate
|
||||
and looks for possible matches. This even includes crates that haven't yet
|
||||
been loaded!
|
||||
|
||||
Loading crates for import suggestions that haven't yet been loaded is called
|
||||
_speculative crate loading_, because any errors it encounters shouldn't be
|
||||
reported: resolve decided to load them, not the user. The function that does
|
||||
this is `lookup_import_candidates` and lives in
|
||||
`rustc_resolve/src/diagnostics.rs`.
|
||||
Eagerly loading crates to include import suggestions that haven't yet been
|
||||
loaded is called _speculative crate loading_, because any errors it encounters
|
||||
shouldn't be reported: [`rustc_resolve`] decided to load them, not the user. The function
|
||||
that does this is [`lookup_import_candidates`] and lives in
|
||||
[`rustc_resolve::diagnostics`].
|
||||
|
||||
[`rustc_resolve`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/index.html
|
||||
[`lookup_import_candidates`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/struct.Resolver.html#method.lookup_import_candidates
|
||||
[`rustc_resolve::diagnostics`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/diagnostics/index.html
|
||||
|
||||
To tell the difference between speculative loads and loads initiated by the
|
||||
user, resolve passes around a `record_used` parameter, which is `false` when
|
||||
user, [`rustc_resolve`] passes around a `record_used` parameter, which is `false` when
|
||||
the load is speculative.
|
||||
|
||||
## TODO: [#16](https://github.com/rust-lang/rustc-dev-guide/issues/16)
|
||||
<!-- ## TODO: [#16](https://github.com/rust-lang/rustc-dev-guide/issues/16)
|
||||
|
||||
This is a result of the first pass of learning the code. It is definitely
|
||||
incomplete and not detailed enough. It also might be inaccurate in places.
|
||||
|
|
@ -178,4 +190,4 @@ Still, it probably provides useful first guidepost to what happens in there.
|
|||
* The overall strategy description is a bit vague.
|
||||
* Where does the name `Rib` come from?
|
||||
* Does this thing have its own tests, or is it tested only as part of some e2e
|
||||
testing?
|
||||
testing? -->
|
||||
|
|
|
|||
Loading…
Reference in New Issue