sprinkle around a bunch of links

This commit is contained in:
mark 2020-04-30 20:49:46 -05:00 committed by Who? Me?!
parent 6fee71e345
commit 5ab21a1318
1 changed files with 85 additions and 37 deletions

View File

@ -55,15 +55,19 @@ iteration, this represents a compile error. Here is the [algorithm][original]:
1. Repeat until `queue` is empty (or we make no progress, which is an error): 1. Repeat until `queue` is empty (or we make no progress, which is an error):
0. [Resolve](./name-resolution.md) imports in our partially built crate as 0. [Resolve](./name-resolution.md) imports in our partially built crate as
much as possible. much as possible.
1. Collect as many macro invocations as possible from our partially built 1. Collect as many macro [`Invocation`s][inv] as possible from our
crate (fn-like, attributes, derives) and add them to the queue. partially built crate (fn-like, attributes, derives) and add them to the
queue.
2. Dequeue the first element, and attempt to resolve it. 2. Dequeue the first element, and attempt to resolve it.
3. If it's resolved: 3. If it's resolved:
0. Run the macro's expander function that consumes tokens or AST and 0. Run the macro's expander function that consumes a [`TokenStream`] or
produces tokens or AST (depending on the macro kind). AST and produces a [`TokenStream`] or [`AstFragment`] (depending on
the macro kind). (A `TokenStream` is a collection of [`TokenTrees`],
each of which are a token (punctuation, identifier, or literal) or a
delimited group (anything inside `()`/`[]`/`{}`)).
- At this point, we know everything about the macro itself and can - At this point, we know everything about the macro itself and can
call `set_expn_data` to fill in its properties in the global data call `set_expn_data` to fill in its properties in the global data;
-- that is the hygiene data associated with `ExpnId`. (See [the that is the hygiene data associated with `ExpnId`. (See [the
"Hygiene" section below][hybelow]). "Hygiene" section below][hybelow]).
1. Integrate that piece of AST into the big existing partially built 1. Integrate that piece of AST into the big existing partially built
AST. This is essentially where the "token-like mass" becomes a AST. This is essentially where the "token-like mass" becomes a
@ -94,6 +98,10 @@ iteration, this represents a compile error. Here is the [algorithm][original]:
[`DefCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/def_collector/struct.DefCollector.html [`DefCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/def_collector/struct.DefCollector.html
[`BuildReducedGraphVisitor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/build_reduced_graph/struct.BuildReducedGraphVisitor.html [`BuildReducedGraphVisitor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/build_reduced_graph/struct.BuildReducedGraphVisitor.html
[hybelow]: #hygiene-and-heirarchies [hybelow]: #hygiene-and-heirarchies
[`TokenTree`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/enum.TokenTree.html
[`TokenStream`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/struct.TokenStream.html
[inv]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.Invocation.html
[`AstFragment`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragment.html
If we make no progress in an iteration, then we have reached a compilation If we make no progress in an iteration, then we have reached a compilation
error (e.g. an undefined macro). We attempt to recover from failures error (e.g. an undefined macro). We attempt to recover from failures
@ -110,6 +118,27 @@ macro names in the above algorithm. However, we don't try to resolve other
names yet. This happens later, as we will see in the [next names yet. This happens later, as we will see in the [next
chapter](./name-resolution.md). chapter](./name-resolution.md).
Here are some other notable data structures involved in expansion and integration:
- [`Resolver`] - a trait used to break crate dependencies. This allows the resolver services to be used in [`rustc_ast`], despite [`rustc_resolve`] and pretty much everything else depending on [`rustc_ast`].
- [`ExtCtxt`]/[`ExpansionData`] - various intermediate data kept and used by expansion
infrastructure in the process of its work
- [`Annotatable`] - a piece of AST that can be an attribute target, almost same
thing as AstFragment except for types and patterns that can be produced by
macros but cannot be annotated with attributes
- [`MacResult`] - a "polymorphic" AST fragment, something that can turn into a
different `AstFragment` depending on its [`AstFragmentKind`] - item,
or expression, or pattern etc.
[`rustc_ast`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/index.html
[`rustc_resolve`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/index.html
[`Resolver`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.Resolver.html
[`ExtCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.ExtCtxt.html
[`ExpansionData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.ExpansionData.html
[`Annotatable`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/enum.Annotatable.html
[`MacResult`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.MacResult.html
[`AstFragmentKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragmentKind.html
## Hygiene and Heirarchies ## Hygiene and Heirarchies
If you have ever used C/C++ preprocessor macros, you know that there are some If you have ever used C/C++ preprocessor macros, you know that there are some
@ -167,6 +196,10 @@ The context is attached to AST nodes. All AST nodes generated by macros have
context attached. Additionally, there may be other nodes that have context context attached. Additionally, there may be other nodes that have context
attached, such as some desugared syntax (non-macro-expanded nodes are attached, such as some desugared syntax (non-macro-expanded nodes are
considered to just have the "root" context, as described below). considered to just have the "root" context, as described below).
Throughout the compiler, we use [`Span`s][span] to refer to code locations.
This struct also has hygiene information attached to it, as we will see later.
[span]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/struct.Span.html
Because macros invocations and definitions can be nested, the syntax context of Because macros invocations and definitions can be nested, the syntax context of
a node must be a heirarchy. For example, if we expand a macro and there is a node must be a heirarchy. For example, if we expand a macro and there is
@ -184,24 +217,33 @@ an integer ID, assigned continuously starting from 0 as we discover new macro
calls. All heirarchies start at [`ExpnId::root()`][rootid], which is its own calls. All heirarchies start at [`ExpnId::root()`][rootid], which is its own
parent. parent.
The actual heirarchies are stored in [`HygieneData`][hd], and all of the All of the hygiene-related algorithms are implemented in
hygiene-related algorithms are implemented in [`rustc_span::hygiene`][hy], with [`rustc_span::hygiene`][hy], with the exception of some hacks
the exception of some hacks [`Resolver::resolve_crate_root`][hacks]. [`Resolver::resolve_crate_root`][hacks].
The actual heirarchies are stored in [`HygieneData`][hd]. This is a global
piece of data containing hygiene and expansion info that can be accessed from
any [`Ident`] without any context.
[`ExpnId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html [`ExpnId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html
[rootid]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html#method.root [rootid]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html#method.root
[hd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.HygieneData.html [hd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.HygieneData.html
[hy]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/index.html [hy]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/index.html
[hacks]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/struct.Resolver.html#method.resolve_crate_root [hacks]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/struct.Resolver.html#method.resolve_crate_root
[`Ident`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/symbol/struct.Ident.html
### The Expansion Order Heirarchy ### The Expansion Order Heirarchy
The first heirarchy tracks the order of expansions, i.e., when a macro The first heirarchy tracks the order of expansions, i.e., when a macro
invocation is in the output of another macro. invocation is in the output of another macro.
Here, the children in the heirarchy will be the "innermost" tokens. Here, the children in the heirarchy will be the "innermost" tokens. The
[`ExpnData`] struct itself contains a subset of properties from both macro
definition and macro call available through global data.
[`ExpnData::parent`][edp] tracks the child -> parent link in this heirarchy. [`ExpnData::parent`][edp] tracks the child -> parent link in this heirarchy.
[`ExpnData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html
[edp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.parent [edp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.parent
For example, For example,
@ -226,11 +268,20 @@ The second heirarchy tracks the order of macro definitions, i.e., when we are
expanding one macro another macro definition is revealed in its output. This expanding one macro another macro definition is revealed in its output. This
one is a bit tricky and more complex than the other two heirarchies. one is a bit tricky and more complex than the other two heirarchies.
Here, [`SyntaxContextData::parent`][scdp] is the child -> parent link here. [`SyntaxContext`][sc] represents a whole chain in this hierarchy via an ID.
[`SyntaxContext`][sc] is the whole chain in this hierarchy, and [`SyntaxContextData`][scd] contains data associated with the given
[`SyntaxContextData::outer_expns`][scdoe] are individual elements in the chain. `SyntaxContext`; mostly it is a cache for results of filtering that chain in
The "chaining operator" is [`SyntaxContext::apply_mark`][am] in compiler code. different ways. [`SyntaxContextData::parent`][scdp] is the child -> parent
link here, and [`SyntaxContextData::outer_expns`][scdoe] are individual
elements in the chain. The "chaining operator" is
[`SyntaxContext::apply_mark`][am] in compiler code.
A [`Span`][span], mentioned above, is actually just a compact representation of
a code location and `SyntaxContext`. Likewise, an [`Ident`] is just an interned
[`Symbol`] + `Span` (i.e. an interned string + hygiene data).
[`Symbol`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/symbol/struct.Symbol.html
[scd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html
[scdp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.parent [scdp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.parent
[sc]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html [sc]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html
[scdoe]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.outer_expn [scdoe]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.outer_expn
@ -323,6 +374,24 @@ There are two types of macros in Rust:
Rust parser will set aside the contents of macros and their invocations. Later, Rust parser will set aside the contents of macros and their invocations. Later,
macros are expanded using these portions of the code. macros are expanded using these portions of the code.
Some important data structures/interfaces here:
- [`SyntaxExtension`] - a lowered macro representation, contains its expander
function, which transforms a `TokenStream` or AST into another `TokenStream`
or AST + some additional data like stability, or a list of unstable features
allowed inside the macro.
- [`SyntaxExtensionKind`] - expander functions may have several different
signatures (take one token stream, or two, or a piece of AST, etc). This is
an enum that lists them.
- [`ProcMacro`]/[`TTMacroExpander`]/[`AttrProcMacro`]/[`MultiItemModifier`] -
traits representing the expander function signatures.
[`SyntaxExtension`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.SyntaxExtension.html
[`SyntaxExtensionKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/enum.SyntaxExtensionKind.html
[`ProcMacro`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.ProcMacro.html
[`TTMacroExpander`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.TTMacroExpander.html
[`AttrProcMacro`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.AttrProcMacro.html
[`MultiItemModifier`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.MultiItemModifier.html
## Macros By Example ## Macros By Example
MBEs have their own parser distinct from the normal Rust parser. When macros MBEs have their own parser distinct from the normal Rust parser. When macros
@ -492,11 +561,10 @@ Custom derives are a special type of proc macro.
TODO: more? TODO: more?
## Notes from petrochenkov discussion ## Important Modules and Data Structures
TODO: sprinkle these links around the chapter... TODO: sprinkle these throughout the chapter as much as possible...
Where to find the code:
- librustc_span/hygiene.rs - structures related to hygiene and expansion that are kept in global data (can be accessed from any Ident without any context) - librustc_span/hygiene.rs - structures related to hygiene and expansion that are kept in global data (can be accessed from any Ident without any context)
- librustc_span/lib.rs - some secondary methods like macro backtrace using primary methods from hygiene.rs - librustc_span/lib.rs - some secondary methods like macro backtrace using primary methods from hygiene.rs
- librustc_builtin_macros - implementations of built-in macros (including macro attributes and derives) and some other early code generation facilities like injection of standard library imports or generation of test harness. - librustc_builtin_macros - implementations of built-in macros (including macro attributes and derives) and some other early code generation facilities like injection of standard library imports or generation of test harness.
@ -511,23 +579,3 @@ Where to find the code:
- librustc_ast/ext/tt - implementation of macro_rules, turns macro_rules DSL into something with signature Fn(TokenStream) -> TokenStream that can eat and produce tokens, @mark-i-m knows more about this - librustc_ast/ext/tt - implementation of macro_rules, turns macro_rules DSL into something with signature Fn(TokenStream) -> TokenStream that can eat and produce tokens, @mark-i-m knows more about this
- librustc_resolve/macros.rs - resolving macro paths, validating those resolutions, reporting various "not found"/"found, but it's unstable"/"expected x, found y" errors - librustc_resolve/macros.rs - resolving macro paths, validating those resolutions, reporting various "not found"/"found, but it's unstable"/"expected x, found y" errors
- librustc_middle/hir/map/def_collector.rs + librustc_resolve/build_reduced_graph.rs - integrate an AST fragment freshly expanded from a macro into various parent/child structures like module hierarchy or "definition paths" - librustc_middle/hir/map/def_collector.rs + librustc_resolve/build_reduced_graph.rs - integrate an AST fragment freshly expanded from a macro into various parent/child structures like module hierarchy or "definition paths"
Primary structures:
- HygieneData - global piece of data containing hygiene and expansion info that can be accessed from any Ident without any context
- ExpnId - ID of a macro call or desugaring (and also expansion of that call/desugaring, depending on context)
- ExpnInfo/InternalExpnData - a subset of properties from both macro definition and macro call available through global data
- SyntaxContext - ID of a chain of nested macro definitions (identified by ExpnIds)
- SyntaxContextData - data associated with the given SyntaxContext, mostly a cache for results of filtering that chain in different ways
- Span - a code location + SyntaxContext
- Ident - interned string (Symbol) + Span, i.e. a string with attached hygiene data
- TokenStream - a collection of TokenTrees
- TokenTree - a token (punctuation, identifier, or literal) or a delimited group (anything inside ()/[]/{})
- SyntaxExtension - a lowered macro representation, contains its expander function transforming a tokenstream or AST into tokenstream or AST + some additional data like stability, or a list of unstable features allowed inside the macro.
- SyntaxExtensionKind - expander functions may have several different signatures (take one token stream, or two, or a piece of AST, etc), this is an enum that lists them
- ProcMacro/TTMacroExpander/AttrProcMacro/MultiItemModifier - traits representing the expander signatures (TODO: change and rename the signatures into something more consistent)
- Resolver - a trait used to break crate dependencies (so resolver services can be used in librustc_ast, despite librustc_resolve and pretty much everything else depending on librustc_ast)
- ExtCtxt/ExpansionData - various intermediate data kept and used by expansion infra in the process of its work
- AstFragment - a piece of AST that can be produced by a macro (may include multiple homogeneous AST nodes, like e.g. a list of items)
- Annotatable - a piece of AST that can be an attribute target, almost same thing as AstFragment except for types and patterns that can be produced by macros but cannot be annotated with attributes (TODO: Merge into AstFragment)
- MacResult - a "polymorphic" AST fragment, something that can turn into a different AstFragment depending on its context (aka AstFragmentKind - item, or expression, or pattern etc.)
- Invocation/InvocationKind - a structure describing a macro call, these structures are collected by the expansion infra (InvocationCollector), queued, resolved, expanded when resolved, etc.