sprinkle around a bunch of links
This commit is contained in:
parent
6fee71e345
commit
5ab21a1318
|
|
@ -55,15 +55,19 @@ iteration, this represents a compile error. Here is the [algorithm][original]:
|
|||
1. Repeat until `queue` is empty (or we make no progress, which is an error):
|
||||
0. [Resolve](./name-resolution.md) imports in our partially built crate as
|
||||
much as possible.
|
||||
1. Collect as many macro invocations as possible from our partially built
|
||||
crate (fn-like, attributes, derives) and add them to the queue.
|
||||
1. Collect as many macro [`Invocation`s][inv] as possible from our
|
||||
partially built crate (fn-like, attributes, derives) and add them to the
|
||||
queue.
|
||||
2. Dequeue the first element, and attempt to resolve it.
|
||||
3. If it's resolved:
|
||||
0. Run the macro's expander function that consumes tokens or AST and
|
||||
produces tokens or AST (depending on the macro kind).
|
||||
0. Run the macro's expander function that consumes a [`TokenStream`] or
|
||||
AST and produces a [`TokenStream`] or [`AstFragment`] (depending on
|
||||
the macro kind). (A `TokenStream` is a collection of [`TokenTrees`],
|
||||
each of which are a token (punctuation, identifier, or literal) or a
|
||||
delimited group (anything inside `()`/`[]`/`{}`)).
|
||||
- At this point, we know everything about the macro itself and can
|
||||
call `set_expn_data` to fill in its properties in the global data
|
||||
-- that is the hygiene data associated with `ExpnId`. (See [the
|
||||
call `set_expn_data` to fill in its properties in the global data;
|
||||
that is the hygiene data associated with `ExpnId`. (See [the
|
||||
"Hygiene" section below][hybelow]).
|
||||
1. Integrate that piece of AST into the big existing partially built
|
||||
AST. This is essentially where the "token-like mass" becomes a
|
||||
|
|
@ -94,6 +98,10 @@ iteration, this represents a compile error. Here is the [algorithm][original]:
|
|||
[`DefCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/def_collector/struct.DefCollector.html
|
||||
[`BuildReducedGraphVisitor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/build_reduced_graph/struct.BuildReducedGraphVisitor.html
|
||||
[hybelow]: #hygiene-and-heirarchies
|
||||
[`TokenTree`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/enum.TokenTree.html
|
||||
[`TokenStream`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/struct.TokenStream.html
|
||||
[inv]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.Invocation.html
|
||||
[`AstFragment`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragment.html
|
||||
|
||||
If we make no progress in an iteration, then we have reached a compilation
|
||||
error (e.g. an undefined macro). We attempt to recover from failures
|
||||
|
|
@ -110,6 +118,27 @@ macro names in the above algorithm. However, we don't try to resolve other
|
|||
names yet. This happens later, as we will see in the [next
|
||||
chapter](./name-resolution.md).
|
||||
|
||||
Here are some other notable data structures involved in expansion and integration:
|
||||
- [`Resolver`] - a trait used to break crate dependencies. This allows the resolver services to be used in [`rustc_ast`], despite [`rustc_resolve`] and pretty much everything else depending on [`rustc_ast`].
|
||||
- [`ExtCtxt`]/[`ExpansionData`] - various intermediate data kept and used by expansion
|
||||
infrastructure in the process of its work
|
||||
- [`Annotatable`] - a piece of AST that can be an attribute target, almost same
|
||||
thing as AstFragment except for types and patterns that can be produced by
|
||||
macros but cannot be annotated with attributes
|
||||
- [`MacResult`] - a "polymorphic" AST fragment, something that can turn into a
|
||||
different `AstFragment` depending on its [`AstFragmentKind`] - item,
|
||||
or expression, or pattern etc.
|
||||
|
||||
[`rustc_ast`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/index.html
|
||||
[`rustc_resolve`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/index.html
|
||||
[`Resolver`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.Resolver.html
|
||||
[`ExtCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.ExtCtxt.html
|
||||
[`ExpansionData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.ExpansionData.html
|
||||
[`Annotatable`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/enum.Annotatable.html
|
||||
[`MacResult`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.MacResult.html
|
||||
[`AstFragmentKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragmentKind.html
|
||||
|
||||
|
||||
## Hygiene and Heirarchies
|
||||
|
||||
If you have ever used C/C++ preprocessor macros, you know that there are some
|
||||
|
|
@ -167,6 +196,10 @@ The context is attached to AST nodes. All AST nodes generated by macros have
|
|||
context attached. Additionally, there may be other nodes that have context
|
||||
attached, such as some desugared syntax (non-macro-expanded nodes are
|
||||
considered to just have the "root" context, as described below).
|
||||
Throughout the compiler, we use [`Span`s][span] to refer to code locations.
|
||||
This struct also has hygiene information attached to it, as we will see later.
|
||||
|
||||
[span]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/struct.Span.html
|
||||
|
||||
Because macros invocations and definitions can be nested, the syntax context of
|
||||
a node must be a heirarchy. For example, if we expand a macro and there is
|
||||
|
|
@ -184,24 +217,33 @@ an integer ID, assigned continuously starting from 0 as we discover new macro
|
|||
calls. All heirarchies start at [`ExpnId::root()`][rootid], which is its own
|
||||
parent.
|
||||
|
||||
The actual heirarchies are stored in [`HygieneData`][hd], and all of the
|
||||
hygiene-related algorithms are implemented in [`rustc_span::hygiene`][hy], with
|
||||
the exception of some hacks [`Resolver::resolve_crate_root`][hacks].
|
||||
All of the hygiene-related algorithms are implemented in
|
||||
[`rustc_span::hygiene`][hy], with the exception of some hacks
|
||||
[`Resolver::resolve_crate_root`][hacks].
|
||||
|
||||
The actual heirarchies are stored in [`HygieneData`][hd]. This is a global
|
||||
piece of data containing hygiene and expansion info that can be accessed from
|
||||
any [`Ident`] without any context.
|
||||
|
||||
|
||||
[`ExpnId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html
|
||||
[rootid]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html#method.root
|
||||
[hd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.HygieneData.html
|
||||
[hy]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/index.html
|
||||
[hacks]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/struct.Resolver.html#method.resolve_crate_root
|
||||
[`Ident`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/symbol/struct.Ident.html
|
||||
|
||||
### The Expansion Order Heirarchy
|
||||
|
||||
The first heirarchy tracks the order of expansions, i.e., when a macro
|
||||
invocation is in the output of another macro.
|
||||
|
||||
Here, the children in the heirarchy will be the "innermost" tokens.
|
||||
Here, the children in the heirarchy will be the "innermost" tokens. The
|
||||
[`ExpnData`] struct itself contains a subset of properties from both macro
|
||||
definition and macro call available through global data.
|
||||
[`ExpnData::parent`][edp] tracks the child -> parent link in this heirarchy.
|
||||
|
||||
[`ExpnData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html
|
||||
[edp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.parent
|
||||
|
||||
For example,
|
||||
|
|
@ -226,11 +268,20 @@ The second heirarchy tracks the order of macro definitions, i.e., when we are
|
|||
expanding one macro another macro definition is revealed in its output. This
|
||||
one is a bit tricky and more complex than the other two heirarchies.
|
||||
|
||||
Here, [`SyntaxContextData::parent`][scdp] is the child -> parent link here.
|
||||
[`SyntaxContext`][sc] is the whole chain in this hierarchy, and
|
||||
[`SyntaxContextData::outer_expns`][scdoe] are individual elements in the chain.
|
||||
The "chaining operator" is [`SyntaxContext::apply_mark`][am] in compiler code.
|
||||
[`SyntaxContext`][sc] represents a whole chain in this hierarchy via an ID.
|
||||
[`SyntaxContextData`][scd] contains data associated with the given
|
||||
`SyntaxContext`; mostly it is a cache for results of filtering that chain in
|
||||
different ways. [`SyntaxContextData::parent`][scdp] is the child -> parent
|
||||
link here, and [`SyntaxContextData::outer_expns`][scdoe] are individual
|
||||
elements in the chain. The "chaining operator" is
|
||||
[`SyntaxContext::apply_mark`][am] in compiler code.
|
||||
|
||||
A [`Span`][span], mentioned above, is actually just a compact representation of
|
||||
a code location and `SyntaxContext`. Likewise, an [`Ident`] is just an interned
|
||||
[`Symbol`] + `Span` (i.e. an interned string + hygiene data).
|
||||
|
||||
[`Symbol`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/symbol/struct.Symbol.html
|
||||
[scd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html
|
||||
[scdp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.parent
|
||||
[sc]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html
|
||||
[scdoe]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.outer_expn
|
||||
|
|
@ -323,6 +374,24 @@ There are two types of macros in Rust:
|
|||
Rust parser will set aside the contents of macros and their invocations. Later,
|
||||
macros are expanded using these portions of the code.
|
||||
|
||||
Some important data structures/interfaces here:
|
||||
- [`SyntaxExtension`] - a lowered macro representation, contains its expander
|
||||
function, which transforms a `TokenStream` or AST into another `TokenStream`
|
||||
or AST + some additional data like stability, or a list of unstable features
|
||||
allowed inside the macro.
|
||||
- [`SyntaxExtensionKind`] - expander functions may have several different
|
||||
signatures (take one token stream, or two, or a piece of AST, etc). This is
|
||||
an enum that lists them.
|
||||
- [`ProcMacro`]/[`TTMacroExpander`]/[`AttrProcMacro`]/[`MultiItemModifier`] -
|
||||
traits representing the expander function signatures.
|
||||
|
||||
[`SyntaxExtension`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.SyntaxExtension.html
|
||||
[`SyntaxExtensionKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/enum.SyntaxExtensionKind.html
|
||||
[`ProcMacro`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.ProcMacro.html
|
||||
[`TTMacroExpander`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.TTMacroExpander.html
|
||||
[`AttrProcMacro`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.AttrProcMacro.html
|
||||
[`MultiItemModifier`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.MultiItemModifier.html
|
||||
|
||||
## Macros By Example
|
||||
|
||||
MBEs have their own parser distinct from the normal Rust parser. When macros
|
||||
|
|
@ -492,11 +561,10 @@ Custom derives are a special type of proc macro.
|
|||
|
||||
TODO: more?
|
||||
|
||||
## Notes from petrochenkov discussion
|
||||
## Important Modules and Data Structures
|
||||
|
||||
TODO: sprinkle these links around the chapter...
|
||||
TODO: sprinkle these throughout the chapter as much as possible...
|
||||
|
||||
Where to find the code:
|
||||
- librustc_span/hygiene.rs - structures related to hygiene and expansion that are kept in global data (can be accessed from any Ident without any context)
|
||||
- librustc_span/lib.rs - some secondary methods like macro backtrace using primary methods from hygiene.rs
|
||||
- librustc_builtin_macros - implementations of built-in macros (including macro attributes and derives) and some other early code generation facilities like injection of standard library imports or generation of test harness.
|
||||
|
|
@ -511,23 +579,3 @@ Where to find the code:
|
|||
- librustc_ast/ext/tt - implementation of macro_rules, turns macro_rules DSL into something with signature Fn(TokenStream) -> TokenStream that can eat and produce tokens, @mark-i-m knows more about this
|
||||
- librustc_resolve/macros.rs - resolving macro paths, validating those resolutions, reporting various "not found"/"found, but it's unstable"/"expected x, found y" errors
|
||||
- librustc_middle/hir/map/def_collector.rs + librustc_resolve/build_reduced_graph.rs - integrate an AST fragment freshly expanded from a macro into various parent/child structures like module hierarchy or "definition paths"
|
||||
|
||||
Primary structures:
|
||||
- HygieneData - global piece of data containing hygiene and expansion info that can be accessed from any Ident without any context
|
||||
- ExpnId - ID of a macro call or desugaring (and also expansion of that call/desugaring, depending on context)
|
||||
- ExpnInfo/InternalExpnData - a subset of properties from both macro definition and macro call available through global data
|
||||
- SyntaxContext - ID of a chain of nested macro definitions (identified by ExpnIds)
|
||||
- SyntaxContextData - data associated with the given SyntaxContext, mostly a cache for results of filtering that chain in different ways
|
||||
- Span - a code location + SyntaxContext
|
||||
- Ident - interned string (Symbol) + Span, i.e. a string with attached hygiene data
|
||||
- TokenStream - a collection of TokenTrees
|
||||
- TokenTree - a token (punctuation, identifier, or literal) or a delimited group (anything inside ()/[]/{})
|
||||
- SyntaxExtension - a lowered macro representation, contains its expander function transforming a tokenstream or AST into tokenstream or AST + some additional data like stability, or a list of unstable features allowed inside the macro.
|
||||
- SyntaxExtensionKind - expander functions may have several different signatures (take one token stream, or two, or a piece of AST, etc), this is an enum that lists them
|
||||
- ProcMacro/TTMacroExpander/AttrProcMacro/MultiItemModifier - traits representing the expander signatures (TODO: change and rename the signatures into something more consistent)
|
||||
- Resolver - a trait used to break crate dependencies (so resolver services can be used in librustc_ast, despite librustc_resolve and pretty much everything else depending on librustc_ast)
|
||||
- ExtCtxt/ExpansionData - various intermediate data kept and used by expansion infra in the process of its work
|
||||
- AstFragment - a piece of AST that can be produced by a macro (may include multiple homogeneous AST nodes, like e.g. a list of items)
|
||||
- Annotatable - a piece of AST that can be an attribute target, almost same thing as AstFragment except for types and patterns that can be produced by macros but cannot be annotated with attributes (TODO: Merge into AstFragment)
|
||||
- MacResult - a "polymorphic" AST fragment, something that can turn into a different AstFragment depending on its context (aka AstFragmentKind - item, or expression, or pattern etc.)
|
||||
- Invocation/InvocationKind - a structure describing a macro call, these structures are collected by the expansion infra (InvocationCollector), queued, resolved, expanded when resolved, etc.
|
||||
|
|
|
|||
Loading…
Reference in New Issue