Update macro-expansion.md
This commit is contained in:
parent
6977f206f5
commit
2edd9e08d4
|
|
@ -2,25 +2,29 @@
|
||||||
|
|
||||||
<!-- toc -->
|
<!-- toc -->
|
||||||
|
|
||||||
> `rustc_ast`, `rustc_expand`, and `rustc_builtin_macros` are all undergoing
|
> N.B. [`rustc_ast`], [`rustc_expand`], and [`rustc_builtin_macros`] are all
|
||||||
> refactoring, so some of the links in this chapter may be broken.
|
> undergoing refactoring, so some of the links in this chapter may be broken.
|
||||||
|
|
||||||
Rust has a very powerful macro system. In the previous chapter, we saw how the
|
Rust has a very powerful `macro` system. In the previous chapter, we saw how
|
||||||
parser sets aside macros to be expanded (it temporarily uses [placeholders]).
|
the parser sets aside `macro`s to be expanded (using temporary [placeholders]).
|
||||||
This chapter is about the process of expanding those macros iteratively until
|
This chapter is about the process of expanding those `macro`s iteratively until
|
||||||
we have a complete AST for our crate with no unexpanded macros (or a compile
|
we have a complete [*Abstract Syntax Tree* (`AST`)][ast] for our crate with no
|
||||||
error).
|
unexpanded `macro`s (or a compile error).
|
||||||
|
|
||||||
|
[ast]: https://en.wikipedia.org/wiki/Abstract_syntax_tree
|
||||||
|
[`rustc_ast`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/index.html
|
||||||
|
[`rustc_expand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/index.html
|
||||||
|
[`rustc_builtin_macros`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_builtin_macros/index.html
|
||||||
[placeholders]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/placeholders/index.html
|
[placeholders]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/placeholders/index.html
|
||||||
|
|
||||||
First, we will discuss the algorithm that expands and integrates macro output
|
First, we discuss the algorithm that expands and integrates `macro` output into
|
||||||
into ASTs. Next, we will take a look at how hygiene data is collected. Finally,
|
`AST`s. Next, we take a look at how hygiene data is collected. Finally, we look
|
||||||
we will look at the specifics of expanding different types of macros.
|
at the specifics of expanding different types of `macro`s.
|
||||||
|
|
||||||
Many of the algorithms and data structures described below are in [`rustc_expand`],
|
Many of the algorithms and data structures described below are in [`rustc_expand`],
|
||||||
with basic data structures in [`rustc_expand::base`][base].
|
with fundamental data structures in [`rustc_expand::base`][base].
|
||||||
|
|
||||||
Also of note, `cfg` and `cfg_attr` are treated specially from other macros, and are
|
Also of note, `cfg` and `cfg_attr` are treated specially from other `macro`s, and are
|
||||||
handled in [`rustc_expand::config`][cfg].
|
handled in [`rustc_expand::config`][cfg].
|
||||||
|
|
||||||
[`rustc_expand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/index.html
|
[`rustc_expand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/index.html
|
||||||
|
|
@ -29,108 +33,112 @@ handled in [`rustc_expand::config`][cfg].
|
||||||
|
|
||||||
## Expansion and AST Integration
|
## Expansion and AST Integration
|
||||||
|
|
||||||
First of all, expansion happens at the crate level. Given a raw source code for
|
Firstly, expansion happens at the crate level. Given a raw source code for
|
||||||
a crate, the compiler will produce a massive AST with all macros expanded, all
|
a crate, the compiler will produce a massive `AST` with all `macro`s expanded, all
|
||||||
modules inlined, etc. The primary entry point for this process is the
|
modules inlined, etc. The primary entry point for this process is the
|
||||||
[`MacroExpander::fully_expand_fragment`][fef] method. With few exceptions, we
|
[`MacroExpander::fully_expand_fragment()`][fef] method. With few exceptions, we
|
||||||
use this method on the whole crate (see ["Eager Expansion"](#eager-expansion)
|
use this method on the whole crate (see ["Eager Expansion"](#eager-expansion)
|
||||||
below for more detailed discussion of edge case expansion issues).
|
below for more detailed discussion of edge case expansion issues).
|
||||||
|
|
||||||
[`rustc_builtin_macros`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_builtin_macros/index.html
|
[`rustc_builtin_macros`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_builtin_macros/index.html
|
||||||
[reb]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/build/index.html
|
[reb]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/build/index.html
|
||||||
|
|
||||||
At a high level, [`fully_expand_fragment`][fef] works in iterations. We keep a
|
At a high level, [`fully_expand_fragment()`][fef] works in iterations. We keep a
|
||||||
queue of unresolved macro invocations (that is, macros we haven't found the
|
queue of unresolved `macro` invocations (i.e. `macro`s we haven't found the
|
||||||
definition of yet). We repeatedly try to pick a macro from the queue, resolve
|
definition of yet). We repeatedly try to pick a `macro` from the queue, resolve
|
||||||
it, expand it, and integrate it back. If we can't make progress in an
|
it, expand it, and integrate it back. If we can't make progress in an
|
||||||
iteration, this represents a compile error. Here is the [algorithm][original]:
|
iteration, this represents a compile error. Here is the [algorithm][original]:
|
||||||
|
|
||||||
[fef]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.MacroExpander.html#method.fully_expand_fragment
|
[fef]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.MacroExpander.html#method.fully_expand_fragment
|
||||||
[original]: https://github.com/rust-lang/rust/pull/53778#issuecomment-419224049
|
[original]: https://github.com/rust-lang/rust/pull/53778#issuecomment-419224049
|
||||||
|
|
||||||
1. Initialize a `queue` of unresolved macros.
|
1. Initialize a `queue` of unresolved `macro`s.
|
||||||
2. Repeat until `queue` is empty (or we make no progress, which is an error):
|
2. Repeat until `queue` is empty (or we make no progress, which is an error):
|
||||||
1. [Resolve](./name-resolution.md) imports in our partially built crate as
|
1. [Resolve](./name-resolution.md) imports in our partially built crate as
|
||||||
much as possible.
|
much as possible.
|
||||||
2. Collect as many macro [`Invocation`s][inv] as possible from our
|
2. Collect as many `macro` [`Invocation`s][inv] as possible from our
|
||||||
partially built crate (fn-like, attributes, derives) and add them to the
|
partially built crate (`fn`-like, attributes, derives) and add them to the
|
||||||
queue.
|
queue.
|
||||||
3. Dequeue the first element, and attempt to resolve it.
|
3. Dequeue the first element and attempt to resolve it.
|
||||||
4. If it's resolved:
|
4. If it's resolved:
|
||||||
1. Run the macro's expander function that consumes a [`TokenStream`] or
|
1. Run the `macro`'s expander function that consumes a [`TokenStream`] or
|
||||||
AST and produces a [`TokenStream`] or [`AstFragment`] (depending on
|
`AST` and produces a [`TokenStream`] or [`AstFragment`] (depending on
|
||||||
the macro kind). (A `TokenStream` is a collection of [`TokenTree`s][tt],
|
the `macro` kind). (A [`TokenStream`] is a collection of [`TokenTree`s][tt],
|
||||||
each of which are a token (punctuation, identifier, or literal) or a
|
each of which are a token (punctuation, identifier, or literal) or a
|
||||||
delimited group (anything inside `()`/`[]`/`{}`)).
|
delimited group (anything inside `()`/`[]`/`{}`)).
|
||||||
- At this point, we know everything about the macro itself and can
|
- At this point, we know everything about the `macro` itself and can
|
||||||
call `set_expn_data` to fill in its properties in the global data;
|
call [`set_expn_data()`] to fill in its properties in the global
|
||||||
that is the hygiene data associated with `ExpnId`. (See [the
|
data; that is the [hygiene] data associated with [`ExpnId`] (see
|
||||||
"Hygiene" section below][hybelow]).
|
[Hygiene][hybelow] below).
|
||||||
2. Integrate that piece of AST into the big existing partially built
|
2. Integrate that piece of `AST` into the currently-existing though
|
||||||
AST. This is essentially where the "token-like mass" becomes a
|
partially-built `AST`. This is essentially where the "token-like mass"
|
||||||
proper set-in-stone AST with side-tables. It happens as follows:
|
becomes a proper set-in-stone `AST` with side-tables. It happens as
|
||||||
- If the macro produces tokens (e.g. a proc macro), we parse into
|
follows:
|
||||||
an AST, which may produce parse errors.
|
- If the `macro` produces tokens (e.g. a `proc macro`), we parse into
|
||||||
- During expansion, we create `SyntaxContext`s (hierarchy 2). (See
|
an `AST`, which may produce parse errors.
|
||||||
[the "Hygiene" section below][hybelow])
|
- During expansion, we create [`SyntaxContext`]s (hierarchy 2) (see
|
||||||
- These three passes happen one after another on every AST fragment
|
[Hygiene][hybelow] below).
|
||||||
freshly expanded from a macro:
|
- These three passes happen one after another on every `AST` fragment
|
||||||
|
freshly expanded from a `macro`:
|
||||||
- [`NodeId`]s are assigned by [`InvocationCollector`]. This
|
- [`NodeId`]s are assigned by [`InvocationCollector`]. This
|
||||||
also collects new macro calls from this new AST piece and
|
also collects new `macro` calls from this new `AST` piece and
|
||||||
adds them to the queue.
|
adds them to the queue.
|
||||||
- ["Def paths"][defpath] are created and [`DefId`]s are
|
- ["Def paths"][defpath] are created and [`DefId`]s are
|
||||||
assigned to them by [`DefCollector`].
|
assigned to them by [`DefCollector`].
|
||||||
- Names are put into modules (from the resolver's point of
|
- Names are put into modules (from the resolver's point of
|
||||||
view) by [`BuildReducedGraphVisitor`].
|
view) by [`BuildReducedGraphVisitor`].
|
||||||
3. After expanding a single macro and integrating its output, continue
|
3. After expanding a single `macro` and integrating its output, continue
|
||||||
to the next iteration of [`fully_expand_fragment`][fef].
|
to the next iteration of [`fully_expand_fragment()`][fef].
|
||||||
5. If it's not resolved:
|
5. If it's not resolved:
|
||||||
1. Put the macro back in the queue
|
1. Put the `macro` back in the queue.
|
||||||
2. Continue to next iteration...
|
2. Continue to next iteration...
|
||||||
|
|
||||||
[defpath]: hir.md#identifiers-in-the-hir
|
|
||||||
[`NodeId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/node_id/struct.NodeId.html
|
|
||||||
[`InvocationCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.InvocationCollector.html
|
|
||||||
[`DefId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/struct.DefId.html
|
|
||||||
[`DefCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/def_collector/struct.DefCollector.html
|
|
||||||
[`BuildReducedGraphVisitor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/build_reduced_graph/struct.BuildReducedGraphVisitor.html
|
|
||||||
[hybelow]: #hygiene-and-hierarchies
|
|
||||||
[tt]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/enum.TokenTree.html
|
|
||||||
[`TokenStream`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/struct.TokenStream.html
|
|
||||||
[inv]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.Invocation.html
|
|
||||||
[`AstFragment`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragment.html
|
[`AstFragment`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragment.html
|
||||||
|
[`BuildReducedGraphVisitor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/build_reduced_graph/struct.BuildReducedGraphVisitor.html
|
||||||
|
[`DefCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/def_collector/struct.DefCollector.html
|
||||||
|
[`DefId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/struct.DefId.html
|
||||||
|
[`ExpnId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html
|
||||||
|
[`InvocationCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.InvocationCollector.html
|
||||||
|
[`NodeId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/node_id/struct.NodeId.html
|
||||||
|
[`set_expn_data()`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.LocalExpnId.html#method.set_expn_data
|
||||||
|
[`SyntaxContext`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html
|
||||||
|
[`TokenStream`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/struct.TokenStream.html
|
||||||
|
[defpath]: hir.md#identifiers-in-the-hir
|
||||||
|
[hybelow]: #hygiene-and-hierarchies
|
||||||
|
[hygiene]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/index.html
|
||||||
|
[inv]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.Invocation.html
|
||||||
|
[tt]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/enum.TokenTree.html
|
||||||
|
|
||||||
### Error Recovery
|
### Error Recovery
|
||||||
|
|
||||||
If we make no progress in an iteration, then we have reached a compilation
|
If we make no progress in an iteration we have reached a compilation error
|
||||||
error (e.g. an undefined macro). We attempt to recover from failures
|
(e.g. an undefined `macro`). We attempt to recover from failures (i.e.
|
||||||
(unresolved macros or imports) for the sake of diagnostics. This allows
|
unresolved `macro`s or imports) with the intent of generating diagnostics.
|
||||||
compilation to continue past the first error, so that we can report more errors
|
Failure recovery happens by expanding unresolved `macro`s into
|
||||||
at a time. Recovery can't cause compilation to succeed. We know that it will
|
[`ExprKind::Err`][err] and allows compilation to continue past the first error
|
||||||
fail at this point. The recovery happens by expanding unresolved macros into
|
so that `rustc` can report more errors than just the original failure.
|
||||||
[`ExprKind::Err`][err].
|
|
||||||
|
|
||||||
[err]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/enum.ExprKind.html#variant.Err
|
[err]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/enum.ExprKind.html#variant.Err
|
||||||
|
|
||||||
### Name Resolution
|
### Name Resolution
|
||||||
|
|
||||||
Notice that name resolution is involved here: we need to resolve imports and
|
Notice that name resolution is involved here: we need to resolve imports and
|
||||||
macro names in the above algorithm. This is done in
|
`macro` names in the above algorithm. This is done in
|
||||||
[`rustc_resolve::macros`][mresolve], which resolves macro paths, validates
|
[`rustc_resolve::macros`][mresolve], which resolves `macro` paths, validates
|
||||||
those resolutions, and reports various errors (e.g. "not found" or "found, but
|
those resolutions, and reports various errors (e.g. "not found", "found, but
|
||||||
it's unstable" or "expected x, found y"). However, we don't try to resolve
|
it's unstable", "expected x, found y"). However, we don't try to resolve
|
||||||
other names yet. This happens later, as we will see in the [next
|
other names yet. This happens later, as we will see in the chapter: [Name
|
||||||
chapter](./name-resolution.md).
|
Resolution](./name-resolution.md).
|
||||||
|
|
||||||
[mresolve]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/macros/index.html
|
[mresolve]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/macros/index.html
|
||||||
|
|
||||||
### Eager Expansion
|
### Eager Expansion
|
||||||
|
|
||||||
_Eager expansion_ means that we expand the arguments of a macro invocation
|
_Eager expansion_ means we expand the arguments of a `macro` invocation before
|
||||||
before the macro invocation itself. This is implemented only for a few special
|
the `macro` invocation itself. This is implemented only for a few special
|
||||||
built-in macros that expect literals; expanding arguments first for some of
|
built-in `macro`s that expect literals; expanding arguments first for some of
|
||||||
these macro results in a smoother user experience. As an example, consider the
|
these `macro` results in a smoother user experience. As an example, consider
|
||||||
following:
|
the following:
|
||||||
|
|
||||||
```rust,ignore
|
```rust,ignore
|
||||||
macro bar($i: ident) { $i }
|
macro bar($i: ident) { $i }
|
||||||
|
|
@ -139,35 +147,37 @@ macro foo($i: ident) { $i }
|
||||||
foo!(bar!(baz));
|
foo!(bar!(baz));
|
||||||
```
|
```
|
||||||
|
|
||||||
A lazy expansion would expand `foo!` first. An eager expansion would expand
|
A lazy-expansion would expand `foo!` first. An eager-expansion would expand
|
||||||
`bar!` first.
|
`bar!` first.
|
||||||
|
|
||||||
Eager expansion is not a generally available feature of Rust. Implementing
|
Eager-expansion is not a generally available feature of Rust. Implementing
|
||||||
eager expansion more generally would be challenging, but we implement it for a
|
eager-expansion more generally would be challenging, so we implement it for a
|
||||||
few special built-in macros for the sake of user experience. The built-in
|
few special built-in `macro`s for the sake of user-experience. The built-in
|
||||||
macros are implemented in [`rustc_builtin_macros`], along with some other early
|
`macro`s are implemented in [`rustc_builtin_macros`], along with some other
|
||||||
code generation facilities like injection of standard library imports or
|
early code generation facilities like injection of standard library imports or
|
||||||
generation of test harness. There are some additional helpers for building
|
generation of test harness. There are some additional helpers for building
|
||||||
their AST fragments in [`rustc_expand::build`][reb]. Eager expansion generally
|
`AST` fragments in [`rustc_expand::build`][reb]. Eager-expansion generally
|
||||||
performs a subset of the things that lazy (normal) expansion does. It is done by
|
performs a subset of the things that lazy (normal) expansion does. It is done
|
||||||
invoking [`fully_expand_fragment`][fef] on only part of a crate (as opposed to
|
by invoking [`fully_expand_fragment`][fef] on only part of a crate (as opposed
|
||||||
the whole crate, like we normally do).
|
to the whole crate, like we normally do).
|
||||||
|
|
||||||
### Other Data Structures
|
### Other Data Structures
|
||||||
|
|
||||||
Here are some other notable data structures involved in expansion and integration:
|
Here are some other notable data structures involved in expansion and
|
||||||
- [`ResolverExpand`] - a trait used to break crate dependencies. This allows the
|
integration:
|
||||||
|
- [`ResolverExpand`] - a `trait` used to break crate dependencies. This allows the
|
||||||
resolver services to be used in [`rustc_ast`], despite [`rustc_resolve`] and
|
resolver services to be used in [`rustc_ast`], despite [`rustc_resolve`] and
|
||||||
pretty much everything else depending on [`rustc_ast`].
|
pretty much everything else depending on [`rustc_ast`].
|
||||||
- [`ExtCtxt`]/[`ExpansionData`] - various intermediate data kept and used by expansion
|
- [`ExtCtxt`]/[`ExpansionData`] - holds various intermediate expansion
|
||||||
infrastructure in the process of its work
|
infrastructure data.
|
||||||
- [`Annotatable`] - a piece of AST that can be an attribute target, almost same
|
- [`Annotatable`] - a piece of `AST` that can be an attribute target, almost the same
|
||||||
thing as AstFragment except for types and patterns that can be produced by
|
thing as [`AstFragment`] except for `type`s and patterns that can be produced by
|
||||||
macros but cannot be annotated with attributes
|
`macro`s but cannot be annotated with attributes.
|
||||||
- [`MacResult`] - a "polymorphic" AST fragment, something that can turn into a
|
- [`MacResult`] - a "polymorphic" `AST` fragment, something that can turn into
|
||||||
different `AstFragment` depending on its [`AstFragmentKind`] - item,
|
a different [`AstFragment`] depending on its [`AstFragmentKind`] (i.e. an item,
|
||||||
or expression, or pattern etc.
|
expression, pattern, etc).
|
||||||
|
|
||||||
|
[`AstFragment`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragment.html
|
||||||
[`rustc_ast`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/index.html
|
[`rustc_ast`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/index.html
|
||||||
[`rustc_resolve`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/index.html
|
[`rustc_resolve`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/index.html
|
||||||
[`ResolverExpand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.ResolverExpand.html
|
[`ResolverExpand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.ResolverExpand.html
|
||||||
|
|
@ -179,7 +189,7 @@ Here are some other notable data structures involved in expansion and integratio
|
||||||
|
|
||||||
## Hygiene and Hierarchies
|
## Hygiene and Hierarchies
|
||||||
|
|
||||||
If you have ever used C/C++ preprocessor macros, you know that there are some
|
If you have ever used the C/C++ preprocessor macros, you know that there are some
|
||||||
annoying and hard-to-debug gotchas! For example, consider the following C code:
|
annoying and hard-to-debug gotchas! For example, consider the following C code:
|
||||||
|
|
||||||
```c
|
```c
|
||||||
|
|
@ -213,16 +223,16 @@ we got `foo(0, 0)` because the macro defined its own `y`!
|
||||||
|
|
||||||
These are both examples of _macro hygiene_ issues. _Hygiene_ relates to how to
|
These are both examples of _macro hygiene_ issues. _Hygiene_ relates to how to
|
||||||
handle names defined _within a macro_. In particular, a hygienic macro system
|
handle names defined _within a macro_. In particular, a hygienic macro system
|
||||||
prevents errors due to names introduced within a macro. Rust macros are hygienic
|
prevents errors due to names introduced within a macro. Rust `macro`s are hygienic
|
||||||
in that they do not allow one to write the sorts of bugs above.
|
in that they do not allow one to write the sorts of bugs above.
|
||||||
|
|
||||||
At a high level, hygiene within the Rust compiler is accomplished by keeping
|
At a high level, hygiene within the Rust compiler is accomplished by keeping
|
||||||
track of the context where a name is introduced and used. We can then
|
track of the context where a name is introduced and used. We can then
|
||||||
disambiguate names based on that context. Future iterations of the macro system
|
disambiguate names based on that context. Future iterations of the `macro` system
|
||||||
will allow greater control to the macro author to use that context. For example,
|
will allow greater control to the `macro` author to use that context. For example,
|
||||||
a macro author may want to introduce a new name to the context where the macro
|
a `macro` author may want to introduce a new name to the context where the `macro`
|
||||||
was called. Alternately, the macro author may be defining a variable for use
|
was called. Alternately, the `macro` author may be defining a variable for use
|
||||||
only within the macro (i.e. it should not be visible outside the macro).
|
only within the `macro` (i.e. it should not be visible outside the `macro`).
|
||||||
|
|
||||||
[code_dir]: https://github.com/rust-lang/rust/tree/master/compiler/rustc_expand/src/mbe
|
[code_dir]: https://github.com/rust-lang/rust/tree/master/compiler/rustc_expand/src/mbe
|
||||||
[code_mp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser
|
[code_mp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser
|
||||||
|
|
@ -230,18 +240,18 @@ only within the macro (i.e. it should not be visible outside the macro).
|
||||||
[code_parse_int]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser/struct.TtParser.html#method.parse_tt
|
[code_parse_int]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser/struct.TtParser.html#method.parse_tt
|
||||||
[parsing]: ./the-parser.html
|
[parsing]: ./the-parser.html
|
||||||
|
|
||||||
The context is attached to AST nodes. All AST nodes generated by macros have
|
The context is attached to `AST` nodes. All `AST` nodes generated by `macro`s have
|
||||||
context attached. Additionally, there may be other nodes that have context
|
context attached. Additionally, there may be other nodes that have context
|
||||||
attached, such as some desugared syntax (non-macro-expanded nodes are
|
attached, such as some desugared syntax (non-`macro`-expanded nodes are
|
||||||
considered to just have the "root" context, as described below).
|
considered to just have the "root" context, as described below).
|
||||||
Throughout the compiler, we use [`rustc_span::Span`s][span] to refer to code locations.
|
Throughout the compiler, we use [`rustc_span::Span`s][span] to refer to code locations.
|
||||||
This struct also has hygiene information attached to it, as we will see later.
|
This struct also has hygiene information attached to it, as we will see later.
|
||||||
|
|
||||||
[span]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/struct.Span.html
|
[span]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/struct.Span.html
|
||||||
|
|
||||||
Because macros invocations and definitions can be nested, the syntax context of
|
Because `macro`s invocations and definitions can be nested, the syntax context of
|
||||||
a node must be a hierarchy. For example, if we expand a macro and there is
|
a node must be a hierarchy. For example, if we expand a `macro` and there is
|
||||||
another macro invocation or definition in the generated output, then the syntax
|
another `macro` invocation or definition in the generated output, then the syntax
|
||||||
context should reflect the nesting.
|
context should reflect the nesting.
|
||||||
|
|
||||||
However, it turns out that there are actually a few types of context we may
|
However, it turns out that there are actually a few types of context we may
|
||||||
|
|
@ -249,13 +259,13 @@ want to track for different purposes. Thus, there are not just one but _three_
|
||||||
expansion hierarchies that together comprise the hygiene information for a
|
expansion hierarchies that together comprise the hygiene information for a
|
||||||
crate.
|
crate.
|
||||||
|
|
||||||
All of these hierarchies need some sort of "macro ID" to identify individual
|
All of these hierarchies need some sort of "`macro` ID" to identify individual
|
||||||
elements in the chain of expansions. This ID is [`ExpnId`]. All macros receive
|
elements in the chain of expansions. This ID is [`ExpnId`]. All `macro`s receive
|
||||||
an integer ID, assigned continuously starting from 0 as we discover new macro
|
an integer ID, assigned continuously starting from 0 as we discover new `macro`
|
||||||
calls. All hierarchies start at [`ExpnId::root()`][rootid], which is its own
|
calls. All hierarchies start at [`ExpnId::root()`][rootid], which is its own
|
||||||
parent.
|
parent.
|
||||||
|
|
||||||
[`rustc_span::hygiene`][hy] contains all of the hygiene-related algorithms
|
The [`rustc_span::hygiene`][hy] library contains all of the hygiene-related algorithms
|
||||||
(with the exception of some hacks in [`Resolver::resolve_crate_root`][hacks])
|
(with the exception of some hacks in [`Resolver::resolve_crate_root`][hacks])
|
||||||
and structures related to hygiene and expansion that are kept in global data.
|
and structures related to hygiene and expansion that are kept in global data.
|
||||||
|
|
||||||
|
|
@ -273,18 +283,18 @@ any [`Ident`] without any context.
|
||||||
|
|
||||||
### The Expansion Order Hierarchy
|
### The Expansion Order Hierarchy
|
||||||
|
|
||||||
The first hierarchy tracks the order of expansions, i.e., when a macro
|
The first hierarchy tracks the order of expansions, i.e., when a `macro`
|
||||||
invocation is in the output of another macro.
|
invocation is in the output of another `macro`.
|
||||||
|
|
||||||
Here, the children in the hierarchy will be the "innermost" tokens. The
|
Here, the children in the hierarchy will be the "innermost" tokens. The
|
||||||
[`ExpnData`] struct itself contains a subset of properties from both macro
|
[`ExpnData`] struct itself contains a subset of properties from both `macro`
|
||||||
definition and macro call available through global data.
|
definition and `macro` call available through global data.
|
||||||
[`ExpnData::parent`][edp] tracks the child -> parent link in this hierarchy.
|
[`ExpnData::parent`][edp] tracks the child-to-parent link in this hierarchy.
|
||||||
|
|
||||||
[`ExpnData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html
|
[`ExpnData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html
|
||||||
[edp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.parent
|
[edp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.parent
|
||||||
|
|
||||||
For example,
|
For example:
|
||||||
|
|
||||||
```rust,ignore
|
```rust,ignore
|
||||||
macro_rules! foo { () => { println!(); } }
|
macro_rules! foo { () => { println!(); } }
|
||||||
|
|
@ -292,25 +302,25 @@ macro_rules! foo { () => { println!(); } }
|
||||||
fn main() { foo!(); }
|
fn main() { foo!(); }
|
||||||
```
|
```
|
||||||
|
|
||||||
In this code, the AST nodes that are finally generated would have hierarchy
|
In this code, the `AST` nodes that are finally generated would have hierarchy
|
||||||
`root -> id(foo) -> id(println)`.
|
`root -> id(foo) -> id(println)`.
|
||||||
|
|
||||||
### The Macro Definition Hierarchy
|
### The Macro Definition Hierarchy
|
||||||
|
|
||||||
The second hierarchy tracks the order of macro definitions, i.e., when we are
|
The second hierarchy tracks the order of `macro` definitions, i.e., when we are
|
||||||
expanding one macro another macro definition is revealed in its output. This
|
expanding one `macro` another `macro` definition is revealed in its output. This
|
||||||
one is a bit tricky and more complex than the other two hierarchies.
|
one is a bit tricky and more complex than the other two hierarchies.
|
||||||
|
|
||||||
[`SyntaxContext`][sc] represents a whole chain in this hierarchy via an ID.
|
[`SyntaxContext`][sc] represents a whole chain in this hierarchy via an ID.
|
||||||
[`SyntaxContextData`][scd] contains data associated with the given
|
[`SyntaxContextData`][scd] contains data associated with the given
|
||||||
`SyntaxContext`; mostly it is a cache for results of filtering that chain in
|
[`SyntaxContext`][sc]; mostly it is a cache for results of filtering that chain in
|
||||||
different ways. [`SyntaxContextData::parent`][scdp] is the child -> parent
|
different ways. [`SyntaxContextData::parent`][scdp] is the child-to-parent
|
||||||
link here, and [`SyntaxContextData::outer_expns`][scdoe] are individual
|
link here, and [`SyntaxContextData::outer_expns`][scdoe] are individual
|
||||||
elements in the chain. The "chaining operator" is
|
elements in the chain. The "chaining-operator" is
|
||||||
[`SyntaxContext::apply_mark`][am] in compiler code.
|
[`SyntaxContext::apply_mark`][am] in compiler code.
|
||||||
|
|
||||||
A [`Span`][span], mentioned above, is actually just a compact representation of
|
A [`Span`][span], mentioned above, is actually just a compact representation of
|
||||||
a code location and `SyntaxContext`. Likewise, an [`Ident`] is just an interned
|
a code location and [`SyntaxContext`][sc]. Likewise, an [`Ident`] is just an interned
|
||||||
[`Symbol`] + `Span` (i.e. an interned string + hygiene data).
|
[`Symbol`] + `Span` (i.e. an interned string + hygiene data).
|
||||||
|
|
||||||
[`Symbol`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/symbol/struct.Symbol.html
|
[`Symbol`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/symbol/struct.Symbol.html
|
||||||
|
|
@ -320,13 +330,13 @@ a code location and `SyntaxContext`. Likewise, an [`Ident`] is just an interned
|
||||||
[scdoe]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.outer_expn
|
[scdoe]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.outer_expn
|
||||||
[am]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html#method.apply_mark
|
[am]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html#method.apply_mark
|
||||||
|
|
||||||
For built-in macros, we use the context:
|
For built-in `macro`s, we use the context:
|
||||||
`SyntaxContext::empty().apply_mark(expn_id)`, and such macros are considered to
|
`SyntaxContext::empty().apply_mark(expn_id)`, and such `macro`s are considered to
|
||||||
be defined at the hierarchy root. We do the same for proc-macros because we
|
be defined at the hierarchy root. We do the same for `proc macro`s because we
|
||||||
haven't implemented cross-crate hygiene yet.
|
haven't implemented cross-crate hygiene yet.
|
||||||
|
|
||||||
If the token had context `X` before being produced by a macro then after being
|
If the token had context `X` before being produced by a `macro` then after being
|
||||||
produced by the macro it has context `X -> macro_id`. Here are some examples:
|
produced by the `macro` it has context `X -> macro_id`. Here are some examples:
|
||||||
|
|
||||||
Example 0:
|
Example 0:
|
||||||
|
|
||||||
|
|
@ -356,7 +366,7 @@ after the first expansion, then `ROOT -> id(m) -> id(n)`.
|
||||||
Example 2:
|
Example 2:
|
||||||
|
|
||||||
Note that these chains are not entirely determined by their last element, in
|
Note that these chains are not entirely determined by their last element, in
|
||||||
other words `ExpnId` is not isomorphic to `SyntaxContext`.
|
other words [`ExpnId`] is not isomorphic to [`SyntaxContext`][sc].
|
||||||
|
|
||||||
```rust,ignore
|
```rust,ignore
|
||||||
macro m($i: ident) { macro n() { ($i, bar) } }
|
macro m($i: ident) { macro n() { ($i, bar) } }
|
||||||
|
|
@ -369,15 +379,16 @@ After all expansions, `foo` has context `ROOT -> id(n)` and `bar` has context
|
||||||
|
|
||||||
Finally, one last thing to mention is that currently, this hierarchy is subject
|
Finally, one last thing to mention is that currently, this hierarchy is subject
|
||||||
to the ["context transplantation hack"][hack]. Basically, the more modern (and
|
to the ["context transplantation hack"][hack]. Basically, the more modern (and
|
||||||
experimental) `macro` macros have stronger hygiene than the older MBE system,
|
experimental) `macro` `macro`s have stronger hygiene than the older MBE system,
|
||||||
but this can result in weird interactions between the two. The hack is intended
|
but this can result in weird interactions between the two. The hack is intended
|
||||||
to make things "just work" for now.
|
to make things "just work" for now.
|
||||||
|
|
||||||
|
[`ExpnId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html
|
||||||
[hack]: https://github.com/rust-lang/rust/pull/51762#issuecomment-401400732
|
[hack]: https://github.com/rust-lang/rust/pull/51762#issuecomment-401400732
|
||||||
|
|
||||||
### The Call-site Hierarchy
|
### The Call-site Hierarchy
|
||||||
|
|
||||||
The third and final hierarchy tracks the location of macro invocations.
|
The third and final hierarchy tracks the location of `macro` invocations.
|
||||||
|
|
||||||
In this hierarchy [`ExpnData::call_site`][callsite] is the child -> parent link.
|
In this hierarchy [`ExpnData::call_site`][callsite] is the child -> parent link.
|
||||||
|
|
||||||
|
|
@ -392,39 +403,39 @@ macro foo($i: ident) { $i }
|
||||||
foo!(bar!(baz));
|
foo!(bar!(baz));
|
||||||
```
|
```
|
||||||
|
|
||||||
For the `baz` AST node in the final output, the expansion-order hierarchy is
|
For the `baz` `AST` node in the final output, the expansion-order hierarchy is
|
||||||
`ROOT -> id(foo) -> id(bar) -> baz`, while the call-site hierarchy is `ROOT ->
|
`ROOT -> id(foo) -> id(bar) -> baz`, while the call-site hierarchy is `ROOT ->
|
||||||
baz`.
|
baz`.
|
||||||
|
|
||||||
### Macro Backtraces
|
### Macro Backtraces
|
||||||
|
|
||||||
Macro backtraces are implemented in [`rustc_span`] using the hygiene machinery
|
`macro` backtraces are implemented in [`rustc_span`] using the hygiene machinery
|
||||||
in [`rustc_span::hygiene`][hy].
|
in [`rustc_span::hygiene`][hy].
|
||||||
|
|
||||||
[`rustc_span`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/index.html
|
[`rustc_span`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/index.html
|
||||||
|
|
||||||
## Producing Macro Output
|
## Producing Macro Output
|
||||||
|
|
||||||
Above, we saw how the output of a macro is integrated into the AST for a crate,
|
Above, we saw how the output of a `macro` is integrated into the `AST` for a crate,
|
||||||
and we also saw how the hygiene data for a crate is generated. But how do we
|
and we also saw how the hygiene data for a crate is generated. But how do we
|
||||||
actually produce the output of a macro? It depends on the type of macro.
|
actually produce the output of a `macro`? It depends on the type of `macro`.
|
||||||
|
|
||||||
There are two types of macros in Rust:
|
There are two types of `macro`s in Rust:
|
||||||
`macro_rules!` macros (a.k.a. "Macros By Example" (MBE)) and procedural macros
|
`macro_rules!` `macro`s (a.k.a. "Macros By Example" (MBE)) and procedural `macro`s
|
||||||
(or "proc macros"; including custom derives). During the parsing phase, the normal
|
(or "proc `macro`s"; including custom derives). During the parsing phase, the normal
|
||||||
Rust parser will set aside the contents of macros and their invocations. Later,
|
Rust parser will set aside the contents of `macro`s and their invocations. Later,
|
||||||
macros are expanded using these portions of the code.
|
`macro`s are expanded using these portions of the code.
|
||||||
|
|
||||||
Some important data structures/interfaces here:
|
Some important data structures/interfaces here:
|
||||||
- [`SyntaxExtension`] - a lowered macro representation, contains its expander
|
- [`SyntaxExtension`] - a lowered `macro` representation, contains its expander
|
||||||
function, which transforms a `TokenStream` or AST into another `TokenStream`
|
function, which transforms a `TokenStream` or `AST` into another `TokenStream`
|
||||||
or AST + some additional data like stability, or a list of unstable features
|
or `AST` + some additional data like stability, or a list of unstable features
|
||||||
allowed inside the macro.
|
allowed inside the `macro`.
|
||||||
- [`SyntaxExtensionKind`] - expander functions may have several different
|
- [`SyntaxExtensionKind`] - expander functions may have several different
|
||||||
signatures (take one token stream, or two, or a piece of AST, etc). This is
|
signatures (take one token stream, or two, or a piece of `AST`, etc). This is
|
||||||
an enum that lists them.
|
an enum that lists them.
|
||||||
- [`BangProcMacro`]/[`TTMacroExpander`]/[`AttrProcMacro`]/[`MultiItemModifier`] -
|
- [`BangProcMacro`]/[`TTMacroExpander`]/[`AttrProcMacro`]/[`MultiItemModifier`] -
|
||||||
traits representing the expander function signatures.
|
`trait`s representing the expander function signatures.
|
||||||
|
|
||||||
[`SyntaxExtension`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.SyntaxExtension.html
|
[`SyntaxExtension`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.SyntaxExtension.html
|
||||||
[`SyntaxExtensionKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/enum.SyntaxExtensionKind.html
|
[`SyntaxExtensionKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/enum.SyntaxExtensionKind.html
|
||||||
|
|
@ -435,11 +446,11 @@ Some important data structures/interfaces here:
|
||||||
|
|
||||||
## Macros By Example
|
## Macros By Example
|
||||||
|
|
||||||
MBEs have their own parser distinct from the normal Rust parser. When macros
|
MBEs have their own parser distinct from the normal Rust parser. When `macro`s
|
||||||
are expanded, we may invoke the MBE parser to parse and expand a macro. The
|
are expanded, we may invoke the MBE parser to parse and expand a `macro`. The
|
||||||
MBE parser, in turn, may call the normal Rust parser when it needs to bind a
|
MBE parser, in turn, may call the normal Rust parser when it needs to bind a
|
||||||
metavariable (e.g. `$my_expr`) while parsing the contents of a macro
|
metavariable (e.g. `$my_expr`) while parsing the contents of a `macro`
|
||||||
invocation. The code for macro expansion is in
|
invocation. The code for `macro` expansion is in
|
||||||
[`compiler/rustc_expand/src/mbe/`][code_dir].
|
[`compiler/rustc_expand/src/mbe/`][code_dir].
|
||||||
|
|
||||||
### Example
|
### Example
|
||||||
|
|
@ -467,8 +478,8 @@ special tokens, such as `EOF`, which indicates that there are no more tokens.
|
||||||
Token trees resulting from paired parentheses-like characters (`(`...`)`,
|
Token trees resulting from paired parentheses-like characters (`(`...`)`,
|
||||||
`[`...`]`, and `{`...`}`) – they include the open and close and all the tokens
|
`[`...`]`, and `{`...`}`) – they include the open and close and all the tokens
|
||||||
in between (we do require that parentheses-like characters be balanced). Having
|
in between (we do require that parentheses-like characters be balanced). Having
|
||||||
macro expansion operate on token streams rather than the raw bytes of a source
|
`macro` expansion operate on token streams rather than the raw bytes of a source
|
||||||
file abstracts away a lot of complexity. The macro expander (and much of the
|
file abstracts away a lot of complexity. The `macro` expander (and much of the
|
||||||
rest of the compiler) doesn't really care that much about the exact line and
|
rest of the compiler) doesn't really care that much about the exact line and
|
||||||
column of some syntactic construct in the code; it cares about what constructs
|
column of some syntactic construct in the code; it cares about what constructs
|
||||||
are used in the code. Using tokens allows us to care about _what_ without
|
are used in the code. Using tokens allows us to care about _what_ without
|
||||||
|
|
@ -481,21 +492,21 @@ Whenever we refer to the "example _invocation_", we mean the following snippet:
|
||||||
printer!(print foo); // Assume `foo` is a variable defined somewhere else...
|
printer!(print foo); // Assume `foo` is a variable defined somewhere else...
|
||||||
```
|
```
|
||||||
|
|
||||||
The process of expanding the macro invocation into the syntax tree
|
The process of expanding the `macro` invocation into the syntax tree
|
||||||
`println!("{}", foo)` and then expanding that into a call to `Display::fmt` is
|
`println!("{}", foo)` and then expanding that into a call to `Display::fmt` is
|
||||||
called _macro expansion_, and it is the topic of this chapter.
|
called _`macro` expansion_, and it is the topic of this chapter.
|
||||||
|
|
||||||
### The MBE parser
|
### The MBE parser
|
||||||
|
|
||||||
There are two parts to MBE expansion: parsing the definition and parsing the
|
There are two parts to MBE expansion: parsing the definition and parsing the
|
||||||
invocations. Interestingly, both are done by the macro parser.
|
invocations. Interestingly, both are done by the `macro` parser.
|
||||||
|
|
||||||
Basically, the MBE parser is like an NFA-based regex parser. It uses an
|
Basically, the MBE parser is like an NFA-based regex parser. It uses an
|
||||||
algorithm similar in spirit to the [Earley parsing
|
algorithm similar in spirit to the [Earley parsing
|
||||||
algorithm](https://en.wikipedia.org/wiki/Earley_parser). The macro parser is
|
algorithm](https://en.wikipedia.org/wiki/Earley_parser). The `macro` parser is
|
||||||
defined in [`compiler/rustc_expand/src/mbe/macro_parser.rs`][code_mp].
|
defined in [`compiler/rustc_expand/src/mbe/macro_parser.rs`][code_mp].
|
||||||
|
|
||||||
The interface of the macro parser is as follows (this is slightly simplified):
|
The interface of the `macro` parser is as follows (this is slightly simplified):
|
||||||
|
|
||||||
```rust,ignore
|
```rust,ignore
|
||||||
fn parse_tt(
|
fn parse_tt(
|
||||||
|
|
@ -505,7 +516,7 @@ fn parse_tt(
|
||||||
) -> ParseResult
|
) -> ParseResult
|
||||||
```
|
```
|
||||||
|
|
||||||
We use these items in macro parser:
|
We use these items in `macro` parser:
|
||||||
|
|
||||||
- `parser` is a reference to the state of a normal Rust parser, including the
|
- `parser` is a reference to the state of a normal Rust parser, including the
|
||||||
token stream and parsing session. The token stream is what we are about to
|
token stream and parsing session. The token stream is what we are about to
|
||||||
|
|
@ -529,47 +540,47 @@ three cases has occurred:
|
||||||
"No rule expected token _blah_".
|
"No rule expected token _blah_".
|
||||||
- Error: some fatal error has occurred _in the parser_. For example, this
|
- Error: some fatal error has occurred _in the parser_. For example, this
|
||||||
happens if there is more than one pattern match, since that indicates
|
happens if there is more than one pattern match, since that indicates
|
||||||
the macro is ambiguous.
|
the `macro` is ambiguous.
|
||||||
|
|
||||||
The full interface is defined [here][code_parse_int].
|
The full interface is defined [here][code_parse_int].
|
||||||
|
|
||||||
The macro parser does pretty much exactly the same as a normal regex parser with
|
The `macro` parser does pretty much exactly the same as a normal regex parser with
|
||||||
one exception: in order to parse different types of metavariables, such as
|
one exception: in order to parse different types of metavariables, such as
|
||||||
`ident`, `block`, `expr`, etc., the macro parser must sometimes call back to the
|
`ident`, `block`, `expr`, etc., the `macro` parser must sometimes call back to the
|
||||||
normal Rust parser.
|
normal Rust parser.
|
||||||
|
|
||||||
As mentioned above, both definitions and invocations of macros are parsed using
|
As mentioned above, both definitions and invocations of `macro`s are parsed using
|
||||||
the macro parser. This is extremely non-intuitive and self-referential. The code
|
the `macro` parser. This is extremely non-intuitive and self-referential. The code
|
||||||
to parse macro _definitions_ is in
|
to parse `macro` _definitions_ is in
|
||||||
[`compiler/rustc_expand/src/mbe/macro_rules.rs`][code_mr]. It defines the pattern for
|
[`compiler/rustc_expand/src/mbe/macro_rules.rs`][code_mr]. It defines the pattern for
|
||||||
matching for a macro definition as `$( $lhs:tt => $rhs:tt );+`. In other words,
|
matching for a `macro` definition as `$( $lhs:tt => $rhs:tt );+`. In other words,
|
||||||
a `macro_rules` definition should have in its body at least one occurrence of a
|
a `macro_rules` definition should have in its body at least one occurrence of a
|
||||||
token tree followed by `=>` followed by another token tree. When the compiler
|
token tree followed by `=>` followed by another token tree. When the compiler
|
||||||
comes to a `macro_rules` definition, it uses this pattern to match the two token
|
comes to a `macro_rules` definition, it uses this pattern to match the two token
|
||||||
trees per rule in the definition of the macro _using the macro parser itself_.
|
trees per rule in the definition of the `macro` _using the `macro` parser itself_.
|
||||||
In our example definition, the metavariable `$lhs` would match the patterns of
|
In our example definition, the metavariable `$lhs` would match the patterns of
|
||||||
both arms: `(print $mvar:ident)` and `(print twice $mvar:ident)`. And `$rhs`
|
both arms: `(print $mvar:ident)` and `(print twice $mvar:ident)`. And `$rhs`
|
||||||
would match the bodies of both arms: `{ println!("{}", $mvar); }` and `{
|
would match the bodies of both arms: `{ println!("{}", $mvar); }` and `{
|
||||||
println!("{}", $mvar); println!("{}", $mvar); }`. The parser would keep this
|
println!("{}", $mvar); println!("{}", $mvar); }`. The parser would keep this
|
||||||
knowledge around for when it needs to expand a macro invocation.
|
knowledge around for when it needs to expand a `macro` invocation.
|
||||||
|
|
||||||
When the compiler comes to a macro invocation, it parses that invocation using
|
When the compiler comes to a `macro` invocation, it parses that invocation using
|
||||||
the same NFA-based macro parser that is described above. However, the matcher
|
the same NFA-based `macro` parser that is described above. However, the matcher
|
||||||
used is the first token tree (`$lhs`) extracted from the arms of the macro
|
used is the first token tree (`$lhs`) extracted from the arms of the `macro`
|
||||||
_definition_. Using our example, we would try to match the token stream `print
|
_definition_. Using our example, we would try to match the token stream `print
|
||||||
foo` from the invocation against the matchers `print $mvar:ident` and `print
|
foo` from the invocation against the matchers `print $mvar:ident` and `print
|
||||||
twice $mvar:ident` that we previously extracted from the definition. The
|
twice $mvar:ident` that we previously extracted from the definition. The
|
||||||
algorithm is exactly the same, but when the macro parser comes to a place in the
|
algorithm is exactly the same, but when the `macro` parser comes to a place in the
|
||||||
current matcher where it needs to match a _non-terminal_ (e.g. `$mvar:ident`),
|
current matcher where it needs to match a _non-terminal_ (e.g. `$mvar:ident`),
|
||||||
it calls back to the normal Rust parser to get the contents of that
|
it calls back to the normal Rust parser to get the contents of that
|
||||||
non-terminal. In this case, the Rust parser would look for an `ident` token,
|
non-terminal. In this case, the Rust parser would look for an `ident` token,
|
||||||
which it finds (`foo`) and returns to the macro parser. Then, the macro parser
|
which it finds (`foo`) and returns to the `macro` parser. Then, the `macro` parser
|
||||||
proceeds in parsing as normal. Also, note that exactly one of the matchers from
|
proceeds in parsing as normal. Also, note that exactly one of the matchers from
|
||||||
the various arms should match the invocation; if there is more than one match,
|
the various arms should match the invocation; if there is more than one match,
|
||||||
the parse is ambiguous, while if there are no matches at all, there is a syntax
|
the parse is ambiguous, while if there are no matches at all, there is a syntax
|
||||||
error.
|
error.
|
||||||
|
|
||||||
For more information about the macro parser's implementation, see the comments
|
For more information about the `macro` parser's implementation, see the comments
|
||||||
in [`compiler/rustc_expand/src/mbe/macro_parser.rs`][code_mp].
|
in [`compiler/rustc_expand/src/mbe/macro_parser.rs`][code_mp].
|
||||||
|
|
||||||
### `macro`s and Macros 2.0
|
### `macro`s and Macros 2.0
|
||||||
|
|
@ -577,21 +588,21 @@ in [`compiler/rustc_expand/src/mbe/macro_parser.rs`][code_mp].
|
||||||
There is an old and mostly undocumented effort to improve the MBE system, give
|
There is an old and mostly undocumented effort to improve the MBE system, give
|
||||||
it more hygiene-related features, better scoping and visibility rules, etc. There
|
it more hygiene-related features, better scoping and visibility rules, etc. There
|
||||||
hasn't been a lot of work on this recently, unfortunately. Internally, `macro`
|
hasn't been a lot of work on this recently, unfortunately. Internally, `macro`
|
||||||
macros use the same machinery as today's MBEs; they just have additional
|
`macro`s use the same machinery as today's MBEs; they just have additional
|
||||||
syntactic sugar and are allowed to be in namespaces.
|
syntactic sugar and are allowed to be in namespaces.
|
||||||
|
|
||||||
## Procedural Macros
|
## Procedural Macros
|
||||||
|
|
||||||
Procedural macros are also expanded during parsing, as mentioned above.
|
Procedural `macro`s are also expanded during parsing, as mentioned above.
|
||||||
However, they use a rather different mechanism. Rather than having a parser in
|
However, they use a rather different mechanism. Rather than having a parser in
|
||||||
the compiler, procedural macros are implemented as custom, third-party crates.
|
the compiler, procedural `macro`s are implemented as custom, third-party crates.
|
||||||
The compiler will compile the proc macro crate and specially annotated
|
The compiler will compile the proc `macro` crate and specially annotated
|
||||||
functions in them (i.e. the proc macro itself), passing them a stream of tokens.
|
functions in them (i.e. the proc `macro` itself), passing them a stream of tokens.
|
||||||
|
|
||||||
The proc macro can then transform the token stream and output a new token
|
The proc `macro` can then transform the token stream and output a new token
|
||||||
stream, which is synthesized into the AST.
|
stream, which is synthesized into the `AST`.
|
||||||
|
|
||||||
It's worth noting that the token stream type used by proc macros is _stable_,
|
It's worth noting that the token stream type used by proc `macro`s is _stable_,
|
||||||
so `rustc` does not use it internally (since our internal data structures are
|
so `rustc` does not use it internally (since our internal data structures are
|
||||||
unstable). The compiler's token stream is
|
unstable). The compiler's token stream is
|
||||||
[`rustc_ast::tokenstream::TokenStream`][rustcts], as previously. This is
|
[`rustc_ast::tokenstream::TokenStream`][rustcts], as previously. This is
|
||||||
|
|
@ -610,6 +621,6 @@ TODO: more here. [#1160](https://github.com/rust-lang/rustc-dev-guide/issues/116
|
||||||
|
|
||||||
### Custom Derive
|
### Custom Derive
|
||||||
|
|
||||||
Custom derives are a special type of proc macro.
|
Custom derives are a special type of proc `macro`.
|
||||||
|
|
||||||
TODO: more? [#1160](https://github.com/rust-lang/rustc-dev-guide/issues/1160)
|
TODO: more? [#1160](https://github.com/rust-lang/rustc-dev-guide/issues/1160)
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue