expand notes on expansion heirarchies

This commit is contained in:
mark 2020-04-30 20:08:36 -05:00 committed by Who? Me?!
parent ba8620f34a
commit f05ff9c30d
1 changed files with 128 additions and 57 deletions

View File

@ -163,82 +163,153 @@ only within the macro (i.e. it should not be visible outside the macro).
[code_parse_int]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser/fn.parse_tt.html [code_parse_int]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser/fn.parse_tt.html
[parsing]: ./the-parser.html [parsing]: ./the-parser.html
TODO: expand these notes The context is attached to AST nodes. All AST nodes generated by macros have
context attached. Additionally, there may be other nodes that have context
attached, such as some desugared syntax (non-macro-expanded nodes are
considered to just have the "root" context, as described below).
- Many AST nodes have some sort of syntax context, especially nodes from macros. Because macros invocations and definitions can be nested, the syntax context of
- When we ask what is the syntax context of a node, the answer actually differs by what we are trying to do. Thus, we don't just keep track of a single context. There are in fact 3 different types of context used for different things. a node must be a heirarchy. For example, if we expand a macro and there is
- Each type of context is tracked by an "expansion heirarchy". As we expand macros, new macro calls or macro definitions may be generated, leading to some nesting. This nesting is where the heirarchies come from. Each heirarchy tracks some different aspect, though, as we will see. another macro invocation or definition in the generated output, then the syntax
- There are 3 expansion heirarchies context should reflex the nesting.
- All macros receive an integer ID assigned continuously starting from 0 as we discover new macro calls
- This is used as the `expn_id` where needed.
- All heirarchies start at ExpnId::root, which is its own parent
- The context of a node consists of a chain of expansions leading to `ExpnId::root`. A non-macro-expanded node has syntax context 0 (`SyntaxContext::empty()`) which represents just the root node.
- There are vectors in `HygieneData` that contain expansion info.
- There are entries here for both `SyntaxContext::empty()` and `ExpnId::root`, but they aren't used much.
1. Tracks expansion order: when a macro invocation is in the output of another macro. However, it turns out that there are actually a few types of context we may
... want to track for different purposes. Thus, there not just one but _three_
expn_id2 expansion heirarchies that together comprise the hygiene information for a
expn_id1 crate.
InternalExpnData::parent is the child->parent link. That is the expn_id1 points to expn_id2 points to ...
Ex: All of these heirarchies need some sort of "macro ID" to identify individual
macro_rules! foo { () => { println!(); } } elements in the chain of expansions. This ID is [`ExpnId`]. All macros receive
fn main() { foo!(); } an integer ID, assigned continuously starting from 0 as we discover new macro
calls. All heirarchies start at [`ExpnId::root()`][rootid], which is its own
parent.
// Then AST nodes that are finally generated would have parent(expn_id_println) -> parent(expn_id_foo), right? The actual heirarchies are stored in [`HygieneData`][hd], and all of the
hygiene-related algorithms are implemented in [`rustc_span::hygiene`][hy], with
the exception of some hacks [`Resolver::resolve_crate_root`][hacks].
2. Tracks macro definitions: when we are expanding one macro another macro definition is revealed in its output. [`ExpnId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html
... [rootid]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html#method.root
SyntaxContext2 [hd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.HygieneData.html
SyntaxContext1 [hy]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/index.html
SyntaxContextData::parent is the child->parent link here. [hacks]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/struct.Resolver.html#method.resolve_crate_root
SyntaxContext is the whole chain in this hierarchy, and SyntaxContextData::outer_expns are individual elements in the chain.
- For built-in macros (e.g. `line!()`) or stable proc macros: tokens produced by the macro are given the context `SyntaxContext::empty().apply_mark(expn_id)` ### The Expansion Order Heirarchy
- Such macros are considered to have been defined at the root.
- For proc macros this is because they are always cross-crate and we don't have cross-crate hygiene implemented.
The second hierarchy has the context transplantation hack. See https://github.com/rust-lang/rust/pull/51762#issuecomment-401400732. The first heirarchy tracks the order of expansions, i.e., when a macro
invocation is in the output of another macro.
If the token had context X before being produced by a macro then after being produced by the macro it has context X -> macro_id. Here, the children in the heirarchy will be the "innermost" tokens.
[`ExpnData::parent`][edp] tracks the child -> parent link in this heirarchy.
Ex: [edp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.parent
```rust
macro m() { ident }
```
Here `ident` originally has context SyntaxContext::root(). `ident` has context ROOT -> id(m) after it's produced by m. For example,
The "chaining operator" is `apply_mark` in compiler code.
Ex: ```rust,ignore
macro_rules! foo { () => { println!(); } }
```rust fn main() { foo!(); }
macro m() { macro n() { ident } } ```
```
In this example the ident has context ROOT originally, then ROOT -> id(m), then ROOT -> id(m) -> id(n).
Note that these chains are not entirely determined by their last element, in other words ExpnId is not isomorphic to SyntaxCtxt. In this code, the AST nodes that are finally generated would have heirarchy:
Ex: ```
```rust root
macro m($i: ident) { macro n() { ($i, bar) } } expn_id_foo
expn_id_println
```
m!(foo); ### The Macro Definition Heirarchy
```
After all expansions, foo has context ROOT -> id(n) and bar has context ROOT -> id(m) -> id(n) The second heirarchy tracks the order of macro definitions, i.e., when we are
expanding one macro another macro definition is revealed in its output. This
one is a bit tricky and more complex than the other two heirarchies.
3. Call-site: tracks the location of the macro invocation. Here, [`SyntaxContextData::parent`][scdp] is the child -> parent link here.
Ex: [`SyntaxContext`][sc] is the whole chain in this hierarchy, and
If foo!(bar!(ident)) expands into ident [`SyntaxContextData::outer_expns`][scdoe] are individual elements in the chain.
then hierarchy 1 is root -> foo -> bar -> ident The "chaining operator" is [`SyntaxContext::apply_mark`][am] in compiler code.
but hierarchy 3 is root -> ident
ExpnInfo::call_site is the child-parent link in this case. [scdp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.parent
[sc]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html
[scdoe]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.outer_expn
[am]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html#method.apply_mark
- Hygiene-related algorithms are entirely in hygiene.rs For built-in macros, we use the context:
- Some hacks in `resolve_crate_root`, though. `SyntaxContext::empty().apply_mark(expn_id)`, and such macros are considered to
be defined at the heirarchy root. We do the same for proc-macros because we
haven't implemented cross-crate hygiene yet.
If the token had context `X` before being produced by a macro then after being
produced by the macro it has context `X -> macro_id`. Here are some examples:
Example 0:
```rust,ignore
macro m() { ident }
m!();
```
Here `ident` originally has context [`SyntaxContext::root()`][scr]. `ident` has
context `ROOT -> id(m)` after it's produced by `m`.
[scr]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html#method.root
Example 1:
```rust,ignore
macro m() { macro n() { ident } }
m!();
n!();
```
In this example the `ident` has context `ROOT` originally, then `ROOT -> id(m)`
after the first expansion, then `ROOT -> id(m) -> id(n)`.
Example 2:
Note that these chains are not entirely determined by their last element, in
other words `ExpnId` is not isomorphic to `SyntaxContext`.
```rust,ignore
macro m($i: ident) { macro n() { ($i, bar) } }
m!(foo);
```
After all expansions, `foo` has context `ROOT -> id(n)` and `bar` has context
`ROOT -> id(m) -> id(n)`.
Finally, one last thing to mention is that currently, this heirarchy is subject
to the ["context transplantation hack"][hack]. Basically, the more modern (and
experimental) `macro` macros have stronger hygiene than the older MBE system,
but this can result in weird interactions between the two. The hack is intended
to make things "just work" for now.
[hack]: https://github.com/rust-lang/rust/pull/51762#issuecomment-401400732
### The Call-site Heirarchy
The third and final heirarchy tracks the location of macro invocations.
In this heirarchy [`ExpnData::call_site`][callsite] is the child -> parent link.
[callsite]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.call_site
Here is an example:
```rust,ignore
macro bar($i: ident) { $i }
macro foo($i: ident) { $i }
foo!(bar!(baz));
```
For the `baz` AST node in the final output, the first heirarchy is `ROOT ->
id(foo) -> id(bar) -> baz`, while the third heirarchy is `ROOT -> baz`.
## Producing Macro Output ## Producing Macro Output