add discussion transcript so we don't lose it
This commit is contained in:
parent
2dabf0f58d
commit
a1d1860a11
|
|
@ -210,3 +210,706 @@ TODO: maybe something about macros 2.0?
|
|||
[code_mr]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax_expand/mbe/macro_rules
|
||||
[code_parse_int]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax_expand/mbe/macro_parser/fn.parse.html
|
||||
[parsing]: ./the-parser.html
|
||||
|
||||
|
||||
# Discussion about hygiene
|
||||
|
||||
The rest of this chapter is a dump of a discussion between `mark-i-m` and
|
||||
`petrochenkov` about Macro Expansion and Hygiene. I am pasting it here so that
|
||||
it never gets lost until we can make it into a proper chapter.
|
||||
|
||||
```txt
|
||||
mark-i-m: @Vadim Petrochenkov Hi :wave:
|
||||
I was wondering if you would have a chance sometime in the next month or so to
|
||||
just have a zulip discussion where you tell us (WG-learning) everything you
|
||||
know about macros/expansion/hygiene. We were thinking this could be less formal
|
||||
(and less work for you) than compiler lecture series lecture... thoughts?
|
||||
|
||||
mark-i-m: The goal is to fill out that long-standing gap in the rustc-guide
|
||||
|
||||
Vadim Petrochenkov: Ok, I'm at UTC+03:00 and generally available in the
|
||||
evenings (or weekends).
|
||||
|
||||
mark-i-m: @Vadim Petrochenkov Either of those works for me (your evenings are
|
||||
about lunch time for me :) ) Is there a particular date that would work best
|
||||
for you?
|
||||
|
||||
mark-i-m: @WG-learning Does anyone else have a preferred date?
|
||||
|
||||
Vadim Petrochenkov:
|
||||
|
||||
Is there a particular date that would work best for you?
|
||||
|
||||
Nah, not much difference. (If something changes for a specific day, I'll
|
||||
notify.)
|
||||
|
||||
Santiago Pastorino: week days are better, but I'd say let's wait for @Vadim
|
||||
Petrochenkov to say when they are ready for it and we can set a date
|
||||
|
||||
Santiago Pastorino: also, we should record this so ... I guess it doesn't
|
||||
matter that much when :)
|
||||
|
||||
mark-i-m:
|
||||
|
||||
also, we should record this so ... I guess it doesn't matter that much when
|
||||
:)
|
||||
|
||||
@Santiago Pastorino My thinking was to just use zulip, so we would have the log
|
||||
|
||||
mark-i-m: @Vadim Petrochenkov @WG-learning How about 2 weeks from now: July 24
|
||||
at 5pm UTC time (if I did the math right, that should be evening for Vadim)
|
||||
|
||||
Amanjeev Sethi: i can try and do this but I am starting a new job that week so
|
||||
cannot promise.
|
||||
|
||||
Santiago Pastorino:
|
||||
|
||||
Vadim Petrochenkov @WG-learning How about 2 weeks from now: July 24 at 5pm
|
||||
UTC time (if I did the math right, that should be evening for Vadim)
|
||||
|
||||
works perfect for me
|
||||
|
||||
Santiago Pastorino: @mark-i-m I have access to the compiler calendar so I can
|
||||
add something there
|
||||
|
||||
Santiago Pastorino: let me know if you want to add an event to the calendar, I
|
||||
can do that
|
||||
|
||||
Santiago Pastorino: how long it would be?
|
||||
|
||||
mark-i-m:
|
||||
|
||||
let me know if you want to add an event to the calendar, I can do that
|
||||
|
||||
mark-i-m: That could be good :+1:
|
||||
|
||||
mark-i-m:
|
||||
|
||||
how long it would be?
|
||||
|
||||
Let's start with 30 minutes, and if we need to schedule another we cna
|
||||
|
||||
Vadim Petrochenkov:
|
||||
|
||||
5pm UTC
|
||||
|
||||
1-2 hours later would be better, 5pm UTC is not evening enough.
|
||||
|
||||
Vadim Petrochenkov: How exactly do you plan the meeting to go (aka how much do
|
||||
I need to prepare)?
|
||||
|
||||
Santiago Pastorino:
|
||||
|
||||
5pm UTC
|
||||
|
||||
1-2 hours later would be better, 5pm UTC is not evening enough.
|
||||
|
||||
Scheduled for 7pm UTC then
|
||||
|
||||
Santiago Pastorino:
|
||||
|
||||
How exactly do you plan the meeting to go (aka how much do I need to
|
||||
prepare)?
|
||||
|
||||
/cc @mark-i-m
|
||||
|
||||
mark-i-m: @Vadim Petrochenkov
|
||||
|
||||
How exactly do you plan the meeting to go (aka how much do I need to
|
||||
prepare)?
|
||||
|
||||
My hope was that this could be less formal than for a compiler lecture series,
|
||||
but it would be nice if you could have in your mind a tour of the design and
|
||||
the code
|
||||
|
||||
That is, imagine that a new person was joining the compiler team and needed to
|
||||
get up to speed about macros/expansion/hygiene. What would you tell such a
|
||||
person?
|
||||
|
||||
mark-i-m: @Vadim Petrochenkov Are we still on for tomorrow at 7pm UTC?
|
||||
|
||||
Vadim Petrochenkov: Yes.
|
||||
|
||||
Santiago Pastorino: @Vadim Petrochenkov @mark-i-m I've added an event on rust
|
||||
compiler team calendar
|
||||
|
||||
mark-i-m: @WG-learning @Vadim Petrochenkov Hello!
|
||||
|
||||
mark-i-m: We will be starting in ~7 minutes
|
||||
|
||||
mark-i-m: :wave:
|
||||
|
||||
Vadim Petrochenkov: I'm here.
|
||||
|
||||
mark-i-m: Cool :)
|
||||
|
||||
Santiago Pastorino: hello @Vadim Petrochenkov
|
||||
|
||||
mark-i-m: Shall we start?
|
||||
|
||||
mark-i-m: First off, @Vadim Petrochenkov Thanks for doing this!
|
||||
|
||||
Vadim Petrochenkov: Here's some preliminary data I prepared.
|
||||
|
||||
Vadim Petrochenkov: Below I'll assume #62771 and #62086 has landed.
|
||||
|
||||
Vadim Petrochenkov: Where to find the code: libsyntax_pos/hygiene.rs -
|
||||
structures related to hygiene and expansion that are kept in global data (can
|
||||
be accessed from any Ident without any context) libsyntax_pos/lib.rs - some
|
||||
secondary methods like macro backtrace using primary methods from hygiene.rs
|
||||
libsyntax_ext - implementations of built-in macros (including macro attributes
|
||||
and derives) and some other early code generation facilities like injection of
|
||||
standard library imports or generation of test harness. libsyntax/config.rs -
|
||||
implementation of cfg/cfg_attr (they treated specially from other macros),
|
||||
should probably be moved into libsyntax/ext. libsyntax/tokenstream.rs +
|
||||
libsyntax/parse/token.rs - structures for compiler-side tokens, token trees,
|
||||
and token streams. libsyntax/ext - various expansion-related stuff
|
||||
libsyntax/ext/base.rs - basic structures used by expansion
|
||||
libsyntax/ext/expand.rs - some expansion structures and the bulk of expansion
|
||||
infrastructure code - collecting macro invocations, calling into resolve for
|
||||
them, calling their expanding functions, and integrating the results back into
|
||||
AST libsyntax/ext/placeholder.rs - the part of expand.rs responsible for
|
||||
"integrating the results back into AST" basicallly, "placeholder" is a
|
||||
temporary AST node replaced with macro expansion result nodes
|
||||
libsyntax/ext/builer.rs - helper functions for building AST for built-in macros
|
||||
in libsyntax_ext (and user-defined syntactic plugins previously), can probably
|
||||
be moved into libsyntax_ext these days libsyntax/ext/proc_macro.rs +
|
||||
libsyntax/ext/proc_macro_server.rs - interfaces between the compiler and the
|
||||
stable proc_macro library, converting tokens and token streams between the two
|
||||
representations and sending them through C ABI libsyntax/ext/tt -
|
||||
implementation of macro_rules, turns macro_rules DSL into something with
|
||||
signature Fn(TokenStream) -> TokenStream that can eat and produce tokens,
|
||||
@mark-i-m knows more about this librustc_resolve/macros.rs - resolving macro
|
||||
paths, validating those resolutions, reporting various "not found"/"found, but
|
||||
it's unstable"/"expected x, found y" errors librustc/hir/map/def_collector.rs +
|
||||
librustc_resolve/build_reduced_graph.rs - integrate an AST fragment freshly
|
||||
expanded from a macro into various parent/child structures like module
|
||||
hierarchy or "definition paths"
|
||||
|
||||
Primary structures: HygieneData - global piece of data containing hygiene and
|
||||
expansion info that can be accessed from any Ident without any context ExpnId -
|
||||
ID of a macro call or desugaring (and also expansion of that call/desugaring,
|
||||
depending on context) ExpnInfo/InternalExpnData - a subset of properties from
|
||||
both macro definition and macro call available through global data
|
||||
SyntaxContext - ID of a chain of nested macro definitions (identified by
|
||||
ExpnIds) SyntaxContextData - data associated with the given SyntaxContext,
|
||||
mostly a cache for results of filtering that chain in different ways Span - a
|
||||
code location + SyntaxContext Ident - interned string (Symbol) + Span, i.e. a
|
||||
string with attached hygiene data TokenStream - a collection of TokenTrees
|
||||
TokenTree - a token (punctuation, identifier, or literal) or a delimited group
|
||||
(anything inside ()/[]/{}) SyntaxExtension - a lowered macro representation,
|
||||
contains its expander function transforming a tokenstream or AST into
|
||||
tokenstream or AST + some additional data like stability, or a list of unstable
|
||||
features allowed inside the macro. SyntaxExtensionKind - expander functions
|
||||
may have several different signatures (take one token stream, or two, or a
|
||||
piece of AST, etc), this is an enum that lists them
|
||||
ProcMacro/TTMacroExpander/AttrProcMacro/MultiItemModifier - traits representing
|
||||
the expander signatures (TODO: change and rename the signatures into something
|
||||
more consistent) trait Resolver - a trait used to break crate dependencies (so
|
||||
resolver services can be used in libsyntax, despite librustc_resolve and pretty
|
||||
much everything else depending on libsyntax) ExtCtxt/ExpansionData - various
|
||||
intermediate data kept and used by expansion infra in the process of its work
|
||||
AstFragment - a piece of AST that can be produced by a macro (may include
|
||||
multiple homogeneous AST nodes, like e.g. a list of items) Annotatable - a
|
||||
piece of AST that can be an attribute target, almost same thing as AstFragment
|
||||
except for types and patterns that can be produced by macros but cannot be
|
||||
annotated with attributes (TODO: Merge into AstFragment) trait MacResult - a
|
||||
"polymorphic" AST fragment, something that can turn into a different
|
||||
AstFragment depending on its context (aka AstFragmentKind - item, or
|
||||
expression, or pattern etc.) Invocation/InvocationKind - a structure describing
|
||||
a macro call, these structures are collected by the expansion infra
|
||||
(InvocationCollector), queued, resolved, expanded when resolved, etc.
|
||||
|
||||
Primary algorithms / actions: TODO
|
||||
|
||||
mark-i-m: Very useful :+1:
|
||||
|
||||
mark-i-m: @Vadim Petrochenkov Zulip doesn't have an indication of typing, so
|
||||
I'm not sure if you are waiting for me or not
|
||||
|
||||
Vadim Petrochenkov: The TODO part should be about how a crate transitions from
|
||||
the state "macros exist as written in source" to "all macros are expanded", but
|
||||
I didn't write it yet.
|
||||
|
||||
Vadim Petrochenkov: (That should probably better happen off-line.)
|
||||
|
||||
Vadim Petrochenkov: Now, if you have any questions?
|
||||
|
||||
mark-i-m: Thanks :)
|
||||
|
||||
mark-i-m: /me is still reading :P
|
||||
|
||||
mark-i-m: Ok
|
||||
|
||||
mark-i-m: So I guess my first question is about hygiene, since that remains the
|
||||
most mysterious to me... My understanding is that the parser outputs AST nodes,
|
||||
where each node has a Span
|
||||
|
||||
mark-i-m: In the absence of macros and desugaring, what does the syntax context
|
||||
of an AST node look like?
|
||||
|
||||
mark-i-m: @Vadim Petrochenkov
|
||||
|
||||
Vadim Petrochenkov: Not each node, but many of them. When a node is not
|
||||
macro-expanded, its context is 0.
|
||||
|
||||
Vadim Petrochenkov: aka SyntaxContext::empty()
|
||||
|
||||
Vadim Petrochenkov: it's a chain that consists of one expansion - expansion 0
|
||||
aka ExpnId::root.
|
||||
|
||||
mark-i-m: Do all expansions start at root?
|
||||
|
||||
Vadim Petrochenkov: Also, SyntaxContext:empty() is its own father.
|
||||
|
||||
mark-i-m: Is this actually stored somewhere or is it a logical value?
|
||||
|
||||
Vadim Petrochenkov: All expansion hyerarchies (there are several of them) start
|
||||
at ExpnId::root.
|
||||
|
||||
Vadim Petrochenkov: Vectors in HygieneData has entries for both ctxt == 0 and
|
||||
expn_id == 0.
|
||||
|
||||
Vadim Petrochenkov: I don't think anyone looks into them much though.
|
||||
|
||||
mark-i-m: Ok
|
||||
|
||||
Vadim Petrochenkov: Speaking of multiple hierarchies...
|
||||
|
||||
mark-i-m: Go ahead :)
|
||||
|
||||
Vadim Petrochenkov: One is parent (expn_id1) -> parent(expn_id2) -> ...
|
||||
|
||||
Vadim Petrochenkov: This is the order in which macros are expanded.
|
||||
|
||||
Vadim Petrochenkov: Well.
|
||||
|
||||
Vadim Petrochenkov: When we are expanding one macro another macro is revealed
|
||||
in its output.
|
||||
|
||||
Vadim Petrochenkov: That's the parent-child relation in this hierarchy.
|
||||
|
||||
Vadim Petrochenkov: InternalExpnData::parent is the child->parent link.
|
||||
|
||||
mark-i-m: So in the above chain expn_id1 is the child?
|
||||
|
||||
Vadim Petrochenkov: Yes.
|
||||
|
||||
Vadim Petrochenkov: The second one is parent (SyntaxContext1) ->
|
||||
parent(SyntaxContext2) -> ...
|
||||
|
||||
Vadim Petrochenkov: This is about nested macro definitions. When we are
|
||||
expanding one macro another macro definition is revealed in its output.
|
||||
|
||||
Vadim Petrochenkov: SyntaxContextData::parent is the child->parent link here.
|
||||
|
||||
Vadim Petrochenkov: So, SyntaxContext is the whole chain in this hierarchy, and
|
||||
outer_expns are individual elements in the chain.
|
||||
|
||||
mark-i-m: So for example, suppose I have the following:
|
||||
|
||||
macro_rules! foo { () => { println!(); } }
|
||||
|
||||
fn main() { foo!(); }
|
||||
|
||||
Then AST nodes that are finally generated would have parent(expn_id_println) ->
|
||||
parent(expn_id_foo), right?
|
||||
|
||||
Vadim Petrochenkov: Pretty common construction (at least it was, before
|
||||
refactorings) is SyntaxContext::empty().apply_mark(expn_id), which means...
|
||||
|
||||
Vadim Petrochenkov:
|
||||
|
||||
Then AST nodes that are finally generated would have
|
||||
parent(expn_id_println) -> parent(expn_id_foo), right?
|
||||
|
||||
Yes.
|
||||
|
||||
mark-i-m:
|
||||
|
||||
and outer_expns are individual elements in the chain.
|
||||
|
||||
Sorry, what is outer_expns?
|
||||
|
||||
Vadim Petrochenkov: SyntaxContextData::outer_expn
|
||||
|
||||
mark-i-m: Thanks :) Please continue
|
||||
|
||||
Vadim Petrochenkov: ...which means a token produced by a built-in macro (which
|
||||
is defined in the root effectively).
|
||||
|
||||
mark-i-m: Where does the expn_id come from?
|
||||
|
||||
Vadim Petrochenkov: Or a stable proc macro, which are always considered to be
|
||||
defined in the root because they are always cross-crate, and we don't have the
|
||||
cross-crate hygiene implemented, ha-ha.
|
||||
|
||||
Vadim Petrochenkov:
|
||||
|
||||
Where does the expn_id come from?
|
||||
|
||||
Vadim Petrochenkov: ID of the built-in macro call like line!().
|
||||
|
||||
Vadim Petrochenkov: Assigned continuously from 0 to N as soon as we discover
|
||||
new macro calls.
|
||||
|
||||
mark-i-m: Sorry, I didn't quite understand. Do you mean that only built-in
|
||||
macros receive continuous IDs?
|
||||
|
||||
Vadim Petrochenkov: So, the second hierarchy has a catch - the context
|
||||
transplantation hack -
|
||||
https://github.com/rust-lang/rust/pull/51762#issuecomment-401400732.
|
||||
|
||||
Vadim Petrochenkov:
|
||||
|
||||
Do you mean that only built-in macros receive continuous IDs?
|
||||
|
||||
Vadim Petrochenkov: No, all macro calls receive ID.
|
||||
|
||||
Vadim Petrochenkov: Built-ins have the typical pattern
|
||||
SyntaxContext::empty().apply_mark(expn_id) for syntax contexts produced by
|
||||
them.
|
||||
|
||||
mark-i-m: I see, but this pattern is only used for built-ins, right?
|
||||
|
||||
Vadim Petrochenkov: And also all stable proc macros, see the comments above.
|
||||
|
||||
mark-i-m: Got it
|
||||
|
||||
Vadim Petrochenkov: The third hierarchy is call-site hierarchy.
|
||||
|
||||
Vadim Petrochenkov: If foo!(bar!(ident)) expands into ident
|
||||
|
||||
Vadim Petrochenkov: then hierarchy 1 is root -> foo -> bar -> ident
|
||||
|
||||
Vadim Petrochenkov: but hierarchy 3 is root -> ident
|
||||
|
||||
Vadim Petrochenkov: ExpnInfo::call_site is the child-parent link in this case.
|
||||
|
||||
mark-i-m: When we expand, do we expand foo first or bar? Why is there a
|
||||
hierarchy 1 here? Is that foo expands first and it expands to something that
|
||||
contains bar!(ident)?
|
||||
|
||||
Vadim Petrochenkov: Ah, yes, let's assume both foo and bar are identity macros.
|
||||
|
||||
Vadim Petrochenkov: Then foo!(bar!(ident)) -> expand -> bar!(ident) -> expand
|
||||
-> ident
|
||||
|
||||
Vadim Petrochenkov: If bar were expanded first, that would be eager expansion -
|
||||
https://github.com/rust-lang/rfcs/pull/2320.
|
||||
|
||||
mark-i-m: And after we expand only foo! presumably whatever intermediate state
|
||||
has heirarchy 1 of root->foo->(bar_ident), right?
|
||||
|
||||
Vadim Petrochenkov: (We have it hacked into some built-in macros, but not
|
||||
generally.)
|
||||
|
||||
Vadim Petrochenkov:
|
||||
|
||||
And after we expand only foo! presumably whatever intermediate state has
|
||||
heirarchy 1 of root->foo->(bar_ident), right?
|
||||
|
||||
Vadim Petrochenkov: Yes.
|
||||
|
||||
mark-i-m: Got it :)
|
||||
|
||||
mark-i-m: It looks like we have ~5 minutes left. This has been very helpful
|
||||
already, but I also have more questions. Shall we try to schedule another
|
||||
meeting in the future?
|
||||
|
||||
Vadim Petrochenkov: Sure, why not.
|
||||
|
||||
Vadim Petrochenkov: A thread for offline questions-answers would be good too.
|
||||
|
||||
mark-i-m:
|
||||
|
||||
A thread for offline questions-answers would be good too.
|
||||
|
||||
I don't mind using this thread, since it already has a lot of info in it. We
|
||||
also plan to summarize the info from this thread into the rustc-guide.
|
||||
|
||||
Sure, why not.
|
||||
|
||||
Unfortunately, I'm unavailable for a few weeks. Would August 21-ish work for
|
||||
you (and @WG-learning )?
|
||||
|
||||
mark-i-m: @Vadim Petrochenkov Thanks very much for your time and knowledge!
|
||||
|
||||
mark-i-m: One last question: are there more hierarchies?
|
||||
|
||||
Vadim Petrochenkov: Not that I know of. Three + the context transplantation
|
||||
hack is already more complex than I'd like.
|
||||
|
||||
mark-i-m: Yes, one wonders what it would be like if one also had to think about
|
||||
eager expansion...
|
||||
|
||||
Santiago Pastorino: sorry but I couldn't follow that much today, will read it
|
||||
when I have some time later
|
||||
|
||||
Santiago Pastorino: btw https://github.com/rust-lang/rustc-guide/issues/398
|
||||
|
||||
mark-i-m: @Vadim Petrochenkov Would 7pm UTC on August 21 work for a followup?
|
||||
|
||||
Vadim Petrochenkov: Tentatively yes.
|
||||
|
||||
mark-i-m: @Vadim Petrochenkov @WG-learning Does this still work for everyone?
|
||||
|
||||
Vadim Petrochenkov: August 21 is still ok.
|
||||
|
||||
mark-i-m: @WG-learning @Vadim Petrochenkov We will start in ~30min
|
||||
|
||||
Vadim Petrochenkov: Oh. Thanks for the reminder, I forgot about this entirely.
|
||||
|
||||
mark-i-m: Hello!
|
||||
|
||||
Vadim Petrochenkov: (I'll be here in a couple of minutes.)
|
||||
|
||||
Vadim Petrochenkov: Ok, I'm here.
|
||||
|
||||
mark-i-m: Hi :)
|
||||
|
||||
Vadim Petrochenkov: Hi.
|
||||
|
||||
mark-i-m: so last time, we talked about the 3 context heirarchies
|
||||
|
||||
Vadim Petrochenkov: Right.
|
||||
|
||||
mark-i-m: Was there anything you wanted to add to that? If not, I think it
|
||||
would be good to get a big-picture... Given some piece of rust code, how do we
|
||||
get to the point where things are expanded and hygiene context is computed?
|
||||
|
||||
mark-i-m: (I'm assuming that hygiene info is computed as we expand stuff, since
|
||||
I don't think you can discover it beforehand)
|
||||
|
||||
Vadim Petrochenkov: Ok, let's move from hygiene to expansion.
|
||||
|
||||
Vadim Petrochenkov: Especially given that I don't remember the specific hygiene
|
||||
algorithms like adjust in detail.
|
||||
|
||||
Vadim Petrochenkov:
|
||||
|
||||
Given some piece of rust code, how do we get to the point where things are
|
||||
expanded
|
||||
|
||||
So, first of all, the "some piece of rust code" is the whole crate.
|
||||
|
||||
mark-i-m: Just to confirm, the algorithms are well-encapsulated, right? Like a
|
||||
function or a struct as opposed to a bunch of conventions distributed across
|
||||
the codebase?
|
||||
|
||||
Vadim Petrochenkov: We run fully_expand_fragment in it.
|
||||
|
||||
Vadim Petrochenkov:
|
||||
|
||||
Just to confirm, the algorithms are well-encapsulated, right?
|
||||
|
||||
Yes, the algorithmic parts are entirely inside hygiene.rs.
|
||||
|
||||
Vadim Petrochenkov: Ok, some are in fn resolve_crate_root, but those are hacks.
|
||||
|
||||
Vadim Petrochenkov: (Continuing about expansion.) If fully_expand_fragment is
|
||||
run not on a whole crate, it means that we are performing eager expansion.
|
||||
|
||||
Vadim Petrochenkov: Eager expansion is done for arguments of some built-in
|
||||
macros that expect literals.
|
||||
|
||||
Vadim Petrochenkov: It generally performs a subset of actions performed by the
|
||||
non-eager expansion.
|
||||
|
||||
Vadim Petrochenkov: So, I'll talk about non-eager expansion for now.
|
||||
|
||||
mark-i-m: Eager expansion is not exposed as a language feature, right? i.e. it
|
||||
is not possible for me to write an eager macro?
|
||||
|
||||
Vadim Petrochenkov:
|
||||
https://github.com/rust-lang/rust/pull/53778#issuecomment-419224049 (vvv The
|
||||
link is explained below vvv )
|
||||
|
||||
Vadim Petrochenkov:
|
||||
|
||||
Eager expansion is not exposed as a language feature, right? i.e. it is not
|
||||
possible for me to write an eager macro?
|
||||
|
||||
Yes, it's entirely an ability of some built-in macros.
|
||||
|
||||
Vadim Petrochenkov: Not exposed for general use.
|
||||
|
||||
Vadim Petrochenkov: fully_expand_fragment works in iterations.
|
||||
|
||||
Vadim Petrochenkov: Iterations looks roughly like this:
|
||||
- Resolve imports in our partially built crate as much as possible.
|
||||
- Collect as many macro invocations as possible from our partially built crate
|
||||
(fn-like, attributes, derives) from the crate and add them to the queue.
|
||||
|
||||
Vadim Petrochenkov: Take a macro from the queue, and attempt to resolve it.
|
||||
|
||||
Vadim Petrochenkov: If it's resolved - run its expander function that
|
||||
consumes tokens or AST and produces tokens or AST (depending on the macro
|
||||
kind).
|
||||
|
||||
Vadim Petrochenkov: (If it's not resolved, then put it back into the
|
||||
queue.)
|
||||
|
||||
Vadim Petrochenkov: ^^^ That's where we fill in the hygiene data associated
|
||||
with ExpnIds.
|
||||
|
||||
mark-i-m: When we put it back in the queue?
|
||||
|
||||
mark-i-m: or do you mean the collect step in general?
|
||||
|
||||
Vadim Petrochenkov: Once we resolved the macro call to the macro definition we
|
||||
know everything about the macro and can call set_expn_data to fill in its
|
||||
properties in the global data.
|
||||
|
||||
Vadim Petrochenkov: I mean, immediately after successful resolution.
|
||||
|
||||
Vadim Petrochenkov: That's the first part of hygiene data, the second one is
|
||||
associated with SyntaxContext rather than with ExpnId, it's filled in later
|
||||
during expansion.
|
||||
|
||||
Vadim Petrochenkov: So, after we run the macro's expander function and got a
|
||||
piece of AST (or got tokens and parsed them into a piece of AST) we need to
|
||||
integrate that piece of AST into the big existing partially built AST.
|
||||
|
||||
Vadim Petrochenkov: This integration is a really important step where the next
|
||||
things happen:
|
||||
- NodeIds are assigned.
|
||||
|
||||
Vadim Petrochenkov: "def paths"s and their IDs (DefIds) are created
|
||||
|
||||
Vadim Petrochenkov: Names are put into modules from the resolver point of
|
||||
view.
|
||||
|
||||
Vadim Petrochenkov: So, we are basically turning some vague token-like mass
|
||||
into proper set in stone hierarhical AST and side tables.
|
||||
|
||||
Vadim Petrochenkov: Where exactly this happens - NodeIds are assigned by
|
||||
InvocationCollector (which also collects new macro calls from this new AST
|
||||
piece and adds them to the queue), DefIds are created by DefCollector, and
|
||||
modules are filled by BuildReducedGraphVisitor.
|
||||
|
||||
Vadim Petrochenkov: These three passes run one after another on every AST
|
||||
fragment freshly expanded from a macro.
|
||||
|
||||
Vadim Petrochenkov: After expanding a single macro and integrating its output
|
||||
we again try to resolve all imports in the crate, and then return to the big
|
||||
queue processing loop and pick up the next macro.
|
||||
|
||||
Vadim Petrochenkov: Repeat until there's no more macros. Vadim Petrochenkov:
|
||||
|
||||
mark-i-m: The integration step is where we would get parser errors too right?
|
||||
|
||||
mark-i-m: Also, when do we know definitively that resolution has failed for
|
||||
particular ident?
|
||||
|
||||
Vadim Petrochenkov:
|
||||
|
||||
The integration step is where we would get parser errors too right?
|
||||
|
||||
Yes, if the macro produced tokens (rather than AST directly) and we had to
|
||||
parse them.
|
||||
|
||||
Vadim Petrochenkov:
|
||||
|
||||
when do we know definitively that resolution has failed for particular
|
||||
ident?
|
||||
|
||||
So, ident is looked up in a number of scopes during resolution. From closest
|
||||
like the current block or module, to far away like preludes or built-in types.
|
||||
|
||||
Vadim Petrochenkov: If lookup is certainly failed in all of the scopes, then
|
||||
it's certainly failed.
|
||||
|
||||
mark-i-m: This is after all expansions and integrations are done, right?
|
||||
|
||||
Vadim Petrochenkov: "Certainly" is determined differently for different scopes,
|
||||
e.g. for a module scope it means no unexpanded macros and no unresolved glob
|
||||
imports in that module.
|
||||
|
||||
Vadim Petrochenkov:
|
||||
|
||||
This is after all expansions and integrations are done, right?
|
||||
|
||||
For macro and import names this happens during expansions and integrations.
|
||||
|
||||
mark-i-m: Makes sense
|
||||
|
||||
Vadim Petrochenkov: For all other names we certainly know whether a name is
|
||||
resolved successfully or not on the first attempt, because no new names can
|
||||
appear.
|
||||
|
||||
Vadim Petrochenkov: (They are resolved in a later pass, see
|
||||
librustc_resolve/late.rs.)
|
||||
|
||||
mark-i-m: And if at the end of the iteration, there are still things in the
|
||||
queue that can't be resolve, this represents an error, right?
|
||||
|
||||
mark-i-m: i.e. an undefined macro?
|
||||
|
||||
Vadim Petrochenkov: Yes, if we make no progress during an iteration, then we
|
||||
are stuck and that state represent an error.
|
||||
|
||||
Vadim Petrochenkov: We attempt to recover though, using dummies expanding into
|
||||
nothing or ExprKind::Err or something like that for unresolved macros.
|
||||
|
||||
mark-i-m: This is for the purposes of diagnostics, though, right?
|
||||
|
||||
Vadim Petrochenkov: But if we are going through recovery, then compilation must
|
||||
result in an error anyway.
|
||||
|
||||
Vadim Petrochenkov: Yes, that's for diagnostics, without recovery we would
|
||||
stuck at the first unresolved macro or import. Vadim Petrochenkov:
|
||||
|
||||
So, about the SyntaxContext hygiene...
|
||||
|
||||
Vadim Petrochenkov: New syntax contexts are created during macro expansion.
|
||||
|
||||
Vadim Petrochenkov: If the token had context X before being produced by a
|
||||
macro, e.g. here ident has context SyntaxContext::root(): Vadim Petrochenkov:
|
||||
|
||||
macro m() { ident }
|
||||
|
||||
Vadim Petrochenkov: , then after being produced by the macro it has context X
|
||||
-> macro_id.
|
||||
|
||||
Vadim Petrochenkov: I.e. our ident has context ROOT -> id(m) after it's
|
||||
produced by m.
|
||||
|
||||
Vadim Petrochenkov: The "chaining operator" -> is apply_mark in compiler code.
|
||||
Vadim Petrochenkov:
|
||||
|
||||
macro m() { macro n() { ident } }
|
||||
|
||||
Vadim Petrochenkov: In this example the ident has context ROOT originally, then
|
||||
ROOT -> id(m), then ROOT -> id(m) -> id(n).
|
||||
|
||||
Vadim Petrochenkov: Note that these chains are not entirely determined by their
|
||||
last element, in other words ExpnId is not isomorphic to SyntaxCtxt.
|
||||
|
||||
Vadim Petrochenkov: Couterexample: Vadim Petrochenkov:
|
||||
|
||||
macro m($i: ident) { macro n() { ($i, bar) } }
|
||||
|
||||
m!(foo);
|
||||
|
||||
Vadim Petrochenkov: foo has context ROOT -> id(n) and bar has context ROOT ->
|
||||
id(m) -> id(n) after all the expansions.
|
||||
|
||||
mark-i-m: Cool :)
|
||||
|
||||
mark-i-m: It looks like we are out of time
|
||||
|
||||
mark-i-m: Is there anything you wanted to add?
|
||||
|
||||
mark-i-m: We can schedule another meeting if you would like
|
||||
|
||||
Vadim Petrochenkov: Yep, 23.06 already. No, I think this is an ok point to
|
||||
stop.
|
||||
|
||||
mark-i-m: :+1:
|
||||
|
||||
mark-i-m: Thanks @Vadim Petrochenkov ! This was very helpful
|
||||
|
||||
Vadim Petrochenkov: Yeah, we can schedule another one. So far it's been like 1
|
||||
hour of meetings per month? Certainly not a big burden.
|
||||
```
|
||||
|
|
|
|||
Loading…
Reference in New Issue