update parser chapter

This commit is contained in:
Mark Mansi 2019-11-11 10:36:45 -06:00 committed by Who? Me?!
parent 7616ab36bc
commit a50b8f144f
4 changed files with 33 additions and 24 deletions

View File

@ -38,7 +38,7 @@
- [Incremental compilation](./queries/incremental-compilation.md)
- [Incremental compilation In Detail](./queries/incremental-compilation-in-detail.md)
- [Debugging and Testing](./incrcomp-debugging.md)
- [The parser](./the-parser.md)
- [Lexing and Parsing](./the-parser.md)
- [`#[test]` Implementation](./test-implementation.md)
- [Macro expansion](./macro-expansion.md)
- [Name resolution](./name-resolution.md)

View File

@ -24,7 +24,7 @@ Item | Kind | Short description | Chapter |
`SourceFile` | struct | Part of the `SourceMap`. Maps AST nodes to their source code for a single source file. Was previously called FileMap | [The parser] | [src/libsyntax_pos/lib.rs](https://doc.rust-lang.org/nightly/nightly-rustc/syntax/source_map/struct.SourceFile.html)
`SourceMap` | struct | Maps AST nodes to their source code. It is composed of `SourceFile`s. Was previously called CodeMap | [The parser] | [src/libsyntax/source_map.rs](https://doc.rust-lang.org/nightly/nightly-rustc/syntax/source_map/struct.SourceMap.html)
`Span` | struct | A location in the user's source code, used for error reporting primarily | [Emitting Diagnostics] | [src/libsyntax_pos/span_encoding.rs](https://doc.rust-lang.org/nightly/nightly-rustc/syntax_pos/struct.Span.html)
`StringReader` | struct | This is the lexer used during parsing. It consumes characters from the raw source code being compiled and produces a series of tokens for use by the rest of the parser | [The parser] | [src/libsyntax/parse/lexer/mod.rs](https://doc.rust-lang.org/nightly/nightly-rustc/syntax/parse/lexer/struct.StringReader.html)
`StringReader` | struct | This is the lexer used during parsing. It consumes characters from the raw source code being compiled and produces a series of tokens for use by the rest of the parser | [The parser] | [src/librustc_parse/lexer/mod.rs](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/lexer/struct.StringReader.html)
`syntax::token_stream::TokenStream` | struct | An abstract sequence of tokens, organized into `TokenTree`s | [The parser], [Macro expansion] | [src/libsyntax/tokenstream.rs](https://doc.rust-lang.org/nightly/nightly-rustc/syntax/tokenstream/struct.TokenStream.html)
`TraitDef` | struct | This struct contains a trait's definition with type information | [The `ty` modules] | [src/librustc/ty/trait_def.rs](https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/trait_def/struct.TraitDef.html)
`TraitRef` | struct | The combination of a trait and its input types (e.g. `P0: Trait<P1...Pn>`) | [Trait Solving: Goals and Clauses], [Trait Solving: Lowering impls] | [src/librustc/ty/sty.rs](https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/struct.TraitRef.html)

View File

@ -1,5 +1,8 @@
# Macro expansion
> `libsyntax`, `librustc_expand`, and `libsyntax_ext` are all undergoing
> refactoring, so some of the links in this chapter may be broken.
Macro expansion happens during parsing. `rustc` has two parsers, in fact: the
normal Rust parser, and the macro parser. During the parsing phase, the normal
Rust parser will set aside the contents of macros and their invocations. Later,

View File

@ -1,29 +1,35 @@
# The Parser
# Lexing and Parsing
The parser is responsible for converting raw Rust source code into a structured
> The parser and lexer are currently undergoing a lot of refactoring, so parts
> of this chapter may be out of date.
The very first thing the compiler does is take the program (in Unicode
characters) and turn it into something the compiler can work with more
conveniently than strings. This happens in two stages: Lexing and Parsing.
Lexing takes strings and turns them into streams of tokens. For example,
`a.b + c` would be turned into the tokens `a`, `.`, `b`, `+`, and `c`.
The lexer lives in [`librustc_lexer`][lexer].
[lexer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html
Parsing then takes streams of tokens and turns them into a structured
form which is easier for the compiler to work with, usually called an [*Abstract
Syntax Tree*][ast]. An AST mirrors the structure of a Rust program in memory,
Syntax Tree*][ast] (AST). An AST mirrors the structure of a Rust program in memory,
using a `Span` to link a particular AST node back to its source text.
The bulk of the parser lives in the [libsyntax] crate.
The AST is defined in [`libsyntax`][libsyntax], along with some definitions for
tokens and token streams, data structures/traits for mutating ASTs, and shared
definitions for other AST-related parts of the compiler (like the lexer and
macro-expansion).
Like most parsers, the parsing process is composed of two main steps,
- lexical analysis turn a stream of characters into a stream of token trees
- parsing turn the token trees into an AST
The `syntax` crate contains several main players,
- a [`SourceMap`] for mapping AST nodes to their source code
- the [ast module] contains types corresponding to each AST node
- a [`StringReader`] for lexing source code into tokens
- the [parser module] and [`Parser`] struct are in charge of actually parsing
tokens into AST nodes,
- and a [visit module] for walking the AST and inspecting or mutating the AST
nodes.
The parser is defined in [`librustc_parse`][librustc_parse], along with a
high-level interface to the lexer and some validation routines that run after
macro expansion. In particular, the [`rustc_parser::parser`][parser] contains
the parser implementation.
The main entrypoint to the parser is via the various `parse_*` functions in the
[parser module]. They let you do things like turn a [`SourceFile`][sourcefile]
[parser][parser]. They let you do things like turn a [`SourceFile`][sourcefile]
(e.g. the source in a single file) into a token stream, create a parser from
the token stream, and then execute the parser to get a `Crate` (the root AST
node).
@ -44,14 +50,14 @@ Code for lexical analysis is split between two crates:
specific data structures. Specifically, it adds `Span` information to tokens
returned by `rustc_lexer` and interns identifiers.
[libsyntax]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/index.html
[rustc_errors]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_errors/index.html
[ast]: https://en.wikipedia.org/wiki/Abstract_syntax_tree
[`SourceMap`]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/source_map/struct.SourceMap.html
[ast module]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/ast/index.html
[parser module]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/parse/index.html
[librustc_parse]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html
[parser]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/index.html
[`Parser`]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/parse/parser/struct.Parser.html
[`StringReader`]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/parse/lexer/struct.StringReader.html
[`StringReader`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/lexer/struct.StringReader.html
[visit module]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/visit/index.html
[sourcefile]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/source_map/struct.SourceFile.html