minor edits
This commit is contained in:
parent
26558f882c
commit
91d6ab9de4
|
|
@ -1,8 +1,8 @@
|
||||||
# Lexing and Parsing
|
# Lexing and Parsing
|
||||||
|
|
||||||
The very first thing the compiler does is take the program (in Unicode) and
|
The very first thing the compiler does is take the program (in UTF-8 Unicode text)
|
||||||
transmute it into a data format the compiler can work with more conveniently
|
and turn it into a data format the compiler can work with more conveniently than strings.
|
||||||
than strings. This happens in two stages: Lexing and Parsing.
|
This happens in two stages: Lexing and Parsing.
|
||||||
|
|
||||||
1. _Lexing_ takes strings and turns them into streams of [tokens]. For
|
1. _Lexing_ takes strings and turns them into streams of [tokens]. For
|
||||||
example, `foo.bar + buz` would be turned into the tokens `foo`, `.`, `bar`,
|
example, `foo.bar + buz` would be turned into the tokens `foo`, `.`, `bar`,
|
||||||
|
|
@ -13,38 +13,36 @@ than strings. This happens in two stages: Lexing and Parsing.
|
||||||
|
|
||||||
2. _Parsing_ takes streams of tokens and turns them into a structured form
|
2. _Parsing_ takes streams of tokens and turns them into a structured form
|
||||||
which is easier for the compiler to work with, usually called an [*Abstract
|
which is easier for the compiler to work with, usually called an [*Abstract
|
||||||
Syntax Tree* (`AST`)][ast] .
|
Syntax Tree* (AST)][ast] .
|
||||||
|
|
||||||
|
|
||||||
An `AST` mirrors the structure of a Rust program in memory, using a `Span` to
|
An AST mirrors the structure of a Rust program in memory, using a `Span` to
|
||||||
link a particular `AST` node back to its source text. The `AST` is defined in
|
link a particular AST node back to its source text. The AST is defined in
|
||||||
[`rustc_ast`][rustc_ast], along with some definitions for tokens and token
|
[`rustc_ast`][rustc_ast], along with some definitions for tokens and token
|
||||||
streams, data structures/`trait`s for mutating `AST`s, and shared definitions for
|
streams, data structures/traits for mutating ASTs, and shared definitions for
|
||||||
other `AST`-related parts of the compiler (like the lexer and
|
other AST-related parts of the compiler (like the lexer and
|
||||||
`macro`-expansion).
|
macro-expansion).
|
||||||
|
|
||||||
The lexer is developed in [`rustc_lexer`][lexer].
|
The lexer is developed in [`rustc_lexer`][lexer].
|
||||||
|
|
||||||
The parser is defined in [`rustc_parse`][rustc_parse], along with a
|
The parser is defined in [`rustc_parse`][rustc_parse], along with a
|
||||||
high-level interface to the lexer and some validation routines that run after
|
high-level interface to the lexer and some validation routines that run after
|
||||||
`macro` expansion. In particular, the [`rustc_parse::parser`][parser] contains
|
macro expansion. In particular, the [`rustc_parse::parser`][parser] contains
|
||||||
the parser implementation.
|
the parser implementation.
|
||||||
|
|
||||||
The main entrypoint to the parser is via the various `parse_*` functions and others in
|
The main entrypoint to the parser is via the various `parse_*` functions and others in
|
||||||
[rustc_parse][rustc_parse]. They let you do things like turn a [`SourceFile`][sourcefile]
|
[rustc_parse][rustc_parse]. They let you do things like turn a [`SourceFile`][sourcefile]
|
||||||
(e.g. the source in a single file) into a token stream, create a parser from
|
(e.g. the source in a single file) into a token stream, create a parser from
|
||||||
the token stream, and then execute the parser to get a [`Crate`] (the root `AST`
|
the token stream, and then execute the parser to get a [`Crate`] (the root AST
|
||||||
node).
|
node).
|
||||||
|
|
||||||
To minimize the amount of copying that is done, both [`StringReader`] and
|
To minimize the amount of copying that is done,
|
||||||
[`Parser`] have lifetimes which bind them to the parent [`ParseSess`]. This
|
both [`StringReader`] and [`Parser`] have lifetimes which bind them to the parent [`ParseSess`].
|
||||||
contains all the information needed while parsing, as well as the [`SourceMap`]
|
This contains all the information needed while parsing, as well as the [`SourceMap`] itself.
|
||||||
itself.
|
|
||||||
|
|
||||||
Note that while parsing, we may encounter `macro` definitions or invocations. We
|
Note that while parsing, we may encounter macro definitions or invocations.
|
||||||
set these aside to be expanded (see [Macro Expansion](./macro-expansion.md)).
|
We set these aside to be expanded (see [Macro Expansion](./macro-expansion.md)).
|
||||||
Expansion itself may require parsing the output of a `macro`, which may reveal
|
Expansion itself may require parsing the output of a macro, which may reveal more macros to be expanded, and so on.
|
||||||
more `macro`s to be expanded, and so on.
|
|
||||||
|
|
||||||
## More on Lexical Analysis
|
## More on Lexical Analysis
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue