[overview.md] Add command line argument parsing, lexer stages, and parser outline

This commit is contained in:
Chris Simpkins 2020-04-03 01:41:04 -04:00 committed by Who? Me?!
parent a43ef4d3b3
commit 0783019c12
1 changed files with 66 additions and 66 deletions

View File

@ -19,16 +19,16 @@ we'll talk about that later.
**TODO: someone else should confirm this vvv**
- User writes a program and invokes `rustc` on it (possibly through `cargo`).
- First, we parse command line flags, etc. This is done in [`librustc_driver`].
We now know what the exact work is we need to do (e.g. which nightly features
are enabled, whether we are doing a `check`-only build or emiting LLVM-IR or
a full compilation).
- Then, we start to do compilation...
- We first [_lex_ the user program][lex]. This turns the program into a stream
of _tokens_ (yes, the same sort of tokens as `proc_macros` (sort of)).
[`StringReader`] from [`librustc_parse`] integrates [`librustc_lexer`] with
`rustc` data structures.
- The compile process begins when a user writes a Rust source program in text and invokes the `rustc` compiler on it. The work that the compiler needs to perform is defined with command line options. For example, it is possible to optionally enable nightly features, perform `check`-only builds, or emit LLVM-IR rather than complete the entire compile process defined here. The `rustc` executable call may be indirect through the use of `cargo`.
- Command line argument parsing occurs in the [`librustc_driver`]. This crate defines the compile configuration that is requested by the user.
- The raw Rust source text is analyzed by a low-level lexer located in [`librustc_lexer`]. At this stage, the source text is turned into a stream of atomic source code units known as _tokens_. (**TODO**: chrissimpkins - Maybe discuss Unicode handling during this stage?)
- The token stream passes through a higher-level lexer located in [`librustc_parse`] to prepare for the next stage of the compile process. The [`StringReader`] struct is used at this stage to perform a set of validations and turn strings into interned symbols.
- (**TODO**: chrissimpkins - Expand info on parser) We then [_parse_ the stream of tokens][parser] to build an Abstract Syntax Tree (AST).
- macro expansion (**TODO** chrissimpkins)
- ast validation (**TODO** chrissimpkins)
- nameres (**TODO** chrissimpkins)
- early linting (**TODO** chrissimpkins)
- We then [_parse_ the stream of tokens][parser] to build an Abstract Syntax
Tree (AST).
- We then take the AST and [convert it to High-Level Intermediate