From a12e9e31a32d7ede8bfffdb3d668393757e7a2ad Mon Sep 17 00:00:00 2001 From: Chris Simpkins Date: Sun, 5 Apr 2020 22:50:28 -0400 Subject: [PATCH] [overview.md] add documentation of lexer support for Unicode encoding --- src/overview.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/overview.md b/src/overview.md index 751a01ec..d641718c 100644 --- a/src/overview.md +++ b/src/overview.md @@ -28,8 +28,8 @@ we'll talk about that later. to the rest of the compilation process as a [`rustc_interface::Config`]. - The raw Rust source text is analyzed by a low-level lexer located in [`librustc_lexer`]. At this stage, the source text is turned into a stream of - atomic source code units known as _tokens_. (**TODO**: chrissimpkins - Maybe - discuss Unicode handling during this stage?) + atomic source code units known as _tokens_. The lexer supports the Unicode + character encoding. - The token stream passes through a higher-level lexer located in [`librustc_parse`] to prepare for the next stage of the compile process. The [`StringReader`] struct is used at this stage to perform a set of validations