Update parallel-rustc.md (#1926)

Co-authored-by: SparrowLii <liyuan179@huawei.com>
Co-authored-by: Jieyou Xu <jieyouxu@outlook.com>
This commit is contained in:
Tbkhi 2024-11-08 05:59:36 -03:00 committed by GitHub
parent 5d7107b836
commit c423c5636d
1 changed files with 70 additions and 51 deletions

View File

@ -1,31 +1,46 @@
# Parallel Compilation # Parallel Compilation
As of <!-- date-check --> August 2022, the only stage of the compiler that <div class="warning">
is already parallel is codegen. Some parts of the compiler already have Parallel front-end is currently (as of 2024 November) undergoing significant
parallel implementations, such as query evaluation, type check and changes, this page contains quite a bit of outdated information.
monomorphization, but the general version of the compiler does not include
these parallelization functions. **To try out the current parallel compiler**,
one can install rustc from source code with `parallel-compiler = true` in
the `config.toml`.
The lack of parallelism at other stages (for example, macro expansion) also Tracking issue: <https://github.com/rust-lang/rust/issues/113349>
represents an opportunity for improving compiler performance. </div>
These next few sections describe where and how parallelism is currently used, As of <!-- date-check --> November 2024, most of the rust compiler is now
and the current status of making parallel compilation the default in `rustc`. parallelized.
## Codegen - The codegen part is executed concurrently by default. You can use the `-C
codegen-units=n` option to control the number of concurrent tasks.
- The parts after HIR lowering to codegen such as type checking, borrowing
checking, and mir optimization are parallelized in the nightly version.
Currently, they are executed in serial by default, and parallelization is
manually enabled by the user using the `-Z threads = n` option.
- Other parts, such as lexical parsing, HIR lowering, and macro expansion, are
still executed in serial mode.
During [monomorphization][monomorphization] the compiler splits up all the code to <div class="warning">
The follow sections are kept for now but are quite outdated.
</div>
---
[codegen]: backend/codegen.md
## Code Generation
During monomorphization the compiler splits up all the code to
be generated into smaller chunks called _codegen units_. These are then generated by be generated into smaller chunks called _codegen units_. These are then generated by
independent instances of LLVM running in parallel. At the end, the linker independent instances of LLVM running in parallel. At the end, the linker
is run to combine all the codegen units together into one binary. This process is run to combine all the codegen units together into one binary. This process
occurs in the `rustc_codegen_ssa::base` module. occurs in the [`rustc_codegen_ssa::base`] module.
[`rustc_codegen_ssa::base`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/index.html
## Data Structures ## Data Structures
The underlying thread-safe data-structures used in the parallel compiler The underlying thread-safe data-structures used in the parallel compiler
can be found in the `rustc_data_structures::sync` module. These data structures can be found in the [`rustc_data_structures::sync`] module. These data structures
are implemented differently depending on whether `parallel-compiler` is true. are implemented differently depending on whether `parallel-compiler` is true.
| data structure | parallel | non-parallel | | data structure | parallel | non-parallel |
@ -45,34 +60,39 @@ are implemented differently depending on whether `parallel-compiler` is true.
| LockGuard | parking_lot::MutexGuard | std::cell::RefMut | | LockGuard | parking_lot::MutexGuard | std::cell::RefMut |
| MappedLockGuard | parking_lot::MappedMutexGuard | std::cell::RefMut | | MappedLockGuard | parking_lot::MappedMutexGuard | std::cell::RefMut |
- These thread-safe data structures interspersed during compilation can - These thread-safe data structures are interspersed during compilation which
cause a lot of lock contention, which actually degrades performance as the can cause lock contention resulting in degraded performance as the number of
number of threads increases beyond 4. This inspires us to audit the use threads increases beyond 4. So we audit the use of these data structures
of these data structures, leading to either refactoring to reduce use of which leads to either a refactoring so as to reduce the use of shared state,
shared state, or persistent documentation covering invariants, atomicity, or the authoring of persistent documentation covering the specific of the
and lock orderings. invariants, the atomicity, and the lock orderings.
- On the other hand, we still need to figure out what other invariants - On the other hand, we still need to figure out what other invariants
during compilation might not hold in parallel compilation. during compilation might not hold in parallel compilation.
### WorkLocal [`rustc_data_structures::sync`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/sync/index.html
`WorkLocal` is a special data structure implemented for parallel compiler. ### WorkerLocal
It holds worker-locals values for each thread in a thread pool. You can only
access the worker local value through the Deref impl on the thread pool it
was constructed on. It will panic otherwise.
`WorkLocal` is used to implement the `Arena` allocator in the parallel [`WorkerLocal`] is a special data structure implemented for parallel compilers. It
environment, which is critical in parallel queries. Its implementation holds worker-locals values for each thread in a thread pool. You can only
is located in the `rustc-rayon-core::worker_local` module. However, in the access the worker local value through the `Deref` `impl` on the thread pool it
non-parallel compiler, it is implemented as `(OneThread<T>)`, whose `T` was constructed on. It panics otherwise.
`WorkerLocal` is used to implement the `Arena` allocator in the parallel
environment, which is critical in parallel queries. Its implementation is
located in the [`rustc_data_structures::sync::worker_local`] module. However,
in the non-parallel compiler, it is implemented as `(OneThread<T>)`, whose `T`
can be accessed directly through `Deref::deref`. can be accessed directly through `Deref::deref`.
[`rustc_data_structures::sync::worker_local`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/sync/worker_local/index.html
[`WorkerLocal`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/sync/worker_local/struct.WorkerLocal.html
## Parallel Iterator ## Parallel Iterator
The parallel iterators provided by the [`rayon`] crate are easy ways The parallel iterators provided by the [`rayon`] crate are easy ways to
to implement parallelism. In the current implementation of the parallel implement parallelism. In the current implementation of the parallel compiler
compiler we use a custom [fork][rustc-rayon] of [`rayon`] to run tasks in parallel. we use a custom [fork][rustc-rayon] of `rayon` to run tasks in parallel.
Some iterator functions are implemented to run loops in parallel Some iterator functions are implemented to run loops in parallel
when `parallel-compiler` is true. when `parallel-compiler` is true.
@ -88,10 +108,9 @@ when `parallel-compiler` is true.
| **ModuleItems::par_impl_items**(&self, f: impl Fn(ImplItemId)) | run `f` on all impl items in the module | rustc_middle::hir | | **ModuleItems::par_impl_items**(&self, f: impl Fn(ImplItemId)) | run `f` on all impl items in the module | rustc_middle::hir |
| **ModuleItems::par_foreign_items**(&self, f: impl Fn(ForeignItemId)) | run `f` on all foreign items in the module | rustc_middle::hir | | **ModuleItems::par_foreign_items**(&self, f: impl Fn(ForeignItemId)) | run `f` on all foreign items in the module | rustc_middle::hir |
There are a lot of loops in the compiler which can possibly be There are a lot of loops in the compiler which can possibly be parallelized
parallelized using these functions. As of <!-- date-check--> August using these functions. As of <!-- date-check--> August 2022, scenarios where
2022, scenarios where the parallel iterator function has been used the parallel iterator function has been used are as follows:
are as follows:
| caller | scenario | callee | | caller | scenario | callee |
| ------------------------------------------------------- | ------------------------------------------------------------ | ------------------------ | | ------------------------------------------------------- | ------------------------------------------------------------ | ------------------------ |
@ -113,9 +132,9 @@ There are still many loops that have the potential to use parallel iterators.
## Query System ## Query System
The query model has some properties that make it actually feasible to evaluate The query model has some properties that make it actually feasible to evaluate
multiple queries in parallel without too much of an effort: multiple queries in parallel without too much effort:
- All data a query provider can access is accessed via the query context, so - All data a query provider can access is via the query context, so
the query context can take care of synchronizing access. the query context can take care of synchronizing access.
- Query results are required to be immutable so they can safely be used by - Query results are required to be immutable so they can safely be used by
different threads concurrently. different threads concurrently.
@ -135,31 +154,31 @@ When a query `foo` is evaluated, the cache table for `foo` is locked.
the compiler uses an extra thread *(named deadlock handler)* to detect, remove and the compiler uses an extra thread *(named deadlock handler)* to detect, remove and
report the cycle error. report the cycle error.
Parallel query still has a lot of work to do, most of which is related to The parallel query feature still has implementation to do, most of which is
the previous `Data Structures` and `Parallel Iterators`. See [this tracking issue][tracking]. related to the previous `Data Structures` and `Parallel Iterators`. See [this
open feature tracking issue][tracking].
## Rustdoc ## Rustdoc
As of <!-- date-check--> November 2022, there are still a number of steps As of <!-- date-check--> November 2022, there are still a number of steps to
to complete before rustdoc rendering can be made parallel. More details on complete before `rustdoc` rendering can be made parallel (see a open discussion
this issue can be found [here][parallel-rustdoc]. of [parallel `rustdoc`][parallel-rustdoc]).
## Resources ## Resources
Here are some resources that can be used to learn more (note that some of them Here are some resources that can be used to learn more:
are a bit out of date):
- [This IRLO thread by alexchricton about performance][irlo1]
- [This IRLO thread by Zoxc, one of the pioneers of the effort][irlo0] - [This IRLO thread by Zoxc, one of the pioneers of the effort][irlo0]
- [This list of interior mutability in the compiler by nikomatsakis][imlist] - [This list of interior mutability in the compiler by nikomatsakis][imlist]
- [This IRLO thread by alexchricton about performance][irlo1]
[`rayon`]: https://crates.io/crates/rayon [`rayon`]: https://crates.io/crates/rayon
[rustc-rayon]: https://github.com/rust-lang/rustc-rayon [Arc]: https://doc.rust-lang.org/std/sync/struct.Arc.html
[irlo0]: https://internals.rust-lang.org/t/parallelizing-rustc-using-rayon/6606
[imlist]: https://github.com/nikomatsakis/rustc-parallelization/blob/master/interior-mutability-list.md [imlist]: https://github.com/nikomatsakis/rustc-parallelization/blob/master/interior-mutability-list.md
[irlo0]: https://internals.rust-lang.org/t/parallelizing-rustc-using-rayon/6606
[irlo1]: https://internals.rust-lang.org/t/help-test-parallel-rustc/11503 [irlo1]: https://internals.rust-lang.org/t/help-test-parallel-rustc/11503
[tracking]: https://github.com/rust-lang/rust/issues/48685
[monomorphization]: backend/monomorph.md [monomorphization]: backend/monomorph.md
[parallel-rustdoc]: https://github.com/rust-lang/rust/issues/82741 [parallel-rustdoc]: https://github.com/rust-lang/rust/issues/82741
[Arc]: https://doc.rust-lang.org/std/sync/struct.Arc.html
[Rc]: https://doc.rust-lang.org/std/rc/struct.Rc.html [Rc]: https://doc.rust-lang.org/std/rc/struct.Rc.html
[rustc-rayon]: https://github.com/rust-lang/rustc-rayon
[tracking]: https://github.com/rust-lang/rust/issues/48685