Update some statements

This commit is contained in:
SparrowLii 2022-08-25 19:47:13 +08:00 committed by Tshepang Mbambo
parent 09dd4d4f49
commit bbdb0ca29c
1 changed files with 27 additions and 33 deletions

View File

@ -1,14 +1,15 @@
# Parallel Compilation # Parallel Compilation
As of <!-- date-check --> August 2022, the only stage of the compiler that As of <!-- date-check --> August 2022, the only stage of the compiler that
is already parallel is codegen. Some other parts of the nightly compiler is already parallel is codegen. Some parts of the compiler already have
have parallel implementations, such as query evaluation, type check and parallel implementations, such as query evaluation, type check and
monomorphization, but there is still a lot of work to be done. The lack of monomorphization, but the general version of the compiler does not include
parallelism at other stages (for example, macro expansion) also represents these parallelization functions. **To try out the current parallel compiler**,
an opportunity for improving compiler performance. one can install rustc from source code with `parallel-compiler = true` in
the `config.toml`.
**To try out the current parallel compiler**, one can install rustc from The lack of parallelism at other stages (for example, macro expansion) also
source code with `parallel-compiler = true` in the `config.toml`. represents an opportunity for improving compiler performance.
These next few sections describe where and how parallelism is currently used, These next few sections describe where and how parallelism is currently used,
and the current status of making parallel compilation the default in `rustc`. and the current status of making parallel compilation the default in `rustc`.
@ -45,9 +46,15 @@ are implemented diferently depending on whether `parallel-compiler` is true.
| MappedLockGuard | parking_lot::MappedMutexGuard | std::cell::RefMut | | MappedLockGuard | parking_lot::MappedMutexGuard | std::cell::RefMut |
| MetadataRef | [`OwningRef<Box<dyn Erased + Send + Sync>, [u8]>`][OwningRef] | [`OwningRef<Box<dyn Erased>, [u8]>`][OwningRef] | | MetadataRef | [`OwningRef<Box<dyn Erased + Send + Sync>, [u8]>`][OwningRef] | [`OwningRef<Box<dyn Erased>, [u8]>`][OwningRef] |
- There are currently a lot of global data structures that need to be made - These thread-safe data structures interspersed during compilation can
thread-safe. A key strategy here has been converting interior-mutable cause a lot of lock contention, which actually degrades performance as the
data-structures (e.g. `Cell`) into their thread-safe siblings (e.g. `Mutex`). number of threads increases beyond 4. This inspires us to audit the use
of these data structures, leading to either refactoring to reduce use of
shared state, or persistent documentation covering invariants, atomicity,
and lock orderings.
- On the other hand, we still need to figure out what other invariants
during compilation might not hold in parallel compilation.
### WorkLocal ### WorkLocal
@ -64,10 +71,10 @@ can be accessed directly through `Deref::deref`.
## Parallel Iterator ## Parallel Iterator
The parallel iterators provided by the [`rayon`] crate are efficient The parallel iterators provided by the [`rayon`] crate are easy ways
ways to achieve parallelization. The current nightly rustc uses (a custom to implement parallelism. In the current implementation of the parallel
fork of) [`rayon`] to run tasks in parallel. The custom fork allows the compiler we use a custom fork of [`rayon`] to run tasks in parallel.
execution of DAGs of tasks, not just trees. *(more information wanted here)*
Some iterator functions are implemented in the current nightly compiler to Some iterator functions are implemented in the current nightly compiler to
run loops in parallel when `parallel-compiler` is true. run loops in parallel when `parallel-compiler` is true.
@ -124,9 +131,11 @@ When a query `foo` is evaluated, the cache table for `foo` is locked.
start evaluating. start evaluating.
- If there *is* another query invocation for the same key in progress, we - If there *is* another query invocation for the same key in progress, we
release the lock, and just block the thread until the other invocation has release the lock, and just block the thread until the other invocation has
computed the result we are waiting for. **Deadlocks are possible**, in which computed the result we are waiting for. **Cycle error detection** in the parallel
case `rustc_query_system::query::job::deadlock()` will be called to detect compiler requires more complex logic than in single-threaded mode. When
and remove the deadlock and then return cycle error as the query result. worker threads in parallel queries stop making progress due to interdependence,
the compiler uses an extra thread *(named deadlock handler)* to detect, remove and
report the cycle error.
Parallel query still has a lot of work to do, most of which is related to Parallel query still has a lot of work to do, most of which is related to
the previous `Data Structures` and `Parallel Iterators`. See [this tracking issue][tracking]. the previous `Data Structures` and `Parallel Iterators`. See [this tracking issue][tracking].
@ -137,22 +146,7 @@ As of <!-- date-check--> May 2022, there are still a number of steps
to complete before rustdoc rendering can be made parallel. More details on to complete before rustdoc rendering can be made parallel. More details on
this issue can be found [here][parallel-rustdoc]. this issue can be found [here][parallel-rustdoc].
## Current Status ## Resources
As of <!-- date-check --> May 2022, work on explicitly parallelizing the
compiler has stalled. There is a lot of design and correctness work that needs
to be done.
As of <!-- date-check --> May 2022, much of this effort is on hold due
to lack of manpower. We have a working prototype with promising performance
gains in many cases. However, there are two blockers:
- It's not clear what invariants need to be upheld that might not hold in the
face of concurrency. An auditing effort was underway, but seems to have
stalled at some point.
- There is a lot of lock contention, which actually degrades performance as the
number of threads increases beyond 4.
Here are some resources that can be used to learn more (note that some of them Here are some resources that can be used to learn more (note that some of them
are a bit out of date): are a bit out of date):