Adds a dev guide section on Rust Coverage
This commit is contained in:
parent
2adddd99f2
commit
a7c82e3ba9
|
|
@ -147,6 +147,7 @@
|
||||||
- [Backend Agnostic Codegen](./backend/backend-agnostic.md)
|
- [Backend Agnostic Codegen](./backend/backend-agnostic.md)
|
||||||
- [Implicit Caller Location](./backend/implicit-caller-location.md)
|
- [Implicit Caller Location](./backend/implicit-caller-location.md)
|
||||||
- [Profile-guided Optimization](./profile-guided-optimization.md)
|
- [Profile-guided Optimization](./profile-guided-optimization.md)
|
||||||
|
- [LLVM Source-Based Code Coverage](./llvm-coverage-instrumentation.md)
|
||||||
- [Sanitizers Support](./sanitizers.md)
|
- [Sanitizers Support](./sanitizers.md)
|
||||||
- [Debugging Support in the Rust Compiler](./debugging-support-in-rustc.md)
|
- [Debugging Support in the Rust Compiler](./debugging-support-in-rustc.md)
|
||||||
|
|
||||||
|
|
|
||||||
Binary file not shown.
|
After Width: | Height: | Size: 11 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 332 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 313 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 407 KiB |
|
|
@ -0,0 +1,627 @@
|
||||||
|
# LLVM Source-Based Code Coverage
|
||||||
|
|
||||||
|
`rustc` supports detailed source-based code and test coverage analysis
|
||||||
|
with a command line option (`-Z instrument-coverage`) that instruments Rust
|
||||||
|
libraries and binaries with additional instructions and data, at compile time.
|
||||||
|
|
||||||
|
The coverage instrumentation injects calls to the LLVM intrinsic instruction
|
||||||
|
[`llvm.instrprof.increment`][llvm-instrprof-increment] at code branches
|
||||||
|
(based on a MIR-based control flow analysis), and LLVM converts these to
|
||||||
|
instructions that increment static counters, when executed. The LLVM coverage
|
||||||
|
instrumentation also requires a [Coverage Map] that encodes source metadata,
|
||||||
|
mapping counter IDs--directly and indirectly--to the file locations (with
|
||||||
|
start and end line and column).
|
||||||
|
|
||||||
|
Rust libraries, with or without coverage instrumentation, can be linked into
|
||||||
|
instrumented binaries. When the program is executed and cleanly terminates,
|
||||||
|
LLVM libraries write the final counter values to a file (`default.profraw` or
|
||||||
|
a custom file set through environment variable `LLVM_PROFILE_FILE`).
|
||||||
|
|
||||||
|
Developers use existing LLVM coverage analysis tools to decode `.profraw`
|
||||||
|
files, with corresponding Coverage Maps (from matching binaries that produced
|
||||||
|
them), and generate various reports for analysis, for example:
|
||||||
|
|
||||||
|
<img alt="Screenshot of sample `llvm-cov show` result, for function add_quoted_string"
|
||||||
|
src="img/llvm-cov-show-01.png" class="center"/>
|
||||||
|
<br/>
|
||||||
|
|
||||||
|
Detailed instructions and examples are documented in the
|
||||||
|
[Rust Unstable Book (under _source-based-code-coverage_)][unstable-book-sbcc].
|
||||||
|
|
||||||
|
[llvm-instrprof-increment]: https://llvm.org/docs/LangRef.html#llvm-instrprof-increment-intrinsic
|
||||||
|
[Coverage Map]: https://llvm.org/docs/CoverageMappingFormat.html
|
||||||
|
[unstable-book-sbcc]: https://doc.rust-lang.org/nightly/unstable-book/compiler-flags/source-based-code-coverage.html
|
||||||
|
|
||||||
|
### Rust symbol mangling
|
||||||
|
|
||||||
|
`-Z instrument-coverage` automatically enables Rust symbol mangling `v0` (as
|
||||||
|
if the user specified `-Z symbol-mangling-version=v0` option when invoking
|
||||||
|
`rustc`) to ensure consistent and reversible name mangling. This has two
|
||||||
|
important benefits:
|
||||||
|
|
||||||
|
1. LLVM coverage tools can analyze coverage over multiple runs, including some
|
||||||
|
changes to source code; so mangled names must be consistent across compilations.
|
||||||
|
2. LLVM coverage reports can report coverage by function, and even separates
|
||||||
|
out the coverage counts of each unique instantiation of a generic function,
|
||||||
|
if invoked with multiple type substitution variations.
|
||||||
|
|
||||||
|
## Components of LLVM Coverage Instrumentation in `rustc`
|
||||||
|
|
||||||
|
### LLVM Runtime Dependency
|
||||||
|
|
||||||
|
Coverage data is only generated by running the executable Rust program. `rustc`
|
||||||
|
statically links coverage-instrumented binaries with LLVM runtime code
|
||||||
|
([compiler-rt][compiler-rt-profile]) that implements program hooks
|
||||||
|
(such as an `exit` hook) to write the counter values to the `.profraw` file.
|
||||||
|
|
||||||
|
In the `rustc` source tree, `library/profiler_builtins` bundles the LLVM
|
||||||
|
`compiler-rt` code into a Rust library crate. (When building `rustc`, the
|
||||||
|
`profiler_builtins` library is only included when `profiler = true` is set
|
||||||
|
in `rustc`'s `config.toml`.)
|
||||||
|
|
||||||
|
When compiling with `-Z instrument-coverage`,
|
||||||
|
[`CrateLoader::postprocess()`][crate-loader-postprocess] dynamically loads the
|
||||||
|
`profiler_builtins` library by calling `inject_profiler_runtime()`.
|
||||||
|
|
||||||
|
[compiler-rt-profile]: https://github.com/llvm/llvm-project/tree/master/compiler-rt/lib/profile
|
||||||
|
[crate-loader-postprocess]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_metadata/creader/struct.CrateLoader.html#method.postprocess
|
||||||
|
|
||||||
|
### MIR Pass: `InstrumentCoverage`
|
||||||
|
|
||||||
|
Coverage instrumentation is performed on the MIR with a [MIR pass][mir-passes]
|
||||||
|
called [`InstrumentCoverage`][mir-instrument-coverage]. This MIR pass analyzes
|
||||||
|
the control flow graph (CFG)--represented by MIR `BasicBlock`s--to identify
|
||||||
|
code branches, and injects additional [`Coverage`][coverage-statement]
|
||||||
|
statements into the `BasicBlock`s.
|
||||||
|
|
||||||
|
A MIR `Coverage` statement is a virtual instruction that indicates a counter
|
||||||
|
should be incremented when its adjacent statemeents are executed, to count
|
||||||
|
a span of code ([`CodeRegion`][code-region]). It counts the number of times a
|
||||||
|
branch is executed, and also specifies the exact location of that code span in
|
||||||
|
the Rust source code.
|
||||||
|
|
||||||
|
Note that many of these `Coverage` statements will *not* be converted into
|
||||||
|
physical counters (or any other executable instructions) in the final binary.
|
||||||
|
Some of them will be (see `CoverageKind::`[`Counter`][counter-coverage-kind]),
|
||||||
|
but other counters can be computed on the fly, when generating a coverage
|
||||||
|
report, by mapping a `CodeRegion` to a
|
||||||
|
`CoverageKind`::[`Expression`][expression-coverage-kind].
|
||||||
|
|
||||||
|
As an example:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn some_func(flag: bool) {
|
||||||
|
// increment Counter(1)
|
||||||
|
...
|
||||||
|
if flag {
|
||||||
|
// increment Counter(2)
|
||||||
|
...
|
||||||
|
} else {
|
||||||
|
// count = Expression(1) = Counter(1) - Counter(2)
|
||||||
|
...
|
||||||
|
}
|
||||||
|
// count = Expression(2) = Counter(1) + Zero
|
||||||
|
// or, alternatively, Expression(2) = Counter(2) + Expression(1)
|
||||||
|
...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
In this example, four contiguous code regions are counted while only
|
||||||
|
incrementing two counters.
|
||||||
|
|
||||||
|
CFG analysis is used to not only determine *where* the branches are, for
|
||||||
|
conditional expressions like `if`, `else`, `match`, and `loop`, but also to
|
||||||
|
determine where expressions can be used in place of physical counters.
|
||||||
|
|
||||||
|
The advantages of optimizing coverage through expressions are more pronounced
|
||||||
|
with loops. Loops generally include at least one conditional branch that
|
||||||
|
determines when to break out of a loop (a `while` condition, or an `if` or
|
||||||
|
`match` with a `break`). In MIR, this is typically lowered to a `SwitchInt`,
|
||||||
|
with one branch to stay in the loop, and another branch to break out of the
|
||||||
|
loop. The branch that breaks out will almost always execute less often,
|
||||||
|
so `InstrumentCoverage` chooses to add a `Counter` to that branch, and an
|
||||||
|
`Expression(continue) = Counter(loop) - Counter(break)` to the branch that
|
||||||
|
continues.
|
||||||
|
|
||||||
|
The `InstrumentCoverage` MIR pass is documented in
|
||||||
|
[more detail below][instrument-coverage-pass-details].
|
||||||
|
|
||||||
|
[mir-passes]: mir/passes.md
|
||||||
|
[mir-instrument-coverage]: https://github.com/rust-lang/rust/tree/master/compiler/rustc_mir/src/transform/coverage
|
||||||
|
[code-region]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/coverage/struct.CodeRegion.html
|
||||||
|
[counter-coverage-kind]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/coverage/enum.CoverageKind.html#variant.Counter
|
||||||
|
[expression-coverage-kind]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/coverage/enum.CoverageKind.html#variant.Expression
|
||||||
|
[coverage-statement]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.StatementKind.html#variant.Coverage
|
||||||
|
[instrument-coverage-pass-details]: #implementation-details-of-the-instrumentcoverage-mir-pass
|
||||||
|
|
||||||
|
### Counter Injection and Coverage Map Pre-staging
|
||||||
|
|
||||||
|
When the compiler enters [the Codegen phase][backend-lowering-mir], with a
|
||||||
|
coverage-enabled MIR, [`codegen_statement()`][codegen-statement] converts each
|
||||||
|
MIR `Statement` into some backend-specific action or instruction.
|
||||||
|
`codegen_statement()` forwards `Coverage` statements to
|
||||||
|
[`codegen_coverage()`][codegen-coverage]:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn codegen_statement(&mut self, mut bx: Bx, statement: &mir::Statement<'tcx>) -> Bx {
|
||||||
|
...
|
||||||
|
match statement.kind {
|
||||||
|
...
|
||||||
|
mir::StatementKind::Coverage(box ref coverage) => {
|
||||||
|
self.codegen_coverage(&mut bx, coverage.clone());
|
||||||
|
bx
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
`codegen_coverage()` handles each `CoverageKind` as follows:
|
||||||
|
|
||||||
|
* For all `CoverageKind`s, Coverage data (counter ID, expression equation
|
||||||
|
and ID, and code regions) are passed to the backend's `Builder`, to
|
||||||
|
populate data structures that will be used to generate the crate's
|
||||||
|
"Coverage Map". (See the [`FunctionCoverage`][function-coverage] `struct`.)
|
||||||
|
* For `CoverageKind::Counter`s, an instruction is injected in the backend
|
||||||
|
IR to increment the physical counter, by calling the `BuilderMethod`
|
||||||
|
[`instrprof_increment()`][instrprof-increment].
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn codegen_coverage(&self, bx: &mut Bx, coverage: Coverage) {
|
||||||
|
let Coverage { kind, code_region } = coverage;
|
||||||
|
match kind {
|
||||||
|
CoverageKind::Counter { function_source_hash, id } => {
|
||||||
|
if let Some(code_region) = code_region {
|
||||||
|
bx.add_coverage_counter(self.instance, id, code_region);
|
||||||
|
}
|
||||||
|
...
|
||||||
|
bx.instrprof_increment(fn_name, hash, num_counters, index);
|
||||||
|
}
|
||||||
|
CoverageKind::Expression { id, lhs, op, rhs } => {
|
||||||
|
bx.add_coverage_counter_expression(self.instance, id, lhs, op, rhs, code_region);
|
||||||
|
}
|
||||||
|
CoverageKind::Unreachable => {
|
||||||
|
...
|
||||||
|
```
|
||||||
|
_[code snippet trimmed for brevity]_
|
||||||
|
|
||||||
|
> The function name `instrprof_increment()` is taken from the LLVM intrinsic
|
||||||
|
call of the same name ([`llvm.instrprof.increment`][llvm-instrprof-increment]),
|
||||||
|
and uses the same arguments and types; but note that, up to and through this
|
||||||
|
stage (even though modeled after LLVM's implementation for code coverage
|
||||||
|
instrumentation), the data and instructions are not strictly LLVM-specific.
|
||||||
|
>
|
||||||
|
> But since LLVM is the only Rust-supported backend with the tooling to
|
||||||
|
process this form of coverage instrumentation, the backend for `Coverage`
|
||||||
|
statements is only implemented for LLVM, at this time.
|
||||||
|
|
||||||
|
[backend-lowering-mir]: backend/lowering-mir.md
|
||||||
|
[codegen-statement]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/mir/struct.FunctionCx.html#method.codegen_statement
|
||||||
|
[codegen-coverage]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/mir/struct.FunctionCx.html#method.codegen_coverage
|
||||||
|
[function-coverage]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/coverageinfo/map/struct.FunctionCoverage.html
|
||||||
|
[instrprof-increment]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/trait.BuilderMethods.html#tymethod.instrprof_increment
|
||||||
|
|
||||||
|
### Coverage Map Generation
|
||||||
|
|
||||||
|
With the instructions to increment counters now implemented in LLVM IR,
|
||||||
|
the last remaining step is to inject the LLVM IR variables that hold the
|
||||||
|
static data for the coverage map.
|
||||||
|
|
||||||
|
`rustc_codegen_llvm`'s [`compile_codegen_unit()`][compile-codegen-unit] calls
|
||||||
|
[`coverageinfo_finalze()`][coverageinfo-finalize],
|
||||||
|
which delegates its implementation to the
|
||||||
|
[`rustc_codegen_llvm::coverageinfo::mapgen`][mapgen-finalize] module.
|
||||||
|
|
||||||
|
For each function `Instance` (code-generated from MIR, including multiple
|
||||||
|
instances of the same MIR for generic functions that have different type
|
||||||
|
substitution combinations), `mapgen`'s `finalize()` method queries the
|
||||||
|
`Instance`-associated `FunctionCoverage` for its `Counter`s, `Expression`s,
|
||||||
|
and `CodeRegion`s; and calls LLVM codegen APIs to generate
|
||||||
|
properly-configured variables in LLVM IR, according to very specific
|
||||||
|
details of the [_LLVM Coverage Mapping Format_][coverage-mapping-format]
|
||||||
|
(Version 4).[^llvm-and-covmap-versions]
|
||||||
|
|
||||||
|
[^llvm-and-covmap-versions]: The Rust compiler currently supports
|
||||||
|
_LLVM Coverage Mapping Format_ Version 4 (the most up-to-date version
|
||||||
|
of the format, at the time of this writing) for improved compatibility
|
||||||
|
with other LLVM-based compilers (like _Clang_), and to take advantage
|
||||||
|
of some format optimizations. Version 4 was introduced in _LLVM 11_,
|
||||||
|
which is currently the default LLVM version for Rust. Note that the
|
||||||
|
Rust compiler optionally supports some earlier LLVM versions,
|
||||||
|
prior to _LLVM 11_. If `rustc` is configured to use an incompatible
|
||||||
|
version of LLVM, compiling with `-Z instrument-coverage` will
|
||||||
|
generate an error message.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn finalize<'ll, 'tcx>(cx: &CodegenCx<'ll, 'tcx>) {
|
||||||
|
let mut function_coverage_map = match cx.coverage_context() {
|
||||||
|
Some(ctx) => ctx.take_function_coverage_map(),
|
||||||
|
None => return,
|
||||||
|
};
|
||||||
|
...
|
||||||
|
add_unreachable_coverage(tcx, &mut function_coverage_map);
|
||||||
|
|
||||||
|
let mut mapgen = CoverageMapGenerator::new();
|
||||||
|
|
||||||
|
for (instance, function_coverage) in function_coverage_map {
|
||||||
|
...
|
||||||
|
let coverage_mapping_buffer = llvm::build_byte_buffer(|coverage_mapping_buffer| {
|
||||||
|
mapgen.write_coverage_mapping(expressions, counter_regions, coverage_mapping_buffer);
|
||||||
|
});
|
||||||
|
```
|
||||||
|
_[code snippet trimmed for brevity]_
|
||||||
|
|
||||||
|
One noteable step, performed by `mapgen::finalize()` before processing the
|
||||||
|
`Instance`s and their `FunctionCoverage`s, is the call to
|
||||||
|
[`add_unreachable_functions()`][add-unreachable-coverage].
|
||||||
|
|
||||||
|
When finalizing the coverage map, `FunctionCoverage` only has the `CodeRegion`s and counters for
|
||||||
|
the functions that went through codegen; such as public functions and "used" functions
|
||||||
|
(functions referenced by other "used" or public items). Any other functions (considered unused
|
||||||
|
or "Unreachable") were still parsed and processed through the MIR stage.
|
||||||
|
|
||||||
|
The set of unreachable functions is computed via the set difference of all MIR
|
||||||
|
`DefId`s (`tcx` query `mir_keys`) minus the codegenned `DefId`s
|
||||||
|
(`tcx` query `collect_and_partition_mono_items`). `add_unreachable_functions()`
|
||||||
|
computes the set of unreachable functions, queries the `tcx` for the
|
||||||
|
previously-computed `CodeRegions`, for each unreachabe MIR, and adds those code
|
||||||
|
regions to one of the non-generic codegenned functions (non-generic avoids
|
||||||
|
potentially injecting the unreachable coverage multiple times for multiple
|
||||||
|
instantiations).
|
||||||
|
|
||||||
|
[compile-codegen-unit]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/base/fn.compile_codegen_unit.html
|
||||||
|
[coverageinfo-finalize]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/context/struct.CodegenCx.html#method.coverageinfo_finalize
|
||||||
|
[mapgen-finalize]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/coverageinfo/mapgen/fn.finalize.html
|
||||||
|
[coverage-mapping-format]: https://llvm.org/docs/CoverageMappingFormat.html
|
||||||
|
[add-unreachable-coverage]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/coverageinfo/mapgen/fn.add_unreachable_coverage.html
|
||||||
|
|
||||||
|
## Testing LLVM Coverage
|
||||||
|
|
||||||
|
Coverage instrumentation in the MIR is validated by a `mir-opt` test:
|
||||||
|
[`instrument-coverage`][mir-opt-test].
|
||||||
|
|
||||||
|
More complete testing of end-to-end coverage instrumentation and reports are done
|
||||||
|
in the `run-make-fulldeps` tests, with sample Rust programs (to be instrumented)
|
||||||
|
in the [`coverage`][coverage-test-samples] directory, and the actual tests and expected
|
||||||
|
results in [`coverage-reports`].
|
||||||
|
|
||||||
|
In addition to testing the final result, two intermediate results are also validated
|
||||||
|
to catch potential regression errors early: Minimum `CoverageSpan`s computed during
|
||||||
|
the `InstrumentCoverage` MIR pass are saved in `mir_dump` [Spanview][spanview-debugging]
|
||||||
|
files and compared to expected results in [`coverage-spanview`].
|
||||||
|
|
||||||
|
Finally, the [`coverage-llvmir`] test compares compiles a simple Rust program with
|
||||||
|
`-Z instrument-coverage` and compares the compiled program's LLVM IR to expected
|
||||||
|
LLVM IR instructions and structured data for a coverage-enabled program, including
|
||||||
|
various checks for Coverage Map-related metadata and the LLVM intrinsic calls to
|
||||||
|
increment the runtime counters.
|
||||||
|
|
||||||
|
Expected results for both the `mir-opt` tests and the `coverage*` tests under
|
||||||
|
`run-make-fulldeps` can be refreshed by running:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
$ ./x.py test src/test/<test-type> --blessed
|
||||||
|
```
|
||||||
|
|
||||||
|
[mir-opt-test]: https://github.com/rust-lang/rust/blob/master/src/test/mir-opt/instrument_coverage.rs
|
||||||
|
[coverage-test-samples]: https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/coverage
|
||||||
|
[`coverage-reports`]: https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/coverage-reports
|
||||||
|
[`coverage-spanview`]: https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/coverage-spanview
|
||||||
|
[spanview-debugging]: compiler-debugging.md#viewing-spanview-output
|
||||||
|
[`coverage-llvmir`]: https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/coverage-llvmir
|
||||||
|
|
||||||
|
## Implementation Details of the `InstrumentCoverage` MIR Pass
|
||||||
|
|
||||||
|
The bulk of the implementation of the `InstrumentCoverage` MIR pass is performed
|
||||||
|
by the [`Instrumentor`][instrumentor]. For each MIR (each non-const, non-inlined
|
||||||
|
function, generic, or closure), the `Instrumentor`'s constructor prepares a
|
||||||
|
[`CoverageGraph`][coverage-graph] and then executes
|
||||||
|
[`inject_counters()`][inject-counters].
|
||||||
|
|
||||||
|
```rust
|
||||||
|
Instrumentor::new(&self.name(), tcx, mir_body).inject_counters();
|
||||||
|
```
|
||||||
|
|
||||||
|
The `CoverageGraph` is a coverage-specific simplification of the MIR control
|
||||||
|
flow graph (CFG). Its nodes are [`BasicCoverageBlock`s][bcb], which
|
||||||
|
encompass one or more sequentially-executed MIR `BasicBlock`s
|
||||||
|
(with no internal branching), plus a `CoverageKind` counter (to
|
||||||
|
be added, via coverage analysis), and an optional set of additional counters
|
||||||
|
to count incoming edges (if there are more than one).
|
||||||
|
|
||||||
|
The `Instrumentor`'s `inject_counters()` uses the `CoverageGraph` to
|
||||||
|
compute the best places to inject coverage counters, as MIR `Statement`s,
|
||||||
|
with the following steps:
|
||||||
|
|
||||||
|
1. Depending on the debugging configurations in `rustc`'s, `config.toml`,
|
||||||
|
and `rustc` command line flags, various debugging features may be enabled
|
||||||
|
to enhance `debug!()` messages in logs, and to generate various "dump" files,
|
||||||
|
to help developers understand the MIR transformation process for coverage.
|
||||||
|
Most of the debugging features are implemented in the [`debug`][debug]
|
||||||
|
sub-module.
|
||||||
|
2. [`generate_coverage_spans()`][generate-coverage-spans] computes the minimum set of distinct,
|
||||||
|
non-branching code regions, from the MIR. These `CoverageSpan`s
|
||||||
|
represent a span of code that must be counted.
|
||||||
|
3. [`make_bcb_counters()`][make-bcb-counters] generates `CoverageKind::Counter`s and
|
||||||
|
`CoverageKind::Expression`s for each `CoverageSpan`, plus additional
|
||||||
|
`intermediate_expressions`[^intermediate-expressions], not associated with any `CodeRegion`, but
|
||||||
|
are required to compute a final `Expression` value for a `CodeRegion`.
|
||||||
|
4. Inject the new counters into the MIR, as new `StatementKind::Coverage`
|
||||||
|
statements. This is done by three distinct functions:
|
||||||
|
- `inject_coverage_span_counters()`
|
||||||
|
- `inject_indirect_counters()`
|
||||||
|
- `inject_intermediate_expression()`, called for each intermediate expression
|
||||||
|
returned from `make_bcb_counters()`
|
||||||
|
|
||||||
|
[^intermediate-expressions]: Intermediate expressions are sometimes required
|
||||||
|
because `Expression`s are limited to binary additions or subtractions. For
|
||||||
|
example, `A + (B - C)` might represent an `Expression` count computed from three
|
||||||
|
other counters, `A`, `B`, and `C`, but computing that value requires an
|
||||||
|
intermediate expression for `B - C`.
|
||||||
|
|
||||||
|
[instrumentor]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/struct.Instrumentor.html
|
||||||
|
[coverage-graph]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/graph/struct.CoverageGraph.html
|
||||||
|
[inject-counters]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/struct.Instrumentor.html#method.inject_counters
|
||||||
|
[bcb]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/graph/struct.BasicCoverageBlock.html
|
||||||
|
[debug]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/debug
|
||||||
|
[generate-coverage-spans]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/spans/struct.CoverageSpans.html#method.generate_coverage_spans
|
||||||
|
[make-bcb-counters]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/counters/struct.BcbCounters.html#method.make_bcb_counters
|
||||||
|
|
||||||
|
### The `CoverageGraph`
|
||||||
|
|
||||||
|
The [`CoverageGraph`][coverage-graph] is derived from the MIR (`mir::Body`).
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let basic_coverage_blocks = CoverageGraph::from_mir(mir_body);
|
||||||
|
```
|
||||||
|
|
||||||
|
Like `mir::Body`, the `CoverageGraph` is also a
|
||||||
|
[`DirectedGraph`][directed-graph]. Both graphs represent the function's
|
||||||
|
fundamental control flow, with many of the same
|
||||||
|
[`graph trait`][graph-traits]s, supporting `start_node()`, `num_nodes()`,
|
||||||
|
`successors()`, `predecessors()`, and `is_dominated_by()`.
|
||||||
|
|
||||||
|
For anyone that knows how to work with the [MIR, as a CFG][mir-dev-guide], the
|
||||||
|
`CoverageGraph` will be familiar, and can be used in much the same way.
|
||||||
|
The nodes of the `CoverageGraph` are `BasicCoverageBlock`s (BCBs), which
|
||||||
|
index into an `IndexVec` of `BasicCoverageBlockData`. This is analagous
|
||||||
|
to the MIR CFG of `BasicBlock`s that index `BasicBlockData`.
|
||||||
|
|
||||||
|
Each `BasicCoverageBlockData` captures one or more MIR `BasicBlock`s,
|
||||||
|
exclusively, and represents the maximal-length sequence of `BasicBlocks`
|
||||||
|
without conditional branches.
|
||||||
|
|
||||||
|
[`compute_basic_coverage_blocks()`][compute-basic-coverage-blocks] builds the
|
||||||
|
`CoverageGraph` as a coverage-specific simplification of the MIR CFG. In
|
||||||
|
contrast with the [`SimplifyCfg`][simplify-cfg] MIR pass, this step does
|
||||||
|
not alter the MIR itself, because the `CoverageGraph` aggressively simplifies
|
||||||
|
the CFG, and ignores nodes that are not relevant to coverage. For example:
|
||||||
|
|
||||||
|
* The BCB CFG ignores (excludes) branches considered not relevant
|
||||||
|
to the current coverage solution. It excludes unwind-related code[^78544]
|
||||||
|
that is injected by the Rust compiler but has no physical source
|
||||||
|
code to count, which allows a `Call`-terminated BasicBlock
|
||||||
|
to be merged with its successor, within a single BCB.
|
||||||
|
* A `Goto`-terminated `BasicBlock` can be merged with its successor
|
||||||
|
***as long as*** it has the only incoming edge to the successor `BasicBlock`.
|
||||||
|
* Some BasicBlock terminators support Rust-specific concerns--like borrow-checking--that are
|
||||||
|
not relevant to coverage analysis. `FalseUnwind`, for example, can be treated the same as
|
||||||
|
a `Goto` (potentially merged with its successor into the same BCB).
|
||||||
|
|
||||||
|
[^78544]: (Note, however, that Issue [#78544][rust-lang/rust#78544] considers
|
||||||
|
providing future support for coverage of programs that intentionally
|
||||||
|
`panic`, as an option, with some non-trivial cost.)
|
||||||
|
|
||||||
|
The BCB CFG is critical to simplifying the coverage analysis by ensuring graph path-based
|
||||||
|
queries (`is_dominated_by()`, `predecessors`, `successors`, etc.) have branch (control flow)
|
||||||
|
significance.
|
||||||
|
|
||||||
|
To visualize the `CoverageGraph`, you can generate a _graphviz_ `*.dot`
|
||||||
|
file with the following `rustc` flags:[^graphviz-dark-mode]
|
||||||
|
|
||||||
|
[^graphviz-dark-mode]: This image also applies `-Z graphviz-dark-mode`, to
|
||||||
|
produce a Graphviz document with "dark mode" styling. If you use a dark mode or
|
||||||
|
theme in your development environment, you will probably want to use this
|
||||||
|
option so you can review the graphviz output without straining your vision.
|
||||||
|
|
||||||
|
```shell
|
||||||
|
$ rustc -Z instrument-coverage -Z dump-mir=InstrumentCoverage \
|
||||||
|
-Z dump-mir-graphviz some_rust_source.rs
|
||||||
|
```
|
||||||
|
|
||||||
|
The `-Z dump-mir` flag requests [MIR debugging
|
||||||
|
output][mir-debugging] (generating `*.mir` files, by default).
|
||||||
|
`-Z dump-mir-graphviz` additionally generates `*.dot` files.
|
||||||
|
`-Z dump-mir=InstrumentCoverage` restricts these files to the
|
||||||
|
`InstrumentCoverage` pass. All files are written to the `./mir_dump/`
|
||||||
|
directory, by default.
|
||||||
|
|
||||||
|
Files with names ending in `.-------.InstrumentCoverage.0.dot` contain the
|
||||||
|
_graphviz_ representations of a `CoverageGraph` (one for each MIR, that is,
|
||||||
|
for each function and closure):
|
||||||
|
|
||||||
|
<img alt="cropped image of a sample CoverageGraph in graphviz format"
|
||||||
|
src="img/coverage-graphviz-01.png" style="border: 1px solid gray" class="center"/>
|
||||||
|
<br/>
|
||||||
|
|
||||||
|
This image shows each `BasicCoverageBlock` as a rectangular _node_, with
|
||||||
|
directional edges (the arrows) leading from each node to its `successors()`.
|
||||||
|
The nodes contain information in sections:
|
||||||
|
|
||||||
|
1. The gray header has a label showing the BCB ID (or _index_ for looking up
|
||||||
|
its `BasicCoverageBlockData`).
|
||||||
|
2. The first content section shows the assigned `Counter` or `Expression` for
|
||||||
|
each contiguous section of code. (There may be more than one `Expression`
|
||||||
|
incremented by the same `Counter` for discontigous sections of code representing
|
||||||
|
the same sequential actions.) Note the code is represented by the line and
|
||||||
|
column ranges (for example: `52:28-52:33`, representing the original source
|
||||||
|
line 52, for columns 28-33). These are followed by the MIR `Statement` or
|
||||||
|
`Terminator` represented by that source range. (How these coverage regions
|
||||||
|
are determined is discussed in the following section.)
|
||||||
|
3. The final section(s) show the MIR `BasicBlock`s (by ID/index and its
|
||||||
|
`TerminatorKind`) contained in this BCB. The last BCB is separated out because
|
||||||
|
its `successors()` determine the edges leading out of the BCB, and into
|
||||||
|
the `leading_bb()` (first `BasicBlock`) of each successor BCB.
|
||||||
|
|
||||||
|
Note, to find the `BasicCoverageBlock` from a final BCB `Terminator`'s
|
||||||
|
successor `BasicBlock`, there is an index and helper
|
||||||
|
function--[`bcb_from_bb()`][bcb-from-bb]--to look up a `BasicCoverageBlock` from _any_
|
||||||
|
contained `BasicBlock`.
|
||||||
|
|
||||||
|
[directed-graph]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/graph/trait.DirectedGraph.html
|
||||||
|
[graph-traits]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/graph/index.html#traits
|
||||||
|
[mir-dev-guide]: mir/index.md
|
||||||
|
[compute-basic-coverage-blocks]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/graph/struct.CoverageGraph.html#method.compute_basic_coverage_blocks
|
||||||
|
[simplify-cfg]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/simplify/struct.SimplifyCfg.html
|
||||||
|
[rust-lang/rust#78544]: https://github.com/rust-lang/rust/issues/78544
|
||||||
|
[mir-debugging]: mir/debugging.md
|
||||||
|
[bcb-from-bb]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/graph/struct.CoverageGraph.html#method.bcb_from_bb
|
||||||
|
|
||||||
|
### `CoverageSpans`
|
||||||
|
|
||||||
|
The `struct` [`CoverageSpans`][coverage-spans] builds and refines a final set of
|
||||||
|
[`CoverageSpan`][coverage-span]s, each representing the largest contiguous `Span`
|
||||||
|
of source within a single BCB. By definition--since each `Span` falls within a
|
||||||
|
BCB, the `Span` is also non-branching; so if any code in that `Span` has executed,
|
||||||
|
all code in the `Span` will have executed, the same number of times.
|
||||||
|
|
||||||
|
[`CoverageSpans::generate_coverage_spans()`][generate-coverage-spans] constructs
|
||||||
|
an initial set of `CoverageSpan`s from the `Span`s associated with each MIR
|
||||||
|
`Statement` and `Terminator`.
|
||||||
|
|
||||||
|
The final stage of `generate_coverage_spans()` is handled by
|
||||||
|
[`to_refined_spans()`][to-refined-spans], which iterates through the `CoverageSpan`s,
|
||||||
|
merges and de-duplicates them, and returns an optimial, minimal set of `CoverageSpan`s
|
||||||
|
that can be used to assign coverage `Counter`s or `Expression`s, one-for-one.
|
||||||
|
|
||||||
|
An visual, interactive representation of the final `CoverageSpan`s can be
|
||||||
|
generated with the following `rustc` flags:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
$ rustc -Zinstrument-coverage -Zdump-mir=InstrumentCoverage \
|
||||||
|
-Zdump-mir-spanview some_rust_source.rs
|
||||||
|
```
|
||||||
|
|
||||||
|
These flags request Spanview output for the `InstrumentCoverage` pass, and the
|
||||||
|
resulting files (one for each MIR, that is, for each function or closure) can be
|
||||||
|
found in the `mir_dump` directory (by default), with the extension:
|
||||||
|
`.-------.InstrumentCoverage.0.html`.
|
||||||
|
|
||||||
|
<img alt="cropped image of a sample Spanview in a browser"
|
||||||
|
src="img/coverage-spanview-01.png" style="border: 1px solid gray" class="center"/>
|
||||||
|
<br/>
|
||||||
|
|
||||||
|
The image above shows one such example. The orange and blue backgrounds
|
||||||
|
highlight alternating `CoverageSpan`s from the refined set. Hovering over a
|
||||||
|
line expands the output on that line to show the MIR `BasicBlock` IDs covered
|
||||||
|
by each `CoverageSpan`. While hovering, the `CoverageSpan` under the pointer
|
||||||
|
also has a _tooltip_ block of text, showing even more detail, including the
|
||||||
|
MIR `Statement`s and `Terminator`s contributing to the `CoverageSpan`, and
|
||||||
|
their individual `Span`s (which should be encapsulated within the code region
|
||||||
|
of the refined `CoverageSpan`)
|
||||||
|
|
||||||
|
[coverage-spans]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/spans/struct.CoverageSpans.html
|
||||||
|
[coverage-span]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/spans/struct.CoverageSpan.html
|
||||||
|
[to-refined-spans]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/spans/struct.CoverageSpans.html#method.to_refined_spans
|
||||||
|
|
||||||
|
### `make_bcb_counters()`
|
||||||
|
|
||||||
|
[`make_bcb_counters()`][make-bcb-counters] traverses the `CoverageGraph` and adds a
|
||||||
|
`Counter` or `Expression` to every BCB. It uses _Control Flow Analysis_
|
||||||
|
to determine where an `Expression` can be used in place of a `Counter`.
|
||||||
|
`Expressions` have no runtime overhead, so if a viable expression (adding or
|
||||||
|
subtracting two other counters or expressions) can compute the same result as
|
||||||
|
an embedded counter, an `Expression` is preferred.
|
||||||
|
|
||||||
|
[`TraverseCoverageGraphWithLoops`][traverse-coverage-graph-with-loops]
|
||||||
|
provides a traversal order that ensures all `BasicCoverageBlock` nodes in a
|
||||||
|
loop are visited before visiting any node outside that loop. The traversal
|
||||||
|
state includes a `context_stack`, with the current loop's context information
|
||||||
|
(if in a loop), as well as context for nested loops.
|
||||||
|
|
||||||
|
Within loops, nodes with multiple outgoing edges (generally speaking, these
|
||||||
|
are BCBs terminated in a `SwitchInt`) can be optimized when at least one
|
||||||
|
branch exits the loop and at least one branch stays within the loop. (For an
|
||||||
|
`if` or `while`, there are only two branches, but a `match` may have more.)
|
||||||
|
|
||||||
|
A branch that does not exit the loop should be counted by `Expression`, if
|
||||||
|
possible. Note that some situations require assigning counters to BCBs before
|
||||||
|
they are visited by traversal, so the `counter_kind` (`CoverageKind` for
|
||||||
|
a `Counter` or `Expression`) may have already been assigned, in which case
|
||||||
|
one of the other branches should get the `Expression`.
|
||||||
|
|
||||||
|
For a node with more than two branches (such as for more than two
|
||||||
|
`match` patterns), only one branch can be optimized by `Expression`. All
|
||||||
|
others require a `Counter` (unless its BCB `counter_kind` was previously
|
||||||
|
assigned).
|
||||||
|
|
||||||
|
A branch expression is derived from the equation:
|
||||||
|
|
||||||
|
```text
|
||||||
|
Counter(branching_node) = SUM(Counter(branches))
|
||||||
|
```
|
||||||
|
|
||||||
|
It's important to
|
||||||
|
be aware that the `branches` in this equation are the outgoing _edges_
|
||||||
|
from the `branching_node`, but a `branch`'s target node may have other
|
||||||
|
incoming edges. Given the following graph, for example, the count for
|
||||||
|
`B` is the sum of its two incoming edges:
|
||||||
|
|
||||||
|
<img alt="Example graph with multiple incoming edges to a branch node"
|
||||||
|
src="img/coverage-branch-counting-01.png" class="center" style="width: 25%">
|
||||||
|
<br/>
|
||||||
|
|
||||||
|
In this situation, BCB node `B` may require an edge counter for its
|
||||||
|
"edge from A", and that edge might be computed from an `Expression`,
|
||||||
|
`Counter(A) - Counter(C)`. But an expression for the BCB _node_ `B`
|
||||||
|
would be the sum of all incoming edges:
|
||||||
|
|
||||||
|
```text
|
||||||
|
Expression((Counter(A) - Counter(C)) + SUM(Counter(remaining_edges)))
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that this is only one possible configuration. The actual choice
|
||||||
|
of `Counter` vs. `Expression` also depends on the order of counter
|
||||||
|
assignments, and whether a BCB or incoming edge counter already has
|
||||||
|
its `Counter` or `Expression`.
|
||||||
|
|
||||||
|
[bcb-counters]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/counters/struct.BcbCounters.html
|
||||||
|
[traverse-coverage-graph-with-loops]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/graph/struct.TraverseCoverageGraphWithLoops.html
|
||||||
|
|
||||||
|
### Injecting counters into a MIR `BasicBlock`
|
||||||
|
|
||||||
|
With the refined `CoverageSpan`s, and after all `Counter`s and `Expression`s are
|
||||||
|
created, the final step is to inject the `StatementKind::Coverage` statements
|
||||||
|
into the MIR. There are three distinct sources, handled by the following
|
||||||
|
functions:
|
||||||
|
|
||||||
|
- [`inject_coverage_span_counters()`][inject-coverage-span-counters] injects the
|
||||||
|
counter from each `CoverageSpan`'s BCB.
|
||||||
|
- [`inject_indirect_counters()`][inject-indirect-counters] injects counters
|
||||||
|
for any BCB not assigned to a `CoverageSpan`, and for all edge counters.
|
||||||
|
These counters don't have `CoverageSpan`s.
|
||||||
|
- [`inject_intermediate_expression()`][inject-intermediate-expression] injects
|
||||||
|
the intermediate expressions returned from `make_bcb_counters()`. These
|
||||||
|
counters aren't associated with any BCB, edge, or `CoverageSpan`.
|
||||||
|
|
||||||
|
These three functions inject the `Coverage` statements into the MIR.
|
||||||
|
`Counter`s and `Expression`s with `CoverageSpan`s add `Coverage` statements
|
||||||
|
to a corresponding `BasicBlock`, with a `CodeRegion` computed from the
|
||||||
|
refined `Span` and current `SourceMap`.
|
||||||
|
|
||||||
|
All other `Coverage` statements have a `CodeRegion` of `None`, but they
|
||||||
|
still must be injected because they contribute to other `Expression`s.
|
||||||
|
|
||||||
|
Finally, edge's with a `CoverageKind::Counter` require a new `BasicBlock`,
|
||||||
|
so the counter is only incremented when traversing the branch edge.
|
||||||
|
|
||||||
|
[inject-coverage-span-counters]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/struct.Instrumentor.html#method.inject_coverage_span_counters
|
||||||
|
[inject-indirect-counters]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/struct.Instrumentor.html#method.inject_indirect_counters
|
||||||
|
[inject-intermediate-expression]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/fn.inject_intermediate_expression.html
|
||||||
|
|
||||||
|
### Additional Debugging Support
|
||||||
|
|
||||||
|
See the
|
||||||
|
[crate documentation for `rustc_mir::transform::coverage::debug`][coverage-debugging]
|
||||||
|
for a detailed description of the debug output, logging, and configuration options
|
||||||
|
available to developers working on the `InstrumentCoverage` pass.
|
||||||
|
|
||||||
|
[coverage-debugging]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/coverage/debug/index.html
|
||||||
|
|
@ -29,7 +29,8 @@ LLVM actually [supports multiple forms][clang-pgo] of PGO:
|
||||||
- GCOV-based profiling, where code coverage infrastructure is used to collect
|
- GCOV-based profiling, where code coverage infrastructure is used to collect
|
||||||
profiling information.
|
profiling information.
|
||||||
- Front-end based instrumentation, where the compiler front-end (e.g. Clang)
|
- Front-end based instrumentation, where the compiler front-end (e.g. Clang)
|
||||||
inserts instrumentation intrinsics into the LLVM IR it generates.
|
inserts instrumentation intrinsics into the LLVM IR it generates (but see the
|
||||||
|
[^note-instrument-coverage]"Note").
|
||||||
- IR-level instrumentation, where LLVM inserts the instrumentation intrinsics
|
- IR-level instrumentation, where LLVM inserts the instrumentation intrinsics
|
||||||
itself during optimization passes.
|
itself during optimization passes.
|
||||||
|
|
||||||
|
|
@ -44,6 +45,11 @@ optimized. Instrumentation-based PGO has two components: a compile-time
|
||||||
component and run-time component, and one needs to understand the overall
|
component and run-time component, and one needs to understand the overall
|
||||||
workflow to see how they interact.
|
workflow to see how they interact.
|
||||||
|
|
||||||
|
[^note-instrument-coverage]: Note: `rustc` now supports front-end-based coverage
|
||||||
|
instrumentation, via the experimental option
|
||||||
|
[`-Z instrument-coverage`](../llvm-coverage-instrumentation), but using these
|
||||||
|
coverage results for PGO has not been attempted at this time.
|
||||||
|
|
||||||
### Overall Workflow
|
### Overall Workflow
|
||||||
|
|
||||||
Generating a PGO-optimized program involves the following four steps:
|
Generating a PGO-optimized program involves the following four steps:
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue