add notes about generating llvm ir
This commit is contained in:
parent
894893860d
commit
e0d07aad5f
|
|
@ -261,29 +261,65 @@ $ firefox maybe_init_suffix.pdf # Or your favorite pdf viewer
|
||||||
## Debugging LLVM
|
## Debugging LLVM
|
||||||
[debugging-llvm]: #debugging-llvm
|
[debugging-llvm]: #debugging-llvm
|
||||||
|
|
||||||
LLVM is a big project on its own that probably needs to have its own debugging
|
> NOTE: If you are looking for info about code generation, please see [this
|
||||||
document (not that I could find one). But here are some tips that are important
|
> chapter][codegen] instead.
|
||||||
in a rustc context:
|
|
||||||
|
[codegen]: codegen.html
|
||||||
|
|
||||||
|
This section is about debugging compiler bugs in code generation (e.g. why the
|
||||||
|
compiler generated some piece of code or crashed in LLVM). LLVM is a big
|
||||||
|
project on its own that probably needs to have its own debugging document (not
|
||||||
|
that I could find one). But here are some tips that are important in a rustc
|
||||||
|
context:
|
||||||
|
|
||||||
|
As a general rule, compilers generate lots of information from analyzing code.
|
||||||
|
Thus, a useful first step is usually to find a minimal example. One way to do
|
||||||
|
this is to
|
||||||
|
|
||||||
|
1. create a new crate that reproduces the issue (e.g. adding whatever crate is
|
||||||
|
at fault as a dependency, and using it from there)
|
||||||
|
|
||||||
|
2. minimize the crate by removing external dependencies; that is, moving
|
||||||
|
everything relevant to the new crate
|
||||||
|
|
||||||
|
3. further minimize the issue by making the code shorter (there are tools that
|
||||||
|
help with this like `creduce`)
|
||||||
|
|
||||||
The official compilers (including nightlies) have LLVM assertions disabled,
|
The official compilers (including nightlies) have LLVM assertions disabled,
|
||||||
which means that LLVM assertion failures can show up as compiler crashes (not
|
which means that LLVM assertion failures can show up as compiler crashes (not
|
||||||
ICEs but "real" crashes) and other sorts of weird behavior. If you are
|
ICEs but "real" crashes) and other sorts of weird behavior. If you are
|
||||||
encountering these, it is a good idea to try using a compiler with LLVM
|
encountering these, it is a good idea to try using a compiler with LLVM
|
||||||
assertions enabled - either an "alt" nightly or a compiler you build yourself
|
assertions enabled - either an "alt" nightly or a compiler you build yourself
|
||||||
by setting `[llvm] assertions=true` in your config.toml - and
|
by setting `[llvm] assertions=true` in your config.toml - and see whether
|
||||||
see whether anything turns up.
|
anything turns up.
|
||||||
|
|
||||||
The rustc build process builds the LLVM tools into
|
The rustc build process builds the LLVM tools into
|
||||||
`./build/<host-triple>/llvm/bin`. They can be called directly.
|
`./build/<host-triple>/llvm/bin`. They can be called directly.
|
||||||
|
|
||||||
The default rustc compilation pipeline has multiple codegen units, which is hard
|
The default rustc compilation pipeline has multiple codegen units, which is
|
||||||
to replicate manually and means that LLVM is called multiple times in parallel.
|
hard to replicate manually and means that LLVM is called multiple times in
|
||||||
If you can get away with it (i.e. if it doesn't make your bug disappear),
|
parallel. If you can get away with it (i.e. if it doesn't make your bug
|
||||||
passing `-C codegen-units=1` to rustc will make debugging easier.
|
disappear), passing `-C codegen-units=1` to rustc will make debugging easier.
|
||||||
|
|
||||||
If you want to play with the optimization pipeline, you can use the opt tool
|
To rustc to generate LLVM IR, you need to pass the `--emit=llvm-ir` flag. If
|
||||||
from `./build/<host-triple>/llvm/bin/` with the the LLVM IR emitted by rustc.
|
you are building via cargo, use the `RUSTFLAGS` environment variable (e.g.
|
||||||
Note that rustc emits different IR depending on whether `-O` is enabled, even
|
`RUSTFLAGS='--emit=llvm-ir'`). This causes rustc to spit out LLVM IR into the
|
||||||
|
target directory.
|
||||||
|
|
||||||
|
`cargo llvm-ir [options] path` spits out the LLVM IR for a particular function
|
||||||
|
at `path`. (`cargo install cargo-asm` installs `cargo asm` and `cargo
|
||||||
|
llvm-ir`). `--build-type=debug` emits code for debug builds. There are also
|
||||||
|
other useful options. Also, debug info in LLVM IR can clutter the output a lot:
|
||||||
|
`RUSTFLAGS="-C debuginfo=0"` is really useful.
|
||||||
|
|
||||||
|
`RUSTFLAGS="-C save-temps"` outputs LLVM bitcode (not the same as IR) at
|
||||||
|
different stages during compilation, which is sometimes useful. One just needs
|
||||||
|
to convert the bitcode files to `.ll` files using `llvm-dis` which should be in
|
||||||
|
the target local compilation of rustc.
|
||||||
|
|
||||||
|
If you want to play with the optimization pipeline, you can use the `opt` tool
|
||||||
|
from `./build/<host-triple>/llvm/bin/` with the LLVM IR emitted by rustc. Note
|
||||||
|
that rustc emits different IR depending on whether `-O` is enabled, even
|
||||||
without LLVM's optimizations, so if you want to play with the IR rustc emits,
|
without LLVM's optimizations, so if you want to play with the IR rustc emits,
|
||||||
you should:
|
you should:
|
||||||
|
|
||||||
|
|
@ -295,21 +331,21 @@ $ $OPT -S -O2 < my-file.ll > my
|
||||||
```
|
```
|
||||||
|
|
||||||
If you just want to get the LLVM IR during the LLVM pipeline, to e.g. see which
|
If you just want to get the LLVM IR during the LLVM pipeline, to e.g. see which
|
||||||
IR causes an optimization-time assertion to fail, or to see when
|
IR causes an optimization-time assertion to fail, or to see when LLVM performs
|
||||||
LLVM performs a particular optimization, you can pass the rustc flag
|
a particular optimization, you can pass the rustc flag `-C
|
||||||
`-C llvm-args=-print-after-all`, and possibly add
|
llvm-args=-print-after-all`, and possibly add `-C
|
||||||
`-C llvm-args='-filter-print-funcs=EXACT_FUNCTION_NAME` (e.g.
|
llvm-args='-filter-print-funcs=EXACT_FUNCTION_NAME` (e.g. `-C
|
||||||
`-C llvm-args='-filter-print-funcs=_ZN11collections3str21_$LT$impl$u20$str$GT$\
|
llvm-args='-filter-print-funcs=_ZN11collections3str21_$LT$impl$u20$str$GT$\
|
||||||
7replace17hbe10ea2e7c809b0bE'`).
|
7replace17hbe10ea2e7c809b0bE'`).
|
||||||
|
|
||||||
That produces a lot of output into standard error, so you'll want to pipe
|
That produces a lot of output into standard error, so you'll want to pipe that
|
||||||
that to some file. Also, if you are using neither `-filter-print-funcs` nor
|
to some file. Also, if you are using neither `-filter-print-funcs` nor `-C
|
||||||
`-C codegen-units=1`, then, because the multiple codegen units run in parallel,
|
codegen-units=1`, then, because the multiple codegen units run in parallel, the
|
||||||
the printouts will mix together and you won't be able to read anything.
|
printouts will mix together and you won't be able to read anything.
|
||||||
|
|
||||||
If you want just the IR for a specific function (say, you want to see
|
If you want just the IR for a specific function (say, you want to see why it
|
||||||
why it causes an assertion or doesn't optimize correctly), you can use
|
causes an assertion or doesn't optimize correctly), you can use `llvm-extract`,
|
||||||
`llvm-extract`, e.g.
|
e.g.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ ./build/$TRIPLE/llvm/bin/llvm-extract \
|
$ ./build/$TRIPLE/llvm/bin/llvm-extract \
|
||||||
|
|
@ -319,4 +355,32 @@ $ ./build/$TRIPLE/llvm/bin/llvm-extract \
|
||||||
> extracted.ll
|
> extracted.ll
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Filing LLVM bug reports
|
||||||
|
|
||||||
|
When filing an LLVM bug report, you will probably want some sort of minimal
|
||||||
|
working example that demonstrates the problem. The Godbolt compiler explorer is
|
||||||
|
really helpful for this.
|
||||||
|
|
||||||
|
1. Once you have some LLVM IR for the problematic code (see above), you can
|
||||||
|
create a minimal working example with Godbolt. Go to
|
||||||
|
[gcc.godbolt.org](https://gcc.godbolt.org).
|
||||||
|
|
||||||
|
2. Choose `LLVM-IR` as programming language.
|
||||||
|
|
||||||
|
3. Use `llc` to compile the IR to a particular target as is:
|
||||||
|
- There are some useful flags: `-mattr` enables target features, `-march=`
|
||||||
|
selects the target, `-mcpu=` selects the CPU, etc.
|
||||||
|
- Commands like `llc -march=help` output all architectures available, which
|
||||||
|
is useful because sometimes the Rust arch names and the LLVM names do not
|
||||||
|
match.
|
||||||
|
- If you have compiled rustc yourself somewhere, in the target directory
|
||||||
|
you have binaries for `llc`, `opt`, etc.
|
||||||
|
|
||||||
|
4. If you want to optimize the LLVM-IR, you can use `opt` to see how the LLVM
|
||||||
|
optimizations transform it.
|
||||||
|
|
||||||
|
5. Once you have a godbolt link demonstrating the issue, it is pretty easy to
|
||||||
|
fill in an LLVM bug.
|
||||||
|
|
||||||
|
|
||||||
[env-logger]: https://docs.rs/env_logger/0.4.3/env_logger/
|
[env-logger]: https://docs.rs/env_logger/0.4.3/env_logger/
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue