Added Rustc Debugger Support Chapter
This commit is contained in:
parent
fd50812b27
commit
5236600afd
|
|
@ -84,6 +84,7 @@
|
|||
- [Updating LLVM](./codegen/updating-llvm.md)
|
||||
- [Debugging LLVM](./codegen/debugging.md)
|
||||
- [Profile-guided Optimization](./profile-guided-optimization.md)
|
||||
- [Debugging Support in Rust Compiler](./debugging-support-in-rustc.md)
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,321 @@
|
|||
# Debugging support in the Rust compiler
|
||||
|
||||
This document explains the state of debugging tools support in the Rust compiler (rustc).
|
||||
The document gives an overview of debugging tools like GDB, LLDB etc. and infrastrcture
|
||||
around Rust compiler to debug Rust code. If you want to learn how to debug the Rust compiler
|
||||
itself, then you must see [Debugging the Compiler] page.
|
||||
|
||||
The material is gathered from YouTube video [Tom Tromey discusses debugging support in rustc].
|
||||
|
||||
## Preliminaries
|
||||
|
||||
### Debuggers
|
||||
|
||||
According to Wikipedia
|
||||
|
||||
> A [debugger or debugging tool] is a computer program that is used to test and debug
|
||||
> other programs (the "target" program).
|
||||
|
||||
Writing a debugger from scratch for a language requires a lot of work, especially if
|
||||
debuggers have to be supported on various platforms. GDB and LLDB, however, can be
|
||||
extended to support debugging a language. This is the path that Rust has chosen.
|
||||
This document's main goal is to document the said debuggers support in Rust compiler.
|
||||
|
||||
### DWARF
|
||||
|
||||
According to the [DWARF] standard website
|
||||
|
||||
> DWARF is a debugging file format used by many compilers and debuggers to support source level
|
||||
> debugging. It addresses the requirements of a number of procedural languages,
|
||||
> such as C, C++, and Fortran, and is designed to be extensible to other languages.
|
||||
> DWARF is architecture independent and applicable to any processor or operating system.
|
||||
> It is widely used on Unix, Linux and other operating systems,
|
||||
> as well as in stand-alone environments.
|
||||
|
||||
DWARF reader is a program that consumes the DWARF format and creates debugger compatible output.
|
||||
This program may live in the compiler itself. DWARF uses a data structure called
|
||||
Debugging Information Entry (DIE) which stores the information as "tags" to denote functions,
|
||||
variables etc., e.g., `DW_TAG_variable`, `DW_TAG_pointer_type`, `DW_TAG_subprogram` etc.
|
||||
You can also invent your own tags and attributes.
|
||||
|
||||
## Supported debuggers
|
||||
|
||||
### GDB
|
||||
|
||||
We have our own fork of GDB - [https://github.com/rust-dev-tools/gdb]
|
||||
|
||||
#### Rust expression parser
|
||||
|
||||
To be able to show debug output we need an expression parser.
|
||||
This (GDB) expression parser is written in [Bison] and is only a subset of Rust expressions.
|
||||
This means that this parser can parse only a subset of Rust expressions.
|
||||
GDB parser was written from scratch and has no relation to any other parser.
|
||||
For example, this parser is not related to Rustc's parser.
|
||||
|
||||
GDB has Rust like value and type output. It can print values and types in a way
|
||||
that look like Rust syntax in the output. Or when you print a type as [ptype] in GDB,
|
||||
it also looks like Rust source code. Checkout the documentation in the [manual for GDB/Rust].
|
||||
|
||||
#### Parser extensions
|
||||
|
||||
Expression parser has a couple of extensions in it to facilitate features that you cannot do
|
||||
with Rust. Some limitations are listed in the [manual for GDB/Rust]. There is some special
|
||||
code in the DWARF reader in GDB to support the extensions.
|
||||
|
||||
A couple of examples of DWARF reader support needed are as follows -
|
||||
|
||||
1. Enum: Needed for support for enum types. The Rustc writes the information about enum into
|
||||
DWARF and GDB reads the DWARF to understand where is the tag field or is there a tag
|
||||
field or is the tag slot shared with non-zero optimization etc.
|
||||
|
||||
2. Dissect trait objects: DWARF extension where the trait object's description in the DWARF
|
||||
also points to a stub description of the corresponding vtable which in turn points to the
|
||||
concrete type for which this trait object exists. This means that you can do a `print *object`
|
||||
for that trait object, and GDB will understand how to find the correct type of the payload in
|
||||
the trait object.
|
||||
|
||||
**TODO**: Figure out if the following should be mentioned in the GDB-Rust document rather than
|
||||
this guide page so there is no duplication. This is regarding the following comments:
|
||||
|
||||
[This comment by Tom](https://github.com/rust-lang/rustc-guide/pull/316#discussion_r284027340)
|
||||
> gdb's Rust extensions and limitations are documented in the gdb manual:
|
||||
https://sourceware.org/gdb/onlinedocs/gdb/Rust.html -- however, this neglects to mention that
|
||||
gdb convenience variables and registers follow the gdb $ convention, and that the Rust parser
|
||||
implements the gdb @ extension.
|
||||
|
||||
[This question by Aman](https://github.com/rust-lang/rustc-guide/pull/316#discussion_r285401353)
|
||||
> @tromey do you think we should mention this part in the GDB-Rust document rather than this
|
||||
document so there is no duplication etc.?
|
||||
|
||||
#### Developer notes
|
||||
|
||||
* This work is now upstream. Bugs can be reported in [GDB Bugzilla].
|
||||
|
||||
### LLDB
|
||||
|
||||
We have our own fork of LLDB - [https://github.com/rust-lang/lldb]
|
||||
|
||||
Fork of LLVM project - [https://github.com/rust-lang/llvm-project]
|
||||
|
||||
LLDB currently only works on macOS because of a dependency issue. This issue was easier to
|
||||
solve for macOS as compared to Linux. However, Tom has a possible solution which can enable
|
||||
us to ship LLDB everywhere.
|
||||
|
||||
#### Rust expression parser
|
||||
|
||||
This expression parser is written in C++. It is a type of [Recursive Descent parser].
|
||||
Implements slightly less of the Rust language than GDB. LLDB has Rust like value and type output.
|
||||
|
||||
#### Parser extensions
|
||||
|
||||
There is some special code in the DWARF reader in LLDB to support the extensions.
|
||||
A couple of examples of DWARF reader support needed are as follows -
|
||||
|
||||
1. Enum: Needed for support for enum types. The Rustc writes the information about
|
||||
enum into DWARF and LLDB reads the DWARF to understand where is the tag field or
|
||||
is there a tag field or is the tag slot shared with non-zero optimization etc.
|
||||
In other words, it has enum support as well.
|
||||
|
||||
#### Developer notes
|
||||
|
||||
* None of the LLDB work is upstream. This [rust-lang/lldb wiki page] explains a few details.
|
||||
* The reason for forking LLDB is that LLDB recently removed all the other language plugins
|
||||
due to lack of maintenance.
|
||||
* LLDB has a plugin architecture but that does not work for language support.
|
||||
* LLDB is available via Rust build (`rustup`).
|
||||
* GDB generally works better on Linux.
|
||||
|
||||
## DWARF and Rustc
|
||||
|
||||
[DWARF] is the standard way compilers generate debugging information that debuggers read.
|
||||
It is _the_ debugging format on macOS and Linux. It is a multi-language, extensible format
|
||||
and is mostly good enough for Rust's purposes. Hence, the current implementation reuses DWARF's
|
||||
concepts. This is true even if some of the concepts in DWARF do not align with Rust
|
||||
semantically because generally there can be some kind of mapping between the two.
|
||||
|
||||
We have some DWARF extensions that the Rust compiler emits and the debuggers understand that
|
||||
are _not_ in the DWARF standard.
|
||||
|
||||
* Rust compiler will emit DWARF for a virtual table, and this `vtable` object will have a
|
||||
`DW_AT_containing_type` that points to the real type. This lets debuggers dissect a trait object
|
||||
pointer to correctly find the payload. E.g., here's such a DIE, from a test case in the gdb
|
||||
repository:
|
||||
|
||||
```asm
|
||||
<1><1a9>: Abbrev Number: 3 (DW_TAG_structure_type)
|
||||
<1aa> DW_AT_containing_type: <0x1b4>
|
||||
<1ae> DW_AT_name : (indirect string, offset: 0x23d): vtable
|
||||
<1b2> DW_AT_byte_size : 0
|
||||
<1b3> DW_AT_alignment : 8
|
||||
```
|
||||
|
||||
* The other extension is that the Rust compiler can emit a tagless discriminated union.
|
||||
See [DWARF feature request] for this item.
|
||||
|
||||
### Current limitations of DWARF
|
||||
|
||||
* Traits - require a bigger change than normal to DWARF, on how to represent Traits in DWARF.
|
||||
* DWARF provides no way to differentiate between Structs and Tuples. Rust compiler emits
|
||||
fields with `__0` and debuggers look for a sequence of such names to overcome this limitation.
|
||||
For example, in this case the debugger would look at a field via `x.__0` instead of `x.0`.
|
||||
This is resolved via the Rust parser in the debugger so now you can do `x.0`.
|
||||
|
||||
DWARF relies on debuggers to know some information about platform ABI.
|
||||
Rust does not do that all the time.
|
||||
|
||||
## Developer notes
|
||||
|
||||
This section is from the talk about certain aspects of development.
|
||||
|
||||
## What is missing
|
||||
|
||||
### Shipping GDB in Rustup
|
||||
|
||||
Tracking issue: [https://github.com/rust-lang/rust/issues/34457]
|
||||
|
||||
Shipping GDB requires change to Rustup delivery system. To manage Rustup build size and
|
||||
times we need to build GDB separately, on its own and somehow provide the artifacts produced
|
||||
to be included in the final build. However, if we can ship GDB with rustup, it will simplify
|
||||
the development process by having compiler emit new debug info which can be readily consumed.
|
||||
|
||||
Main issue in achieving this is setting up dependencies. One such dependency is Python. That
|
||||
is why we have our own fork of GDB because one of the drivers is patched on Rust's side to
|
||||
check the correct version of Python (Python 2.7 in this case. *Note: Python3 is not chosen
|
||||
for this purpose because Python's stable ABI is limited and is not sufficient for GDB's needs.
|
||||
See [https://docs.python.org/3/c-api/stable.html]*).
|
||||
|
||||
This is to keep updates to debugger as fast as possible as we make changes to the debugging symbols.
|
||||
In essence, to ship the debugger as soon as new debugging info is added. GDB only releases
|
||||
every six months or so. However, the changes that are
|
||||
not related to Rust itself should ideally be first merged to upstream eventually.
|
||||
|
||||
### Code signing for LLDB debug server on macOS
|
||||
|
||||
According to Wikipedia, [System Integrity Protection] is
|
||||
|
||||
> System Integrity Protection (SIP, sometimes referred to as rootless) is a security feature
|
||||
> of Apple's macOS operating system introduced in OS X El Capitan. It comprises a number of
|
||||
> mechanisms that are enforced by the kernel. A centerpiece is the protection of system-owned
|
||||
> files and directories against modifications by processes without a specific "entitlement",
|
||||
> even when executed by the root user or a user with root privileges (sudo).
|
||||
|
||||
It prevents processes using `ptrace` syscall. If a process wants to use `ptrace` it has to be
|
||||
code signed. The certificate that signs it has to be trusted on your machine.
|
||||
|
||||
See [Apple developer documentation for System Integrity Protection].
|
||||
|
||||
We may need to sign up with Apple and get the keys to do this signing. Tom has looked into if
|
||||
Mozilla cannot do this because it is at the maximum number of
|
||||
keys it is allowed to sign. Tom does not know if Mozilla could get more keys.
|
||||
|
||||
Alternatively, Tom suggests that maybe a Rust legal entity is needed to get the keys via Apple.
|
||||
This problem is not technical in nature. If we had such a key we could sign GDB as well and
|
||||
ship that.
|
||||
|
||||
### DWARF and Traits
|
||||
|
||||
Rust traits are not emitted into DWARF at all. The impact of this is calling a method `x.method()`
|
||||
does not work as is. The reason being that method is implemented by a trait, as opposed
|
||||
to a type. That information is not present so finding trait methods is missing.
|
||||
|
||||
DWARF has a notion of interface types (possibly added for Java). Tom's idea was to use this
|
||||
interface type as traits.
|
||||
|
||||
DWARF only deals with concrete names, not the reference types. So, a given implementation of a
|
||||
trait for a type would be one of these interfaces (`DW_tag_interface` type). Also, the type for
|
||||
which it is implemented would describe all the interfaces this type implements. This requires a
|
||||
DWARF extension.
|
||||
|
||||
Issue on Github: [https://github.com/rust-lang/rust/issues/33014]
|
||||
|
||||
## Typical process for a Debug Info change (LLVM)
|
||||
|
||||
LLVM has Debug Info (DI) builders. This is the primary thing that Rust calls into.
|
||||
This is why we need to change LLVM first because that is emitted first and not DWARF directly.
|
||||
This is a kind of metadata that you construct and hand-off to LLVM. For the Rustc/LLVM hand-off
|
||||
some LLVM DI builder methods are called to construct representation of a type.
|
||||
|
||||
The steps of this process are as follows -
|
||||
|
||||
1. LLVM needs changing.
|
||||
|
||||
LLVM does not emit Interface types at all, so this needs to be implemented in the LLVM first.
|
||||
|
||||
Get sign off on LLVM maintainers that this is a good idea.
|
||||
|
||||
2. Change the DWARF extension.
|
||||
|
||||
3. Update the debuggers.
|
||||
|
||||
Update DWARF readers, expression evaluators.
|
||||
|
||||
4. Update Rust compiler.
|
||||
|
||||
Change it to emit this new information.
|
||||
|
||||
### Procedural macro stepping
|
||||
|
||||
A deeply profound question is that how do you actually debug a procedural macro?
|
||||
What is the location you emit for a macro expansion? Consider some of the following cases -
|
||||
|
||||
* You can emit location of the invocation of the macro.
|
||||
* You can emit the location of the definition of the macro.
|
||||
* You can emit locations of the content of the macro.
|
||||
|
||||
RFC: [https://github.com/rust-lang/rfcs/pull/2117]
|
||||
|
||||
Focus is to let macros decide what to do. This can be achieved by having some kind of attribute
|
||||
that lets the macro tell the compiler where the line marker should be. This affects where you
|
||||
set the breakpoints and what happens when you step it.
|
||||
|
||||
## Future work
|
||||
|
||||
#### Name mangling changes
|
||||
|
||||
* New demangler in `libiberty` (gcc source tree).
|
||||
* New demangler in LLVM or LLDB.
|
||||
|
||||
**TODO**: Check the location of the demangler source.
|
||||
[Question on Github](https://github.com/rust-lang/rustc-guide/pull/316#discussion_r283062536).
|
||||
|
||||
#### Reuse Rust compiler for expressions
|
||||
|
||||
This is an important idea because debuggers by and large do not try to implement type
|
||||
inference. You need to be much more explicit when you type into the debugger than your
|
||||
actual source code. So, you cannot just copy and paste an expression from your source
|
||||
code to debugger and expect the same answer but this would be nice. This can be helped
|
||||
by using compiler.
|
||||
|
||||
It is certainly doable but it is a large project. You certainly need a bridge to the
|
||||
debugger because the debugger alone has access to the memory. Both GDB (gcc) and LLDB (clang)
|
||||
have this feature. LLDB uses Clang to compile code to JIT and GDB can do the same with GCC.
|
||||
|
||||
Both debuggers expression evaluation implement both a superset and a subset of Rust.
|
||||
They implement just the expression language but they also add some extensions like GDB has
|
||||
convenience variables. Therefore, if you are taking this route then you not only need
|
||||
to do this bridge but may have to add some mode to let the compiler understand some extensions.
|
||||
|
||||
#### Windows debugging (PDB) is missing
|
||||
|
||||
This is a complete unknown.
|
||||
|
||||
[Tom Tromey discusses debugging support in rustc]: https://www.youtube.com/watch?v=elBxMRSNYr4
|
||||
[Debugging the Compiler]: compiler-debugging.md
|
||||
[debugger or debugging tool]: https://en.wikipedia.org/wiki/Debugger
|
||||
[Bison]: https://www.gnu.org/software/bison/
|
||||
[ptype]: https://ftp.gnu.org/old-gnu/Manuals/gdb/html_node/gdb_109.html
|
||||
[rust-lang/lldb wiki page]: https://github.com/rust-lang/lldb/wiki
|
||||
[DWARF]: http://dwarfstd.org
|
||||
[manual for GDB/Rust]: https://sourceware.org/gdb/onlinedocs/gdb/Rust.html
|
||||
[GDB Bugzilla]: https://sourceware.org/bugzilla/
|
||||
[Recursive Descent parser]: https://en.wikipedia.org/wiki/Recursive_descent_parser
|
||||
[System Integrity Protection]: https://en.wikipedia.org/wiki/System_Integrity_Protection
|
||||
[https://github.com/rust-dev-tools/gdb]: https://github.com/rust-dev-tools/gdb
|
||||
[DWARF feature request]: http://dwarfstd.org/ShowIssue.php?issue=180517.2
|
||||
[https://docs.python.org/3/c-api/stable.html]: https://docs.python.org/3/c-api/stable.html
|
||||
[https://github.com/rust-lang/rfcs/pull/2117]: https://github.com/rust-lang/rfcs/pull/2117
|
||||
[https://github.com/rust-lang/rust/issues/33014]: https://github.com/rust-lang/rust/issues/33014
|
||||
[https://github.com/rust-lang/rust/issues/34457]: https://github.com/rust-lang/rust/issues/34457
|
||||
[Apple developer documentation for System Integrity Protection]: https://developer.apple.com/library/archive/releasenotes/MacOSX/WhatsNewInOSX/Articles/MacOSX10_11.html#//apple_ref/doc/uid/TP40016227-SW11
|
||||
[https://github.com/rust-lang/lldb]: https://github.com/rust-lang/lldb
|
||||
[https://github.com/rust-lang/llvm-project]: https://github.com/rust-lang/llvm-project
|
||||
Loading…
Reference in New Issue