Compare commits
14 Commits
db0e0c41c0
...
b3605e5e9d
| Author | SHA1 | Date |
|---|---|---|
|
|
b3605e5e9d | |
|
|
4c6d66ccb0 | |
|
|
4233695fea | |
|
|
33eaf36815 | |
|
|
980acc5eee | |
|
|
e0a39188f1 | |
|
|
9d7ba8573d | |
|
|
c963b4ad93 | |
|
|
a02af2f135 | |
|
|
4185dca095 | |
|
|
a2c80e6e23 | |
|
|
7b921990fc | |
|
|
fc47bce9f7 | |
|
|
733773ce12 |
|
|
@ -101,10 +101,13 @@
|
|||
- [The `rustdoc` test suite](./rustdoc-internals/rustdoc-test-suite.md)
|
||||
- [The `rustdoc-gui` test suite](./rustdoc-internals/rustdoc-gui-test-suite.md)
|
||||
- [The `rustdoc-json` test suite](./rustdoc-internals/rustdoc-json-test-suite.md)
|
||||
- [GPU offload internals](./offload/internals.md)
|
||||
- [Installation](./offload/installation.md)
|
||||
- [Autodiff internals](./autodiff/internals.md)
|
||||
- [Installation](./autodiff/installation.md)
|
||||
- [How to debug](./autodiff/debugging.md)
|
||||
- [Autodiff flags](./autodiff/flags.md)
|
||||
- [Type Trees](./autodiff/type-trees.md)
|
||||
- [Current limitations](./autodiff/limitations.md)
|
||||
|
||||
# Source Code Representation
|
||||
|
|
@ -121,8 +124,9 @@
|
|||
- [Feature gate checking](./feature-gate-ck.md)
|
||||
- [Lang Items](./lang-items.md)
|
||||
- [The HIR (High-level IR)](./hir.md)
|
||||
- [Lowering AST to HIR](./ast-lowering.md)
|
||||
- [Debugging](./hir-debugging.md)
|
||||
- [Lowering AST to HIR](./hir/lowering.md)
|
||||
- [Ambig/Unambig Types and Consts](./hir/ambig-unambig-ty-and-consts.md)
|
||||
- [Debugging](./hir/debugging.md)
|
||||
- [The THIR (Typed High-level IR)](./thir.md)
|
||||
- [The MIR (Mid-level IR)](./mir/index.md)
|
||||
- [MIR construction](./mir/construction.md)
|
||||
|
|
|
|||
|
|
@ -0,0 +1,118 @@
|
|||
# Type Trees in Enzyme
|
||||
|
||||
This document describes type trees as used by Enzyme for automatic differentiation.
|
||||
|
||||
## What are Type Trees?
|
||||
|
||||
Type trees in Enzyme are a way to represent the types of variables, including their activity (e.g., whether they are active, duplicated, or contain duplicated data) for automatic differentiation. They provide a structured way for Enzyme to understand how to handle different data types during the differentiation process.
|
||||
|
||||
## Representing Rust Types as Type Trees
|
||||
|
||||
Enzyme needs to understand the structure and properties of Rust types to perform automatic differentiation correctly. This is where type trees come in. They provide a detailed map of a type, including pointer indirections and the underlying concrete data types.
|
||||
|
||||
The `-enzyme-rust-type` flag in Enzyme helps in interpreting types more accurately in the context of Rust's memory layout and type system.
|
||||
|
||||
### Primitive Types
|
||||
|
||||
#### Floating-Point Types (`f32`, `f64`)
|
||||
|
||||
Consider a Rust reference to a 32-bit floating-point number, `&f32`.
|
||||
|
||||
In LLVM IR, this might be represented, for instance, as an `i8*` (a generic byte pointer) that is then `bitcast` to a `float*`. Consider the following LLVM IR function:
|
||||
|
||||
```llvm
|
||||
define internal void @callee(i8* %x) {
|
||||
start:
|
||||
%x.dbg.spill = bitcast i8* %x to float*
|
||||
; ...
|
||||
ret void
|
||||
}
|
||||
```
|
||||
|
||||
When Enzyme analyzes this function (with appropriate flags like `-enzyme-rust-type`), it might produce the following type information for the argument `%x` and the result of the bitcast:
|
||||
|
||||
```llvm
|
||||
i8* %x: {[-1]:Pointer, [-1,0]:Float@float}
|
||||
%x.dbg.spill = bitcast i8* %x to float*: {[-1]:Pointer, [-1,0]:Float@float}
|
||||
```
|
||||
|
||||
**Understanding the Type Tree: `{[-1]:Pointer, [-1,0]:Float@float}`**
|
||||
|
||||
This string is the type tree representation. Let's break it down:
|
||||
|
||||
* **`{ ... }`**: This encloses the set of type information for the variable.
|
||||
* **`[-1]:Pointer`**:
|
||||
* `[-1]` is an index or path. In this context, `-1` often refers to the base memory location or the immediate value pointed to.
|
||||
* `Pointer` indicates that the variable `%x` itself is treated as a pointer.
|
||||
* **`[-1,0]:Float@float`**:
|
||||
* `[-1,0]` is a path. It means: start with the base item `[-1]` (the pointer), and then look at offset `0` from the memory location it points to.
|
||||
* `Float` is the `CConcreteType` (from `enzyme_ffi.rs`, corresponding to `DT_Float`). It signifies that the data at this location is a floating-point number.
|
||||
* `@float` is a subtype or specific variant of `Float`. In this case, it specifies a single-precision float (like Rust's `f32`).
|
||||
|
||||
A reference to an `f64` (e.g., `&f64`) is handled very similarly. The LLVM IR might cast to `double*`:
|
||||
```llvm
|
||||
define internal void @callee(i8* %x) {
|
||||
start:
|
||||
%x.dbg.spill = bitcast i8* %x to double*
|
||||
; ...
|
||||
ret void
|
||||
}
|
||||
```
|
||||
|
||||
And the type tree would be:
|
||||
|
||||
```llvm
|
||||
i8* %x: {[-1]:Pointer, [-1,0]:Float@double}
|
||||
```
|
||||
The key difference is `@double`, indicating a double-precision float.
|
||||
|
||||
This level of detail allows Enzyme to know, for example, that if `x` is an active variable in differentiation, the floating-point value it points to needs to be handled according to AD rules for its specific precision.
|
||||
|
||||
### Compound Types
|
||||
|
||||
#### Structs
|
||||
|
||||
Consider a Rust struct `T` with two `f32` fields (e.g., a reference `&T`):
|
||||
|
||||
```rust
|
||||
struct T {
|
||||
x: f32,
|
||||
y: f32,
|
||||
}
|
||||
|
||||
// And a function taking a reference to it:
|
||||
// fn callee(t: &T) { /* ... */ }
|
||||
```
|
||||
|
||||
In LLVM IR, a pointer to this struct might be initially represented as `i8*` and then cast to the specific struct type, like `{ float, float }*`:
|
||||
|
||||
```llvm
|
||||
define internal void @callee(i8* %t) {
|
||||
start:
|
||||
%t.dbg.spill = bitcast i8* %t to { float, float }*
|
||||
; ...
|
||||
ret void
|
||||
}
|
||||
```
|
||||
|
||||
The Enzyme type analysis output for `%t` would be:
|
||||
|
||||
```llvm
|
||||
i8* %t: {[-1]:Pointer, [-1,0]:Float@float, [-1,4]:Float@float}
|
||||
```
|
||||
|
||||
**Understanding the Struct Type Tree: `{[-1]:Pointer, [-1,0]:Float@float, [-1,4]:Float@float}`**
|
||||
|
||||
* **`[-1]:Pointer`**: As before, this indicates that `%t` is a pointer.
|
||||
* **`[-1,0]:Float@float`**:
|
||||
* This describes the first field of the struct (`x`).
|
||||
* `[-1,0]` means: from the memory location pointed to by `%t` (`-1`), at offset `0` bytes.
|
||||
* `Float@float` indicates this field is an `f32`.
|
||||
* **`[-1,4]:Float@float`**:
|
||||
* This describes the second field of the struct (`y`).
|
||||
* `[-1,4]` means: from the memory location pointed to by `%t` (`-1`), at offset `4` bytes.
|
||||
* `Float@float` indicates this field is also an `f32`.
|
||||
|
||||
The offset `4` comes from the size of the first field (`f32` is 4 bytes). If the first field were, for example, an `f64` (8 bytes), the second field might be at offset `[-1,8]`. Enzyme uses these offsets to pinpoint the exact memory location of each field within the struct.
|
||||
|
||||
This detailed mapping is crucial for Enzyme to correctly track the activity of individual struct fields during automatic differentiation.
|
||||
|
|
@ -174,8 +174,8 @@ compiler, you can use it instead of the JSON file for both arguments.
|
|||
## Promoting a target from tier 2 (target) to tier 2 (host)
|
||||
|
||||
There are two levels of tier 2 targets:
|
||||
a) Targets that are only cross-compiled (`rustup target add`)
|
||||
b) Targets that [have a native toolchain][tier2-native] (`rustup toolchain install`)
|
||||
- Targets that are only cross-compiled (`rustup target add`)
|
||||
- Targets that [have a native toolchain][tier2-native] (`rustup toolchain install`)
|
||||
|
||||
[tier2-native]: https://doc.rust-lang.org/nightly/rustc/target-tier-policy.html#tier-2-with-host-tools
|
||||
|
||||
|
|
|
|||
|
|
@ -553,7 +553,7 @@ compiler](#linting-early-in-the-compiler).
|
|||
|
||||
|
||||
[AST nodes]: the-parser.md
|
||||
[AST lowering]: ast-lowering.md
|
||||
[AST lowering]: ./hir/lowering.md
|
||||
[HIR nodes]: hir.md
|
||||
[MIR nodes]: mir/index.md
|
||||
[macro expansion]: macro-expansion.md
|
||||
|
|
|
|||
|
|
@ -5,7 +5,7 @@
|
|||
The HIR – "High-Level Intermediate Representation" – is the primary IR used
|
||||
in most of rustc. It is a compiler-friendly representation of the abstract
|
||||
syntax tree (AST) that is generated after parsing, macro expansion, and name
|
||||
resolution (see [Lowering](./ast-lowering.html) for how the HIR is created).
|
||||
resolution (see [Lowering](./hir/lowering.md) for how the HIR is created).
|
||||
Many parts of HIR resemble Rust surface syntax quite closely, with
|
||||
the exception that some of Rust's expression forms have been desugared away.
|
||||
For example, `for` loops are converted into a `loop` and do not appear in
|
||||
|
|
|
|||
|
|
@ -0,0 +1,63 @@
|
|||
# Ambig/Unambig Types and Consts
|
||||
|
||||
Types and Consts args in the HIR can be in two kinds of positions ambiguous (ambig) or unambiguous (unambig). Ambig positions are where
|
||||
it would be valid to parse either a type or a const, unambig positions are where only one kind would be valid to
|
||||
parse.
|
||||
|
||||
```rust
|
||||
fn func<T, const N: usize>(arg: T) {
|
||||
// ^ Unambig type position
|
||||
let a: _ = arg;
|
||||
// ^ Unambig type position
|
||||
|
||||
func::<T, N>(arg);
|
||||
// ^ ^
|
||||
// ^^^^ Ambig position
|
||||
|
||||
let _: [u8; 10];
|
||||
// ^^ ^^ Unambig const position
|
||||
// ^^ Unambig type position
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
Most types/consts in ambig positions are able to be disambiguated as either a type or const during parsing. Single segment paths are always represented as types in the AST but may get resolved to a const parameter during name resolution, then lowered to a const argument during ast-lowering. The only generic arguments which remain ambiguous after lowering are inferred generic arguments (`_`) in path segments. For example, in `Foo<_>` it is not clear whether the `_` argument is an inferred type argument, or an inferred const argument.
|
||||
|
||||
In unambig positions, inferred arguments are represented with [`hir::TyKind::Infer`][ty_infer] or [`hir::ConstArgKind::Infer`][const_infer] depending on whether it is a type or const position respectively.
|
||||
In ambig positions, inferred arguments are represented with `hir::GenericArg::Infer`.
|
||||
|
||||
A naive implementation of this would result in there being potentially 5 places where you might think an inferred type/const could be found in the HIR from looking at the structure of the HIR:
|
||||
1. In unambig type position as a `hir::TyKind::Infer`
|
||||
2. In unambig const arg position as a `hir::ConstArgKind::Infer`
|
||||
3. In an ambig position as a [`GenericArg::Type(TyKind::Infer)`][generic_arg_ty]
|
||||
4. In an ambig position as a [`GenericArg::Const(ConstArgKind::Infer)`][generic_arg_const]
|
||||
5. In an ambig position as a [`GenericArg::Infer`][generic_arg_infer]
|
||||
|
||||
Note that places 3 and 4 would never actually be possible to encounter as we always lower to `GenericArg::Infer` in generic arg position.
|
||||
|
||||
This has a few failure modes:
|
||||
- People may write visitors which check for `GenericArg::Infer` but forget to check for `hir::TyKind/ConstArgKind::Infer`, only handling infers in ambig positions by accident.
|
||||
- People may write visitors which check for `hir::TyKind/ConstArgKind::Infer` but forget to check for `GenericArg::Infer`, only handling infers in unambig positions by accident.
|
||||
- People may write visitors which check for `GenerArg::Type/Const(TyKind/ConstArgKind::Infer)` and `GenerigArg::Infer`, not realising that we never represent inferred types/consts in ambig positions as a `GenericArg::Type/Const`.
|
||||
- People may write visitors which check for *only* `TyKind::Infer` and not `ConstArgKind::Infer` forgetting that there are also inferred const arguments (and vice versa).
|
||||
|
||||
To make writing HIR visitors less error prone when caring about inferred types/consts we have a relatively complex system:
|
||||
|
||||
1. We have different types in the compiler for when a type or const is in an unambig or ambig position, `hir::Ty<AmbigArg>` and `hir::Ty<()>`. [`AmbigArg`][ambig_arg] is an uninhabited type which we use in the `Infer` variant of `TyKind` and `ConstArgKind` to selectively "disable" it if we are in an ambig position.
|
||||
|
||||
2. The [`visit_ty`][visit_ty] and [`visit_const_arg`][visit_const_arg] methods on HIR visitors only accept the ambig position versions of types/consts. Unambig types/consts are implicitly converted to ambig types/consts during the visiting process, with the `Infer` variant handled by a dedicated [`visit_infer`][visit_infer] method.
|
||||
|
||||
This has a number of benefits:
|
||||
- It's clear that `GenericArg::Type/Const` cannot represent inferred type/const arguments
|
||||
- Implementors of `visit_ty` and `visit_const_arg` will never encounter inferred types/consts making it impossible to write a visitor that seems to work right but handles edge cases wrong
|
||||
- The `visit_infer` method handles *all* cases of inferred type/consts in the HIR making it easy for visitors to handle inferred type/consts in one dedicated place and not forget cases
|
||||
|
||||
[ty_infer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.TyKind.html#variant.Infer
|
||||
[const_infer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.ConstArgKind.html#variant.Infer
|
||||
[generic_arg_ty]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.GenericArg.html#variant.Type
|
||||
[generic_arg_const]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.GenericArg.html#variant.Const
|
||||
[generic_arg_infer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.GenericArg.html#variant.Infer
|
||||
[ambig_arg]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.AmbigArg.html
|
||||
[visit_ty]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/intravisit/trait.Visitor.html#method.visit_ty
|
||||
[visit_const_arg]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/intravisit/trait.Visitor.html#method.visit_const_arg
|
||||
[visit_infer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/intravisit/trait.Visitor.html#method.visit_infer
|
||||
|
|
@ -1,6 +1,6 @@
|
|||
# AST lowering
|
||||
|
||||
The AST lowering step converts AST to [HIR](hir.html).
|
||||
The AST lowering step converts AST to [HIR](../hir.md).
|
||||
This means many structures are removed if they are irrelevant
|
||||
for type analysis or similar syntax agnostic analyses. Examples
|
||||
of such structures include but are not limited to
|
||||
|
|
@ -0,0 +1,71 @@
|
|||
# Installation
|
||||
|
||||
In the future, `std::offload` should become available in nightly builds for users. For now, everyone still needs to build rustc from source.
|
||||
|
||||
## Build instructions
|
||||
|
||||
First you need to clone and configure the Rust repository:
|
||||
```bash
|
||||
git clone --depth=1 git@github.com:rust-lang/rust.git
|
||||
cd rust
|
||||
./configure --enable-llvm-link-shared --release-channel=nightly --enable-llvm-assertions --enable-offload --enable-enzyme --enable-clang --enable-lld --enable-option-checking --enable-ninja --disable-docs
|
||||
```
|
||||
|
||||
Afterwards you can build rustc using:
|
||||
```bash
|
||||
./x.py build --stage 1 library
|
||||
```
|
||||
|
||||
Afterwards rustc toolchain link will allow you to use it through cargo:
|
||||
```
|
||||
rustup toolchain link offload build/host/stage1
|
||||
rustup toolchain install nightly # enables -Z unstable-options
|
||||
```
|
||||
|
||||
|
||||
|
||||
## Build instruction for LLVM itself
|
||||
```bash
|
||||
git clone --depth=1 git@github.com:llvm/llvm-project.git
|
||||
cd llvm-project
|
||||
mkdir build
|
||||
cd build
|
||||
cmake -G Ninja ../llvm -DLLVM_TARGETS_TO_BUILD="host,AMDGPU,NVPTX" -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PROJECTS="clang;lld" -DLLVM_ENABLE_RUNTIMES="offload,openmp" -DLLVM_ENABLE_PLUGINS=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=.
|
||||
ninja
|
||||
ninja install
|
||||
```
|
||||
This gives you a working LLVM build.
|
||||
|
||||
|
||||
## Testing
|
||||
run
|
||||
```
|
||||
./x.py test --stage 1 tests/codegen/gpu_offload
|
||||
```
|
||||
|
||||
## Usage
|
||||
It is important to use a clang compiler build on the same llvm as rustc. Just calling clang without the full path will likely use your system clang, which probably will be incompatible.
|
||||
```
|
||||
/absolute/path/to/rust/build/x86_64-unknown-linux-gnu/stage1/bin/rustc --edition=2024 --crate-type cdylib src/main.rs --emit=llvm-ir -O -C lto=fat -Cpanic=abort -Zoffload=Enable
|
||||
/absolute/path/to/rust/build/x86_64-unknown-linux-gnu/llvm/bin/clang++ -fopenmp --offload-arch=native -g -O3 main.ll -o main -save-temps
|
||||
LIBOMPTARGET_INFO=-1 ./main
|
||||
```
|
||||
The first step will generate a `main.ll` file, which has enough instructions to cause the offload runtime to move data to and from a gpu.
|
||||
The second step will use clang as the compilation driver to compile our IR file down to a working binary. Only a very small Rust subset will work out of the box here, unless
|
||||
you use features like build-std, which are not covered by this guide. Look at the codegen test to get a feeling for how to write a working example.
|
||||
In the last step you can run your binary, if all went well you will see a data transfer being reported:
|
||||
```
|
||||
omptarget device 0 info: Entering OpenMP data region with being_mapper at unknown:0:0 with 1 arguments:
|
||||
omptarget device 0 info: tofrom(unknown)[1024]
|
||||
omptarget device 0 info: Creating new map entry with HstPtrBase=0x00007fffffff9540, HstPtrBegin=0x00007fffffff9540, TgtAllocBegin=0x0000155547200000, TgtPtrBegin=0x0000155547200000, Size=1024, DynRefCount=1, HoldRefCount=0, Name=unknown
|
||||
omptarget device 0 info: Copying data from host to device, HstPtr=0x00007fffffff9540, TgtPtr=0x0000155547200000, Size=1024, Name=unknown
|
||||
omptarget device 0 info: OpenMP Host-Device pointer mappings after block at unknown:0:0:
|
||||
omptarget device 0 info: Host Ptr Target Ptr Size (B) DynRefCount HoldRefCount Declaration
|
||||
omptarget device 0 info: 0x00007fffffff9540 0x0000155547200000 1024 1 0 unknown at unknown:0:0
|
||||
// some other output
|
||||
omptarget device 0 info: Exiting OpenMP data region with end_mapper at unknown:0:0 with 1 arguments:
|
||||
omptarget device 0 info: tofrom(unknown)[1024]
|
||||
omptarget device 0 info: Mapping exists with HstPtrBegin=0x00007fffffff9540, TgtPtrBegin=0x0000155547200000, Size=1024, DynRefCount=0 (decremented, delayed deletion), HoldRefCount=0
|
||||
omptarget device 0 info: Copying data from device to host, TgtPtr=0x0000155547200000, HstPtr=0x00007fffffff9540, Size=1024, Name=unknown
|
||||
omptarget device 0 info: Removing map entry with HstPtrBegin=0x00007fffffff9540, TgtPtrBegin=0x0000155547200000, Size=1024, Name=unknown
|
||||
```
|
||||
|
|
@ -0,0 +1,9 @@
|
|||
# std::offload
|
||||
|
||||
This module is under active development. Once upstream, it should allow Rust developers to run Rust code on GPUs.
|
||||
We aim to develop a `rusty` GPU programming interface, which is safe, convenient and sufficiently fast by default.
|
||||
This includes automatic data movement to and from the GPU, in a efficient way. We will (later)
|
||||
also offer more advanced, possibly unsafe, interfaces which allow a higher degree of control.
|
||||
|
||||
The implementation is based on LLVM's "offload" project, which is already used by OpenMP to run Fortran or C++ code on GPUs.
|
||||
While the project is under development, users will need to call other compilers like clang to finish the compilation process.
|
||||
|
|
@ -410,7 +410,7 @@ For more details on bootstrapping, see
|
|||
- Guide: [The HIR](hir.md)
|
||||
- Guide: [Identifiers in the HIR](hir.md#identifiers-in-the-hir)
|
||||
- Guide: [The `HIR` Map](hir.md#the-hir-map)
|
||||
- Guide: [Lowering `AST` to `HIR`](ast-lowering.md)
|
||||
- Guide: [Lowering `AST` to `HIR`](./hir/lowering.md)
|
||||
- How to view `HIR` representation for your code `cargo rustc -- -Z unpretty=hir-tree`
|
||||
- Rustc `HIR` definition: [`rustc_hir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html)
|
||||
- Main entry point: **TODO**
|
||||
|
|
|
|||
|
|
@ -7,8 +7,8 @@ This is a guide for how to profile rustc with [perf](https://perf.wiki.kernel.or
|
|||
- Get a clean checkout of rust-lang/master, or whatever it is you want
|
||||
to profile.
|
||||
- Set the following settings in your `bootstrap.toml`:
|
||||
- `debuginfo-level = 1` - enables line debuginfo
|
||||
- `jemalloc = false` - lets you do memory use profiling with valgrind
|
||||
- `rust.debuginfo-level = 1` - enables line debuginfo
|
||||
- `rust.jemalloc = false` - lets you do memory use profiling with valgrind
|
||||
- leave everything else the defaults
|
||||
- Run `./x build` to get a full build
|
||||
- Make a rustup toolchain pointing to that result
|
||||
|
|
|
|||
Loading…
Reference in New Issue