Merge fc47bce9f7 into 4c6d66ccb0

Merge pull request #2447 from rust-lang/offload-docs
initial instructions for gpu offload
2025-06-19 16:57:50 +08:00 · 2025-06-18 17:24:36 -07:00 · 2025-06-18 17:22:50 -07:00 · 2025-06-19 00:04:36 +02:00 · 2025-06-19 00:03:33 +02:00 · 2025-06-18 15:30:14 +01:00
12 changed files with 275 additions and 10 deletions
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@ -101,10 +101,13 @@
 	- [The `rustdoc` test suite](./rustdoc-internals/rustdoc-test-suite.md)
 	- [The `rustdoc-gui` test suite](./rustdoc-internals/rustdoc-gui-test-suite.md)
 	- [The `rustdoc-json` test suite](./rustdoc-internals/rustdoc-json-test-suite.md)
+- [GPU offload internals](./offload/internals.md)
+    - [Installation](./offload/installation.md)
 - [Autodiff internals](./autodiff/internals.md)
    - [Installation](./autodiff/installation.md)
    - [How to debug](./autodiff/debugging.md)
    - [Autodiff flags](./autodiff/flags.md)
+    - [Type Trees](./autodiff/type-trees.md)
    - [Current limitations](./autodiff/limitations.md)

 # Source Code Representation
@ -121,8 +124,9 @@
    - [Feature gate checking](./feature-gate-ck.md)
    - [Lang Items](./lang-items.md)
 - [The HIR (High-level IR)](./hir.md)
-    - [Lowering AST to HIR](./ast-lowering.md)
-    - [Debugging](./hir-debugging.md)
+    - [Lowering AST to HIR](./hir/lowering.md)
+    - [Ambig/Unambig Types and Consts](./hir/ambig-unambig-ty-and-consts.md)
+    - [Debugging](./hir/debugging.md)
 - [The THIR (Typed High-level IR)](./thir.md)
 - [The MIR (Mid-level IR)](./mir/index.md)
    - [MIR construction](./mir/construction.md)
--- a/src/autodiff/type-trees.md
+++ b/src/autodiff/type-trees.md
@ -0,0 +1,118 @@
+# Type Trees in Enzyme
+
+This document describes type trees as used by Enzyme for automatic differentiation.
+
+## What are Type Trees?
+
+Type trees in Enzyme are a way to represent the types of variables, including their activity (e.g., whether they are active, duplicated, or contain duplicated data) for automatic differentiation. They provide a structured way for Enzyme to understand how to handle different data types during the differentiation process.
+
+## Representing Rust Types as Type Trees
+
+Enzyme needs to understand the structure and properties of Rust types to perform automatic differentiation correctly. This is where type trees come in. They provide a detailed map of a type, including pointer indirections and the underlying concrete data types.
+
+The `-enzyme-rust-type` flag in Enzyme helps in interpreting types more accurately in the context of Rust's memory layout and type system.
+
+### Primitive Types
+
+#### Floating-Point Types (`f32`, `f64`)
+
+Consider a Rust reference to a 32-bit floating-point number, `&f32`.
+
+In LLVM IR, this might be represented, for instance, as an `i8*` (a generic byte pointer) that is then `bitcast` to a `float*`. Consider the following LLVM IR function:
+
+```llvm
+define internal void @callee(i8* %x) {
+start:
+  %x.dbg.spill = bitcast i8* %x to float*
+  ; ...
+  ret void
+}
+```
+
+When Enzyme analyzes this function (with appropriate flags like `-enzyme-rust-type`), it might produce the following type information for the argument `%x` and the result of the bitcast:
+
+```llvm
+i8* %x: {[-1]:Pointer, [-1,0]:Float@float}
+%x.dbg.spill = bitcast i8* %x to float*: {[-1]:Pointer, [-1,0]:Float@float}
+```
+
+**Understanding the Type Tree: `{[-1]:Pointer, [-1,0]:Float@float}`**
+
+This string is the type tree representation. Let's break it down:
+
+*   **`{ ... }`**: This encloses the set of type information for the variable.
+*   **`[-1]:Pointer`**:
+    *   `[-1]` is an index or path. In this context, `-1` often refers to the base memory location or the immediate value pointed to.
+    *   `Pointer` indicates that the variable `%x` itself is treated as a pointer.
+*   **`[-1,0]:Float@float`**:
+    *   `[-1,0]` is a path. It means: start with the base item `[-1]` (the pointer), and then look at offset `0` from the memory location it points to.
+    *   `Float` is the `CConcreteType` (from `enzyme_ffi.rs`, corresponding to `DT_Float`). It signifies that the data at this location is a floating-point number.
+    *   `@float` is a subtype or specific variant of `Float`. In this case, it specifies a single-precision float (like Rust's `f32`).
+
+A reference to an `f64` (e.g., `&f64`) is handled very similarly. The LLVM IR might cast to `double*`:
+```llvm
+define internal void @callee(i8* %x) {
+start:
+  %x.dbg.spill = bitcast i8* %x to double*
+  ; ...
+  ret void
+}
+```
+
+And the type tree would be:
+
+```llvm
+i8* %x: {[-1]:Pointer, [-1,0]:Float@double}
+```
+The key difference is `@double`, indicating a double-precision float.
+
+This level of detail allows Enzyme to know, for example, that if `x` is an active variable in differentiation, the floating-point value it points to needs to be handled according to AD rules for its specific precision.
+
+### Compound Types
+
+#### Structs
+
+Consider a Rust struct `T` with two `f32` fields (e.g., a reference `&T`):
+
+```rust
+struct T {
+    x: f32,
+    y: f32,
+}
+
+// And a function taking a reference to it:
+// fn callee(t: &T) { /* ... */ }
+```
+
+In LLVM IR, a pointer to this struct might be initially represented as `i8*` and then cast to the specific struct type, like `{ float, float }*`:
+
+```llvm
+define internal void @callee(i8* %t) {
+start:
+  %t.dbg.spill = bitcast i8* %t to { float, float }*
+  ; ...
+  ret void
+}
+```
+
+The Enzyme type analysis output for `%t` would be:
+
+```llvm
+i8* %t: {[-1]:Pointer, [-1,0]:Float@float, [-1,4]:Float@float}
+```
+
+**Understanding the Struct Type Tree: `{[-1]:Pointer, [-1,0]:Float@float, [-1,4]:Float@float}`**
+
+*   **`[-1]:Pointer`**: As before, this indicates that `%t` is a pointer.
+*   **`[-1,0]:Float@float`**:
+    *   This describes the first field of the struct (`x`).
+    *   `[-1,0]` means: from the memory location pointed to by `%t` (`-1`), at offset `0` bytes.
+    *   `Float@float` indicates this field is an `f32`.
+*   **`[-1,4]:Float@float`**:
+    *   This describes the second field of the struct (`y`).
+    *   `[-1,4]` means: from the memory location pointed to by `%t` (`-1`), at offset `4` bytes.
+    *   `Float@float` indicates this field is also an `f32`.
+
+The offset `4` comes from the size of the first field (`f32` is 4 bytes). If the first field were, for example, an `f64` (8 bytes), the second field might be at offset `[-1,8]`. Enzyme uses these offsets to pinpoint the exact memory location of each field within the struct.
+
+This detailed mapping is crucial for Enzyme to correctly track the activity of individual struct fields during automatic differentiation.
--- a/src/building/new-target.md
+++ b/src/building/new-target.md
@ -174,8 +174,8 @@ compiler, you can use it instead of the JSON file for both arguments.
 ## Promoting a target from tier 2 (target) to tier 2 (host)

 There are two levels of tier 2 targets:
-  a) Targets that are only cross-compiled (`rustup target add`)
-  b) Targets that [have a native toolchain][tier2-native] (`rustup toolchain install`)
+- Targets that are only cross-compiled (`rustup target add`)
+- Targets that [have a native toolchain][tier2-native] (`rustup toolchain install`)

 [tier2-native]: https://doc.rust-lang.org/nightly/rustc/target-tier-policy.html#tier-2-with-host-tools

--- a/src/diagnostics.md
+++ b/src/diagnostics.md
@ -553,7 +553,7 @@ compiler](#linting-early-in-the-compiler).


 [AST nodes]: the-parser.md
-[AST lowering]: ast-lowering.md
+[AST lowering]: ./hir/lowering.md
 [HIR nodes]: hir.md
 [MIR nodes]: mir/index.md
 [macro expansion]: macro-expansion.md
--- a/src/hir.md
+++ b/src/hir.md
@ -5,7 +5,7 @@
 The HIR – "High-Level Intermediate Representation" – is the primary IR used
 in most of rustc. It is a compiler-friendly representation of the abstract
 syntax tree (AST) that is generated after parsing, macro expansion, and name
-resolution (see [Lowering](./ast-lowering.html) for how the HIR is created).
+resolution (see [Lowering](./hir/lowering.md) for how the HIR is created).
 Many parts of HIR resemble Rust surface syntax quite closely, with
 the exception that some of Rust's expression forms have been desugared away.
 For example, `for` loops are converted into a `loop` and do not appear in
--- a/src/hir/ambig-unambig-ty-and-consts.md
+++ b/src/hir/ambig-unambig-ty-and-consts.md
@ -0,0 +1,63 @@
+# Ambig/Unambig Types and Consts
+
+Types and Consts args in the HIR can be in two kinds of positions ambiguous (ambig) or unambiguous (unambig). Ambig positions are where
+it would be valid to parse either a type or a const, unambig positions are where only one kind would be valid to
+parse.
+
+```rust
+fn func<T, const N: usize>(arg: T) {
+    //                           ^ Unambig type position
+    let a: _ = arg; 
+    //     ^ Unambig type position
+
+    func::<T, N>(arg);
+    //     ^  ^
+    //     ^^^^ Ambig position 
+
+    let _: [u8; 10];
+    //      ^^  ^^ Unambig const position
+    //      ^^ Unambig type position
+}
+
+```
+
+Most types/consts in ambig positions are able to be disambiguated as either a type or const during parsing. Single segment paths are always represented as types in the AST but may get resolved to a const parameter during name resolution, then lowered to a const argument during ast-lowering. The only generic arguments which remain ambiguous after lowering are inferred generic arguments (`_`) in path segments. For example, in `Foo<_>` it is not clear whether the `_` argument is an inferred type argument, or an inferred const argument.
+
+In unambig positions, inferred arguments are represented with [`hir::TyKind::Infer`][ty_infer] or [`hir::ConstArgKind::Infer`][const_infer] depending on whether it is a type or const position respectively.
+In ambig positions, inferred arguments are represented with `hir::GenericArg::Infer`.
+
+A naive implementation of this would result in there being potentially 5 places where you might think an inferred type/const could be found in the HIR from looking at the structure of the HIR:
+1. In unambig type position as a `hir::TyKind::Infer`
+2. In unambig const arg position as a `hir::ConstArgKind::Infer`
+3. In an ambig position as a [`GenericArg::Type(TyKind::Infer)`][generic_arg_ty]
+4. In an ambig position as a [`GenericArg::Const(ConstArgKind::Infer)`][generic_arg_const]
+5. In an ambig position as a [`GenericArg::Infer`][generic_arg_infer]
+
+Note that places 3 and 4 would never actually be possible to encounter as we always lower to `GenericArg::Infer` in generic arg position. 
+
+This has a few failure modes:
+- People may write visitors which check for `GenericArg::Infer` but forget to check for `hir::TyKind/ConstArgKind::Infer`, only handling infers in ambig positions by accident.
+- People may write visitors which check for `hir::TyKind/ConstArgKind::Infer` but forget to check for `GenericArg::Infer`, only handling infers in unambig positions by accident.
+- People may write visitors which check for `GenerArg::Type/Const(TyKind/ConstArgKind::Infer)` and `GenerigArg::Infer`, not realising that we never represent inferred types/consts in ambig positions as a `GenericArg::Type/Const`.
+- People may write visitors which check for *only* `TyKind::Infer` and not `ConstArgKind::Infer` forgetting that there are also inferred const arguments (and vice versa).
+
+To make writing HIR visitors less error prone when caring about inferred types/consts we have a relatively complex system:
+
+1. We have different types in the compiler for when a type or const is in an unambig or ambig position, `hir::Ty<AmbigArg>` and `hir::Ty<()>`. [`AmbigArg`][ambig_arg] is an uninhabited type which we use in the `Infer` variant of `TyKind` and `ConstArgKind` to selectively "disable" it if we are in an ambig position.
+
+2. The [`visit_ty`][visit_ty] and [`visit_const_arg`][visit_const_arg] methods on HIR visitors only accept the ambig position versions of types/consts. Unambig types/consts are implicitly converted to ambig types/consts during the visiting process, with the `Infer` variant handled by a dedicated [`visit_infer`][visit_infer] method.
+
+This has a number of benefits:
+- It's clear that `GenericArg::Type/Const` cannot represent inferred type/const arguments
+- Implementors of `visit_ty` and `visit_const_arg` will never encounter inferred types/consts making it impossible to write a visitor that seems to work right but handles edge cases wrong 
+- The `visit_infer` method handles *all* cases of inferred type/consts in the HIR making it easy for visitors to handle inferred type/consts in one dedicated place and not forget cases
+
+[ty_infer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.TyKind.html#variant.Infer
+[const_infer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.ConstArgKind.html#variant.Infer
+[generic_arg_ty]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.GenericArg.html#variant.Type
+[generic_arg_const]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.GenericArg.html#variant.Const
+[generic_arg_infer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.GenericArg.html#variant.Infer
+[ambig_arg]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.AmbigArg.html
+[visit_ty]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/intravisit/trait.Visitor.html#method.visit_ty
+[visit_const_arg]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/intravisit/trait.Visitor.html#method.visit_const_arg
+[visit_infer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/intravisit/trait.Visitor.html#method.visit_infer
--- a/src/hir/debugging.md
+++ b/src/hir/debugging.md
--- a/src/hir/lowering.md
+++ b/src/hir/lowering.md
@ -1,6 +1,6 @@
 # AST lowering

-The AST lowering step converts AST to [HIR](hir.html).
+The AST lowering step converts AST to [HIR](../hir.md).
 This means many structures are removed if they are irrelevant
 for type analysis or similar syntax agnostic analyses. Examples
 of such structures include but are not limited to
--- a/src/offload/installation.md
+++ b/src/offload/installation.md
@ -0,0 +1,71 @@
+# Installation
+
+In the future, `std::offload` should become available in nightly builds for users. For now, everyone still needs to build rustc from source. 
+
+## Build instructions
+
+First you need to clone and configure the Rust repository:
+```bash
+git clone --depth=1 git@github.com:rust-lang/rust.git
+cd rust
+./configure --enable-llvm-link-shared --release-channel=nightly --enable-llvm-assertions --enable-offload --enable-enzyme --enable-clang --enable-lld --enable-option-checking --enable-ninja --disable-docs
+```
+
+Afterwards you can build rustc using:
+```bash
+./x.py build --stage 1 library
+```
+
+Afterwards rustc toolchain link will allow you to use it through cargo:
+```
+rustup toolchain link offload build/host/stage1
+rustup toolchain install nightly # enables -Z unstable-options
+```
+
+
+
+## Build instruction for LLVM itself
+```bash
+git clone --depth=1 git@github.com:llvm/llvm-project.git 
+cd llvm-project
+mkdir build
+cd build
+cmake -G Ninja ../llvm -DLLVM_TARGETS_TO_BUILD="host,AMDGPU,NVPTX" -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PROJECTS="clang;lld" -DLLVM_ENABLE_RUNTIMES="offload,openmp" -DLLVM_ENABLE_PLUGINS=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=.
+ninja
+ninja install
+```
+This gives you a working LLVM build.
+
+
+## Testing
+run
+```
+./x.py test --stage 1 tests/codegen/gpu_offload
+```
+
+## Usage
+It is important to use a clang compiler build on the same llvm as rustc. Just calling clang without the full path will likely use your system clang, which probably will be incompatible.
+```
+/absolute/path/to/rust/build/x86_64-unknown-linux-gnu/stage1/bin/rustc --edition=2024 --crate-type cdylib src/main.rs --emit=llvm-ir  -O -C lto=fat -Cpanic=abort -Zoffload=Enable
+/absolute/path/to/rust/build/x86_64-unknown-linux-gnu/llvm/bin/clang++ -fopenmp --offload-arch=native -g  -O3 main.ll -o main -save-temps
+LIBOMPTARGET_INFO=-1  ./main
+```
+The first step will generate a `main.ll` file, which has enough instructions to cause the offload runtime to move data to and from a gpu.
+The second step will use clang as the compilation driver to compile our IR file down to a working binary. Only a very small Rust subset will work out of the box here, unless
+you use features like build-std, which are not covered by this guide. Look at the codegen test to get a feeling for how to write a working example.
+In the last step you can run your binary, if all went well you will see a data transfer being reported:
+```
+omptarget device 0 info: Entering OpenMP data region with being_mapper at unknown:0:0 with 1 arguments:
+omptarget device 0 info: tofrom(unknown)[1024]
+omptarget device 0 info: Creating new map entry with HstPtrBase=0x00007fffffff9540, HstPtrBegin=0x00007fffffff9540, TgtAllocBegin=0x0000155547200000, TgtPtrBegin=0x0000155547200000, Size=1024, DynRefCount=1, HoldRefCount=0, Name=unknown
+omptarget device 0 info: Copying data from host to device, HstPtr=0x00007fffffff9540, TgtPtr=0x0000155547200000, Size=1024, Name=unknown
+omptarget device 0 info: OpenMP Host-Device pointer mappings after block at unknown:0:0:
+omptarget device 0 info: Host Ptr           Target Ptr         Size (B) DynRefCount HoldRefCount Declaration
+omptarget device 0 info: 0x00007fffffff9540 0x0000155547200000 1024     1           0            unknown at unknown:0:0
+// some other output
+omptarget device 0 info: Exiting OpenMP data region with end_mapper at unknown:0:0 with 1 arguments:
+omptarget device 0 info: tofrom(unknown)[1024]
+omptarget device 0 info: Mapping exists with HstPtrBegin=0x00007fffffff9540, TgtPtrBegin=0x0000155547200000, Size=1024, DynRefCount=0 (decremented, delayed deletion), HoldRefCount=0
+omptarget device 0 info: Copying data from device to host, TgtPtr=0x0000155547200000, HstPtr=0x00007fffffff9540, Size=1024, Name=unknown
+omptarget device 0 info: Removing map entry with HstPtrBegin=0x00007fffffff9540, TgtPtrBegin=0x0000155547200000, Size=1024, Name=unknown
+```
--- a/src/offload/internals.md
+++ b/src/offload/internals.md
@ -0,0 +1,9 @@
+# std::offload
+
+This module is under active development. Once upstream, it should allow Rust developers to run Rust code on GPUs.
+We aim to develop a `rusty` GPU programming interface, which is safe, convenient and sufficiently fast by default.
+This includes automatic data movement to and from the GPU, in a efficient way. We will (later)
+also offer more advanced, possibly unsafe, interfaces which allow a higher degree of control.
+
+The implementation is based on LLVM's "offload" project, which is already used by OpenMP to run Fortran or C++ code on GPUs.
+While the project is under development, users will need to call other compilers like clang to finish the compilation process.
--- a/src/overview.md
+++ b/src/overview.md
@ -410,7 +410,7 @@ For more details on bootstrapping, see
  - Guide: [The HIR](hir.md)
  - Guide: [Identifiers in the HIR](hir.md#identifiers-in-the-hir)
  - Guide: [The `HIR` Map](hir.md#the-hir-map)
-  - Guide: [Lowering `AST` to `HIR`](ast-lowering.md)
+  - Guide: [Lowering `AST` to `HIR`](./hir/lowering.md)
  - How to view `HIR` representation for your code `cargo rustc -- -Z unpretty=hir-tree`
  - Rustc `HIR` definition: [`rustc_hir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html)
  - Main entry point: **TODO**
--- a/src/profiling/with_perf.md
+++ b/src/profiling/with_perf.md
@ -7,8 +7,8 @@ This is a guide for how to profile rustc with [perf](https://perf.wiki.kernel.or
 - Get a clean checkout of rust-lang/master, or whatever it is you want
  to profile.
 - Set the following settings in your `bootstrap.toml`:
-  - `debuginfo-level = 1` - enables line debuginfo
-  - `jemalloc = false` - lets you do memory use profiling with valgrind
+  - `rust.debuginfo-level = 1` - enables line debuginfo
+  - `rust.jemalloc = false` - lets you do memory use profiling with valgrind
  - leave everything else the defaults
 - Run `./x build` to get a full build
 - Make a rustup toolchain pointing to that result
Author	SHA1	Message	Date
Karan Janthe	b3605e5e9d	Merge `fc47bce9f7` into `4c6d66ccb0`	2025-06-19 16:57:50 +08:00
Manuel Drehwald	4c6d66ccb0	Merge pull request #2447 from rust-lang/offload-docs initial instructions for gpu offload	2025-06-18 17:24:36 -07:00
Manuel Drehwald	4233695fea	initial instructions for gpu offload	2025-06-18 17:22:50 -07:00
Tshepang Mbambo	33eaf36815	Merge pull request #2476 from rust-lang/tshepang-patch-1 fix markup	2025-06-19 00:04:36 +02:00
Tshepang Mbambo	980acc5eee	fix markup That was intended to be a list. Also, the order is not relevant.	2025-06-19 00:03:33 +02:00
Boxy	e0a39188f1	Merge pull request #2474 from BoxyUwU/ambig_unambig_ty_consts Document Ambig vs Unambig Type/Consts	2025-06-18 15:30:14 +01:00
Boxy	9d7ba8573d	Reviews	2025-06-18 15:28:44 +01:00
Boxy	c963b4ad93	Add links	2025-06-17 18:09:06 +01:00
Boxy	a02af2f135	Write chapter on Unambig vs Ambig Types/Consts	2025-06-17 18:09:06 +01:00
Boxy	4185dca095	Stub chapter and consolidate under `/hir/`	2025-06-17 18:09:02 +01:00
nora	a2c80e6e23	Merge pull request #2475 from lolbinarycat/patch-3 Profiling with perf: specify the section of bootstrap settings.	2025-06-17 18:34:03 +02:00
lolbinarycat	7b921990fc	Profiling with perf: specify the section of bootstrap settings.	2025-06-17 11:31:04 -05:00
Karan Janthe	fc47bce9f7	Merge branch 'rust-lang:master' into master	2025-05-13 20:32:47 +05:30
karan	733773ce12	basic type docs for auto diff	2025-05-13 20:32:08 +05:30