Compare commits

...

8 Commits

Author SHA1 Message Date
xizheyin 42e6a03e82
Merge 23d77abfc0 into 4c6d66ccb0 2025-06-19 16:57:50 +08:00
Manuel Drehwald 4c6d66ccb0
Merge pull request #2447 from rust-lang/offload-docs
initial instructions for gpu offload
2025-06-18 17:24:36 -07:00
Manuel Drehwald 4233695fea initial instructions for gpu offload 2025-06-18 17:22:50 -07:00
Tshepang Mbambo 33eaf36815
Merge pull request #2476 from rust-lang/tshepang-patch-1
fix markup
2025-06-19 00:04:36 +02:00
Tshepang Mbambo 980acc5eee
fix markup
That was intended to be a list.

Also, the order is not relevant.
2025-06-19 00:03:33 +02:00
xizheyin 23d77abfc0
Add Section How queries interact with external crate metadata
Signed-off-by: xizheyin <xizheyin@smail.nju.edu.cn>
2025-06-16 23:47:04 +08:00
xizheyin e03ee80811
change key in Provider example (local) into `LocalDefId`
Signed-off-by: xizheyin <xizheyin@smail.nju.edu.cn>
2025-06-15 21:03:22 +08:00
xizheyin 07e05bbc46 Refinement of Providers into Providers and ExternProviders
Signed-off-by: xizheyin <xizheyin@smail.nju.edu.cn>
2025-06-15 20:59:08 +08:00
5 changed files with 161 additions and 35 deletions

View File

@ -101,6 +101,8 @@
- [The `rustdoc` test suite](./rustdoc-internals/rustdoc-test-suite.md)
- [The `rustdoc-gui` test suite](./rustdoc-internals/rustdoc-gui-test-suite.md)
- [The `rustdoc-json` test suite](./rustdoc-internals/rustdoc-json-test-suite.md)
- [GPU offload internals](./offload/internals.md)
- [Installation](./offload/installation.md)
- [Autodiff internals](./autodiff/internals.md)
- [Installation](./autodiff/installation.md)
- [How to debug](./autodiff/debugging.md)

View File

@ -174,8 +174,8 @@ compiler, you can use it instead of the JSON file for both arguments.
## Promoting a target from tier 2 (target) to tier 2 (host)
There are two levels of tier 2 targets:
a) Targets that are only cross-compiled (`rustup target add`)
b) Targets that [have a native toolchain][tier2-native] (`rustup toolchain install`)
- Targets that are only cross-compiled (`rustup target add`)
- Targets that [have a native toolchain][tier2-native] (`rustup toolchain install`)
[tier2-native]: https://doc.rust-lang.org/nightly/rustc/target-tier-policy.html#tier-2-with-host-tools

View File

@ -0,0 +1,71 @@
# Installation
In the future, `std::offload` should become available in nightly builds for users. For now, everyone still needs to build rustc from source.
## Build instructions
First you need to clone and configure the Rust repository:
```bash
git clone --depth=1 git@github.com:rust-lang/rust.git
cd rust
./configure --enable-llvm-link-shared --release-channel=nightly --enable-llvm-assertions --enable-offload --enable-enzyme --enable-clang --enable-lld --enable-option-checking --enable-ninja --disable-docs
```
Afterwards you can build rustc using:
```bash
./x.py build --stage 1 library
```
Afterwards rustc toolchain link will allow you to use it through cargo:
```
rustup toolchain link offload build/host/stage1
rustup toolchain install nightly # enables -Z unstable-options
```
## Build instruction for LLVM itself
```bash
git clone --depth=1 git@github.com:llvm/llvm-project.git
cd llvm-project
mkdir build
cd build
cmake -G Ninja ../llvm -DLLVM_TARGETS_TO_BUILD="host,AMDGPU,NVPTX" -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PROJECTS="clang;lld" -DLLVM_ENABLE_RUNTIMES="offload,openmp" -DLLVM_ENABLE_PLUGINS=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=.
ninja
ninja install
```
This gives you a working LLVM build.
## Testing
run
```
./x.py test --stage 1 tests/codegen/gpu_offload
```
## Usage
It is important to use a clang compiler build on the same llvm as rustc. Just calling clang without the full path will likely use your system clang, which probably will be incompatible.
```
/absolute/path/to/rust/build/x86_64-unknown-linux-gnu/stage1/bin/rustc --edition=2024 --crate-type cdylib src/main.rs --emit=llvm-ir -O -C lto=fat -Cpanic=abort -Zoffload=Enable
/absolute/path/to/rust/build/x86_64-unknown-linux-gnu/llvm/bin/clang++ -fopenmp --offload-arch=native -g -O3 main.ll -o main -save-temps
LIBOMPTARGET_INFO=-1 ./main
```
The first step will generate a `main.ll` file, which has enough instructions to cause the offload runtime to move data to and from a gpu.
The second step will use clang as the compilation driver to compile our IR file down to a working binary. Only a very small Rust subset will work out of the box here, unless
you use features like build-std, which are not covered by this guide. Look at the codegen test to get a feeling for how to write a working example.
In the last step you can run your binary, if all went well you will see a data transfer being reported:
```
omptarget device 0 info: Entering OpenMP data region with being_mapper at unknown:0:0 with 1 arguments:
omptarget device 0 info: tofrom(unknown)[1024]
omptarget device 0 info: Creating new map entry with HstPtrBase=0x00007fffffff9540, HstPtrBegin=0x00007fffffff9540, TgtAllocBegin=0x0000155547200000, TgtPtrBegin=0x0000155547200000, Size=1024, DynRefCount=1, HoldRefCount=0, Name=unknown
omptarget device 0 info: Copying data from host to device, HstPtr=0x00007fffffff9540, TgtPtr=0x0000155547200000, Size=1024, Name=unknown
omptarget device 0 info: OpenMP Host-Device pointer mappings after block at unknown:0:0:
omptarget device 0 info: Host Ptr Target Ptr Size (B) DynRefCount HoldRefCount Declaration
omptarget device 0 info: 0x00007fffffff9540 0x0000155547200000 1024 1 0 unknown at unknown:0:0
// some other output
omptarget device 0 info: Exiting OpenMP data region with end_mapper at unknown:0:0 with 1 arguments:
omptarget device 0 info: tofrom(unknown)[1024]
omptarget device 0 info: Mapping exists with HstPtrBegin=0x00007fffffff9540, TgtPtrBegin=0x0000155547200000, Size=1024, DynRefCount=0 (decremented, delayed deletion), HoldRefCount=0
omptarget device 0 info: Copying data from device to host, TgtPtr=0x0000155547200000, HstPtr=0x00007fffffff9540, Size=1024, Name=unknown
omptarget device 0 info: Removing map entry with HstPtrBegin=0x00007fffffff9540, TgtPtrBegin=0x0000155547200000, Size=1024, Name=unknown
```

9
src/offload/internals.md Normal file
View File

@ -0,0 +1,9 @@
# std::offload
This module is under active development. Once upstream, it should allow Rust developers to run Rust code on GPUs.
We aim to develop a `rusty` GPU programming interface, which is safe, convenient and sufficiently fast by default.
This includes automatic data movement to and from the GPU, in a efficient way. We will (later)
also offer more advanced, possibly unsafe, interfaces which allow a higher degree of control.
The implementation is based on LLVM's "offload" project, which is already used by OpenMP to run Fortran or C++ code on GPUs.
While the project is under development, users will need to call other compilers like clang to finish the compilation process.

View File

@ -71,22 +71,24 @@ are cheaply cloneable; insert an `Rc` if necessary).
If, however, the query is *not* in the cache, then the compiler will
call the corresponding **provider** function. A provider is a function
implemented in a specific module and **manually registered** into the
[`Providers`][providers_struct] struct during compiler initialization.
The macro system generates the [`Providers`][providers_struct] struct,
which acts as a function table for all query implementations, where each
implemented in a specific module and **manually registered** into either
the [`Providers`][providers_struct] struct (for local crate queries) or
the [`ExternProviders`][extern_providers_struct] struct (for external crate queries)
during compiler initialization. The macro system generates both structs,
which act as function tables for all query implementations, where each
field is a function pointer to the actual provider.
**Note:** The `Providers` struct is generated by macros and acts as a function table for all query implementations.
It is **not** a Rust trait, but a plain struct with function pointer fields.
**Note:** Both the `Providers` and `ExternProviders` structs are generated by macros and act as function tables for all query implementations.
They are **not** Rust traits, but plain structs with function pointer fields.
**Providers are defined per-crate.** The compiler maintains,
internally, a table of providers for every crate, at least
conceptually. Right now, there are really two sets: the providers for
queries about the **local crate** (that is, the one being compiled)
and providers for queries about **external crates** (that is,
dependencies of the local crate). Note that what determines the crate
that a query is targeting is not the *kind* of query, but the *key*.
conceptually. There are two sets of providers:
- The `Providers` struct for queries about the **local crate** (that is, the one being compiled)
- The `ExternProviders` struct for queries about **external crates** (that is,
dependencies of the local crate)
Note that what determines the crate that a query is targeting is not the *kind* of query, but the *key*.
For example, when you invoke `tcx.type_of(def_id)`, that could be a
local query or an external query, depending on what crate the `def_id`
is referring to (see the [`self::keys::Key`][Key] trait for more
@ -119,22 +121,22 @@ they define both a `provide` and a `provide_extern` function, through
### How providers are set up
When the tcx is created, it is given the providers by its creator using
the [`Providers`][providers_struct] struct. This struct is generated by
the macros here, but it is basically a big list of function pointers:
[providers_struct]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/query/struct.Providers.html
When the tcx is created, it is given both the local and external providers by its creator using
the `Providers` struct from `rustc_middle::util`. This struct contains both the local and external providers:
```rust,ignore
struct Providers {
type_of: for<'tcx> fn(TyCtxt<'tcx>, DefId) -> Ty<'tcx>,
// ... one field for each query
pub struct Providers {
pub queries: crate::query::Providers, // Local crate providers
pub extern_queries: crate::query::ExternProviders, // External crate providers
pub hooks: crate::hooks::Providers,
}
```
Each of these provider structs is generated by the macros and contains function pointers for their respective queries.
#### How are providers registered?
The `Providers` struct is filled in during compiler initialization, mainly by the `rustc_driver` crate.
The provider structs are filled in during compiler initialization, mainly by the `rustc_driver` crate.
But the actual provider functions are implemented in various `rustc_*` crates (like `rustc_middle`, `rustc_hir_analysis`, etc).
To register providers, each crate exposes a [`provide`][provide_fn] function that looks like this:
@ -142,17 +144,20 @@ To register providers, each crate exposes a [`provide`][provide_fn] function tha
[provide_fn]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/fn.provide.html
```rust,ignore
pub fn provide(providers: &mut Providers) {
*providers = Providers {
type_of,
// ... add more providers here
..*providers
};
pub fn provide(providers: &mut rustc_middle::util::Providers) {
providers.queries.type_of = type_of;
// ... add more local providers here
providers.extern_queries.type_of = extern_type_of;
// ... add more external providers here
providers.hooks.some_hook = some_hook;
// ... add more hooks here
}
```
- This function takes a mutable reference to the `Providers` struct and sets the fields to point to the correct provider functions.
- You can also assign fields individually, e.g. `providers.type_of = type_of;`.
- You can assign fields individually for each provider type (local, external, and hooks).
#### Adding a new provider
@ -160,18 +165,57 @@ Suppose you want to add a new query called `fubar`. You would:
1. Implement the provider function:
```rust,ignore
fn fubar<'tcx>(tcx: TyCtxt<'tcx>, key: DefId) -> Fubar<'tcx> { ... }
fn fubar<'tcx>(tcx: TyCtxt<'tcx>, key: LocalDefId) -> Fubar<'tcx> { ... }
```
2. Register it in the `provide` function:
```rust,ignore
pub fn provide(providers: &mut Providers) {
*providers = Providers {
fubar,
..*providers
};
pub fn provide(providers: &mut rustc_middle::util::Providers) {
providers.queries.fubar = fubar;
}
```
### How queries interact with external crate metadata
When a query is made for an external crate (i.e., a dependency), the query system needs to load the information from that crate's metadata.
This is handled by the [`rustc_metadata` crate][rustc_metadata], which is responsible for decoding and providing the information stored in the `.rmeta` files.
The process works like this:
1. When a query is made, the query system first checks if the `DefId` refers to a local or external crate by checking if `def_id.krate == LOCAL_CRATE`.
This determines whether to use the local provider from [`Providers`][providers_struct] or the external provider from [`ExternProviders`][extern_providers_struct].
2. For external crates, the query system will look for a provider in the [`ExternProviders`][extern_providers_struct] struct.
The `rustc_metadata` crate registers these external providers through the `provide_extern` function in `rustc_metadata/src/rmeta/decoder/cstore_impl.rs`. Just like:
```rust
pub fn provide_extern(providers: &mut ExternProviders) {
providers.foo = |tcx, def_id| {
// Load and decode metadata for external crate
let cdata = CStore::from_tcx(tcx).get_crate_data(def_id.krate);
cdata.foo(def_id.index)
};
// Register other external providers...
}
```
3. The metadata is stored in a binary format in `.rmeta` files that contains pre-computed information about the external crate, such as types, function signatures, trait implementations, and other information needed by the compiler. When an external query is made, the `rustc_metadata` crate:
- Loads the `.rmeta` file for the external crate
- Decodes the metadata using the `Decodable` trait
- Returns the decoded information to the query system
This approach avoids recompiling external crates, allows for faster compilation of dependent crates, and enables incremental compilation to work across crate boundaries.
Here is a simplified example, when you call `tcx.type_of(def_id)` for a type defined in an external crate, the query system will:
1. Detect that the `def_id` refers to an external crate by checking `def_id.krate != LOCAL_CRATE`
2. Call the appropriate provider from `ExternProviders` which was registered by `rustc_metadata`
3. The provider will load and decode the type information from the external crate's metadata
4. Return the decoded type to the caller
This is why most `rustc_*` crates only need to provide local providers - the external providers are handled by the metadata system.
The only exception is when a crate needs to provide special handling for external queries, in which case it would implement both local and external providers.
[rustc_metadata]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_metadata/index.html
[providers_struct]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/query/struct.Providers.html
[extern_providers_struct]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/query/struct.ExternProviders.html
---
## Adding a new query