Prepend temp files with per-invocation random string to avoid temp filename conflicts https://github.com/rust-lang/rust/issues/139407 uncovered a very subtle unsoundness with incremental codegen, failing compilation sessions (due to assembler errors), and the "prefer hard linking over copying files" strategy we use in the compiler for file management. Specifically, imagine we're building a single file 3 times, all with `-Csave-temps -Cincremental=...`. Let's call the object file we're building for the codegen unit for `main` "`XXX.o`" just for clarity since it's probably some gigantic hash name: ``` #[inline(never)] #[cfg(any(rpass1, rpass3))] fn a() -> i32 { 0 } #[cfg(any(cfail2))] fn a() -> i32 { 1 } fn main() { evil::evil(); assert_eq!(a(), 0); } mod evil { #[cfg(any(rpass1, rpass3))] pub fn evil() { unsafe { std::arch::asm!("/* */"); } } #[cfg(any(cfail2))] pub fn evil() { unsafe { std::arch::asm!("missing"); } } } ``` Session 1 (`rpass1`): * Type-check, borrow-check, etc. * Serialize the dep graph to the incremental working directory `.../s-...-working/`. * Codegen object file to a temp file `XXX.rcgu.o` which is spit out in the cwd. * Hard-link[^1] `XXX.rcgu.o` to the incremental working directory `.../s-...-working/XXX.o`. * Save-temps option means we don't delete `XXX.rgcu.o`. * Link the binary and stuff. * Finalize[^2] the working incremental session by renaming `.../s-...-working` to ` s-...-asjkdhsjakd` (some other finalized incr comp session dir name). Session 2 (`cfail2`): * Load artifacts from the previous *finalized* incremental session, namely the dep graph. * Type-check, borrow-check, etc. since the file has changed, so most dep graph nodes are red. * Serialize the dep graph to the incremental working directory `.../s-...-working/`. * Codegen object file to a temp file `XXX.rcgu.o`. **HERE IS THE PROBLEM**: The hard-link is still set up to point to the inode from `XXX.o` from the first session, so this also modifies the `XXX.o` in the previous finalized session directory. * Codegen emits an error b/c `missing` is not an instruction, so we abort before finalizing the incremental session. Specifically, this means that the *previous* session is the last finalized session. Session 3 (`rpass3`): * Load artifacts from the previous *finalized* incremental session, namely the dep graph. NOTE that this is from session 1. * All the dep graph nodes are green since we are basically replaying session 1. * codegen object file `XXX.o`, which is detected as *reused* from session 1 since dep nodes were green. That means we **reuse** `XXX.o` which had been dirtied from session 2. * Link the binary and stuff. This results in a binary which reuses some of the build artifacts from session 2, but thinks it's from session 1. At this point, I hope it's clear to see that the incremental results from session 1 were dirtied from session 2, but we reuse them as if session 1 was the previous (finalized) incremental session we ran. This is at best really buggy, and at worst **unsound**. This isn't limited to `-C save-temps`, since there are other combinations of flags that may keep around temporary files (hard linked) in the working directory (like `-C debuginfo=1 -C split-debuginfo=unpacked` on darwin, for example). --- This PR implements a fix which is to prepend temp filenames with a random string that is generated per invocation of rustc. This string is not *deterministic*, but temporary files are transient anyways, so I don't believe this is a problem. That means that temp files are now something like... `{crate-name}.{cgu}.{invocation_temp}.rcgu.o`, where `{invocation_temp}` is the new temporary string we generate per invocation of rustc. Fixes https://github.com/rust-lang/rust/issues/139407 [^1]: |
||
|---|---|---|
| .github/workflows | ||
| ci | ||
| examples | ||
| josh-sync | ||
| src | ||
| .editorconfig | ||
| .gitattributes | ||
| .gitignore | ||
| .mailmap | ||
| CITATION.cff | ||
| CNAME | ||
| CODE_OF_CONDUCT.md | ||
| LICENSE-APACHE | ||
| LICENSE-MIT | ||
| README.md | ||
| book.toml | ||
| mermaid-init.js | ||
| mermaid.min.js | ||
| rust-version | ||
| rustfmt.toml | ||
| triagebot.toml | ||
README.md
This is a collaborative effort to build a guide that explains how rustc works. The aim of the guide is to help new contributors get oriented to rustc, as well as to help more experienced folks in figuring out some new part of the compiler that they haven't worked on before.
You can read the latest version of the guide here.
You may also find the rustdocs for the compiler itself useful. Note that these are not intended as a guide; it's recommended that you search for the docs you're looking for instead of reading them top to bottom.
For documentation on developing the standard library, see
std-dev-guide.
Contributing to the guide
The guide is useful today, but it has a lot of work still to go.
If you'd like to help improve the guide, we'd love to have you! You can find plenty of issues on the issue tracker. Just post a comment on the issue you would like to work on to make sure that we don't accidentally duplicate work. If you think something is missing, please open an issue about it!
In general, if you don't know how the compiler works, that is not a problem! In that case, what we will do is to schedule a bit of time for you to talk with someone who does know the code, or who wants to pair with you and figure it out. Then you can work on writing up what you learned.
In general, when writing about a particular part of the compiler's code, we recommend that you link to the relevant parts of the rustc rustdocs.
Build Instructions
To build a local static HTML site, install mdbook with:
> cargo install mdbook mdbook-linkcheck2 mdbook-toc mdbook-mermaid
and execute the following command in the root of the repository:
> mdbook build --open
The build files are found in the book/html directory.
Link Validations
We use mdbook-linkcheck2 to validate URLs included in our documentation. Link
checking is not run by default locally, though it is in CI. To enable it
locally, set the environment variable ENABLE_LINKCHECK=1 like in the
following example.
$ ENABLE_LINKCHECK=1 mdbook serve
Table of Contents
We use mdbook-toc to auto-generate TOCs for long sections. You can invoke the preprocessor by
including the <!-- toc --> marker at the place where you want the TOC.
Synchronizing josh subtree with rustc
This repository is linked to rust-lang/rust as a josh subtree. You can use the following commands to synchronize the subtree in both directions.
You'll need to install josh-proxy locally via
cargo +stable install josh-proxy --git https://github.com/josh-project/josh --tag r24.10.04
Older versions of josh-proxy may not round trip commits losslessly so it is important to install this exact version.
Pull changes from rust-lang/rust into this repository
- Checkout a new branch that will be used to create a PR into
rust-lang/rustc-dev-guide - Run the pull command
$ cargo run --manifest-path josh-sync/Cargo.toml rustc-pull - Push the branch to your fork and create a PR into
rustc-dev-guide
Push changes from this repository into rust-lang/rust
- Run the push command to create a branch named
<branch-name>in arustcfork under the<gh-username>account$ cargo run --manifest-path josh-sync/Cargo.toml rustc-push <branch-name> <gh-username> - Create a PR from
<branch-name>intorust-lang/rust
Minimal git config
For simplicity (ease of implementation purposes), the josh-sync script simply calls out to system git. This means that the git invocation may be influenced by global (or local) git configuration.
You may observe "Nothing to pull" even if you know rustc-pull has something to pull if your global git config sets fetch.prunetags = true (and possibly other configurations may cause unexpected outcomes).
To minimize the likelihood of this happening, you may wish to keep a separate minimal git config that only has [user] entries from global git config, then repoint system git to use the minimal git config instead. E.g.
$ GIT_CONFIG_GLOBAL=/path/to/minimal/gitconfig GIT_CONFIG_SYSTEM='' cargo +stable run --manifest-path josh-sync/Cargo.toml -- rustc-pull