Add Karrq's salsa chapter (#529)
* add Karrq's salsa chapter * add youtu.be short url
This commit is contained in:
parent
52720a5708
commit
a3f4e929a3
|
|
@ -15,6 +15,6 @@ level = 1
|
||||||
|
|
||||||
[output.linkcheck]
|
[output.linkcheck]
|
||||||
follow-web-links = true
|
follow-web-links = true
|
||||||
exclude = [ "crates\\.io", "gcc\\.godbolt\\.org", "youtube\\.com", "dl\\.acm\\.org" ]
|
exclude = [ "crates\\.io", "gcc\\.godbolt\\.org", "youtube\\.com", "youtu\\.be", "dl\\.acm\\.org" ]
|
||||||
cache-timeout = 172800
|
cache-timeout = 172800
|
||||||
warning-policy = "error"
|
warning-policy = "error"
|
||||||
|
|
|
||||||
|
|
@ -38,6 +38,7 @@
|
||||||
- [Incremental compilation](./queries/incremental-compilation.md)
|
- [Incremental compilation](./queries/incremental-compilation.md)
|
||||||
- [Incremental compilation In Detail](./queries/incremental-compilation-in-detail.md)
|
- [Incremental compilation In Detail](./queries/incremental-compilation-in-detail.md)
|
||||||
- [Debugging and Testing](./incrcomp-debugging.md)
|
- [Debugging and Testing](./incrcomp-debugging.md)
|
||||||
|
- [Salsa](./salsa.md)
|
||||||
- [Lexing and Parsing](./the-parser.md)
|
- [Lexing and Parsing](./the-parser.md)
|
||||||
- [`#[test]` Implementation](./test-implementation.md)
|
- [`#[test]` Implementation](./test-implementation.md)
|
||||||
- [Macro expansion](./macro-expansion.md)
|
- [Macro expansion](./macro-expansion.md)
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,214 @@
|
||||||
|
# How Salsa works
|
||||||
|
|
||||||
|
This chapter is based on the explanation given by Niko Matsakis in this
|
||||||
|
[video](https://www.youtube.com/watch?v=_muY4HjSqVw) about
|
||||||
|
[Salsa](https://github.com/salsa-rs/salsa).
|
||||||
|
|
||||||
|
> Salsa is not used directly in rustc, but it is used extensively for
|
||||||
|
> rust-analyzer and may be integrated into the compiler in the future.
|
||||||
|
|
||||||
|
## What is Salsa?
|
||||||
|
|
||||||
|
Salsa is a library for incremental recomputation, this means reusing
|
||||||
|
computation that has already been done in the past to increase the efficiency
|
||||||
|
of future computations.
|
||||||
|
|
||||||
|
The objectives of Salsa are:
|
||||||
|
* Provide that functionality in an automatic way, so reusing old computations
|
||||||
|
is done automatically by the library
|
||||||
|
* Doing so in a "sound", or "correct", way, therefore leading to the same
|
||||||
|
results as if it had been done from scratch
|
||||||
|
|
||||||
|
Salsa's actual model is much richer, allowing many kinds of inputs and many
|
||||||
|
different outputs.
|
||||||
|
For example, integrating Salsa with an IDE could mean that the inputs could be
|
||||||
|
the manifest (`Cargo.toml`), entire source files (`foo.rs`), snippets and so
|
||||||
|
on; the outputs of such an integration could range from a binary executable, to
|
||||||
|
lints, types (for example, if a user selects a certain variable and wishes to
|
||||||
|
see its type), completions, etc.
|
||||||
|
|
||||||
|
## How does it work?
|
||||||
|
|
||||||
|
The first thing that Salsa has to do is identify the "base inputs" [^EN1].
|
||||||
|
|
||||||
|
Then Salsa has to also identify intermediate, "derived" values, which are
|
||||||
|
something that the library produces, but, for each derived value there's a
|
||||||
|
"pure" function that computes the derived value.
|
||||||
|
|
||||||
|
For example, there might be a function `ast(x: Path) -> AST`. The produced
|
||||||
|
`AST` isn't a final value, it's an intermidiate value that the library would
|
||||||
|
use for the computation.
|
||||||
|
|
||||||
|
This means that when you try to compute with the library, Salsa is going to
|
||||||
|
compute various derived values, and eventually read the input and produce the
|
||||||
|
result for the asked computation.
|
||||||
|
|
||||||
|
In the course of computing, Salsa tracks which inputs were accessed and which
|
||||||
|
values are derived. This information is used to determine what's going to
|
||||||
|
happen when the inputs change: are the derived values still valid?
|
||||||
|
|
||||||
|
This doesn't necessarily mean that each computation downstream from the input
|
||||||
|
is going to be checked, which could be costly. Salsa only needs to check each
|
||||||
|
downstream computation until it finds one that isn't changed. At that point, it
|
||||||
|
won't check other derived computations since they wouldn't need to change.
|
||||||
|
|
||||||
|
It's is helpful to think about this as a graph with nodes. Each derived value
|
||||||
|
has a dependency on other values, which could themselves be either base or
|
||||||
|
derived. Base values don't have a dependency.
|
||||||
|
|
||||||
|
```ignore
|
||||||
|
I <- A <- C ...
|
||||||
|
|
|
||||||
|
J <- B <--+
|
||||||
|
```
|
||||||
|
|
||||||
|
When an input `I` changes, the derived value `A` could change. The derived
|
||||||
|
value `B` , which does not depend on `I`, `A`, or any value derived from `A` or
|
||||||
|
`I`, is not subject to change. Therefore, Salsa can reuse the computation done
|
||||||
|
for `B` in the past, without having to compute it again.
|
||||||
|
|
||||||
|
The computation could also terminate early. Keeping the same graph as before,
|
||||||
|
say that input `I` has changed in some way (and input `J` hasn't) but, when
|
||||||
|
computing `A` again, it's found that `A` hasn't changed from the previous
|
||||||
|
computation. This leads to an "early termination", because there's no need to
|
||||||
|
check if `C` needs to change, since both `C` direct inputs, `A` and `B`,
|
||||||
|
haven't changed.
|
||||||
|
|
||||||
|
## Key Salsa concepts
|
||||||
|
|
||||||
|
### Query
|
||||||
|
|
||||||
|
A query is some value that Salsa can access in the course of computation. Each
|
||||||
|
query can have a number of keys (from 0 to many), and all queries have a
|
||||||
|
result, akin to functions. 0-key queries are called "input" queries.
|
||||||
|
|
||||||
|
### Database
|
||||||
|
|
||||||
|
The database is basically the context for the entire computation, it's meant to
|
||||||
|
store Salsa's internal state, all intermediate values for each query, and
|
||||||
|
anything else that the computation might need. The database must know all the
|
||||||
|
queries that the library is going to do before it can be built, but they don't
|
||||||
|
need to be specified in the same place.
|
||||||
|
|
||||||
|
After the database is formed, it can be accessed with queries that are very
|
||||||
|
similar to functions. Since each query's result is stored in the database,
|
||||||
|
when a query is invoked N times, it will return N **cloned** results, without
|
||||||
|
having to recompute the query (unless the input has changed in such a way that
|
||||||
|
it warrants recomputation).
|
||||||
|
|
||||||
|
For each input query (0-key), a "set" method is generated, allowing the user to
|
||||||
|
change the output of such query, and trigger previous memoized values to be
|
||||||
|
potentially invalidated.
|
||||||
|
|
||||||
|
### Query Groups
|
||||||
|
|
||||||
|
A query group is a set of queries which have been defined together as a unit.
|
||||||
|
The database is formed by combining query groups. Query groups are akin to
|
||||||
|
"Salsa modules" [^EN2].
|
||||||
|
|
||||||
|
A set of queries in a query group are just a set of methods in a trait.
|
||||||
|
|
||||||
|
To create a query group a trait annotated with a specific attribute
|
||||||
|
(`#[salsa::query_group(...)]`) has to be created.
|
||||||
|
|
||||||
|
An argument must also be provided to said attribute as it will be used by Salsa
|
||||||
|
to create a struct to be used later when the database is created.
|
||||||
|
|
||||||
|
Example input query group:
|
||||||
|
|
||||||
|
```rust,ignore
|
||||||
|
/// This attribute will process this tree, produce this tree as output, and produce
|
||||||
|
/// a bunch of intermidiate stuff that Salsa also uses. One of these things is a
|
||||||
|
/// "StorageStruct", whose name we have specified in the attribute.
|
||||||
|
///
|
||||||
|
/// This query group is a bunch of **input** queries, that do not rely on any
|
||||||
|
/// derived input.
|
||||||
|
#[salsa::query_group(InputsStorage)]
|
||||||
|
pub trait Inputs {
|
||||||
|
/// This attribute (`#[salsa::input]`) indicates that this query is a base
|
||||||
|
/// input, therefore `set_manifest` is going to be auto-generated
|
||||||
|
#[salsa::input]
|
||||||
|
fn manifest(&self) -> Manifest;
|
||||||
|
|
||||||
|
#[salsa::input]
|
||||||
|
fn source_text(&self, name: String) -> String;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
To create a **derived** query group, one must specify which other query groups
|
||||||
|
this one depends on by specifying them as supertraits, as seen in the following
|
||||||
|
example:
|
||||||
|
|
||||||
|
```rust,ignore
|
||||||
|
/// This query group is going to contain queries that depend on derived values a
|
||||||
|
/// query group can access another query group's queries by specifying the
|
||||||
|
/// dependency as a super trait query groups can be stacked as much as needed using
|
||||||
|
/// that pattern.
|
||||||
|
#[salsa::query_group(ParserStorage)]
|
||||||
|
pub trait Parser: Inputs {
|
||||||
|
/// This query `ast` is not an input query, it's a derived query this means
|
||||||
|
/// that a definition is necessary.
|
||||||
|
fn ast(&self, name: String) -> String;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
When creating a derived query the implementation of said query must be defined
|
||||||
|
outside the trait. The definition must take a database parameter as an `impl
|
||||||
|
Trait` (or `dyn Trait`), where `Trait` is the query group that the definition
|
||||||
|
belongs to, in addition to the other keys.
|
||||||
|
|
||||||
|
```rust,ignore
|
||||||
|
///This is going to be the definition of the `ast` query in the `Parser` trait.
|
||||||
|
///So, when the query `ast` is invoked, and it needs to be recomputed, Salsa is going to call this function
|
||||||
|
///and it's is going to give it the database as `impl Parser`.
|
||||||
|
///The function doesn't need to be aware of all the queries of all the query groups
|
||||||
|
fn ast(db: &impl Parser, name: String) -> String {
|
||||||
|
//! Note, `impl Parser` is used here but `dyn Parser` works just as well
|
||||||
|
/* code */
|
||||||
|
///By passing an `impl Parser`, this is allowed
|
||||||
|
let source_text = db.input_file(name);
|
||||||
|
/* do the actual parsing */
|
||||||
|
return ast;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Eventually, after all the query groups have been defined, the database can be
|
||||||
|
created by declaring a struct.
|
||||||
|
|
||||||
|
To specify which query groups are going to be part of the database an attribute
|
||||||
|
(`#[salsa::database(...)]`) must be added. The argument of said attribute is a
|
||||||
|
list of identifiers, specifying the query groups **storages**.
|
||||||
|
|
||||||
|
```rust,ignore
|
||||||
|
///This attribute specifies which query groups are going to be in the database
|
||||||
|
#[salsa::database(InputsStorage, ParserStorage)]
|
||||||
|
#[derive(Default)] //optional!
|
||||||
|
struct MyDatabase {
|
||||||
|
///You also need this one field
|
||||||
|
runtime : salsa::Runtime<MyDatabase>,
|
||||||
|
}
|
||||||
|
///And this trait has to be implemented
|
||||||
|
impl salsa::Databse for MyDatabase {
|
||||||
|
fn salsa_runtime(&self) -> &salsa::Runtime<MyDatabase> {
|
||||||
|
&self.runtime
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Example usage:
|
||||||
|
|
||||||
|
```rust,ignore
|
||||||
|
fn main() {
|
||||||
|
let db = MyDatabase::default();
|
||||||
|
db.set_manifest(...);
|
||||||
|
db.set_source_text(...);
|
||||||
|
loop {
|
||||||
|
db.ast(...); //will reuse results
|
||||||
|
db.set_source_text(...);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
[^EN1]: "They are not something that you **inaubible** but something that you kinda get **inaudible** from the outside [3:23](https://youtu.be/_muY4HjSqVw?t=203).
|
||||||
|
|
||||||
|
[^EN2]: What is a Salsa module?
|
||||||
Loading…
Reference in New Issue