Commit Graph

114 Commits

Author SHA1 Message Date
Josh Bleecher Snyder a2bff7c296 cmd/compile: make pre-elimination of rulegen bounds checks more precise
In cases in which we had a named value whose args were all _,
like this rule from ARM.rules:

(MOVBUreg x:(MOVBUload _ _)) -> (MOVWreg x)

We previously inserted

_ = x.Args[1]

even though it is unnecessary.
This change eliminates this pointless bounds check.
And in other cases, we now check bounds just as far as strictly necessary.

No significant movement on any compiler metrics.
Just nicer (and less) code.

Passes toolstash-check -all.

Change-Id: I075dfe9f926cc561cdc705e9ddaab563164bed3a
Reviewed-on: https://go-review.googlesource.com/c/go/+/221781
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-03-02 17:40:11 +00:00
Josh Bleecher Snyder 5e4da0adac cmd/compile: add streamlined Block Reset+AddControl routines
For use in rewrite rules. Shrinks cmd/compile:

compile 20082104  19967416  -114688 -0.571%

Passes toolstash-check -all.

Change-Id: Ic856508b27ec5b7fb9b6ca63e955a7139ae7dc30
Reviewed-on: https://go-review.googlesource.com/c/go/+/221780
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-03-02 17:40:00 +00:00
Josh Bleecher Snyder d7c073ecbf cmd/compile: add specialized Value reset for OpCopy
This:

* Simplifies and shortens the generated code for rewrite rules.
* Shrinks cmd/compile by 86k (0.4%) and makes it easier to compile.
* Removes the stmt boundary code wrangling from Value.reset,
  in favor of doing it in the one place where it actually does some work,
  namely the writebarrier pass. (This was ascertained by inspecting the
  code for cases in which notStmtBoundary values were generated.)

Passes toolstash-check -all.

Change-Id: I25671d4c4bbd772f235195d11da090878ea2cc07
Reviewed-on: https://go-review.googlesource.com/c/go/+/221421
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2020-03-02 16:24:47 +00:00
Josh Bleecher Snyder 7913f7dfcf cmd/compile: add specialized AddArgN functions for rewrite rules
This shrinks the compiler without impacting performance.
(The performance-sensitive part of rewrite rules is the non-match case.)
Passes toolstash-check -all.

Executable size:

file    before    after     Δ       %       
compile 20356168  20163960  -192208 -0.944% 
total   115599376 115407168 -192208 -0.166% 

Text size:

file                       before   after    Δ       %       
cmd/compile/internal/ssa.s 3928309  3778774  -149535 -3.807% 
total                      18862943 18713408 -149535 -0.793% 

Memory allocated compiling package SSA:

SSA               12.7M ± 0%        12.5M ± 0%  -1.74%  (p=0.008 n=5+5)

Compiler speed impact:

name        old time/op       new time/op       delta
Template          211ms ± 1%        211ms ± 2%    ~     (p=0.832 n=49+49)
Unicode          82.8ms ± 2%       83.2ms ± 2%  +0.44%  (p=0.022 n=46+49)
GoTypes           726ms ± 1%        728ms ± 2%    ~     (p=0.076 n=46+48)
Compiler          3.39s ± 2%        3.40s ± 2%    ~     (p=0.633 n=48+49)
SSA               7.71s ± 1%        7.65s ± 1%  -0.78%  (p=0.000 n=45+44)
Flate             134ms ± 1%        134ms ± 1%    ~     (p=0.195 n=50+49)
GoParser          167ms ± 1%        167ms ± 1%    ~     (p=0.390 n=47+47)
Reflect           453ms ± 3%        452ms ± 2%    ~     (p=0.492 n=48+49)
Tar               184ms ± 3%        184ms ± 2%    ~     (p=0.862 n=50+48)
XML               248ms ± 2%        248ms ± 2%    ~     (p=0.096 n=49+47)
[Geo mean]        415ms             415ms       -0.03%

name        old user-time/op  new user-time/op  delta
Template          273ms ± 1%        273ms ± 2%    ~     (p=0.711 n=48+48)
Unicode           117ms ± 6%        117ms ± 5%    ~     (p=0.633 n=50+50)
GoTypes           972ms ± 2%        974ms ± 1%  +0.29%  (p=0.016 n=47+49)
Compiler          4.46s ± 6%        4.51s ± 6%    ~     (p=0.093 n=50+50)
SSA               10.4s ± 1%        10.3s ± 2%  -0.94%  (p=0.000 n=45+50)
Flate             166ms ± 2%        167ms ± 2%    ~     (p=0.148 n=49+48)
GoParser          202ms ± 1%        202ms ± 2%  -0.28%  (p=0.014 n=47+49)
Reflect           594ms ± 2%        594ms ± 2%    ~     (p=0.717 n=48+49)
Tar               224ms ± 2%        224ms ± 2%    ~     (p=0.805 n=50+49)
XML               311ms ± 1%        310ms ± 1%    ~     (p=0.177 n=49+48)
[Geo mean]        537ms             537ms       +0.01%


Change-Id: I562b9f349b34ddcff01771769e6dbbc80604da7a
Reviewed-on: https://go-review.googlesource.com/c/go/+/221237
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-03-01 15:27:58 +00:00
Josh Bleecher Snyder 2cf3ebaf3d cmd/compile: add dedicated ARM64BitField aux type
The goal here is improved AuxInt printing in ssa.html.
Instead of displaying an inscrutable encoded integer,
it displays something like

v25 (28) = UBFX <int> [lsb=4,width=8] v52

which is much nicer for debugging.

Change-Id: I40713ff7f4a857c4557486cdf73c2dff137511ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/221420
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-02-28 14:52:13 +00:00
Josh Bleecher Snyder a3fc77aa7e cmd/compile: add ellipsis rule diagnostics to rulegen
These detect opportunities to convert a rule to use an ellipsis,
and provide better error messages when something goes wrong.

This change was used to generate all the preceding changes
converting rules to use ellipses. This change is at the end of those
changes rather than the beginning in order to avoid log spam during rule
generation (say during a git bisection).

The preceding changes collectively shrink the cmd/compile binary by ~2.2%.

Part of this detection is also warning when the presence of an
unmentioned aux or auxint could cause conversion to an ellipsis
rule to change the sematics of the rule.

For example:

(Div64 x y) -> (DIV x y)

looks like a promising rule for an ellipsis. However, Div64 has an auxint,
and (on most platforms) DIV does not. An ellipsis rule would keep the
auxint intact, rather than zeroing it, which can infere with CSE.
So this change flags this rule as doing implicit zeroing;
it should be replaced by

(Div64 [a] x y) -> (DIV x y)

which makes it clear that the auxint is being zeroed.

This detection is not foolproof, but it currently has no false positives.
If false positives arise in the future, we will need to gate the output.

Change-Id: Ie21f284579e5d6e75aa304d0deb024d41ede528b
Reviewed-on: https://go-review.googlesource.com/c/go/+/217014
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2020-02-28 13:21:44 +00:00
Daniel Martí 7855d6835d cmd/compile: add a script to measure ssa/gen's coverage
Since rulegen is only tested by inspecting and running its output code,
we have no good way to see if any chunks of its source are actually
being unused.

Code coverage only works as part of 'go test', since it needs to
instrument our code. Add a script that sets up a tiny test for that
purpose, with a quick example on how to use it.

We need to use a script, because there's no other way to make this work
without breaking 'go run *.go'. It's far more common to run the
generator than to obtain a coverage profile, so this solution seems like
the right tradeoff, and we don't break existing users.

The script isn't terribly portable, but that's okay for now.

At the time of wriging, coverage sits at 89.7%. I've manually skimmed
main.go and rulegen.go, and practically all unused code is either error
handling, or optional code like *genLog and "if false". A couple of
small exceptions stand out, though I'm not paying attention to them in
this CL.

While at it, inline a couple of tiny unusedInspector methods that were
only needed once or twice.

Change-Id: I78c5fb47c8536d70e546a437637d4428ec7adfaa
Reviewed-on: https://go-review.googlesource.com/c/go/+/212760
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-21 18:30:22 +00:00
Josh Bleecher Snyder 60651a1bc8 cmd/compile: add rule location to some rulegen logging
This requires threading location information through varCount.

This provides much more useful error messages.

Change-Id: If5ff942cbbbf386724eda15a523c181c137fac20
Reviewed-on: https://go-review.googlesource.com/c/go/+/216221
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2020-02-21 04:25:16 +00:00
Josh Bleecher Snyder 6dd11bcb35 cmd/compile: remove chunking of rewrite rules
We added chunking of rewrite rules to speed up compiling package SSA.
This series of changes has significantly shrunk the number of
rewrite rules, and they are no longer being added nearly as fast.
Now that we are sharing v.Args across multiple rewrite rules,
there is additional benefit to having more rules in a single function.
Removing chunking now has an incidental impact on compiling package SSA,
marginally speeds up other compilation, shrinks the cmd/compile binary,
and simplifies the code.

name        old time/op       new time/op       delta
Template          211ms ± 2%        210ms ± 2%  -0.50%  (p=0.000 n=91+97)
Unicode          81.9ms ± 3%       81.8ms ± 3%    ~     (p=0.179 n=96+91)
GoTypes           731ms ± 2%        731ms ± 1%    ~     (p=0.442 n=94+96)
Compiler          3.43s ± 2%        3.41s ± 2%  -0.36%  (p=0.001 n=98+94)
SSA               8.30s ± 2%        8.32s ± 2%  +0.19%  (p=0.034 n=94+95)
Flate             135ms ± 2%        134ms ± 1%  -0.30%  (p=0.006 n=98+94)
GoParser          167ms ± 1%        167ms ± 1%  -0.22%  (p=0.001 n=92+94)
Reflect           453ms ± 2%        453ms ± 3%    ~     (p=0.306 n=98+97)
Tar               184ms ± 2%        183ms ± 2%  -0.31%  (p=0.012 n=94+94)
XML               249ms ± 2%        248ms ± 1%  -0.26%  (p=0.002 n=96+92)
[Geo mean]        419ms             418ms       -0.21%

name        old user-time/op  new user-time/op  delta
Template          273ms ± 2%        272ms ± 2%  -0.46%  (p=0.000 n=93+96)
Unicode           116ms ± 4%        117ms ± 4%    ~     (p=0.433 n=98+98)
GoTypes           977ms ± 2%        977ms ± 1%    ~     (p=0.971 n=92+99)
Compiler          4.56s ± 6%        4.53s ± 6%    ~     (p=0.081 n=100+100)
SSA               11.1s ± 2%        11.1s ± 2%    ~     (p=0.064 n=99+96)
Flate             167ms ± 2%        167ms ± 1%  -0.24%  (p=0.004 n=95+96)
GoParser          203ms ± 1%        203ms ± 2%  -0.14%  (p=0.049 n=96+97)
Reflect           595ms ± 2%        595ms ± 2%    ~     (p=0.544 n=95+92)
Tar               225ms ± 2%        224ms ± 2%    ~     (p=0.562 n=99+99)
XML               312ms ± 2%        311ms ± 1%    ~     (p=0.050 n=97+93)
[Geo mean]        543ms             542ms       -0.13%

Change-Id: I8d34ab59f154b28f20c6f9e416b976bfce339baa
Reviewed-on: https://go-review.googlesource.com/c/go/+/216220
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-21 02:29:11 +00:00
Josh Bleecher Snyder 1bc116b73c cmd/compile: extract function for splitting up x:(Foo) in rewrite rule fragments
We had three implementations.

Refactor, and document the shared implementation.

While we're here, improve the docs for func unbalanced.

Change-Id: I612cce79de15a864247afe377d3739d04a56b9bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/216219
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2020-02-21 01:48:01 +00:00
Josh Bleecher Snyder a3f234c706 cmd/compile: reduce bounds checks in generated rewrite rules
CL 213703 converted generated rewrite rules for commutative ops
to use loops instead of duplicated code.

However, it loaded args using expressions like
v.Args[i] and v.Args[i^1], which the compiler could
not eliminate bounds for (including with all outstanding
prove CLs).

Also, given a series of separate rewrite rules for the same op,
we generated bounds checks for every rewrite rule, even though
we were repeatedly loading the same set of args.

This change reduces both sets of bounds checks.

Instead of loading v.Args[i] and v.Args[i^1] for commutative loops,
we now preload v.Args[0] and v.Args[1] into local variables,
and then swap them (as needed) in the commutative loop post statement.

And we now load all top level v.Args into local variables
at the beginning of every rewrite rule function.

The second optimization is the more significant,
but the first helps a little, and they play together
nicely from the perspective of generating the code.

This does increase register pressure, but the reduced bounds
checks more than compensate.

Note that the vast majority of rewrite rules evaluated
are not applied, so the prologue is the most important
part of the rewrite rules.

There is one subtle aspect to the new generated code.
Because the top level v.Args are shared across rewrite rules,
and rule evaluation can swap v_0 and v_1, v_0 and v_1
can end up being swapped from one rule to the next.
That is OK, because any time a rule does not get applied,
they will have been swapped exactly twice.

Passes toolstash-check -all.

name        old time/op       new time/op       delta
Template          213ms ± 2%        211ms ± 2%  -0.85%  (p=0.000 n=92+96)
Unicode          83.5ms ± 2%       83.2ms ± 2%  -0.41%  (p=0.004 n=95+90)
GoTypes           737ms ± 2%        733ms ± 2%  -0.51%  (p=0.000 n=91+94)
Compiler          3.45s ± 2%        3.43s ± 2%  -0.44%  (p=0.000 n=99+100)
SSA               8.54s ± 1%        8.32s ± 2%  -2.56%  (p=0.000 n=96+99)
Flate             136ms ± 2%        135ms ± 1%  -0.47%  (p=0.000 n=96+96)
GoParser          169ms ± 1%        168ms ± 1%  -0.33%  (p=0.000 n=96+93)
Reflect           456ms ± 3%        455ms ± 3%    ~     (p=0.261 n=95+94)
Tar               186ms ± 2%        185ms ± 2%  -0.48%  (p=0.000 n=94+95)
XML               251ms ± 1%        250ms ± 1%  -0.51%  (p=0.000 n=91+94)
[Geo mean]        424ms             421ms       -0.68%

name        old user-time/op  new user-time/op  delta
Template          275ms ± 1%        274ms ± 2%  -0.55%  (p=0.000 n=95+98)
Unicode           118ms ± 4%        118ms ± 4%    ~     (p=0.642 n=98+90)
GoTypes           983ms ± 1%        980ms ± 1%  -0.30%  (p=0.000 n=93+93)
Compiler          4.56s ± 6%        4.52s ± 6%  -0.72%  (p=0.003 n=100+100)
SSA               11.4s ± 1%        11.1s ± 1%  -2.50%  (p=0.000 n=96+97)
Flate             168ms ± 1%        167ms ± 1%  -0.49%  (p=0.000 n=92+92)
GoParser          204ms ± 1%        204ms ± 2%  -0.27%  (p=0.003 n=99+96)
Reflect           599ms ± 2%        598ms ± 2%    ~     (p=0.116 n=95+92)
Tar               227ms ± 2%        225ms ± 2%  -0.57%  (p=0.000 n=95+98)
XML               313ms ± 2%        312ms ± 1%  -0.37%  (p=0.000 n=89+95)
[Geo mean]        547ms             544ms       -0.61%

file    before    after     Δ       %
compile 21113112  21109016  -4096   -0.019%
total   131704940 131700844 -4096   -0.003%

Change-Id: Id6c39e0367e597c0c75b8a4b1eb14cc3cbd11956
Reviewed-on: https://go-review.googlesource.com/c/go/+/216218
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-21 00:55:58 +00:00
Josh Bleecher Snyder 8484d409ac cmd/compile: add ellipsis syntax for op-only rewrite rules
This change introduces a new syntax for rewrite rules
that only change a Value's Op. See #36380 for more discussion.

Updating rewrite rules to use ellipses will happen
in follow-up CLs.

Change-Id: I8c56e85de24607579d79729575c89ca80805ba5c
Reviewed-on: https://go-review.googlesource.com/c/go/+/213898
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-20 22:34:08 +00:00
Josh Bleecher Snyder 0f7088ade5 cmd/compile: dump contents when rulegen generates invalid code
It's much easier to debug when you can see
the contents in order to interpret the error message.

Change-Id: I03bbb9dd3071aeca9577cc725a60d43f78118cf4
Reviewed-on: https://go-review.googlesource.com/c/go/+/215717
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2020-02-20 22:33:55 +00:00
Josh Bleecher Snyder 2dfdd85c4f cmd/compile: document non-commutative rule detection
This documentation was lost in CL 213703.
This change restores it.

Change-Id: I544f15771d8a7390893efbda93478b46095ccf3c
Reviewed-on: https://go-review.googlesource.com/c/go/+/215541
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-20 22:33:07 +00:00
Josh Bleecher Snyder 4084c125cc cmd/compile: normalize whitespace around square brackets
I noticed some instances of "[ " and " ]" in the rewrite rules.
Normalizing them helps catch possible future duplicate rules.

Change-Id: I892fd7e9b4019ed304f0a61fa2bb7f7e47ef8f38
Reviewed-on: https://go-review.googlesource.com/c/go/+/213682
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2020-02-20 18:38:43 +00:00
Josh Bleecher Snyder bd6d78ef37 cmd/compile: use loops to handle commutative ops in rules
Prior to this change, we generated additional rules at rulegen time
for all possible combinations of args to commutative ops.
This is simple and works well, but leads to lots of generated rules.
This in turn has increased the size of the compiler,
made it hard to compile package ssa on small machines,
and provided a disincentive to mark some ops as commutative.

This change reworks how we handle commutative ops.
Instead of generating a rule per argument permutation,
we generate a series of nested loops, one for each commutative op.
Each loop tries both possible argument orderings.

I also considered attempting to canonicalize the inputs to the
rewrite rules. However, because either or both arguments might be
nothing more than an identifier, and because there can be arbitrary
conditions to evaluate during matching, I did not see how to proceed.

The duplicate rule detection now sorts arguments to commutative ops,
so that it can detect commutative-only duplicates.

There may be further optimizations to the new generated code.
In particular, we may not be removing as many bounds checks as before;
I have not investigated deeply. If more work here is needed,
we could do it with more hints or with improvements to the prove pass.

This change has almost no impact on the generated code.
It does not pass toolstash-check, however. In a handful of functions,
for reasons I do not understand, there are minor position changes.

For the entire series ending at this change,
there is negligible compiler performance impact.

The compiler binary shrinks by about 15%,
and package ssa shrinks by about 25%.
Package ssa also compiles ~25% faster with ~25% less memory.

Change-Id: Ia2ee9ceae7be08a17342319d4e31b0bb238a2ee4
Reviewed-on: https://go-review.googlesource.com/c/go/+/213703
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-20 17:34:07 +00:00
Josh Bleecher Snyder 73269b8e75 cmd/compile: generate commutative rules when a condition is present
The commutative rule generator has an optimization
where given (Add x y) with no other uses of x and y in the matching rule,
it doesn't generate the commutative match (Add y x).

However, if there is also a condition referring to x or y,
such as (Add x y) && isFoo(x), then we should generate the commutative rule.
This change parses the condition, extracts all idents, and takes them
into consideration.

This doesn't yield any new optimizations now.
However, it is the right thing to do;
otherwise we'll have to track it down and fix it again later.

It is also expensive now, in terms of additional generated code.
However, it will be much, much less expensive soon,
once our generated code for commutative ops gets smaller.

Change-Id: I52c2016c884bbc7789bf8dfe9b9c56061bc028ad
Reviewed-on: https://go-review.googlesource.com/c/go/+/213702
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-20 14:59:27 +00:00
Josh Bleecher Snyder 24fa159a2e cmd/compile: add a flag to print the source line for generated rules
When working on rulegen, I often find myself
searching the rules files to find the source of
generated code. Add a flag to make that easier.

The flag needs to be off by default,
so that adding a single rule doesn't cause a massive diff.

Change-Id: I5a6f09129dc6fceef7c9cd1ad7eee24f3880ba91
Reviewed-on: https://go-review.googlesource.com/c/go/+/213700
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-20 14:58:21 +00:00
Josh Bleecher Snyder 49f8d45994 cmd/compile: delete duplicate rules
Add logic during rulegen to detect exact duplicates
(after applying commutativity),
and clean up existing duplicates.

Change-Id: I7179f40fc48e236c74b74f429ec9f0f100026530
Reviewed-on: https://go-review.googlesource.com/c/go/+/213699
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-20 05:04:39 +00:00
Josh Bleecher Snyder a63ed10d44 cmd/compile: don't generate commutative rules for (Op x x)
If the two commutative arguments are perfectly identical,
then swapping them will never have an effect.

Passes toolstash-check for the relevant architectures,
that is, linux-386, linux-386-387, linux-amd64, linux-s390x.

Change-Id: I19f91644867d8d174bd01f872abe4809013872ea
Reviewed-on: https://go-review.googlesource.com/c/go/+/213698
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-20 04:51:33 +00:00
Josh Bleecher Snyder 715e8bbe63 cmd/compile: factor out opIsCommutative from commute1
Change-Id: I989a66c98dcca8168e35dd9834fc1365e0a1d881
Reviewed-on: https://go-review.googlesource.com/c/go/+/213697
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-20 03:27:46 +00:00
Michael Munday 6ec4c71eef cmd/compile: add SSA rules for s390x compare-and-branch instructions
This commit adds SSA rules for the s390x combined compare-and-branch
instructions. These have a shorter encoding than separate compare
and branch instructions and they also don't clobber the condition
code (a.k.a. flag register) reducing pressure on the flag allocator.

I have deleted the 'loop_test.go' file and replaced it with a new
codegen test which performs a wider range of checks.

Object sizes from compilebench:

name                      old object-bytes  new object-bytes  delta
Template                        562kB ± 0%        561kB ± 0%   -0.28%  (p=0.000 n=10+10)
Unicode                         217kB ± 0%        217kB ± 0%   -0.17%  (p=0.000 n=10+10)
GoTypes                        2.03MB ± 0%       2.02MB ± 0%   -0.59%  (p=0.000 n=10+10)
Compiler                       8.16MB ± 0%       8.11MB ± 0%   -0.62%  (p=0.000 n=10+10)
SSA                            27.4MB ± 0%       27.0MB ± 0%   -1.45%  (p=0.000 n=10+10)
Flate                           356kB ± 0%        356kB ± 0%   -0.12%  (p=0.000 n=10+10)
GoParser                        438kB ± 0%        436kB ± 0%   -0.51%  (p=0.000 n=10+10)
Reflect                        1.37MB ± 0%       1.37MB ± 0%   -0.42%  (p=0.000 n=10+10)
Tar                             485kB ± 0%        483kB ± 0%   -0.39%  (p=0.000 n=10+10)
XML                             630kB ± 0%        621kB ± 0%   -1.45%  (p=0.000 n=10+10)
[Geo mean]                     1.14MB            1.13MB        -0.60%

name                      old text-bytes    new text-bytes    delta
HelloSize                       763kB ± 0%        754kB ± 0%   -1.30%  (p=0.000 n=10+10)
CmdGoSize                      10.7MB ± 0%       10.6MB ± 0%   -0.91%  (p=0.000 n=10+10)
[Geo mean]                     2.86MB            2.82MB        -1.10%

Change-Id: Ibca55d9c0aa1254aee69433731ab5d26a43a7c18
Reviewed-on: https://go-review.googlesource.com/c/go/+/198037
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-10-08 10:03:04 +00:00
Michael Munday c7d81bc086 cmd/compile: reduce amount of code generated for block rewrite rules
Add a Reset method to blocks that allows us to reduce the amount of
code we generate for block rewrite rules.

Thanks to Cherry for suggesting a similar fix to this in CL 196557.

Compilebench result:

name                      old time/op       new time/op       delta
Template                        211ms ± 1%        211ms ± 1%   -0.30%  (p=0.028 n=19+20)
Unicode                        83.7ms ± 3%       83.0ms ± 2%   -0.79%  (p=0.029 n=18+19)
GoTypes                         757ms ± 1%        755ms ± 1%   -0.31%  (p=0.034 n=19+19)
Compiler                        3.51s ± 1%        3.50s ± 1%   -0.20%  (p=0.013 n=18+18)
SSA                             11.7s ± 1%        11.7s ± 1%   -0.38%  (p=0.000 n=19+19)
Flate                           131ms ± 1%        130ms ± 1%   -0.32%  (p=0.024 n=18+18)
GoParser                        162ms ± 1%        162ms ± 1%     ~     (p=0.059 n=20+18)
Reflect                         471ms ± 0%        470ms ± 0%   -0.24%  (p=0.045 n=20+17)
Tar                             187ms ± 1%        186ms ± 1%     ~     (p=0.157 n=20+20)
XML                             255ms ± 1%        255ms ± 1%     ~     (p=0.461 n=19+20)
LinkCompiler                    754ms ± 2%        755ms ± 2%     ~     (p=0.919 n=17+17)
ExternalLinkCompiler            2.82s ±16%        2.37s ±10%  -15.94%  (p=0.000 n=20+20)
LinkWithoutDebugCompiler        439ms ± 4%        442ms ± 6%     ~     (p=0.461 n=18+19)
StdCmd                          25.8s ± 2%        25.5s ± 1%   -0.95%  (p=0.000 n=20+20)

name                      old user-time/op  new user-time/op  delta
Template                        240ms ± 8%        238ms ± 7%     ~     (p=0.301 n=20+20)
Unicode                         107ms ±18%        104ms ±13%     ~     (p=0.149 n=20+20)
GoTypes                         883ms ± 3%        888ms ± 2%     ~     (p=0.211 n=20+20)
Compiler                        4.22s ± 1%        4.20s ± 1%     ~     (p=0.077 n=20+18)
SSA                             14.1s ± 1%        14.1s ± 2%     ~     (p=0.192 n=20+20)
Flate                           145ms ±10%        148ms ± 5%     ~     (p=0.126 n=20+18)
GoParser                        186ms ± 7%        186ms ± 7%     ~     (p=0.779 n=20+20)
Reflect                         538ms ± 3%        541ms ± 3%     ~     (p=0.192 n=20+20)
Tar                             218ms ± 4%        217ms ± 6%     ~     (p=0.835 n=19+20)
XML                             298ms ± 5%        298ms ± 5%     ~     (p=0.749 n=19+20)
LinkCompiler                    818ms ± 5%        825ms ± 8%     ~     (p=0.461 n=20+20)
ExternalLinkCompiler            1.55s ± 4%        1.53s ± 5%     ~     (p=0.063 n=20+18)
LinkWithoutDebugCompiler        460ms ±12%        460ms ± 7%     ~     (p=0.925 n=20+20)

name                      old object-bytes  new object-bytes  delta
Template                        554kB ± 0%        554kB ± 0%     ~     (all equal)
Unicode                         215kB ± 0%        215kB ± 0%     ~     (all equal)
GoTypes                        2.01MB ± 0%       2.01MB ± 0%     ~     (all equal)
Compiler                       7.97MB ± 0%       7.97MB ± 0%   +0.00%  (p=0.000 n=20+20)
SSA                            26.8MB ± 0%       26.9MB ± 0%   +0.27%  (p=0.000 n=20+20)
Flate                           340kB ± 0%        340kB ± 0%     ~     (all equal)
GoParser                        434kB ± 0%        434kB ± 0%     ~     (all equal)
Reflect                        1.34MB ± 0%       1.34MB ± 0%     ~     (all equal)
Tar                             480kB ± 0%        480kB ± 0%     ~     (all equal)
XML                             622kB ± 0%        622kB ± 0%     ~     (all equal)

name                      old export-bytes  new export-bytes  delta
Template                       20.4kB ± 0%       20.4kB ± 0%     ~     (all equal)
Unicode                        8.21kB ± 0%       8.21kB ± 0%     ~     (all equal)
GoTypes                        36.6kB ± 0%       36.6kB ± 0%     ~     (all equal)
Compiler                        115kB ± 0%        115kB ± 0%   +0.08%  (p=0.000 n=20+20)
SSA                             141kB ± 0%        141kB ± 0%   +0.07%  (p=0.000 n=20+20)
Flate                          5.11kB ± 0%       5.11kB ± 0%     ~     (all equal)
GoParser                       8.93kB ± 0%       8.93kB ± 0%     ~     (all equal)
Reflect                        11.8kB ± 0%       11.8kB ± 0%     ~     (all equal)
Tar                            10.9kB ± 0%       10.9kB ± 0%     ~     (all equal)
XML                            17.4kB ± 0%       17.4kB ± 0%     ~     (all equal)

name                      old text-bytes    new text-bytes    delta
HelloSize                       742kB ± 0%        742kB ± 0%     ~     (all equal)
CmdGoSize                      10.7MB ± 0%       10.7MB ± 0%     ~     (all equal)

name                      old data-bytes    new data-bytes    delta
HelloSize                      10.7kB ± 0%       10.7kB ± 0%     ~     (all equal)
CmdGoSize                       312kB ± 0%        312kB ± 0%     ~     (all equal)

name                      old bss-bytes     new bss-bytes     delta
HelloSize                       122kB ± 0%        122kB ± 0%     ~     (all equal)
CmdGoSize                       146kB ± 0%        146kB ± 0%     ~     (all equal)

name                      old exe-bytes     new exe-bytes     delta
HelloSize                      1.10MB ± 0%       1.10MB ± 0%     ~     (all equal)
CmdGoSize                      14.9MB ± 0%       14.9MB ± 0%     ~     (all equal)

Change-Id: Ic89a8e62423b3d9fd9391159e0663acf450803b5
Reviewed-on: https://go-review.googlesource.com/c/go/+/198419
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2019-10-07 09:19:12 +00:00
Michael Munday 9c2e7e8bed cmd/compile: allow multiple SSA block control values
Control values are used to choose which successor of a block is
jumped to. Typically a control value takes the form of a 'flags'
value that represents the result of a comparison. Some
architectures however use a variable in a register as a control
value.

Up until now we have managed with a single control value per block.
However some architectures (e.g. s390x and riscv64) have combined
compare-and-branch instructions that take two variables in registers
as parameters. To generate these instructions we need to support 2
control values per block.

This CL allows up to 2 control values to be used in a block in
order to support the addition of compare-and-branch instructions.
I have implemented s390x compare-and-branch instructions in a
different CL.

Passes toolstash-check -all.

Results of compilebench:

name                      old time/op       new time/op       delta
Template                        208ms ± 1%        209ms ± 1%    ~     (p=0.289 n=20+20)
Unicode                        83.7ms ± 1%       83.3ms ± 3%  -0.49%  (p=0.017 n=18+18)
GoTypes                         748ms ± 1%        748ms ± 0%    ~     (p=0.460 n=20+18)
Compiler                        3.47s ± 1%        3.48s ± 1%    ~     (p=0.070 n=19+18)
SSA                             11.5s ± 1%        11.7s ± 1%  +1.64%  (p=0.000 n=19+18)
Flate                           130ms ± 1%        130ms ± 1%    ~     (p=0.588 n=19+20)
GoParser                        160ms ± 1%        161ms ± 1%    ~     (p=0.211 n=20+20)
Reflect                         465ms ± 1%        467ms ± 1%  +0.42%  (p=0.007 n=20+20)
Tar                             184ms ± 1%        185ms ± 2%    ~     (p=0.087 n=18+20)
XML                             253ms ± 1%        253ms ± 1%    ~     (p=0.377 n=20+18)
LinkCompiler                    769ms ± 2%        774ms ± 2%    ~     (p=0.070 n=19+19)
ExternalLinkCompiler            3.59s ±11%        3.68s ± 6%    ~     (p=0.072 n=20+20)
LinkWithoutDebugCompiler        446ms ± 5%        454ms ± 3%  +1.79%  (p=0.002 n=19+20)
StdCmd                          26.0s ± 2%        26.0s ± 2%    ~     (p=0.799 n=20+20)

name                      old user-time/op  new user-time/op  delta
Template                        238ms ± 5%        240ms ± 5%    ~     (p=0.142 n=20+20)
Unicode                         105ms ±11%        106ms ±10%    ~     (p=0.512 n=20+20)
GoTypes                         876ms ± 2%        873ms ± 4%    ~     (p=0.647 n=20+19)
Compiler                        4.17s ± 2%        4.19s ± 1%    ~     (p=0.093 n=20+18)
SSA                             13.9s ± 1%        14.1s ± 1%  +1.45%  (p=0.000 n=18+18)
Flate                           145ms ±13%        146ms ± 5%    ~     (p=0.851 n=20+18)
GoParser                        185ms ± 5%        188ms ± 7%    ~     (p=0.174 n=20+20)
Reflect                         534ms ± 3%        538ms ± 2%    ~     (p=0.105 n=20+18)
Tar                             215ms ± 4%        211ms ± 9%    ~     (p=0.079 n=19+20)
XML                             295ms ± 6%        295ms ± 5%    ~     (p=0.968 n=20+20)
LinkCompiler                    832ms ± 4%        837ms ± 7%    ~     (p=0.707 n=17+20)
ExternalLinkCompiler            1.58s ± 8%        1.60s ± 4%    ~     (p=0.296 n=20+19)
LinkWithoutDebugCompiler        478ms ±12%        489ms ±10%    ~     (p=0.429 n=20+20)

name                      old object-bytes  new object-bytes  delta
Template                        559kB ± 0%        559kB ± 0%    ~     (all equal)
Unicode                         216kB ± 0%        216kB ± 0%    ~     (all equal)
GoTypes                        2.03MB ± 0%       2.03MB ± 0%    ~     (all equal)
Compiler                       8.07MB ± 0%       8.07MB ± 0%  -0.06%  (p=0.000 n=20+20)
SSA                            27.1MB ± 0%       27.3MB ± 0%  +0.89%  (p=0.000 n=20+20)
Flate                           343kB ± 0%        343kB ± 0%    ~     (all equal)
GoParser                        441kB ± 0%        441kB ± 0%    ~     (all equal)
Reflect                        1.36MB ± 0%       1.36MB ± 0%    ~     (all equal)
Tar                             487kB ± 0%        487kB ± 0%    ~     (all equal)
XML                             632kB ± 0%        632kB ± 0%    ~     (all equal)

name                      old export-bytes  new export-bytes  delta
Template                       18.5kB ± 0%       18.5kB ± 0%    ~     (all equal)
Unicode                        7.92kB ± 0%       7.92kB ± 0%    ~     (all equal)
GoTypes                        35.0kB ± 0%       35.0kB ± 0%    ~     (all equal)
Compiler                        109kB ± 0%        110kB ± 0%  +0.72%  (p=0.000 n=20+20)
SSA                             137kB ± 0%        138kB ± 0%  +0.58%  (p=0.000 n=20+20)
Flate                          4.89kB ± 0%       4.89kB ± 0%    ~     (all equal)
GoParser                       8.49kB ± 0%       8.49kB ± 0%    ~     (all equal)
Reflect                        11.4kB ± 0%       11.4kB ± 0%    ~     (all equal)
Tar                            10.5kB ± 0%       10.5kB ± 0%    ~     (all equal)
XML                            16.7kB ± 0%       16.7kB ± 0%    ~     (all equal)

name                      old text-bytes    new text-bytes    delta
HelloSize                       761kB ± 0%        761kB ± 0%    ~     (all equal)
CmdGoSize                      10.8MB ± 0%       10.8MB ± 0%    ~     (all equal)

name                      old data-bytes    new data-bytes    delta
HelloSize                      10.7kB ± 0%       10.7kB ± 0%    ~     (all equal)
CmdGoSize                       312kB ± 0%        312kB ± 0%    ~     (all equal)

name                      old bss-bytes     new bss-bytes     delta
HelloSize                       122kB ± 0%        122kB ± 0%    ~     (all equal)
CmdGoSize                       146kB ± 0%        146kB ± 0%    ~     (all equal)

name                      old exe-bytes     new exe-bytes     delta
HelloSize                      1.13MB ± 0%       1.13MB ± 0%    ~     (all equal)
CmdGoSize                      15.1MB ± 0%       15.1MB ± 0%    ~     (all equal)

Change-Id: I3cc2f9829a109543d9a68be4a21775d2d3e9801f
Reviewed-on: https://go-review.googlesource.com/c/go/+/196557
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Keith Randall <khr@golang.org>
2019-10-02 09:56:36 +00:00
Daniel Martí bdf0fe5448 cmd/compile: minor simplifications in rulegen
First, be consistent about declaring typ as &b.Func.Config.Types and
not &config.Types. Not particularly better, and it barely changes the
output, but we're more consistent now.

Second, remove a bit of duplication when handling the typ, auxint, and
aux variables.

Third and last, remove a stray canFail assignment; we ended up setting
that in add, not breakf, so it's not necessary to set it manually if we
don't use breakf.

Updates #33644.

Change-Id: I75999cb223a201969266fbfeae043599fa27fac5
Reviewed-on: https://go-review.googlesource.com/c/go/+/196803
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-09-27 18:58:08 +00:00
Michael Munday cf03238020 cmd/compile: use numeric condition code masks on s390x
Prior to this CL conditional branches on s390x always used an
extended mnemonic such as BNE, BLT and so on to represent branch
instructions with different condition code masks. This CL adds
support for numeric condition code masks to the s390x SSA backend
so that we can encode the condition under which a Block's
successor is chosen as a field in that Block rather than in its
type.

This change will be useful as we come to add support for combined
compare-and-branch instructions. Rather than trying to add extended
mnemonics for every possible combination of mask and compare-and-
branch instruction we can instead use a single mnemonic for each
instruction.

Change-Id: Idb7458f187b50906877d683695c291dff5279553
Reviewed-on: https://go-review.googlesource.com/c/go/+/197178
Reviewed-by: Keith Randall <khr@golang.org>
2019-09-26 14:47:12 +00:00
Daniel Martí 870080752d cmd/compile: reduce rulegen's output by 200 KiB
First, renove unnecessary "// cond:" lines from the generated files.
This shaves off about ~7k lines.

Second, join "if cond { break }" statements via "||", which allows us to
deduplicate a large number of them. This shaves off another ~25k lines.

This change is not for readability or simplicity; but rather, to avoid
unnecessary verbosity that makes the generated files larger. All in all,
git reports that the generated files overall weigh ~200KiB less, or
about 2.7% less.

While at it, add a -trace flag to rulegen.

Updates #33644.

Change-Id: I3fac0290a6066070cc62400bf970a4ae0929470a
Reviewed-on: https://go-review.googlesource.com/c/go/+/196498
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-09-23 16:36:10 +00:00
Daniel Martí 7d16e44d4f cmd/compile: reduce the regexp work in rulegen
As correctly pointed out by Giovanni Bajo, doing a single regexp pass
should be much faster than doing hundreds per architecture. We can then
use a map to keep track of what ops are handled in each file. And the
amount of saved work is evident:

	name     old time/op         new time/op         delta
	Rulegen          2.48s ± 1%          2.02s ± 1%  -18.44%  (p=0.008 n=5+5)

	name     old user-time/op    new user-time/op    delta
	Rulegen          10.9s ± 1%           8.9s ± 0%  -18.27%  (p=0.008 n=5+5)

	name     old sys-time/op     new sys-time/op     delta
	Rulegen          209ms ±28%          236ms ±18%     ~     (p=0.310 n=5+5)

	name     old peak-RSS-bytes  new peak-RSS-bytes  delta
	Rulegen          178MB ± 3%          176MB ± 3%     ~     (p=0.548 n=5+5)

The speed-up is so large that we don't need to parallelize it anymore;
the numbers above are with the removed goroutines. Adding them back in
doesn't improve performance noticeably at all:

	name     old time/op         new time/op         delta
	Rulegen          2.02s ± 1%          2.01s ± 1%   ~     (p=0.421 n=5+5)

	name     old user-time/op    new user-time/op    delta
	Rulegen          8.90s ± 0%          8.96s ± 1%   ~     (p=0.095 n=5+5)

While at it, remove an unused method.

Change-Id: I328b56e63b64a9ab48147e67e7d5a385c795ec54
Reviewed-on: https://go-review.googlesource.com/c/go/+/195739
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-09-17 18:22:37 +00:00
Daniel Martí 0e0bff840e cmd/compiler: skip some go/printer work in rulegen
We use go/format on the final output, so don't bother with the added
tabwriter work to align comments when using go/printer.

	name     old time/op         new time/op         delta
	Rulegen          2.53s ± 2%          2.48s ± 1%  -2.20%  (p=0.032 n=5+5)

	name     old user-time/op    new user-time/op    delta
	Rulegen          11.2s ± 1%          10.8s ± 0%  -3.72%  (p=0.008 n=5+5)

	name     old sys-time/op     new sys-time/op     delta
	Rulegen          218ms ±17%          207ms ±19%    ~     (p=0.548 n=5+5)

	name     old peak-RSS-bytes  new peak-RSS-bytes  delta
	Rulegen          184MB ± 3%          175MB ± 4%    ~     (p=0.056 n=5+5)

Change-Id: I53bad2ab15cace67415f2171fffcd13ed596e62b
Reviewed-on: https://go-review.googlesource.com/c/go/+/195219
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-09-16 21:37:16 +00:00
Daniel Martí 1581bb9843 cmd/compile: stop using go/types in rulegen
Using go/types to get rid of all unused variables in CL 189798 was a
neat idea, but it was pretty expensive. go/types is a full typechecker,
which does a lot more work than we actually need. Moreover, we had to
run it multiple times, to catch variables that became unused after
removing existing unused variables.

Instead, write our own little detector for unused imports and variables.
It doesn't use ast.Walk, as we need to know what fields we're
inspecting. For example, in "foo := bar", "foo" is declared, and "bar"
is used, yet they both appear as simple *ast.Ident cases under ast.Walk.

The code is documented to explain how unused variables are detected in a
single syntax tree pass. Since this happens after we've generated a
complete go/ast.File, we don't need to worry about our own simplified
node types.

The generated code is the same, but rulegen is much faster and uses less
memory at its peak, so it should scale better with time.

With 'benchcmd Rulegen go run *.go' on perflock, we get:

	name     old time/op         new time/op         delta
	Rulegen          4.00s ± 0%          3.41s ± 1%  -14.70%  (p=0.008 n=5+5)

	name     old user-time/op    new user-time/op    delta
	Rulegen          14.1s ± 1%          10.6s ± 1%  -24.62%  (p=0.008 n=5+5)

	name     old sys-time/op     new sys-time/op     delta
	Rulegen          318ms ±26%          263ms ± 9%     ~     (p=0.056 n=5+5)

	name     old peak-RSS-bytes  new peak-RSS-bytes  delta
	Rulegen          231MB ± 4%          181MB ± 3%  -21.69%  (p=0.008 n=5+5)

Change-Id: I8387d52818f6131357868ad348dac8c96d926191
Reviewed-on: https://go-review.googlesource.com/c/go/+/191782
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-09-12 10:48:58 +00:00
Ainar Garipov 0efbd10157 all: fix typos
Use the following (suboptimal) script to obtain a list of possible
typos:

  #!/usr/bin/env sh

  set -x

  git ls-files |\
    grep -e '\.\(c\|cc\|go\)$' |\
    xargs -n 1\
    awk\
    '/\/\// { gsub(/.*\/\//, ""); print; } /\/\*/, /\*\// { gsub(/.*\/\*/, ""); gsub(/\*\/.*/, ""); }' |\
    hunspell -d en_US -l |\
    grep '^[[:upper:]]\{0,1\}[[:lower:]]\{1,\}$' |\
    grep -v -e '^.\{1,4\}$' -e '^.\{16,\}$' |\
    sort -f |\
    uniq -c |\
    awk '$1 == 1 { print $2; }'

Then, go through the results manually and fix the most obvious typos in
the non-vendored code.

Change-Id: I3cb5830a176850e1a0584b8a40b47bde7b260eae
Reviewed-on: https://go-review.googlesource.com/c/go/+/193848
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-09-08 17:28:20 +00:00
Keith Randall b91b3d9c31 cmd/compile: remove auxSymInt32
We never used it, might as well get rid of it.

Change-Id: I5c23c93e90173bff9ac1fc1b8ae1e2025215d6eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/191938
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-08-28 00:40:22 +00:00
Daniel Martí 79dee788ec cmd/compile: teach rulegen to remove unused decls
First, add cpu and memory profiling flags, as these are useful to see
where rulegen is spending its time. It now takes many seconds to run on
a recent laptop, so we have to keep an eye on what it's doing.

Second, stop writing '_ = var' lines to keep imports and variables used
at all times. Now that rulegen removes all such unused names, they're
unnecessary.

To perform the removal, lean on go/types to first detect what names are
unused. We can configure it to give us all the type-checking errors in a
file, so we can collect all "declared but not used" errors in a single
pass.

We then use astutil.Apply to remove the relevant nodes based on the line
information from each unused error. This allows us to apply the changes
without having to do extra parser+printer roundtrips to plaintext, which
are far too expensive.

We need to do multiple such passes, as removing an unused variable
declaration might then make another declaration unused. Two passes are
enough to clean every file at the moment, so add a limit of three passes
for now to avoid eating cpu uncontrollably by accident.

The resulting performance of the changes above is a ~30% loss across the
table, since go/types is fairly expensive. The numbers were obtained
with 'benchcmd Rulegen go run *.go', which involves compiling rulegen
itself, but that seems reflective of how the program is used.

	name     old time/op         new time/op         delta
	Rulegen          5.61s ± 0%          7.36s ± 0%  +31.17%  (p=0.016 n=5+4)

	name     old user-time/op    new user-time/op    delta
	Rulegen          7.20s ± 1%          9.92s ± 1%  +37.76%  (p=0.016 n=5+4)

	name     old sys-time/op     new sys-time/op     delta
	Rulegen          135ms ±19%          169ms ±17%  +25.66%  (p=0.032 n=5+5)

	name     old peak-RSS-bytes  new peak-RSS-bytes  delta
	Rulegen         71.0MB ± 2%         85.6MB ± 2%  +20.56%  (p=0.008 n=5+5)

We can live with a bit more resource usage, but the time/op getting
close to 10s isn't good. To win that back, introduce concurrency in
main.go. This further increases resource usage a bit, but the real time
on this quad-core laptop is greatly reduced. The final benchstat is as
follows:

	name     old time/op         new time/op         delta
	Rulegen          5.61s ± 0%          3.97s ± 1%   -29.26%  (p=0.008 n=5+5)

	name     old user-time/op    new user-time/op    delta
	Rulegen          7.20s ± 1%         13.91s ± 1%   +93.09%  (p=0.008 n=5+5)

	name     old sys-time/op     new sys-time/op     delta
	Rulegen          135ms ±19%          269ms ± 9%   +99.17%  (p=0.008 n=5+5)

	name     old peak-RSS-bytes  new peak-RSS-bytes  delta
	Rulegen         71.0MB ± 2%        226.3MB ± 1%  +218.72%  (p=0.008 n=5+5)

It might be possible to reduce the cpu or memory usage in the future,
such as configuring go/types to do less work, or taking shortcuts to
avoid having to run it many times. For now, ~2x cpu and ~4x memory usage
seems like a fair trade for a faster and better rulegen.

Finally, we can remove the old code that tried to remove some unused
variables in a hacky and unmaintainable way.

Change-Id: Iff9e83e3f253babf5a1bd48cc993033b8550cee6
Reviewed-on: https://go-review.googlesource.com/c/go/+/189798
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-08-27 17:04:18 +00:00
Daniel Martí 1a53915c6b cmd/compile: initial rulegen rewrite
rulegen.go produces plaintext Go code directly, which was fine for a
while. However, that's started being a bottleneck for making code
generation more complex, as we can only generate code directly one line
at a time.

Some workarounds were used, like multiple layers of buffers to generate
chunks of code, to then use strings.Contains to see whether variables
need to be defined or not. However, that's error-prone, verbose, and
difficult to work with.

A better approach is to generate an intermediate syntax tree in memory,
which we can inspect and modify easily. For example, we could run a
number of "passes" on the syntax tree before writing to disk, such as
removing unused variables, simplifying logic, or moving declarations
closer to their uses.

This is the first step in that direction, without changing any of the
generated code. We didn't use go/ast directly, as it's too complex for
our needs. In particular, we only need a few kinds of simple statements,
but we do want to support arbitrary expressions. As such, define a
simple set of statement structs, and add thin layers for printer.Fprint
and ast.Inspect.

A nice side effect of this change, besides removing some buffers and
string handling, is that we can now avoid passing so many parameters
around. And, while we add over a hundred lines of code, the tricky
pieces of code are now a bit simpler to follow.

While at it, apply some cleanups, such as replacing isVariable with
token.IsIdentifier, and consistently using log.Fatalf.

Follow-up CLs will start improving the generated code, also simplifying
the rulegen code itself. I've added some TODOs for the low-hanging fruit
that I intend to work on right after.

Updates #30810.

Change-Id: Ic371c192b29c85dfc4a001be7fbcbeec85facc9d
Reviewed-on: https://go-review.googlesource.com/c/go/+/177539
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-08-27 16:50:45 +00:00
Daniel Martí 0fbf681840 cmd/compile: reduce rulegen's for loop verbosity
A lot of the naked for loops begin like:

	for {
		v := b.Control
		if v.Op != OpConstBool {
			break
		}
		...
		return true
	}

Instead, write them out in a more compact and readable way:

	for v.Op == OpConstBool {
		...
		return true
	}

This requires the addition of two bytes.Buffer writers, as this helps us
make a decision based on future pieces of generated code. This probably
makes rulegen slightly slower, but that's not noticeable; the code
generation still takes ~3.5s on my laptop, excluding build time.

The "v := b.Control" declaration can be moved to the top of each
function. Even though the rules can modify b.Control when firing, they
also make the function return, so v can't be used again.

While at it, remove three unnecessary lines from the top of each
rewriteBlock func.

In total, this results in ~4k lines removed from the generated code, and
a slight improvement in readability.

Change-Id: I317e4c6a4842c64df506f4513375475fad2aeec5
Reviewed-on: https://go-review.googlesource.com/c/go/+/167399
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-03-22 15:41:51 +00:00
Josh Bleecher Snyder 80b6812d7b cmd/compile: move flagalloc op splitting to rewrite rules
Flagalloc has the unenviable task of splitting
flag-generating ops that have been merged with loads
when the flags need to "spilled" (i.e. regenerated).
Since there weren't very many of them, there was a hard-coded list
of ops and bespoke code written to split them.

This change migrates load splitting into rewrite rules,
to make them easier to maintain.

Change-Id: I7750eafb888a802206c410f9c341b3133e7748b8
Reviewed-on: https://go-review.googlesource.com/c/go/+/166978
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-03-19 16:55:40 +00:00
Michael Munday 04f1b65cc6 cmd/compile: try and access last argument first in rulegen
This reduces the number of extra bounds check hints we need to
insert. For example, rather than producing:

	_ = v.Args[2]
	x := v.Args[0]
	y := v.Args[1]
	z := v.Args[2]

We now produce:

	z := v.Args[2]
	x := v.Args[0]
	y := v.Args[1]

This gets rid of about 7000 lines of code from the rewrite rules.

Change-Id: I1291cf0f82e8d035a6d65bce7dee6cedee04cbcd
Reviewed-on: https://go-review.googlesource.com/c/go/+/167397
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-13 15:11:37 +00:00
Josh Bleecher Snyder 53948127d3 cmd/compile: make rulegen magic variable prediction more precise
The sheer length of the generated rules files makes my
editor and git client unhappy.
This change is a small step towards shortening them.

We recognize a few magic variables during rulegen: b, config, fe, typ.
Of these, only b appears prone to false positives.
By tightening the heuristic and fixing one case in MIPS.rules,
we can make the heuristic enough that it has no failures.
That allows us to remove the hedge assignments to _,
removing 3000 pointless lines of code.

Change-Id: I080cde5db28c8277cb3fd9ddcd829306c9a27785
Reviewed-on: https://go-review.googlesource.com/c/go/+/166979
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-12 20:13:46 +00:00
Josh Bleecher Snyder 48d3c32ba9 cmd/compile: teach rulegen to |-expand multiple |s in a single op
I want to be able to write

MOV(Q|Q|L|L|L|W|W|B)loadidx(1|8|1|4|8|1|2|1)

instead of

MOV(Qloadidx1|Qloadidx8|Lloadidx1|Lloadidx4|Lloadidx8|Wloadidx1|Wloadidx2|Bloadidx1)

in rewrite rules.

Both are fairly cryptic and hard to review, but the former
is at least compact, which helps to not obscure the structure
of the rest of the rule.

Support that by adjusting rulegen's expansion.

Instead of looking for an op that begins with "(", ends with " ",
and has exactly one set of parens in it, look for everything of the
form "(...|...)".

That has false positives: Go code in the && conditions and AuxInt expressions.
Those are easily checked for syntactically: && conditions are between && and ->,
and AuxInt expressions are inside square brackets.
After ruling out those false positives, we can keep everything else,
regardless of where it is.

No change to the generated code for existing rules.

Change-Id: I5b70a190e268989504f53cb2cce2f9a50170d8a2
Reviewed-on: https://go-review.googlesource.com/c/go/+/166737
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-11 17:40:52 +00:00
Josh Bleecher Snyder b4fbd291b3 cmd/compile: normalize more whitespace in rewrite rules
If you write a rewrite rule:

(something) && noteRule("X")-> (something)

then rulegen will panic with an error message about commutativity.
The real problem is the lack of a space between the ) and the ->.
Normalize that bit of whitespace too.

Change-Id: Idbd53687cd0398fe275ff2702667688cad05b4ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/166427
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-03-11 17:18:40 +00:00
Keith Randall 7502ed3b90 cmd/compile: when merging instructions, prefer line number of faulting insn
Normally this happens when combining a sign extension and a load.  We
want the resulting combo-instruction to get the line number of the
load, not the line number of the sign extension.

For each rule, compute where we should get its line number by finding
a value on the match side that can fault.  Use that line number for
all the new values created on the right-hand side.

Fixes #27201

Change-Id: I19b3c6f468fff1a3c0bfbce2d6581828557064a3
Reviewed-on: https://go-review.googlesource.com/c/156937
Reviewed-by: David Chase <drchase@google.com>
2019-01-14 22:41:33 +00:00
Brad Fitzpatrick 3813edf26e all: use "reports whether" consistently in the few places that didn't
Go documentation style for boolean funcs is to say:

    // Foo reports whether ...
    func Foo() bool

(rather than "returns true if")

This CL also replaces 4 uses of "iff" with the same "reports whether"
wording, which doesn't lose any meaning, and will prevent people from
sending typo fixes when they don't realize it's "if and only if". In
the past I think we've had the typo CLs updated to just say "reports
whether". So do them all at once.

(Inspired by the addition of another "returns true if" in CL 146938
in fd_plan9.go)

Created with:

$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" | grep -v vendor)
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" | grep -v vendor)

Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f
Reviewed-on: https://go-review.googlesource.com/c/147037
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-02 22:47:58 +00:00
Josh Bleecher Snyder 3bab4373c7 cmd/compile: make fmt available in rewrite rules
During development and debugging, I often want to
write noteRule(fmt.Sprintf(...)), and end up
manually adding the import to the generated code.
Let's just make it always available instead.

Change-Id: I1e2d47c98ba056e1b5da42e35fb6ad26f1d9cc3d
Reviewed-on: https://go-review.googlesource.com/c/145207
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
2018-10-28 18:46:36 +00:00
Keith Randall 31ef3846a7 cmd/compile: add rulegen diagnostic
When rulegen complains about a missing type, report the line number
in the rules file.

Change-Id: Ic7c19e1d5f29547911909df5788945848a6080ff
Reviewed-on: https://go-review.googlesource.com/114004
Reviewed-by: David Chase <drchase@google.com>
2018-05-22 16:40:34 +00:00
Shawn Smith d3beea8c52 all: fix misspellings
GitHub-Last-Rev: 468df242d0
GitHub-Pull-Request: golang/go#23935
Change-Id: If751ce3ffa3a4d5e00a3138211383d12cb6b23fc
Reviewed-on: https://go-review.googlesource.com/95577
Run-TryBot: Andrew Bonventre <andybons@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Bonventre <andybons@golang.org>
2018-02-20 21:02:58 +00:00
Giovanni Bajo 70fd25e4e1 cmd/compile: normalize spaces in rewrite rule comments.
In addition to look nicer to the eye, this allows to reformat
and indent rules without causing spurious changes to the generated
file, making it easier to spot functional changes.

After this CL, all CLs that will aggregate rules through
the new "|" functionality should cause no changes to the
generated files.

Change-Id: Icec283585ba8d7b91c79d76513c1d83dca4b30aa
Reviewed-on: https://go-review.googlesource.com/95216
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-02-20 18:14:38 +00:00
philhofer 2d0172c3a7 cmd/compile/internal/ssa: emit csel on arm64
Introduce a new SSA pass to generate CondSelect intstrutions,
and add CondSelect lowering rules for arm64.

In order to make the CSEL instruction easier to optimize,
and to simplify the introduction of CSNEG, CSINC, and CSINV
in the future, modify the CSEL instruction to accept a condition
code in the aux field.

Notably, this change makes the go1 Gzip benchmark
more than 10% faster.

Benchmarks on a Cavium ThunderX:

name                      old time/op    new time/op    delta
BinaryTree17-96              15.9s ± 6%     16.0s ± 4%     ~     (p=0.968 n=10+9)
Fannkuch11-96                7.17s ± 0%     7.00s ± 0%   -2.43%  (p=0.000 n=8+9)
FmtFprintfEmpty-96           208ns ± 1%     207ns ± 0%     ~     (p=0.152 n=10+8)
FmtFprintfString-96          379ns ± 0%     375ns ± 0%   -0.95%  (p=0.000 n=10+9)
FmtFprintfInt-96             385ns ± 0%     383ns ± 0%   -0.52%  (p=0.000 n=9+10)
FmtFprintfIntInt-96          591ns ± 0%     586ns ± 0%   -0.85%  (p=0.006 n=7+9)
FmtFprintfPrefixedInt-96     656ns ± 0%     667ns ± 0%   +1.71%  (p=0.000 n=10+10)
FmtFprintfFloat-96           967ns ± 0%     984ns ± 0%   +1.78%  (p=0.000 n=10+10)
FmtManyArgs-96              2.35µs ± 0%    2.25µs ± 0%   -4.63%  (p=0.000 n=9+8)
GobDecode-96                31.0ms ± 0%    30.8ms ± 0%   -0.36%  (p=0.006 n=9+9)
GobEncode-96                24.4ms ± 0%    24.5ms ± 0%   +0.30%  (p=0.000 n=9+9)
Gzip-96                      1.60s ± 0%     1.43s ± 0%  -10.58%  (p=0.000 n=9+10)
Gunzip-96                    167ms ± 0%     169ms ± 0%   +0.83%  (p=0.000 n=8+9)
HTTPClientServer-96          311µs ± 1%     308µs ± 0%   -0.75%  (p=0.000 n=10+10)
JSONEncode-96               65.0ms ± 0%    64.8ms ± 0%   -0.25%  (p=0.000 n=9+8)
JSONDecode-96                262ms ± 1%     261ms ± 1%     ~     (p=0.579 n=10+10)
Mandelbrot200-96            18.0ms ± 0%    18.1ms ± 0%   +0.17%  (p=0.000 n=8+10)
GoParse-96                  14.0ms ± 0%    14.1ms ± 1%   +0.42%  (p=0.003 n=9+10)
RegexpMatchEasy0_32-96       644ns ± 2%     645ns ± 2%     ~     (p=0.836 n=10+10)
RegexpMatchEasy0_1K-96      3.70µs ± 0%    3.49µs ± 0%   -5.58%  (p=0.000 n=10+10)
RegexpMatchEasy1_32-96       662ns ± 2%     657ns ± 2%     ~     (p=0.137 n=10+10)
RegexpMatchEasy1_1K-96      4.47µs ± 0%    4.31µs ± 0%   -3.48%  (p=0.000 n=10+10)
RegexpMatchMedium_32-96      844ns ± 2%     849ns ± 1%     ~     (p=0.208 n=10+10)
RegexpMatchMedium_1K-96      179µs ± 0%     182µs ± 0%   +1.20%  (p=0.000 n=10+10)
RegexpMatchHard_32-96       10.0µs ± 0%    10.1µs ± 0%   +0.48%  (p=0.000 n=10+9)
RegexpMatchHard_1K-96        297µs ± 0%     297µs ± 0%   -0.14%  (p=0.000 n=10+10)
Revcomp-96                   3.08s ± 0%     3.13s ± 0%   +1.56%  (p=0.000 n=9+9)
Template-96                  276ms ± 2%     275ms ± 1%     ~     (p=0.393 n=10+10)
TimeParse-96                1.37µs ± 0%    1.36µs ± 0%   -0.53%  (p=0.000 n=10+7)
TimeFormat-96               1.40µs ± 0%    1.42µs ± 0%   +0.97%  (p=0.000 n=10+10)
[Geo mean]                   264µs          262µs        -0.77%

Change-Id: Ie54eee4b3092af53e6da3baa6d1755098f57f3a2
Reviewed-on: https://go-review.googlesource.com/55670
Run-TryBot: Philip Hofer <phofer@umich.edu>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-20 06:00:54 +00:00
Keith Randall b657c00243 cmd/compile: add | operator to make rewrite rules more succinct
Instead of

(And64 x x) -> x
(And32 x x) -> x
(And16 x x) -> x
(And8  x x) -> x

we can now do:

(And(64|32|16|8) x x) -> x

Any part of an opcode can have a parenthesized, |-separated list of possibilites.
The rule is then expanded using each piece of the | combo.
If there are multiple | clauses, they get expanded in tandem.
(All the first positions, then all the second positions, etc.)
All places | opcodes appear must have the same count.

A more complicated example:

(MOV(L|SS)load [off1] {sym1} (LEAQ4 [off2] {sym2} ptr idx) mem) && is32Bit(off1+off2) && canMergeSym(sym1, sym2) ->
	(MOV(L|SS)loadidx4 [off1+off2] {mergeSym(sym1,sym2)} ptr idx mem)

This meta-rule generates 2 rules, a MOVL and a MOVSS rule.

This CL is carefully orchestrated to not change the generated rules file at all.
In some cases, this means we can't align the rules nicely because it changes
the whitespace in the generated code.  I'll clean that up as a separate step.

There are many more opportunites to compactify rules using this new mechanism.
I've just done some examples, there's more to do.

Change-Id: I8a5e748cd0761ccbb12d09b01925b2f1f4b2f608
Reviewed-on: https://go-review.googlesource.com/86595
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-02-19 22:23:40 +00:00
Joe Tsai b53088a634 Revert "go/printer: forbid empty line before first comment in block"
This reverts commit 08f19bbde1.

Reason for revert:
The changed transformation takes effect on a larger set
of code snippets than expected.

For example, this:
    func foo() {

        // Comment
        bar()

    }
becomes:
    func foo() {
        // Comment
        bar()

    }

This is an unintended consequence.

Change-Id: Ifca88d6267dab8a8170791f7205124712bf8ace8
Reviewed-on: https://go-review.googlesource.com/81335
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Joe Tsai <joetsai@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-12-01 01:12:26 +00:00
Joe Tsai 08f19bbde1 go/printer: forbid empty line before first comment in block
To improve readability when exported fields are removed,
forbid the printer from emitting an empty line before the first comment
in a const, var, or type block.
Also, when printing the "Has filtered or unexported fields." message,
add an empty line before it to separate the message from the struct
or interfact contents.

Before the change:
<<<
type NamedArg struct {

        // Name is the name of the parameter placeholder.
        //
        // If empty, the ordinal position in the argument list will be
        // used.
        //
        // Name must omit any symbol prefix.
        Name string

        // Value is the value of the parameter.
        // It may be assigned the same value types as the query
        // arguments.
        Value interface{}
        // contains filtered or unexported fields
}
>>>

After the change:
<<<
type NamedArg struct {
        // Name is the name of the parameter placeholder.
        //
        // If empty, the ordinal position in the argument list will be
        // used.
        //
        // Name must omit any symbol prefix.
        Name string

        // Value is the value of the parameter.
        // It may be assigned the same value types as the query
        // arguments.
        Value interface{}

        // contains filtered or unexported fields
}
>>>

Fixes #18264

Change-Id: I9fe17ca39cf92fcdfea55064bd2eaa784ce48c88
Reviewed-on: https://go-review.googlesource.com/71990
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-11-02 18:17:22 +00:00