Commit Graph

2714 Commits

Author SHA1 Message Date
Daniel Martí 83f0af1742 cmd/compile: remove a few unnecessary gotos
Rework the logic to remove them. These were the low hanging fruit,
with labels that were used only once and logic that was fairly
straightforward.

Change-Id: I02a01c59c247b8b2972d8d73ff23f96f271de038
Reviewed-on: https://go-review.googlesource.com/63410
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-09-22 16:55:47 +00:00
Matthew Dempsky f260ae6523 cmd/compile/internal/types: unexport Type.Copy
It's only used/needed by SubstAny.

CL prepared with gorename.

Change-Id: I243138f9dcc4e6af9b81a7746414e6d7b3ba10a2
Reviewed-on: https://go-review.googlesource.com/65311
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
2017-09-22 16:42:43 +00:00
Daniel Martí 3d741349f5 cmd/compile: collect reasons in inlining test
If we use -gcflags='-m -m', the compiler should give us a reason why a
func couldn't be inlined. Add the extra -m necessary for that extra info
and use it to give better test failures. For example, for the func in
the TODO:

	--- FAIL: TestIntendedInlining (1.53s)
		inl_test.go:104: runtime.nextFreeFast was not inlined: function too complex

We might increase the number of -m flags to get more information at some
later point, such as getting details on how close the func was to the
inlining budget.

Also started using regexes, as the output parsing is getting a bit too
complex for manual string handling.

While at it, also refactored the test to not buffer the entire output
into memory. This is fine in practice, but it won't scale well as we add
more packages or we depend more on the compiler's debugging output.

For example, "go build -a -gcflags='-m -m' std" prints nearly 40MB of
plaintext - and we only need to see the output line by line anyway.

Updates #21851.

Change-Id: I00986ff360eb56e4e9737b65a6be749ef8540643
Reviewed-on: https://go-review.googlesource.com/63810
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-09-22 10:59:14 +00:00
Ben Shi 9732485851 cmd/compile: optimized ARM code with BFX/BFXU
BFX&BFXU were introduced in ARMv6T2. A single BFX or BFXU is
more efficiently than a pair of left-shift/right-shift in bit
field extraction.

This patch implements this optimization. And the benchmark tests
show big improvement in special cases and little change in total.

1. There is big improvement in a special test case.
name                     old time/op    new time/op    delta
BFX-4                       665µs ± 1%     595µs ± 0%  -10.61%  (p=0.000 n=20+20)
(The test case: https://github.com/benshi001/ugo1/blob/master/bfx_test.go)

2. The compilecmp benchmark shows no regression.
name        old time/op       new time/op       delta
Template          2.33s ± 2%        2.34s ± 2%    ~     (p=0.356 n=9+10)
Unicode           1.32s ± 2%        1.30s ± 2%    ~     (p=0.139 n=9+8)
GoTypes           7.77s ± 1%        7.76s ± 1%    ~     (p=0.780 n=10+9)
Compiler          37.3s ± 1%        37.1s ± 1%    ~     (p=0.211 n=10+9)
SSA               84.3s ± 2%        84.3s ± 2%    ~     (p=0.842 n=10+9)
Flate             1.45s ± 1%        1.45s ± 3%    ~     (p=0.853 n=10+10)
GoParser          1.83s ± 2%        1.83s ± 2%    ~     (p=0.739 n=10+10)
Reflect           5.08s ± 2%        5.09s ± 2%    ~     (p=0.720 n=9+10)
Tar               2.44s ± 1%        2.44s ± 2%    ~     (p=0.684 n=10+10)
XML               2.62s ± 2%        2.62s ± 2%    ~     (p=0.529 n=10+10)
[Geo mean]        4.80s             4.79s       -0.06%

name        old user-time/op  new user-time/op  delta
Template          2.76s ± 2%        2.75s ± 3%    ~     (p=0.893 n=10+10)
Unicode           1.63s ± 1%        1.60s ± 1%  -2.07%  (p=0.000 n=8+9)
GoTypes           9.54s ± 1%        9.52s ± 1%    ~     (p=0.215 n=10+10)
Compiler          46.0s ± 1%        46.0s ± 1%    ~     (p=0.853 n=10+10)
SSA                110s ± 1%         110s ± 1%    ~     (p=0.838 n=10+10)
Flate             1.69s ± 3%        1.69s ± 5%    ~     (p=0.957 n=10+10)
GoParser          2.15s ± 2%        2.15s ± 2%    ~     (p=0.749 n=10+10)
Reflect           6.03s ± 1%        5.99s ± 2%    ~     (p=0.060 n=9+10)
Tar               3.02s ± 2%        2.99s ± 2%    ~     (p=0.214 n=10+10)
XML               3.10s ± 2%        3.08s ± 2%    ~     (p=0.732 n=9+10)
[Geo mean]        5.82s             5.79s       -0.41%

name        old text-bytes    new text-bytes    delta
HelloSize         589kB ± 0%        589kB ± 0%    ~     (all equal)

name        old data-bytes    new data-bytes    delta
HelloSize        5.46kB ± 0%       5.46kB ± 0%    ~     (all equal)

name        old bss-bytes     new bss-bytes     delta
HelloSize        76.9kB ± 0%       76.9kB ± 0%    ~     (all equal)

name        old exe-bytes     new exe-bytes     delta
HelloSize        1.03MB ± 0%       1.03MB ± 0%    ~     (all equal)

3. The go1 benchmark shows little change in total. (excluding noise)
name                     old time/op    new time/op    delta
BinaryTree17-4              41.5s ± 1%     41.6s ± 1%    ~     (p=0.373 n=30+26)
Fannkuch11-4                23.6s ± 1%     23.6s ± 1%  +0.28%  (p=0.003 n=29+30)
FmtFprintfEmpty-4           826ns ± 1%     827ns ± 1%    ~     (p=0.155 n=30+30)
FmtFprintfString-4         1.35µs ± 1%    1.35µs ± 1%    ~     (p=0.499 n=30+30)
FmtFprintfInt-4            1.43µs ± 1%    1.41µs ± 1%  -1.19%  (p=0.000 n=30+30)
FmtFprintfIntInt-4         2.15µs ± 1%    2.11µs ± 1%  -1.78%  (p=0.000 n=30+30)
FmtFprintfPrefixedInt-4    2.21µs ± 1%    2.21µs ± 1%    ~     (p=0.881 n=30+30)
FmtFprintfFloat-4          4.41µs ± 1%    4.44µs ± 0%  +0.64%  (p=0.000 n=30+30)
FmtManyArgs-4              8.06µs ± 1%    8.06µs ± 0%    ~     (p=0.871 n=30+30)
GobDecode-4                 103ms ± 1%     104ms ± 2%  +0.54%  (p=0.013 n=28+29)
GobEncode-4                92.4ms ± 1%    92.6ms ± 1%    ~     (p=0.447 n=30+29)
Gzip-4                      4.17s ± 1%     4.06s ± 1%  -2.56%  (p=0.000 n=29+30)
Gunzip-4                    603ms ± 1%     602ms ± 1%    ~     (p=0.423 n=30+30)
HTTPClientServer-4          688µs ± 2%     674µs ± 3%  -2.09%  (p=0.000 n=29+30)
JSONEncode-4                237ms ± 1%     237ms ± 1%    ~     (p=0.061 n=29+30)
JSONDecode-4                907ms ± 1%     910ms ± 1%    ~     (p=0.061 n=30+30)
Mandelbrot200-4            41.7ms ± 0%    41.7ms ± 0%  +0.19%  (p=0.000 n=24+20)
GoParse-4                  45.7ms ± 2%    45.5ms ± 2%  -0.29%  (p=0.005 n=30+30)
RegexpMatchEasy0_32-4      1.27µs ± 0%    1.27µs ± 0%  +0.12%  (p=0.031 n=30+30)
RegexpMatchEasy0_1K-4      7.77µs ± 4%    7.73µs ± 3%    ~     (p=0.169 n=30+30)
RegexpMatchEasy1_32-4      1.29µs ± 1%    1.29µs ± 1%    ~     (p=0.126 n=30+30)
RegexpMatchEasy1_1K-4      10.4µs ± 3%    10.3µs ± 2%  -1.32%  (p=0.004 n=30+29)
RegexpMatchMedium_32-4     2.06µs ± 0%    2.06µs ± 0%    ~     (p=0.071 n=30+30)
RegexpMatchMedium_1K-4      531µs ± 1%     530µs ± 0%    ~     (p=0.121 n=30+23)
RegexpMatchHard_32-4       28.7µs ± 1%    28.6µs ± 1%  -0.21%  (p=0.001 n=30+27)
RegexpMatchHard_1K-4        860µs ± 1%     857µs ± 1%    ~     (p=0.105 n=30+27)
Revcomp-4                  67.3ms ± 2%    67.3ms ± 2%    ~     (p=0.805 n=29+29)
Template-4                  1.08s ± 1%     1.08s ± 1%    ~     (p=0.260 n=30+30)
TimeParse-4                7.04µs ± 0%    7.04µs ± 0%    ~     (p=0.315 n=30+30)
TimeFormat-4               13.2µs ± 1%    13.2µs ± 1%    ~     (p=0.077 n=30+30)
[Geo mean]                  715µs          713µs       -0.30%

name                     old speed      new speed      delta
GobDecode-4              7.42MB/s ± 1%  7.38MB/s ± 2%  -0.54%  (p=0.011 n=28+29)
GobEncode-4              8.30MB/s ± 1%  8.29MB/s ± 1%    ~     (p=0.484 n=30+29)
Gzip-4                   4.65MB/s ± 2%  4.78MB/s ± 1%  +2.73%  (p=0.000 n=30+30)
Gunzip-4                 32.2MB/s ± 1%  32.2MB/s ± 1%    ~     (p=0.357 n=30+30)
JSONEncode-4             8.18MB/s ± 1%  8.19MB/s ± 1%    ~     (p=0.052 n=29+30)
JSONDecode-4             2.14MB/s ± 1%  2.13MB/s ± 1%    ~     (p=0.074 n=30+29)
GoParse-4                1.27MB/s ± 1%  1.27MB/s ± 2%    ~     (p=0.618 n=24+30)
RegexpMatchEasy0_32-4    25.2MB/s ± 0%  25.2MB/s ± 0%  -0.12%  (p=0.031 n=30+30)
RegexpMatchEasy0_1K-4     132MB/s ± 5%   132MB/s ± 2%    ~     (p=0.171 n=30+30)
RegexpMatchEasy1_32-4    24.8MB/s ± 1%  24.9MB/s ± 1%    ~     (p=0.106 n=30+30)
RegexpMatchEasy1_1K-4    98.4MB/s ± 3%  99.6MB/s ± 4%  +1.19%  (p=0.011 n=30+30)
RegexpMatchMedium_32-4    483kB/s ± 1%   484kB/s ± 1%    ~     (p=0.426 n=30+30)
RegexpMatchMedium_1K-4   1.93MB/s ± 1%  1.93MB/s ± 0%    ~     (p=0.157 n=30+17)
RegexpMatchHard_32-4     1.12MB/s ± 1%  1.12MB/s ± 0%  +0.33%  (p=0.001 n=30+24)
RegexpMatchHard_1K-4     1.19MB/s ± 1%  1.19MB/s ± 1%    ~     (p=0.290 n=30+30)
Revcomp-4                37.8MB/s ± 2%  37.8MB/s ± 1%    ~     (p=0.815 n=29+29)
Template-4               1.80MB/s ± 1%  1.80MB/s ± 1%    ~     (p=0.586 n=30+30)
[Geo mean]               6.80MB/s       6.81MB/s       +0.25%

fixes #20966

Change-Id: Idb5567bbe988c875315b8c98c128957cd474ccc5
Reviewed-on: https://go-review.googlesource.com/64950
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
2017-09-21 12:41:04 +00:00
Cherry Zhang 99c757adb5 cmd/compile: use a counter to track whether writebarrier rewriting is done
Use a counter, instead of a loop, to see whether there are more
writebarrier ops in the current block that need to be rewritten.

No visible change in normal compiler speed benchmarks.

Passes toolstash -cmp on std cmd.

Fixes #20416.

Change-Id: Ifbbde23611cd668c35b8a4a3e9a92726bfe19956
Reviewed-on: https://go-review.googlesource.com/60310
Reviewed-by: David Chase <drchase@google.com>
2017-09-20 23:57:19 +00:00
Matthew Dempsky 93e97ef066 cmd/compile/internal/gc: update comment in plive.go
onebitwalktype1 no longer appears to be a bottleneck for the mentioned
test case. In fact, we appear to compile it significantly faster now
than Go 1.4 did (~1.8s vs ~3s).

Fixes #21951.

Change-Id: I315313e906092a7d6ff4ff60a918d80a4cff7a7f
Reviewed-on: https://go-review.googlesource.com/65110
Reviewed-by: Keith Randall <khr@golang.org>
2017-09-20 23:07:40 +00:00
Ilya Tocar 475df0ebcc cmd/compile/internal/gc: better inliner diagnostics
When debugging inliner with -m -m print cost of complex functions,
instead of simple "function too complex". This helps to understand,
how close to inlining is this particular function.

Change-Id: I6871f69b5b914d23fd0b43a24d7c6fc928f4b716
Reviewed-on: https://go-review.googlesource.com/63330
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-20 21:38:06 +00:00
Ilya Tocar 101fbc2c82 runtime: make nextFreeFast inlinable
https://golang.org/cl/22598 made nextFreeFast inlinable.
But during https://golang.org/cl/63611 it was discovered, that it is no longer inlinable.
Reduce number of statements below inlining threshold to make it inlinable again.
Also update tests, to prevent regressions.
Doesn't reduce readability.

Change-Id: Ia672784dd48ed3b1ab46e390132f1094fe453de5
Reviewed-on: https://go-review.googlesource.com/65030
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2017-09-20 20:27:13 +00:00
Matthew Dempsky 0d73f1e333 cmd/compile: change liveness-related functions into methods
No functional change; just making the code slightly more idiomatic.

Passes toolstash-check.

Change-Id: I66d14a8410bbecf260d0ea5683564aa413ce5747
Reviewed-on: https://go-review.googlesource.com/65070
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-20 20:19:51 +00:00
Matthew Dempsky 39983cf491 cmd/compile: refactor onebitwalktype1
The existing logic tried to advance the offset for each variable's
width, but then tried to undo this logic with the array and struct
handling code. It can all be much simpler by only worrying about
computing offsets within the array and struct code.

While here, include a short-circuit for zero-width arrays to fix a
pedantic compiler failure case.

Passes toolstash-check.

Fixes #20739.

Change-Id: I98af9bb512a33e3efe82b8bf1803199edb480640
Reviewed-on: https://go-review.googlesource.com/64471
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-09-20 18:11:52 +00:00
Matthew Dempsky e06a64a476 cmd/compile/internal/syntax: fix source buffer refilling
The previous code seems to have an off-by-1 in it somewhere, the
consequence being that we didn't properly preserve all of the old
buffer contents that we intended to.

After spending a while looking at the existing window-shifting logic,
I wasn't able to understand exactly how it was supposed to work or
where the issue was, so I rewrote it to be (at least IMO) more
obviously correct.

Fixes #21938.

Change-Id: I1ed7bbc1e1751a52ab5f7cf0411ae289586dc345
Reviewed-on: https://go-review.googlesource.com/64830
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-09-20 17:47:26 +00:00
Matthew Dempsky a53e853964 cmd/compile/internal/types: simplify dclstack
We used to backup symbol declarations using complete Syms, but this
was unnecessary: very few of Sym's fields were actually needed. Also,
to restore a symbol, we had to re-Lookup the Sym in its Pkg.

By introducing a new dedicated dsym type for this purpose, we can
address both of these deficiencies.

Passes toolstash-check.

Change-Id: I39f3d672b301f84a3a62b9b34b4b2770cb25df79
Reviewed-on: https://go-review.googlesource.com/64811
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-09-20 17:47:12 +00:00
Michael Munday 2cb61aa3f7 cmd/compile: stop rematerializable ops from clobbering flags
Rematerializable ops can be inserted after the flagalloc phase,
they must therefore not clobber flags. This CL adds a check to
ensure this doesn't happen and fixes the instances where it
does currently.

amd64: ADDQconst and ADDLconst were recently changed to be
rematerializable in CL 54393 (only in tip, not 1.9). That change
has been reverted.

s390x: MOVDaddr could clobber flags when using dynamic linking due
to a ADD with immediate instruction. Change the code generation to
use LA/LAY instead.

Fixes #21080.

Change-Id: Ia85c882afa2a820a309e93775354b3169ec6d034
Reviewed-on: https://go-review.googlesource.com/63030
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2017-09-20 17:10:58 +00:00
Matthew Dempsky 3628c2d52f cmd/compile: remove {Mark,Pop}dcl calls in bimport
These were previously only relevant for recording scoping level so
that invalid 'fallthrough' statements could be rejected. However,
that's handled differently since CL 61130 (in particular, there's no
use of types.Block anymore), so these calls can be safely removed.

Passes toolstash-check.

Change-Id: I8631b156594df85b8d39f57acad3ebcf099d52f9
Reviewed-on: https://go-review.googlesource.com/64810
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-09-20 17:03:52 +00:00
Michael Munday 7582494e06 cmd/compile: add s390x intrinsics for Ceil, Floor, Round and Trunc
Ceil, Floor and Trunc are pre-existing intrinsics. Round is a new
function and has been added as an intrinsic in this CL. All of the
functions can be implemented as a single 'LOAD FP INTEGER'
instruction, FIDBR, on s390x.

name   old time/op  new time/op  delta
Ceil   2.34ns ± 0%  0.85ns ± 0%  -63.74%  (p=0.000 n=5+4)
Floor  2.33ns ± 0%  0.85ns ± 1%  -63.35%  (p=0.008 n=5+5)
Round  4.23ns ± 0%  0.85ns ± 0%  -79.89%  (p=0.000 n=5+4)
Trunc  2.35ns ± 0%  0.85ns ± 0%  -63.83%  (p=0.029 n=4+4)

Change-Id: Idee7ba24a2899d12bf9afee4eedd6b4aaad3c510
Reviewed-on: https://go-review.googlesource.com/63890
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-09-20 10:01:35 +00:00
Keith Randall 1787ced894 cmd/compile: remove Symbol wrappers from Aux fields
We used to have {Arg,Auto,Extern}Symbol structs with which we wrapped
a *gc.Node or *obj.LSym before storing them in the Aux field
of an ssa.Value.  This let the SSA part of the compiler distinguish
between autos and args, for example.  We no longer need the wrappers
as we can query the underlying objects directly.

There was also some sloppy usage, where VarDef had a *gc.Node
directly in its Aux field, whereas the use of that variable had
that *gc.Node wrapped in an AutoSymbol. Thus the Aux fields didn't
match (using ==) when they probably should.
This sloppy usage cleanup is the only thing in the CL that changes the
generated code - we can get rid of some more unused auto variables if
the matching happens reliably.

Removing this wrapper also lets us get rid of the varsyms cache
(which was used to prevent wrapping the same *gc.Node twice).

Change-Id: I0dedf8f82f84bfee413d310342b777316bd1d478
Reviewed-on: https://go-review.googlesource.com/64452
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-09-19 22:03:10 +00:00
Matthew Dempsky 7c8a9615c0 cmd/compile: fix stack frame info for calls in receiver slot
Previously, after inlining a call, we made a second pass to rewrite
the AST's position information to record the inlined stack frame. The
call arguments were part of this AST, but it would be incorrect to
rewrite them too, so extra effort was made to temporarily remove them
while the position rewriting was done.

However, this extra logic was only done for regular arguments: it was
not done for receiver arguments. Consequently if m was inlined in
"f().m(g(), h())", g and h would have correct call frames, but f would
appear to be called by m.

The fix taken by this CL is to merge setpos into inlsubst and only
rewrite position information for nodes that were actually copied from
the original function AST body. As a side benefit, this eliminates an
extra AST pass and some AST walking code.

Fixes #21879.

Change-Id: I22b25c208313fc25c358d3a2eebfc9b012400084
Reviewed-on: https://go-review.googlesource.com/64470
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-09-19 18:35:24 +00:00
Matthew Dempsky 3066dbad52 cmd/compile: cleanup toolstash pacifier from OXFALL removal
Change-Id: Ide7fe6b09247b7a6befbdfc2d6ce5988aa1df323
Reviewed-on: https://go-review.googlesource.com/61131
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-09-19 18:20:29 +00:00
Matthew Dempsky 4347baac7d cmd/compile: eliminate OXFALL
Previously, we used OXFALL vs OFALL to distinguish fallthrough
statements that had been validated. Because in the Node AST we flatten
statement blocks, OXCASE and OXFALL needed to keep track of their
block scopes for this purpose.

Now that we have an AST that keeps these separate, we can just perform
the validation earlier.

Passes toolstash-check.

Fixes #14540.

Change-Id: I8421eaba16c2b3b72c9c5483b5cf20b14261385e
Reviewed-on: https://go-review.googlesource.com/61130
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-09-19 18:08:50 +00:00
Tobias Klauser 8bdf0b72b0 cmd/compile: simplify range expression
Found by running gofmt -s on the file in question.

Change-Id: I84511bd2bc75dff196930a7a87ecf5a2aca2fbb8
Reviewed-on: https://go-review.googlesource.com/64310
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-19 00:29:58 +00:00
Matthew Dempsky bb2f0da23a cmd/compile: fix compiler crash on recursive types
By setting both a valid size and alignment for broken recursive types,
we can appease some more safety checks and prevent compiler crashes.

Fixes #21882.

Change-Id: Ibaa137d8aa2c2a9d521462f144d7016c4abfd6e7
Reviewed-on: https://go-review.googlesource.com/64430
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-09-18 21:49:43 +00:00
Daniel Martí 71c9454f99 cmd/compile: remove some redundant types in decls
As per golint's suggestions.

Change-Id: Ie0c6ad9aa5dc69966a279562a341c7b095c47ede
Reviewed-on: https://go-review.googlesource.com/64192
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-17 09:51:38 +00:00
Michael Munday a5d6b41449 cmd/compile: test constant folded integer to/from float conversions
Improves test coverage of the rules added in CL 63795 and would have
detected the bug fixed by CL 63950.

Change-Id: I107ee8d8e0b6684ce85b2446bd5018c5a03d608a
Reviewed-on: https://go-review.googlesource.com/64130
Reviewed-by: Keith Randall <khr@golang.org>
2017-09-16 09:17:58 +00:00
Ben Shi a07176b45a cmd/compile: optimize ARM code with MULAF/MULSF/MULAD/MULSD
The go compiler can generate better ARM code with those more
efficient FP instructions. And there is little improvement
in total but big improvement in special cases.

1. The size of pkg/linux_arm/math.a shrinks by 2.4%.

2. there is neither improvement nor regression in compilecmp benchmark.
name        old time/op       new time/op       delta
Template          2.32s ± 2%        2.32s ± 1%    ~     (p=1.000 n=9+10)
Unicode           1.32s ± 4%        1.32s ± 4%    ~     (p=0.912 n=10+10)
GoTypes           7.76s ± 1%        7.79s ± 1%    ~     (p=0.447 n=9+10)
Compiler          37.4s ± 2%        37.2s ± 2%    ~     (p=0.218 n=10+10)
SSA               84.8s ± 2%        85.0s ± 1%    ~     (p=0.604 n=10+9)
Flate             1.45s ± 2%        1.44s ± 2%    ~     (p=0.075 n=10+10)
GoParser          1.82s ± 1%        1.81s ± 1%    ~     (p=0.190 n=10+10)
Reflect           5.06s ± 1%        5.05s ± 1%    ~     (p=0.315 n=10+9)
Tar               2.37s ± 1%        2.37s ± 2%    ~     (p=0.912 n=10+10)
XML               2.56s ± 1%        2.58s ± 2%    ~     (p=0.089 n=10+10)
[Geo mean]        4.77s             4.77s       -0.08%

name        old user-time/op  new user-time/op  delta
Template          2.74s ± 2%        2.75s ± 2%    ~     (p=0.856 n=9+10)
Unicode           1.61s ± 4%        1.62s ± 3%    ~     (p=0.693 n=10+10)
GoTypes           9.55s ± 1%        9.49s ± 2%    ~     (p=0.056 n=9+10)
Compiler          45.9s ± 1%        45.8s ± 1%    ~     (p=0.345 n=9+10)
SSA                110s ± 1%         110s ± 1%    ~     (p=0.763 n=9+10)
Flate             1.68s ± 2%        1.68s ± 3%    ~     (p=0.616 n=10+10)
GoParser          2.14s ± 4%        2.14s ± 1%    ~     (p=0.825 n=10+9)
Reflect           5.95s ± 1%        5.97s ± 3%    ~     (p=0.951 n=9+10)
Tar               2.94s ± 3%        2.93s ± 2%    ~     (p=0.359 n=10+10)
XML               3.03s ± 3%        3.07s ± 6%    ~     (p=0.166 n=10+10)
[Geo mean]        5.76s             5.77s       +0.12%

name        old text-bytes    new text-bytes    delta
HelloSize         588kB ± 0%        588kB ± 0%    ~     (all equal)

name        old data-bytes    new data-bytes    delta
HelloSize        5.46kB ± 0%       5.46kB ± 0%    ~     (all equal)

name        old bss-bytes     new bss-bytes     delta
HelloSize        72.9kB ± 0%       72.9kB ± 0%    ~     (all equal)

name        old exe-bytes     new exe-bytes     delta
HelloSize        1.03MB ± 0%       1.03MB ± 0%    ~     (all equal)

3. The performance of Mandelbrot200 improves 15%, though little
   improvement in total.
name                     old time/op    new time/op    delta
BinaryTree17-4              41.7s ± 1%     41.7s ± 1%     ~     (p=0.264 n=29+23)
Fannkuch11-4                24.2s ± 0%     24.1s ± 1%   -0.13%  (p=0.050 n=30+30)
FmtFprintfEmpty-4           826ns ± 1%     824ns ± 1%   -0.24%  (p=0.038 n=25+30)
FmtFprintfString-4         1.38µs ± 1%    1.38µs ± 0%   -0.42%  (p=0.000 n=27+25)
FmtFprintfInt-4            1.46µs ± 1%    1.46µs ± 0%     ~     (p=0.060 n=30+23)
FmtFprintfIntInt-4         2.11µs ± 1%    2.08µs ± 0%   -1.04%  (p=0.000 n=30+30)
FmtFprintfPrefixedInt-4    2.23µs ± 1%    2.22µs ± 1%   -0.51%  (p=0.000 n=30+30)
FmtFprintfFloat-4          4.49µs ± 1%    4.48µs ± 1%   -0.22%  (p=0.004 n=26+30)
FmtManyArgs-4              8.06µs ± 1%    8.12µs ± 1%   +0.68%  (p=0.000 n=25+30)
GobDecode-4                 104ms ± 1%     104ms ± 2%     ~     (p=0.362 n=29+29)
GobEncode-4                92.9ms ± 1%    92.8ms ± 2%     ~     (p=0.786 n=30+30)
Gzip-4                      4.12s ± 1%     4.12s ± 1%     ~     (p=0.314 n=30+30)
Gunzip-4                    602ms ± 1%     603ms ± 1%     ~     (p=0.164 n=30+30)
HTTPClientServer-4          659µs ± 1%     655µs ± 2%   -0.64%  (p=0.006 n=25+28)
JSONEncode-4                234ms ± 1%     235ms ± 1%   +0.29%  (p=0.050 n=30+30)
JSONDecode-4                912ms ± 0%     911ms ± 0%     ~     (p=0.385 n=18+24)
Mandelbrot200-4            49.2ms ± 0%    41.7ms ± 0%  -15.35%  (p=0.000 n=25+27)
GoParse-4                  46.3ms ± 1%    46.3ms ± 2%     ~     (p=0.572 n=30+30)
RegexpMatchEasy0_32-4      1.29µs ± 1%    1.27µs ± 0%   -1.59%  (p=0.000 n=30+30)
RegexpMatchEasy0_1K-4      7.62µs ± 4%    7.71µs ± 3%     ~     (p=0.074 n=30+30)
RegexpMatchEasy1_32-4      1.31µs ± 0%    1.30µs ± 1%   -0.71%  (p=0.000 n=23+30)
RegexpMatchEasy1_1K-4      10.3µs ± 3%    10.3µs ± 5%     ~     (p=0.105 n=30+30)
RegexpMatchMedium_32-4     2.06µs ± 1%    2.06µs ± 1%     ~     (p=0.100 n=30+30)
RegexpMatchMedium_1K-4      533µs ± 1%     534µs ± 1%     ~     (p=0.254 n=29+30)
RegexpMatchHard_32-4       28.9µs ± 0%    28.9µs ± 0%     ~     (p=0.154 n=30+30)
RegexpMatchHard_1K-4        868µs ± 1%     867µs ± 0%     ~     (p=0.729 n=30+23)
Revcomp-4                  66.9ms ± 1%    67.2ms ± 2%     ~     (p=0.102 n=28+29)
Template-4                  1.07s ± 1%     1.06s ± 1%   -0.53%  (p=0.000 n=30+30)
TimeParse-4                7.07µs ± 1%    7.01µs ± 0%   -0.85%  (p=0.000 n=30+25)
TimeFormat-4               13.1µs ± 0%    13.2µs ± 1%   +0.77%  (p=0.000 n=27+27)
[Geo mean]                  721µs          716µs        -0.70%

name                     old speed      new speed      delta
GobDecode-4              7.38MB/s ± 1%  7.37MB/s ± 2%     ~     (p=0.399 n=29+29)
GobEncode-4              8.26MB/s ± 1%  8.27MB/s ± 2%     ~     (p=0.790 n=30+30)
Gzip-4                   4.71MB/s ± 1%  4.71MB/s ± 1%     ~     (p=0.885 n=30+30)
Gunzip-4                 32.2MB/s ± 1%  32.2MB/s ± 1%     ~     (p=0.190 n=30+30)
JSONEncode-4             8.28MB/s ± 1%  8.25MB/s ± 1%     ~     (p=0.053 n=30+30)
JSONDecode-4             2.13MB/s ± 0%  2.12MB/s ± 1%     ~     (p=0.072 n=18+30)
GoParse-4                1.25MB/s ± 1%  1.25MB/s ± 2%     ~     (p=0.863 n=30+30)
RegexpMatchEasy0_32-4    24.8MB/s ± 0%  25.2MB/s ± 1%   +1.61%  (p=0.000 n=30+30)
RegexpMatchEasy0_1K-4     134MB/s ± 4%   133MB/s ± 3%     ~     (p=0.074 n=30+30)
RegexpMatchEasy1_32-4    24.5MB/s ± 0%  24.6MB/s ± 1%   +0.72%  (p=0.000 n=23+30)
RegexpMatchEasy1_1K-4    99.1MB/s ± 3%  99.8MB/s ± 5%     ~     (p=0.105 n=30+30)
RegexpMatchMedium_32-4    483kB/s ± 1%   487kB/s ± 1%   +0.83%  (p=0.002 n=30+30)
RegexpMatchMedium_1K-4   1.92MB/s ± 1%  1.92MB/s ± 1%     ~     (p=0.058 n=30+30)
RegexpMatchHard_32-4     1.10MB/s ± 0%  1.11MB/s ± 0%     ~     (p=0.804 n=30+30)
RegexpMatchHard_1K-4     1.18MB/s ± 0%  1.18MB/s ± 0%     ~     (all equal)
Revcomp-4                38.0MB/s ± 1%  37.8MB/s ± 2%     ~     (p=0.098 n=28+29)
Template-4               1.82MB/s ± 1%  1.83MB/s ± 1%   +0.55%  (p=0.000 n=29+29)
[Geo mean]               6.79MB/s       6.79MB/s        +0.09%

Change-Id: Ia91991c2c5c59c5df712de85a83b13a21c0a554b
Reviewed-on: https://go-review.googlesource.com/63770
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-09-15 22:30:34 +00:00
isharipo 8c67f210a1 cmd/internal/obj: change Prog.From3 to RestArgs ([]Addr)
This change makes it easier to express instructions
with arbitrary number of operands.

Rationale: previous approach with operand "hiding" does
not scale well, AVX and especially AVX512 have many
instructions with 3+ operands.

x86 asm backend is updated to handle up to 6 explicit operands.
It also fixes issue with 4-th immediate operand type checks.
All `ytab` tables are updated accordingly.

Changes to non-x86 backends only include these patterns:
`p.From3 = X` => `p.SetFrom3(X)`
`p.From3.X = Y` => `p.GetFrom3().X = Y`

Over time, other backends can adapt Prog.RestArgs
and reduce the amount of workarounds.

-- Performance --

x/benchmark/build:

$ benchstat upstream.bench patched.bench
name      old time/op                 new time/op                 delta
Build-48                  21.7s ± 2%                  21.8s ± 2%   ~     (p=0.218 n=10+10)

name      old binary-size             new binary-size             delta
Build-48                  10.3M ± 0%                  10.3M ± 0%   ~     (all equal)

name      old build-time/op           new build-time/op           delta
Build-48                  21.7s ± 2%                  21.8s ± 2%   ~     (p=0.218 n=10+10)

name      old build-peak-RSS-bytes    new build-peak-RSS-bytes    delta
Build-48                  145MB ± 5%                  148MB ± 5%   ~     (p=0.218 n=10+10)

name      old build-user+sys-time/op  new build-user+sys-time/op  delta
Build-48                  21.0s ± 2%                  21.2s ± 2%   ~     (p=0.075 n=10+10)

Microbenchmark shows a slight slowdown.

name        old time/op  new time/op  delta
AMD64asm-4  49.5ms ± 1%  49.9ms ± 1%  +0.67%  (p=0.001 n=23+15)

func BenchmarkAMD64asm(b *testing.B) {
  for i := 0; i < b.N; i++ {
    TestAMD64EndToEnd(nil)
    TestAMD64Encoder(nil)
  }
}

Change-Id: I4f1d37b5c2c966da3f2127705ccac9bff0038183
Reviewed-on: https://go-review.googlesource.com/63490
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-09-15 21:05:03 +00:00
Alessandro Arzilli e1cf2be7a8 cmd/compile: fix lexical block of captured variables
Variables captured by a closure were always assigned to the root scope
in their declaration function. Using decl.Name.Defn.Pos will result in
the correct scope for both the declaration function and the capturing
function.

Fixes #21515

Change-Id: I3960aface3c4fc97e15b36191a74a7bed5b5ebc1
Reviewed-on: https://go-review.googlesource.com/56830
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-09-15 21:02:59 +00:00
David Crawshaw 27e80f7c4d cmd/compile: replace GOROOT in //line directives
The compiler replaces any path of the form /path/to/goroot/src/net/port.go
with GOROOT/src/net/port.go so that the same object file is
produced if the GOROOT is moved. It was skipping this transformation
for any absolute path into the GOROOT that came from //line directives,
such as those generated by cmd/cgo.

Fixes #21373
Fixes #21720
Fixes #21825

Change-Id: I2784c701b4391cfb92e23efbcb091a84957d61dd
Reviewed-on: https://go-review.googlesource.com/63693
Run-TryBot: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-09-15 17:18:43 +00:00
Todd Neal af86083812 cmd/compile: fix typo in floating point rule
Change-Id: Idfb64fcb26f48d5b70bab872f9a3d96a036be681
Reviewed-on: https://go-review.googlesource.com/63950
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-09-15 03:07:43 +00:00
Kunpei Sakai 5a986eca86 all: fix article typos
a -> an

Change-Id: I7362bdc199e83073a712be657f5d9ba16df3077e
Reviewed-on: https://go-review.googlesource.com/63850
Reviewed-by: Rob Pike <r@golang.org>
2017-09-15 02:39:16 +00:00
Michael Munday 95b146e8eb cmd/compile: improve floating point constant propagation
Add generic rules to propagate floating point constants through
comparisons and integer conversions. These new rules seldom trigger
in the standard library so there is no performance change, however
I think it is worth adding them anyway for completeness.

Change-Id: I9db5222746508a2996f1cafb72f4e0cf2541de07
Reviewed-on: https://go-review.googlesource.com/63795
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-09-14 23:08:33 +00:00
Lynn Boger 40e25895e3 cmd/compile,math: improve int<->float conversions on ppc64x
The functions Float64bits and Float64frombits perform
poorly on ppc64x because the int<->float conversions
often result in load and store sequences to handle the
type change. This patch adds more rules to recognize
those sequences and use register to register moves
and avoid unnecessary loads and stores where possible.

There were some existing rules to improve these conversions,
but this provides additional improvements. Included here:

- New instruction FCFIDS to improve on conversion to 32 bit
- Rename Xf2i64 and Xi2f64 as MTVSRD, MFVSRD, to match the asm
- Add rules to lower some of the load/store sequences for
- Added new go asm to ppc64.s testcase.
conversions

Improvements:

BenchmarkAbs-16                2.16          0.93          -56.94%
BenchmarkCopysign-16           2.66          1.18          -55.64%
BenchmarkRound-16              4.82          2.69          -44.19%
BenchmarkSignbit-16            1.71          1.14          -33.33%
BenchmarkFrexp-16              11.4          7.94          -30.35%
BenchmarkLogb-16               10.4          7.34          -29.42%
BenchmarkLdexp-16              15.7          11.2          -28.66%
BenchmarkIlogb-16              10.2          7.32          -28.24%
BenchmarkPowInt-16             69.6          55.9          -19.68%
BenchmarkModf-16               10.1          8.19          -18.91%
BenchmarkLog2-16               17.4          14.3          -17.82%
BenchmarkCbrt-16               45.0          37.3          -17.11%
BenchmarkAtanh-16              57.6          48.3          -16.15%
BenchmarkRemainder-16          76.6          65.4          -14.62%
BenchmarkGamma-16              26.0          22.5          -13.46%
BenchmarkPowFrac-16            197           174           -11.68%
BenchmarkMod-16                112           99.8          -10.89%
BenchmarkAsinh-16              59.9          53.7          -10.35%
BenchmarkAcosh-16              44.8          40.3          -10.04%

Updates #21390

Change-Id: I56cc991fc2e55249d69518d4e1ba76cc23904e35
Reviewed-on: https://go-review.googlesource.com/63290
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2017-09-14 12:14:00 +00:00
Daniel Martí f351dbfa4d cmd/compile: expand inlining test to multiple pkgs
Rework the test to work with any number of std packages. This was done
to include a few funcs from unicode/utf8. Adding more will be much
simpler too.

While at it, add more runtime funcs by searching for "inlined" or
"inlining" in the git log of its directory. These are: addb, subtractb,
fastrand and noescape.

Updates #21851.

Change-Id: I4fb2bd8aa6a5054218f9b36cb19d897ac533710e
Reviewed-on: https://go-review.googlesource.com/63611
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-14 04:49:58 +00:00
zhongtao.chen 99414a5b1d cmd/compile: limit the number of simultaneously opened files to avoid EMFILE/ENFILE errors
If the Go packages with enough source files,it will cause EMFILE/ENFILE error,
Fix this by limiting the number of simultaneously opened files.

Fixes #21621

Change-Id: I8555d79242d2f90771e37e073b7540fc7194a64a
Reviewed-on: https://go-review.googlesource.com/57751
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-14 04:02:44 +00:00
Matthew Dempsky c64e793850 cmd/compile: simplify exporting ONAME nodes
These two special cases are unnecessary:

1) "~b%d" references only appear during walk, to handle "return"
statements implicitly assigning to blank result parameters. Even if
they could appear, the "inlined and customized version" accidentally
diverged from p.sym in golang.org/cl/33911.

2) The Vargen case is already identical to the default case, and it
never overlaps with the remaining "T.method" case.

Passes toolstash-check.

Change-Id: I03f7e5b75b707b43afc8ed6eb90f43ba93ed17ae
Reviewed-on: https://go-review.googlesource.com/63272
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
2017-09-13 22:07:35 +00:00
Matthew Dempsky 577967799c cmd/compile: simplify exporting OTYPE nodes
We only export packages that typechecked successfully, and OTYPE nodes
will always have their Type field set.

Changes the package export format, but only in the compiler-specific
section. No version bump necessary.

Change-Id: I722f5827e73948fceb0432bc8b3b22471fea8f61
Reviewed-on: https://go-review.googlesource.com/63273
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
2017-09-13 18:31:16 +00:00
Daniel Martí 0d8a3b208c cmd/compile: add TestIntendedInlining from runtime
Move it from the runtime package, as we will soon add more packages and
functions for it to check.

The test used the testEnv func, which cleaned certain environment
variables from a command, so it was moved to internal/testenv under a
more descriptive (and less ambiguous) name. Add a simple godoc to it
too.

For #21851.

Change-Id: I6f39c1f23b45377718355fafe66ffd87047d8ab6
Reviewed-on: https://go-review.googlesource.com/63550
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2017-09-13 18:10:31 +00:00
Martin Möhrmann 1d3ad6733e runtime: refactor hmap.extra.overflow array into two separate fields
This makes it easier to deduce from the field names which overflow
field corresponds to h.buckets and which to h.oldbuckets by aligning
the naming with the buckets fields in hmap.

Change-Id: I8d6a729229a190db0212bac012ead1a3c13cf5d0
Reviewed-on: https://go-review.googlesource.com/62411
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-09-13 06:43:01 +00:00
Daniel Martí 6d33df1d65 cmd/compile: remove redundant switch label
This label was added automatically by grind to remove gotos. As of
today, it's completely useless, as none of its uses need a label to
begin with.

While at it, remove all the redundant breaks too. Leave those that are
the single statement in a case clause body, as that's the style used
throughout std and cmd to clarify when cases are empty.

Change-Id: I3e20068b66b759614e903beab1cc9b2709b31063
Reviewed-on: https://go-review.googlesource.com/62950
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-12 19:39:46 +00:00
Martin Möhrmann 137e4a6c63 cmd/compile: improve single blank variable handling in walkrange
Refactor walkrange to treat "for _ = range a" as "for range a".

This avoids generating some later discarded nodes in the compiler.

Passes toolstash -cmp.

Change-Id: Ifb2e1ca3b8519cbb67e8ad5aad514af9d18f1ec4
Reviewed-on: https://go-review.googlesource.com/61017
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-09-12 05:50:54 +00:00
Daniel Martí 27a70ea560 cmd/compile: simplify a few early var declarations
These were likely written in C or added by an automated tool. Either
way, they're unnecessary now. Clean up the code.

Change-Id: I56de2c7bb60ebab8c500803a8b6586bdf4bf75c7
Reviewed-on: https://go-review.googlesource.com/62951
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-11 21:16:39 +00:00
Lynn Boger fa3fe2e3c6 cmd/compile, math/bits: add rotate rules to PPC64.rules
This adds rules to match the code in math/bits RotateLeft,
RotateLeft32, and RotateLef64 to allow them to be inlined.

The rules are complicated because the code in these function
use different types, and the non-const version of these
shifts generate Mask and Carry instructions that become
subexpressions during the match process.

Also adds a testcase to asm_test.go.

Improvement in math/bits:

BenchmarkRotateLeft-16       1.57     1.32      -15.92%
BenchmarkRotateLeft32-16     1.60     1.37      -14.37%
BenchmarkRotateLeft64-16     1.57     1.32      -15.92%

Updates #21390

Change-Id: Ib6f17669ecc9cab54f18d690be27e2225ca654a4
Reviewed-on: https://go-review.googlesource.com/59932
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-09-11 20:44:22 +00:00
Lynn Boger b74b43de68 cmd/compile: request r12 for indirect calls on ppc64le
On ppc64le, functions compiled with -shared expect r12 to
hold the function's address for indirect calls. Previously
this was enforced by generating a move instruction if the
address wasn't already in r12. This change avoids that extra
move by requesting r12 in the CALL ops that do indirect calls.

As a result of adding support for plugins on ppc64le, it was
discovered that there would be more cases where this extra
move was needed, so this seemed like a better solution.

Updates #20756

Change-Id: I6770885a46990f78c6d2902a715dcdaa822192a1
Reviewed-on: https://go-review.googlesource.com/62890
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-09-11 18:02:33 +00:00
Ben Shi 2899c3e8cb cmd/compile: optimize ARM code with NMULF/NMULD
NMULF and NMULD are efficient FP instructions, and the go compiler can
use them to generate better code.

The benchmark tests of my patch did not show general change, but big
improvement in special cases.

1.A special test case improved 12.6%.
https://github.com/benshi001/ugo1/blob/master/fpmul_test.go
name                     old time/op    new time/op    delta
FPMul-4                     398µs ± 1%     348µs ± 1%  -12.64%  (p=0.000 n=40+40)

2. the compilecmp test showed little change.
name        old time/op       new time/op       delta
Template          2.30s ± 1%        2.31s ± 1%    ~     (p=0.754 n=17+19)
Unicode           1.31s ± 3%        1.32s ± 5%    ~     (p=0.265 n=20+20)
GoTypes           7.73s ± 2%        7.73s ± 1%    ~     (p=0.925 n=20+20)
Compiler          37.0s ± 1%        37.3s ± 2%  +0.79%  (p=0.002 n=19+20)
SSA               83.8s ± 4%        83.5s ± 2%    ~     (p=0.964 n=20+17)
Flate             1.43s ± 2%        1.44s ± 1%    ~     (p=0.602 n=20+20)
GoParser          1.82s ± 2%        1.81s ± 2%    ~     (p=0.141 n=19+20)
Reflect           5.08s ± 2%        5.08s ± 3%    ~     (p=0.835 n=20+19)
Tar               2.36s ± 1%        2.35s ± 1%    ~     (p=0.195 n=18+17)
XML               2.57s ± 2%        2.56s ± 1%    ~     (p=0.283 n=20+17)
[Geo mean]        4.74s             4.75s       +0.05%

name        old user-time/op  new user-time/op  delta
Template          2.75s ± 2%        2.75s ± 0%    ~     (p=0.620 n=20+15)
Unicode           1.59s ± 4%        1.60s ± 4%    ~     (p=0.479 n=20+19)
GoTypes           9.48s ± 1%        9.47s ± 1%    ~     (p=0.743 n=20+20)
Compiler          45.7s ± 1%        45.7s ± 1%    ~     (p=0.482 n=19+20)
SSA                109s ± 1%         109s ± 2%    ~     (p=0.800 n=18+20)
Flate             1.67s ± 3%        1.67s ± 3%    ~     (p=0.598 n=19+18)
GoParser          2.15s ± 4%        2.13s ± 3%    ~     (p=0.153 n=20+20)
Reflect           5.95s ± 2%        5.95s ± 2%    ~     (p=0.961 n=19+20)
Tar               2.93s ± 2%        2.92s ± 3%    ~     (p=0.242 n=20+19)
XML               3.02s ± 3%        3.04s ± 3%    ~     (p=0.233 n=19+18)
[Geo mean]        5.74s             5.74s       -0.04%

name        old text-bytes    new text-bytes    delta
HelloSize         588kB ± 0%        588kB ± 0%    ~     (all equal)

name        old data-bytes    new data-bytes    delta
HelloSize        5.46kB ± 0%       5.46kB ± 0%    ~     (all equal)

name        old bss-bytes     new bss-bytes     delta
HelloSize        72.9kB ± 0%       72.9kB ± 0%    ~     (all equal)

name        old exe-bytes     new exe-bytes     delta
HelloSize        1.03MB ± 0%       1.03MB ± 0%    ~     (all equal)

3. The go1 benchmark showed little change in total.
name                     old time/op    new time/op    delta
BinaryTree17-4              41.8s ± 1%     41.8s ± 1%    ~     (p=0.388 n=40+39)
Fannkuch11-4                24.1s ± 1%     24.1s ± 1%    ~     (p=0.077 n=40+40)
FmtFprintfEmpty-4           834ns ± 1%     831ns ± 1%  -0.31%  (p=0.002 n=40+37)
FmtFprintfString-4         1.34µs ± 1%    1.34µs ± 0%    ~     (p=0.387 n=40+40)
FmtFprintfInt-4            1.44µs ± 1%    1.44µs ± 1%    ~     (p=0.421 n=40+40)
FmtFprintfIntInt-4         2.09µs ± 0%    2.09µs ± 1%    ~     (p=0.589 n=40+39)
FmtFprintfPrefixedInt-4    2.32µs ± 1%    2.33µs ± 1%  +0.15%  (p=0.001 n=40+40)
FmtFprintfFloat-4          4.51µs ± 0%    4.44µs ± 1%  -1.50%  (p=0.000 n=40+40)
FmtManyArgs-4              7.94µs ± 0%    7.97µs ± 0%  +0.36%  (p=0.001 n=32+40)
GobDecode-4                 104ms ± 1%     102ms ± 2%  -1.27%  (p=0.000 n=39+37)
GobEncode-4                90.5ms ± 1%    90.9ms ± 2%  +0.40%  (p=0.006 n=37+40)
Gzip-4                      4.10s ± 2%     4.08s ± 1%  -0.30%  (p=0.004 n=40+40)
Gunzip-4                    603ms ± 0%     602ms ± 1%    ~     (p=0.303 n=37+40)
HTTPClientServer-4          672µs ± 3%     658µs ± 2%  -2.08%  (p=0.000 n=39+37)
JSONEncode-4                238ms ± 1%     239ms ± 0%  +0.26%  (p=0.001 n=40+25)
JSONDecode-4                884ms ± 1%     885ms ± 1%  +0.16%  (p=0.012 n=40+40)
Mandelbrot200-4            49.3ms ± 0%    49.3ms ± 0%    ~     (p=0.588 n=40+38)
GoParse-4                  46.3ms ± 1%    46.4ms ± 2%    ~     (p=0.487 n=40+40)
RegexpMatchEasy0_32-4      1.28µs ± 1%    1.28µs ± 0%  +0.12%  (p=0.003 n=40+40)
RegexpMatchEasy0_1K-4      7.78µs ± 5%    7.78µs ± 4%    ~     (p=0.825 n=40+40)
RegexpMatchEasy1_32-4      1.29µs ± 1%    1.29µs ± 0%    ~     (p=0.659 n=40+40)
RegexpMatchEasy1_1K-4      10.3µs ± 3%    10.4µs ± 2%    ~     (p=0.266 n=40+40)
RegexpMatchMedium_32-4     2.05µs ± 1%    2.05µs ± 0%  -0.18%  (p=0.002 n=40+28)
RegexpMatchMedium_1K-4      533µs ± 1%     534µs ± 1%    ~     (p=0.397 n=37+40)
RegexpMatchHard_32-4       28.9µs ± 1%    28.9µs ± 1%  -0.22%  (p=0.002 n=40+40)
RegexpMatchHard_1K-4        868µs ± 1%     870µs ± 1%  +0.21%  (p=0.015 n=40+40)
Revcomp-4                  67.3ms ± 1%    67.2ms ± 2%    ~     (p=0.262 n=38+39)
Template-4                  1.07s ± 1%     1.07s ± 1%    ~     (p=0.276 n=40+40)
TimeParse-4                7.16µs ± 1%    7.16µs ± 1%    ~     (p=0.610 n=39+40)
TimeFormat-4               13.3µs ± 1%    13.3µs ± 1%    ~     (p=0.617 n=38+40)
[Geo mean]                  720µs          719µs       -0.13%

name                     old speed      new speed      delta
GobDecode-4              7.39MB/s ± 1%  7.49MB/s ± 2%  +1.25%  (p=0.000 n=39+38)
GobEncode-4              8.48MB/s ± 1%  8.45MB/s ± 2%  -0.40%  (p=0.005 n=37+40)
Gzip-4                   4.74MB/s ± 2%  4.75MB/s ± 1%  +0.30%  (p=0.018 n=40+40)
Gunzip-4                 32.2MB/s ± 0%  32.2MB/s ± 1%    ~     (p=0.272 n=36+40)
JSONEncode-4             8.15MB/s ± 1%  8.13MB/s ± 0%  -0.26%  (p=0.003 n=40+25)
JSONDecode-4             2.19MB/s ± 1%  2.19MB/s ± 1%    ~     (p=0.676 n=40+40)
GoParse-4                1.25MB/s ± 2%  1.25MB/s ± 2%    ~     (p=0.823 n=40+40)
RegexpMatchEasy0_32-4    25.1MB/s ± 1%  25.1MB/s ± 0%  -0.12%  (p=0.006 n=40+40)
RegexpMatchEasy0_1K-4     132MB/s ± 5%   132MB/s ± 5%    ~     (p=0.821 n=40+40)
RegexpMatchEasy1_32-4    24.7MB/s ± 1%  24.7MB/s ± 0%    ~     (p=0.630 n=40+40)
RegexpMatchEasy1_1K-4    99.1MB/s ± 3%  98.8MB/s ± 2%    ~     (p=0.268 n=40+40)
RegexpMatchMedium_32-4    487kB/s ± 2%   490kB/s ± 0%  +0.51%  (p=0.001 n=40+40)
RegexpMatchMedium_1K-4   1.92MB/s ± 1%  1.92MB/s ± 1%    ~     (p=0.208 n=39+40)
RegexpMatchHard_32-4     1.11MB/s ± 1%  1.11MB/s ± 0%  +0.36%  (p=0.000 n=40+33)
RegexpMatchHard_1K-4     1.18MB/s ± 1%  1.18MB/s ± 1%    ~     (p=0.207 n=40+37)
Revcomp-4                37.8MB/s ± 1%  37.8MB/s ± 2%    ~     (p=0.276 n=38+39)
Template-4               1.82MB/s ± 1%  1.81MB/s ± 1%    ~     (p=0.122 n=38+40)
[Geo mean]               6.81MB/s       6.81MB/s       +0.06%

fixes #19843

Change-Id: Ief3a0c2b15f59d40c7b40f2784eeb71196685b59
Reviewed-on: https://go-review.googlesource.com/61150
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-11 13:10:33 +00:00
Martin Möhrmann 098126103e cmd/compile: preserve escape information for map literals
While some map literals were marked non-escaping that information
was lost when creating the corresponding OMAKE node which made map
literals always heap allocated.

Copying the escape information to the corresponding OMAKE node allows
stack allocation of hmap and a map bucket for non escaping map literals.

Fixes #21830

Change-Id: Ife0b020fffbc513f1ac009352f2ecb110d6889c9
Reviewed-on: https://go-review.googlesource.com/62790
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-11 05:54:46 +00:00
Matthew Dempsky b65cffdcd8 cmd/compile: slightly more idiomatic println code
Updates #21808.

Change-Id: I0314426afcfeed17b1111040110d7f2b0e209526
Reviewed-on: https://go-review.googlesource.com/62430
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-08 23:58:57 +00:00
Keith Randall 02deb77f6d cmd/compile: fix println()
println with no arguments accidentally doesn't print a newline.

Introduced at CL 55097

Fixes #21808

Change-Id: I9fc7b4271b9b31e4c9b6078f055195dc3907b62c
Reviewed-on: https://go-review.googlesource.com/62390
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-08 20:10:48 +00:00
Michael Munday 9da29b687f cmd/compile: propagate constants through math.Float{32,64}{,from}bits
This CL adds generic SSA rules to propagate constants through raw bits
conversions between floats and integers. This allows constants to
propagate through some math functions. For example, math.Copysign(0, -1)
is now constant folded to a load of -0.0.

Requires a fix to the ARM assembler which loaded -0.0 as +0.0.

Change-Id: I52649a4691077c7414f19d17bb599a6743c23ac2
Reviewed-on: https://go-review.googlesource.com/62250
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-09-08 17:24:03 +00:00
Matthew Dempsky 13772edbbd cmd/compile: simplify exporting universal 'error' type
There shouldn't be any problems setting error's "Orig" (underlying)
type to a separate anonymous interface, as this is already how
go/types defines it.

Change-Id: I44e9c4048ffe362ce329e8306632e38b5ccfecff
Reviewed-on: https://go-review.googlesource.com/61790
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Alan Donovan <adonovan@google.com>
2017-09-06 20:14:17 +00:00
Martin Möhrmann 6c102e141c cmd/compile: avoid stack allocation of a map bucket for large constant hints
runtime.makemap will allocate map buckets on the heap for hints larger
than the number of elements a single map bucket can hold.

Do not allocate any map bucket on the stack if it is known at compile time
that hint is larger than the number of elements one map bucket can hold.
This avoids zeroing and reserving memory on the stack that will not be used.

Change-Id: I1a5ab853fb16f6a18d67674a77701bf0cf29b550
Reviewed-on: https://go-review.googlesource.com/60450
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-09-06 04:38:12 +00:00
Martin Möhrmann 9c3f268558 cmd/compile: set hiter type for map iterator in order pass
Previously the type was first set to uint8 and then corrected
later in walkrange.

Change-Id: I9e4b597710e8a5fad39dde035df85676bc8d2874
Reviewed-on: https://go-review.googlesource.com/61032
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-09-05 21:00:59 +00:00