Commit Graph

11738 Commits

Author SHA1 Message Date
Daniel Martí 19ee2ef950 cmd/compile: introduce gc.Node.copy method
When making a shallow copy of a node, various methods were used,
including calling nod(OXXX, nil, nil) and then overwriting it, or
"n1 := *n" and then using &n1.

Add a copy method instead, simplifying all of those and making them
consistent.

Passes toolstash -cmp on std cmd.

Change-Id: I3f3fc88bad708edc712bf6d87214cda4ddc43b01
Reviewed-on: https://go-review.googlesource.com/72710
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-04-03 12:08:39 +00:00
Giovanni Bajo 321bd8c93b cmd/compile: in prove, simplify logic of branch pushing
prove used a complex logic when trying to prove branch conditions:
tryPushBranch() was sometimes leaving a checkpoint on the factsTable,
sometimes not, and the caller was supposed to check the return value
to know what to do.

Since we're going to make the prove descend logic a little bit more
complex by adding also induction variables, simplify the tryPushBranch
logic, by removing any factsTable checkpoint handling from it.

Passes toolstash -cmp.

Change-Id: Idfb1703df8a455f612f93158328b36c461560781
Reviewed-on: https://go-review.googlesource.com/104035
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-03 11:14:26 +00:00
Matthew Dempsky 26e0e8a840 cmd/compile: improve declaration position precision
Previously, n.Pos was reassigned to lineno when declare was called,
which might not match where the identifier actually appeared in the
source. This caused a loss of position precision for function
parameters (which were all declared at the last parameter's position),
and required some clumsy workarounds in bimport.go.

This CL changes declare to leave n.Pos alone and also fixes a few
places where n.Pos was not being set correctly.

Change-Id: Ibe5b5fd30609c684367207df701f9a1bfa82867f
Reviewed-on: https://go-review.googlesource.com/104275
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-03 05:45:44 +00:00
Robert Griesemer c65a2781be cmd/compile: better handling of incorrect type switches
Don't report errors if we don't have a correct type switch
guard; instead ignore it and leave it to the type-checker
to report the error. This leads to better error messages
concentrating on the type switch guard rather than errors
around (confusing) syntactic details.

Also clean up some code setting up AssertExpr (they never
have a nil Type field) and remove some incorrect TODOs.

Fixes #24470.

Change-Id: I69512f36e0417e3b5ea9c8856768e04b19d654a8
Reviewed-on: https://go-review.googlesource.com/103615
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-04-03 05:34:20 +00:00
quasilyte da02dcda02 cmd/link/internal/ld: make Thearch unexported
s/Thearch/thearch/

This reduces the amount of exported global variables,
which in turn could make it easier to refactor them later.

Also updated somewhat vague comment about ld.Thearch.
There is no need for Thearch to be exported as Archinit is
called by ld.Main.

Updates #22095

Change-Id: I266b291f6eac0165f70c51964738206e066cea08
Reviewed-on: https://go-review.googlesource.com/103878
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-02 23:39:16 +00:00
Matthew Dempsky fac7d5dd95 cmd/compile: simplify exportsym debug message
No need to disambiguate if we're exporting or reexporting, because
it's obvious from the output.

Change-Id: I59053d34dc6f8b29e20749c7b03c3cb4f4d641ff
Reviewed-on: https://go-review.googlesource.com/104236
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-02 23:19:52 +00:00
Matthew Dempsky ce1252a610 cmd/compile: simplify exportsym flags and logic
We used to have three Sym flags for dealing with export/reexport:
Export, Package, and Exported.

Export and Package were used to distinguish whether a symbol is
exported or package-scope (i.e., mutually exclusive), except that for
local declarations Export served double-duty as tracking whether the
symbol had been added to exportlist.

Meanwhile, imported declarations that needed reexporting could be
added to exportlist multiple times, necessitating a flag to track
whether they'd already been written out by exporter.

Simplify all of these into a single OnExportList flag so that we can
ensure symbols on exportlist are present exactly once. Merge
reexportsym into exportsym so there's a single place where we append
to exportlist.

Code that used to set Exported to prevent a symbol from being exported
can now just set OnExportList before calling declare to prevent it
from even appearing on exportlist.

Lastly, drop the IsAlias check in exportsym: we call exportsym too
early for local symbols to detect if they're an alias, and we never
reexport aliases.

Passes toolstash-check.

Change-Id: Icdea3719105dc169fcd7651606589cd08b0a80ff
Reviewed-on: https://go-review.googlesource.com/103865
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-02 23:12:53 +00:00
Matthew Dempsky 096d96779a cmd/compile: cleanup Order.cleanTempNoPop slightly
Passes toolstash-check.

Change-Id: Ia769e719e89e508201711775ea3e2cb3979387fa
Reviewed-on: https://go-review.googlesource.com/102215
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-04-02 22:58:09 +00:00
Matthew Dempsky bcc8edfd8a cmd/compile: simplify reexport logic
Currently, we reexport any package-scope constant, function, type, or
variable declarations needed by an inlineable function body. However,
now that we have an early pass to walk inlineable function bodies
(golang.org/cl/74110), we can simplify the logic for finding these
declarations.

The binary export format supports writing out type declarations
in-place at their first use. Also, it always writes out constants by
value, so their declarations never need to be reexported.

Notably, we attempted this before (golang.org/cl/36170) and had to
revert it (golang.org/cl/45911). However, this was because while
writing out inline bodies, we could discover variable/function
dependencies. By collecting variable/function dependencies during
inlineable function discovery, we avoid this problem.

While here, get rid of isInlineable. We already typecheck inlineable
function bodies during inlFlood, so it's become a no-op. Just move the
comment explaining parameter numbering to its caller.

Change-Id: Ibbfaafce793733675d3a2ad98791758583055666
Reviewed-on: https://go-review.googlesource.com/103864
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-02 22:41:56 +00:00
Matthew Dempsky c0841ecd87 cmd/compile: disable instrumentation for no-race packages earlier
Rather than checking for each function whether the package supports
instrumentation, check once up front.

Relatedly, tweak the logic for preventing inlining calls to runtime
functions from instrumented packages. Previously, we simply disallowed
inlining runtime functions altogether when instrumenting. With this
CL, it's only disallowed from packages that are actually being
instrumented. That is, now intra-runtime calls can be inlined.

Updates #19054.

Change-Id: I88c97b48bf70193a8a3ee18d952dcb26b0369d55
Reviewed-on: https://go-review.googlesource.com/102815
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-02 21:50:05 +00:00
Matthew Dempsky 45ce10fa3a cmd/compile: use newfuncname in dclfunc
Eliminates an inconsistency between user functions and generated
functions.

Passes toolstash-check.

Change-Id: I946b511ca81d88a0024b5932cb50f3d8b9e808f4
Reviewed-on: https://go-review.googlesource.com/103863
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-02 19:30:39 +00:00
Daniel Martí 2722650415 cmd: remove some unused parameters
Change-Id: I9d2a4b8df324897e264d30801e95ddc0f0e75f3a
Reviewed-on: https://go-review.googlesource.com/102337
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-04-02 15:51:31 +00:00
Matthew Dempsky 0250ef910f cmd/compile: refactor constant rewriting
Extract all rewrite-to-OLITERAL expressions to use a single setconst
helper function.

Does not pass toolstash-check for two reasons:

1) We now consistently clear Left/Right/etc when rewriting Nodes into
OLITERALs, which results in their inlining complexity being correctly
computed. So more functions can now be inlined.

2) We preserve Pos, so PC line tables change somewhat.

Change-Id: I2b5c293bee7c69c2ccd704677f5aba4ec40e3155
Reviewed-on: https://go-review.googlesource.com/103860
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-04-02 04:09:29 +00:00
Matthew Dempsky fdf33730e1 cmd/compile: refactor constant node constructors
Passes toolstash-check.

Change-Id: I6a2d46e69d4d3a06858c80c4ea1ad3f5a58f6956
Reviewed-on: https://go-review.googlesource.com/103859
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-01 06:55:54 +00:00
Bryan Chan 625f2dccd4 cmd/compile/internal/ssa: handle symbol address comparisons consistently
CL 38338 introduced SSA rules to optimize two types of pointer equality
tests: a pointer compared with itself, and comparison of addresses taken
of two symbols which may have the same base. This patch adds rules to
apply the same optimization to pointer inequality tests, which also ensures
that two pointers to zero-width types cannot be both equal and unequal
at the same time.

Fixes #24503.

Change-Id: Ic828aeb86ae2e680caf66c35f4c247674768a9ba
Reviewed-on: https://go-review.googlesource.com/102275
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-31 21:37:13 +00:00
Daniel Martí ec60e4a0d2 cmd/compile/internal/ssa: add initial README
This is the first version of an introductory document that should help
developers who want to get started with this package.

I recently started poking around this part of the compiler, and was
confused by a few basic ideas such as memory arguments. I also hadn't
heard about GOSSAFUNC until another developer pointed it out. Both of
those are essential if one wants to do any non-trivial work here.

This document can of course be expanded with more pointers and tips to
better understand this package's code and behavior. Its intent is not to
cover all of its features; but it should be enough for most developers
to start playing with it without extensive compiler experience.

Change-Id: Ifd2d047fbd038ab50f4625a15c4d49932b42fd66
Reviewed-on: https://go-review.googlesource.com/99715
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-03-31 20:18:09 +00:00
Josh Bleecher Snyder fa3e9d27f3 cmd/compile: compile all functions concurrently
CL 40693 added concurrent backend compilation support,
and used it for user-provided functions.
Autogenerated functions were still compiled serially.
This CL brings them into the fold.
As of this CL, when requested,
no functions are compiled serially.

There generally aren't many autogenerated functions.
When there are, this CL can help a lot,
because autogenerated functions are usually short.
Many short functions is the best case scenario
for concurrent compilation; see CL 41192.

One example of such a package comes from Dave Cheney's benchjuju:
github.com/juju/govmomi/vim25/types.
It has thousands of autogenerated functions.
This CL improves performance on the entire benchmark
by around a second on my machine at c=8, or about ~5%.

Updates #15756

Change-Id: Ia21e302b2469a9ed743df02244ec7ebde55b32f3
Reviewed-on: https://go-review.googlesource.com/41503
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-03-30 21:38:50 +00:00
Richard Musiol a9ba3e30ac cmd/compile: add SSA config options useAvg and useHmul
This commit allows architectures to disable optimizations that need the
Avg* and Hmul* operations.

WebAssembly has no such operations, so using them as an optimization
but then having to emulate them with multiple instructions makes no
sense, especially since the WebAssembly compiler may do the same
optimizations internally.

Updates #18892

Change-Id: If57b59e3235482a9e0ec334a7312b3e3b5fc2b61
Reviewed-on: https://go-review.googlesource.com/103256
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-03-30 21:34:26 +00:00
Richard Musiol 80e69220c8 go/build, go/types, cmd/dist: add js/wasm architecture
This is the first commit of a series that will add WebAssembly
as an architecture target. The design document can be found at
https://docs.google.com/document/d/131vjr4DH6JFnb-blm_uRdaC0_Nv3OUwjEY5qVCxCup4.

The GOARCH name "wasm" is the official abbreviation of WebAssembly.
The GOOS name "js" got chosen because initially the host environment
that executes WebAssembly bytecode will be web browsers and Node.js,
which both use JavaScript to embed WebAssembly. Other GOOS values
may be possible later, see:
https://github.com/WebAssembly/design/blob/master/NonWeb.md

Updates #18892

Change-Id: Ia25b4fa26bba8029c25923f48ad009fd3681933a
Reviewed-on: https://go-review.googlesource.com/102835
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-30 21:34:18 +00:00
Josh Bleecher Snyder b638760dad cmd/compile: consider full number of struct components to deciding on inlining ==
Change-Id: I6bfbbce2ec5dfc7f9f99dbd82e51c2b0edacc87a
Reviewed-on: https://go-review.googlesource.com/59334
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-30 19:04:43 +00:00
Alberto Donizetti 3b0b8bcd68 test/codegen: port stack-related tests to codegen
And delete them from asm_test.

Change-Id: Idfe1249052d82d15b9c30b292c78656a0bf7b48d
Reviewed-on: https://go-review.googlesource.com/103315
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-30 08:08:06 +00:00
Matthew Dempsky 64bf90576e cmd/compile: refactor memory op constructors in SSA builder
Pulling these operations out separately so it's easier to add
instrumentation calls.

Passes toolstash-check.

Updates #19054.

Change-Id: If6a537124a87bac2eceff1d66d9df5ebb3bf07be
Reviewed-on: https://go-review.googlesource.com/102816
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-29 20:51:13 +00:00
Josh Bleecher Snyder 014a9048d4 cmd/compile: prefer to evict a rematerializable register
This resolves a long-standing regalloc TODO:
If you must evict a register, choose to evict a register
containing a rematerializable value, since that value
won't need to be spilled.

Provides very minor performance and size improvements.

name                     old time/op    new time/op    delta
BinaryTree17-8              2.20s ± 3%     2.18s ± 2%  -0.77%  (p=0.000 n=45+49)
Fannkuch11-8                2.14s ± 2%     2.15s ± 2%  +0.73%  (p=0.000 n=43+44)
FmtFprintfEmpty-8          30.6ns ± 4%    30.2ns ± 3%  -1.14%  (p=0.000 n=50+48)
FmtFprintfString-8         54.5ns ± 6%    53.6ns ± 5%  -1.64%  (p=0.001 n=50+48)
FmtFprintfInt-8            58.0ns ± 7%    57.6ns ± 4%    ~     (p=0.220 n=50+50)
FmtFprintfIntInt-8         85.3ns ± 2%    84.8ns ± 3%  -0.62%  (p=0.001 n=44+47)
FmtFprintfPrefixedInt-8    93.9ns ± 6%    93.6ns ± 5%    ~     (p=0.706 n=50+48)
FmtFprintfFloat-8           178ns ± 4%     177ns ± 4%    ~     (p=0.107 n=49+50)
FmtManyArgs-8               376ns ± 4%     374ns ± 3%  -0.58%  (p=0.013 n=45+50)
GobDecode-8                4.77ms ± 2%    4.76ms ± 3%    ~     (p=0.059 n=47+46)
GobEncode-8                4.04ms ± 2%    3.99ms ± 3%  -1.13%  (p=0.000 n=49+49)
Gzip-8                      177ms ± 2%     180ms ± 3%  +1.43%  (p=0.000 n=48+48)
Gunzip-8                   28.5ms ± 6%    28.3ms ± 5%    ~     (p=0.104 n=50+49)
HTTPClientServer-8         72.1µs ± 1%    72.0µs ± 1%  -0.15%  (p=0.042 n=48+42)
JSONEncode-8               9.81ms ± 5%   10.03ms ± 6%  +2.29%  (p=0.000 n=50+49)
JSONDecode-8               39.2ms ± 3%    39.3ms ± 2%    ~     (p=0.095 n=49+49)
Mandelbrot200-8            3.48ms ± 2%    3.46ms ± 2%  -0.80%  (p=0.000 n=47+48)
GoParse-8                  2.54ms ± 3%    2.51ms ± 3%  -1.35%  (p=0.000 n=49+49)
RegexpMatchEasy0_32-8      66.0ns ± 7%    65.7ns ± 8%    ~     (p=0.331 n=50+50)
RegexpMatchEasy0_1K-8       155ns ± 4%     154ns ± 4%    ~     (p=0.986 n=49+50)
RegexpMatchEasy1_32-8      62.6ns ± 8%    62.2ns ± 5%    ~     (p=0.395 n=50+49)
RegexpMatchEasy1_1K-8       260ns ± 5%     255ns ± 3%  -1.92%  (p=0.000 n=49+49)
RegexpMatchMedium_32-8     92.9ns ± 2%    91.8ns ± 2%  -1.25%  (p=0.000 n=46+48)
RegexpMatchMedium_1K-8     27.7µs ± 3%    27.0µs ± 2%  -2.59%  (p=0.000 n=49+49)
RegexpMatchHard_32-8       1.23µs ± 4%    1.21µs ± 2%  -2.16%  (p=0.000 n=49+44)
RegexpMatchHard_1K-8       36.4µs ± 2%    35.7µs ± 2%  -1.87%  (p=0.000 n=48+49)
Revcomp-8                   274ms ± 2%     276ms ± 3%  +0.70%  (p=0.034 n=45+48)
Template-8                 45.1ms ± 8%    45.1ms ± 8%    ~     (p=0.643 n=50+50)
TimeParse-8                 223ns ± 2%     223ns ± 2%    ~     (p=0.401 n=47+47)
TimeFormat-8                245ns ± 2%     246ns ± 3%    ~     (p=0.758 n=49+50)
[Geo mean]                 36.5µs         36.3µs       -0.54%


name        old object-bytes  new object-bytes  delta
Template          480kB ± 0%        480kB ± 0%    ~     (all equal)
Unicode           214kB ± 0%        214kB ± 0%    ~     (all equal)
GoTypes          1.54MB ± 0%       1.54MB ± 0%  -0.03%  (p=0.008 n=5+5)
Compiler         5.75MB ± 0%       5.75MB ± 0%    ~     (all equal)
SSA              14.6MB ± 0%       14.6MB ± 0%  -0.01%  (p=0.008 n=5+5)
Flate             300kB ± 0%        300kB ± 0%  -0.01%  (p=0.008 n=5+5)
GoParser          366kB ± 0%        366kB ± 0%    ~     (all equal)
Reflect          1.20MB ± 0%       1.20MB ± 0%    ~     (all equal)
Tar               413kB ± 0%        413kB ± 0%    ~     (all equal)
XML               529kB ± 0%        528kB ± 0%  -0.13%  (p=0.008 n=5+5)
[Geo mean]        909kB             909kB       -0.02%


Change-Id: I46d37a55197683a98913f35801dc2b0d609653c8
Reviewed-on: https://go-review.googlesource.com/103240
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-29 18:47:18 +00:00
Paul Jolly 5d8d3d52b0 cmd/go: make generate pass correct GOPACKAGE to XTest files
The existing behaviour of go generate is to pass GOPACKAGE=p
to all package files, including XTest files. This however is
incorrect as the package name for the XTest files is p_test.

Fixes #24594

Change-Id: I96b6e5777ec511cdcf1a6267a43f4d8c544c4af3
Reviewed-on: https://go-review.googlesource.com/103415
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-29 17:58:14 +00:00
Hana Kim 3ac17f86be cmd/trace: make span tables pretty
Mostly same as golang.org/cl/102156, except the parts that
deal with different data types.

Change-Id: I061b858b73898725e3bf175ed022c2e3e55fc485
Reviewed-on: https://go-review.googlesource.com/103158
Reviewed-by: Andrew Bonventre <andybons@golang.org>
2018-03-29 17:56:54 +00:00
Ian Lance Taylor 9761a162f0 cmd/go: don't try to initialize cover profile for go test -c
Using go test -c makes you responsible for managing and merging the
coverage profile yourself.

Fixes #24588

Change-Id: I2037a91ceb904f9f35d76c7b5e5fae6bcbed4e46
Reviewed-on: https://go-review.googlesource.com/103395
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-29 16:37:26 +00:00
Ian Lance Taylor f961272e59 cmd/go: reject relative tmpdir
Fixes #23264

Change-Id: Ib6c343dc8e32949c6de72cb628cace2e8fabc302
Reviewed-on: https://go-review.googlesource.com/103236
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-28 23:01:52 +00:00
Josh Bleecher Snyder dafca7de0f cmd/compile: prefer rematerialization to copying
Fixes #24132


name                     old time/op    new time/op    delta
BinaryTree17-8              2.18s ± 2%     2.15s ± 2%  -1.28%  (p=0.000 n=25+26)
Fannkuch11-8                2.16s ± 3%     2.13s ± 3%  -1.54%  (p=0.000 n=27+30)
FmtFprintfEmpty-8          29.9ns ± 3%    29.6ns ± 3%  -1.08%  (p=0.001 n=29+26)
FmtFprintfString-8         53.6ns ± 2%    54.0ns ± 4%    ~     (p=0.193 n=28+29)
FmtFprintfInt-8            56.8ns ± 3%    57.0ns ± 3%    ~     (p=0.330 n=29+29)
FmtFprintfIntInt-8         85.3ns ± 2%    85.8ns ± 3%  +0.56%  (p=0.042 n=30+29)
FmtFprintfPrefixedInt-8    94.1ns ± 5%    99.0ns ± 8%  +5.20%  (p=0.000 n=27+30)
FmtFprintfFloat-8           183ns ± 4%     182ns ± 3%    ~     (p=0.619 n=30+26)
FmtManyArgs-8               369ns ± 2%     369ns ± 2%    ~     (p=0.748 n=27+29)
GobDecode-8                4.78ms ± 2%    4.75ms ± 1%    ~     (p=0.051 n=28+27)
GobEncode-8                4.06ms ± 3%    4.07ms ± 3%    ~     (p=0.781 n=29+30)
Gzip-8                      178ms ± 2%     177ms ± 2%    ~     (p=0.171 n=29+30)
Gunzip-8                   28.2ms ± 7%    28.0ms ± 4%    ~     (p=0.155 n=30+30)
HTTPClientServer-8         71.5µs ± 3%    71.3µs ± 1%    ~     (p=0.913 n=25+27)
JSONEncode-8               9.71ms ± 5%    9.86ms ± 4%  +1.55%  (p=0.015 n=28+30)
JSONDecode-8               38.8ms ± 2%    39.3ms ± 2%  +1.41%  (p=0.000 n=28+29)
Mandelbrot200-8            3.47ms ± 6%    3.44ms ± 3%    ~     (p=0.183 n=28+28)
GoParse-8                  2.55ms ± 2%    2.54ms ± 3%  -0.58%  (p=0.003 n=27+29)
RegexpMatchEasy0_32-8      66.0ns ± 5%    65.3ns ± 4%    ~     (p=0.124 n=30+30)
RegexpMatchEasy0_1K-8       152ns ± 2%     152ns ± 3%    ~     (p=0.881 n=30+30)
RegexpMatchEasy1_32-8      62.9ns ± 9%    62.7ns ± 7%    ~     (p=0.717 n=30+30)
RegexpMatchEasy1_1K-8       263ns ± 3%     263ns ± 4%    ~     (p=0.909 n=30+29)
RegexpMatchMedium_32-8     93.4ns ± 3%    89.3ns ± 2%  -4.32%  (p=0.000 n=29+29)
RegexpMatchMedium_1K-8     27.5µs ± 3%    27.1µs ± 2%  -1.46%  (p=0.000 n=30+27)
RegexpMatchHard_32-8       1.33µs ± 3%    1.31µs ± 3%  -1.50%  (p=0.000 n=27+28)
RegexpMatchHard_1K-8       39.4µs ± 2%    39.1µs ± 2%  -0.54%  (p=0.027 n=28+28)
Revcomp-8                   274ms ± 4%     276ms ± 2%  +0.67%  (p=0.048 n=29+28)
Template-8                 45.1ms ± 5%    44.6ms ± 7%  -1.22%  (p=0.029 n=30+29)
TimeParse-8                 227ns ± 3%     224ns ± 3%  -1.25%  (p=0.000 n=28+27)
TimeFormat-8                248ns ± 3%     245ns ± 3%  -1.33%  (p=0.002 n=30+29)
[Geo mean]                 36.6µs         36.5µs       -0.32%


Change-Id: I24083f0013506b77e2d9da99c40ae2f67803285e
Reviewed-on: https://go-review.googlesource.com/101076
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-28 18:43:49 +00:00
Josh Bleecher Snyder 6a74fe2f15 cmd/compile: strength reduce more x86 constant multiplication
The additions were machine-generated.

The change for x * 7 avoids a reg-reg move,
reducing the number of instructions from 3 to 2.

Change-Id: Ib002e39f29ca5e46cfdb8daaf87ddc7ba50a17e5
Reviewed-on: https://go-review.googlesource.com/102395
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-28 18:33:10 +00:00
Josh Bleecher Snyder 8623503fe5 cmd/compile/internal/gc: speed up arith const tests
This change reduces the time to run this test on my machine from

real	0m2.491s
user	0m3.020s
sys	0m0.331s

to

real	0m0.237s
user	0m0.180s
sys	0m0.173s

This will make it reasonable to add more constants to the test.
I am also hopeful that it might help a bit with intermittent
cmd/compile/internal/gc test timeouts on the build dashboard
on the slower builders.

The time savings are entirely in compilation time,
by avoiding generating one giant func main.
Instead, generate tables of tests to be run,
which are translated into static data,
and then loop over those tests.

While we're here, do some minor cleanup:

* Remove the _ssa suffix on function names,
  as that was only needed during ssa bootstrapping.
* Clean up error handling during test generation.
* Make functions single-line, to reduce future diff sizes.
  Diffing giant files is slow.

Change-Id: Ic5fccdb71679169bea756c7d33c07d05e4801860
Reviewed-on: https://go-review.googlesource.com/102956
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-03-28 17:44:28 +00:00
Alberto Donizetti 360c19157a cmd/compile: print accurate escape reason for non-const-length slices
This change makes `-m -m` print a better explanation for the case
where a slice is marked as escaping and heap-allocated because it
has a non-constant len/cap.

Fixes #24578

Change-Id: I0ebafb77c758a99857d72b365817bdba7b446cc0
Reviewed-on: https://go-review.googlesource.com/102895
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-03-28 16:56:03 +00:00
Ian Lance Taylor 7e34ac1f4c cmd/go: add more C compiler/linker options to whitelist
Fixes #23937

Change-Id: Ie63d91355d1a724d0012d99d457d939deeeb8d3e
Reviewed-on: https://go-review.googlesource.com/102818
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Bonventre <andybons@golang.org>
2018-03-28 05:58:35 +00:00
Ian Lance Taylor 976a852d4c cmd/go: if -race, don't run coverage on runtime packages
Don't compile the runtime packages with coverage when using the race
detector. The user can, perhaps accidentally, request coverage for the
runtime by using -coverpkg=all. If using the race detector, the
runtime package coverage will call into the race detector before it
has been initialized. This will cause the program to crash
mysteriously on startup.

Fixes #23882

Change-Id: I9a63867a9138797d8b8afb0856ae21079accdb27
Reviewed-on: https://go-review.googlesource.com/94898
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Bonventre <andybons@golang.org>
2018-03-27 22:41:14 +00:00
Ian Lance Taylor ad0ebc3994 cmd/go: with -x, don't report removing a non-existent objdir
Fixes #24389
Fixes #24396

Change-Id: I37399528700e2a39d9523d7c41bdc929618eb095
Reviewed-on: https://go-review.googlesource.com/102619
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-27 19:55:14 +00:00
Yuval Pavel Zholkover 0a5be12f5c cmd/internal/obj/arm: add DMB instruction
Change-Id: Ib67a61d5b37af210ff15d60d72bd5238b9c2d0ca
Reviewed-on: https://go-review.googlesource.com/94815
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-27 19:54:44 +00:00
Ben Shi fc7a72596b cmd/internal/obj/arm64: add LDPW/LDPSW/STPW to arm64 assembler
1. STPW stores the lower 32-bit words of a pair of registers to memory.
2. LDPW loads two 32-bit words from memory, zero extends them to 64-bit,
and then copies to a pair of registers.
3. LDPSW does the same as LDPW, except a sign extension.

This CL implements those 3 instructions and adds test cases.

Change-Id: Ied9834d8240240d23ce00e086b4ea456e1611f1a
Reviewed-on: https://go-review.googlesource.com/99956
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-27 19:13:17 +00:00
Hana Kim aaeaad6870 cmd/trace: assign a unique span id for slice representation
Spans are represented using Async Event types of chrome trace viewer.
According to the doc, the 'id' should be unique within category, scope.

https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/preview#heading=h.jh64i9l3vwa1

Use the index in the task's span slice as the slice id, so it
can be unique within the task. The scope is the task id which
is unique.

This fixes a visualization bug that caused incorrect or missing
presentation of nested spans.

Change-Id: If1537ee00247f71fa967abfe45569a9e7dbcdce7
Reviewed-on: https://go-review.googlesource.com/102697
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-27 19:03:54 +00:00
Michael Munday 331c187b17 cmd/compile: simplify Neg lowering on s390x
No need to sign extend input to Neg8 and Neg16.

Change-Id: I7896c83c9cdf84a34098582351a4aabf61cd6fdd
Reviewed-on: https://go-review.googlesource.com/102675
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-27 19:03:42 +00:00
Matthew Dempsky 7b177b1a03 cmd/compile: fix method set computation for shadowed methods
In expandmeth, we call expand1/expand0 to build a list of all
candidate methods to promote, and then we use dotpath to prune down
which names actually resolve to a promoted method and how.

However, previously we still computed "followsptr" based on the
expand1/expand0 traversal (which is depth-first), rather than
dotpath (which is breadth-first). The result is that we could
sometimes end up miscomputing whether a particular promoted method
involves a pointer traversal, which could result in bad code
generation for method trampolines.

Fixes #24547.

Change-Id: I57dc014466d81c165b05d78b98610dc3765b7a90
Reviewed-on: https://go-review.googlesource.com/102618
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-03-27 18:56:36 +00:00
Alberto Donizetti 377a2cb2d2 cmd/compile: reduce allocations in regAllocState.regalloc
name      old time/op       new time/op       delta
Template        281ms ± 2%        282ms ± 3%    ~     (p=0.428 n=19+20)
Unicode         138ms ± 6%        138ms ± 7%    ~     (p=0.813 n=19+20)
GoTypes         901ms ± 2%        895ms ± 2%    ~     (p=0.050 n=19+20)
Compiler        4.25s ± 1%        4.23s ± 1%  -0.31%  (p=0.031 n=19+18)
SSA             9.77s ± 1%        9.78s ± 1%    ~     (p=0.512 n=20+20)
Flate           187ms ± 3%        187ms ± 4%    ~     (p=0.687 n=20+19)
GoParser        224ms ± 4%        222ms ± 3%    ~     (p=0.301 n=20+20)
Reflect         576ms ± 2%        576ms ± 2%    ~     (p=0.620 n=20+20)
Tar             262ms ± 3%        263ms ± 3%    ~     (p=0.599 n=19+18)
XML             322ms ± 4%        322ms ± 2%    ~     (p=0.512 n=20+20)

name      old user-time/op  new user-time/op  delta
Template        403ms ± 3%        399ms ± 5%    ~     (p=0.149 n=17+20)
Unicode         217ms ±12%        217ms ± 9%    ~     (p=0.883 n=20+20)
GoTypes         1.24s ± 3%        1.24s ± 3%    ~     (p=0.718 n=20+20)
Compiler        5.90s ± 3%        5.84s ± 5%    ~     (p=0.217 n=18+20)
SSA             14.0s ± 6%        14.1s ± 5%    ~     (p=0.235 n=19+20)
Flate           253ms ± 6%        254ms ± 5%    ~     (p=0.749 n=20+19)
GoParser        309ms ± 7%        307ms ± 5%    ~     (p=0.398 n=20+20)
Reflect         772ms ± 3%        771ms ± 3%    ~     (p=0.901 n=20+19)
Tar             368ms ± 5%        369ms ± 8%    ~     (p=0.429 n=20+20)
XML             435ms ± 5%        434ms ± 5%    ~     (p=0.841 n=20+20)

name      old alloc/op      new alloc/op      delta
Template       39.0MB ± 0%       38.9MB ± 0%  -0.21%  (p=0.000 n=20+19)
Unicode        29.0MB ± 0%       29.0MB ± 0%  -0.03%  (p=0.000 n=20+20)
GoTypes         116MB ± 0%        115MB ± 0%  -0.33%  (p=0.000 n=20+20)
Compiler        498MB ± 0%        496MB ± 0%  -0.37%  (p=0.000 n=19+20)
SSA            1.41GB ± 0%       1.40GB ± 0%  -0.24%  (p=0.000 n=20+20)
Flate          25.0MB ± 0%       25.0MB ± 0%  -0.22%  (p=0.000 n=20+19)
GoParser       31.0MB ± 0%       30.9MB ± 0%  -0.23%  (p=0.000 n=20+17)
Reflect        77.1MB ± 0%       77.0MB ± 0%  -0.12%  (p=0.000 n=20+20)
Tar            39.7MB ± 0%       39.6MB ± 0%  -0.17%  (p=0.000 n=20+20)
XML            44.9MB ± 0%       44.8MB ± 0%  -0.29%  (p=0.000 n=20+20)

name      old allocs/op     new allocs/op     delta
Template         386k ± 0%         385k ± 0%  -0.28%  (p=0.000 n=20+20)
Unicode          337k ± 0%         336k ± 0%  -0.07%  (p=0.000 n=20+20)
GoTypes         1.20M ± 0%        1.20M ± 0%  -0.41%  (p=0.000 n=20+20)
Compiler        4.71M ± 0%        4.68M ± 0%  -0.52%  (p=0.000 n=20+20)
SSA             11.7M ± 0%        11.6M ± 0%  -0.31%  (p=0.000 n=20+19)
Flate            238k ± 0%         237k ± 0%  -0.28%  (p=0.000 n=18+20)
GoParser         320k ± 0%         319k ± 0%  -0.34%  (p=0.000 n=20+19)
Reflect          961k ± 0%         959k ± 0%  -0.12%  (p=0.000 n=20+20)
Tar              397k ± 0%         396k ± 0%  -0.23%  (p=0.000 n=20+20)
XML              419k ± 0%         417k ± 0%  -0.39%  (p=0.000 n=20+19)

Change-Id: Ic7ec3614808d9892c1cab3991b996b7a3b8eff21
Reviewed-on: https://go-review.googlesource.com/102676
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-27 18:03:39 +00:00
HaraldNordgren a42ea51ae9 cmd/go: print each import error only once
This change prevents import errors from being printed multiple times.
Creating a bare-bones package 'p' with only one file importing itself
and running 'go build p', the current implementation gives this error
message:

	can't load package: import cycle not allowed
	package p
		imports p
	import cycle not allowed
	package p
		imports p

With this change we will show the message only once.

Updates #23295

Change-Id: I653b34c1c06c279f3df514f12ec0b89745a7e64a
Reviewed-on: https://go-review.googlesource.com/86535
Reviewed-by: Harald Nordgren <haraldnordgren@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-27 14:07:14 +00:00
Alberto Donizetti b63b0f2b75 cmd/compile: allocate less in regalloc's liveValues
Instrumenting the compiler shows that, at the end of liveValues, the
values of the workList's cap are distributed as:

  cap	 freq
  1  	 0.006
  2  	 0.002
  4  	 0.237
  8  	 0.272
  16	 0.254
  32	 0.141
  64	 0.062
  128	 0.02
  256	 0.005
  512	 0.001
  1024	 0.0

Since the initial workList slice allocation is always on the stack
(as the variable does not escape), we can aggressively pre-allocate a
big backing array at (almost) no cost. This will save several
allocations in liveValues calls that end up having a large workList,
with no performance penalties for calls that have a small workList.

name      old time/op       new time/op       delta
Template        284ms ± 3%        282ms ± 3%    ~     (p=0.201 n=20+20)
Unicode         138ms ± 7%        138ms ± 7%    ~     (p=0.718 n=20+20)
GoTypes         905ms ± 2%        895ms ± 1%  -1.10%  (p=0.003 n=19+18)
Compiler        4.26s ± 1%        4.25s ± 1%  -0.38%  (p=0.038 n=20+19)
SSA             9.85s ± 2%        9.80s ± 1%    ~     (p=0.061 n=20+19)
Flate           187ms ± 6%        186ms ± 5%    ~     (p=0.289 n=20+20)
GoParser        227ms ± 3%        225ms ± 3%    ~     (p=0.072 n=20+20)
Reflect         578ms ± 2%        575ms ± 2%    ~     (p=0.059 n=18+20)
Tar             263ms ± 2%        265ms ± 3%    ~     (p=0.224 n=19+20)
XML             323ms ± 3%        325ms ± 2%    ~     (p=0.127 n=20+20)

name      old user-time/op  new user-time/op  delta
Template        406ms ± 6%        404ms ± 4%    ~     (p=0.314 n=20+20)
Unicode         220ms ± 6%        215ms ±11%    ~     (p=0.077 n=18+20)
GoTypes         1.25s ± 3%        1.24s ± 4%    ~     (p=0.461 n=20+20)
Compiler        5.95s ± 2%        5.84s ± 5%  -1.93%  (p=0.007 n=20+20)
SSA             14.4s ± 4%        14.2s ± 4%    ~     (p=0.108 n=20+20)
Flate           257ms ± 6%        252ms ± 9%    ~     (p=0.063 n=20+20)
GoParser        317ms ± 5%        312ms ± 6%  -1.85%  (p=0.049 n=20+20)
Reflect         779ms ± 2%        774ms ± 3%    ~     (p=0.253 n=20+20)
Tar             371ms ± 4%        374ms ± 4%    ~     (p=0.327 n=20+20)
XML             440ms ± 5%        442ms ± 5%    ~     (p=0.678 n=20+20)

name      old alloc/op      new alloc/op      delta
Template       39.4MB ± 0%       39.0MB ± 0%  -0.96%  (p=0.000 n=20+20)
Unicode        29.1MB ± 0%       29.0MB ± 0%  -0.13%  (p=0.000 n=20+20)
GoTypes         117MB ± 0%        116MB ± 0%  -0.88%  (p=0.000 n=20+20)
Compiler        502MB ± 0%        498MB ± 0%  -0.77%  (p=0.000 n=19+20)
SSA            1.42GB ± 0%       1.40GB ± 0%  -0.80%  (p=0.000 n=20+20)
Flate          25.3MB ± 0%       25.0MB ± 0%  -1.10%  (p=0.000 n=20+19)
GoParser       31.3MB ± 0%       31.0MB ± 0%  -1.05%  (p=0.000 n=20+20)
Reflect        77.9MB ± 0%       77.1MB ± 0%  -1.03%  (p=0.000 n=20+20)
Tar            40.0MB ± 0%       39.7MB ± 0%  -0.80%  (p=0.000 n=20+20)
XML            45.2MB ± 0%       44.9MB ± 0%  -0.72%  (p=0.000 n=20+20)

name      old allocs/op     new allocs/op     delta
Template         392k ± 0%         386k ± 0%  -1.44%  (p=0.000 n=20+20)
Unicode          337k ± 0%         337k ± 0%  -0.22%  (p=0.000 n=20+20)
GoTypes         1.22M ± 0%        1.20M ± 0%  -1.33%  (p=0.000 n=20+20)
Compiler        4.76M ± 0%        4.71M ± 0%  -1.12%  (p=0.000 n=20+20)
SSA             11.8M ± 0%        11.7M ± 0%  -1.00%  (p=0.000 n=20+20)
Flate            241k ± 0%         238k ± 0%  -1.49%  (p=0.000 n=20+20)
GoParser         324k ± 0%         320k ± 0%  -1.17%  (p=0.000 n=20+20)
Reflect          981k ± 0%         961k ± 0%  -2.11%  (p=0.000 n=20+20)
Tar              402k ± 0%         397k ± 0%  -1.29%  (p=0.000 n=20+20)
XML              424k ± 0%         419k ± 0%  -1.10%  (p=0.000 n=19+20)

Change-Id: If46667ae98eee2d47a615cad05e18df0629d8388
Reviewed-on: https://go-review.googlesource.com/102495
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-27 08:16:50 +00:00
Alberto Donizetti fdee46ee70 cmd/compile: use more ORs in generic.rules
No changes in the actual generated compiler code.

Change-Id: I206a7bf7b60f70a73640119fc92974f79ed95a6b
Reviewed-on: https://go-review.googlesource.com/102416
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-03-27 07:49:49 +00:00
Tim Wright 131901e80d cmd/go, cmd/link, runtime: enable PIE build mode, cgo race tests on FreeBSD
Fixes #24546

Change-Id: I99ebd5bc18e5c5e42eee4689644a7a8b02405f31
Reviewed-on: https://go-review.googlesource.com/102616
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-27 02:50:29 +00:00
Brad Fitzpatrick 48db2c01b4 all: use strings.Builder instead of bytes.Buffer where appropriate
I grepped for "bytes.Buffer" and "buf.String" and mostly ignored test
files. I skipped a few on purpose and probably missed a few others,
but otherwise I think this should be most of them.

Updates #18990

Change-Id: I5a6ae4296b87b416d8da02d7bfaf981d8cc14774
Reviewed-on: https://go-review.googlesource.com/102479
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-26 23:05:53 +00:00
Hana Kim f0eca373be cmd/trace: add /userspans, /userspan pages
Change-Id: Ifbefb659a8df3b079d69679871af444b179deaeb
Reviewed-on: https://go-review.googlesource.com/102599
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-26 22:30:52 +00:00
Austin Clements 2d8181e7b5 cmd/compile: clarify unsigned interpretation of AuxInt
The way Value.AuxInt represents unsigned numbers is currently
documented in genericOps.go, which is not the most obvious place for
it. Move that documentation to Value.AuxInt. Furthermore, to make it
harder to use incorrectly, introduce a Value.AuxUnsigned accessor that
returns the zero-extended value of Value.AuxInt.

Change-Id: I85030c3c68761404058a430e0b1c7464591b2f42
Reviewed-on: https://go-review.googlesource.com/102597
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-26 19:08:24 +00:00
Tobias Klauser a29e25b82c cmd/cgo: add support for GOARCH=sparc64
Even though GOARCH=sparc64 is not supported by gc (yet), it is easy to
make cgo already support it.

This e.g. allows to generate Go type definitions for linux/sparc64 in
the golang.org/x/sys/unix package without using gccgo.

Change-Id: I8886c81e7c895a0d93e350d81ed653fb59d95dd8
Reviewed-on: https://go-review.googlesource.com/102555
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-26 18:58:51 +00:00
David Chase a934e34e87 cmd/compile: invoke gdb more carefully in ssa/debug_test.go
Gdb can be sensitive to contents of .gdbinit, and to run
this test properly needs to have runtime/runtime-gdb.py
on the auto load safe path.  Therefore, turn off .gdbinit
loading and explicitly add $GOROOT/runtime to the safe
load path.

This should make ssa/debug_test.go run more consistently.

Updates #24464.

Change-Id: I63ed17c032cb3773048713ce51fca3a3f86e79b6
Reviewed-on: https://go-review.googlesource.com/102598
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-26 18:44:52 +00:00
Ilya Tocar 3db3826a57 cmd/internal/obj/x86: use PutOpBytesLit in more places
We already replaced most loops with PutOpBytesLit where possible,
do this in a last few places.

Change-Id: I8c90de017810145a12394fa6b887755e9111b22a
Reviewed-on: https://go-review.googlesource.com/102276
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-26 18:20:18 +00:00
David Chase f7404974da cmd/compile: finish GOEXPERIMENT=preemptibleloops repair
A newish check for branch-likely on single-successor blocks
caught a case where the preemption-check inserter was
setting "likely" on an unconditional branch.

Fixed by checking for that case before setting likely.

Also removed an overconservative restriction on parallel
compilation for GOEXPERIMENT=preemptibleloops; it works
fine, it is just another control-flow transformation.

Change-Id: I8e786e6281e0631cac8d80cff67bfb6402b4d225
Reviewed-on: https://go-review.googlesource.com/102317
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-26 17:44:14 +00:00
Michael Munday 8afa8a3374 cmd/compile: use 32-bit comparisons where possible on s390x
We use 32-bit operations for 8- and 16-bit arithmetic, so use them
for comparisons too. This won't change performance but it is more
consistent and makes testing 8- and 16-bit comparison codegen
slightly more straightforward (for follow up CL).

Also fix a typo and add some additional double sign and zero
extension rules to remove the operations inserted by the comparison
rules.

Change-Id: I89ec1b0e09cb8be8090cf007be283ad88bba75a4
Reviewed-on: https://go-review.googlesource.com/102556
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-26 17:41:34 +00:00
Hana Kim 68a1c9c400 internal/trace: compute span stats as computing goroutine stats
Move part of UserSpan event processing from cmd/trace.analyzeAnnotations
to internal/trace.GoroutineStats that returns analyzed per-goroutine
execution information. Now the execution information includes list of
spans and their execution information.

cmd/trace.analyzeAnnotations utilizes the span execution information
from internal/trace.GoroutineStats and connects them with task
information.

Change-Id: Ib7f79a3ba652a4ae55cd81ea17565bcc7e241c5c
Reviewed-on: https://go-review.googlesource.com/101917
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Peter Weinberger <pjw@google.com>
2018-03-26 16:59:01 +00:00
Ilya Tocar 24cd112086 cmd/compile/internal/ssa: optimize away double NEG on amd64
When lowering some ops on amd64 we generate additional NEGQ.
This may result in code like this:

NEGQ R12
NEGQ R12

Optimize it away. Gain is not significant, about ~0.5% gain in geomean
in compress/flate and 200 bytes codesize reduction in go tool.

Full results below:

name                             old time/op    new time/op    delta
Encode/Digits/Huffman/1e4-6        65.8µs ± 0%    65.7µs ± 0%  -0.21%  (p=0.010 n=10+9)
Encode/Digits/Huffman/1e5-6         633µs ± 0%     632µs ± 0%    ~     (p=0.370 n=8+9)
Encode/Digits/Huffman/1e6-6        6.30ms ± 1%    6.29ms ± 1%    ~     (p=0.796 n=10+10)
Encode/Digits/Speed/1e4-6           281µs ± 0%     280µs ± 1%  -0.34%  (p=0.043 n=8+10)
Encode/Digits/Speed/1e5-6          2.66ms ± 0%    2.66ms ± 0%  -0.09%  (p=0.043 n=10+10)
Encode/Digits/Speed/1e6-6          26.3ms ± 0%    26.3ms ± 0%    ~     (p=0.190 n=10+10)
Encode/Digits/Default/1e4-6         554µs ± 0%     557µs ± 0%  +0.46%  (p=0.001 n=9+10)
Encode/Digits/Default/1e5-6        8.63ms ± 1%    8.62ms ± 1%    ~     (p=0.912 n=10+10)
Encode/Digits/Default/1e6-6        92.7ms ± 1%    92.2ms ± 1%    ~     (p=0.052 n=10+10)
Encode/Digits/Compression/1e4-6     558µs ± 1%     557µs ± 1%    ~     (p=0.481 n=10+10)
Encode/Digits/Compression/1e5-6    8.58ms ± 0%    8.61ms ± 1%    ~     (p=0.315 n=8+10)
Encode/Digits/Compression/1e6-6    92.3ms ± 1%    92.4ms ± 1%    ~     (p=0.971 n=10+10)
Encode/Twain/Huffman/1e4-6         89.5µs ± 0%    89.0µs ± 1%  -0.48%  (p=0.001 n=9+9)
Encode/Twain/Huffman/1e5-6          727µs ± 1%     728µs ± 0%    ~     (p=0.604 n=10+9)
Encode/Twain/Huffman/1e6-6         7.21ms ± 0%    7.19ms ± 1%    ~     (p=0.696 n=8+10)
Encode/Twain/Speed/1e4-6            320µs ± 1%     321µs ± 1%    ~     (p=0.353 n=10+10)
Encode/Twain/Speed/1e5-6           2.63ms ± 0%    2.62ms ± 1%  -0.33%  (p=0.016 n=8+10)
Encode/Twain/Speed/1e6-6           25.8ms ± 0%    25.8ms ± 0%    ~     (p=0.360 n=10+8)
Encode/Twain/Default/1e4-6          677µs ± 1%     671µs ± 1%  -0.88%  (p=0.000 n=10+10)
Encode/Twain/Default/1e5-6         10.5ms ± 1%    10.3ms ± 0%  -2.06%  (p=0.000 n=10+10)
Encode/Twain/Default/1e6-6          113ms ± 1%     111ms ± 1%  -1.96%  (p=0.000 n=10+9)
Encode/Twain/Compression/1e4-6      688µs ± 0%     679µs ± 1%  -1.30%  (p=0.000 n=7+10)
Encode/Twain/Compression/1e5-6     11.6ms ± 1%    11.3ms ± 1%  -2.10%  (p=0.000 n=10+10)
Encode/Twain/Compression/1e6-6      126ms ± 1%     124ms ± 0%  -1.57%  (p=0.000 n=10+10)
[Geo mean]                         3.45ms         3.44ms       -0.46%

name                             old speed      new speed      delta
Encode/Digits/Huffman/1e4-6       152MB/s ± 0%   152MB/s ± 0%  +0.21%  (p=0.009 n=10+9)
Encode/Digits/Huffman/1e5-6       158MB/s ± 0%   158MB/s ± 0%    ~     (p=0.336 n=8+9)
Encode/Digits/Huffman/1e6-6       159MB/s ± 1%   159MB/s ± 1%    ~     (p=0.781 n=10+10)
Encode/Digits/Speed/1e4-6        35.6MB/s ± 0%  35.7MB/s ± 1%  +0.34%  (p=0.020 n=8+10)
Encode/Digits/Speed/1e5-6        37.6MB/s ± 0%  37.7MB/s ± 0%  +0.09%  (p=0.049 n=10+10)
Encode/Digits/Speed/1e6-6        38.0MB/s ± 0%  38.0MB/s ± 0%    ~     (p=0.146 n=10+10)
Encode/Digits/Default/1e4-6      18.0MB/s ± 0%  18.0MB/s ± 0%  -0.45%  (p=0.002 n=9+10)
Encode/Digits/Default/1e5-6      11.6MB/s ± 1%  11.6MB/s ± 1%    ~     (p=0.644 n=10+10)
Encode/Digits/Default/1e6-6      10.8MB/s ± 1%  10.8MB/s ± 1%  +0.51%  (p=0.044 n=10+10)
Encode/Digits/Compression/1e4-6  17.9MB/s ± 1%  17.9MB/s ± 1%    ~     (p=0.468 n=10+10)
Encode/Digits/Compression/1e5-6  11.7MB/s ± 0%  11.6MB/s ± 1%    ~     (p=0.322 n=8+10)
Encode/Digits/Compression/1e6-6  10.8MB/s ± 1%  10.8MB/s ± 1%    ~     (p=0.983 n=10+10)
Encode/Twain/Huffman/1e4-6        112MB/s ± 0%   112MB/s ± 1%  +0.42%  (p=0.002 n=8+9)
Encode/Twain/Huffman/1e5-6        138MB/s ± 1%   137MB/s ± 0%    ~     (p=0.616 n=10+9)
Encode/Twain/Huffman/1e6-6        139MB/s ± 0%   139MB/s ± 1%    ~     (p=0.652 n=8+10)
Encode/Twain/Speed/1e4-6         31.3MB/s ± 1%  31.2MB/s ± 1%    ~     (p=0.342 n=10+10)
Encode/Twain/Speed/1e5-6         38.0MB/s ± 0%  38.1MB/s ± 1%  +0.33%  (p=0.011 n=8+10)
Encode/Twain/Speed/1e6-6         38.8MB/s ± 0%  38.7MB/s ± 0%    ~     (p=0.325 n=10+8)
Encode/Twain/Default/1e4-6       14.8MB/s ± 1%  14.9MB/s ± 1%  +0.88%  (p=0.000 n=10+10)
Encode/Twain/Default/1e5-6       9.48MB/s ± 1%  9.68MB/s ± 0%  +2.11%  (p=0.000 n=10+10)
Encode/Twain/Default/1e6-6       8.86MB/s ± 1%  9.03MB/s ± 1%  +1.97%  (p=0.000 n=10+9)
Encode/Twain/Compression/1e4-6   14.5MB/s ± 0%  14.7MB/s ± 1%  +1.31%  (p=0.000 n=7+10)
Encode/Twain/Compression/1e5-6   8.63MB/s ± 1%  8.82MB/s ± 1%  +2.17%  (p=0.000 n=10+10)
Encode/Twain/Compression/1e6-6   7.92MB/s ± 1%  8.05MB/s ± 1%  +1.59%  (p=0.000 n=10+10)
[Geo mean]                       29.0MB/s       29.1MB/s       +0.47%

// symSizeComp `which go` go_old:

section differences:
global text (code) = 203 bytes (0.005131%)
read-only data = 1 bytes (0.000057%)
Total difference 204 bytes (0.003297%)

Change-Id: Ie2cdfa1216472d78694fff44d215b3b8e71cf7bf
Reviewed-on: https://go-review.googlesource.com/102277
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-26 16:19:53 +00:00
Hana (Hyang-Ah) Kim ea1f483240 cmd/trace: beautify goroutine page
- Summary: also includes links to pprof data.
- Sortable table: sorting is done on server-side. The intention is
  that later, I want to add pagination feature and limit the page
  size the browser has to handle.
- Stacked horizontal bar graph to present total time breakdown.
- Human-friendly time representation.
- No dependency on external fancy javascript libraries to allow
  it to function without an internet connection.

Change-Id: I91e5c26746e59ad0329dfb61e096e11f768c7b73
Reviewed-on: https://go-review.googlesource.com/102156
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Bonventre <andybons@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-26 15:36:56 +00:00
Alberto Donizetti 2ba98f1ae9 cmd/compile: avoid some allocations in regalloc
Compilebench:
name      old time/op       new time/op       delta
Template        283ms ± 3%        281ms ± 4%    ~     (p=0.242 n=20+20)
Unicode         137ms ± 6%        135ms ± 6%    ~     (p=0.194 n=20+19)
GoTypes         890ms ± 2%        883ms ± 1%  -0.74%  (p=0.001 n=19+19)
Compiler        4.21s ± 2%        4.20s ± 2%  -0.40%  (p=0.033 n=20+19)
SSA             9.86s ± 2%        9.68s ± 1%  -1.80%  (p=0.000 n=20+19)
Flate           185ms ± 5%        185ms ± 7%    ~     (p=0.429 n=20+20)
GoParser        222ms ± 3%        222ms ± 4%    ~     (p=0.588 n=19+20)
Reflect         572ms ± 2%        570ms ± 3%    ~     (p=0.113 n=19+20)
Tar             263ms ± 4%        259ms ± 2%  -1.41%  (p=0.013 n=20+20)
XML             321ms ± 2%        321ms ± 4%    ~     (p=0.835 n=20+19)

name      old user-time/op  new user-time/op  delta
Template        400ms ± 5%        405ms ± 5%    ~     (p=0.096 n=20+20)
Unicode         217ms ± 8%        213ms ± 8%    ~     (p=0.242 n=20+20)
GoTypes         1.23s ± 3%        1.22s ± 3%    ~     (p=0.923 n=19+20)
Compiler        5.76s ± 6%        5.81s ± 2%    ~     (p=0.687 n=20+19)
SSA             14.2s ± 4%        14.0s ± 4%    ~     (p=0.121 n=20+20)
Flate           248ms ± 7%        251ms ±10%    ~     (p=0.369 n=20+20)
GoParser        308ms ± 5%        305ms ± 6%    ~     (p=0.336 n=19+20)
Reflect         771ms ± 2%        766ms ± 2%    ~     (p=0.113 n=20+19)
Tar             370ms ± 5%        362ms ± 7%  -2.06%  (p=0.036 n=19+20)
XML             435ms ± 4%        432ms ± 5%    ~     (p=0.369 n=20+20)

name      old alloc/op      new alloc/op      delta
Template       39.5MB ± 0%       39.4MB ± 0%  -0.20%  (p=0.000 n=20+20)
Unicode        29.1MB ± 0%       29.1MB ± 0%    ~     (p=0.064 n=20+20)
GoTypes         117MB ± 0%        117MB ± 0%  -0.17%  (p=0.000 n=20+20)
Compiler        503MB ± 0%        502MB ± 0%  -0.15%  (p=0.000 n=19+19)
SSA            1.42GB ± 0%       1.42GB ± 0%  -0.16%  (p=0.000 n=20+20)
Flate          25.3MB ± 0%       25.3MB ± 0%  -0.19%  (p=0.000 n=20+20)
GoParser       31.4MB ± 0%       31.3MB ± 0%  -0.14%  (p=0.000 n=20+18)
Reflect        78.1MB ± 0%       77.9MB ± 0%  -0.34%  (p=0.000 n=20+19)
Tar            40.1MB ± 0%       40.0MB ± 0%  -0.17%  (p=0.000 n=20+20)
XML            45.3MB ± 0%       45.2MB ± 0%  -0.13%  (p=0.000 n=20+20)

name      old allocs/op     new allocs/op     delta
Template         393k ± 0%         392k ± 0%  -0.21%  (p=0.000 n=20+19)
Unicode          337k ± 0%         337k ± 0%  -0.02%  (p=0.000 n=20+20)
GoTypes         1.22M ± 0%        1.22M ± 0%  -0.21%  (p=0.000 n=20+20)
Compiler        4.77M ± 0%        4.76M ± 0%  -0.16%  (p=0.000 n=20+20)
SSA             11.8M ± 0%        11.8M ± 0%  -0.12%  (p=0.000 n=20+20)
Flate            242k ± 0%         241k ± 0%  -0.20%  (p=0.000 n=20+20)
GoParser         324k ± 0%         324k ± 0%  -0.14%  (p=0.000 n=20+20)
Reflect          985k ± 0%         981k ± 0%  -0.38%  (p=0.000 n=20+20)
Tar              403k ± 0%         402k ± 0%  -0.19%  (p=0.000 n=20+20)
XML              424k ± 0%         424k ± 0%  -0.16%  (p=0.000 n=19+20)

Change-Id: I131e382b64cd6db11a9263a477d45d80c180c499
Reviewed-on: https://go-review.googlesource.com/102421
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-25 18:27:49 +00:00
Daniel Martí b1892d740e cmd/compile/internal/gc: various cleanups
Remove a couple of unnecessary var declarations, an unused sort.Sort
type, and simplify a range by using the two-name variant.

Change-Id: Ia251f634db0bfbe8b1d553b8659272ddbd13b2c3
Reviewed-on: https://go-review.googlesource.com/102336
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-24 19:44:18 +00:00
Daniel Nephin 5526ef1c51 cmd/test2json: document missing "skip" action
Change-Id: I906e61170279f0647598e2fd4fa931aac1b69288
GitHub-Last-Rev: f6df43e8e1
GitHub-Pull-Request: golang/go#24517
Reviewed-on: https://go-review.googlesource.com/102396
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-24 18:37:22 +00:00
Alberto Donizetti a27cd4fd31 test/codegen: port tbz/tbnz arm64 tests
And delete them from asm_test.

Change-Id: I34fcf85ae8ce09cd146fe4ce6a0ae7616bd97e2d
Reviewed-on: https://go-review.googlesource.com/102296
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-03-24 09:35:53 +00:00
Giovanni Bajo d54902ece9 cmd/compile: in prove, shortcircuit self-facts
Sometimes, we can end up calling update with a self-relation
about a variable (x REL x). In this case, there is no need
to record anything: the relation is unsatisfiable if and only
if it doesn't contain eq.

This also helps avoiding infinite loop in next CL that will
introduce transitive closure of relations.

Passes toolstash -cmp.

Change-Id: Ic408452ec1c13653f22ada35466ec98bc14aaa8e
Reviewed-on: https://go-review.googlesource.com/100276
Reviewed-by: Austin Clements <austin@google.com>
2018-03-24 03:06:21 +00:00
Giovanni Bajo 385d936fb2 cmd/compile: in prove, fail fast when unsat is found
When an unsatisfiable relation is recorded in the facts table,
there is no need to compute further relations or updates
additional data structures.

Since we're about to transitively propagate relations, make
sure to fail as fast as possible to avoid doing useless work
in dead branches.

Passes toolstash -cmp.

Change-Id: I23eed376d62776824c33088163c7ac9620abce85
Reviewed-on: https://go-review.googlesource.com/100275
Reviewed-by: Austin Clements <austin@google.com>
2018-03-24 03:06:01 +00:00
Giovanni Bajo 79112707bb cmd/compile: add patterns for bit set/clear/complement on amd64
This patch completes implementation of BT(Q|L), and adds support
for BT(S|R|C)(Q|L).

Example of code changes from time.(*Time).addSec:

        if t.wall&hasMonotonic != 0 {
  0x1073465               488b08                  MOVQ 0(AX), CX
  0x1073468               4889ca                  MOVQ CX, DX
  0x107346b               48c1e93f                SHRQ $0x3f, CX
  0x107346f               48c1e13f                SHLQ $0x3f, CX
  0x1073473               48f7c1ffffffff          TESTQ $-0x1, CX
  0x107347a               746b                    JE 0x10734e7

        if t.wall&hasMonotonic != 0 {
  0x1073435               488b08                  MOVQ 0(AX), CX
  0x1073438               480fbae13f              BTQ $0x3f, CX
  0x107343d               7363                    JAE 0x10734a2

Another example:

                        t.wall = t.wall&nsecMask | uint64(dsec)<<nsecShift | hasMonotonic
  0x10734c8               4881e1ffffff3f          ANDQ $0x3fffffff, CX
  0x10734cf               48c1e61e                SHLQ $0x1e, SI
  0x10734d3               4809ce                  ORQ CX, SI
  0x10734d6               48b90000000000000080    MOVQ $0x8000000000000000, CX
  0x10734e0               4809f1                  ORQ SI, CX
  0x10734e3               488908                  MOVQ CX, 0(AX)

                        t.wall = t.wall&nsecMask | uint64(dsec)<<nsecShift | hasMonotonic
  0x107348b		4881e2ffffff3f		ANDQ $0x3fffffff, DX
  0x1073492		48c1e61e		SHLQ $0x1e, SI
  0x1073496		4809f2			ORQ SI, DX
  0x1073499		480fbaea3f		BTSQ $0x3f, DX
  0x107349e		488910			MOVQ DX, 0(AX)

Go1 benchmarks seem unaffected, and I would be surprised
otherwise:

name                     old time/op    new time/op     delta
BinaryTree17-4              2.64s ± 4%      2.56s ± 9%  -2.92%  (p=0.008 n=9+9)
Fannkuch11-4                2.90s ± 1%      2.95s ± 3%  +1.76%  (p=0.010 n=10+9)
FmtFprintfEmpty-4          35.3ns ± 1%     34.5ns ± 2%  -2.34%  (p=0.004 n=9+8)
FmtFprintfString-4         57.0ns ± 1%     58.4ns ± 5%  +2.52%  (p=0.029 n=9+10)
FmtFprintfInt-4            59.8ns ± 3%     59.8ns ± 6%    ~     (p=0.565 n=10+10)
FmtFprintfIntInt-4         93.9ns ± 3%     91.2ns ± 5%  -2.94%  (p=0.014 n=10+9)
FmtFprintfPrefixedInt-4     107ns ± 6%      104ns ± 6%    ~     (p=0.099 n=10+10)
FmtFprintfFloat-4           187ns ± 3%      188ns ± 3%    ~     (p=0.505 n=10+9)
FmtManyArgs-4               410ns ± 1%      415ns ± 6%    ~     (p=0.649 n=8+10)
GobDecode-4                5.30ms ± 3%     5.27ms ± 3%    ~     (p=0.436 n=10+10)
GobEncode-4                4.62ms ± 5%     4.47ms ± 2%  -3.24%  (p=0.001 n=9+10)
Gzip-4                      197ms ± 4%      193ms ± 3%    ~     (p=0.123 n=10+10)
Gunzip-4                   30.4ms ± 3%     30.1ms ± 3%    ~     (p=0.481 n=10+10)
HTTPClientServer-4         76.3µs ± 1%     76.0µs ± 1%    ~     (p=0.236 n=8+9)
JSONEncode-4               10.5ms ± 9%     10.3ms ± 3%    ~     (p=0.280 n=10+10)
JSONDecode-4               42.3ms ±10%     41.3ms ± 2%    ~     (p=0.053 n=9+10)
Mandelbrot200-4            3.80ms ± 2%     3.72ms ± 2%  -2.15%  (p=0.001 n=9+10)
GoParse-4                  2.88ms ±10%     2.81ms ± 2%    ~     (p=0.247 n=10+10)
RegexpMatchEasy0_32-4      69.5ns ± 4%     68.6ns ± 2%    ~     (p=0.171 n=10+10)
RegexpMatchEasy0_1K-4       165ns ± 3%      162ns ± 3%    ~     (p=0.137 n=10+10)
RegexpMatchEasy1_32-4      65.7ns ± 6%     64.4ns ± 2%  -2.02%  (p=0.037 n=10+10)
RegexpMatchEasy1_1K-4       278ns ± 2%      279ns ± 3%    ~     (p=0.991 n=8+9)
RegexpMatchMedium_32-4     99.3ns ± 3%     98.5ns ± 4%    ~     (p=0.457 n=10+9)
RegexpMatchMedium_1K-4     30.1µs ± 1%     30.4µs ± 2%    ~     (p=0.173 n=8+10)
RegexpMatchHard_32-4       1.40µs ± 2%     1.41µs ± 4%    ~     (p=0.565 n=10+10)
RegexpMatchHard_1K-4       42.5µs ± 1%     41.5µs ± 3%  -2.13%  (p=0.002 n=8+9)
Revcomp-4                   332ms ± 4%      328ms ± 5%    ~     (p=0.720 n=9+10)
Template-4                 48.3ms ± 2%     49.6ms ± 3%  +2.56%  (p=0.002 n=8+10)
TimeParse-4                 252ns ± 2%      249ns ± 3%    ~     (p=0.116 n=9+10)
TimeFormat-4                262ns ± 4%      252ns ± 3%  -4.01%  (p=0.000 n=9+10)

name                     old speed      new speed       delta
GobDecode-4               145MB/s ± 3%    146MB/s ± 3%    ~     (p=0.436 n=10+10)
GobEncode-4               166MB/s ± 5%    172MB/s ± 2%  +3.28%  (p=0.001 n=9+10)
Gzip-4                   98.6MB/s ± 4%  100.4MB/s ± 3%    ~     (p=0.123 n=10+10)
Gunzip-4                  639MB/s ± 3%    645MB/s ± 3%    ~     (p=0.481 n=10+10)
JSONEncode-4              185MB/s ± 8%    189MB/s ± 3%    ~     (p=0.280 n=10+10)
JSONDecode-4             46.0MB/s ± 9%   47.0MB/s ± 2%  +2.21%  (p=0.046 n=9+10)
GoParse-4                20.1MB/s ± 9%   20.6MB/s ± 2%    ~     (p=0.239 n=10+10)
RegexpMatchEasy0_32-4     460MB/s ± 4%    467MB/s ± 2%    ~     (p=0.165 n=10+10)
RegexpMatchEasy0_1K-4    6.19GB/s ± 3%   6.28GB/s ± 3%    ~     (p=0.165 n=10+10)
RegexpMatchEasy1_32-4     487MB/s ± 5%    497MB/s ± 2%  +2.00%  (p=0.043 n=10+10)
RegexpMatchEasy1_1K-4    3.67GB/s ± 2%   3.67GB/s ± 3%    ~     (p=0.963 n=8+9)
RegexpMatchMedium_32-4   10.1MB/s ± 3%   10.1MB/s ± 4%    ~     (p=0.435 n=10+9)
RegexpMatchMedium_1K-4   34.0MB/s ± 1%   33.7MB/s ± 2%    ~     (p=0.173 n=8+10)
RegexpMatchHard_32-4     22.9MB/s ± 2%   22.7MB/s ± 4%    ~     (p=0.565 n=10+10)
RegexpMatchHard_1K-4     24.0MB/s ± 3%   24.7MB/s ± 3%  +2.64%  (p=0.001 n=9+9)
Revcomp-4                 766MB/s ± 4%    775MB/s ± 5%    ~     (p=0.720 n=9+10)
Template-4               40.2MB/s ± 2%   39.2MB/s ± 3%  -2.47%  (p=0.002 n=8+10)

The rules match ~1800 times during all.bash.

Fixes #18943

Change-Id: I64be1ada34e89c486dfd935bf429b35652117ed4
Reviewed-on: https://go-review.googlesource.com/94766
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-24 02:38:50 +00:00
isharipo 3afd2d7fc8 cmd/compile/internal/gc: properly initialize ssa.Func Type field
The ssa.Func has Type field that is described as
function signature type.

It never gets any value and remains nil.
This leads to "<T>" signature printed representation.

Given this function declaration:
	func foo(x int, f func() string) (int, error)

GOSSAFUNC printed it as below:
	compiling foo
	foo <T>

After this change:
	compiling foo
	foo func(int, func() string) (int, error)

Change-Id: Iec5eec8aac5c76ff184659e30f41b2f5fe86d329
Reviewed-on: https://go-review.googlesource.com/102375
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-03-24 02:18:52 +00:00
Matthew Dempsky ea668e18a6 cmd/compile: always write pack files
By always writing out pack files, the object file format can be
simplified somewhat. In particular, the export data format will no
longer require escaping, because the pack file provides appropriate
framing.

This CL does not affect build systems that use -pack, which includes
all major Go build systems (cmd/go, gb, bazel).

Also, existing package import logic already distinguishes pack/object
files based on file contents rather than file extension.

The only exception is cmd/pack, which specially handled object files
created by cmd/compile when used with the 'c' mode. This mode is
extended to now recognize the pack files produced by cmd/compile and
handle them as before.

Passes toolstash-check.

Updates #21705.
Updates #24512.

Change-Id: Idf131013bfebd73a5cde7e087eb19964503a9422
Reviewed-on: https://go-review.googlesource.com/102236
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-24 00:51:24 +00:00
Matthew Dempsky 699b0d4e52 cmd/link: skip __.PKGDEF in archives
The __.PKGDEF file is a compiler object file only intended for other
compilers. Also, for build systems that use -linkobj, all of the
information it contains is present within the linker object files
already, so look for it there instead.

This requires a little bit of code reorganization. Significantly,
previously when loading an archive file, the __.PKGDEF file was
authoritative on whether the package was "main" and/or "safe". Now
that we're using the Go object files instead, there's the issue that
there can be multiple Go object files in an archive (because when
using assembly, each assembly file becomes its own additional object
file).

The solution taken here is to check if any object file within the
package declares itself as "main" and/or "safe".

Updates #24512.

Change-Id: I70243a293bdf34b8555c0bf1833f8933b2809449
Reviewed-on: https://go-review.googlesource.com/102281
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-24 00:17:33 +00:00
Matthew Dempsky 946af1b658 cmd/link: make sure we're hashing __.PKGDEF in genhash
This is currently always the case because loadobjfile complains if
it's not, but that will be changed soon.

Updates #24512.

Change-Id: I262daca765932a0f4cea3fcc1cc80ca90de07a59
Reviewed-on: https://go-review.googlesource.com/102280
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-23 22:28:03 +00:00
Hyang-Ah Hana Kim 58734039bd Revert "cmd/vendor/.../pprof: refresh from upstream@a74ae6f"
This reverts commit c6e69ec7f9.

Reason for revert: Broke builders. #24508

Change-Id: I66abff0dd14ec6e1f8d8d982ccfb0438633b639d
Reviewed-on: https://go-review.googlesource.com/102316
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2018-03-23 15:09:04 +00:00
Hana (Hyang-Ah) Kim c6e69ec7f9 cmd/vendor/.../pprof: refresh from upstream@a74ae6f
Merges updates listed in
0e0e5b725...a74ae6f

Update #24443

cmd/vendor/vendor.json was updated manually.

Change-Id: I15d5fe82ac18263d4d54f5773cee0e197e93dd59
Reviewed-on: https://go-review.googlesource.com/101736
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
2018-03-23 14:36:29 +00:00
Matthew Dempsky 50921bfa2e cmd/compile: change unsafeUintptrTag from var to const
Change-Id: Ie30878199e24cce5b75428e6b602c017ebd16642
Reviewed-on: https://go-review.googlesource.com/102175
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-03-22 19:38:06 +00:00
Daniel Martí 02798ed936 cmd/compile: use more range fors in gc
Slightly simplifies the code. Made sure to exclude the cases that would
change behavior, such as when the iterated value is a string, when the
index is modified within the body, or when the slice is modified.

Also checked that all the elements are of pointer type, to avoid the
corner case where non-pointer types could be copied by mistake.

Change-Id: Iea64feb2a9a6a4c94ada9ff3ace40ee173505849
Reviewed-on: https://go-review.googlesource.com/100557
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-03-22 18:38:19 +00:00
Austin Clements 48f990b4a5 cmd/compile: fix GOEXPERIMENT=preemptibleloops type-checking
This experiment has gone stale. It causes a type-checking failure
because the condition of the OIF produced by range loop lowering has
type "untyped bool". Fix this by typechecking the whole OIF statement,
not just its condition.

This doesn't quite fix the whole experiment, but it gets further.
Something about preemption point insertion is causing failures like
"internal compiler error: likeliness prediction 1 for block b10 with 1
successors" in cmd/compile/internal/gc.

Change-Id: I7d80d618d7c91c338bf5f2a8dc174d582a479df3
Reviewed-on: https://go-review.googlesource.com/102157
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-03-22 18:20:31 +00:00
Travis Bischel 4f7b774822 cmd/compile: specialize Move up to 79B on amd64
Move currently uses mov instructions directly up to 31 bytes and then
switches to duffcopy. Moving 31 bytes is 4 instructions corresponding to
two loads and two stores, (or 6 if !useSSE) depending on the usage,
duffcopy is five (one or two mov, two or three lea, one call).

This adds direct mov instructions for Move's of size 32, 48, and 64 with
sse and for only size 32 without.
With useSSE:
- 32 is 4 instructions (byte +/- comparison below)
- 33 thru 48 is 6
- 49 thru 64 is 8

Without:
- 32 is 8

Note that the only platform with useSSE set to false is plan 9. I have
built three projects based off tip and tip with this patch and the
project's byte size is equal to or less than they were prior.

The basis of this change is that copying data with instructions directly
is nearly free, whereas calling into duffcopy adds a bit of overhead.
This is most noticeable in range statements where elements are 32+
bytes. For code with the following pattern:

func Benchmark32Range(b *testing.B) {
        var f s32
        for _, count := range []int{10, 100, 1000, 10000} {
                name := strconv.Itoa(count)
                b.Run(name, func(b *testing.B) {
                        base := make([]s32, count)
                        for i := 0; i < b.N; i++ {
                                for _, v := range base {
                                        f = v
                                }
                        }
                })
        }
        _ = f
}

These are the resulting benchmarks:
Benchmark16Range/10-4        19.1          19.1          +0.00%
Benchmark16Range/100-4       169           170           +0.59%
Benchmark16Range/1000-4      1684          1691          +0.42%
Benchmark16Range/10000-4     18147         18124         -0.13%
Benchmark31Range/10-4        141           142           +0.71%
Benchmark31Range/100-4       1407          1410          +0.21%
Benchmark31Range/1000-4      14070         14074         +0.03%
Benchmark31Range/10000-4     141781        141759        -0.02%
Benchmark32Range/10-4        71.4          32.2          -54.90%
Benchmark32Range/100-4       695           326           -53.09%
Benchmark32Range/1000-4      7166          3313          -53.77%
Benchmark32Range/10000-4     72571         35425         -51.19%
Benchmark64Range/10-4        87.8          64.9          -26.08%
Benchmark64Range/100-4       868           629           -27.53%
Benchmark64Range/1000-4      9355          6907          -26.17%
Benchmark64Range/10000-4     94463         70385         -25.49%
Benchmark79Range/10-4        177           152           -14.12%
Benchmark79Range/100-4       1769          1531          -13.45%
Benchmark79Range/1000-4      17893         15532         -13.20%
Benchmark79Range/10000-4     178947        155551        -13.07%
Benchmark80Range/10-4        99.6          99.7          +0.10%
Benchmark80Range/100-4       987           985           -0.20%
Benchmark80Range/1000-4      10573         10560         -0.12%
Benchmark80Range/10000-4     106792        106639        -0.14%

For runtime's BenchCopyFat* benchmarks:
CopyFat8-4     0.40ns ± 0%  0.40ns ± 0%      ~     (all equal)
CopyFat12-4    0.40ns ± 0%  0.80ns ± 0%  +100.00%  (p=0.000 n=9+9)
CopyFat16-4    0.40ns ± 0%  0.80ns ± 0%  +100.00%  (p=0.000 n=10+8)
CopyFat24-4    0.80ns ± 0%  0.40ns ± 0%   -50.00%  (p=0.001 n=8+9)
CopyFat32-4    2.01ns ± 0%  0.40ns ± 0%   -80.10%  (p=0.000 n=8+8)
CopyFat64-4    2.87ns ± 0%  0.40ns ± 0%   -86.07%  (p=0.000 n=8+10)
CopyFat128-4   4.82ns ± 0%  4.82ns ± 0%      ~     (p=1.000 n=8+8)
CopyFat256-4   8.83ns ± 0%  8.83ns ± 0%      ~     (p=1.000 n=8+8)
CopyFat512-4   16.9ns ± 0%  16.9ns ± 0%      ~     (all equal)
CopyFat520-4   14.6ns ± 0%  14.6ns ± 1%      ~     (p=0.529 n=8+9)
CopyFat1024-4  32.9ns ± 0%  33.0ns ± 0%    +0.20%  (p=0.041 n=8+9)

Function calls are not benefitted as much due how they are compiled, but
other benchmarks I ran show that calling function with 64 byte elements
is marginally improved.

The main downside with this change is that it may increase binary sizes
depending on the size of the copy, but this change also decreases
binaries for moves of 48 bytes or less.

For the following code:
package main

type size [32]byte

//go:noinline
func use(t size) {
}

//go:noinline
func get() size {
	var z size
	return z
}

func main() {
	var a size
	use(a)
}

Changing size around gives the following assembly leading up to the call
(the initialization and actual call are removed):

tip func call with 32B arg: 27B
    48 89 e7                 mov    %rsp,%rdi
    48 8d 74 24 20           lea    0x20(%rsp),%rsi
    48 89 6c 24 f0           mov    %rbp,-0x10(%rsp)
    48 8d 6c 24 f0           lea    -0x10(%rsp),%rbp
    e8 53 ab ff ff           callq  448964 <runtime.duffcopy+0x364>
    48 8b 6d 00              mov    0x0(%rbp),%rbp

modified: 19B (-8B)
    0f 10 44 24 20           movups 0x20(%rsp),%xmm0
    0f 11 04 24              movups %xmm0,(%rsp)
    0f 10 44 24 30           movups 0x30(%rsp),%xmm0
    0f 11 44 24 10           movups %xmm0,0x10(%rsp)
-
tip with 47B arg: 29B
    48 8d 7c 24 0f           lea    0xf(%rsp),%rdi
    48 8d 74 24 40           lea    0x40(%rsp),%rsi
    48 89 6c 24 f0           mov    %rbp,-0x10(%rsp)
    48 8d 6c 24 f0           lea    -0x10(%rsp),%rbp
    e8 43 ab ff ff           callq  448964 <runtime.duffcopy+0x364>
    48 8b 6d 00              mov    0x0(%rbp),%rbp

modified: 20B (-9B)
    0f 10 44 24 40           movups 0x40(%rsp),%xmm0
    0f 11 44 24 0f           movups %xmm0,0xf(%rsp)
    0f 10 44 24 50           movups 0x50(%rsp),%xmm0
    0f 11 44 24 1f           movups %xmm0,0x1f(%rsp)
-
tip with 64B arg: 27B
    48 89 e7                 mov    %rsp,%rdi
    48 8d 74 24 40           lea    0x40(%rsp),%rsi
    48 89 6c 24 f0           mov    %rbp,-0x10(%rsp)
    48 8d 6c 24 f0           lea    -0x10(%rsp),%rbp
    e8 1f ab ff ff           callq  448948 <runtime.duffcopy+0x348>
    48 8b 6d 00              mov    0x0(%rbp),%rbp

modified: 39B [+12B]
    0f 10 44 24 40           movups 0x40(%rsp),%xmm0
    0f 11 04 24              movups %xmm0,(%rsp)
    0f 10 44 24 50           movups 0x50(%rsp),%xmm0
    0f 11 44 24 10           movups %xmm0,0x10(%rsp)
    0f 10 44 24 60           movups 0x60(%rsp),%xmm0
    0f 11 44 24 20           movups %xmm0,0x20(%rsp)
    0f 10 44 24 70           movups 0x70(%rsp),%xmm0
    0f 11 44 24 30           movups %xmm0,0x30(%rsp)
-
tip with 79B arg: 29B
    48 8d 7c 24 0f           lea    0xf(%rsp),%rdi
    48 8d 74 24 60           lea    0x60(%rsp),%rsi
    48 89 6c 24 f0           mov    %rbp,-0x10(%rsp)
    48 8d 6c 24 f0           lea    -0x10(%rsp),%rbp
    e8 09 ab ff ff           callq  448948 <runtime.duffcopy+0x348>
    48 8b 6d 00              mov    0x0(%rbp),%rbp

modified: 46B [+17B]
    0f 10 44 24 60           movups 0x60(%rsp),%xmm0
    0f 11 44 24 0f           movups %xmm0,0xf(%rsp)
    0f 10 44 24 70           movups 0x70(%rsp),%xmm0
    0f 11 44 24 1f           movups %xmm0,0x1f(%rsp)
    0f 10 84 24 80 00 00     movups 0x80(%rsp),%xmm0
    00
    0f 11 44 24 2f           movups %xmm0,0x2f(%rsp)
    0f 10 84 24 90 00 00     movups 0x90(%rsp),%xmm0
    00
    0f 11 44 24 3f           movups %xmm0,0x3f(%rsp)

So, at best we save 9B, at worst we gain 17. I do not think that copying
around 65+B sized types is common enough to bloat program sizes. Using
bincmp on the go binary itself shows a zero byte difference; there are
gains and losses all over. One of the largest gains in binary size comes
from cmd/go/internal/cache.(*Cache).Get, which passes around a 64 byte
sized type -- this is one of the cases I would expect to be benefitted
by this change.

I think that this marginal improvement in struct copying for 64 byte
structs is worth it: most data structs / work items I use in my programs
are small, but few are smaller than 32 bytes: with one slice, the budget
is up. The 32 rule alone would allow another 16 bytes, the 48 and 64
rules allow another 32 and 48.

Change-Id: I19a8f9190d5d41825091f17f268f4763bfc12a62
Reviewed-on: https://go-review.googlesource.com/100718
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-22 18:17:37 +00:00
Alberto Donizetti fc6280d4b0 test/codegen: port direct comparisons with memory tests
And remove them from asm_test.

Change-Id: I1ca29b40546d6de06f20bfd550ed8ff87f495454
Reviewed-on: https://go-review.googlesource.com/102115
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-22 17:20:09 +00:00
Carlos Eduardo Seo 6633bb2aa7 cmd/compile/internal/ppc64, runtime internal/atomic, sync/atomic: implement faster atomics for ppc64x
This change implements faster atomics for ppc64x based on the ISA 2.07B,
Appendix B.2 recommendations, replacing SYNC/ISYNC by LWSYNC in some
cases.

Updates #21348

name                                           old time/op new time/op    delta
Cond1-16                                           955ns     856ns      -10.33%
Cond2-16                                          2.38µs    2.03µs      -14.59%
Cond4-16                                          5.90µs    5.44µs       -7.88%
Cond8-16                                          12.1µs    11.1µs       -8.42%
Cond16-16                                         27.0µs    25.1µs       -7.04%
Cond32-16                                         59.1µs    55.5µs       -6.14%
LoadMostlyHits/*sync_test.DeepCopyMap-16          22.1ns    24.1ns       +9.02%
LoadMostlyHits/*sync_test.RWMutexMap-16            252ns     249ns       -1.20%
LoadMostlyHits/*sync.Map-16                       16.2ns    16.3ns         ~
LoadMostlyMisses/*sync_test.DeepCopyMap-16        22.3ns    22.6ns         ~
LoadMostlyMisses/*sync_test.RWMutexMap-16          249ns     247ns       -0.51%
LoadMostlyMisses/*sync.Map-16                     12.7ns    12.7ns         ~
LoadOrStoreBalanced/*sync_test.RWMutexMap-16      1.27µs    1.17µs       -7.54%
LoadOrStoreBalanced/*sync.Map-16                  1.12µs    1.10µs       -2.35%
LoadOrStoreUnique/*sync_test.RWMutexMap-16        1.75µs    1.68µs       -3.84%
LoadOrStoreUnique/*sync.Map-16                    2.07µs    1.97µs       -5.13%
LoadOrStoreCollision/*sync_test.DeepCopyMap-16    15.8ns    15.9ns         ~
LoadOrStoreCollision/*sync_test.RWMutexMap-16      496ns     424ns      -14.48%
LoadOrStoreCollision/*sync.Map-16                 6.07ns    6.07ns         ~
Range/*sync_test.DeepCopyMap-16                   1.65µs    1.64µs         ~
Range/*sync_test.RWMutexMap-16                     278µs     288µs       +3.75%
Range/*sync.Map-16                                2.00µs    2.01µs         ~
AdversarialAlloc/*sync_test.DeepCopyMap-16        3.45µs    3.44µs         ~
AdversarialAlloc/*sync_test.RWMutexMap-16          226ns     227ns         ~
AdversarialAlloc/*sync.Map-16                     1.09µs    1.07µs       -2.36%
AdversarialDelete/*sync_test.DeepCopyMap-16        553ns     550ns       -0.57%
AdversarialDelete/*sync_test.RWMutexMap-16         273ns     274ns         ~
AdversarialDelete/*sync.Map-16                     247ns     249ns         ~
UncontendedSemaphore-16                           79.0ns    65.5ns      -17.11%
ContendedSemaphore-16                              112ns      97ns      -13.77%
MutexUncontended-16                               3.34ns    2.51ns      -24.69%
Mutex-16                                           266ns     191ns      -28.26%
MutexSlack-16                                      226ns     159ns      -29.55%
MutexWork-16                                       377ns     338ns      -10.14%
MutexWorkSlack-16                                  335ns     308ns       -8.20%
MutexNoSpin-16                                     196ns     184ns       -5.91%
MutexSpin-16                                       710ns     666ns       -6.21%
Once-16                                           1.29ns    1.29ns         ~
Pool-16                                           8.64ns    8.71ns         ~
PoolOverflow-16                                   1.60µs    1.44µs      -10.25%
SemaUncontended-16                                5.39ns    4.42ns      -17.96%
SemaSyntNonblock-16                                539ns     483ns      -10.42%
SemaSyntBlock-16                                   413ns     354ns      -14.20%
SemaWorkNonblock-16                                305ns     258ns      -15.36%
SemaWorkBlock-16                                   266ns     229ns      -14.06%
RWMutexUncontended-16                             12.9ns     9.7ns      -24.80%
RWMutexWrite100-16                                 203ns     147ns      -27.47%
RWMutexWrite10-16                                  177ns     119ns      -32.74%
RWMutexWorkWrite100-16                             435ns     403ns       -7.39%
RWMutexWorkWrite10-16                              642ns     611ns       -4.79%
WaitGroupUncontended-16                           4.67ns    3.70ns      -20.92%
WaitGroupAddDone-16                                402ns     355ns      -11.54%
WaitGroupAddDoneWork-16                            208ns     250ns      +20.09%
WaitGroupWait-16                                  1.21ns    1.21ns         ~
WaitGroupWaitWork-16                              5.91ns    5.87ns       -0.81%
WaitGroupActuallyWait-16                          92.2ns    85.8ns       -6.91%

Updates #21348

Change-Id: Ibb9b271d11b308264103829e176c6d9fe8f867d3
Reviewed-on: https://go-review.googlesource.com/95175
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2018-03-22 14:13:01 +00:00
Tim Wright 88129f0cb2 all: enable c-shared/c-archive support for freebsd/amd64
Fixes #14327
Much of the code is based on the linux/amd64 code that implements these
build modes, and code is shared where possible.

Change-Id: Ia510f2023768c0edbc863aebc585929ec593b332
Reviewed-on: https://go-review.googlesource.com/93875
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-21 21:56:20 +00:00
isharipo ff5cf43df5 runtime,sync/atomic: replace asm BYTEs with insts for x86
For each replacement, test case is added to new 386enc.s file
with exception of EMMS, SYSENTER, MFENCE and LFENCE as they
are already covered in amd64enc.s (same on amd64 and 386).

The replacement became less obvious after go vet suggested changes
Before:
	BYTE $0x0f; BYTE $0x7f; BYTE $0x44; BYTE $0x24; BYTE $0x08
Changed to MOVQ (this form is being tested):
	MOVQ M0, 8(SP)
Refactored to FP-relative access (go vet advice):
	MOVQ M0, val+4(FP)

Change-Id: I56b87cf3371b6ad81ad0cd9db2033aee407b5818
Reviewed-on: https://go-review.googlesource.com/101475
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-03-21 20:51:04 +00:00
Daniel Martí 77c3ef6f6f cmd/doc: use empty GOPATH when running the tests
Otherwise, a populated GOPATH might result in failures such as:

	$ go test
	[...] no buildable Go source files in [...]/gopherjs/compiler/natives/src/crypto/rand
	exit status 1

Move the initialization of the dirs walker out of the init func, so that
we can control its behavior in the tests.

Updates #24464.

Change-Id: I4b26a7d3d6809bdd8e9b6b0556d566e7855f80fe
Reviewed-on: https://go-review.googlesource.com/101836
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-21 13:43:22 +00:00
Alberto Donizetti 041c5d8348 cmd/trace: remove unused variable in tests
Unused variables in closures are currently not diagnosed by the
compiler (this is Issue #3059), while go/types catches them.

One unused variable in the cmd/trace tests is causing the go/types
test that typechecks the whole standard library to fail:

  FAIL: TestStdlib (8.05s)
    stdlib_test.go:223: cmd/trace/annotations_test.go:241:6: gcTime
    declared but not used
  FAIL

Remove it.

Updates #24464

Change-Id: I0f1b9db6ae1f0130616ee649bdbfdc91e38d2184
Reviewed-on: https://go-review.googlesource.com/101815
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-03-21 11:10:03 +00:00
Ilya Tocar 983dcf70ba cmd/compile/internal/ssa: update regalloc in loops
Currently we don't lift spill out of loop if loop contains call.
However often we have code like this:

for .. {
    if hard_case {
	call()
    }
    // simple case, without call
}

So instead of checking for any call, check for unavoidable call.
For #22698 cases I see:
mime/quotedprintable/Writer-6                   10.9µs ± 4%      9.2µs ± 3%   -15.02%  (p=0.000 n=8+8)
And:
compress/flate/Encode/Twain/Huffman/1e4-6       99.4µs ± 6%     90.9µs ± 0%    -8.57%  (p=0.000 n=8+8)
compress/flate/Encode/Twain/Huffman/1e5-6       760µs ± 1%      725µs ± 1%     -4.56%  (p=0.000 n=8+8)
compress/flate/Encode/Twain/Huffman/1e6-6       7.55ms ± 0%      7.24ms ± 0%     -4.07%  (p=0.000 n=8+7)

There are no significant changes on go1 benchmarks.
But for cases with runtime arch checks, where we call generic version on old hardware,
there are respectable performance gains:
math/RoundToEven-6                             1.43ns ± 0%     1.25ns ± 0%   -12.59%  (p=0.001 n=7+7)
math/bits/OnesCount64-6                        1.60ns ± 1%     1.42ns ± 1%   -11.32%  (p=0.000 n=8+8)

Also on some runtime benchmarks loops have less loads and higher performance:
runtime/RuneIterate/range1/ASCII-6             15.6ns ± 1%     13.9ns ± 1%   -10.74%  (p=0.000 n=7+8)
runtime/ArrayEqual-6                           3.22ns ± 0%     2.86ns ± 2%   -11.06%  (p=0.000 n=7+8)

Fixes #22698
Updates #22234

Change-Id: I0ae2f19787d07a9026f064366dedbe601bf7257a
Reviewed-on: https://go-review.googlesource.com/84055
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-03-20 21:02:39 +00:00
Alberto Donizetti be371edd67 test/codegen: port comparisons tests to codegen
And delete them from asm_test.

Change-Id: I64c512bfef3b3da6db5c5d29277675dade28b8ab
Reviewed-on: https://go-review.googlesource.com/101595
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-03-20 19:38:06 +00:00
Than McIntosh f45c07e84a cmd/compile: fix regression in DWARF inlined routine variable tracking
Fix a bug in the code that generates the pre-inlined variable
declaration table used as raw material for emitting DWARF inline
routine records. The fix for issue 23704 altered the recipe for
assigning file/line/col to variables in one part of the compiler, but
didn't update a similar recipe in the code for variable tracking.
Added a new test that should catch problems of a similar nature.

Fixes #24460.

Change-Id: I255c036637f4151aa579c0e21d123fd413724d61
Reviewed-on: https://go-review.googlesource.com/101676
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-20 18:56:52 +00:00
Michael Munday ae10914e67 cmd/compile: mark LAA and LAAG as clobbering flags on s390x
The atomic add instructions modify the condition code and so need to
be marked as clobbering flags.

Fixes #24449.

Change-Id: Ic69c8d775fbdbfb2a56c5e0cfca7a49c0d7f6897
Reviewed-on: https://go-review.googlesource.com/101455
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-20 09:44:50 +00:00
Fangming.Fang 9c312245ac cmd/asm: fix bug about VMOV instruction (move a vector element to another) on ARM64
This change fixes index error when encoding VMOV instruction which pattern
is vmov Vn.<T>[index], Vd.<T>[index]

Change-Id: I949166e6dfd63fb0a9365f183b6c50d452614f9d
Reviewed-on: https://go-review.googlesource.com/101335
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-20 03:45:04 +00:00
Fangming.Fang 7673e30503 cmd/asm: fix bug about VMOV instruction (move register to vector element) on ARM64
This change fixes index error when encoding VMOV instruction which pattern is
VMOV Rn, V.<T>[index]. For example VMOV R1, V1.S[1] is assembled as VMOV R1, V1.S[0]

Fixes #24400
Change-Id: I82b5edc8af4e06862bc4692b119697c6bb7dc3fb
Reviewed-on: https://go-review.googlesource.com/101297
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-20 03:43:37 +00:00
Vladimir Kuzmin c12b185a6e cmd/compile: avoid mapaccess at m[k]=append(m[k]..
Currently rvalue m[k] is transformed during walk into:

        tmp1 := *mapaccess(m, k)
        tmp2 := append(tmp1, ...)
        *mapassign(m, k) = tmp2

However, this is suboptimal, as we could instead produce just:
        tmp := mapassign(m, k)
        *tmp := append(*tmp, ...)

Optimization is possible only if during Order it may tell that m[k] is
exactly the same at left and right part of assignment. It doesn't work:
1) m[f(k)] = append(m[f(k)], ...)
2) sink, m[k] = sink, append(m[k]...)
3) m[k] = append(..., m[k],...)

Benchmark:
name                           old time/op    new time/op    delta
MapAppendAssign/Int32/256-8      33.5ns ± 3%    22.4ns ±10%  -33.24%  (p=0.000 n=16+18)
MapAppendAssign/Int32/65536-8    68.2ns ± 6%    48.5ns ±29%  -28.90%  (p=0.000 n=20+20)
MapAppendAssign/Int64/256-8      34.3ns ± 4%    23.3ns ± 5%  -32.23%  (p=0.000 n=17+18)
MapAppendAssign/Int64/65536-8    65.9ns ± 7%    61.2ns ±19%   -7.06%  (p=0.002 n=18+20)
MapAppendAssign/Str/256-8         116ns ±12%      79ns ±16%  -31.70%  (p=0.000 n=20+19)
MapAppendAssign/Str/65536-8       134ns ±15%     111ns ±45%  -16.95%  (p=0.000 n=19+20)

name                           old alloc/op   new alloc/op   delta
MapAppendAssign/Int32/256-8       47.0B ± 0%     46.0B ± 0%   -2.13%  (p=0.000 n=19+18)
MapAppendAssign/Int32/65536-8     27.0B ± 0%     20.7B ±30%  -23.33%  (p=0.000 n=20+20)
MapAppendAssign/Int64/256-8       47.0B ± 0%     46.0B ± 0%   -2.13%  (p=0.000 n=20+17)
MapAppendAssign/Int64/65536-8     27.0B ± 0%     27.0B ± 0%     ~     (all equal)
MapAppendAssign/Str/256-8         94.0B ± 0%     78.0B ± 0%  -17.02%  (p=0.000 n=20+16)
MapAppendAssign/Str/65536-8       54.0B ± 0%     54.0B ± 0%     ~     (all equal)

Fixes #24364
Updates #5147

Change-Id: Id257d052b75b9a445b4885dc571bf06ce6f6b409
Reviewed-on: https://go-review.googlesource.com/100838
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-20 01:47:07 +00:00
fanzha02 910c3a9dfc cmd/asm: add ARM64 assembler check for incorrect input
Current ARM64 assembler has no check for the invalid value of both
shift amount and post-index immediate offset of LD1/ST1. This patch
adds the check.

This patch also fixes the printing error of register number equals
to 31, which should be printed as ZR instead of R31. Test cases
are also added.

Change-Id: I476235f3ab3a3fc91fe89c5a3149a4d4529c05c7
Reviewed-on: https://go-review.googlesource.com/100255
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
2018-03-19 23:45:50 +00:00
Alberto Donizetti 5a4e09837c test/codegen: port maps test to codegen
And delete them from asm_test.

Change-Id: I3cf0934706a640136cb0f646509174f8c1bf3363
Reviewed-on: https://go-review.googlesource.com/101395
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-03-19 13:39:34 +00:00
Alberto Donizetti b61b1d2c57 test/codegen: port structs test to codegen
And delete them from asm_test.

Change-Id: Ia286239a3d8f3915f2ca25dbcb39f3354a4f8aea
Reviewed-on: https://go-review.googlesource.com/101138
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-18 16:53:53 +00:00
Daniel Martí 2767c4e285 cmd/go: remove some unused parameters
Change-Id: I441b3045e76afc1c561914926c14efc8a116c8a7
Reviewed-on: https://go-review.googlesource.com/101195
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-16 21:01:28 +00:00
David Chase b30bf958da cmd/compile: enable scopes unconditionally
This revives Alessandro Arzilli's CL to enable scopes
whenever any dwarf is emitted (with optimization or not),
adds a test that detects this changes and shows that it
creates more truthful debugging output.

Reverted change to ssa/debug_test tests made when
scopes were disabled during dwarflocationlist development.

Also included are updates to the Delve test output (it
had fallen out of sync; creating test output for one
updates it for all) and minor naming changes in
ssa/debug_test.

Compile-time/space changes (relative to tip including dwarflocationlists):

benchstat -geomean after.log scopes.log
name        old time/op     new time/op     delta
Template        182ms ± 1%      182ms ± 1%    ~     (p=0.666 n=9+9)
Unicode        82.8ms ± 1%     86.6ms ±14%    ~     (p=0.211 n=9+10)
GoTypes         611ms ± 1%      616ms ± 2%  +0.97%  (p=0.001 n=10+9)
Compiler        2.95s ± 1%      2.95s ± 0%    ~     (p=0.573 n=10+8)
SSA             6.70s ± 1%      6.81s ± 1%  +1.68%  (p=0.000 n=9+10)
Flate           117ms ± 1%      118ms ± 1%  +0.60%  (p=0.036 n=9+8)
GoParser        145ms ± 1%      145ms ± 1%    ~     (p=1.000 n=9+9)
Reflect         398ms ± 1%      396ms ± 1%    ~     (p=0.053 n=9+10)
Tar             171ms ± 1%      171ms ± 1%    ~     (p=0.356 n=9+10)
XML             214ms ± 1%      214ms ± 1%    ~     (p=0.605 n=9+9)
StdCmd          12.4s ± 2%      12.4s ± 1%    ~     (p=1.000 n=9+9)
[Geo mean]      506ms           509ms       +0.71%

name        old user-ns/op  new user-ns/op  delta
Template         254M ± 4%       249M ± 6%    ~     (p=0.155 n=10+10)
Unicode          121M ±11%       124M ± 6%    ~     (p=0.516 n=10+10)
GoTypes          824M ± 2%       869M ± 5%  +5.49%  (p=0.001 n=8+10)
Compiler        4.01G ± 2%      4.02G ± 1%    ~     (p=0.561 n=9+9)
SSA             10.0G ± 2%      10.2G ± 2%  +2.29%  (p=0.000 n=9+10)
Flate            154M ± 7%       154M ± 7%    ~     (p=0.960 n=10+9)
GoParser         190M ± 7%       196M ± 6%    ~     (p=0.064 n=9+10)
Reflect          528M ± 2%       517M ± 3%  -1.97%  (p=0.025 n=10+10)
Tar              227M ± 5%       232M ± 3%    ~     (p=0.061 n=9+10)
XML              286M ± 4%       283M ± 4%    ~     (p=0.343 n=9+9)
[Geo mean]       502M            508M       +1.09%

name        old text-bytes  new text-bytes  delta
HelloSize        672k ± 0%       672k ± 0%  +0.01%  (p=0.000 n=10+10)
CmdGoSize       7.21M ± 0%      7.21M ± 0%  -0.00%  (p=0.000 n=10+10)
[Geo mean]      2.20M           2.20M       +0.00%

name        old data-bytes  new data-bytes  delta
HelloSize       9.88k ± 0%      9.88k ± 0%    ~     (all equal)
CmdGoSize        248k ± 0%       248k ± 0%    ~     (all equal)
[Geo mean]      49.5k           49.5k       +0.00%

name        old bss-bytes   new bss-bytes   delta
HelloSize        125k ± 0%       125k ± 0%    ~     (all equal)
CmdGoSize        144k ± 0%       144k ± 0%  -0.04%  (p=0.000 n=10+10)
[Geo mean]       135k            135k       -0.02%

name        old exe-bytes   new exe-bytes   delta
HelloSize       1.30M ± 0%      1.34M ± 0%  +3.15%  (p=0.000 n=10+10)
CmdGoSize       13.5M ± 0%      13.9M ± 0%  +2.70%  (p=0.000 n=10+10)
[Geo mean]      4.19M           4.31M       +2.92%

Change-Id: Id53b8d57bd00440142ccbd39b95710e14e083fb5
Reviewed-on: https://go-review.googlesource.com/101217
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-16 20:25:10 +00:00
Matthew Dempsky 86a338960d reflect: sort exported methods first
By moving exported methods to the front of method lists, filtering
down to only the exported methods just needs a count of how many
exported methods exist, which the compiler can statically
provide. This allows getting rid of the exported method cache.

For #22075.

Change-Id: I8eeb274563a2940e1347c34d673f843ae2569064
Reviewed-on: https://go-review.googlesource.com/100846
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-15 21:56:08 +00:00
Matthew Dempsky 91bbe5388d cmd/compile: sort method sets earlier
By sorting method sets earlier, we can change the interface
satisfaction problem from taking O(NM) time to O(N+M). This is the
same algorithm already used by runtime and reflect for dynamic
interface satisfaction testing.

For #22075.

Change-Id: I3d889f0227f37704535739bbde11f5107b4eea17
Reviewed-on: https://go-review.googlesource.com/100845
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-03-15 21:53:01 +00:00
David Chase 1c24ffbf93 cmd/compile: turn on DWARF locations lists for ssa vars
This changes the default setting for -dwarflocationlists
from false to true, removes the flag from ssa/debug_test.go,
and updates runtime/runtime-gdb_test.go to match a change
in debugging output for composite variables.

Current benchmarks (perflock, -count 10)

benchstat -geomean before.log after.log
name        old time/op     new time/op     delta
Template        175ms ± 0%      182ms ± 1%   +3.68%  (p=0.000 n=8+9)
Unicode        82.0ms ± 2%     82.8ms ± 1%   +0.96%  (p=0.019 n=9+9)
GoTypes         590ms ± 1%      611ms ± 1%   +3.42%  (p=0.000 n=9+10)
Compiler        2.85s ± 0%      2.95s ± 1%   +3.60%  (p=0.000 n=9+10)
SSA             6.42s ± 1%      6.70s ± 1%   +4.31%  (p=0.000 n=10+9)
Flate           113ms ± 2%      117ms ± 1%   +3.11%  (p=0.000 n=10+9)
GoParser        140ms ± 1%      145ms ± 1%   +3.47%  (p=0.000 n=10+9)
Reflect         384ms ± 0%      398ms ± 1%   +3.56%  (p=0.000 n=8+9)
Tar             165ms ± 1%      171ms ± 1%   +3.33%  (p=0.000 n=9+9)
XML             207ms ± 2%      214ms ± 1%   +3.41%  (p=0.000 n=9+9)
StdCmd          11.8s ± 2%      12.4s ± 2%   +4.41%  (p=0.000 n=10+9)
[Geo mean]      489ms           506ms        +3.38%

name        old user-ns/op  new user-ns/op  delta
Template         247M ± 4%       254M ± 4%   +2.76%  (p=0.040 n=10+10)
Unicode          118M ±16%       121M ±11%     ~     (p=0.364 n=10+10)
GoTypes          805M ± 2%       824M ± 2%   +2.37%  (p=0.003 n=9+8)
Compiler        3.92G ± 2%      4.01G ± 2%   +2.20%  (p=0.001 n=9+9)
SSA             9.63G ± 4%     10.00G ± 2%   +3.81%  (p=0.000 n=10+9)
Flate            155M ±10%       154M ± 7%     ~     (p=0.718 n=9+10)
GoParser         184M ±11%       190M ± 7%     ~     (p=0.220 n=10+9)
Reflect          506M ± 4%       528M ± 2%   +4.27%  (p=0.000 n=10+10)
Tar              224M ± 4%       227M ± 5%     ~     (p=0.207 n=10+9)
XML              272M ± 7%       286M ± 4%   +5.23%  (p=0.010 n=10+9)
[Geo mean]       489M            502M        +2.76%

name        old text-bytes  new text-bytes  delta
HelloSize        672k ± 0%       672k ± 0%     ~     (all equal)
CmdGoSize       7.21M ± 0%      7.21M ± 0%     ~     (all equal)
[Geo mean]      2.20M           2.20M        +0.00%

name        old data-bytes  new data-bytes  delta
HelloSize       9.88k ± 0%      9.88k ± 0%     ~     (all equal)
CmdGoSize        248k ± 0%       248k ± 0%     ~     (all equal)
[Geo mean]      49.5k           49.5k        +0.00%

name        old bss-bytes   new bss-bytes   delta
HelloSize        125k ± 0%       125k ± 0%     ~     (all equal)
CmdGoSize        144k ± 0%       144k ± 0%     ~     (all equal)
[Geo mean]       135k            135k        +0.00%

name        old exe-bytes   new exe-bytes   delta
HelloSize       1.10M ± 0%      1.30M ± 0%  +17.82%  (p=0.000 n=10+10)
CmdGoSize       11.6M ± 0%      13.5M ± 0%  +16.90%  (p=0.000 n=10+10)
[Geo mean]      3.57M           4.19M       +17.36%

Change-Id: I250055813cadd25cebee8da1f9a7f995a6eae432
Reviewed-on: https://go-review.googlesource.com/100738
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-15 21:34:17 +00:00
Heschi Kreinick 1814a0595c cmd/trace: filter tasks by log text
Add a search box to the top of the user task views that only displays
tasks containing a particular log message.

Change-Id: I92f4aa113f930954e8811416901e37824f0eb884
Reviewed-on: https://go-review.googlesource.com/100843
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2018-03-15 19:59:22 +00:00
Alberto Donizetti cceee685be test/codegen: port floats tests to codegen
And delete them from asm_test.

Change-Id: Ibdaca3496eefc73c731b511ddb9636a1f3dff68c
Reviewed-on: https://go-review.googlesource.com/100915
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-15 18:05:59 +00:00
Keith Randall 9d4215311b runtime: identify special functions by flag instead of address
When there are plugins, there may not be a unique copy of runtime
functions like goexit, mcall, etc.  So identifying them by entry
address is problematic.  Instead, keep track of each special function
using a field in the symbol table.  That way, multiple copies of
the same runtime function will be treated identically.

Fixes #24351
Fixes #23133

Change-Id: Iea3232df8a6af68509769d9ca618f530cc0f84fd
Reviewed-on: https://go-review.googlesource.com/100739
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-15 17:31:57 +00:00
Daniel Martí cd2cb6e3f5 cmd/compile: cache sparse maps across ssa passes
This is done for sparse sets already, but it was missing for sparse
maps. Only affects deadstore and regalloc, as they're the only ones that
use sparse maps.

name                 old time/op    new time/op    delta
DSEPass-4               247µs ± 0%     216µs ± 0%  -12.75%  (p=0.008 n=5+5)
DSEPassBlock-4         3.05ms ± 1%    2.87ms ± 1%   -6.02%  (p=0.002 n=6+6)
CSEPass-4              2.30ms ± 0%    2.32ms ± 0%   +0.53%  (p=0.026 n=6+6)
CSEPassBlock-4         23.8ms ± 0%    23.8ms ± 0%     ~     (p=0.931 n=6+5)
DeadcodePass-4         51.7µs ± 1%    51.5µs ± 2%     ~     (p=0.429 n=5+6)
DeadcodePassBlock-4     734µs ± 1%     742µs ± 3%     ~     (p=0.394 n=6+6)
MultiPass-4             152µs ± 0%     149µs ± 2%     ~     (p=0.082 n=5+6)
MultiPassBlock-4       2.67ms ± 1%    2.41ms ± 2%   -9.77%  (p=0.008 n=5+5)

name                 old alloc/op   new alloc/op   delta
DSEPass-4              41.2kB ± 0%     0.1kB ± 0%  -99.68%  (p=0.002 n=6+6)
DSEPassBlock-4          560kB ± 0%       4kB ± 0%  -99.34%  (p=0.026 n=5+6)
CSEPass-4               189kB ± 0%     189kB ± 0%     ~     (all equal)
CSEPassBlock-4         3.10MB ± 0%    3.10MB ± 0%     ~     (p=0.444 n=5+5)
DeadcodePass-4         10.5kB ± 0%    10.5kB ± 0%     ~     (all equal)
DeadcodePassBlock-4     164kB ± 0%     164kB ± 0%     ~     (all equal)
MultiPass-4             240kB ± 0%     199kB ± 0%  -17.06%  (p=0.002 n=6+6)
MultiPassBlock-4       3.60MB ± 0%    2.99MB ± 0%  -17.06%  (p=0.002 n=6+6)

name                 old allocs/op  new allocs/op  delta
DSEPass-4                8.00 ± 0%      4.00 ± 0%  -50.00%  (p=0.002 n=6+6)
DSEPassBlock-4            240 ± 0%       120 ± 0%  -50.00%  (p=0.002 n=6+6)
CSEPass-4                9.00 ± 0%      9.00 ± 0%     ~     (all equal)
CSEPassBlock-4          1.35k ± 0%     1.35k ± 0%     ~     (all equal)
DeadcodePass-4           3.00 ± 0%      3.00 ± 0%     ~     (all equal)
DeadcodePassBlock-4      9.00 ± 0%      9.00 ± 0%     ~     (all equal)
MultiPass-4              11.0 ± 0%      10.0 ± 0%   -9.09%  (p=0.002 n=6+6)
MultiPassBlock-4          165 ± 0%       150 ± 0%   -9.09%  (p=0.002 n=6+6)

Change-Id: I43860687c88f33605eb1415f36473c5cfe8fde4a
Reviewed-on: https://go-review.googlesource.com/98449
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-03-15 17:24:39 +00:00
Giovanni Bajo a35ec9a59e cmd/compile: implement CMOV on amd64
This builds upon the branchelim pass, activating it for amd64 and
lowering CondSelect. Special care is made to FPU instructions for
NaN handling.

Benchmark results on Xeon E5630 (Westmere EP):

name                      old time/op    new time/op    delta
BinaryTree17-16              4.99s ± 9%     4.66s ± 2%     ~     (p=0.095 n=5+5)
Fannkuch11-16                4.93s ± 3%     5.04s ± 2%     ~     (p=0.548 n=5+5)
FmtFprintfEmpty-16          58.8ns ± 7%    61.4ns ±14%     ~     (p=0.579 n=5+5)
FmtFprintfString-16          114ns ± 2%     114ns ± 4%     ~     (p=0.603 n=5+5)
FmtFprintfInt-16             181ns ± 4%     125ns ± 3%  -30.90%  (p=0.008 n=5+5)
FmtFprintfIntInt-16          263ns ± 2%     217ns ± 2%  -17.34%  (p=0.008 n=5+5)
FmtFprintfPrefixedInt-16     230ns ± 1%     212ns ± 1%   -7.99%  (p=0.008 n=5+5)
FmtFprintfFloat-16           411ns ± 3%     344ns ± 5%  -16.43%  (p=0.008 n=5+5)
FmtManyArgs-16               828ns ± 4%     790ns ± 2%   -4.59%  (p=0.032 n=5+5)
GobDecode-16                10.9ms ± 4%    10.8ms ± 5%     ~     (p=0.548 n=5+5)
GobEncode-16                9.52ms ± 5%    9.46ms ± 2%     ~     (p=1.000 n=5+5)
Gzip-16                      334ms ± 2%     337ms ± 2%     ~     (p=0.548 n=5+5)
Gunzip-16                   64.4ms ± 1%    65.0ms ± 1%   +1.00%  (p=0.008 n=5+5)
HTTPClientServer-16          156µs ± 3%     155µs ± 3%     ~     (p=0.690 n=5+5)
JSONEncode-16               21.0ms ± 1%    21.8ms ± 0%   +3.76%  (p=0.016 n=5+4)
JSONDecode-16               95.1ms ± 0%    95.7ms ± 1%     ~     (p=0.151 n=5+5)
Mandelbrot200-16            6.38ms ± 1%    6.42ms ± 1%     ~     (p=0.095 n=5+5)
GoParse-16                  5.47ms ± 2%    5.36ms ± 1%   -1.95%  (p=0.016 n=5+5)
RegexpMatchEasy0_32-16       111ns ± 1%     111ns ± 1%     ~     (p=0.635 n=5+4)
RegexpMatchEasy0_1K-16       408ns ± 1%     411ns ± 2%     ~     (p=0.087 n=5+5)
RegexpMatchEasy1_32-16       103ns ± 1%     104ns ± 1%     ~     (p=0.484 n=5+5)
RegexpMatchEasy1_1K-16       659ns ± 2%     652ns ± 1%     ~     (p=0.571 n=5+5)
RegexpMatchMedium_32-16      176ns ± 2%     174ns ± 1%     ~     (p=0.476 n=5+5)
RegexpMatchMedium_1K-16     58.6µs ± 4%    57.7µs ± 4%     ~     (p=0.548 n=5+5)
RegexpMatchHard_32-16       3.07µs ± 3%    3.04µs ± 4%     ~     (p=0.421 n=5+5)
RegexpMatchHard_1K-16       89.2µs ± 1%    87.9µs ± 2%   -1.52%  (p=0.032 n=5+5)
Revcomp-16                   575ms ± 0%     587ms ± 2%   +2.12%  (p=0.032 n=4+5)
Template-16                  110ms ± 1%     107ms ± 3%   -3.00%  (p=0.032 n=5+5)
TimeParse-16                 463ns ± 0%     462ns ± 0%     ~     (p=0.810 n=5+4)
TimeFormat-16                538ns ± 0%     535ns ± 0%   -0.63%  (p=0.024 n=5+5)

name                      old speed      new speed      delta
GobDecode-16              70.7MB/s ± 4%  71.4MB/s ± 5%     ~     (p=0.452 n=5+5)
GobEncode-16              80.7MB/s ± 5%  81.2MB/s ± 2%     ~     (p=1.000 n=5+5)
Gzip-16                   58.2MB/s ± 2%  57.7MB/s ± 2%     ~     (p=0.452 n=5+5)
Gunzip-16                  302MB/s ± 1%   299MB/s ± 1%   -0.99%  (p=0.008 n=5+5)
JSONEncode-16             92.4MB/s ± 1%  89.1MB/s ± 0%   -3.63%  (p=0.016 n=5+4)
JSONDecode-16             20.4MB/s ± 0%  20.3MB/s ± 1%     ~     (p=0.135 n=5+5)
GoParse-16                10.6MB/s ± 2%  10.8MB/s ± 1%   +2.00%  (p=0.016 n=5+5)
RegexpMatchEasy0_32-16     286MB/s ± 1%   285MB/s ± 3%     ~     (p=1.000 n=5+5)
RegexpMatchEasy0_1K-16    2.51GB/s ± 1%  2.49GB/s ± 2%     ~     (p=0.095 n=5+5)
RegexpMatchEasy1_32-16     309MB/s ± 1%   307MB/s ± 1%     ~     (p=0.548 n=5+5)
RegexpMatchEasy1_1K-16    1.55GB/s ± 2%  1.57GB/s ± 1%     ~     (p=0.690 n=5+5)
RegexpMatchMedium_32-16   5.68MB/s ± 2%  5.73MB/s ± 1%     ~     (p=0.579 n=5+5)
RegexpMatchMedium_1K-16   17.5MB/s ± 4%  17.8MB/s ± 4%     ~     (p=0.500 n=5+5)
RegexpMatchHard_32-16     10.4MB/s ± 3%  10.5MB/s ± 4%     ~     (p=0.460 n=5+5)
RegexpMatchHard_1K-16     11.5MB/s ± 1%  11.7MB/s ± 2%   +1.57%  (p=0.032 n=5+5)
Revcomp-16                 442MB/s ± 0%   433MB/s ± 2%   -2.05%  (p=0.032 n=4+5)
Template-16               17.7MB/s ± 1%  18.2MB/s ± 3%   +3.12%  (p=0.032 n=5+5)

Change-Id: I6972e8f35f2b31f9a42ac473a6bf419a18022558
Reviewed-on: https://go-review.googlesource.com/100935
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-15 16:41:59 +00:00
James Cowgill 423111081b cmd/internal/obj/mips: load/store even float registers first
There is a bug in Octeon III processors where storing an odd floating
point register after it has recently been written to by a double
floating point operation will store the old value from before the double
operation (there are some extra details - the operation and store
must be a certain number of cycles apart). However, this bug does not
occur if the even register is stored first. Currently the bug only
happens on big endian because go always loads the even register first on
little endian.

Workaround the bug by always loading / storing the even floating point
register first. Since this is just an instruction reordering, it should
have no performance penalty. This follows other compilers like GCC which
will always store the even register first (although you do have to set
the ISA level to MIPS I to prevent it from using SDC1).

Change-Id: I5e73daa4d724ca1df7bf5228aab19f53f26a4976
Reviewed-on: https://go-review.googlesource.com/97735
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-15 15:40:39 +00:00
Geoff Berry e244a7a7d3 cmd/compile/internal/ssa: add patterns for arm64 bitfield opcodes
Add patterns to match common idioms for EXTR, BFI, BFXIL, SBFIZ, SBFX,
UBFIZ and UBFX opcodes.

go1 benchmarks results on Amberwing:
name                   old time/op    new time/op    delta
FmtManyArgs               786ns ± 2%     714ns ± 1%  -9.20%  (p=0.000 n=10+10)
Gzip                      437ms ± 0%     402ms ± 0%  -7.99%  (p=0.000 n=10+10)
FmtFprintfIntInt          196ns ± 0%     182ns ± 0%  -7.28%  (p=0.000 n=10+9)
FmtFprintfPrefixedInt     207ns ± 0%     199ns ± 0%  -3.86%  (p=0.000 n=10+10)
FmtFprintfFloat           324ns ± 0%     316ns ± 0%  -2.47%  (p=0.000 n=10+8)
FmtFprintfInt             119ns ± 0%     117ns ± 0%  -1.68%  (p=0.000 n=10+9)
GobDecode                12.8ms ± 2%    12.6ms ± 1%  -1.62%  (p=0.002 n=10+10)
JSONDecode               94.4ms ± 1%    93.4ms ± 0%  -1.10%  (p=0.000 n=10+10)
RegexpMatchEasy0_32       247ns ± 0%     245ns ± 0%  -0.65%  (p=0.000 n=10+10)
RegexpMatchMedium_32      314ns ± 0%     312ns ± 0%  -0.64%  (p=0.000 n=10+10)
RegexpMatchEasy0_1K       541ns ± 0%     538ns ± 0%  -0.55%  (p=0.000 n=10+9)
TimeParse                 450ns ± 1%     448ns ± 1%  -0.42%  (p=0.035 n=9+9)
RegexpMatchEasy1_32       244ns ± 0%     243ns ± 0%  -0.41%  (p=0.000 n=10+10)
GoParse                  6.03ms ± 0%    6.00ms ± 0%  -0.40%  (p=0.002 n=10+10)
RegexpMatchEasy1_1K       779ns ± 0%     777ns ± 0%  -0.26%  (p=0.000 n=10+10)
RegexpMatchHard_32       2.75µs ± 0%    2.74µs ± 1%  -0.06%  (p=0.026 n=9+9)
BinaryTree17              11.7s ± 0%     11.6s ± 0%    ~     (p=0.089 n=10+10)
HTTPClientServer         89.1µs ± 1%    89.5µs ± 2%    ~     (p=0.436 n=10+10)
RegexpMatchHard_1K       78.9µs ± 0%    79.5µs ± 2%    ~     (p=0.469 n=10+10)
FmtFprintfEmpty          58.5ns ± 0%    58.5ns ± 0%    ~     (all equal)
GobEncode                12.0ms ± 1%    12.1ms ± 0%    ~     (p=0.075 n=10+10)
Revcomp                   669ms ± 0%     668ms ± 0%    ~     (p=0.091 n=7+9)
Mandelbrot200            5.35ms ± 0%    5.36ms ± 0%  +0.07%  (p=0.000 n=9+9)
RegexpMatchMedium_1K     52.1µs ± 0%    52.1µs ± 0%  +0.10%  (p=0.000 n=9+9)
Fannkuch11                3.25s ± 0%     3.26s ± 0%  +0.36%  (p=0.000 n=9+10)
FmtFprintfString          114ns ± 1%     115ns ± 0%  +0.52%  (p=0.011 n=10+10)
JSONEncode               20.2ms ± 0%    20.3ms ± 0%  +0.65%  (p=0.000 n=10+10)
Template                 91.3ms ± 0%    92.3ms ± 0%  +1.08%  (p=0.000 n=10+10)
TimeFormat                484ns ± 0%     495ns ± 1%  +2.30%  (p=0.000 n=9+10)

There are some opportunities to improve this change further by adding
patterns to match the "extended register" versions of ADD/SUB/CMP, but I
think that should be evaluated on its own.  The regressions in Template
and TimeFormat would likely be recovered by this, as they seem to be due
to generating:

    ubfiz x0, x0, #3, #8
    add x1, x2, x0

instead of

    add x1, x2, x0, lsl #3

Change-Id: I5644a8d70ac7a98e784a377a2b76ab47a3415a4b
Reviewed-on: https://go-review.googlesource.com/88355
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-15 14:10:41 +00:00