Commit Graph

1180 Commits

Author SHA1 Message Date
Keith Randall 9dac0a8132 runtime: on a signal, set traceback address to a deferreturn call
When a function triggers a signal (like a segfault which translates to
a nil pointer exception) during execution, a sigpanic handler is just
below it on the stack.  The function itself did not stop at a
safepoint, so we have to figure out what safepoint we should use to
scan its stack frame.

Previously we used the site of the most recent defer to get the live
variables at the signal site. That answer is not quite correct, as
explained in #27518. Instead, use the site of a deferreturn call.
It has all the right variables marked as live (no args, all the return
values, except those that escape to the heap, in which case the
corresponding PAUTOHEAP variables will be live instead).

This CL requires stack objects, so that all the local variables
and args referenced by the deferred closures keep the right variables alive.

Fixes #27518

Change-Id: Id45d8a8666759986c203181090b962e2981e48ca
Reviewed-on: https://go-review.googlesource.com/c/134637
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-03 19:54:23 +00:00
Keith Randall cbafcc55e8 cmd/compile,runtime: implement stack objects
Rework how the compiler+runtime handles stack-allocated variables
whose address is taken.

Direct references to such variables work as before. References through
pointers, however, use a new mechanism. The new mechanism is more
precise than the old "ambiguously live" mechanism. It computes liveness
at runtime based on the actual references among objects on the stack.

Each function records all of its address-taken objects in a FUNCDATA.
These are called "stack objects". The runtime then uses that
information while scanning a stack to find all of the stack objects on
a stack. It then does a mark phase on the stack objects, using all the
pointers found on the stack (and ancillary structures, like defer
records) as the root set. Only stack objects which are found to be
live during this mark phase will be scanned and thus retain any heap
objects they point to.

A subsequent CL will remove all the "ambiguously live" logic from
the compiler, so that the stack object tracing will be required.
For this CL, the stack tracing is all redundant with the current
ambiguously live logic.

Update #22350

Change-Id: Ide19f1f71a5b6ec8c4d54f8f66f0e9a98344772f
Reviewed-on: https://go-review.googlesource.com/c/134155
Reviewed-by: Austin Clements <austin@google.com>
2018-10-03 19:52:49 +00:00
Clément Chigot a3a69afff8 cmd/dist: add AIX operating system.
This commit adds AIX operating system to cmd/dist package for ppc64
architecture.

The stack guard is increased because of syscalls made inside the runtime
which need a larger stack.

Disable cmd/vet/all tests until aix/ppc64 is fully available.

Change-Id: I7e3caf86724249ae564a152d90c1cbd4de288814
Reviewed-on: https://go-review.googlesource.com/c/138715
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-10-03 13:38:38 +00:00
Ben Shi c3304edf10 cmd/internal/obj/arm: delete unnecessary code
In the arm assembler, "AMOVW" never falls into optab
case 13, so the check "if p.As == AMOVW" is useless.

Change-Id: Iec241d5b4cffb358a1477f470619dc9a6287884a
Reviewed-on: https://go-review.googlesource.com/c/138575
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-03 01:05:56 +00:00
Alessandro Arzilli 1c8943bd59 cmd/link: move DIE of global variables to their compile unit
The DIEs for global variables were all assigned to the first emitted
compile unit in debug_info, regardless of what it was. Move them
instead to their respective compile units.

Change-Id: If794fa0ba4702f5b959c6e8c16119b16e7ecf6d8
Reviewed-on: https://go-review.googlesource.com/137235
Reviewed-by: Than McIntosh <thanm@google.com>
2018-09-27 11:58:35 +00:00
Brad Fitzpatrick da0d1a44ba all: use strings.ReplaceAll and bytes.ReplaceAll where applicable
I omitted vendor directories and anything necessary for bootstrapping.
(Tested by bootstrapping with Go 1.4)

Updates #27864

Change-Id: I7d9b68d0372d3a34dee22966cca323513ece7e8a
Reviewed-on: https://go-review.googlesource.com/137856
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-09-26 22:14:25 +00:00
fanzha02 e7f5f3eca4 cmd/internal/obj/arm64: add error report for invalid base register
The current assembler accepts the non-integer register as the base register,
which should be an illegal combination.

Add the test cases.

Change-Id: Ia21596bbb5b1e212e34bd3a170748ae788860422
Reviewed-on: https://go-review.googlesource.com/134575
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-12 15:45:13 +00:00
fanzha02 7ab4b5586d cmd/internal/obj/arm64: add CONSTRAINED UNPREDICTABLE behavior check for some load/store
According to ARM64 manual, it is "constrained unpredictable behavior"
if the src and dst registers of some load/store instructions are same.
In order to completely prevent such unpredictable behavior, adding the
check for load/store instructions that are supported by the assembler
in the assembler.

Add test cases.

Update #25823

Change-Id: I64c14ad99ee543d778e7ec8ae6516a532293dbb3
Reviewed-on: https://go-review.googlesource.com/120660
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-06 19:42:03 +00:00
fanzha02 c430adf136 cmd/internal/obj/arm64: encode float constants into FMOVS/FMOVD instructions
Current assembler rewrites float constants to values stored in memory
except 0.0, which is not performant. This patch uses the FMOVS/FMOVD
instructions to move some available floating-point immediate constants
into SIMD&FP destination registers. These available constants can be
encoded into FMOVS/FMOVD instructions, checked by the chipfloat7() function.

go1 benchmark results.
name                     old time/op    new time/op    delta
BinaryTree17-8              6.27s ± 1%     6.27s ± 1%    ~     (p=0.762 n=10+8)
Fannkuch11-8                5.42s ± 1%     5.38s ± 0%  -0.63%  (p=0.000 n=10+10)
FmtFprintfEmpty-8          92.9ns ± 1%    93.4ns ± 0%  +0.47%  (p=0.004 n=9+8)
FmtFprintfString-8          169ns ± 2%     170ns ± 4%    ~     (p=0.378 n=10+10)
FmtFprintfInt-8             197ns ± 1%     196ns ± 1%  -0.77%  (p=0.009 n=10+9)
FmtFprintfIntInt-8          284ns ± 1%     286ns ± 1%    ~     (p=0.051 n=10+10)
FmtFprintfPrefixedInt-8     419ns ± 0%     422ns ± 1%  +0.69%  (p=0.038 n=6+10)
FmtFprintfFloat-8           458ns ± 0%     463ns ± 1%  +1.14%  (p=0.000 n=10+10)
FmtManyArgs-8              1.35µs ± 2%    1.36µs ± 1%  +0.91%  (p=0.043 n=10+10)
GobDecode-8                16.0ms ± 2%    15.5ms ± 1%   -3.39%  (p=0.000 n=10+10)
GobEncode-8                11.9ms ± 3%    11.4ms ± 1%   -3.98%  (p=0.000 n=10+9)
Gzip-8                      621ms ± 0%     625ms ± 0%   +0.59%  (p=0.000 n=9+10)
Gunzip-8                   74.0ms ± 1%    74.3ms ± 0%     ~     (p=0.059 n=9+8)
HTTPClientServer-8          116µs ± 1%     116µs ± 1%     ~     (p=0.165 n=10+10)
JSONEncode-8               29.3ms ± 1%    29.5ms ± 0%   +0.72%  (p=0.001 n=10+10)
JSONDecode-8                145ms ± 1%     148ms ± 2%   +2.06%  (p=0.000 n=10+10)
Mandelbrot200-8            9.67ms ± 0%    9.48ms ± 1%   -1.92%  (p=0.000 n=8+10)
GoParse-8                  7.55ms ± 0%    7.60ms ± 0%   +0.57%  (p=0.000 n=9+10)
RegexpMatchEasy0_32-8       234ns ± 0%     210ns ± 0%  -10.13%  (p=0.000 n=8+10)
RegexpMatchEasy0_1K-8       753ns ± 1%     729ns ± 0%   -3.17%  (p=0.000 n=10+8)
RegexpMatchEasy1_32-8       225ns ± 0%     224ns ± 0%   -0.44%  (p=0.000 n=9+9)
RegexpMatchEasy1_1K-8      1.03µs ± 0%    1.04µs ± 1%   +1.29%  (p=0.000 n=10+10)
RegexpMatchMedium_32-8      320ns ± 3%     296ns ± 6%   -7.50%  (p=0.000 n=10+10)
RegexpMatchMedium_1K-8     77.0µs ± 5%    73.6µs ± 1%     ~     (p=0.393 n=10+10)
RegexpMatchHard_32-8       3.93µs ± 0%    3.89µs ± 1%   -0.95%  (p=0.000 n=10+9)
RegexpMatchHard_1K-8        120µs ± 5%     115µs ± 1%     ~     (p=0.739 n=10+10)
Revcomp-8                   1.07s ± 0%     1.08s ± 1%   +0.63%  (p=0.000 n=10+9)
Template-8                  165ms ± 1%     163ms ± 1%   -1.05%  (p=0.001 n=8+10)
TimeParse-8                 751ns ± 1%     749ns ± 1%     ~     (p=0.209 n=10+10)
TimeFormat-8                759ns ± 1%     751ns ± 1%   -0.96%  (p=0.001 n=10+10)

name                     old speed      new speed      delta
GobDecode-8              48.0MB/s ± 2%  49.6MB/s ± 1%   +3.50%  (p=0.000 n=10+10)
GobEncode-8              64.5MB/s ± 3%  67.1MB/s ± 1%   +4.08%  (p=0.000 n=10+9)
Gzip-8                   31.2MB/s ± 0%  31.1MB/s ± 0%   -0.55%  (p=0.000 n=9+8)
Gunzip-8                  262MB/s ± 1%   261MB/s ± 0%     ~     (p=0.059 n=9+8)
JSONEncode-8             66.3MB/s ± 1%  65.8MB/s ± 0%   -0.72%  (p=0.001 n=10+10)
JSONDecode-8             13.4MB/s ± 1%  13.2MB/s ± 1%   -2.02%  (p=0.000 n=10+10)
GoParse-8                7.67MB/s ± 0%  7.63MB/s ± 0%   -0.57%  (p=0.000 n=9+10)
RegexpMatchEasy0_32-8     136MB/s ± 0%   152MB/s ± 0%  +11.45%  (p=0.000 n=10+10)
RegexpMatchEasy0_1K-8    1.36GB/s ± 1%  1.40GB/s ± 0%   +3.25%  (p=0.000 n=10+8)
RegexpMatchEasy1_32-8     142MB/s ± 0%   143MB/s ± 0%   +0.35%  (p=0.000 n=10+9)
RegexpMatchEasy1_1K-8     992MB/s ± 0%   980MB/s ± 1%   -1.27%  (p=0.000 n=10+10)
RegexpMatchMedium_32-8   3.12MB/s ± 3%  3.38MB/s ± 6%   +8.17%  (p=0.000 n=10+10)
RegexpMatchMedium_1K-8   13.3MB/s ± 5%  13.9MB/s ± 1%     ~     (p=0.362 n=10+10)
RegexpMatchHard_32-8     8.14MB/s ± 0%  8.21MB/s ± 1%   +0.95%  (p=0.000 n=10+9)
RegexpMatchHard_1K-8     8.54MB/s ± 5%  8.90MB/s ± 1%     ~     (p=0.636 n=10+10)
Revcomp-8                 238MB/s ± 0%   236MB/s ± 1%   -0.63%  (p=0.000 n=10+9)
Template-8               11.8MB/s ± 1%  11.9MB/s ± 1%   +1.07%  (p=0.001 n=8+10)

Change-Id: I57b372d8dcd47e6aec39893843b20385d5d9c37e
Reviewed-on: https://go-review.googlesource.com/129555
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 14:47:02 +00:00
Ben Shi 1018a80fe8 cmd/internal/obj/arm64: support more atomic instructions
LDADDALD(64-bit) and LDADDALW(32-bit) are already supported.
This CL adds supports of LDADDALH(16-bit) and LDADDALB(8-bit).

Change-Id: I4eac61adcec226d618dfce88618a2b98f5f1afe7
Reviewed-on: https://go-review.googlesource.com/132135
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-04 20:29:33 +00:00
Michael Munday 6f9b94ab66 cmd/compile: implement OnesCount{8,16,32,64} intrinsics on s390x
This CL implements the math/bits.OnesCount{8,16,32,64} functions
as intrinsics on s390x using the 'population count' (popcnt)
instruction. This instruction was released as the 'population-count'
facility which uses the same facility bit (45) as the
'distinct-operands' facility which is a pre-requisite for Go on
s390x. We can therefore use it without a feature check.

The s390x popcnt instruction treats a 64 bit register as a vector
of 8 bytes, summing the number of ones in each byte individually.
It then writes the results to the corresponding bytes in the
output register. Therefore to implement OnesCount{16,32,64} we
need to sum the individual byte counts using some extra
instructions. To do this efficiently I've added some additional
pseudo operations to the s390x SSA backend.

Unlike other architectures the new instruction sequence is faster
for OnesCount8, so that is implemented using the intrinsic.

name         old time/op  new time/op  delta
OnesCount    3.21ns ± 1%  1.35ns ± 0%  -58.00%  (p=0.000 n=20+20)
OnesCount8   0.91ns ± 1%  0.81ns ± 0%  -11.43%  (p=0.000 n=20+20)
OnesCount16  1.51ns ± 3%  1.21ns ± 0%  -19.71%  (p=0.000 n=20+17)
OnesCount32  1.91ns ± 0%  1.12ns ± 1%  -41.60%  (p=0.000 n=19+20)
OnesCount64  3.18ns ± 4%  1.35ns ± 0%  -57.52%  (p=0.000 n=20+20)

Change-Id: Id54f0bd28b6db9a887ad12c0d72fcc168ef9c4e0
Reviewed-on: https://go-review.googlesource.com/114675
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-03 14:35:38 +00:00
Zheng Xu 8f4fd3f34e build: support frame-pointer for arm64
Supporting frame-pointer makes Linux's perf and other profilers much more useful
because it lets them gather a stack trace efficiently on profiling events. Major
changes include:
1. save FP on the word below where RSP is pointing to (proposed by Cherry and Austin)
2. adjust some specific offsets in runtime assembly and wrapper code
3. add support to FP in goroutine scheduler
4. adjust link stack overflow check to take the extra word into account
5. adjust nosplit test cases to enable frame sizes which are 16 bytes aligned

Performance impacts on go1 benchmarks:

Enable frame-pointer (by default)

name                      old time/op    new time/op    delta
BinaryTree17-46              5.94s ± 0%     6.00s ± 0%  +1.03%  (p=0.029 n=4+4)
Fannkuch11-46                2.84s ± 1%     2.77s ± 0%  -2.58%  (p=0.008 n=5+5)
FmtFprintfEmpty-46          55.0ns ± 1%    58.9ns ± 1%  +7.06%  (p=0.008 n=5+5)
FmtFprintfString-46          102ns ± 0%     105ns ± 0%  +2.94%  (p=0.008 n=5+5)
FmtFprintfInt-46             118ns ± 0%     117ns ± 1%  -1.19%  (p=0.000 n=4+5)
FmtFprintfIntInt-46          181ns ± 0%     182ns ± 1%    ~     (p=0.444 n=5+5)
FmtFprintfPrefixedInt-46     215ns ± 1%     214ns ± 0%    ~     (p=0.254 n=5+4)
FmtFprintfFloat-46           292ns ± 0%     296ns ± 0%  +1.46%  (p=0.029 n=4+4)
FmtManyArgs-46               720ns ± 0%     732ns ± 0%  +1.72%  (p=0.008 n=5+5)
GobDecode-46                9.82ms ± 1%   10.03ms ± 2%  +2.10%  (p=0.008 n=5+5)
GobEncode-46                8.14ms ± 0%    8.72ms ± 1%  +7.14%  (p=0.008 n=5+5)
Gzip-46                      420ms ± 0%     424ms ± 0%  +0.92%  (p=0.008 n=5+5)
Gunzip-46                   48.2ms ± 0%    48.4ms ± 0%  +0.41%  (p=0.008 n=5+5)
HTTPClientServer-46          201µs ± 4%     201µs ± 0%    ~     (p=0.730 n=5+4)
JSONEncode-46               17.1ms ± 0%    17.7ms ± 1%  +3.80%  (p=0.008 n=5+5)
JSONDecode-46               88.0ms ± 0%    90.1ms ± 0%  +2.42%  (p=0.008 n=5+5)
Mandelbrot200-46            5.06ms ± 0%    5.07ms ± 0%    ~     (p=0.310 n=5+5)
GoParse-46                  5.04ms ± 0%    5.12ms ± 0%  +1.53%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-46       117ns ± 0%     117ns ± 0%    ~     (all equal)
RegexpMatchEasy0_1K-46       332ns ± 0%     329ns ± 0%  -0.78%  (p=0.008 n=5+5)
RegexpMatchEasy1_32-46       104ns ± 0%     113ns ± 0%  +8.65%  (p=0.029 n=4+4)
RegexpMatchEasy1_1K-46       563ns ± 0%     569ns ± 0%  +1.10%  (p=0.008 n=5+5)
RegexpMatchMedium_32-46      167ns ± 2%     177ns ± 1%  +5.74%  (p=0.008 n=5+5)
RegexpMatchMedium_1K-46     49.5µs ± 0%    53.4µs ± 0%  +7.81%  (p=0.008 n=5+5)
RegexpMatchHard_32-46       2.56µs ± 1%    2.72µs ± 0%  +6.01%  (p=0.008 n=5+5)
RegexpMatchHard_1K-46       77.0µs ± 0%    81.8µs ± 0%  +6.24%  (p=0.016 n=5+4)
Revcomp-46                   631ms ± 1%     627ms ± 1%    ~     (p=0.095 n=5+5)
Template-46                 81.8ms ± 0%    86.3ms ± 0%  +5.55%  (p=0.008 n=5+5)
TimeParse-46                 423ns ± 0%     432ns ± 0%  +2.32%  (p=0.008 n=5+5)
TimeFormat-46                478ns ± 2%     497ns ± 1%  +3.89%  (p=0.008 n=5+5)
[Geo mean]                  71.6µs         73.3µs       +2.45%

name                      old speed      new speed      delta
GobDecode-46              78.1MB/s ± 1%  76.6MB/s ± 2%  -2.04%  (p=0.008 n=5+5)
GobEncode-46              94.3MB/s ± 0%  88.0MB/s ± 1%  -6.67%  (p=0.008 n=5+5)
Gzip-46                   46.2MB/s ± 0%  45.8MB/s ± 0%  -0.91%  (p=0.008 n=5+5)
Gunzip-46                  403MB/s ± 0%   401MB/s ± 0%  -0.41%  (p=0.008 n=5+5)
JSONEncode-46              114MB/s ± 0%   109MB/s ± 1%  -3.66%  (p=0.008 n=5+5)
JSONDecode-46             22.0MB/s ± 0%  21.5MB/s ± 0%  -2.35%  (p=0.008 n=5+5)
GoParse-46                11.5MB/s ± 0%  11.3MB/s ± 0%  -1.51%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-46     272MB/s ± 0%   272MB/s ± 1%    ~     (p=0.190 n=4+5)
RegexpMatchEasy0_1K-46    3.08GB/s ± 0%  3.11GB/s ± 0%  +0.77%  (p=0.008 n=5+5)
RegexpMatchEasy1_32-46     306MB/s ± 0%   283MB/s ± 0%  -7.63%  (p=0.029 n=4+4)
RegexpMatchEasy1_1K-46    1.82GB/s ± 0%  1.80GB/s ± 0%  -1.07%  (p=0.008 n=5+5)
RegexpMatchMedium_32-46   5.99MB/s ± 0%  5.64MB/s ± 1%  -5.77%  (p=0.016 n=4+5)
RegexpMatchMedium_1K-46   20.7MB/s ± 0%  19.2MB/s ± 0%  -7.25%  (p=0.008 n=5+5)
RegexpMatchHard_32-46     12.5MB/s ± 1%  11.8MB/s ± 0%  -5.66%  (p=0.008 n=5+5)
RegexpMatchHard_1K-46     13.3MB/s ± 0%  12.5MB/s ± 1%  -6.01%  (p=0.008 n=5+5)
Revcomp-46                 402MB/s ± 1%   405MB/s ± 1%    ~     (p=0.095 n=5+5)
Template-46               23.7MB/s ± 0%  22.5MB/s ± 0%  -5.25%  (p=0.008 n=5+5)
[Geo mean]                82.2MB/s       79.6MB/s       -3.26%

Disable frame-pointer (GOEXPERIMENT=noframepointer)

name                      old time/op    new time/op    delta
BinaryTree17-46              5.94s ± 0%     5.96s ± 0%  +0.39%  (p=0.029 n=4+4)
Fannkuch11-46                2.84s ± 1%     2.79s ± 1%  -1.68%  (p=0.008 n=5+5)
FmtFprintfEmpty-46          55.0ns ± 1%    55.2ns ± 3%    ~     (p=0.794 n=5+5)
FmtFprintfString-46          102ns ± 0%     103ns ± 0%  +0.98%  (p=0.016 n=5+4)
FmtFprintfInt-46             118ns ± 0%     115ns ± 0%  -2.54%  (p=0.029 n=4+4)
FmtFprintfIntInt-46          181ns ± 0%     179ns ± 0%  -1.10%  (p=0.000 n=5+4)
FmtFprintfPrefixedInt-46     215ns ± 1%     213ns ± 0%    ~     (p=0.143 n=5+4)
FmtFprintfFloat-46           292ns ± 0%     300ns ± 0%  +2.83%  (p=0.029 n=4+4)
FmtManyArgs-46               720ns ± 0%     739ns ± 0%  +2.64%  (p=0.008 n=5+5)
GobDecode-46                9.82ms ± 1%    9.78ms ± 1%    ~     (p=0.151 n=5+5)
GobEncode-46                8.14ms ± 0%    8.12ms ± 1%    ~     (p=0.690 n=5+5)
Gzip-46                      420ms ± 0%     420ms ± 0%    ~     (p=0.548 n=5+5)
Gunzip-46                   48.2ms ± 0%    48.0ms ± 0%  -0.33%  (p=0.032 n=5+5)
HTTPClientServer-46          201µs ± 4%     199µs ± 3%    ~     (p=0.548 n=5+5)
JSONEncode-46               17.1ms ± 0%    17.2ms ± 0%    ~     (p=0.056 n=5+5)
JSONDecode-46               88.0ms ± 0%    88.6ms ± 0%  +0.64%  (p=0.008 n=5+5)
Mandelbrot200-46            5.06ms ± 0%    5.07ms ± 0%    ~     (p=0.548 n=5+5)
GoParse-46                  5.04ms ± 0%    5.07ms ± 0%  +0.65%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-46       117ns ± 0%     112ns ± 4%  -4.27%  (p=0.016 n=4+5)
RegexpMatchEasy0_1K-46       332ns ± 0%     330ns ± 1%    ~     (p=0.095 n=5+5)
RegexpMatchEasy1_32-46       104ns ± 0%     110ns ± 1%  +5.29%  (p=0.029 n=4+4)
RegexpMatchEasy1_1K-46       563ns ± 0%     567ns ± 2%    ~     (p=0.151 n=5+5)
RegexpMatchMedium_32-46      167ns ± 2%     166ns ± 0%    ~     (p=0.333 n=5+4)
RegexpMatchMedium_1K-46     49.5µs ± 0%    49.6µs ± 0%    ~     (p=0.841 n=5+5)
RegexpMatchHard_32-46       2.56µs ± 1%    2.49µs ± 0%  -2.81%  (p=0.008 n=5+5)
RegexpMatchHard_1K-46       77.0µs ± 0%    75.8µs ± 0%  -1.55%  (p=0.008 n=5+5)
Revcomp-46                   631ms ± 1%     628ms ± 0%    ~     (p=0.095 n=5+5)
Template-46                 81.8ms ± 0%    84.3ms ± 1%  +3.05%  (p=0.008 n=5+5)
TimeParse-46                 423ns ± 0%     425ns ± 0%  +0.52%  (p=0.008 n=5+5)
TimeFormat-46                478ns ± 2%     478ns ± 1%    ~     (p=1.000 n=5+5)
[Geo mean]                  71.6µs         71.6µs       -0.01%

name                      old speed      new speed      delta
GobDecode-46              78.1MB/s ± 1%  78.5MB/s ± 1%    ~     (p=0.151 n=5+5)
GobEncode-46              94.3MB/s ± 0%  94.5MB/s ± 1%    ~     (p=0.690 n=5+5)
Gzip-46                   46.2MB/s ± 0%  46.2MB/s ± 0%    ~     (p=0.571 n=5+5)
Gunzip-46                  403MB/s ± 0%   404MB/s ± 0%  +0.33%  (p=0.032 n=5+5)
JSONEncode-46              114MB/s ± 0%   113MB/s ± 0%    ~     (p=0.056 n=5+5)
JSONDecode-46             22.0MB/s ± 0%  21.9MB/s ± 0%  -0.64%  (p=0.008 n=5+5)
GoParse-46                11.5MB/s ± 0%  11.4MB/s ± 0%  -0.64%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-46     272MB/s ± 0%   285MB/s ± 4%  +4.74%  (p=0.016 n=4+5)
RegexpMatchEasy0_1K-46    3.08GB/s ± 0%  3.10GB/s ± 1%    ~     (p=0.151 n=5+5)
RegexpMatchEasy1_32-46     306MB/s ± 0%   290MB/s ± 1%  -5.21%  (p=0.029 n=4+4)
RegexpMatchEasy1_1K-46    1.82GB/s ± 0%  1.81GB/s ± 2%    ~     (p=0.151 n=5+5)
RegexpMatchMedium_32-46   5.99MB/s ± 0%  6.02MB/s ± 1%    ~     (p=0.063 n=4+5)
RegexpMatchMedium_1K-46   20.7MB/s ± 0%  20.7MB/s ± 0%    ~     (p=0.659 n=5+5)
RegexpMatchHard_32-46     12.5MB/s ± 1%  12.8MB/s ± 0%  +2.88%  (p=0.008 n=5+5)
RegexpMatchHard_1K-46     13.3MB/s ± 0%  13.5MB/s ± 0%  +1.58%  (p=0.008 n=5+5)
Revcomp-46                 402MB/s ± 1%   405MB/s ± 0%    ~     (p=0.095 n=5+5)
Template-46               23.7MB/s ± 0%  23.0MB/s ± 1%  -2.95%  (p=0.008 n=5+5)
[Geo mean]                82.2MB/s       82.3MB/s       +0.04%

Frame-pointer is enabled on Linux by default but can be disabled by setting: GOEXPERIMENT=noframepointer.

Fixes #10110

Change-Id: I1bfaca6dba29a63009d7c6ab04ed7a1413d9479e
Reviewed-on: https://go-review.googlesource.com/61511
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-29 18:28:34 +00:00
Ben Shi 84374d4de5 cmd/internal/obj: support more arm64 FP instructions
ARM64 also supports float point LDP(load pair) & STP (store pair).
The CL adds implementation and corresponding test cases for
FLDPD/FLDPS/FSTPD/FSTPS.

Change-Id: I45f112012a4e097bfaf023d029b36e6cbc7a5859
Reviewed-on: https://go-review.googlesource.com/125438
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-08-24 03:00:59 +00:00
Iskander Sharipov ed2f84a94e cmd/internal/obj/arm64: simplify some bool expressions
Replace `!(o1 != 0)` with `o1 == 0` (for readability).

Found using https://go-critic.github.io/overview.html#boolExprSimplify-ref

Change-Id: I4fc035458f530973f9be15b38441ec7b5fb591ec
Reviewed-on: https://go-review.googlesource.com/123377
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-22 16:24:52 +00:00
Jordan Rhee aa311fecda cmd/link: support windows/arm
Enable the Go linker to generate executables for windows/arm.

Generates PE relocation tables, which are used by Windows to
dynamically relocate the Go binary in memory. Windows on ARM
requires all modules to be relocatable, unlike x86/amd64 which are
permitted to have fixed base addresses.

Updates #26148

Change-Id: Ie63964ff52c2377e121b2885e9d05ec3ed8dc1cd
Reviewed-on: https://go-review.googlesource.com/125648
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-21 22:11:05 +00:00
Ian Lance Taylor 7e64377903 cmd/compile: only support -race and -msan where they work
Consolidate decision about whether -race and -msan options are
supported in cmd/internal/sys. Use consolidated functions in
cmd/compile and cmd/go. Use a copy of them in cmd/dist; cmd/dist can't
import cmd/internal/sys because Go 1.4 doesn't have it.

Fixes #24315

Change-Id: I9cecaed4895eb1a2a49379b4848db40de66d32a9
Reviewed-on: https://go-review.googlesource.com/121816
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-21 03:38:27 +00:00
Xia Bin 477b7e5f4d cmd/internal/obj: remove pointless validation
s.Func.Text only can be nil at the moment, otherwise there has
some bugs in compiler's Go rumtime.

Change-Id: Ib2ff9bb977352838e67f2b98a69468f6f350c1f3
Reviewed-on: https://go-review.googlesource.com/123535
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-20 22:16:46 +00:00
Ben Shi 84feb4bbb7 cmd/internal/obj/arm64: add register indexed FMOVS/FMOVD
This CL adds register indexed FMOVS/FMOVD.
FMOVS Fx, (Rn)(Rm)
FMOVS Fx, (Rn)(Rm<<2)
FMOVD Fx, (Rn)(Rm)
FMOVD Fx, (Rn)(Rm<<3)
FMOVS (Rn)(Rm), Fx
FMOVS (Rn)(Rm<<2), Fx
FMOVD (Rn)(Rm), Fx
FMOVD (Rn)(Rm<<3), Fx

Change-Id: Id76de6a4be96b64cf79d7e9a1962d9d49cb462f2
Reviewed-on: https://go-review.googlesource.com/123995
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-08-20 15:15:16 +00:00
Ben Shi 26d62b4ca9 cmd/internal/obj/arm64: add SWPALD/SWPALW/SWPALH/SWPALB
Those new instructions have acquire/release semantics, besides
normal atomic SWPD/SWPW/SWPH/SWPB.

Change-Id: I24821a4d21aebc342897ae52903aef612c8d8a4a
Reviewed-on: https://go-review.googlesource.com/128476
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-08-20 14:09:51 +00:00
Ian Lance Taylor 65fa2b615b cmd/internal/objfile: only consider executable segments for load address
Reportedly on some new Fedora systems the linker is producing extra
load segments, basically making the dynamic section non-executable.
We were assuming that the first load segment could be used to
determine the program's load offset, but that is no longer true.
Use the first executable load segment instead.

Fixes #26369

Change-Id: I5ee31ddeef2e8caeed3112edc5149065a6448456
Reviewed-on: https://go-review.googlesource.com/127895
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-03 23:35:53 +00:00
Ben Shi 9594ba4fe5 cmd/internal/obj/arm64: fix incorrect rejection of legal instructions
"BFI $0, R1, $7, R2" is expected to copy bit 0~6 from R1 to R2, and
left R2's other bits unchanged.

But the assembler rejects it with error "illegal bit number", and
BFIW/SBFIZ/SBFIZW/UBFIZ/UBFIZW have the same problem.

This CL fixes that issue and adds corresponding test cases.

fixes #26736

Change-Id: Ie0090a0faa38a49dd9b096a0f435987849800b76
Reviewed-on: https://go-review.googlesource.com/127159
Run-TryBot: Ben Shi <powerman1st@163.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-08-03 15:44:22 +00:00
Ben Shi e351a16005 cmd/internal/obj/arm64: reject incorrect form of LDP/STP
"LDP (R0), (F0, F1)" and "STP (F1, F2), (R0)" are
silently accepted by the arm64 assembler without
any error message. And this CL fixes that bug.

fixes #26556.

Change-Id: Ib6fae81956deb39a4ffd95e9409acc8dad3ab2d2
Reviewed-on: https://go-review.googlesource.com/125637
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-07-30 15:31:06 +00:00
Ian Lance Taylor 274fde9a36 cmd/internal/buildid: close ELF file after reading note
Updates #26400

Change-Id: I1747d1f1018521cdfa4b3ed13412a944829967cf
Reviewed-on: https://go-review.googlesource.com/124235
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-07-16 23:37:14 +00:00
Than McIntosh ec88f781c2 cmd/compile: call objabi.PathToPrefix when emitting abstract fn
When generating an abstract function DIE, call objabi.PathToPrefix on
the import path so as to be consistent with how the linker handles
import paths. This is intended to resolve another problem with DWARF
inline info generation in which there are multiple inconsistent
versions of an abstract function DIE for a function whose package path
is rewritten/canonicalized by objabi.PathToPrefix.

Fixes #26237

Change-Id: I4b64c090ae43a1ad87f47587a1a71f19bc5fc8e8
Reviewed-on: https://go-review.googlesource.com/123036
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-07-10 18:54:39 +00:00
Ian Lance Taylor 7254cfc37b cmd/go: revert "output coverage report even if there are no test files"
Original CL description:

    When using test -cover or -coverprofile the output for "no test files"
    is the same format as for "no tests to run".

Reverting because this CL changed cmd/go to build test binaries for
packages that have no tests, leading to extra work and confusion.

Updates #24570
Fixes #25789
Fixes #26157
Fixes #26242

Change-Id: Ibab1307d39dfaec0de9359d6d96706e3910c8efd
Reviewed-on: https://go-review.googlesource.com/122518
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2018-07-09 14:42:21 +00:00
Michael Munday 9fa988547a cmd/internal/obj/s390x: increase maximum number of loop iterations
The maximum number of 'spanz' iterations that the s390x assembler
performs to reach a fixed point for relative offsets was 10. This
turned out to be too aggressive for one example of auto-generated
fuzzing code. Increase the number of iterations by 10x to reduce
the likelihood that the limit will be hit again. This limit only
exists to help find bugs in the assembler.

master at tip does not fail with the example code in the issue, I
have therefore not submitted it as a test (it is also quite large).
I tested this change with the example code at the commit given and
it fixes the issue.

Fixes #25269.

Change-Id: I0e44948957a7faff51c7d27c0b7746ed6e2d47bb
Reviewed-on: https://go-review.googlesource.com/122235
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-07-05 07:21:50 +00:00
Tobias Klauser 2ee6bfbd7e cmd/internal/obj: follow convention for generated code comment
Follow the convertion (https://golang.org/s/generatedcode) for generated
code in stringer.go.

Change-Id: I7b5fbb04ba03e8ac77a9a0a402088669469de858
Reviewed-on: https://go-review.googlesource.com/122015
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-07-03 16:14:43 +00:00
Wei Xiao 0a7ac93c27 cmd/compile: improve atomic add intrinsics with ARMv8.1 new instruction
ARMv8.1 has added new instruction (LDADDAL) for atomic memory operations. This
CL improves existing atomic add intrinsics with the new instruction. Since the
new instruction is only guaranteed to be present after ARMv8.1, we guard its
usage with a conditional on CPU feature.

Performance result on ARMv8.1 machine:
name        old time/op  new time/op  delta
Xadd-224    1.05µs ± 6%  0.02µs ± 4%  -98.06%  (p=0.000 n=10+8)
Xadd64-224  1.05µs ± 3%  0.02µs ±13%  -98.10%  (p=0.000 n=9+10)
[Geo mean]  1.05µs       0.02µs       -98.08%

Performance result on ARMv8.0 machine:
name        old time/op  new time/op  delta
Xadd-46      538ns ± 1%   541ns ± 1%  +0.62%  (p=0.000 n=9+9)
Xadd64-46    505ns ± 1%   508ns ± 0%  +0.48%  (p=0.003 n=9+8)
[Geo mean]   521ns        524ns       +0.55%

Change-Id: If4b5d8d0e2d6f84fe1492a4f5de0789910ad0ee9
Reviewed-on: https://go-review.googlesource.com/81877
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-06-21 14:52:43 +00:00
Richard Musiol e083dc6307 runtime, sycall/js: add support for callbacks from JavaScript
This commit adds support for JavaScript callbacks back into
WebAssembly. This is experimental API, just like the rest of the
syscall/js package. The time package now also uses this mechanism
to properly support timers without resorting to a busy loop.

JavaScript code can call into the same entry point multiple times.
The new RUN register is used to keep track of the program's
run state. Possible values are: starting, running, paused and exited.
If no goroutine is ready any more, the scheduler can put the
program into the "paused" state and the WebAssembly code will
stop running. When a callback occurs, the JavaScript code puts
the callback data into a queue and then calls into WebAssembly
to allow the Go code to continue running.

Updates #18892
Updates #25506

Change-Id: Ib8701cfa0536d10d69bd541c85b0e2a754eb54fb
Reviewed-on: https://go-review.googlesource.com/114197
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-06-14 21:50:53 +00:00
Lynn Boger 9e9ff565cd runtime/race: implement race detector for ppc64le
This adds the support to enable the race detector for ppc64le.

Added runtime/race_ppc64le.s to manage the calls from Go to the
LLVM tsan functions, mostly converting from the Go ABI to the
PPC64 ABI expected by Clang generated code.

Changed racewalk.go to call racefuncenterfp instead of racefuncenter
on ppc64le to allow the caller pc to be obtained in the asm code
before calling the tsan version.

Changed the set up code for racecallbackthunk so it doesn't use
the autogenerated save and restore of the link register since that
sequence uses registers inconsistent with the normal ppc64 ABI.

Made various changes to recognize that race is supported for
ppc64le.

Ensured that tls_g is updated and accessible from race_linux_ppc64le.s
so that the race ctx can be obtained and passed to tsan functions.

This enables the race tests for ppc64le in cmd/dist/test.go and
increases the timeout when running the benchmarks with the -race
option to avoid timing out.

Updates #24354, #23731

Change-Id: Ib97dc7ac313e6313c836dc7d2fb698f9d8fba3ef
Reviewed-on: https://go-review.googlesource.com/107935
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-06-11 17:45:36 +00:00
Dennis Kuhnert ebb8a1f8e6 cmd/go: output coverage report even if there are no test files
When using test -cover or -coverprofile the output for "no test files"
is the same format as for "no tests to run".

Fixes #24570

Change-Id: If05609411676d42d94c1feac4bc839974fae2cc1
Reviewed-on: https://go-review.googlesource.com/115095
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-06-06 01:10:20 +00:00
Tim Cooper 161874da2a all: update comment URLs from HTTP to HTTPS, where possible
Each URL was manually verified to ensure it did not serve up incorrect
content.

Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df
Reviewed-on: https://go-review.googlesource.com/115798
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
2018-06-01 21:52:00 +00:00
Ben Shi 3d6e4ec0a8 cmd/internal/obj/arm64: fix two issues in the assembler
There are two issues in the arm64 assembler.

1. "CMPW $0x22220000, RSP" is encoded to 5b44a4d2ff031b6b, which
   is the combination of "MOVD $0x22220000, Rtmp" and
   "NEGSW Rtmp, ZR".
   The right encoding should be a combination of
   "MOVD $0x22220000, Rtmp" and "CMPW Rtmp, RSP".

2. "AND $0x22220000, R2, RSP" is encoded to 5b44a4d25f601b00,
   which is the combination of "MOVD $0x22220000, Rtmp" and
   an illegal instruction.
   The right behavior should be an error report of
   "illegal combination", since "AND Rtmp, RSP, RSP" is invalid
   in armv8.

This CL fixes the above 2 issues and adds more test cases.

fixes #25557

Change-Id: Ia510be26b58a229f5dfe8a5fa0b35569b2d566e7
Reviewed-on: https://go-review.googlesource.com/114796
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-06-01 15:24:42 +00:00
Tim Cooper 555eb70db2 all: regenerate stringer files
Change-Id: I34838320047792c4719837591e848b87ccb7f5ab
Reviewed-on: https://go-review.googlesource.com/115058
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-29 20:35:41 +00:00
isharipo 67fe8b5677 cmd/internal/obj/x86: add missing Yi8 for VEX ytabs
This change adds Yi8 forms for every ytab that had them before AVX-512 patch.
The rationale is backwards-compatibility.

EVEX forms remain strict and unchanged as they're not bound to any
backwards-compatibility issues.

Fixes #25510

Change-Id: Icd692266010ed64c9fe47cc837afc2edf2ad2d1d
Reviewed-on: https://go-review.googlesource.com/114136
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-05-23 15:34:19 +00:00
Martin Möhrmann f045ddc624 internal/cpu: add experiment to disable CPU features with GODEBUGCPU
Needs the go compiler to be build with GOEXPERIMENT=debugcpu to be active.

The GODEBUGCPU environment variable can be used to disable usage of
specific processor features in the Go standard library.
This is useful for testing and benchmarking different code paths that
are guarded by internal/cpu variable checks.

Use of processor features can not be enabled through GODEBUGCPU.

To disable usage of AVX and SSE41 cpu features on GOARCH amd64 use:
GODEBUGCPU=avx=0,sse41=0

The special "all" option can be used to disable all options:
GODEBUGCPU=all=0

Updates #12805
Updates #15403

Change-Id: I699c2e6f74d98472b6fb4b1e5ffbf29b15697aab
Reviewed-on: https://go-review.googlesource.com/91737
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-22 18:49:31 +00:00
Austin Clements c5ed10f3be runtime: support for debugger function calls
This adds a mechanism for debuggers to safely inject calls to Go
functions on amd64. Debuggers must participate in a protocol with the
runtime, and need to know how to lay out a call frame, but the runtime
support takes care of the details of handling live pointers in
registers, stack growth, and detecting the trickier conditions when it
is unsafe to inject a user function call.

Fixes #21678.
Updates derekparker/delve#119.

Change-Id: I56d8ca67700f1f77e19d89e7fc92ab337b228834
Reviewed-on: https://go-review.googlesource.com/109699
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 15:55:05 +00:00
Austin Clements 9f95c9db23 cmd/compile, cmd/internal/obj: record register maps in binary
This adds FUNCDATA and PCDATA that records the register maps much like
the existing live arguments maps and live locals maps. The register
map is indexed independently from the argument and locals maps since
changes in register liveness tend not to correlate with changes to
argument and local liveness.

This is the final CL toward adding safe-points everywhere. The
following CLs will optimize liveness analysis to bring down the cost.
The effect of this CL is:

name        old time/op       new time/op       delta
Template          195ms ± 2%        197ms ± 1%    ~     (p=0.136 n=9+9)
Unicode          98.4ms ± 2%       99.7ms ± 1%  +1.39%  (p=0.004 n=10+10)
GoTypes           685ms ± 1%        700ms ± 1%  +2.06%  (p=0.000 n=9+9)
Compiler          3.28s ± 1%        3.34s ± 0%  +1.71%  (p=0.000 n=9+8)
SSA               7.79s ± 1%        7.91s ± 1%  +1.55%  (p=0.000 n=10+9)
Flate             133ms ± 2%        133ms ± 2%    ~     (p=0.190 n=10+10)
GoParser          161ms ± 2%        164ms ± 3%  +1.83%  (p=0.015 n=10+10)
Reflect           450ms ± 1%        457ms ± 1%  +1.62%  (p=0.000 n=10+10)
Tar               183ms ± 2%        185ms ± 1%  +0.91%  (p=0.008 n=9+10)
XML               234ms ± 1%        238ms ± 1%  +1.60%  (p=0.000 n=9+9)
[Geo mean]        411ms             417ms       +1.40%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.47M ± 0%        1.51M ± 0%  +2.79%  (p=0.000 n=10+10)

Compared to just before "cmd/internal/obj: consolidate emitting entry
stack map", the cumulative effect of adding stack maps everywhere and
register maps is:

name        old time/op       new time/op       delta
Template          185ms ± 2%        197ms ± 1%   +6.42%  (p=0.000 n=10+9)
Unicode          96.3ms ± 3%       99.7ms ± 1%   +3.60%  (p=0.000 n=10+10)
GoTypes           658ms ± 0%        700ms ± 1%   +6.37%  (p=0.000 n=10+9)
Compiler          3.14s ± 1%        3.34s ± 0%   +6.53%  (p=0.000 n=9+8)
SSA               7.41s ± 2%        7.91s ± 1%   +6.71%  (p=0.000 n=9+9)
Flate             126ms ± 1%        133ms ± 2%   +6.15%  (p=0.000 n=10+10)
GoParser          153ms ± 1%        164ms ± 3%   +6.89%  (p=0.000 n=10+10)
Reflect           437ms ± 1%        457ms ± 1%   +4.59%  (p=0.000 n=10+10)
Tar               178ms ± 1%        185ms ± 1%   +4.18%  (p=0.000 n=10+10)
XML               223ms ± 1%        238ms ± 1%   +6.39%  (p=0.000 n=10+9)
[Geo mean]        394ms             417ms        +5.78%

name        old alloc/op      new alloc/op      delta
Template         34.5MB ± 0%       38.0MB ± 0%  +10.19%  (p=0.000 n=10+10)
Unicode          29.3MB ± 0%       30.3MB ± 0%   +3.56%  (p=0.000 n=8+9)
GoTypes           113MB ± 0%        125MB ± 0%  +10.89%  (p=0.000 n=10+10)
Compiler          510MB ± 0%        575MB ± 0%  +12.79%  (p=0.000 n=10+10)
SSA              1.46GB ± 0%       1.64GB ± 0%  +12.40%  (p=0.000 n=10+10)
Flate            23.9MB ± 0%       25.9MB ± 0%   +8.56%  (p=0.000 n=10+10)
GoParser         28.0MB ± 0%       30.8MB ± 0%  +10.08%  (p=0.000 n=10+10)
Reflect          77.6MB ± 0%       84.3MB ± 0%   +8.63%  (p=0.000 n=10+10)
Tar              34.1MB ± 0%       37.0MB ± 0%   +8.44%  (p=0.000 n=10+10)
XML              42.7MB ± 0%       47.2MB ± 0%  +10.75%  (p=0.000 n=10+10)
[Geo mean]       76.0MB            83.3MB        +9.60%

name        old allocs/op     new allocs/op     delta
Template           321k ± 0%         337k ± 0%   +4.98%  (p=0.000 n=10+10)
Unicode            337k ± 0%         340k ± 0%   +1.04%  (p=0.000 n=10+9)
GoTypes           1.13M ± 0%        1.18M ± 0%   +4.85%  (p=0.000 n=10+10)
Compiler          4.67M ± 0%        4.96M ± 0%   +6.25%  (p=0.000 n=10+10)
SSA               11.7M ± 0%        12.3M ± 0%   +5.69%  (p=0.000 n=10+10)
Flate              216k ± 0%         226k ± 0%   +4.52%  (p=0.000 n=10+9)
GoParser           271k ± 0%         283k ± 0%   +4.52%  (p=0.000 n=10+10)
Reflect            927k ± 0%         972k ± 0%   +4.78%  (p=0.000 n=10+10)
Tar                318k ± 0%         333k ± 0%   +4.56%  (p=0.000 n=10+10)
XML                376k ± 0%         395k ± 0%   +5.04%  (p=0.000 n=10+10)
[Geo mean]         730k              764k        +4.61%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.46M ± 0%        1.51M ± 0%   +3.66%  (p=0.000 n=10+10)

For #24543.

Change-Id: I91e003dc64151916b384274884bf02a2d6862547
Reviewed-on: https://go-review.googlesource.com/109353
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 15:55:03 +00:00
isharipo 5437cde96c cmd/asm: enable AVX512
- Uncomment tests for AVX512 encoder
- Permit instruction suffixes for x86
- Permit limited reg list [reg-reg] syntax for x86 for multi-source ops
- EVEX encoding support in obj/x86 (Z-cases, asmevex, etc.)
- optabs and ytabs generated by x86avxgen (https://golang.org/cl/107216)

Note: suffix formatting implemented with updated CConv function.
Now arch asm backend should register formatting function by
calling RegisterOpSuffix.

Updates #22779

Change-Id: I076a167ee49582700e058c56ad74e6696710c8c8
Reviewed-on: https://go-review.googlesource.com/113315
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-22 14:57:15 +00:00
Austin Clements 02495da6b6 cmd/internal/obj: consolidate emitting entry stack map
The obj package needs to emit the PCDATA to select the entry stack map
before calling morestack. Currently this is copied for every
architecture. Since we're about to change how this works, consolidate
all of these copies into a single helper function.

For #24543.

Change-Id: Ia92d94de78f8e23fd06dba747c43e03e5989f67b
Reviewed-on: https://go-review.googlesource.com/109346
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 14:43:35 +00:00
isharipo d2c68bb65f cmd/internal/obj/x86: fix VPERMQ and VPERMPD ytab
Fixes invalid encoding of VPERMQ and VPERMPD that use
negative immediate argument.

Fixes #25418
Updates #25420

Change-Id: Idd8180c4c632a76b76f3a68efd5f930d94431994
Reviewed-on: https://go-review.googlesource.com/113615
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-05-17 16:20:25 +00:00
Tobias Klauser ca3364836f cmd/internal/objfile, debug/macho: support disassembling arm64 Mach-O objects
Fixes #25423

Change-Id: I6bed0726b8f4c7d607a3df271b2ab1006e96fa75
Reviewed-on: https://go-review.googlesource.com/113356
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-16 15:32:50 +00:00
David Chase 7bac2a95f6 cmd/compile: plumb prologueEnd into DWARF
This marks the first instruction after the prologue for
consumption by debuggers, specifically Delve, who asked
for it.  gdb appears to ignore it, lldb appears to use it.

The bits for end-of-prologue and beginning-of-epilogue
are added to Pos (reducing maximum line number by 4x, to
1048575).  They're added in cmd/internal/obj/<ARCH>.go
(currently x86 only), so the compiler-proper need not
deal with them.

The linker currently does nothing with beginning-of-epilogue,
but the plumbing exists to make it easier in the future.

This also upgrades the line number table to DWARF version 3.

This CL includes a regression in the coverage for
testdata/i22558.gdb-dbg.nexts, this appears to be a gdb
artifact but the fix would be in the preceding CL in the
stack.

Change-Id: I3bda5f46a0ed232d137ad48f65a14835c742c506
Reviewed-on: https://go-review.googlesource.com/110416
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-14 14:12:31 +00:00
David Chase c2c1822b12 cmd/compile: assign and preserve statement boundaries.
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).

Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).

Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.

Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".

The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices.  This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.

Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go.  There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.

Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)

This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.

The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.

This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.

Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-14 14:09:49 +00:00
Ben Shi bec2f51b07 cmd/internal/obj/arm: fix wrong encoding of MUL
The arm assembler incorrectly encodes the following instructions.
"MUL R2, R4" -> 0xe0040492 ("MUL R4, R2, R4")
"MUL R2, R4, R4" -> 0xe0040492 ("MUL R4, R2, R4")

The CL fixes that issue.

fixes #25347

Change-Id: I883716c7bc51c5f64837ae7d81342f94540a58cb
Reviewed-on: https://go-review.googlesource.com/112737
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-14 01:53:39 +00:00
quasilyte e405c9cca3 cmd/internal/obj/x86: use named consts for movtab Z-cases
Use 0-terminated opbyte sequences for Zlit-like movtabs instead of E=0xff.

movCodeFullPtr is unused (load full ptr is unsupported), but it should
be removed in a separate CL (if removed at all).

Passes toolstash-check.

Change-Id: I28436718d93b017153de0e50e3bcec344ea4ee05
Reviewed-on: https://go-review.googlesource.com/107076
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-11 14:40:34 +00:00
Richard Musiol bf23a4e61d cmd/internal/obj/wasm: avoid invalid offsets for Load/Store
Offsets for Load and Store instructions have type i32. Bad index
expression offsets can cause an offset to be larger than MaxUint32,
which is not allowed. One example for this is the test test/index0.go.

Generate valid code by adding a guard to the responsible rewrite rule.
Also emit a proper error when using such a bad index in assembly code.

Change-Id: Ie90adcbf3ae3861c26680eb81790f28692913ccf
Reviewed-on: https://go-review.googlesource.com/111955
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-10 12:05:17 +00:00
Lynn Boger 28edaf4584 cmd/compile,test: combine byte loads and stores on ppc64le
CL 74410 added rules to combine consecutive byte loads and
stores when the byte order was little endian for ppc64le. This
is the corresponding change for bytes that are in big endian order.
These rules are all intended for a little endian target arch.

This adds new testcases in test/codegen/memcombine.go

Fixes #22496
Updates #24242

Benchmark improvement for encoding/binary:
name                      old time/op    new time/op    delta
ReadSlice1000Int32s-16      11.0µs ± 0%     9.0µs ± 0%  -17.47%  (p=0.029 n=4+4)
ReadStruct-16               2.47µs ± 1%    2.48µs ± 0%   +0.67%  (p=0.114 n=4+4)
ReadInts-16                  642ns ± 1%     630ns ± 1%   -2.02%  (p=0.029 n=4+4)
WriteInts-16                 654ns ± 0%     653ns ± 1%   -0.08%  (p=0.629 n=4+4)
WriteSlice1000Int32s-16     8.75µs ± 0%    8.20µs ± 0%   -6.19%  (p=0.029 n=4+4)
PutUint16-16                1.16ns ± 0%    0.93ns ± 0%  -19.83%  (p=0.029 n=4+4)
PutUint32-16                1.16ns ± 0%    0.93ns ± 0%  -19.83%  (p=0.029 n=4+4)
PutUint64-16                1.85ns ± 0%    0.93ns ± 0%  -49.73%  (p=0.029 n=4+4)
LittleEndianPutUint16-16    1.03ns ± 0%    0.93ns ± 0%   -9.71%  (p=0.029 n=4+4)
LittleEndianPutUint32-16    0.93ns ± 0%    0.93ns ± 0%     ~     (all equal)
LittleEndianPutUint64-16    0.93ns ± 0%    0.93ns ± 0%     ~     (all equal)
PutUvarint32-16             43.0ns ± 0%    43.1ns ± 0%   +0.12%  (p=0.429 n=4+4)
PutUvarint64-16              174ns ± 0%     175ns ± 0%   +0.29%  (p=0.429 n=4+4)

Updates made to functions in gcm.go to enable their matching. An existing
testcase prevents these functions from being replaced by those in encoding/binary
due to import dependencies.

Change-Id: Idb3bd1e6e7b12d86cd828fb29cb095848a3e485a
Reviewed-on: https://go-review.googlesource.com/98136
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-08 13:15:39 +00:00
Austin Clements 44286b17c5 runtime: replace system goroutine whitelist with symbol test
Currently isSystemGoroutine has a hard-coded list of known entry
points into system goroutines. This list is annoying to maintain. For
example, it's missing the ensureSigM goroutine.

Replace it with a check that simply looks for any goroutine with
runtime function as its entry point, with a few exceptions. This also
matches the definition recently added to the trace viewer (CL 81315).

Change-Id: Iaed723d4a6e8c2ffb7c0c48fbac1688b00b30f01
Reviewed-on: https://go-review.googlesource.com/81655
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-07 21:38:40 +00:00
fanzha02 9d72e8c686 cmd/internal/obj/arm64: fix illegal 4-operand instructions accepted arm64 bug
Current assmbler accepts MUL* related instructions with 4 operands,
such as instruction "MUL R1, R2, R3, R4", which is illegal.

The fix adds an actual field informantion to Optab, which has value
of C_NONE, C_REG, etc, so assembler can use p.From3Type for checking
in oplook.

Add test cases.

Fixes #25059

Change-Id: I0656319383c460696b392197bf5960b987f8fc97
Reviewed-on: https://go-review.googlesource.com/109295
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
2018-05-07 15:12:35 +00:00