Commit Graph

3316 Commits

Author SHA1 Message Date
Martin Möhrmann 38e7177c94 cmd/compile: fix length overflow when appending elements to a slice
Instead of testing len(slice)+numNewElements > cap(slice) use
uint(len(slice)+numNewElements) > uint(cap(slice)) to test
if a slice needs to be grown in an append operation.

This prevents a possible overflow when len(slice) is near the maximum
int value and the addition of a constant number of new elements
makes it overflow and wrap around to a negative number which is
smaller than the capacity of the slice.

Appending a slice to a slice with append(s1, s2...) already used
a uint comparison to test slice capacity and therefore was not
vulnerable to the same overflow issue.

Fixes: #29190

Change-Id: I41733895838b4f80a44f827bf900ce931d8be5ca
Reviewed-on: https://go-review.googlesource.com/c/154037
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-12-14 05:48:18 +00:00
Keith Randall 05bbec7357 cmd/compile: don't combine load+op if the op has SymAddr arguments
By combining the load+op, we may force the op to happen earlier in
the store chain. That might force the SymAddr operation earlier, and
in particular earlier than its corresponding VarDef. That leads to
an invalid schedule, so avoid that.

This is kind of a hack to work around the issue presented. I think
the underlying problem, that LEAQ is not directly ordered with respect
to its vardef, is the real problem. The benefit of this CL is that
it fixes the immediate issue, is small, and obviously won't break
anything. A real fix for this issue is much more invasive.

The go binary is unchanged in size.
This situation just doesn't occur very often.

Fixes #28445

Change-Id: I13a765e13f075d5b6808a355ef3c43cdd7cd47b6
Reviewed-on: https://go-review.googlesource.com/c/153641
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-12-12 21:13:15 +00:00
Keith Randall 01a1eaa10c cmd/compile: use innermost line number for -S
When functions are inlined, for instructions in the inlined body, does
-S print the location of the call, or the location of the body? Right
now, we do the former. I'd like to do the latter by default, it makes
much more sense when reading disassembly. With mid-stack inlining
enabled in more cases, this quandry will come up more often.

The original behavior is still available with -S=2. Some tests
use this mode (so they can find assembly generated by a particular
source line).

This helped me with understanding what the compiler was doing
while fixing #29007.

Change-Id: Id14a3a41e1b18901e7c5e460aa4caf6d940ed064
Reviewed-on: https://go-review.googlesource.com/c/153241
Reviewed-by: David Chase <drchase@google.com>
2018-12-11 20:24:45 +00:00
David Chase ea6259d5e9 cmd/compile: check for negative upper bound to IsSliceInBounds
IsSliceInBounds(x, y) asserts that y is not negative, but
there were cases where this is not true.  Change code
generation to ensure that this is true when it's not obviously
true.  Prove phase cleans a few of these out.

With this change the compiler text section is 0.06% larger,
that is, not very much.  Benchmarking still TBD, may need
to wait for access to a benchmarking box (next week).

Also corrected run.go to handle '?' in -update_errors output.

Fixes #28797.

Change-Id: Ia8af90bc50a91ae6e934ef973def8d3f398fac7b
Reviewed-on: https://go-review.googlesource.com/c/152477
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-12-07 23:04:58 +00:00
Austin Clements 6454a09a97 cmd/compile: omit write barriers for slice clears of go:notinheap pointers
Currently,

  for i := range a {
    a[i] = nil
  }

will compile to have write barriers even if a is a slice of pointers
to go:notinheap types. This happens because the optimization that
transforms this into a memclr only asks it a's element type has
pointers, and not if it specifically has heap pointers.

Fix this by changing arrayClear to use HasHeapPointer instead of
types.Haspointers. We probably shouldn't have both of these functions,
since a pointer to a notinheap type is effectively a uintptr, but
that's not going to change in this CL.

Change-Id: I284b85bdec6ae1e641f894e8f577989facdb0cf1
Reviewed-on: https://go-review.googlesource.com/c/152723
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-12-05 21:54:54 +00:00
Robert Griesemer cd47e8944d cmd/compile: avoid multiple errors regarding misuse of ... in signatures
Follow-up on #28450 (golang.org/cl/152417).

Updates #28450.
Fixes #29107.

Change-Id: Ib4b4fe582c35315a4f71cf6dbc7f7f2f24b37ec1
Reviewed-on: https://go-review.googlesource.com/c/152758
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-12-05 20:59:58 +00:00
smasher164 a7af474359 cmd/compile: improve error message for non-final variadic parameter
Previously, when a function signature had defined a non-final variadic
parameter, the error message always referred to the type associated with that
parameter. However, if the offending parameter's name was part of an identifier
list with a variadic type, one could misinterpret the message, thinking the
problem had been with one of the other names in the identifer list.

    func bar(a, b ...int) {}
clear ~~~~~~~^       ^~~~~~~~ confusing

This change updates the error message and sets the column position to that of
the offending parameter's name, if it exists.

Fixes #28450.

Change-Id: I076f560925598ed90e218c25d70f9449ffd9b3ea
Reviewed-on: https://go-review.googlesource.com/c/152417
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-12-05 01:35:59 +00:00
Cherry Zhang be09bdf589 cmd/compile: fix unnamed parameter handling in escape analysis
For recursive functions, the parameters were iterated using
fn.Name.Defn.Func.Dcl, which does not include unnamed/blank
parameters. This results in a mismatch in formal-actual
assignments, for example,

func f(_ T, x T)

f(a, b) should result in { _=a, x=b }, but the escape analysis
currently sees only { x=a } and drops b on the floor. This may
cause b to not escape when it should (or a escape when it should
not).

Fix this by using fntype.Params().FieldSlice() instead, which
does include unnamed parameters.

Also add a sanity check that ensures all the actual parameters
are consumed.

Fixes #29000

Change-Id: Icd86f2b5d71e7ebbab76e375b7702f62efcf59ae
Reviewed-on: https://go-review.googlesource.com/c/152617
Reviewed-by: Keith Randall <khr@golang.org>
2018-12-04 23:01:00 +00:00
Keith Randall 4a1a783dda cmd/compile: fix static initializer
staticcopy of a struct or array should recursively call itself, not
staticassign.

This fixes an issue where a struct with a slice in it is copied during
static initialization. In this case, the backing array for the slice
is duplicated, and each copy of the slice refers to a different
backing array, which is incorrect.  That issue has existed since at
least Go 1.2.

I'm not sure why this was never noticed. It seems like a pretty
obvious bug if anyone modifies the resulting slice.

In any case, we started to notice when the optimization in CL 140301
landed.  Here is basically what happens in issue29013b.go:
1) The error above happens, so we get two backing stores for what
   should be the same slice.
2) The code for initializing those backing stores is reused.
   But not duplicated: they are the same Node structure.
3) The order pass allocates temporaries for the map operations.
   For the first instance, things work fine and two temporaries are
   allocated and stored in the OKEY nodes. For the second instance,
   the order pass decides new temporaries aren't needed, because
   the OKEY nodes already have temporaries in them.
   But the order pass also puts a VARKILL of the temporaries between
   the two instance initializations.
4) In this state, the code is technically incorrect. But before
   CL 140301 it happens to work because the temporaries are still
   correctly initialized when they are used for the second time. But then...
5) The new CL 140301 sees the VARKILLs and decides to reuse the
   temporary for instance 1 map 2 to initialize the instance 2 map 1
   map. Because the keys aren't re-initialized, instance 2 map 1
   gets the wrong key inserted into it.

Fixes #29013

Change-Id: I840ce1b297d119caa706acd90e1517a5e47e9848
Reviewed-on: https://go-review.googlesource.com/c/152081
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-12-03 16:48:21 +00:00
David Chase 624e197c71 cmd/compile: decrease inlining call cost from 60 to 57
A Go user made a well-documented request for a slightly
lower threshold.  I tested against a selection of other
people's benchmarks, and saw a tiny benefit (possibly noise)
at equally tiny cost, and no unpleasant surprises observed
in benchmarking.

I.e., might help, doesn't hurt, low risk, request was
delivered on a silver platter.

It did, however, change the behavior of one test because
now bytes.Buffer.Grow is eligible for inlining.

Updates #19348.

Change-Id: I85e3088a4911290872b8c6bda9601b5354c48695
Reviewed-on: https://go-review.googlesource.com/c/151977
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-12-01 15:03:28 +00:00
Ben Shi c042fedbc8 test/codegen: add arithmetic tests for 386/amd64/arm/arm64
This CL adds several test cases of arithmetic operations for
386/amd64/arm/arm64.

Change-Id: I362687c06249f31091458a1d8c45fc4d006b616a
Reviewed-on: https://go-review.googlesource.com/c/151897
Run-TryBot: Ben Shi <powerman1st@163.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-12-01 05:17:44 +00:00
Robert Griesemer a37d95c74a cmd/compile: fix constant index bounds check and error message
While here, rename nonnegintconst to indexconst (because that's
what it is) and add Fatalf calls where we are not expecting the
indexconst call to fail, and fixed wrong comparison in smallintconst.

Fixes #23781.

Change-Id: I86eb13081c450943b1806dfe3ae368872f76639a
Reviewed-on: https://go-review.googlesource.com/c/151599
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-11-30 23:48:00 +00:00
Keith Randall 2140975ebd cmd/compile: eliminate write barriers when writing non-heap ptrs
We don't need a write barrier if:
1) The location we're writing to doesn't hold a heap pointer, and
2) The value we're writing isn't a heap pointer.

The freshly returned value from runtime.newobject satisfies (1).
Pointers to globals, and the contents of the read-only data section satisfy (2).

This is particularly helpful for code like:
p := []string{"abc", "def", "ghi"}

Where the compiler generates:
   a := new([3]string)
   move(a, statictmp_)  // eliminates write barriers here
   p := a[:]

For big slice literals, this makes the code a smaller and faster to
compile.

Update #13554. Reduces the compile time by ~10% and RSS by ~30%.

Change-Id: Icab81db7591c8777f68e5d528abd48c7e44c87eb
Reviewed-on: https://go-review.googlesource.com/c/151498
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-11-29 22:23:02 +00:00
Keith Randall 2b4f24a2d2 cmd/compile: randomize value order in block for testing
A little bit of compiler stress testing. Randomize the order
of the values in a block before every phase. This randomization
makes sure that we're not implicitly depending on that order.

Currently the random seed is a hash of the function name.
It provides determinism, but sacrifices some coverage.
Other arrangements are possible (env var, ...) but require
more setup.

Fixes #20178

Change-Id: Idae792a23264bd9a3507db6ba49b6d591a608e83
Reviewed-on: https://go-review.googlesource.com/c/33909
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-11-28 17:13:46 +00:00
Keith Randall 0b79dde112 cmd/compile: don't use CMOV ops to compute load addresses
We want to issue loads as soon as possible, especially when they
are going to miss in the cache. Using a conditional move (CMOV) here:

i := ...
if cond {
   i++
}
... = a[i]

means that we have to wait for cond to be computed before the load
is issued. Without a CMOV, if the branch is predicted correctly the
load can be issued in parallel with computing cond.
Even if the branch is predicted incorrectly, maybe the speculative
load is close to the real load, and we get a prefetch for free.
In the worst case, when the prediction is wrong and the address is
way off, we only lose by the time difference between the CMOV
latency (~2 cycles) and the mispredict restart latency (~15 cycles).

We only squash CMOVs that affect load addresses. Results of CMOVs
that are used for other things (store addresses, store values) we
use as before.

Fixes #26306

Change-Id: I82ca14b664bf05e1d45e58de8c4d9c775a127ca1
Reviewed-on: https://go-review.googlesource.com/c/145717
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-11-27 17:22:37 +00:00
Clément Chigot 5680874e0c test: fix nilptr5 for AIX
This commit fixes a mistake made in CL 144538.
This nilcheck can be removed because OpPPC64LoweredMove will fault if
arg0 is nil, as it's used to store. Further information can be found in
cmd/compile/internal/ssa/nilcheck.go.

Change-Id: Ifec0080c00eb1f94a8c02f8bf60b93308e71b119
Reviewed-on: https://go-review.googlesource.com/c/151298
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-27 15:36:08 +00:00
Brian Kessler 319787a528 cmd/compile: intrinsify math/bits.Div on amd64
Note that the intrinsic implementation panics separately for overflow and
divide by zero, which matches the behavior of the pure go implementation.
There is a modest performance improvement after intrinsic implementation.

name     old time/op  new time/op  delta
Div-4    53.0ns ± 1%  47.0ns ± 0%  -11.28%  (p=0.008 n=5+5)
Div32-4  18.4ns ± 0%  18.5ns ± 1%     ~     (p=0.444 n=5+5)
Div64-4  53.3ns ± 0%  47.5ns ± 4%  -10.77%  (p=0.008 n=5+5)

Updates #28273

Change-Id: Ic1688ecc0964acace2e91bf44ef16f5fb6b6bc82
Reviewed-on: https://go-review.googlesource.com/c/144378
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-27 05:04:25 +00:00
Keith Randall eb6c433eb3 cmd/compile: don't convert non-Go-constants to OLITERALs
Don't convert values that aren't Go constants, like
uintptr(unsafe.Pointer(nil)), to a literal constant. This avoids
assuming they are constants for things like indexing, array sizes,
case duplication, etc.

Also, nil is an allowed duplicate in switches. CTNILs aren't Go constants.

Fixes #28078
Fixes #28079

Change-Id: I9ab8af47098651ea09ef10481787eae2ae2fb445
Reviewed-on: https://go-review.googlesource.com/c/151320
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-11-27 01:21:41 +00:00
Keith Randall 6fff980cf1 cmd/compile: initialize sparse slice literals dynamically
When a slice composite literal is sparse, initialize it dynamically
instead of statically.

s := []int{5:5, 20:20}

To initialize the backing store for s, use 2 constant writes instead
of copying from a static array with 21 entries.

This CL also fixes pathologies in the compiler when the slice is
*very* sparse.

Fixes #23780

Change-Id: Iae95c6e6f6a0e2994675cbc750d7a4dd6436b13b
Reviewed-on: https://go-review.googlesource.com/c/151319
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-11-26 22:50:48 +00:00
Keith Randall 1602e49701 cmd/compile: don't constant-fold non-Go constants in the frontend
Abort evconst if its argument isn't a Go constant. The SSA backend
will do the optimizations in question later. They tend to be weird
cases, like uintptr(unsafe.Pointer(uintptr(1))).

Fix OADDSTR and OCOMPLEX cases in isGoConst.
OADDSTR has its arguments in n.List, not n.Left and n.Right.
OCOMPLEX might have a 2-result function as its arg in List[0]
(in which case it isn't a Go constant).

Fixes #24760

Change-Id: Iab312d994240d99b3f69bfb33a443607e872b01d
Reviewed-on: https://go-review.googlesource.com/c/151338
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-11-26 22:49:57 +00:00
Keith Randall ca3749230b cmd/compile: allow bodyless function if it is linkname'd
In assembly free packages (aka "complete" or "pure go"), allow
bodyless functions if they are linkname'd to something else.

Presumably the thing the function is linkname'd to has a definition.
If not, the linker will complain. And linkname is unsafe, so we expect
users to know what they are doing.

Note this handles only one direction, where the linkname directive
is in the local package. If the linkname directive is in the remote
package, this CL won't help. (See os/signal/sig.s for an example.)

Fixes #23311

Change-Id: I824361b4b582ee05976d94812e5b0e8b0f7a18a6
Reviewed-on: https://go-review.googlesource.com/c/151318
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-11-26 20:00:59 +00:00
Clément Chigot 9fe9853ae5 cmd/compile: fix nilcheck for AIX
This commit adapts compile tool to create correct nilchecks for AIX.

AIX allows to load a nil pointer. Therefore, the default nilcheck
which issues a load must be replaced by a CMP instruction followed by a
store at 0x0 if the value is nil. The store will trigger a SIGSEGV as on
others OS.

The nilcheck algorithm must be adapted to do not remove nilcheck if it's
only a read. Stores are detected with v.Type.IsMemory().

Tests related to nilptr must be adapted to the previous changements.
nilptr.go cannot be used as it's because the AIX address space starts at
1<<32.

Change-Id: I9f5aaf0b7e185d736a9b119c0ed2fe4e5bd1e7af
Reviewed-on: https://go-review.googlesource.com/c/144538
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-26 14:13:53 +00:00
Clément Chigot 041526c6ef runtime: handle 64bits addresses for AIX
This commit allows the runtime to handle 64bits addresses returned by
mmap syscall on AIX.

Mmap syscall returns addresses on 59bits on AIX. But the Arena
implementation only allows addresses with less than 48 bits.
This commit increases the arena size up to 1<<60 for aix/ppc64.

Update: #25893

Change-Id: Iea72e8a944d10d4f00be915785e33ae82dd6329e
Reviewed-on: https://go-review.googlesource.com/c/138736
Reviewed-by: Austin Clements <austin@google.com>
2018-11-26 14:06:28 +00:00
Austin Clements 9255688610 cmd/asm: rename -symabis to -gensymabis
Currently, both asm and compile have a -symabis flag, but in asm it's
a boolean flag that means to generate a symbol ABIs file and in the
compiler its a string flag giving the path of the symbol ABIs file to
consume. I'm worried about this false symmetry biting us in the
future, so rename asm's flag to -gensymabis.

Updates #27539.

Change-Id: I8b9c18a852d2838099718f8989813f19d82e7434
Reviewed-on: https://go-review.googlesource.com/c/149818
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-16 21:57:50 +00:00
Martin Möhrmann 75798e8ada runtime: make processor capability variable naming platform specific
The current support_XXX variables are specific for the
amd64 and 386 platforms.

Prefix processor capability variables by architecture to have a
consistent naming scheme and avoid reuse of the existing
variables for new platforms.

This also aligns naming of runtime variables closer with internal/cpu
processor capability variable names.

Change-Id: I3eabb29a03874678851376185d3a62e73c1aff1d
Reviewed-on: https://go-review.googlesource.com/c/91435
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-14 20:30:31 +00:00
Milan Knezevic c92e73b702 cmd/compile/internal/gc: OMUL should be evaluated when using soft-float
When using soft-float, OMUL might be rewritten to function call
so we should ensure it was evaluated first.

Fixes #28688

Change-Id: I30b87501782fff62d35151f394a1c22b0d490c6c
Reviewed-on: https://go-review.googlesource.com/c/148837
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-11-14 18:52:15 +00:00
Emmanuel T Odeke 6d620fc42e test: move empty header file in builddir, buildrundir to temp directory
Move the empty header file created by "builddir", "buildrundir"
directives to t.tempDir. The file was accidentally placed in the
same directory as the source code and this was a vestige of CL 146999.

Fixes #28781

Change-Id: I3d2ada5f9e8bf4ce4f015b9bd379b311592fe3ce
Reviewed-on: https://go-review.googlesource.com/c/149458
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-14 00:22:40 +00:00
Austin Clements a3c70e28ed test: fix ABI mismatch in fixedbugs/issue19507
Because run.go doesn't pass the package being compiled to the compiler
via the -p flag, it can't match up the main·f symbol from the
assembler with the "func f" stub in Go, so it doesn't produce the
correct assembly stub.

Fix this by removing the package prefix from the assembly definition.

Alternatively, we could make run.go pass -p to the compiler, but it's
nicer to remove these package prefixes anyway.

Should fix the linux-arm builder, which was broken by the introduction
of function ABIs in CL 147160.

Updates #27539.

Change-Id: Id62b7701e1108a21a5ad48ffdb5dad4356c273a6
Reviewed-on: https://go-review.googlesource.com/c/149483
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-13 23:44:52 +00:00
Keith Randall 0098f8aeac runtime: when using explicit argmap, also use arglen
When we set an explicit argmap, we may want only a prefix of that
argmap.  Argmap is set when the function is reflect.makeFuncStub or
reflect.methodValueCall. In this case, arglen specifies how much of
the args section is actually live. (It could be either all the args +
results, or just the args.)

Fixes #28750

Change-Id: Idf060607f15a298ac591016994e58e22f7f92d83
Reviewed-on: https://go-review.googlesource.com/c/149217
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-11-13 22:52:09 +00:00
Austin Clements 0f5dfbcfd7 cmd/go, cmd/dist: plumb symabis from assembler to compiler
For #27539.

Change-Id: I0e27f142224e820205fb0e65ad03be7eba93da14
Reviewed-on: https://go-review.googlesource.com/c/146999
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-12 20:46:41 +00:00
Austin Clements 7f1dd3ae4d test: minor simplification to run.go
This is a little clearer, and we're about to need the .s file list in
one more place, so this will cut down on duplication.

Change-Id: I4da8bf03a0469fb97565b0841c40d505657b574e
Reviewed-on: https://go-review.googlesource.com/c/146998
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-11-12 20:46:39 +00:00
Josh Bleecher Snyder 8607b2e825 cmd/compile: optimize A->B->C Moves that include VarDefs
We have an existing optimization that recognizes
memory moves of the form A -> B -> C and converts
them into A -> C, in the hopes that the store to
B will be end up being dead and thus eliminated.

However, when A, B, and C are large types,
the front end sometimes emits VarDef ops for the moves.
This change adds an optimization to match that pattern.

This required changing an old compiler test.
The test assumed that a temporary was required
to deal with a large return value.
With this optimization in place, that temporary
ended up being eliminated.

Triggers 649 times during 'go build -a std cmd'.

Cuts 16k off cmd/go.

name        old object-bytes  new object-bytes  delta
Template          507kB ± 0%        507kB ± 0%  -0.15%  (p=0.008 n=5+5)
Unicode           225kB ± 0%        225kB ± 0%    ~     (all equal)
GoTypes          1.85MB ± 0%       1.85MB ± 0%    ~     (all equal)
Flate             328kB ± 0%        328kB ± 0%    ~     (all equal)
GoParser          402kB ± 0%        402kB ± 0%  -0.00%  (p=0.008 n=5+5)
Reflect          1.41MB ± 0%       1.41MB ± 0%  -0.20%  (p=0.008 n=5+5)
Tar               458kB ± 0%        458kB ± 0%    ~     (all equal)
XML               601kB ± 0%        599kB ± 0%  -0.21%  (p=0.008 n=5+5)

Change-Id: I9b5f25c8663a0b772ad1ee51fa61f74b74d26dd3
Reviewed-on: https://go-review.googlesource.com/c/143479
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2018-11-11 14:18:33 +00:00
Lynn Boger 4ae49b5921 cmd/compile: use ANDCC, ORCC, XORCC to avoid CMP on ppc64x
This change makes use of the cc versions of the AND, OR, XOR
instructions, omitting the need for a CMP instruction.

In many test programs and in the go binary, this reduces the
size of 20-30 functions by at least 1 instruction, many in
runtime.

Testcase added to test/codegen/comparisons.go

Change-Id: I6cc1ca8b80b065d7390749c625bc9784b0039adb
Reviewed-on: https://go-review.googlesource.com/c/143059
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-11-09 19:40:52 +00:00
Keith Randall 13baf4b2cd cmd/compile: encourage inlining of functions with single-call bodies
This is a simple tweak to allow a bit more mid-stack inlining.
In cases like this:

func f() {
    g()
}

We'd really like to inline f into its callers. It can't hurt.

We implement this optimization by making calls a bit cheaper, enough
to afford a single call in the function body, but not 2.
The remaining budget allows for some argument modification, or perhaps
a wrapping conditional:

func f(x int) {
    g(x, 0)
}
func f(x int) {
    if x > 0 {
        g()
    }
}

Update #19348

Change-Id: Ifb1ea0dd1db216c3fd5c453c31c3355561fe406f
Reviewed-on: https://go-review.googlesource.com/c/147361
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
2018-11-08 17:29:23 +00:00
Keith Randall 95a4f793c0 cmd/compile: don't deadcode eliminate labels
Dead-code eliminating labels is tricky because there might
be gotos that can still reach them.

Bug probably introduced with CL 91056

Fixes #28616

Change-Id: I6680465134e3486dcb658896f5172606cc51b104
Reviewed-on: https://go-review.googlesource.com/c/147817
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Iskander Sharipov <iskander.sharipov@intel.com>
2018-11-06 18:50:16 +00:00
Ian Lance Taylor a540aa338a test: add test that gccgo failed to compile
Updates #28601

Change-Id: I734fc5ded153126d384f0df912ecd4d208005e49
Reviewed-on: https://go-review.googlesource.com/c/147537
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-11-05 20:54:58 +00:00
Robert Griesemer e6305380a0 cmd/compile: reintroduce work-around for cyclic alias declarations
This change re-introduces (temporarily) a work-around for recursive
alias type declarations, originally in https://golang.org/cl/35831/
(intended as fix for #18640). The work-around was removed later
for a more comprehensive cycle detection check. That check
contained a subtle error which made the code appear to work,
while in fact creating incorrect types internally. See #25838
for details.

By re-introducing the original work-around, we eliminate problems
with many simple recursive type declarations involving aliases;
specifically cases such as #27232 and #27267. However, the more
general problem remains.

This CL also fixes the subtle error (incorrect variable use when
analyzing a type cycle) mentioned above and now issues a fatal
error with a reference to the relevant issue (rather than crashing
later during the compilation). While not great, this is better
than the current status. The long-term solution will need to
address these cycles (see #25838).

As a consequence, several old test cases are not accepted anymore
by the compiler since they happened to work accidentally only.
This CL disables parts or all code of those test cases. The issues
are: #18640, #23823, and #24939.

One of the new test cases (fixedbugs/issue27232.go) exposed a
go/types issue. The test case is excluded from the go/types test
suite and an issue was filed (#28576).

Updates #18640.
Updates #23823.
Updates #24939.
Updates #25838.
Updates #28576.

Fixes #27232.
Fixes #27267.

Change-Id: I6c2d10da98bfc6f4f445c755fcaab17fc7b214c5
Reviewed-on: https://go-review.googlesource.com/c/147286
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-11-05 20:30:19 +00:00
Matthew Dempsky b2c397e537 cmd/compile: disallow converting string to notinheap slice
Unlikely to happen in practice, but easy enough to prevent and might
as well do so for completeness.

Fixes #28243.

Change-Id: I848c3af49cb923f088e9490c6a79373e182fad08
Reviewed-on: https://go-review.googlesource.com/c/142719
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2018-11-02 19:53:59 +00:00
Clément Chigot 85525c56ab all: skip unsupported tests on AIX
This commit skips tests which aren't yet supported on AIX.

nosplit.go is disabled because stackGuardMultiplier is increased for
syscalls.

Change-Id: Ib5ff9a4539c7646bcb6caee159f105ff8a160ad7
Reviewed-on: https://go-review.googlesource.com/c/146939
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
2018-11-02 16:12:08 +00:00
Keith Randall 0ad332d80c cmd/compile: implement some moves using non-overlapping reads&writes
For moves >8,<16 bytes, do a move using non-overlapping loads/stores
if it would require no more instructions.

This helps a bit with the case when the move is from a static
constant, because then the code to materialize the value being moved
is smaller.

Change-Id: Ie47a5a7c654afeb4973142b0a9922faea13c9b54
Reviewed-on: https://go-review.googlesource.com/c/146019
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-30 20:27:03 +00:00
Keith Randall f14067f3c1 cmd/compile: when comparing 0-size types, make sure expr side-effects survive
Fixes #23837

Change-Id: I53f524d87946a0065f28a4ddbe47b40f2b43c459
Reviewed-on: https://go-review.googlesource.com/c/145757
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-10-30 17:45:19 +00:00
Ben Shi 455ef3f6bc test/codegen: improve arithmetic tests
This CL fixes several typos and adds two more cases
to arithmetic test.

Change-Id: I086560162ea351e2166866e444e2317da36c1729
Reviewed-on: https://go-review.googlesource.com/c/145210
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-30 14:39:53 +00:00
Ben Shi 5f5ea3fd4d cmd/compile: optimize amd64's ADDQconstmodify/ADDLconstmodify
This CL optimize amd64's code:
"ADDQ $-1, MEM_OP" -> "DECQ MEM_OP"
"ADDL $-1, MEM_OP" -> "DECL MEM_OP"

1. The total size of pkg/linux_amd64 (excluding cmd/compile)
decreases about 0.1KB.

2. The go1 benchmark shows little regression, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              2.60s ± 5%     2.64s ± 3%  +1.53%  (p=0.000 n=38+39)
Fannkuch11-4                2.37s ± 2%     2.38s ± 2%    ~     (p=0.950 n=40+40)
FmtFprintfEmpty-4          40.4ns ± 5%    40.5ns ± 5%    ~     (p=0.711 n=40+40)
FmtFprintfString-4         72.4ns ± 5%    72.3ns ± 3%    ~     (p=0.485 n=40+40)
FmtFprintfInt-4            79.7ns ± 3%    80.1ns ± 3%    ~     (p=0.124 n=40+40)
FmtFprintfIntInt-4          126ns ± 3%     127ns ± 3%  +0.71%  (p=0.027 n=40+40)
FmtFprintfPrefixedInt-4     153ns ± 4%     153ns ± 2%    ~     (p=0.604 n=40+40)
FmtFprintfFloat-4           206ns ± 5%     210ns ± 5%  +1.79%  (p=0.002 n=40+40)
FmtManyArgs-4               498ns ± 3%     496ns ± 3%    ~     (p=0.099 n=40+40)
GobDecode-4                6.48ms ± 6%    6.47ms ± 7%    ~     (p=0.686 n=39+40)
GobEncode-4                5.95ms ± 7%    5.96ms ± 6%    ~     (p=0.670 n=40+34)
Gzip-4                      224ms ± 6%     223ms ± 5%    ~     (p=0.143 n=40+40)
Gunzip-4                   36.5ms ± 4%    36.5ms ± 4%    ~     (p=0.556 n=40+40)
HTTPClientServer-4         60.7µs ± 2%    59.9µs ± 3%  -1.20%  (p=0.000 n=39+39)
JSONEncode-4               9.03ms ± 4%    9.04ms ± 4%    ~     (p=0.589 n=40+40)
JSONDecode-4               49.4ms ± 4%    49.2ms ± 4%    ~     (p=0.276 n=40+40)
Mandelbrot200-4            3.80ms ± 4%    3.79ms ± 4%    ~     (p=0.837 n=40+40)
GoParse-4                  3.15ms ± 5%    3.13ms ± 5%    ~     (p=0.240 n=40+40)
RegexpMatchEasy0_32-4      72.9ns ± 3%    72.0ns ± 8%  -1.25%  (p=0.003 n=40+40)
RegexpMatchEasy0_1K-4       229ns ± 5%     230ns ± 4%    ~     (p=0.318 n=40+40)
RegexpMatchEasy1_32-4      66.9ns ± 3%    67.3ns ± 7%    ~     (p=0.817 n=40+40)
RegexpMatchEasy1_1K-4       371ns ± 5%     370ns ± 4%    ~     (p=0.275 n=40+40)
RegexpMatchMedium_32-4      106ns ± 4%     104ns ± 7%  -2.28%  (p=0.000 n=40+40)
RegexpMatchMedium_1K-4     32.0µs ± 2%    31.4µs ± 3%  -2.08%  (p=0.000 n=40+40)
RegexpMatchHard_32-4       1.54µs ± 7%    1.52µs ± 3%  -1.80%  (p=0.007 n=39+40)
RegexpMatchHard_1K-4       45.8µs ± 4%    45.5µs ± 3%    ~     (p=0.707 n=40+40)
Revcomp-4                   401ms ± 5%     401ms ± 6%    ~     (p=0.935 n=40+40)
Template-4                 62.4ms ± 4%    61.2ms ± 3%  -1.85%  (p=0.000 n=40+40)
TimeParse-4                 315ns ± 2%     318ns ± 3%  +1.10%  (p=0.002 n=40+40)
TimeFormat-4                297ns ± 3%     298ns ± 3%    ~     (p=0.238 n=40+40)
[Geo mean]                 45.8µs         45.7µs       -0.22%

name                     old speed      new speed      delta
GobDecode-4               119MB/s ± 6%   119MB/s ± 7%    ~     (p=0.684 n=39+40)
GobEncode-4               129MB/s ± 7%   128MB/s ± 6%    ~     (p=0.413 n=40+34)
Gzip-4                   86.6MB/s ± 6%  87.0MB/s ± 6%    ~     (p=0.145 n=40+40)
Gunzip-4                  532MB/s ± 4%   532MB/s ± 4%    ~     (p=0.556 n=40+40)
JSONEncode-4              215MB/s ± 4%   215MB/s ± 4%    ~     (p=0.583 n=40+40)
JSONDecode-4             39.3MB/s ± 4%  39.5MB/s ± 4%    ~     (p=0.277 n=40+40)
GoParse-4                18.4MB/s ± 5%  18.5MB/s ± 5%    ~     (p=0.229 n=40+40)
RegexpMatchEasy0_32-4     439MB/s ± 3%   445MB/s ± 8%  +1.28%  (p=0.003 n=40+40)
RegexpMatchEasy0_1K-4    4.46GB/s ± 4%  4.45GB/s ± 4%    ~     (p=0.343 n=40+40)
RegexpMatchEasy1_32-4     479MB/s ± 3%   476MB/s ± 7%    ~     (p=0.855 n=40+40)
RegexpMatchEasy1_1K-4    2.76GB/s ± 5%  2.77GB/s ± 4%    ~     (p=0.250 n=40+40)
RegexpMatchMedium_32-4   9.36MB/s ± 4%  9.58MB/s ± 6%  +2.31%  (p=0.001 n=40+40)
RegexpMatchMedium_1K-4   32.0MB/s ± 2%  32.7MB/s ± 3%  +2.12%  (p=0.000 n=40+40)
RegexpMatchHard_32-4     20.7MB/s ± 7%  21.1MB/s ± 3%  +1.95%  (p=0.005 n=40+40)
RegexpMatchHard_1K-4     22.4MB/s ± 4%  22.5MB/s ± 3%    ~     (p=0.689 n=40+40)
Revcomp-4                 634MB/s ± 5%   634MB/s ± 6%    ~     (p=0.935 n=40+40)
Template-4               31.1MB/s ± 3%  31.7MB/s ± 3%  +1.88%  (p=0.000 n=40+40)
[Geo mean]                129MB/s        130MB/s       +0.62%

Change-Id: I9d61ee810d900920c572cbe89e2f1626bfed12b7
Reviewed-on: https://go-review.googlesource.com/c/145209
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-30 00:22:58 +00:00
Robert Griesemer 70dd90c4a9 cmd/compile: revert "typecheck types and funcs before consts"
This reverts commit 9ce87a63b9.

The fix addresses the specific test case, but not the general
problem.

Updates #24755.

Change-Id: I0ba8463b41b099b1ebf49759f88a423b40f70d58
Reviewed-on: https://go-review.googlesource.com/c/145617
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-29 19:51:57 +00:00
Daniel Martí 9ce87a63b9 cmd/compile: typecheck types and funcs before consts
This way, once the constant declarations are typechecked, all named
types are fully typechecked and have all of their methods added.

Usually this isn't important, as methods and interfaces cannot be used
in constant declarations. However, it can lead to confusing and
incorrect errors, such as:

	$ cat f.go
	package p

	type I interface{ F() }
	type T struct{}

	const _ = I(T{})

	func (T) F() {}
	$ go build f.go
	./f.go:6:12: cannot convert T literal (type T) to type I:
		T does not implement I (missing F method)

The error is clearly wrong, as T does have an F method. If we ensure
that all funcs are typechecked before all constant declarations, we get
the correct error:

	$ go build f2.go
	# command-line-arguments
	./f.go:6:7: const initializer I(T literal) is not a constant

Fixes #24755.

Change-Id: I182b60397b9cac521d9a9ffadb11b42fd42e42fe
Reviewed-on: https://go-review.googlesource.com/c/115096
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-10-29 18:10:54 +00:00
Josh Bleecher Snyder 15c4575293 cmd/compile: convert arguments as needed
CL 114797 reworked how arguments get written to the stack.
Some type conversions got lost in the process. Restore them.

Fixes #28390
Updates #28430

Change-Id: Ia0d37428d7d615c865500bbd1a7a4167554ee34f
Reviewed-on: https://go-review.googlesource.com/c/144598
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-28 18:22:36 +00:00
Keith Randall 9f291d1fc3 cmd/compile: fix rule for combining loads with compares
Unlike normal load+op opcodes, the load+compare opcode does
not clobber its non-load argument. Allow the load+compare merge
to happen even if the non-load argument is used elsewhere.

Noticed when investigating issue #28417.

Change-Id: Ibc48d1f2e06ae76034c59f453815d263e8ec7288
Reviewed-on: https://go-review.googlesource.com/c/145097
Reviewed-by: Ainar Garipov <gugl.zadolbal@gmail.com>
Reviewed-by: Ben Shi <powerman1st@163.com>
2018-10-27 00:59:54 +00:00
Keith Randall dd789550a7 cmd/compile: intrinsify math/bits.Sub on amd64
name             old time/op  new time/op  delta
Sub-8            1.12ns ± 1%  1.17ns ± 1%   +5.20%          (p=0.008 n=5+5)
Sub32-8          1.11ns ± 0%  1.11ns ± 0%     ~     (all samples are equal)
Sub64-8          1.12ns ± 0%  1.18ns ± 1%   +5.00%          (p=0.016 n=4+5)
Sub64multiple-8  4.10ns ± 1%  0.86ns ± 1%  -78.93%          (p=0.008 n=5+5)

Fixes #28273

Change-Id: Ibcb6f2fd32d987c3bcbae4f4cd9d335a3de98548
Reviewed-on: https://go-review.googlesource.com/c/144258
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-25 19:47:27 +00:00
Keith Randall 899f3a2892 cmd/compile: intrinsify math/bits.Add on amd64
name             old time/op  new time/op  delta
Add-8            1.11ns ± 0%  1.18ns ± 0%   +6.31%  (p=0.029 n=4+4)
Add32-8          1.02ns ± 0%  1.02ns ± 1%     ~     (p=0.333 n=4+5)
Add64-8          1.11ns ± 1%  1.17ns ± 0%   +5.79%  (p=0.008 n=5+5)
Add64multiple-8  4.35ns ± 1%  0.86ns ± 0%  -80.22%  (p=0.000 n=5+4)

The individual ops are a bit slower (but still very fast).
Using the ops in carry chains is very fast.

Update #28273

Change-Id: Id975f76df2b930abf0e412911d327b6c5b1befe5
Reviewed-on: https://go-review.googlesource.com/c/144257
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-25 19:47:00 +00:00
Robert Griesemer 6761b1eb1b cmd/compile: better errors for structs with conflicting fields and methods
If a field and method have the same name, mark the respective struct field
so that we don't report follow-on errors when the field/method is accessed.

Per suggestion of @mdempsky.

Fixes #28268.

Change-Id: Ia1ca4cdfe9bacd3739d1fd7ca5e014ca094245ee
Reviewed-on: https://go-review.googlesource.com/c/144259
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-10-24 20:39:37 +00:00
Robert Griesemer 5538ecadca cmd/compile: better error for embedded field referring to missing import
Fixes #27938.

Change-Id: I16263ac6c0b8903b8a16f02e8db0e1a16d1c95b4
Reviewed-on: https://go-review.googlesource.com/c/144261
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-10-24 20:08:18 +00:00
ChrisALiles 13d5cd7847 cmd/compile: use proved bounds to remove signed division fix-ups
prove is able to find 94 occurrences in std cmd where a divisor
can't have the value -1. The change removes
the extraneous fix-up code for these cases.

Fixes #25239

Change-Id: Ic184de971f47cc57c702eb72805b8e291c14035d
Reviewed-on: https://go-review.googlesource.com/c/130215
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-23 02:29:44 +00:00
Keith Randall dca769dca9 cmd/compile: in append(f()), type convert appended items
The second and subsequent return values from f() need to be
converted to the element type of the first return value from f()
(which must be a slice).

Fixes #22327

Change-Id: I5c0a424812c82c1b95b6d124c5626cfc4408bdb6
Reviewed-on: https://go-review.googlesource.com/c/142718
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-10-22 17:30:57 +00:00
Ben Shi 95dda75bde cmd/compile: optimize store combination on 386/amd64
This CL add 3 rules to combine byte-store to word-store on386 and
amd64.

Change-Id: Iffd9cda42f1961680c81def4edc773ad58f211b3
Reviewed-on: https://go-review.googlesource.com/c/143057
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-19 02:21:04 +00:00
Ian Lance Taylor 0a8e347751 test: update issue5089.go for recent gccgo changes
As of https://golang.org/cl/43456 gccgo now gives a better error
message for this test.

Before:
    fixedbugs/issue5089.go:13:1: error: redefinition of ‘bufio.Buffered’: receiver name changed
     func (b *bufio.Reader) Buffered() int { // ERROR "non-local|redefinition"
     ^
    fixedbugs/issue5089.go:11:13: note: previous definition of ‘bufio.Buffered’ was here
     import "bufio" // GCCGO_ERROR "previous"
                 ^

Now:
    fixedbugs/issue5089.go:13:7: error: may not define methods on non-local type
     func (b *bufio.Reader) Buffered() int { // ERROR "non-local|redefinition"
           ^

Change-Id: I4112ca8d91336f6369f780c1d45b8915b5e8e235
Reviewed-on: https://go-review.googlesource.com/c/130955
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-18 17:53:42 +00:00
Ian Lance Taylor 8ccafb1ac7 test: add fixedbugs/bug506 for gccgo
Building with gccgo failed with an undefined symbol error from an
unnecessary hash function.

Updates #19773

Change-Id: Ic78bf1b086ff5ee26d464089c0e14987d3fe8b02
Reviewed-on: https://go-review.googlesource.com/c/130956
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-18 04:57:41 +00:00
Ben Shi 4158734097 test/codegen: add more combined load/store test cases
This CL adds more combined load/store test cases for 386/amd64.

Change-Id: I0a483a6ed0212b65c5e84d67ed8c9f50c389ce2d
Reviewed-on: https://go-review.googlesource.com/c/142878
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-18 01:57:54 +00:00
Matthew Dempsky 5185744962 cmd/compile: remove obsolete "safe" mode
Nowadays there are better ways to safely run untrusted Go programs, like
NaCl and gVisor.

Change-Id: I20c45f13a50dbcf35c343438b720eb93e7b4e13a
Reviewed-on: https://go-review.googlesource.com/c/142717
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2018-10-17 19:00:37 +00:00
Josh Bleecher Snyder f2a676536f test: limit runoutput concurrency with -v
This appears to have simply been an oversight.

Change-Id: Ia5d1309b3ebc99c9abbf0282397693272d8178aa
Reviewed-on: https://go-review.googlesource.com/c/142885
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-17 16:42:42 +00:00
Robert Griesemer e861c3e003 cmd/compile: simplified test case (cleanup)
Follow-up on https://golang.org/cl/124595; no semantic changes.

Updates #26411.

Change-Id: Ic1c4622dbf79529ff61530f9c25ec742c2abe5ca
Reviewed-on: https://go-review.googlesource.com/c/142720
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-16 23:11:02 +00:00
Emmanuel T Odeke 0b63086f64 cmd/compile: fix label redefinition error column numbers
Ensure that label redefinition error column numbers
print the actual start of the label instead of the
position of the label's delimiting token ":".

For example, given this program:

package main

func main() {

            foo:
   foo:
foo:
foo            :
}

* Before:
main.go:5:13: label foo defined and not used
main.go:6:7: label foo already defined at main.go:5:13
main.go:7:4: label foo already defined at main.go:5:13
main.go:8:16: label foo already defined at main.go:5:13

* After:
main.go:5:13: label foo defined and not used
main.go:6:4: label foo already defined at main.go:5:13
main.go:7:1: label foo already defined at main.go:5:13
main.go:8:1: label foo already defined at main.go:5:13

Fixes #26411

Change-Id: I8eb874b97fdc8862547176d57ac2fa0f075f2367
Reviewed-on: https://go-review.googlesource.com/c/124595
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-10-16 22:32:14 +00:00
Filippo Valsorda a52289ef2b Revert "fmt: fix incorrect format of whole-number floats when using %#v"
Numbers without decimals are valid Go representations of whole-number
floats. That is, "var x float64 = 5" is valid Go. Avoid breakage in
tests that expect a certain output from %#v by reverting to it.

To guarantee the right type is generated by a print use %T(%#v) instead.

Added a test to lock in this behavior.

This reverts commit 7c7cecc184.

Fixes #27634
Updates #26363

Change-Id: I544c400a0903777dd216452a7e86dfe60b0b0283
Reviewed-on: https://go-review.googlesource.com/c/142597
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2018-10-16 21:54:35 +00:00
Lynn Boger 39fa301bdc test/codegen: enable more tests for ppc64/ppc64le
Adding cases for ppc64,ppc64le to the codegen tests
where appropriate.

Change-Id: Idf8cbe88a4ab4406a4ef1ea777bd15a58b68f3ed
Reviewed-on: https://go-review.googlesource.com/c/142557
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-16 19:00:53 +00:00
Ben Shi 4b78fe57a8 cmd/compile: optimize 386's load/store combination
This CL adds more combinations of two consequtive MOVBload/MOVBstore
to a unique MOVWload/MOVWstore.

1. The size of the go executable decreases about 4KB, and the total
size of pkg/linux_386 (excluding cmd/compile) decreases about 1.5KB.

2. There is no regression in the go1 benchmark result, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              3.28s ± 2%     3.29s ± 2%    ~     (p=0.151 n=40+40)
Fannkuch11-4                3.52s ± 1%     3.51s ± 1%  -0.28%  (p=0.002 n=40+40)
FmtFprintfEmpty-4          45.4ns ± 4%    45.0ns ± 4%  -0.89%  (p=0.019 n=40+40)
FmtFprintfString-4         81.9ns ± 7%    81.3ns ± 1%    ~     (p=0.660 n=40+25)
FmtFprintfInt-4            91.9ns ± 9%    91.4ns ± 9%    ~     (p=0.249 n=40+40)
FmtFprintfIntInt-4          143ns ± 4%     143ns ± 4%    ~     (p=0.760 n=40+40)
FmtFprintfPrefixedInt-4     184ns ± 3%     183ns ± 4%    ~     (p=0.485 n=40+40)
FmtFprintfFloat-4           408ns ± 3%     409ns ± 3%    ~     (p=0.961 n=40+40)
FmtManyArgs-4               597ns ± 4%     602ns ± 3%    ~     (p=0.413 n=40+40)
GobDecode-4                7.13ms ± 6%    7.14ms ± 6%    ~     (p=0.859 n=40+40)
GobEncode-4                6.86ms ± 9%    6.94ms ± 7%    ~     (p=0.162 n=40+40)
Gzip-4                      395ms ± 4%     396ms ± 3%    ~     (p=0.099 n=40+40)
Gunzip-4                   40.9ms ± 4%    41.1ms ± 3%    ~     (p=0.064 n=40+40)
HTTPClientServer-4         63.6µs ± 2%    63.6µs ± 3%    ~     (p=0.832 n=36+39)
JSONEncode-4               16.1ms ± 3%    15.8ms ± 3%  -1.60%  (p=0.001 n=40+40)
JSONDecode-4               61.0ms ± 3%    61.5ms ± 4%    ~     (p=0.065 n=40+40)
Mandelbrot200-4            5.16ms ± 3%    5.18ms ± 3%    ~     (p=0.056 n=40+40)
GoParse-4                  3.25ms ± 2%    3.23ms ± 3%    ~     (p=0.727 n=40+40)
RegexpMatchEasy0_32-4      90.2ns ± 3%    89.3ns ± 6%  -0.98%  (p=0.002 n=40+40)
RegexpMatchEasy0_1K-4       812ns ± 3%     815ns ± 3%    ~     (p=0.309 n=40+40)
RegexpMatchEasy1_32-4       103ns ± 6%     103ns ± 5%    ~     (p=0.680 n=40+40)
RegexpMatchEasy1_1K-4      1.01µs ± 4%    1.02µs ± 3%    ~     (p=0.326 n=40+33)
RegexpMatchMedium_32-4      120ns ± 4%     120ns ± 5%    ~     (p=0.834 n=40+40)
RegexpMatchMedium_1K-4     40.1µs ± 3%    39.5µs ± 4%  -1.35%  (p=0.000 n=40+40)
RegexpMatchHard_32-4       2.27µs ± 6%    2.23µs ± 4%  -1.67%  (p=0.011 n=40+40)
RegexpMatchHard_1K-4       67.2µs ± 3%    67.2µs ± 3%    ~     (p=0.149 n=40+40)
Revcomp-4                   1.84s ± 2%     1.86s ± 3%  +0.70%  (p=0.020 n=40+40)
Template-4                 69.0ms ± 4%    69.8ms ± 3%  +1.20%  (p=0.003 n=40+40)
TimeParse-4                 438ns ± 3%     439ns ± 4%    ~     (p=0.650 n=40+40)
TimeFormat-4                412ns ± 3%     412ns ± 3%    ~     (p=0.888 n=40+40)
[Geo mean]                 65.2µs         65.2µs       -0.04%

name                     old speed      new speed      delta
GobDecode-4               108MB/s ± 6%   108MB/s ± 6%    ~     (p=0.855 n=40+40)
GobEncode-4               112MB/s ± 9%   111MB/s ± 8%    ~     (p=0.159 n=40+40)
Gzip-4                   49.2MB/s ± 4%  49.1MB/s ± 3%    ~     (p=0.102 n=40+40)
Gunzip-4                  474MB/s ± 3%   472MB/s ± 3%    ~     (p=0.063 n=40+40)
JSONEncode-4              121MB/s ± 3%   123MB/s ± 3%  +1.62%  (p=0.001 n=40+40)
JSONDecode-4             31.9MB/s ± 3%  31.6MB/s ± 4%    ~     (p=0.070 n=40+40)
GoParse-4                17.9MB/s ± 2%  17.9MB/s ± 3%    ~     (p=0.696 n=40+40)
RegexpMatchEasy0_32-4     355MB/s ± 3%   358MB/s ± 5%  +0.99%  (p=0.002 n=40+40)
RegexpMatchEasy0_1K-4    1.26GB/s ± 3%  1.26GB/s ± 3%    ~     (p=0.381 n=40+40)
RegexpMatchEasy1_32-4     310MB/s ± 5%   310MB/s ± 4%    ~     (p=0.655 n=40+40)
RegexpMatchEasy1_1K-4    1.01GB/s ± 4%  1.01GB/s ± 3%    ~     (p=0.351 n=40+33)
RegexpMatchMedium_32-4   8.32MB/s ± 4%  8.34MB/s ± 5%    ~     (p=0.696 n=40+40)
RegexpMatchMedium_1K-4   25.6MB/s ± 3%  25.9MB/s ± 4%  +1.36%  (p=0.000 n=40+40)
RegexpMatchHard_32-4     14.1MB/s ± 6%  14.3MB/s ± 4%  +1.64%  (p=0.011 n=40+40)
RegexpMatchHard_1K-4     15.2MB/s ± 3%  15.2MB/s ± 3%    ~     (p=0.147 n=40+40)
Revcomp-4                 138MB/s ± 2%   137MB/s ± 3%  -0.70%  (p=0.021 n=40+40)
Template-4               28.1MB/s ± 4%  27.8MB/s ± 3%  -1.19%  (p=0.003 n=40+40)
[Geo mean]               83.7MB/s       83.7MB/s       +0.03%

Change-Id: I2a2b3a942b5c45467491515d201179fd192e65c9
Reviewed-on: https://go-review.googlesource.com/c/141650
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-16 07:17:11 +00:00
Ben Shi 3785be3093 test/codegen: fix confusing test cases
ARMv7's MULAF/MULSF/MULAD/MULSD are not fused,
this CL fixes the confusing test cases.

Change-Id: I35022e207e2f0d24a23a7f6f188e41ba8eee9886
Reviewed-on: https://go-review.googlesource.com/c/142439
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Akhil Indurti <aindurti@gmail.com>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-10-16 07:17:02 +00:00
Daniel Martí 7f3313133e cmd/compile: don't panic on invalid map key declarations
In golang.org/cl/75310, the compiler's typechecker was changed so that
map key types were validated at a later stage, to make sure that all the
necessary type information was present.

This still worked for map type declarations, but caused a regression for
top-level map variable declarations. These now caused a fatal panic
instead of a typechecking error.

The cause was that checkMapKeys was run too early, before all
typechecking was done. In particular, top-level map variable
declarations are typechecked as external declarations, much later than
where checkMapKeys was run.

Add a test case for both exported and unexported top-level map
declarations, and add a second call to checkMapKeys at the actual end of
typechecking. Simply moving the one call isn't a good solution either;
the comments expand on that.

Fixes #28058.

Change-Id: Ia5febb01a1d877447cf66ba44fb49a7e0f4f18a5
Reviewed-on: https://go-review.googlesource.com/c/140417
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-10-15 22:11:26 +00:00
Martin Möhrmann a0f57c3fd0 cmd/compile: avoid string allocations when map key is struct or array literal
x = map[string(byteslice)] is already optimized by the compiler to avoid a
string allocation. This CL generalizes this optimization to:

x = map[T1{ ... Tn{..., string(byteslice), ...} ... }]
where T1 to Tn is a nesting of struct and array literals.

Found in a hot code path that used a struct of strings made from []byte
slices to make a map lookup.

There are no uses of the more generalized optimization in the standard library.
Passes toolstash -cmp.

MapStringConversion/32/simple    21.9ns ± 2%    21.9ns ± 3%      ~     (p=0.995 n=17+20)
MapStringConversion/32/struct    28.8ns ± 3%    22.0ns ± 2%   -23.80%  (p=0.000 n=20+20)
MapStringConversion/32/array     28.5ns ± 2%    21.9ns ± 2%   -23.14%  (p=0.000 n=19+16)
MapStringConversion/64/simple    21.0ns ± 2%    21.1ns ± 3%      ~     (p=0.072 n=19+18)
MapStringConversion/64/struct    72.4ns ± 3%    21.3ns ± 2%   -70.53%  (p=0.000 n=20+20)
MapStringConversion/64/array     72.8ns ± 1%    21.0ns ± 2%   -71.13%  (p=0.000 n=17+19)

name                           old allocs/op  new allocs/op  delta
MapStringConversion/32/simple      0.00           0.00           ~     (all equal)
MapStringConversion/32/struct      0.00           0.00           ~     (all equal)
MapStringConversion/32/array       0.00           0.00           ~     (all equal)
MapStringConversion/64/simple      0.00           0.00           ~     (all equal)
MapStringConversion/64/struct      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
MapStringConversion/64/array       1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)

Change-Id: I483b4d84d8d74b1025b62c954da9a365e79b7a3a
Reviewed-on: https://go-review.googlesource.com/c/116275
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-15 19:22:07 +00:00
Alberto Donizetti 7c96d87eda test/codegen: test ppc64 TrailingZeros, OnesCount codegen
This change adds codegen tests for the intrinsification on ppc64 of
the OnesCount{64,32,16,8}, and TrailingZeros{64,32,16,8} math/bits
functions.

Change-Id: Id3364921fbd18316850e15c8c71330c906187fdb
Reviewed-on: https://go-review.googlesource.com/c/141897
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2018-10-15 16:53:03 +00:00
Keith Randall 63e964e174 cmd/compile: provide types for all order-allocated temporaries
Ensure that we correctly type the stack temps for regular closures,
method function closures, and slice literals.

Then we don't need to override the dummy types later.
Furthermore, this allows order to reuse temporaries of these types.

OARRAYLIT doesn't need a temporary as far as I can tell, so I
removed that case from order.

Change-Id: Ic58520fa50c90639393ff78f33d3c831d5c4acb9
Reviewed-on: https://go-review.googlesource.com/c/140306
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-15 16:07:52 +00:00
Ben Shi 93e27e01af test/codegen: add tests of FMA for arm/arm64
This CL adds tests of fused multiplication-accumulation
on arm/arm64.

Change-Id: Ic85d5277c0d6acb7e1e723653372dfaf96824a39
Reviewed-on: https://go-review.googlesource.com/c/141652
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-15 14:51:30 +00:00
Ben Shi c3208842e1 test/codegen: add tests for multiplication-subtraction
This CL adds tests for armv7's MULS and arm64's MSUBW.

Change-Id: Id0fd5d26fd477e4ed14389b0d33cad930423eb5b
Reviewed-on: https://go-review.googlesource.com/c/141651
Run-TryBot: Ben Shi <powerman1st@163.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-15 02:41:33 +00:00
Keith Randall 389e942745 cmd/compile: reuse temporaries in order pass
Instead of allocating a new temporary each time one
is needed, keep a list of temporaries which are free
(have already been VARKILLed on every path) and use
one of them.

Should save a lot of stack space. In a function like this:

func main() {
     fmt.Printf("%d %d\n", 2, 3)
     fmt.Printf("%d %d\n", 4, 5)
     fmt.Printf("%d %d\n", 6, 7)
}

The three [2]interface{} arrays used to hold the ... args
all use the same autotmp, instead of 3 different autotmps
as happened previous to this CL.

Change-Id: I2d728e226f81e05ae68ca8247af62014a1b032d3
Reviewed-on: https://go-review.googlesource.com/c/140301
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-14 05:21:00 +00:00
Keith Randall 0e9f8a21f8 runtime,cmd/compile: pass strings and slices to convT2{E,I} by value
When we pass these types by reference, we usually have to allocate
temporaries on the stack, initialize them, then pass their address
to the conversion functions. It's simpler to pass these types
directly by value.

This particularly applies to conversions needed for fmt.Printf
(to interface{} for constructing a [...]interface{}).

func f(a, b, c string) {
     fmt.Printf("%s %s\n", a, b)
     fmt.Printf("%s %s\n", b, c)
}

This function's stack frame shrinks from 200 to 136 bytes, and
its code shrinks from 535 to 453 bytes.

The go binary shrinks 0.3%.

Update #24286

Aside: for this function f, we don't really need to allocate
temporaries for the convT2E function. We could use the address
of a, b, and c directly. That might get similar (or maybe better?)
improvements. I investigated a bit, but it seemed complicated
to do it safely. This change was much easier.

Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d
Reviewed-on: https://go-review.googlesource.com/c/135377
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-10-14 03:46:51 +00:00
Keith Randall 653a4bd8d4 cmd/compile: optimize loads from readonly globals into constants
Instead of
   MOVB go.string."foo"(SB), AX
do
   MOVB $102, AX

When we know the global we're loading from is readonly, we can
do that read at compile time.

I've made this arch-dependent mostly because the cases where this
happens often are memory->memory moves, and those don't get
decomposed until lowering.

Did amd64/386/arm/arm64. Other architectures could follow.

Update #26498

Change-Id: I41b1dc831b2cd0a52dac9b97f4f4457888a46389
Reviewed-on: https://go-review.googlesource.com/c/141118
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-10-14 02:54:40 +00:00
Ben Shi bac6a2925c test/codegen: add more arm64 test cases
This CL adds 3 combined load test cases for arm64.

Change-Id: I2c67308c40fd8a18f9f2d16c6d12911dcdc583e2
Reviewed-on: https://go-review.googlesource.com/c/140700
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-11 15:14:06 +00:00
Ben Shi 27965c1436 cmd/compile: optimize 386's ADDLconstmodifyidx4
This CL optimize ADDLconstmodifyidx4 to INCL/DECL, when the
constant is +1/-1.

1. The total size of pkg/linux_386/ decreases 28 bytes, excluding
cmd/compile.

2. There is no regression in the go1 benchmark test, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              3.25s ± 2%     3.23s ± 3%  -0.70%  (p=0.040 n=30+30)
Fannkuch11-4                3.50s ± 1%     3.47s ± 1%  -0.68%  (p=0.000 n=30+30)
FmtFprintfEmpty-4          44.6ns ± 3%    44.8ns ± 3%  +0.46%  (p=0.029 n=30+30)
FmtFprintfString-4         79.0ns ± 3%    78.7ns ± 3%    ~     (p=0.053 n=30+30)
FmtFprintfInt-4            89.2ns ± 2%    89.4ns ± 3%    ~     (p=0.665 n=30+29)
FmtFprintfIntInt-4          142ns ± 3%     142ns ± 3%    ~     (p=0.435 n=30+30)
FmtFprintfPrefixedInt-4     182ns ± 2%     182ns ± 2%    ~     (p=0.964 n=30+30)
FmtFprintfFloat-4           407ns ± 3%     411ns ± 4%    ~     (p=0.080 n=30+30)
FmtManyArgs-4               597ns ± 3%     593ns ± 4%    ~     (p=0.222 n=30+30)
GobDecode-4                7.09ms ± 6%    7.07ms ± 7%    ~     (p=0.633 n=30+30)
GobEncode-4                6.81ms ± 9%    6.81ms ± 8%    ~     (p=0.982 n=30+30)
Gzip-4                      398ms ± 4%     400ms ± 6%    ~     (p=0.177 n=30+30)
Gunzip-4                   41.3ms ± 3%    40.6ms ± 4%  -1.71%  (p=0.005 n=30+30)
HTTPClientServer-4         63.4µs ± 3%    63.4µs ± 4%    ~     (p=0.646 n=30+28)
JSONEncode-4               16.0ms ± 3%    16.1ms ± 3%    ~     (p=0.057 n=30+30)
JSONDecode-4               63.3ms ± 8%    63.1ms ± 7%    ~     (p=0.786 n=30+30)
Mandelbrot200-4            5.17ms ± 3%    5.15ms ± 8%    ~     (p=0.654 n=30+30)
GoParse-4                  3.24ms ± 3%    3.23ms ± 2%    ~     (p=0.091 n=30+30)
RegexpMatchEasy0_32-4       103ns ± 4%     103ns ± 4%    ~     (p=0.575 n=30+30)
RegexpMatchEasy0_1K-4       823ns ± 2%     821ns ± 3%    ~     (p=0.827 n=30+30)
RegexpMatchEasy1_32-4       113ns ± 3%     112ns ± 3%    ~     (p=0.076 n=30+30)
RegexpMatchEasy1_1K-4      1.02µs ± 4%    1.01µs ± 5%    ~     (p=0.087 n=30+30)
RegexpMatchMedium_32-4      129ns ± 3%     127ns ± 4%  -1.55%  (p=0.009 n=30+30)
RegexpMatchMedium_1K-4     39.3µs ± 4%    39.7µs ± 3%    ~     (p=0.054 n=30+30)
RegexpMatchHard_32-4       2.15µs ± 4%    2.15µs ± 4%    ~     (p=0.712 n=30+30)
RegexpMatchHard_1K-4       66.0µs ± 3%    65.1µs ± 3%  -1.32%  (p=0.002 n=30+30)
Revcomp-4                   1.85s ± 2%     1.85s ± 3%    ~     (p=0.168 n=30+30)
Template-4                 69.5ms ± 7%    68.9ms ± 6%    ~     (p=0.250 n=28+28)
TimeParse-4                 434ns ± 3%     432ns ± 4%    ~     (p=0.629 n=30+30)
TimeFormat-4                403ns ± 4%     408ns ± 3%  +1.23%  (p=0.019 n=30+29)
[Geo mean]                 65.5µs         65.3µs       -0.20%

name                     old speed      new speed      delta
GobDecode-4               108MB/s ± 6%   109MB/s ± 6%    ~     (p=0.636 n=30+30)
GobEncode-4               113MB/s ±10%   113MB/s ± 9%    ~     (p=0.982 n=30+30)
Gzip-4                   48.8MB/s ± 4%  48.6MB/s ± 5%    ~     (p=0.178 n=30+30)
Gunzip-4                  470MB/s ± 3%   479MB/s ± 4%  +1.72%  (p=0.006 n=30+30)
JSONEncode-4              121MB/s ± 3%   120MB/s ± 3%    ~     (p=0.057 n=30+30)
JSONDecode-4             30.7MB/s ± 8%  30.8MB/s ± 8%    ~     (p=0.784 n=30+30)
GoParse-4                17.9MB/s ± 3%  17.9MB/s ± 2%    ~     (p=0.090 n=30+30)
RegexpMatchEasy0_32-4     309MB/s ± 4%   309MB/s ± 3%    ~     (p=0.530 n=30+30)
RegexpMatchEasy0_1K-4    1.24GB/s ± 2%  1.25GB/s ± 3%    ~     (p=0.976 n=30+30)
RegexpMatchEasy1_32-4     282MB/s ± 3%   284MB/s ± 3%  +0.81%  (p=0.041 n=30+30)
RegexpMatchEasy1_1K-4    1.00GB/s ± 3%  1.01GB/s ± 4%    ~     (p=0.091 n=30+30)
RegexpMatchMedium_32-4   7.71MB/s ± 3%  7.84MB/s ± 4%  +1.71%  (p=0.000 n=30+30)
RegexpMatchMedium_1K-4   26.1MB/s ± 4%  25.8MB/s ± 3%    ~     (p=0.051 n=30+30)
RegexpMatchHard_32-4     14.9MB/s ± 4%  14.9MB/s ± 4%    ~     (p=0.712 n=30+30)
RegexpMatchHard_1K-4     15.5MB/s ± 3%  15.7MB/s ± 3%  +1.34%  (p=0.003 n=30+30)
Revcomp-4                 138MB/s ± 2%   137MB/s ± 3%    ~     (p=0.174 n=30+30)
Template-4               28.0MB/s ± 6%  28.2MB/s ± 6%    ~     (p=0.251 n=28+28)
[Geo mean]               82.3MB/s       82.6MB/s       +0.36%

Change-Id: I389829699ffe9500a013fcf31be58a97e98043e1
Reviewed-on: https://go-review.googlesource.com/c/140701
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-11 01:20:34 +00:00
Keith Randall ceb0c371d9 cmd/compile: make []byte("...") more efficient
Do []byte(string) conversions more efficiently when the string
is a constant. Instead of calling stringtobyteslice, allocate
just the space we need and encode the initialization directly.

[]byte("foo") rewrites to the following pseudocode:

var s [3]byte // on heap or stack, depending on whether b escapes
s = *(*[3]byte)(&"foo"[0]) // initialize s from the string
b = s[:]

which generates this assembly:

	0x001d 00029 (tmp1.go:9)	LEAQ	type.[3]uint8(SB), AX
	0x0024 00036 (tmp1.go:9)	MOVQ	AX, (SP)
	0x0028 00040 (tmp1.go:9)	CALL	runtime.newobject(SB)
	0x002d 00045 (tmp1.go:9)	MOVQ	8(SP), AX
	0x0032 00050 (tmp1.go:9)	MOVBLZX	go.string."foo"+2(SB), CX
	0x0039 00057 (tmp1.go:9)	MOVWLZX	go.string."foo"(SB), DX
	0x0040 00064 (tmp1.go:9)	MOVW	DX, (AX)
	0x0043 00067 (tmp1.go:9)	MOVB	CL, 2(AX)
// Then the slice is b = {AX, 3, 3}

The generated code is still not optimal, as it still does load/store
from read-only memory instead of constant stores.  Next CL...

Update #26498
Fixes #10170

Change-Id: I4b990b19f9a308f60c8f4f148934acffefe0a5bd
Reviewed-on: https://go-review.googlesource.com/c/140698
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-10 16:10:40 +00:00
Ben Shi 3933302550 cmd/compile: add indexed form for several 386 instructions
This CL implements indexed memory operands for the following instructions.
(ADD|SUB|MUL|AND|OR|XOR)Lload -> (ADD|SUB|MUL|AND|OR|XOR)Lloadidx4
(ADD|SUB|AND|OR|XOR)Lmodify -> (ADD|SUB|AND|OR|XOR)Lmodifyidx4
(ADD|AND|OR|XOR)Lconstmodify -> (ADD|AND|OR|XOR)Lconstmodifyidx4

1. The total size of pkg/linux_386/ decreases about 2.5KB, excluding
cmd/compile/ .

2. There is little regression in the go1 benchmark test, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              3.25s ± 3%     3.25s ± 3%    ~     (p=0.218 n=40+40)
Fannkuch11-4                3.53s ± 1%     3.53s ± 1%    ~     (p=0.303 n=40+40)
FmtFprintfEmpty-4          44.9ns ± 3%    45.6ns ± 3%  +1.48%  (p=0.030 n=40+36)
FmtFprintfString-4         78.7ns ± 5%    80.1ns ± 7%    ~     (p=0.217 n=36+40)
FmtFprintfInt-4            90.2ns ± 6%    89.8ns ± 5%    ~     (p=0.659 n=40+38)
FmtFprintfIntInt-4          140ns ± 5%     141ns ± 5%  +1.00%  (p=0.027 n=40+40)
FmtFprintfPrefixedInt-4     185ns ± 3%     183ns ± 3%    ~     (p=0.104 n=40+40)
FmtFprintfFloat-4           411ns ± 4%     406ns ± 3%  -1.37%  (p=0.005 n=40+40)
FmtManyArgs-4               590ns ± 4%     598ns ± 4%  +1.35%  (p=0.008 n=40+40)
GobDecode-4                7.16ms ± 5%    7.10ms ± 5%    ~     (p=0.335 n=40+40)
GobEncode-4                6.85ms ± 7%    6.74ms ± 9%    ~     (p=0.058 n=38+40)
Gzip-4                      400ms ± 4%     399ms ± 2%  -0.34%  (p=0.003 n=40+33)
Gunzip-4                   41.4ms ± 3%    41.4ms ± 4%  -0.12%  (p=0.020 n=40+40)
HTTPClientServer-4         64.1µs ± 4%    63.5µs ± 2%  -1.07%  (p=0.000 n=39+37)
JSONEncode-4               15.9ms ± 2%    15.9ms ± 3%    ~     (p=0.103 n=40+40)
JSONDecode-4               62.2ms ± 4%    61.6ms ± 3%  -0.98%  (p=0.006 n=39+40)
Mandelbrot200-4            5.18ms ± 3%    5.14ms ± 4%    ~     (p=0.125 n=40+40)
GoParse-4                  3.29ms ± 2%    3.27ms ± 2%  -0.66%  (p=0.006 n=40+40)
RegexpMatchEasy0_32-4       103ns ± 4%     103ns ± 4%    ~     (p=0.632 n=40+40)
RegexpMatchEasy0_1K-4       830ns ± 3%     828ns ± 3%    ~     (p=0.563 n=40+40)
RegexpMatchEasy1_32-4       113ns ± 4%     113ns ± 4%    ~     (p=0.494 n=40+40)
RegexpMatchEasy1_1K-4      1.03µs ± 4%    1.03µs ± 4%    ~     (p=0.665 n=40+40)
RegexpMatchMedium_32-4      130ns ± 4%     129ns ± 3%    ~     (p=0.458 n=40+40)
RegexpMatchMedium_1K-4     39.4µs ± 3%    39.7µs ± 3%    ~     (p=0.825 n=40+40)
RegexpMatchHard_32-4       2.16µs ± 4%    2.15µs ± 4%    ~     (p=0.137 n=40+40)
RegexpMatchHard_1K-4       65.2µs ± 3%    65.4µs ± 4%    ~     (p=0.160 n=40+40)
Revcomp-4                   1.87s ± 2%     1.87s ± 1%  +0.17%  (p=0.019 n=33+33)
Template-4                 69.4ms ± 3%    69.8ms ± 3%  +0.60%  (p=0.009 n=40+40)
TimeParse-4                 437ns ± 4%     438ns ± 4%    ~     (p=0.234 n=40+40)
TimeFormat-4                408ns ± 3%     408ns ± 3%    ~     (p=0.904 n=40+40)
[Geo mean]                 65.7µs         65.6µs       -0.08%

name                     old speed      new speed      delta
GobDecode-4               107MB/s ± 5%   108MB/s ± 5%    ~     (p=0.336 n=40+40)
GobEncode-4               112MB/s ± 6%   114MB/s ± 9%  +1.95%  (p=0.036 n=37+40)
Gzip-4                   48.5MB/s ± 4%  48.6MB/s ± 2%  +0.28%  (p=0.003 n=40+33)
Gunzip-4                  469MB/s ± 4%   469MB/s ± 4%  +0.11%  (p=0.021 n=40+40)
JSONEncode-4              122MB/s ± 2%   122MB/s ± 3%    ~     (p=0.105 n=40+40)
JSONDecode-4             31.2MB/s ± 4%  31.5MB/s ± 4%  +0.99%  (p=0.007 n=39+40)
GoParse-4                17.6MB/s ± 2%  17.7MB/s ± 2%  +0.66%  (p=0.007 n=40+40)
RegexpMatchEasy0_32-4     310MB/s ± 4%   310MB/s ± 4%    ~     (p=0.384 n=40+40)
RegexpMatchEasy0_1K-4    1.23GB/s ± 3%  1.24GB/s ± 3%    ~     (p=0.186 n=40+40)
RegexpMatchEasy1_32-4     283MB/s ± 3%   281MB/s ± 4%    ~     (p=0.855 n=40+40)
RegexpMatchEasy1_1K-4    1.00GB/s ± 4%  1.00GB/s ± 4%    ~     (p=0.665 n=40+40)
RegexpMatchMedium_32-4   7.68MB/s ± 4%  7.73MB/s ± 3%    ~     (p=0.359 n=40+40)
RegexpMatchMedium_1K-4   26.0MB/s ± 3%  25.8MB/s ± 3%    ~     (p=0.825 n=40+40)
RegexpMatchHard_32-4     14.8MB/s ± 3%  14.9MB/s ± 4%    ~     (p=0.136 n=40+40)
RegexpMatchHard_1K-4     15.7MB/s ± 3%  15.7MB/s ± 4%    ~     (p=0.150 n=40+40)
Revcomp-4                 136MB/s ± 1%   136MB/s ± 1%  -0.09%  (p=0.028 n=32+33)
Template-4               28.0MB/s ± 3%  27.8MB/s ± 3%  -0.59%  (p=0.010 n=40+40)
[Geo mean]               82.1MB/s       82.3MB/s       +0.25%

Change-Id: Ifa387a251056678326d3508aa02753b70bf7e5d0
Reviewed-on: https://go-review.googlesource.com/c/140303
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-09 03:55:08 +00:00
Igor Zhilianin 04dc1b2443 all: fix a bunch of misspellings
Change-Id: I94cebca86706e072fbe3be782d3edbe0e22b9432
GitHub-Last-Rev: 8e15a40545
GitHub-Pull-Request: golang/go#28067
Reviewed-on: https://go-review.googlesource.com/c/140437
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-10-08 03:12:03 +00:00
Keith Randall 6933d76a7e cmd/compile: allow VARDEF at top level
This was missed as part of adding a top-level VARDEF
for stack tracing (CL 134156).

Fixes #28055

Change-Id: Id14748dfccb119197d788867d2ec6a3b3c9835cf
Reviewed-on: https://go-review.googlesource.com/c/140304
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
2018-10-06 16:28:04 +00:00
Igor Zhilianin f90e89e675 all: fix a bunch of misspellings
Change-Id: If2954bdfc551515403706b2cd0dde94e45936e08
GitHub-Last-Rev: d4cfc41a55
GitHub-Pull-Request: golang/go#28049
Reviewed-on: https://go-review.googlesource.com/c/140299
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-06 15:40:03 +00:00
uropek f1973f3164 test: fix spelling of `caught be the compiler` to `caught by the compiler`
Change-Id: Id21cdce35963dcdb96cc06252170590224c5aa17
GitHub-Last-Rev: 429dad0ceb
GitHub-Pull-Request: golang/go#28000
Reviewed-on: https://go-review.googlesource.com/c/139424
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-10-04 00:49:49 +00:00
Keith Randall c91ce3cc7b test: stress test for stack objects
Allocate a long linked list on the stack. This tests both
lots of live stack objects, and lots of intra-stack pointers
to those objects.

Change-Id: I169e067416455737774851633b1e5367e10e1cf2
Reviewed-on: https://go-review.googlesource.com/c/135296
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-03 19:54:29 +00:00
Keith Randall 9dac0a8132 runtime: on a signal, set traceback address to a deferreturn call
When a function triggers a signal (like a segfault which translates to
a nil pointer exception) during execution, a sigpanic handler is just
below it on the stack.  The function itself did not stop at a
safepoint, so we have to figure out what safepoint we should use to
scan its stack frame.

Previously we used the site of the most recent defer to get the live
variables at the signal site. That answer is not quite correct, as
explained in #27518. Instead, use the site of a deferreturn call.
It has all the right variables marked as live (no args, all the return
values, except those that escape to the heap, in which case the
corresponding PAUTOHEAP variables will be live instead).

This CL requires stack objects, so that all the local variables
and args referenced by the deferred closures keep the right variables alive.

Fixes #27518

Change-Id: Id45d8a8666759986c203181090b962e2981e48ca
Reviewed-on: https://go-review.googlesource.com/c/134637
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-03 19:54:23 +00:00
Keith Randall 9a8372f8bd cmd/compile,runtime: remove ambiguously live logic
The previous CL introduced stack objects. This CL removes the old
ambiguously live liveness analysis. After this CL we're relying
on stack objects exclusively.

Update a bunch of liveness tests to reflect the new world.

Fixes #22350

Change-Id: I739b26e015882231011ce6bc1a7f426049e59f31
Reviewed-on: https://go-review.googlesource.com/c/134156
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-03 19:54:16 +00:00
Cherry Zhang c96e3bcc97 cmd/compile: fix type of OffPtr in some optimization rules
In some optimization rules the type of generated OffPtr was
incorrectly set to the type of the pointee, instead of the
pointer. When the OffPtr value is spilled, this may generate
a spill of the wrong type, e.g. a floating point spill of an
integer (pointer) value. On Wasm, this leads to invalid
bytecode.

Fixes #27961.

Change-Id: I5d464847eb900ed90794105c0013a1a7330756cc
Reviewed-on: https://go-review.googlesource.com/c/139257
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Richard Musiol <neelance@gmail.com>
2018-10-03 15:01:47 +00:00
Carlos Eduardo Seo 9aed4cc395 cmd/compile: instrinsify math/bits.Mul on ppc64x
Add SSA rules to intrinsify Mul/Mul64 on ppc64x.

benchmark             old ns/op     new ns/op     delta
BenchmarkMul-40       8.80          0.93          -89.43%
BenchmarkMul32-40     1.39          1.39          +0.00%
BenchmarkMul64-40     5.39          0.93          -82.75%

Updates #24813

Change-Id: I6e95bfbe976a2278bd17799df184a7fbc0e57829
Reviewed-on: https://go-review.googlesource.com/138917
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2018-10-02 18:56:06 +00:00
Keith Randall ef50373983 reflect: ensure correct scanning of return values
During a call to a reflect-generated function or method (via
makeFuncStub or methodValueCall), when should we scan the return
values?

When we're starting a reflect call, the space on the stack for the
return values is not initialized yet, as it contains whatever junk was
on the stack of the caller at the time. The return space must not be
scanned during a GC.

When we're finishing a reflect call, the return values are
initialized, and must be scanned during a GC to make sure that any
pointers in the return values are found and their referents retained.

When the GC stack walk comes across a reflect call in progress on the
stack, it needs to know whether to scan the results or not. It doesn't
know the progress of the reflect call, so it can't decide by
itself. The reflect package needs to tell it.

This CL adds another slot in the frame of makeFuncStub and
methodValueCall so we can put a boolean in there which tells the
runtime whether to scan the results or not.

This CL also adds the args length to reflectMethodValue so the
runtime can restrict its scanning to only the args section (not the
results) if the reflect package says the results aren't ready yet.

Do a delicate dance in the reflect package to set the "results are
valid" bit. We need to make sure we set the bit only after we've
copied the results back to the stack. But we must set the bit before
we drop reflect's copy of the results. Otherwise, we might have a
state where (temporarily) no one has a live copy of the results.
That's the state we were observing in issue #27695 before this CL.

The bitmap used by the runtime currently contains only the args.
(Actually, it contains all the bits, but the size is set so we use
only the args portion.) This is safe for early in a reflect call, but
unsafe late in a reflect call. The test issue27695.go demonstrates
this unsafety. We change the bitmap to always include both args
and results, and decide at runtime which portion to use.

issue27695.go only has a test for method calls. Function calls were ok
because there wasn't a safepoint between when reflect dropped its copy
of the return values and when the caller is resumed. This may change
when we introduce safepoints everywhere.

This truncate-to-only-the-args was part of CL 9888 (in 2015). That
part of the CL fixed the problem demonstrated in issue27695b.go but
introduced the problem demonstrated in issue27695.go.

TODO, in another CL: simplify FuncLayout and its test. stack return
value is now identical to frametype.ptrdata + frametype.gcdata.

Fixes #27695

Change-Id: I2d49b34e34a82c6328b34f02610587a291b25c5f
Reviewed-on: https://go-review.googlesource.com/137440
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-09-29 20:25:24 +00:00
Ben Shi 5aeecc4530 cmd/compile: optimize arm64's code with more shifted operations
This CL optimizes arm64's NEG/MVN/TST/CMN with a shifted operand.

1. The total size of pkg/android_arm64 decreases about 0.2KB, excluding
cmd/compile/ .

2. The go1 benchmark shows no regression, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              16.4s ± 1%     16.4s ± 1%    ~     (p=0.914 n=29+29)
Fannkuch11-4                8.72s ± 0%     8.72s ± 0%    ~     (p=0.274 n=30+29)
FmtFprintfEmpty-4           174ns ± 0%     174ns ± 0%    ~     (all equal)
FmtFprintfString-4          370ns ± 0%     370ns ± 0%    ~     (all equal)
FmtFprintfInt-4             419ns ± 0%     419ns ± 0%    ~     (all equal)
FmtFprintfIntInt-4          672ns ± 1%     675ns ± 2%    ~     (p=0.217 n=28+30)
FmtFprintfPrefixedInt-4     806ns ± 0%     806ns ± 0%    ~     (p=0.402 n=30+28)
FmtFprintfFloat-4          1.09µs ± 0%    1.09µs ± 0%  +0.02%  (p=0.011 n=22+27)
FmtManyArgs-4              2.67µs ± 0%    2.68µs ± 0%    ~     (p=0.279 n=29+30)
GobDecode-4                33.1ms ± 1%    33.1ms ± 0%    ~     (p=0.052 n=28+29)
GobEncode-4                29.6ms ± 0%    29.6ms ± 0%  +0.08%  (p=0.013 n=28+29)
Gzip-4                      1.38s ± 2%     1.39s ± 2%    ~     (p=0.071 n=29+29)
Gunzip-4                    139ms ± 0%     139ms ± 0%    ~     (p=0.265 n=29+29)
HTTPClientServer-4          789µs ± 4%     785µs ± 4%    ~     (p=0.206 n=29+28)
JSONEncode-4               49.7ms ± 0%    49.6ms ± 0%  -0.24%  (p=0.000 n=30+30)
JSONDecode-4                266ms ± 1%     267ms ± 1%  +0.34%  (p=0.000 n=30+30)
Mandelbrot200-4            16.6ms ± 0%    16.6ms ± 0%    ~     (p=0.835 n=28+30)
GoParse-4                  15.9ms ± 0%    15.8ms ± 0%  -0.29%  (p=0.000 n=27+30)
RegexpMatchEasy0_32-4       380ns ± 0%     381ns ± 0%  +0.18%  (p=0.000 n=30+30)
RegexpMatchEasy0_1K-4      1.18µs ± 0%    1.19µs ± 0%  +0.23%  (p=0.000 n=30+30)
RegexpMatchEasy1_32-4       357ns ± 0%     358ns ± 0%  +0.28%  (p=0.000 n=29+29)
RegexpMatchEasy1_1K-4      2.04µs ± 0%    2.04µs ± 0%  +0.06%  (p=0.006 n=30+30)
RegexpMatchMedium_32-4      589ns ± 0%     590ns ± 0%  +0.24%  (p=0.000 n=28+30)
RegexpMatchMedium_1K-4      162µs ± 0%     162µs ± 0%  -0.01%  (p=0.027 n=26+29)
RegexpMatchHard_32-4       9.58µs ± 0%    9.58µs ± 0%    ~     (p=0.935 n=30+30)
RegexpMatchHard_1K-4        287µs ± 0%     287µs ± 0%    ~     (p=0.387 n=29+30)
Revcomp-4                   2.50s ± 0%     2.50s ± 0%  -0.10%  (p=0.020 n=28+28)
Template-4                  310ms ± 0%     310ms ± 1%    ~     (p=0.406 n=30+30)
TimeParse-4                1.68µs ± 0%    1.68µs ± 0%  +0.03%  (p=0.014 n=30+17)
TimeFormat-4               1.65µs ± 0%    1.66µs ± 0%  +0.32%  (p=0.000 n=27+29)
[Geo mean]                  247µs          247µs       +0.05%

name                     old speed      new speed      delta
GobDecode-4              23.2MB/s ± 0%  23.2MB/s ± 0%  -0.08%  (p=0.032 n=27+29)
GobEncode-4              26.0MB/s ± 0%  25.9MB/s ± 0%  -0.10%  (p=0.011 n=29+29)
Gzip-4                   14.1MB/s ± 2%  14.0MB/s ± 2%    ~     (p=0.081 n=29+29)
Gunzip-4                  139MB/s ± 0%   139MB/s ± 0%    ~     (p=0.290 n=29+29)
JSONEncode-4             39.0MB/s ± 0%  39.1MB/s ± 0%  +0.25%  (p=0.000 n=29+30)
JSONDecode-4             7.30MB/s ± 1%  7.28MB/s ± 1%  -0.33%  (p=0.000 n=30+30)
GoParse-4                3.65MB/s ± 0%  3.66MB/s ± 0%  +0.29%  (p=0.000 n=27+30)
RegexpMatchEasy0_32-4    84.1MB/s ± 0%  84.0MB/s ± 0%  -0.17%  (p=0.000 n=30+28)
RegexpMatchEasy0_1K-4     864MB/s ± 0%   862MB/s ± 0%  -0.24%  (p=0.000 n=30+30)
RegexpMatchEasy1_32-4    89.5MB/s ± 0%  89.3MB/s ± 0%  -0.18%  (p=0.000 n=28+24)
RegexpMatchEasy1_1K-4     502MB/s ± 0%   502MB/s ± 0%  -0.05%  (p=0.008 n=30+29)
RegexpMatchMedium_32-4   1.70MB/s ± 0%  1.69MB/s ± 0%  -0.59%  (p=0.000 n=29+30)
RegexpMatchMedium_1K-4   6.31MB/s ± 0%  6.31MB/s ± 0%  +0.05%  (p=0.005 n=30+26)
RegexpMatchHard_32-4     3.34MB/s ± 0%  3.34MB/s ± 0%    ~     (all equal)
RegexpMatchHard_1K-4     3.57MB/s ± 0%  3.57MB/s ± 0%    ~     (all equal)
Revcomp-4                 102MB/s ± 0%   102MB/s ± 0%  +0.10%  (p=0.022 n=28+28)
Template-4               6.26MB/s ± 0%  6.26MB/s ± 1%    ~     (p=0.768 n=30+30)
[Geo mean]               24.2MB/s       24.1MB/s       -0.08%

Change-Id: I494f9db7f8a568a00e9c74ae25086a58b2221683
Reviewed-on: https://go-review.googlesource.com/137976
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-28 15:05:17 +00:00
Ben Shi d60cf39f8e cmd/compile: optimize arm64's MADD and MSUB
This CL implements constant folding for MADD/MSUB on arm64.

1. The total size of pkg/android_arm64/ decreases about 4KB,
   excluding cmd/compile/ .

2. There is no regression in the go1 benchmark, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              16.4s ± 1%     16.5s ± 1%  +0.24%  (p=0.008 n=29+29)
Fannkuch11-4                8.73s ± 0%     8.71s ± 0%  -0.15%  (p=0.000 n=29+29)
FmtFprintfEmpty-4           174ns ± 0%     174ns ± 0%    ~     (all equal)
FmtFprintfString-4          370ns ± 0%     372ns ± 2%  +0.53%  (p=0.007 n=24+30)
FmtFprintfInt-4             419ns ± 0%     419ns ± 0%    ~     (all equal)
FmtFprintfIntInt-4          673ns ± 1%     661ns ± 1%  -1.81%  (p=0.000 n=30+27)
FmtFprintfPrefixedInt-4     806ns ± 0%     805ns ± 0%    ~     (p=0.957 n=28+27)
FmtFprintfFloat-4          1.09µs ± 0%    1.09µs ± 0%  -0.04%  (p=0.001 n=22+30)
FmtManyArgs-4              2.67µs ± 0%    2.68µs ± 0%  +0.03%  (p=0.045 n=29+28)
GobDecode-4                33.2ms ± 1%    32.5ms ± 1%  -2.11%  (p=0.000 n=29+29)
GobEncode-4                29.5ms ± 0%    29.2ms ± 0%  -1.04%  (p=0.000 n=28+28)
Gzip-4                      1.39s ± 2%     1.38s ± 1%  -0.48%  (p=0.023 n=30+30)
Gunzip-4                    139ms ± 0%     139ms ± 0%    ~     (p=0.616 n=30+28)
HTTPClientServer-4          766µs ± 4%     758µs ± 3%  -1.03%  (p=0.013 n=28+29)
JSONEncode-4               49.7ms ± 0%    49.6ms ± 0%  -0.24%  (p=0.000 n=30+30)
JSONDecode-4                266ms ± 0%     268ms ± 1%  +1.07%  (p=0.000 n=29+30)
Mandelbrot200-4            16.6ms ± 0%    16.6ms ± 0%    ~     (p=0.248 n=30+29)
GoParse-4                  15.9ms ± 0%    16.0ms ± 0%  +0.76%  (p=0.000 n=29+29)
RegexpMatchEasy0_32-4       381ns ± 0%     380ns ± 0%  -0.14%  (p=0.000 n=30+30)
RegexpMatchEasy0_1K-4      1.18µs ± 0%    1.19µs ± 1%  +0.30%  (p=0.000 n=29+30)
RegexpMatchEasy1_32-4       357ns ± 0%     357ns ± 0%    ~     (all equal)
RegexpMatchEasy1_1K-4      2.04µs ± 0%    2.05µs ± 0%  +0.50%  (p=0.000 n=26+28)
RegexpMatchMedium_32-4      590ns ± 0%     589ns ± 0%  -0.12%  (p=0.000 n=30+23)
RegexpMatchMedium_1K-4      162µs ± 0%     162µs ± 0%    ~     (p=0.318 n=28+25)
RegexpMatchHard_32-4       9.56µs ± 0%    9.56µs ± 0%    ~     (p=0.072 n=30+29)
RegexpMatchHard_1K-4        287µs ± 0%     287µs ± 0%  -0.02%  (p=0.005 n=28+28)
Revcomp-4                   2.50s ± 0%     2.51s ± 0%    ~     (p=0.246 n=29+29)
Template-4                  312ms ± 1%     313ms ± 1%  +0.46%  (p=0.002 n=30+30)
TimeParse-4                1.68µs ± 0%    1.67µs ± 0%  -0.31%  (p=0.000 n=27+29)
TimeFormat-4               1.66µs ± 0%    1.64µs ± 0%  -0.92%  (p=0.000 n=29+26)
[Geo mean]                  247µs          246µs       -0.15%

name                     old speed      new speed      delta
GobDecode-4              23.1MB/s ± 1%  23.6MB/s ± 0%  +2.17%  (p=0.000 n=29+28)
GobEncode-4              26.0MB/s ± 0%  26.3MB/s ± 0%  +1.05%  (p=0.000 n=28+28)
Gzip-4                   14.0MB/s ± 2%  14.1MB/s ± 1%  +0.47%  (p=0.026 n=30+30)
Gunzip-4                  139MB/s ± 0%   139MB/s ± 0%    ~     (p=0.624 n=30+28)
JSONEncode-4             39.1MB/s ± 0%  39.2MB/s ± 0%  +0.24%  (p=0.000 n=30+30)
JSONDecode-4             7.31MB/s ± 0%  7.23MB/s ± 1%  -1.07%  (p=0.000 n=28+30)
GoParse-4                3.65MB/s ± 0%  3.62MB/s ± 0%  -0.77%  (p=0.000 n=29+29)
RegexpMatchEasy0_32-4    84.0MB/s ± 0%  84.1MB/s ± 0%  +0.18%  (p=0.000 n=28+30)
RegexpMatchEasy0_1K-4     864MB/s ± 0%   861MB/s ± 1%  -0.29%  (p=0.000 n=29+30)
RegexpMatchEasy1_32-4    89.5MB/s ± 0%  89.5MB/s ± 0%    ~     (p=0.841 n=28+28)
RegexpMatchEasy1_1K-4     502MB/s ± 0%   500MB/s ± 0%  -0.51%  (p=0.000 n=29+29)
RegexpMatchMedium_32-4   1.69MB/s ± 0%  1.70MB/s ± 0%  +0.41%  (p=0.000 n=26+30)
RegexpMatchMedium_1K-4   6.31MB/s ± 0%  6.30MB/s ± 0%    ~     (p=0.129 n=30+25)
RegexpMatchHard_32-4     3.35MB/s ± 0%  3.35MB/s ± 0%    ~     (p=0.657 n=30+29)
RegexpMatchHard_1K-4     3.57MB/s ± 0%  3.57MB/s ± 0%    ~     (all equal)
Revcomp-4                 102MB/s ± 0%   101MB/s ± 0%    ~     (p=0.213 n=29+29)
Template-4               6.22MB/s ± 1%  6.19MB/s ± 1%  -0.42%  (p=0.005 n=30+29)
[Geo mean]               24.1MB/s       24.2MB/s       +0.08%

Change-Id: I6c02d3c9975f6bd8bc215cb1fc14d29602b45649
Reviewed-on: https://go-review.googlesource.com/138095
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-28 15:03:17 +00:00
Than McIntosh 31d19c0ba3 test: add testcase for gccgo compile failure
Also includes a small tweak to test/run.go to allow package names
with Unicode letters (as opposed to just ASCII chars).

Updates #27836

Change-Id: Idbf0bdea24174808cddcb69974dab820eb13e521
Reviewed-on: https://go-review.googlesource.com/138075
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-27 15:01:24 +00:00
David Heuschmann ae9c822f78 cmd/compile: use more specific error message for assignment mismatch
Show a more specifc error message in the form of "%d variables but %v
returns %d values" if an assignment mismatch occurs with a function
or method call on the right.

Fixes #27595

Change-Id: Ibc97d070662b08f150ac22d686059cf224e012ab
Reviewed-on: https://go-review.googlesource.com/135575
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-09-27 00:35:06 +00:00
Brian Kessler 9eb53ab9bc cmd/compile: intrinsify math/bits.Mul
Add SSA rules to intrinsify Mul/Mul64 (AMD64 and ARM64).
SSA rules for other functions and architectures are left as a future
optimization.  Benchmark results on AMD64/ARM64 before and after SSA
implementation are below.

amd64
name     old time/op  new time/op  delta
Add-4    1.78ns ± 0%  1.85ns ±12%     ~     (p=0.397 n=4+5)
Add32-4  1.71ns ± 1%  1.70ns ± 0%     ~     (p=0.683 n=5+5)
Add64-4  1.80ns ± 2%  1.77ns ± 0%   -1.22%  (p=0.048 n=5+5)
Sub-4    1.78ns ± 0%  1.78ns ± 0%     ~     (all equal)
Sub32-4  1.78ns ± 1%  1.78ns ± 0%     ~     (p=1.000 n=5+5)
Sub64-4  1.78ns ± 1%  1.78ns ± 0%     ~     (p=0.968 n=5+4)
Mul-4    11.5ns ± 1%   1.8ns ± 2%  -84.39%  (p=0.008 n=5+5)
Mul32-4  1.39ns ± 0%  1.38ns ± 3%     ~     (p=0.175 n=5+5)
Mul64-4  6.85ns ± 1%  1.78ns ± 1%  -73.97%  (p=0.008 n=5+5)
Div-4    57.1ns ± 1%  56.7ns ± 0%     ~     (p=0.087 n=5+5)
Div32-4  18.0ns ± 0%  18.0ns ± 0%     ~     (all equal)
Div64-4  56.4ns ±10%  53.6ns ± 1%     ~     (p=0.071 n=5+5)

arm64
name      old time/op  new time/op  delta
Add-96    5.51ns ± 0%  5.51ns ± 0%     ~     (all equal)
Add32-96  5.51ns ± 0%  5.51ns ± 0%     ~     (all equal)
Add64-96  5.52ns ± 0%  5.51ns ± 0%     ~     (p=0.444 n=5+5)
Sub-96    5.51ns ± 0%  5.51ns ± 0%     ~     (all equal)
Sub32-96  5.51ns ± 0%  5.51ns ± 0%     ~     (all equal)
Sub64-96  5.51ns ± 0%  5.51ns ± 0%     ~     (all equal)
Mul-96    34.6ns ± 0%   5.0ns ± 0%  -85.52%  (p=0.008 n=5+5)
Mul32-96  4.51ns ± 0%  4.51ns ± 0%     ~     (all equal)
Mul64-96  21.1ns ± 0%   5.0ns ± 0%  -76.26%  (p=0.008 n=5+5)
Div-96    64.7ns ± 0%  64.7ns ± 0%     ~     (all equal)
Div32-96  17.0ns ± 0%  17.0ns ± 0%     ~     (all equal)
Div64-96  53.1ns ± 0%  53.1ns ± 0%     ~     (all equal)

Updates #24813

Change-Id: I9bda6d2102f65cae3d436a2087b47ed8bafeb068
Reviewed-on: https://go-review.googlesource.com/129415
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-26 20:35:57 +00:00
Alberto Donizetti 8c1c6702f1 test: restore binary.BigEndian use in checkbce
CL 136855 removed the encoding/binary dependency from the checkbce.go
test by defining a local Uint64 to fix the noopt builder; then a more
general mechanism to skip tests on the noopt builder was introduced in
CL 136898, so we can now restore the binary.Uint64 calls in testbce.

Change-Id: I3efbb41be0bfc446a7e638ce6a593371ead2684f
Reviewed-on: https://go-review.googlesource.com/137056
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-24 21:20:51 +00:00
Brad Fitzpatrick b3369063e5 test: skip some tests on noopt builder
Adds a new build tag "gcflags_noopt" that can be used in test/*.go
tests.

Fixes #27833

Change-Id: I4ea0ccd9e9e58c4639de18645fec81eb24a3a929
Reviewed-on: https://go-review.googlesource.com/136898
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-24 20:56:48 +00:00
Keith Randall 9774fa6f40 cmd/compile: fix precedence order bug
&^ and << have equal precedence.  Add some parentheses to make sure
we shift before we andnot.

Fixes #27829

Change-Id: Iba8576201f0f7c52bf9795aaa75d15d8f9a76811
Reviewed-on: https://go-review.googlesource.com/136899
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-24 17:43:55 +00:00
Jongmin Kim c22c7607d3 test/bench/garbage: update Benchmarks Game URL to new page
The existing URL in comment points to an Alioth page which was
deprecated (and not working), so use the new Benchmarks Game URL.

Change-Id: Ifd694382a44a24c44acbed3fe1b17bca6dab998f
Reviewed-on: https://go-review.googlesource.com/136835
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-24 17:11:42 +00:00
Alberto Donizetti 6054fef17f test: fix bcecheck test on noopt builder
The noopt builder is configured by setting GO_GCFLAGS=-N -l, but the
test/run.go test harness doesn't look at GO_GCFLAGS when processing
"errorcheck" files, it just calls compile:

  cmdline := []string{goTool(), "tool", "compile", /* etc */}

This is working as intended, since it makes the tests more robust and
independent from the environment; errorcheck files are supposed to set
additional building flags, when needed, like in:

  // errorcheck -0 -N -l

The test/bcecheck.go test used to work on the noopt builder (even if
bce is not active on -N -l) because the test was auto-contained and
the file always compiled with optimizations enabled.

In CL 107355, a new bce test dependent on an external package
(encoding.binary) was added. On the noopt builder the external package
is built using -N -l, and this causes a test failure that broke the
noopt builder:

  https://build.golang.org/log/b2be319536285e5807ee9d66d6d0ec4d57433768

To reproduce the failure, one can do:

  $ go install -a -gcflags="-N -l" std
  $ go run run.go -- checkbce.go

This change fixes the noopt builder breakage by removing the bce test
dependency on encoding/binary by defining a local Uint64() function to
be used in the test.

Change-Id: Ife71aab662001442e715c32a0b7d758349a63ff1
Reviewed-on: https://go-review.googlesource.com/136855
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-24 16:54:52 +00:00
Iskander Sharipov 499fbb1a8a cmd/compile/internal/gc: unify self-assignment checks in esc.go
Move slice self-assign check into isSelfAssign function.
Make debug output consistent for all self-assignment cases.

Change-Id: I0e4cc7b3c1fcaeace7226dd80a0dc1ea97347a55
Reviewed-on: https://go-review.googlesource.com/136276
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-20 09:46:11 +00:00
Robert Griesemer ae37f5a397 cmd/compile: fix error message for &T{} literal mismatch
See the change and comment in typecheck.go for a detailed explanation.

Fixes #26855.

Change-Id: I7867f948490fc0873b1bd849048cda6acbc36e76
Reviewed-on: https://go-review.googlesource.com/136395
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-09-20 00:07:35 +00:00
Iskander Sharipov c03d0e4fec cmd/compile/internal/gc: handle arith ops in samesafeexpr
Teach samesafeexpr to handle arithmetic unary and binary ops.

It makes map lookup optimization possible in

	m[k+1] = append(m[k+1], ...)
	m[-k] = append(m[-k], ...)
	... etc

Does not cover "+" for strings (concatenation).

Change-Id: Ibbb16ac3faf176958da344be1471b06d7cf33a6c
Reviewed-on: https://go-review.googlesource.com/135795
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-19 12:03:58 +00:00
Ben Shi c6bf9a8109 cmd/compile: optimize AMD64's bit wise operation
Currently "arr[idx] |= 0x80" is compiled to MOVLload->BTSL->MOVLstore.
And this CL optimizes it to a single BTSLconstmodify. Other bit wise
operations with a direct memory operand are also implemented.

1. The size of the executable bin/go decreases about 4KB, and the total size
of pkg/linux_amd64 (excluding cmd/compile) decreases about 0.6KB.

2. There a little improvement in the go1 benchmark test (excluding noise).
name                     old time/op    new time/op    delta
BinaryTree17-4              2.66s ± 4%     2.66s ± 3%    ~     (p=0.596 n=49+49)
Fannkuch11-4                2.38s ± 2%     2.32s ± 2%  -2.69%  (p=0.000 n=50+50)
FmtFprintfEmpty-4          42.7ns ± 4%    43.2ns ± 7%  +1.31%  (p=0.009 n=50+50)
FmtFprintfString-4         71.0ns ± 5%    72.0ns ± 3%  +1.33%  (p=0.000 n=50+50)
FmtFprintfInt-4            80.7ns ± 4%    80.6ns ± 3%    ~     (p=0.931 n=50+50)
FmtFprintfIntInt-4          125ns ± 3%     126ns ± 4%    ~     (p=0.051 n=50+50)
FmtFprintfPrefixedInt-4     158ns ± 1%     142ns ± 3%  -9.84%  (p=0.000 n=36+50)
FmtFprintfFloat-4           215ns ± 4%     212ns ± 4%  -1.23%  (p=0.002 n=50+50)
FmtManyArgs-4               519ns ± 3%     510ns ± 3%  -1.77%  (p=0.000 n=50+50)
GobDecode-4                6.49ms ± 6%    6.52ms ± 5%    ~     (p=0.866 n=50+50)
GobEncode-4                5.93ms ± 8%    6.01ms ± 7%    ~     (p=0.076 n=50+50)
Gzip-4                      222ms ± 4%     224ms ± 8%  +0.80%  (p=0.001 n=50+50)
Gunzip-4                   36.6ms ± 5%    36.4ms ± 4%    ~     (p=0.093 n=50+50)
HTTPClientServer-4         59.1µs ± 1%    58.9µs ± 2%  -0.24%  (p=0.039 n=49+48)
JSONEncode-4               9.23ms ± 4%    9.21ms ± 5%    ~     (p=0.244 n=50+50)
JSONDecode-4               48.8ms ± 4%    48.7ms ± 4%    ~     (p=0.653 n=50+50)
Mandelbrot200-4            3.81ms ± 4%    3.80ms ± 3%    ~     (p=0.834 n=50+50)
GoParse-4                  3.20ms ± 5%    3.19ms ± 5%    ~     (p=0.494 n=50+50)
RegexpMatchEasy0_32-4      78.1ns ± 2%    77.4ns ± 3%  -0.86%  (p=0.005 n=50+50)
RegexpMatchEasy0_1K-4       233ns ± 3%     233ns ± 3%    ~     (p=0.074 n=50+50)
RegexpMatchEasy1_32-4      74.2ns ± 3%    73.4ns ± 3%  -1.06%  (p=0.000 n=50+50)
RegexpMatchEasy1_1K-4       369ns ± 2%     364ns ± 4%  -1.41%  (p=0.000 n=36+50)
RegexpMatchMedium_32-4      109ns ± 4%     107ns ± 3%  -2.06%  (p=0.001 n=50+50)
RegexpMatchMedium_1K-4     31.5µs ± 3%    30.8µs ± 3%  -2.20%  (p=0.000 n=50+50)
RegexpMatchHard_32-4       1.57µs ± 3%    1.56µs ± 2%  -0.57%  (p=0.016 n=50+50)
RegexpMatchHard_1K-4       47.4µs ± 4%    47.0µs ± 3%  -0.82%  (p=0.008 n=50+50)
Revcomp-4                   414ms ± 7%     412ms ± 7%    ~     (p=0.285 n=50+50)
Template-4                 64.3ms ± 4%    62.7ms ± 3%  -2.44%  (p=0.000 n=50+50)
TimeParse-4                 316ns ± 3%     313ns ± 3%    ~     (p=0.122 n=50+50)
TimeFormat-4                291ns ± 3%     293ns ± 3%  +0.80%  (p=0.001 n=50+50)
[Geo mean]                 46.5µs         46.2µs       -0.81%

name                     old speed      new speed      delta
GobDecode-4               118MB/s ± 6%   118MB/s ± 5%    ~     (p=0.863 n=50+50)
GobEncode-4               130MB/s ± 9%   128MB/s ± 8%    ~     (p=0.076 n=50+50)
Gzip-4                   87.4MB/s ± 4%  86.8MB/s ± 7%  -0.78%  (p=0.002 n=50+50)
Gunzip-4                  531MB/s ± 5%   533MB/s ± 4%    ~     (p=0.093 n=50+50)
JSONEncode-4              210MB/s ± 4%   211MB/s ± 5%    ~     (p=0.247 n=50+50)
JSONDecode-4             39.8MB/s ± 4%  39.9MB/s ± 4%    ~     (p=0.654 n=50+50)
GoParse-4                18.1MB/s ± 5%  18.2MB/s ± 5%    ~     (p=0.493 n=50+50)
RegexpMatchEasy0_32-4     410MB/s ± 2%   413MB/s ± 3%  +0.86%  (p=0.004 n=50+50)
RegexpMatchEasy0_1K-4    4.39GB/s ± 3%  4.38GB/s ± 3%    ~     (p=0.063 n=50+50)
RegexpMatchEasy1_32-4     432MB/s ± 3%   436MB/s ± 3%  +1.07%  (p=0.000 n=50+50)
RegexpMatchEasy1_1K-4    2.77GB/s ± 2%  2.81GB/s ± 4%  +1.46%  (p=0.000 n=36+50)
RegexpMatchMedium_32-4   9.16MB/s ± 3%  9.35MB/s ± 4%  +2.09%  (p=0.001 n=50+50)
RegexpMatchMedium_1K-4   32.5MB/s ± 3%  33.2MB/s ± 3%  +2.25%  (p=0.000 n=50+50)
RegexpMatchHard_32-4     20.4MB/s ± 3%  20.5MB/s ± 2%  +0.56%  (p=0.017 n=50+50)
RegexpMatchHard_1K-4     21.6MB/s ± 4%  21.8MB/s ± 3%  +0.83%  (p=0.008 n=50+50)
Revcomp-4                 613MB/s ± 4%   618MB/s ± 7%    ~     (p=0.152 n=48+50)
Template-4               30.2MB/s ± 4%  30.9MB/s ± 3%  +2.49%  (p=0.000 n=50+50)
[Geo mean]                127MB/s        128MB/s       +0.64%

Change-Id: If405198283855d75697f66cf894b2bef458f620e
Reviewed-on: https://go-review.googlesource.com/135422
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-19 03:00:58 +00:00
Keith Randall c6118af558 cmd/compile: don't do floating point optimization x+0 -> x
That optimization is not valid if x == -0.

The test is a bit tricky because 0 == -0. We distinguish
0 from -0 with 1/0 == inf, 1/-0 == -inf.

This has been a bug since CL 24790 in Go 1.8. Probably doesn't
warrant a backport.

Fixes #27718

Note: the optimization x-0 -> x is actually valid.
But it's probably best to take it out, so as to not confuse readers.

Change-Id: I99f16a93b45f7406ec8053c2dc759a13eba035fa
Reviewed-on: https://go-review.googlesource.com/135701
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-18 20:27:09 +00:00
fanzha02 a19a83c8ef cmd/compile: optimize math.Float64(32)bits and math.Float64(32)frombits on arm64
Use float <-> int register moves without conversion instead of stores
and loads to move float <-> int values.

Math package benchmark results.
name                 old time/op  new time/op  delta
Acosh                 153ns ± 0%   147ns ± 0%   -3.92%  (p=0.000 n=10+10)
Asinh                 183ns ± 0%   177ns ± 0%   -3.28%  (p=0.000 n=10+10)
Atanh                 157ns ± 0%   155ns ± 0%   -1.27%  (p=0.000 n=10+10)
Atan2                 118ns ± 0%   117ns ± 1%   -0.59%  (p=0.003 n=10+10)
Cbrt                  119ns ± 0%   114ns ± 0%   -4.20%  (p=0.000 n=10+10)
Copysign             7.51ns ± 0%  6.51ns ± 0%  -13.32%  (p=0.000 n=9+10)
Cos                  73.1ns ± 0%  70.6ns ± 0%   -3.42%  (p=0.000 n=10+10)
Cosh                  119ns ± 0%   121ns ± 0%   +1.68%  (p=0.000 n=10+9)
ExpGo                 154ns ± 0%   149ns ± 0%   -3.05%  (p=0.000 n=9+10)
Expm1                 101ns ± 0%    99ns ± 0%   -1.88%  (p=0.000 n=10+10)
Exp2Go                150ns ± 0%   146ns ± 0%   -2.67%  (p=0.000 n=10+10)
Abs                  7.01ns ± 0%  6.01ns ± 0%  -14.27%  (p=0.000 n=10+9)
Mod                   234ns ± 0%   212ns ± 0%   -9.40%  (p=0.000 n=9+10)
Frexp                34.5ns ± 0%  30.0ns ± 0%  -13.04%  (p=0.000 n=10+10)
Gamma                 112ns ± 0%   111ns ± 0%   -0.89%  (p=0.000 n=10+10)
Hypot                73.6ns ± 0%  68.6ns ± 0%   -6.79%  (p=0.000 n=10+10)
HypotGo              77.1ns ± 0%  72.1ns ± 0%   -6.49%  (p=0.000 n=10+10)
Ilogb                31.0ns ± 0%  28.0ns ± 0%   -9.68%  (p=0.000 n=10+10)
J0                    437ns ± 0%   434ns ± 0%   -0.62%  (p=0.000 n=10+10)
J1                    433ns ± 0%   431ns ± 0%   -0.46%  (p=0.000 n=10+10)
Jn                    927ns ± 0%   922ns ± 0%   -0.54%  (p=0.000 n=10+10)
Ldexp                41.5ns ± 0%  37.0ns ± 0%  -10.84%  (p=0.000 n=9+10)
Log                   124ns ± 0%   118ns ± 0%   -4.84%  (p=0.000 n=10+9)
Logb                 34.0ns ± 0%  32.0ns ± 0%   -5.88%  (p=0.000 n=10+10)
Log1p                 110ns ± 0%   108ns ± 0%   -1.82%  (p=0.000 n=10+10)
Log10                 136ns ± 0%   132ns ± 0%   -2.94%  (p=0.000 n=10+10)
Log2                 51.6ns ± 0%  47.1ns ± 0%   -8.72%  (p=0.000 n=10+10)
Nextafter32          33.0ns ± 0%  30.5ns ± 0%   -7.58%  (p=0.000 n=10+10)
Nextafter64          29.0ns ± 0%  26.5ns ± 0%   -8.62%  (p=0.000 n=10+10)
PowInt                169ns ± 0%   160ns ± 0%   -5.33%  (p=0.000 n=10+10)
PowFrac               375ns ± 0%   361ns ± 0%   -3.73%  (p=0.000 n=10+10)
RoundToEven          14.0ns ± 0%  12.5ns ± 0%  -10.71%  (p=0.000 n=10+10)
Remainder             206ns ± 0%   192ns ± 0%   -6.80%  (p=0.000 n=10+9)
Signbit              6.01ns ± 0%  5.51ns ± 0%   -8.32%  (p=0.000 n=10+9)
Sin                  70.1ns ± 0%  69.6ns ± 0%   -0.71%  (p=0.000 n=10+10)
Sincos               99.1ns ± 0%  99.6ns ± 0%   +0.50%  (p=0.000 n=9+10)
SqrtGoLatency         178ns ± 0%   146ns ± 0%  -17.70%  (p=0.000 n=8+10)
SqrtPrime            9.19µs ± 0%  9.20µs ± 0%   +0.01%  (p=0.000 n=9+9)
Tanh                  125ns ± 1%   127ns ± 0%   +1.36%  (p=0.000 n=10+10)
Y0                    428ns ± 0%   426ns ± 0%   -0.47%  (p=0.000 n=10+10)
Y1                    431ns ± 0%   429ns ± 0%   -0.46%  (p=0.000 n=10+9)
Yn                    906ns ± 0%   901ns ± 0%   -0.55%  (p=0.000 n=10+10)
Float64bits          4.50ns ± 0%  3.50ns ± 0%  -22.22%  (p=0.000 n=10+10)
Float64frombits      4.00ns ± 0%  3.50ns ± 0%  -12.50%  (p=0.000 n=10+9)
Float32bits          4.50ns ± 0%  3.50ns ± 0%  -22.22%  (p=0.002 n=8+10)
Float32frombits      4.00ns ± 0%  3.50ns ± 0%  -12.50%  (p=0.000 n=10+10)

Change-Id: Iba829e15d5624962fe0c699139ea783efeefabc2
Reviewed-on: https://go-review.googlesource.com/129715
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-17 20:49:04 +00:00
Iskander Sharipov 859cf7fc0f cmd/compile/internal/gc: handle array slice self-assign in esc.go
Instead of skipping all OSLICEARR, skip only ones with non-pointer
array type. For pointers to arrays, it's safe to apply the
self-assignment slicing optimizations.

Refactored the matching code into separate function for readability.

This is an extension to already existing optimization.

On its own, it does not improve any code under std, but
it opens some new optimization opportunities. One
of them is described in the referenced issue.

Updates #7921

Change-Id: I08ac660d3ef80eb15fd7933fb73cf53ded9333ad
Reviewed-on: https://go-review.googlesource.com/133375
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-17 11:14:58 +00:00
Keith Randall b1f656b1ce cmd/compile: fold address calculations into CMPload[const] ops
Makes go binary smaller by 0.2%.

I noticed this in autogenerated equal methods, and there are
probably a lot of those.

Change-Id: I4e04eb3653fbceb9dd6a4eee97ceab1fa4d10b72
Reviewed-on: https://go-review.googlesource.com/135379
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-09-14 19:42:09 +00:00
Lynn Boger 8dbd9afbb0 cmd/compile: improve rules for PPC64.rules
This adds some improvements to the rules for PPC64 to eliminate
unnecessary zero or sign extends, and fix some rule for truncates
which were not always using the correct sign instruction.

This reduces of size of many functions by 1 or 2 instructions and
can improve performance in cases where the execution time depends
on small loops where at least 1 instruction was removed and where that
loop contributes a significant amount of the total execution time.

Included is a testcase for codegen to verify the sign/zero extend
instructions are omitted.

An example of the improvement (strings):
IndexAnyASCII/256:1-16     392ns ± 0%   369ns ± 0%  -5.79%  (p=0.000 n=1+10)
IndexAnyASCII/256:2-16     397ns ± 0%   376ns ± 0%  -5.23%  (p=0.000 n=1+9)
IndexAnyASCII/256:4-16     405ns ± 0%   384ns ± 0%  -5.19%  (p=1.714 n=1+6)
IndexAnyASCII/256:8-16     427ns ± 0%   403ns ± 0%  -5.57%  (p=0.000 n=1+10)
IndexAnyASCII/256:16-16    441ns ± 0%   418ns ± 1%  -5.33%  (p=0.000 n=1+10)
IndexAnyASCII/4096:1-16   5.62µs ± 0%  5.27µs ± 1%  -6.31%  (p=0.000 n=1+10)
IndexAnyASCII/4096:2-16   5.67µs ± 0%  5.29µs ± 0%  -6.67%  (p=0.222 n=1+8)
IndexAnyASCII/4096:4-16   5.66µs ± 0%  5.28µs ± 1%  -6.66%  (p=0.000 n=1+10)
IndexAnyASCII/4096:8-16   5.66µs ± 0%  5.31µs ± 1%  -6.10%  (p=0.000 n=1+10)
IndexAnyASCII/4096:16-16  5.70µs ± 0%  5.33µs ± 1%  -6.43%  (p=0.182 n=1+10)

Change-Id: I739a6132b505936d39001aada5a978ff2a5f0500
Reviewed-on: https://go-review.googlesource.com/129875
Reviewed-by: David Chase <drchase@google.com>
2018-09-13 18:24:53 +00:00
erifan01 8149db4f64 cmd/compile: intrinsify math.RoundToEven and math.Abs on arm64
math.RoundToEven can be done by one arm64 instruction FRINTND, intrinsify it to improve performance.
The current pure Go implementation of the function Abs is translated into five instructions on arm64:
str, ldr, and, str, ldr. The intrinsic implementation requires only one instruction, so in terms of
performance, intrinsify it is worthwhile.

Benchmarks:
name           old time/op  new time/op  delta
Abs-8          3.50ns ± 0%  1.50ns ± 0%  -57.14%  (p=0.000 n=10+10)
RoundToEven-8  9.26ns ± 0%  1.50ns ± 0%  -83.80%  (p=0.000 n=10+10)

Change-Id: I9456b26ab282b544dfac0154fc86f17aed96ac3d
Reviewed-on: https://go-review.googlesource.com/116535
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-13 14:52:51 +00:00
fanzha02 d5377c2026 test: fix the wrong test of math.Copysign(c, -1) for arm64
The CL 132915 added the wrong codegen test for math.Copysign(c, -1),
it should test that AND is not emitted. This CL fixes this error.

Change-Id: Ida1d3d54ebfc7f238abccbc1f70f914e1b5bfd91
Reviewed-on: https://go-review.googlesource.com/134815
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-12 15:34:20 +00:00
Ben Shi 9f2411894b cmd/compile: optimize arm's bit operation
BFC (Bit Field Clear) was introduced in ARMv7, which can simplify
ANDconst and BICconst. And this CL implements that optimization.

1. The total size of pkg/android_arm decreases about 3KB, excluding
cmd/compile/.

2. There is no regression in the go1 benchmark result, and some
cases (FmtFprintfEmpty-4 and RegexpMatchMedium_32-4) even get
slight improvement.

name                     old time/op    new time/op    delta
BinaryTree17-4              25.3s ± 1%     25.2s ± 1%    ~     (p=0.072 n=30+29)
Fannkuch11-4                13.3s ± 0%     13.3s ± 0%  +0.13%  (p=0.000 n=30+26)
FmtFprintfEmpty-4           407ns ± 0%     394ns ± 0%  -3.19%  (p=0.000 n=26+28)
FmtFprintfString-4          664ns ± 0%     662ns ± 0%  -0.22%  (p=0.000 n=30+30)
FmtFprintfInt-4             712ns ± 0%     706ns ± 0%  -0.79%  (p=0.000 n=30+30)
FmtFprintfIntInt-4         1.06µs ± 0%    1.05µs ± 0%  -0.38%  (p=0.000 n=30+30)
FmtFprintfPrefixedInt-4    1.16µs ± 0%    1.16µs ± 0%  -0.13%  (p=0.000 n=30+29)
FmtFprintfFloat-4          2.24µs ± 0%    2.23µs ± 0%  -0.51%  (p=0.000 n=29+21)
FmtManyArgs-4              4.09µs ± 0%    4.06µs ± 0%  -0.83%  (p=0.000 n=28+30)
GobDecode-4                55.0ms ± 5%    55.4ms ± 5%    ~     (p=0.307 n=30+30)
GobEncode-4                51.2ms ± 1%    51.9ms ± 1%  +1.23%  (p=0.000 n=29+30)
Gzip-4                      2.64s ± 0%     2.60s ± 0%  -1.35%  (p=0.000 n=30+29)
Gunzip-4                    309ms ± 0%     308ms ± 0%  -0.27%  (p=0.000 n=30+30)
HTTPClientServer-4         1.03ms ± 5%    1.02ms ± 4%    ~     (p=0.117 n=30+29)
JSONEncode-4                101ms ± 2%     101ms ± 2%    ~     (p=0.338 n=29+29)
JSONDecode-4                383ms ± 2%     382ms ± 2%    ~     (p=0.751 n=26+30)
Mandelbrot200-4            18.4ms ± 0%    18.4ms ± 0%  -0.10%  (p=0.000 n=29+29)
GoParse-4                  22.6ms ± 0%    22.5ms ± 0%  -0.39%  (p=0.000 n=30+30)
RegexpMatchEasy0_32-4       761ns ± 0%     750ns ± 0%  -1.47%  (p=0.000 n=26+29)
RegexpMatchEasy0_1K-4      4.33µs ± 0%    4.34µs ± 0%  +0.27%  (p=0.000 n=25+28)
RegexpMatchEasy1_32-4       809ns ± 0%     795ns ± 0%  -1.74%  (p=0.000 n=27+25)
RegexpMatchEasy1_1K-4      5.54µs ± 0%    5.53µs ± 0%  -0.18%  (p=0.000 n=29+29)
RegexpMatchMedium_32-4     1.11µs ± 0%    1.08µs ± 0%  -2.78%  (p=0.000 n=27+29)
RegexpMatchMedium_1K-4      255µs ± 0%     255µs ± 0%  -0.02%  (p=0.029 n=30+30)
RegexpMatchHard_32-4       14.7µs ± 0%    14.7µs ± 0%  -0.28%  (p=0.000 n=30+29)
RegexpMatchHard_1K-4        439µs ± 0%     439µs ± 0%    ~     (p=0.907 n=23+27)
Revcomp-4                  41.9ms ± 1%    41.9ms ± 1%    ~     (p=0.230 n=28+30)
Template-4                  522ms ± 1%     528ms ± 1%  +1.25%  (p=0.000 n=30+30)
TimeParse-4                3.34µs ± 0%    3.35µs ± 0%  +0.23%  (p=0.000 n=30+27)
TimeFormat-4               6.06µs ± 0%    6.13µs ± 0%  +1.08%  (p=0.000 n=29+29)
[Geo mean]                  384µs          382µs       -0.37%

name                     old speed      new speed      delta
GobDecode-4              14.0MB/s ± 5%  13.9MB/s ± 5%    ~     (p=0.308 n=30+30)
GobEncode-4              15.0MB/s ± 1%  14.8MB/s ± 1%  -1.22%  (p=0.000 n=29+30)
Gzip-4                   7.36MB/s ± 0%  7.46MB/s ± 0%  +1.35%  (p=0.000 n=30+30)
Gunzip-4                 62.8MB/s ± 0%  63.0MB/s ± 0%  +0.27%  (p=0.000 n=30+30)
JSONEncode-4             19.2MB/s ± 2%  19.2MB/s ± 2%    ~     (p=0.312 n=29+29)
JSONDecode-4             5.05MB/s ± 3%  5.08MB/s ± 2%    ~     (p=0.356 n=29+30)
GoParse-4                2.56MB/s ± 0%  2.57MB/s ± 0%  +0.39%  (p=0.000 n=23+27)
RegexpMatchEasy0_32-4    42.0MB/s ± 0%  42.6MB/s ± 0%  +1.50%  (p=0.000 n=26+28)
RegexpMatchEasy0_1K-4     236MB/s ± 0%   236MB/s ± 0%  -0.27%  (p=0.000 n=25+28)
RegexpMatchEasy1_32-4    39.6MB/s ± 0%  40.2MB/s ± 0%  +1.73%  (p=0.000 n=27+27)
RegexpMatchEasy1_1K-4     185MB/s ± 0%   185MB/s ± 0%  +0.18%  (p=0.000 n=29+29)
RegexpMatchMedium_32-4    900kB/s ± 0%   920kB/s ± 0%  +2.22%  (p=0.000 n=29+29)
RegexpMatchMedium_1K-4   4.02MB/s ± 0%  4.02MB/s ± 0%  +0.07%  (p=0.004 n=30+27)
RegexpMatchHard_32-4     2.17MB/s ± 0%  2.18MB/s ± 0%  +0.46%  (p=0.000 n=30+26)
RegexpMatchHard_1K-4     2.33MB/s ± 0%  2.33MB/s ± 0%    ~     (all equal)
Revcomp-4                60.6MB/s ± 1%  60.7MB/s ± 1%    ~     (p=0.207 n=28+30)
Template-4               3.72MB/s ± 1%  3.67MB/s ± 1%  -1.23%  (p=0.000 n=30+30)
[Geo mean]               12.9MB/s       12.9MB/s       +0.29%

Change-Id: I07f497f8bb476c950dc555491d00c9066fb64a4e
Reviewed-on: https://go-review.googlesource.com/134232
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-11 14:37:51 +00:00
Iskander Sharipov b7f9c640b7 test: extend noescape bytes.Buffer test suite
Added some more cases that should be guarded against regression.

Change-Id: I9f1dda2fd0be9b6e167ef1cc018fc8cce55c066c
Reviewed-on: https://go-review.googlesource.com/134017
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-09-07 18:49:12 +00:00
erifan01 204cc14bdd cmd/compile: implement non-constant rotates using ROR on arm64
Add some rules to match the Go code like:
	y &= 63
	x << y | x >> (64-y)
or
	y &= 63
	x >> y | x << (64-y)
as a ROR instruction. Make math/bits.RotateLeft faster on arm64.

Extends CL 132435 to arm64.

Benchmarks of math/bits.RotateLeftxxN:
name            old time/op       new time/op       delta
RotateLeft-8    3.548750ns +- 1%  2.003750ns +- 0%  -43.54%  (p=0.000 n=8+8)
RotateLeft8-8   3.925000ns +- 0%  3.925000ns +- 0%     ~     (p=1.000 n=8+8)
RotateLeft16-8  3.925000ns +- 0%  3.927500ns +- 0%     ~     (p=0.608 n=8+8)
RotateLeft32-8  3.925000ns +- 0%  2.002500ns +- 0%  -48.98%  (p=0.000 n=8+8)
RotateLeft64-8  3.536250ns +- 0%  2.003750ns +- 0%  -43.34%  (p=0.000 n=8+8)

Change-Id: I77622cd7f39b917427e060647321f5513973232c
Reviewed-on: https://go-review.googlesource.com/122542
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-07 14:52:02 +00:00
Ben Shi 031a35ec84 cmd/compile: optimize 386's comparison
Optimization of "(CMPconst [0] (ANDL x y)) -> (TESTL x y)" only
get benefits if there is no further use of the result of x&y. A
condition of uses==1 will have slight improvements.

1. The code size of pkg/linux_386 decreases about 300 bytes, excluding
cmd/compile/.

2. The go1 benchmark shows no regression, and even a slight improvement
in test case FmtFprintfEmpty-4, excluding noise.

name                     old time/op    new time/op    delta
BinaryTree17-4              3.34s ± 3%     3.32s ± 2%    ~     (p=0.197 n=30+30)
Fannkuch11-4                3.48s ± 2%     3.47s ± 1%  -0.33%  (p=0.015 n=30+30)
FmtFprintfEmpty-4          46.3ns ± 4%    44.8ns ± 4%  -3.33%  (p=0.000 n=30+30)
FmtFprintfString-4         78.8ns ± 7%    77.3ns ± 5%    ~     (p=0.098 n=30+26)
FmtFprintfInt-4            90.2ns ± 1%    90.0ns ± 7%  -0.23%  (p=0.027 n=18+30)
FmtFprintfIntInt-4          144ns ± 4%     143ns ± 5%    ~     (p=0.945 n=30+29)
FmtFprintfPrefixedInt-4     180ns ± 4%     180ns ± 5%    ~     (p=0.858 n=30+30)
FmtFprintfFloat-4           409ns ± 4%     406ns ± 3%  -0.87%  (p=0.028 n=30+30)
FmtManyArgs-4               611ns ± 5%     608ns ± 4%    ~     (p=0.812 n=30+30)
GobDecode-4                7.30ms ± 5%    7.26ms ± 5%    ~     (p=0.522 n=30+29)
GobEncode-4                6.90ms ± 7%    6.82ms ± 4%    ~     (p=0.086 n=29+28)
Gzip-4                      396ms ± 4%     400ms ± 4%  +0.99%  (p=0.026 n=30+30)
Gunzip-4                   41.1ms ± 3%    41.2ms ± 3%    ~     (p=0.495 n=30+30)
HTTPClientServer-4         63.7µs ± 3%    63.3µs ± 2%    ~     (p=0.113 n=29+29)
JSONEncode-4               16.1ms ± 2%    16.1ms ± 2%  -0.30%  (p=0.041 n=30+30)
JSONDecode-4               60.9ms ± 3%    61.2ms ± 6%    ~     (p=0.187 n=30+30)
Mandelbrot200-4            5.17ms ± 2%    5.19ms ± 3%    ~     (p=0.676 n=30+30)
GoParse-4                  3.28ms ± 3%    3.25ms ± 2%  -0.97%  (p=0.002 n=30+30)
RegexpMatchEasy0_32-4       103ns ± 4%     104ns ± 4%    ~     (p=0.352 n=30+30)
RegexpMatchEasy0_1K-4       849ns ± 2%     845ns ± 2%    ~     (p=0.381 n=30+30)
RegexpMatchEasy1_32-4       113ns ± 4%     113ns ± 4%    ~     (p=0.795 n=30+30)
RegexpMatchEasy1_1K-4      1.03µs ± 3%    1.03µs ± 4%    ~     (p=0.275 n=25+30)
RegexpMatchMedium_32-4      132ns ± 3%     132ns ± 3%    ~     (p=0.970 n=30+30)
RegexpMatchMedium_1K-4     41.4µs ± 3%    41.4µs ± 3%    ~     (p=0.212 n=30+30)
RegexpMatchHard_32-4       2.22µs ± 4%    2.22µs ± 4%    ~     (p=0.399 n=30+30)
RegexpMatchHard_1K-4       67.2µs ± 3%    67.6µs ± 4%    ~     (p=0.359 n=30+30)
Revcomp-4                   1.84s ± 2%     1.83s ± 2%    ~     (p=0.532 n=30+30)
Template-4                 69.1ms ± 4%    68.8ms ± 3%    ~     (p=0.146 n=30+30)
TimeParse-4                 441ns ± 3%     442ns ± 3%    ~     (p=0.154 n=30+30)
TimeFormat-4                413ns ± 3%     414ns ± 3%    ~     (p=0.275 n=30+30)
[Geo mean]                 66.2µs         66.0µs       -0.28%

name                     old speed      new speed      delta
GobDecode-4               105MB/s ± 5%   106MB/s ± 5%    ~     (p=0.514 n=30+29)
GobEncode-4               111MB/s ± 5%   113MB/s ± 4%  +1.37%  (p=0.046 n=28+28)
Gzip-4                   49.1MB/s ± 4%  48.6MB/s ± 4%  -0.98%  (p=0.028 n=30+30)
Gunzip-4                  472MB/s ± 4%   472MB/s ± 3%    ~     (p=0.496 n=30+30)
JSONEncode-4              120MB/s ± 2%   121MB/s ± 2%  +0.29%  (p=0.042 n=30+30)
JSONDecode-4             31.9MB/s ± 3%  31.7MB/s ± 6%    ~     (p=0.186 n=30+30)
GoParse-4                17.6MB/s ± 3%  17.8MB/s ± 2%  +0.98%  (p=0.002 n=30+30)
RegexpMatchEasy0_32-4     309MB/s ± 4%   307MB/s ± 4%    ~     (p=0.501 n=30+30)
RegexpMatchEasy0_1K-4    1.21GB/s ± 2%  1.21GB/s ± 2%    ~     (p=0.301 n=30+30)
RegexpMatchEasy1_32-4     283MB/s ± 4%   282MB/s ± 3%    ~     (p=0.877 n=30+30)
RegexpMatchEasy1_1K-4    1.00GB/s ± 3%  0.99GB/s ± 4%    ~     (p=0.276 n=25+30)
RegexpMatchMedium_32-4   7.54MB/s ± 3%  7.55MB/s ± 3%    ~     (p=0.528 n=30+30)
RegexpMatchMedium_1K-4   24.7MB/s ± 3%  24.7MB/s ± 3%    ~     (p=0.203 n=30+30)
RegexpMatchHard_32-4     14.4MB/s ± 4%  14.4MB/s ± 4%    ~     (p=0.407 n=30+30)
RegexpMatchHard_1K-4     15.3MB/s ± 3%  15.1MB/s ± 4%    ~     (p=0.306 n=30+30)
Revcomp-4                 138MB/s ± 2%   139MB/s ± 2%    ~     (p=0.520 n=30+30)
Template-4               28.1MB/s ± 4%  28.2MB/s ± 3%    ~     (p=0.149 n=30+30)
[Geo mean]               81.5MB/s       81.5MB/s       +0.06%

Change-Id: I7f75425f79eec93cdd8fdd94db13ad4f61b6a2f5
Reviewed-on: https://go-review.googlesource.com/133657
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-07 01:20:45 +00:00
fanzha02 2e5c32518c cmd/compile: optimize math.Copysign on arm64
Add rewrite rules to optimize math.Copysign() when the second
argument is negative floating point constant.

For example, math.Copysign(c, -2): The previous compile output is
"AND $9223372036854775807, R0, R0; ORR $-9223372036854775808, R0, R0".
The optimized compile output is "ORR $-9223372036854775808, R0, R0"

Math package benchmark results.
name                   old time/op  new time/op  delta
Copysign-8             2.61ns ± 2%  2.49ns ± 0%  -4.55%  (p=0.000 n=10+10)
Cos-8                  43.0ns ± 0%  41.5ns ± 0%  -3.49%  (p=0.000 n=10+10)
Cosh-8                 98.6ns ± 0%  98.1ns ± 0%  -0.51%  (p=0.000 n=10+10)
ExpGo-8                 107ns ± 0%   105ns ± 0%  -1.87%  (p=0.000 n=10+10)
Exp2Go-8                100ns ± 0%   100ns ± 0%  +0.39%  (p=0.000 n=10+8)
Max-8                  6.56ns ± 2%  6.45ns ± 1%  -1.63%  (p=0.002 n=10+10)
Min-8                  6.66ns ± 3%  6.47ns ± 2%  -2.82%  (p=0.006 n=10+10)
Mod-8                   107ns ± 1%   104ns ± 1%  -2.72%  (p=0.000 n=10+10)
Frexp-8                11.5ns ± 1%  11.0ns ± 0%  -4.56%  (p=0.000 n=8+10)
HypotGo-8              19.4ns ± 0%  19.4ns ± 0%  +0.36%  (p=0.019 n=10+10)
Ilogb-8                8.63ns ± 0%  8.51ns ± 0%  -1.36%  (p=0.000 n=10+10)
Jn-8                    584ns ± 0%   585ns ± 0%  +0.17%  (p=0.000 n=7+8)
Ldexp-8                13.8ns ± 0%  13.5ns ± 0%  -2.17%  (p=0.002 n=8+10)
Logb-8                 10.2ns ± 0%   9.9ns ± 0%  -2.65%  (p=0.000 n=10+7)
Nextafter64-8          7.54ns ± 0%  7.51ns ± 0%  -0.37%  (p=0.000 n=10+10)
Remainder-8            73.5ns ± 1%  70.4ns ± 1%  -4.27%  (p=0.000 n=10+10)
SqrtGoLatency-8        79.6ns ± 0%  76.2ns ± 0%  -4.30%  (p=0.000 n=9+10)
Yn-8                    582ns ± 0%   579ns ± 0%  -0.52%  (p=0.000 n=10+10)

Change-Id: I0c9cd1ea87435e7b8bab94b4e79e6e29785f25b1
Reviewed-on: https://go-review.googlesource.com/132915
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-06 19:57:25 +00:00
Iskander Sharipov 9c2be4c22d bytes: remove bootstrap array from Buffer
Rationale: small buffer optimization does not work and it has
made things slower since 2014. Until we can make it work,
we should prefer simpler code that also turns out to be more
efficient.

With this change, it's possible to use
NewBuffer(make([]byte, 0, bootstrapSize)) to get the desired
stack-allocated initial buffer since escape analysis can
prove the created slice to be non-escaping.

New implementation key points:

    - Zero value bytes.Buffer performs better than before
    - You can have a truly stack-allocated buffer, and it's not even limited to 64 bytes
    - The unsafe.Sizeof(bytes.Buffer{}) is reduced significantly
    - Empty writes don't cause allocations

Buffer benchmarks from bytes package:

    name                       old time/op    new time/op    delta
    ReadString-8                 9.20µs ± 1%    9.22µs ± 1%     ~     (p=0.148 n=10+10)
    WriteByte-8                  28.1µs ± 0%    26.2µs ± 0%   -6.78%  (p=0.000 n=10+10)
    WriteRune-8                  64.9µs ± 0%    65.0µs ± 0%   +0.16%  (p=0.000 n=10+10)
    BufferNotEmptyWriteRead-8     469µs ± 0%     461µs ± 0%   -1.76%  (p=0.000 n=9+10)
    BufferFullSmallReads-8        108µs ± 0%     108µs ± 0%   -0.21%  (p=0.000 n=10+10)

    name                       old speed      new speed      delta
    ReadString-8               3.56GB/s ± 1%  3.55GB/s ± 1%     ~     (p=0.165 n=10+10)
    WriteByte-8                 146MB/s ± 0%   156MB/s ± 0%   +7.26%  (p=0.000 n=9+10)
    WriteRune-8                 189MB/s ± 0%   189MB/s ± 0%   -0.16%  (p=0.000 n=10+10)

    name                       old alloc/op   new alloc/op   delta
    ReadString-8                 32.8kB ± 0%    32.8kB ± 0%     ~     (all equal)
    WriteByte-8                   0.00B          0.00B          ~     (all equal)
    WriteRune-8                   0.00B          0.00B          ~     (all equal)
    BufferNotEmptyWriteRead-8    4.72kB ± 0%    4.67kB ± 0%   -1.02%  (p=0.000 n=10+10)
    BufferFullSmallReads-8       3.44kB ± 0%    3.33kB ± 0%   -3.26%  (p=0.000 n=10+10)

    name                       old allocs/op  new allocs/op  delta
    ReadString-8                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
    WriteByte-8                    0.00           0.00          ~     (all equal)
    WriteRune-8                    0.00           0.00          ~     (all equal)
    BufferNotEmptyWriteRead-8      3.00 ± 0%      3.00 ± 0%     ~     (all equal)
    BufferFullSmallReads-8         3.00 ± 0%      2.00 ± 0%  -33.33%  (p=0.000 n=10+10)

The most notable thing in go1 benchmarks is reduced allocs in HTTPClientServer (-1 alloc):

    HTTPClientServer-8           64.0 ± 0%      63.0 ± 0%  -1.56%  (p=0.000 n=10+10)

For more explanations and benchmarks see the referenced issue.

Updates #7921

Change-Id: Ica0bf85e1b70fb4f5dc4f6a61045e2cf4ef72aa3
Reviewed-on: https://go-review.googlesource.com/133715
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-06 19:33:18 +00:00
taylorza 4a095b87d3 cmd/compile: don't crash reporting misuse of shadowed built-in function
The existing implementation causes a compiler panic if a function parameter shadows a built-in function, and then calling that shadowed name.

Fixes #27356
Change-Id: I1ffb6dc01e63c7f499e5f6f75f77ce2318f35bcd
Reviewed-on: https://go-review.googlesource.com/132876
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-06 02:49:21 +00:00
Ben Shi 067dfce21f cmd/compile: optimize ARM's comparision
Optimize (CMPconst [0] (ADD x y)) to (CMN x y) will only get benefits
when the result of the addition is no longer used, otherwise there
might be even performance drop. And this CL fixes that issue for
CMP/CMN/TST/TEQ.

There is little regression in the go1 benchmark (excluding noise),
and the test case JSONDecode-4 even gets improvement.

name                     old time/op    new time/op    delta
BinaryTree17-4              21.6s ± 1%     21.6s ± 0%  -0.22%  (p=0.013 n=30+30)
Fannkuch11-4                11.1s ± 0%     11.1s ± 0%  +0.11%  (p=0.000 n=30+29)
FmtFprintfEmpty-4           297ns ± 0%     297ns ± 0%  +0.08%  (p=0.007 n=26+28)
FmtFprintfString-4          589ns ± 1%     589ns ± 0%    ~     (p=0.659 n=30+25)
FmtFprintfInt-4             644ns ± 1%     650ns ± 0%  +0.88%  (p=0.000 n=30+24)
FmtFprintfIntInt-4          964ns ± 0%     977ns ± 0%  +1.33%  (p=0.000 n=30+30)
FmtFprintfPrefixedInt-4    1.06µs ± 0%    1.07µs ± 0%  +1.31%  (p=0.000 n=29+27)
FmtFprintfFloat-4          1.89µs ± 0%    1.92µs ± 0%  +1.25%  (p=0.000 n=29+29)
FmtManyArgs-4              3.63µs ± 0%    3.67µs ± 0%  +1.33%  (p=0.000 n=29+27)
GobDecode-4                38.1ms ± 1%    37.9ms ± 1%  -0.60%  (p=0.000 n=29+29)
GobEncode-4                35.3ms ± 2%    35.2ms ± 1%    ~     (p=0.286 n=30+30)
Gzip-4                      2.36s ± 0%     2.37s ± 2%    ~     (p=0.277 n=24+28)
Gunzip-4                    264ms ± 1%     264ms ± 1%    ~     (p=0.104 n=28+30)
HTTPClientServer-4         1.04ms ± 4%    1.02ms ± 4%  -1.65%  (p=0.000 n=28+28)
JSONEncode-4               78.5ms ± 1%    79.6ms ± 1%  +1.34%  (p=0.000 n=27+28)
JSONDecode-4                379ms ± 4%     352ms ± 5%  -7.09%  (p=0.000 n=29+30)
Mandelbrot200-4            17.6ms ± 0%    17.6ms ± 0%    ~     (p=0.206 n=28+29)
GoParse-4                  21.9ms ± 1%    22.1ms ± 1%  +0.87%  (p=0.000 n=28+26)
RegexpMatchEasy0_32-4       631ns ± 0%     641ns ± 0%  +1.63%  (p=0.000 n=29+30)
RegexpMatchEasy0_1K-4      4.11µs ± 0%    4.11µs ± 0%    ~     (p=0.700 n=30+30)
RegexpMatchEasy1_32-4       670ns ± 0%     679ns ± 0%  +1.37%  (p=0.000 n=21+30)
RegexpMatchEasy1_1K-4      5.31µs ± 0%    5.26µs ± 0%  -1.03%  (p=0.000 n=25+28)
RegexpMatchMedium_32-4      905ns ± 0%     906ns ± 0%  +0.14%  (p=0.001 n=30+30)
RegexpMatchMedium_1K-4      192µs ± 0%     191µs ± 0%  -0.45%  (p=0.000 n=29+27)
RegexpMatchHard_32-4       11.8µs ± 0%    11.7µs ± 0%  -0.39%  (p=0.000 n=29+28)
RegexpMatchHard_1K-4        347µs ± 0%     347µs ± 0%    ~     (p=0.084 n=29+30)
Revcomp-4                  37.5ms ± 1%    37.5ms ± 1%    ~     (p=0.279 n=29+29)
Template-4                  519ms ± 2%     519ms ± 2%    ~     (p=0.652 n=28+29)
TimeParse-4                2.83µs ± 0%    2.78µs ± 0%  -1.90%  (p=0.000 n=27+28)
TimeFormat-4               5.79µs ± 0%    5.60µs ± 0%  -3.23%  (p=0.000 n=29+29)
[Geo mean]                  331µs          330µs       -0.16%

name                     old speed      new speed      delta
GobDecode-4              20.1MB/s ± 1%  20.3MB/s ± 1%  +0.61%  (p=0.000 n=29+29)
GobEncode-4              21.7MB/s ± 2%  21.8MB/s ± 1%    ~     (p=0.294 n=30+30)
Gzip-4                   8.23MB/s ± 1%  8.20MB/s ± 2%    ~     (p=0.099 n=26+28)
Gunzip-4                 73.5MB/s ± 1%  73.4MB/s ± 1%    ~     (p=0.107 n=28+30)
JSONEncode-4             24.7MB/s ± 1%  24.4MB/s ± 1%  -1.32%  (p=0.000 n=27+28)
JSONDecode-4             5.13MB/s ± 4%  5.52MB/s ± 5%  +7.65%  (p=0.000 n=29+30)
GoParse-4                2.65MB/s ± 1%  2.63MB/s ± 1%  -0.87%  (p=0.000 n=28+26)
RegexpMatchEasy0_32-4    50.7MB/s ± 0%  49.9MB/s ± 0%  -1.58%  (p=0.000 n=29+29)
RegexpMatchEasy0_1K-4     249MB/s ± 0%   249MB/s ± 0%    ~     (p=0.342 n=30+28)
RegexpMatchEasy1_32-4    47.7MB/s ± 0%  47.1MB/s ± 0%  -1.39%  (p=0.000 n=26+30)
RegexpMatchEasy1_1K-4     193MB/s ± 0%   195MB/s ± 0%  +1.04%  (p=0.000 n=25+28)
RegexpMatchMedium_32-4   1.10MB/s ± 0%  1.10MB/s ± 0%  -0.42%  (p=0.000 n=30+26)
RegexpMatchMedium_1K-4   5.33MB/s ± 0%  5.36MB/s ± 0%  +0.43%  (p=0.000 n=29+29)
RegexpMatchHard_32-4     2.72MB/s ± 0%  2.73MB/s ± 0%  +0.37%  (p=0.000 n=29+30)
RegexpMatchHard_1K-4     2.95MB/s ± 0%  2.95MB/s ± 0%    ~     (all equal)
Revcomp-4                67.8MB/s ± 1%  67.7MB/s ± 1%    ~     (p=0.273 n=29+29)
Template-4               3.74MB/s ± 2%  3.74MB/s ± 2%    ~     (p=0.665 n=28+29)
[Geo mean]               15.2MB/s       15.2MB/s       +0.21%

Change-Id: Ifed1fb8cc02d5ca52c8bc6c21b6b5bf6dbb2701a
Reviewed-on: https://go-review.googlesource.com/132115
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 14:54:08 +00:00
Iskander Sharipov 4cf33e361a cmd/compile/internal/gc: fix mayAffectMemory in esc.go
For OINDEX and other Left+Right nodes, we want the whole
node to be considered as "may affect memory" if either
of Left or Right affect memory. Initial implementation
only considered node as such if both Left and Right were non-safe.

Change-Id: Icfb965a0b4c24d8f83f3722216db068dad2eba95
Reviewed-on: https://go-review.googlesource.com/133275
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-09-05 14:16:25 +00:00
Iskander Sharipov bcf3e063cc test: remove go:noinline from escape_because.go
File is compiled with "-l" flag, so go:noinline is redundant.

Change-Id: Ia269f3b9de9466857fc578ba5164613393e82369
Reviewed-on: https://go-review.googlesource.com/133295
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 10:45:58 +00:00
Tobias Klauser d7fc2205d4 test: fix nilptr3 check for wasm
CL 131735 only updated nilptr3.go for the adjusted nil check. Adjust
nilptr3_wasm.go as well.

Change-Id: I4a6257d32bb212666fe768dac53901ea0b051138
Reviewed-on: https://go-review.googlesource.com/133495
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-05 09:57:32 +00:00
Michael Munday f94de9c9fb cmd/compile: make math/bits.RotateLeft{32,64} intrinsics on s390x
Extends CL 132435 to s390x. s390x has 32- and 64-bit variable
rotate left instructions.

Change-Id: Ic4f1ebb0e0543207ed2fc8c119e0163b428138a5
Reviewed-on: https://go-review.googlesource.com/133035
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 08:29:02 +00:00
Ben Shi 0e9f1de0b7 cmd/compile: optimize arm64's comparison
Add more optimization with TST/CMN.

1. A tiny benchmark shows more than 12% improvement.
TSTCMN-4                    378µs ± 0%     332µs ± 0%  -12.15%  (p=0.000 n=30+27)
(https://github.com/benshi001/ugo1/blob/master/tstcmn_test.go)

2. There is little regression in the go1 benchmark, excluding noise.

name                     old time/op    new time/op    delta
BinaryTree17-4              19.1s ± 0%     19.1s ± 0%    ~     (p=0.994 n=28+29)
Fannkuch11-4                10.0s ± 0%     10.0s ± 0%    ~     (p=0.198 n=30+25)
FmtFprintfEmpty-4           233ns ± 0%     233ns ± 0%  +0.14%  (p=0.002 n=24+30)
FmtFprintfString-4          428ns ± 0%     428ns ± 0%    ~     (all equal)
FmtFprintfInt-4             472ns ± 0%     472ns ± 0%    ~     (all equal)
FmtFprintfIntInt-4          725ns ± 0%     725ns ± 0%    ~     (all equal)
FmtFprintfPrefixedInt-4     889ns ± 0%     888ns ± 0%    ~     (p=0.632 n=28+30)
FmtFprintfFloat-4          1.20µs ± 0%    1.20µs ± 0%  +0.05%  (p=0.001 n=18+30)
FmtManyArgs-4              3.00µs ± 0%    2.99µs ± 0%  -0.07%  (p=0.001 n=27+30)
GobDecode-4                42.1ms ± 0%    42.2ms ± 0%  +0.29%  (p=0.000 n=28+28)
GobEncode-4                38.6ms ± 9%    38.8ms ± 9%    ~     (p=0.912 n=30+30)
Gzip-4                      2.07s ± 1%     2.05s ± 1%  -0.64%  (p=0.000 n=29+30)
Gunzip-4                    175ms ± 0%     175ms ± 0%  -0.15%  (p=0.001 n=30+30)
HTTPClientServer-4          872µs ± 5%     880µs ± 6%    ~     (p=0.196 n=30+29)
JSONEncode-4               88.5ms ± 1%    89.8ms ± 1%  +1.49%  (p=0.000 n=23+24)
JSONDecode-4                393ms ± 1%     390ms ± 1%  -0.89%  (p=0.000 n=28+30)
Mandelbrot200-4            19.5ms ± 0%    19.5ms ± 0%    ~     (p=0.405 n=29+28)
GoParse-4                  19.9ms ± 0%    20.0ms ± 0%  +0.27%  (p=0.000 n=30+30)
RegexpMatchEasy0_32-4       431ns ± 0%     431ns ± 0%    ~     (p=1.000 n=30+30)
RegexpMatchEasy0_1K-4      1.61µs ± 0%    1.61µs ± 0%    ~     (p=0.527 n=26+26)
RegexpMatchEasy1_32-4       443ns ± 0%     443ns ± 0%    ~     (all equal)
RegexpMatchEasy1_1K-4      2.58µs ± 1%    2.58µs ± 1%    ~     (p=0.578 n=27+25)
RegexpMatchMedium_32-4      740ns ± 0%     740ns ± 0%    ~     (p=0.357 n=30+30)
RegexpMatchMedium_1K-4      223µs ± 0%     223µs ± 0%  +0.16%  (p=0.000 n=30+29)
RegexpMatchHard_32-4       12.3µs ± 0%    12.3µs ± 0%    ~     (p=0.236 n=27+27)
RegexpMatchHard_1K-4        371µs ± 0%     371µs ± 0%  +0.09%  (p=0.000 n=30+27)
Revcomp-4                   2.85s ± 0%     2.85s ± 0%    ~     (p=0.057 n=28+25)
Template-4                  408ms ± 1%     409ms ± 1%    ~     (p=0.117 n=29+29)
TimeParse-4                1.93µs ± 0%    1.93µs ± 0%    ~     (p=0.535 n=29+28)
TimeFormat-4               1.99µs ± 0%    1.99µs ± 0%    ~     (p=0.168 n=29+28)
[Geo mean]                  306µs          307µs       +0.07%

name                     old speed      new speed      delta
GobDecode-4              18.3MB/s ± 0%  18.2MB/s ± 0%  -0.31%  (p=0.000 n=28+29)
GobEncode-4              19.9MB/s ± 8%  19.8MB/s ± 9%    ~     (p=0.923 n=30+30)
Gzip-4                   9.39MB/s ± 1%  9.45MB/s ± 1%  +0.65%  (p=0.000 n=29+30)
Gunzip-4                  111MB/s ± 0%   111MB/s ± 0%  +0.15%  (p=0.001 n=30+30)
JSONEncode-4             21.9MB/s ± 1%  21.6MB/s ± 1%  -1.45%  (p=0.000 n=23+23)
JSONDecode-4             4.94MB/s ± 1%  4.98MB/s ± 1%  +0.84%  (p=0.000 n=27+30)
GoParse-4                2.91MB/s ± 0%  2.90MB/s ± 0%  -0.34%  (p=0.000 n=21+22)
RegexpMatchEasy0_32-4    74.1MB/s ± 0%  74.1MB/s ± 0%    ~     (p=0.469 n=29+28)
RegexpMatchEasy0_1K-4     634MB/s ± 0%   634MB/s ± 0%    ~     (p=0.978 n=24+28)
RegexpMatchEasy1_32-4    72.2MB/s ± 0%  72.2MB/s ± 0%    ~     (p=0.064 n=27+29)
RegexpMatchEasy1_1K-4     396MB/s ± 1%   396MB/s ± 1%    ~     (p=0.583 n=27+25)
RegexpMatchMedium_32-4   1.35MB/s ± 0%  1.35MB/s ± 0%    ~     (all equal)
RegexpMatchMedium_1K-4   4.60MB/s ± 0%  4.59MB/s ± 0%  -0.14%  (p=0.000 n=30+26)
RegexpMatchHard_32-4     2.61MB/s ± 0%  2.61MB/s ± 0%    ~     (all equal)
RegexpMatchHard_1K-4     2.76MB/s ± 0%  2.76MB/s ± 0%    ~     (all equal)
Revcomp-4                89.1MB/s ± 0%  89.1MB/s ± 0%    ~     (p=0.059 n=28+25)
Template-4               4.75MB/s ± 1%  4.75MB/s ± 1%    ~     (p=0.106 n=29+29)
[Geo mean]               18.3MB/s       18.3MB/s       -0.07%

Change-Id: I3cd76ce63e84b0c3cebabf9fa3573b76a7343899
Reviewed-on: https://go-review.googlesource.com/124935
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 02:51:28 +00:00
Ben Shi b444215116 cmd/compile: optimize ARM64's code with MADD/MSUB
MADD does MUL-ADD in a single instruction, and MSUB does the
similiar simplification for MUL-SUB.

The CL implements the optimization with MADD/MSUB.

1. The total size of pkg/android_arm64/ decreases about 20KB,
excluding cmd/compile/.

2. The go1 benchmark shows a little improvement for RegexpMatchHard_32-4
and Template-4, excluding noise.

name                     old time/op    new time/op    delta
BinaryTree17-4              16.3s ± 1%     16.5s ± 1%  +1.41%  (p=0.000 n=26+28)
Fannkuch11-4                8.79s ± 1%     8.76s ± 0%  -0.36%  (p=0.000 n=26+28)
FmtFprintfEmpty-4           172ns ± 0%     172ns ± 0%    ~     (all equal)
FmtFprintfString-4          362ns ± 1%     364ns ± 0%  +0.55%  (p=0.000 n=30+30)
FmtFprintfInt-4             416ns ± 0%     416ns ± 0%    ~     (p=0.099 n=22+30)
FmtFprintfIntInt-4          655ns ± 1%     660ns ± 1%  +0.76%  (p=0.000 n=30+30)
FmtFprintfPrefixedInt-4     810ns ± 0%     809ns ± 0%  -0.08%  (p=0.009 n=29+29)
FmtFprintfFloat-4          1.08µs ± 0%    1.09µs ± 0%  +0.61%  (p=0.000 n=30+29)
FmtManyArgs-4              2.70µs ± 0%    2.69µs ± 0%  -0.23%  (p=0.000 n=29+28)
GobDecode-4                32.2ms ± 1%    32.1ms ± 1%  -0.39%  (p=0.000 n=27+26)
GobEncode-4                27.4ms ± 2%    27.4ms ± 1%    ~     (p=0.864 n=28+28)
Gzip-4                      1.53s ± 1%     1.52s ± 1%  -0.30%  (p=0.031 n=29+29)
Gunzip-4                    146ms ± 0%     146ms ± 0%  -0.14%  (p=0.001 n=25+30)
HTTPClientServer-4         1.00ms ± 4%    0.98ms ± 6%  -1.65%  (p=0.001 n=29+30)
JSONEncode-4               67.3ms ± 1%    67.2ms ± 1%    ~     (p=0.520 n=28+28)
JSONDecode-4                329ms ± 5%     330ms ± 4%    ~     (p=0.142 n=30+30)
Mandelbrot200-4            17.3ms ± 0%    17.3ms ± 0%    ~     (p=0.055 n=26+29)
GoParse-4                  16.9ms ± 1%    17.0ms ± 1%  +0.82%  (p=0.000 n=30+30)
RegexpMatchEasy0_32-4       382ns ± 0%     382ns ± 0%    ~     (all equal)
RegexpMatchEasy0_1K-4      1.33µs ± 0%    1.33µs ± 0%  -0.25%  (p=0.000 n=30+27)
RegexpMatchEasy1_32-4       361ns ± 0%     361ns ± 0%  -0.08%  (p=0.002 n=30+28)
RegexpMatchEasy1_1K-4      2.11µs ± 0%    2.09µs ± 0%  -0.54%  (p=0.000 n=30+29)
RegexpMatchMedium_32-4      594ns ± 0%     592ns ± 0%  -0.32%  (p=0.000 n=30+30)
RegexpMatchMedium_1K-4      173µs ± 0%     172µs ± 0%  -0.77%  (p=0.000 n=29+27)
RegexpMatchHard_32-4       10.4µs ± 0%    10.1µs ± 0%  -3.63%  (p=0.000 n=28+27)
RegexpMatchHard_1K-4        306µs ± 0%     301µs ± 0%  -1.64%  (p=0.000 n=29+30)
Revcomp-4                   2.51s ± 1%     2.52s ± 0%  +0.18%  (p=0.017 n=26+27)
Template-4                  394ms ± 3%     382ms ± 3%  -3.22%  (p=0.000 n=28+28)
TimeParse-4                1.67µs ± 0%    1.67µs ± 0%  +0.05%  (p=0.030 n=27+30)
TimeFormat-4               1.72µs ± 0%    1.70µs ± 0%  -0.79%  (p=0.000 n=28+26)
[Geo mean]                  259µs          259µs       -0.33%

name                     old speed      new speed      delta
GobDecode-4              23.8MB/s ± 1%  23.9MB/s ± 1%  +0.40%  (p=0.001 n=27+26)
GobEncode-4              28.0MB/s ± 2%  28.0MB/s ± 1%    ~     (p=0.863 n=28+28)
Gzip-4                   12.7MB/s ± 1%  12.7MB/s ± 1%  +0.32%  (p=0.026 n=29+29)
Gunzip-4                  133MB/s ± 0%   133MB/s ± 0%  +0.15%  (p=0.001 n=24+30)
JSONEncode-4             28.8MB/s ± 1%  28.9MB/s ± 1%    ~     (p=0.475 n=28+28)
JSONDecode-4             5.89MB/s ± 4%  5.87MB/s ± 5%    ~     (p=0.174 n=29+30)
GoParse-4                3.43MB/s ± 0%  3.40MB/s ± 1%  -0.83%  (p=0.000 n=28+30)
RegexpMatchEasy0_32-4    83.6MB/s ± 0%  83.6MB/s ± 0%    ~     (p=0.848 n=28+29)
RegexpMatchEasy0_1K-4     768MB/s ± 0%   770MB/s ± 0%  +0.25%  (p=0.000 n=30+27)
RegexpMatchEasy1_32-4    88.5MB/s ± 0%  88.5MB/s ± 0%    ~     (p=0.086 n=29+29)
RegexpMatchEasy1_1K-4     486MB/s ± 0%   489MB/s ± 0%  +0.54%  (p=0.000 n=30+29)
RegexpMatchMedium_32-4   1.68MB/s ± 0%  1.69MB/s ± 0%  +0.60%  (p=0.000 n=30+23)
RegexpMatchMedium_1K-4   5.90MB/s ± 0%  5.95MB/s ± 0%  +0.85%  (p=0.000 n=18+20)
RegexpMatchHard_32-4     3.07MB/s ± 0%  3.18MB/s ± 0%  +3.72%  (p=0.000 n=29+26)
RegexpMatchHard_1K-4     3.35MB/s ± 0%  3.40MB/s ± 0%  +1.69%  (p=0.000 n=30+30)
Revcomp-4                 101MB/s ± 0%   101MB/s ± 0%  -0.18%  (p=0.018 n=26+27)
Template-4               4.92MB/s ± 4%  5.09MB/s ± 3%  +3.31%  (p=0.000 n=28+28)
[Geo mean]               22.4MB/s       22.6MB/s       +0.62%

Change-Id: I8f304b272785739f57b3c8f736316f658f8c1b2a
Reviewed-on: https://go-review.googlesource.com/129119
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-04 20:41:58 +00:00
Matthew Dempsky f7a633aa79 cmd/compile: use "N variables but M values" error for OAS
Makes the error message more consistent between OAS and OAS2.

Fixes #26616.

Change-Id: I07ab46c5ef8a37efb2cb557632697f5d1bf789f7
Reviewed-on: https://go-review.googlesource.com/131280
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-09-04 19:05:56 +00:00
Alexey Naidonov 669fa8f36a cmd/compile: remove unnecessary nil-check
Removes unnecessary nil-check when referencing offset from an
address. Suggested by Keith Randall in golang/go#27180.

Updates golang/go#27180

Change-Id: I326ed7fda7cfa98b7e4354c811900707fee26021
Reviewed-on: https://go-review.googlesource.com/131735
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-04 17:44:14 +00:00
Iskander Sharipov 67ac554d79 cmd/compile/internal/gc: fix invalid positions for sink nodes in esc.go
Make OAS2 and OAS2FUNC sink locations point to the assignment position,
not the nth LHS position.

Fixes #26987

Change-Id: Ibeb9df2da754da8b6638fe1e49e813f37515c13c
Reviewed-on: https://go-review.googlesource.com/129315
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-03 16:46:59 +00:00
Michael Munday 6f9b94ab66 cmd/compile: implement OnesCount{8,16,32,64} intrinsics on s390x
This CL implements the math/bits.OnesCount{8,16,32,64} functions
as intrinsics on s390x using the 'population count' (popcnt)
instruction. This instruction was released as the 'population-count'
facility which uses the same facility bit (45) as the
'distinct-operands' facility which is a pre-requisite for Go on
s390x. We can therefore use it without a feature check.

The s390x popcnt instruction treats a 64 bit register as a vector
of 8 bytes, summing the number of ones in each byte individually.
It then writes the results to the corresponding bytes in the
output register. Therefore to implement OnesCount{16,32,64} we
need to sum the individual byte counts using some extra
instructions. To do this efficiently I've added some additional
pseudo operations to the s390x SSA backend.

Unlike other architectures the new instruction sequence is faster
for OnesCount8, so that is implemented using the intrinsic.

name         old time/op  new time/op  delta
OnesCount    3.21ns ± 1%  1.35ns ± 0%  -58.00%  (p=0.000 n=20+20)
OnesCount8   0.91ns ± 1%  0.81ns ± 0%  -11.43%  (p=0.000 n=20+20)
OnesCount16  1.51ns ± 3%  1.21ns ± 0%  -19.71%  (p=0.000 n=20+17)
OnesCount32  1.91ns ± 0%  1.12ns ± 1%  -41.60%  (p=0.000 n=19+20)
OnesCount64  3.18ns ± 4%  1.35ns ± 0%  -57.52%  (p=0.000 n=20+20)

Change-Id: Id54f0bd28b6db9a887ad12c0d72fcc168ef9c4e0
Reviewed-on: https://go-review.googlesource.com/114675
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-03 14:35:38 +00:00
Iskander Sharipov ff468a43be cmd/compile/internal/gc: better handling of self-assignments in esc.go
Teach escape analysis to recognize these assignment patterns
as not causing the src to leak:

	val.x = val.y
	val.x[i] = val.y[j]
	val.x1.x2 = val.x1.y2
	... etc

Helps to avoid "leaking param" with assignments showed above.
The implementation is based on somewhat similiar xs=xs[a:b]
special case that is ignored by the escape analysis.

We may figure out more generalized version of this,
but this one looks like a safe step into that direction.

Updates #14858

Change-Id: I6fe5bfedec9c03bdc1d7624883324a523bd11fde
Reviewed-on: https://go-review.googlesource.com/126395
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-09-03 14:28:51 +00:00
Giovanni Bajo dd5e9b32ff cmd/compile: add testcase for #24876
This is still not fixed, the testcase reflects that there are still
a few boundchecks. Let's fix the good alternative with an explicit
test though.

Updates #24876

Change-Id: I4da35eb353e19052bd7b69ea6190a69ced8b9b3d
Reviewed-on: https://go-review.googlesource.com/107355
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-02 10:34:51 +00:00
Giovanni Bajo f02cc88f46 test: relax whitespaces matching in codegen tests
The codegen testsuite uses regexp to parse the syntax, but it doesn't
have a way to tell line comments containing checks from line comments
containing English sentences. This means that any syntax error (that
is, non-matching regexp) is currently ignored and not reported.

There were some tests in memcombine.go that had an extraneous space
and were thus effectively disabled. It would be great if we could
report it as a syntax error, but for now we just punt and swallow the
spaces as a workaround, to avoid the same mistake again.

Fixes #25452

Change-Id: Ic7747a2278bc00adffd0c199ce40937acbbc9cf0
Reviewed-on: https://go-review.googlesource.com/113835
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-02 10:31:37 +00:00
Giovanni Bajo 09ea3c08e8 cmd/compile: in prove, fix fence-post implications for unsigned domain
Fence-post implications of the form "x-1 >= w && x > min ⇒ x > w"
were not correctly handling unsigned domain, by always checking signed
limits.

This bug was uncovered once we taught prove that len(x) is always
>= 0 in the signed domain.

In the code being miscompiled (s[len(s)-1]), prove checks
whether len(s)-1 >= len(s) in the unsigned domain; if it proves
that this is always false, it can remove the bound check.

Notice that len(s)-1 >= len(s) can be true for len(s) = 0 because
of the wrap-around, so this is something prove should not be
able to deduce.

But because of the bug, the gate condition for the fence-post
implication was len(s) > MinInt64 instead of len(s) > 0; that
condition would be good in the signed domain but not in the
unsigned domain. And since in CL105635 we taught prove that
len(s) >= 0, the condition incorrectly triggered
(len(s) >= 0 > MinInt64) and things were going downfall.

Fixes #27251
Fixes #27289

Change-Id: I3dbcb1955ac5a66a0dcbee500f41e8d219409be5
Reviewed-on: https://go-review.googlesource.com/132495
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-31 08:54:38 +00:00
Andrew Bonventre 5ac2476748 cmd/compile: make math/bits.RotateLeft* an intrinsic on amd64
Previously, pattern matching was good enough to achieve good performance
for the RotateLeft* functions, but the inlining cost for them was much
too high. Make RotateLeft* intrinsic on amd64 as a stop-gap for now to
reduce inlining costs.

This should be done (or at least looked at) for other architectures
as well.

Updates golang/go#17566

Change-Id: I6a106ff00b6c4e3f490650af3e083ed2be00c819
Reviewed-on: https://go-review.googlesource.com/132435
Run-TryBot: Andrew Bonventre <andybons@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-08-30 22:48:28 +00:00
Cherry Zhang 289dce24c1 test: fix nosplit test on 386
The 120->124 change in https://go-review.googlesource.com/c/go/+/61511/21/test/nosplit.go#143
looks accidental. Change back to 120.

Change-Id: I1690a8ae2d32756ba05544d2ed1baabfa64e1704
Reviewed-on: https://go-review.googlesource.com/131958
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-30 12:06:34 +00:00
Cherry Zhang 54f9c0416a cmd/compile: count nil check as use in dead auto elim
Nil check is special in that it has no use but we must keep it.
Count it as a use of the auto.

Fixes #27278.

Change-Id: I857c3d0db2ebdca1bc342b4993c0dac5c01e067f
Reviewed-on: https://go-review.googlesource.com/131955
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-30 02:19:37 +00:00
Zheng Xu 8f4fd3f34e build: support frame-pointer for arm64
Supporting frame-pointer makes Linux's perf and other profilers much more useful
because it lets them gather a stack trace efficiently on profiling events. Major
changes include:
1. save FP on the word below where RSP is pointing to (proposed by Cherry and Austin)
2. adjust some specific offsets in runtime assembly and wrapper code
3. add support to FP in goroutine scheduler
4. adjust link stack overflow check to take the extra word into account
5. adjust nosplit test cases to enable frame sizes which are 16 bytes aligned

Performance impacts on go1 benchmarks:

Enable frame-pointer (by default)

name                      old time/op    new time/op    delta
BinaryTree17-46              5.94s ± 0%     6.00s ± 0%  +1.03%  (p=0.029 n=4+4)
Fannkuch11-46                2.84s ± 1%     2.77s ± 0%  -2.58%  (p=0.008 n=5+5)
FmtFprintfEmpty-46          55.0ns ± 1%    58.9ns ± 1%  +7.06%  (p=0.008 n=5+5)
FmtFprintfString-46          102ns ± 0%     105ns ± 0%  +2.94%  (p=0.008 n=5+5)
FmtFprintfInt-46             118ns ± 0%     117ns ± 1%  -1.19%  (p=0.000 n=4+5)
FmtFprintfIntInt-46          181ns ± 0%     182ns ± 1%    ~     (p=0.444 n=5+5)
FmtFprintfPrefixedInt-46     215ns ± 1%     214ns ± 0%    ~     (p=0.254 n=5+4)
FmtFprintfFloat-46           292ns ± 0%     296ns ± 0%  +1.46%  (p=0.029 n=4+4)
FmtManyArgs-46               720ns ± 0%     732ns ± 0%  +1.72%  (p=0.008 n=5+5)
GobDecode-46                9.82ms ± 1%   10.03ms ± 2%  +2.10%  (p=0.008 n=5+5)
GobEncode-46                8.14ms ± 0%    8.72ms ± 1%  +7.14%  (p=0.008 n=5+5)
Gzip-46                      420ms ± 0%     424ms ± 0%  +0.92%  (p=0.008 n=5+5)
Gunzip-46                   48.2ms ± 0%    48.4ms ± 0%  +0.41%  (p=0.008 n=5+5)
HTTPClientServer-46          201µs ± 4%     201µs ± 0%    ~     (p=0.730 n=5+4)
JSONEncode-46               17.1ms ± 0%    17.7ms ± 1%  +3.80%  (p=0.008 n=5+5)
JSONDecode-46               88.0ms ± 0%    90.1ms ± 0%  +2.42%  (p=0.008 n=5+5)
Mandelbrot200-46            5.06ms ± 0%    5.07ms ± 0%    ~     (p=0.310 n=5+5)
GoParse-46                  5.04ms ± 0%    5.12ms ± 0%  +1.53%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-46       117ns ± 0%     117ns ± 0%    ~     (all equal)
RegexpMatchEasy0_1K-46       332ns ± 0%     329ns ± 0%  -0.78%  (p=0.008 n=5+5)
RegexpMatchEasy1_32-46       104ns ± 0%     113ns ± 0%  +8.65%  (p=0.029 n=4+4)
RegexpMatchEasy1_1K-46       563ns ± 0%     569ns ± 0%  +1.10%  (p=0.008 n=5+5)
RegexpMatchMedium_32-46      167ns ± 2%     177ns ± 1%  +5.74%  (p=0.008 n=5+5)
RegexpMatchMedium_1K-46     49.5µs ± 0%    53.4µs ± 0%  +7.81%  (p=0.008 n=5+5)
RegexpMatchHard_32-46       2.56µs ± 1%    2.72µs ± 0%  +6.01%  (p=0.008 n=5+5)
RegexpMatchHard_1K-46       77.0µs ± 0%    81.8µs ± 0%  +6.24%  (p=0.016 n=5+4)
Revcomp-46                   631ms ± 1%     627ms ± 1%    ~     (p=0.095 n=5+5)
Template-46                 81.8ms ± 0%    86.3ms ± 0%  +5.55%  (p=0.008 n=5+5)
TimeParse-46                 423ns ± 0%     432ns ± 0%  +2.32%  (p=0.008 n=5+5)
TimeFormat-46                478ns ± 2%     497ns ± 1%  +3.89%  (p=0.008 n=5+5)
[Geo mean]                  71.6µs         73.3µs       +2.45%

name                      old speed      new speed      delta
GobDecode-46              78.1MB/s ± 1%  76.6MB/s ± 2%  -2.04%  (p=0.008 n=5+5)
GobEncode-46              94.3MB/s ± 0%  88.0MB/s ± 1%  -6.67%  (p=0.008 n=5+5)
Gzip-46                   46.2MB/s ± 0%  45.8MB/s ± 0%  -0.91%  (p=0.008 n=5+5)
Gunzip-46                  403MB/s ± 0%   401MB/s ± 0%  -0.41%  (p=0.008 n=5+5)
JSONEncode-46              114MB/s ± 0%   109MB/s ± 1%  -3.66%  (p=0.008 n=5+5)
JSONDecode-46             22.0MB/s ± 0%  21.5MB/s ± 0%  -2.35%  (p=0.008 n=5+5)
GoParse-46                11.5MB/s ± 0%  11.3MB/s ± 0%  -1.51%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-46     272MB/s ± 0%   272MB/s ± 1%    ~     (p=0.190 n=4+5)
RegexpMatchEasy0_1K-46    3.08GB/s ± 0%  3.11GB/s ± 0%  +0.77%  (p=0.008 n=5+5)
RegexpMatchEasy1_32-46     306MB/s ± 0%   283MB/s ± 0%  -7.63%  (p=0.029 n=4+4)
RegexpMatchEasy1_1K-46    1.82GB/s ± 0%  1.80GB/s ± 0%  -1.07%  (p=0.008 n=5+5)
RegexpMatchMedium_32-46   5.99MB/s ± 0%  5.64MB/s ± 1%  -5.77%  (p=0.016 n=4+5)
RegexpMatchMedium_1K-46   20.7MB/s ± 0%  19.2MB/s ± 0%  -7.25%  (p=0.008 n=5+5)
RegexpMatchHard_32-46     12.5MB/s ± 1%  11.8MB/s ± 0%  -5.66%  (p=0.008 n=5+5)
RegexpMatchHard_1K-46     13.3MB/s ± 0%  12.5MB/s ± 1%  -6.01%  (p=0.008 n=5+5)
Revcomp-46                 402MB/s ± 1%   405MB/s ± 1%    ~     (p=0.095 n=5+5)
Template-46               23.7MB/s ± 0%  22.5MB/s ± 0%  -5.25%  (p=0.008 n=5+5)
[Geo mean]                82.2MB/s       79.6MB/s       -3.26%

Disable frame-pointer (GOEXPERIMENT=noframepointer)

name                      old time/op    new time/op    delta
BinaryTree17-46              5.94s ± 0%     5.96s ± 0%  +0.39%  (p=0.029 n=4+4)
Fannkuch11-46                2.84s ± 1%     2.79s ± 1%  -1.68%  (p=0.008 n=5+5)
FmtFprintfEmpty-46          55.0ns ± 1%    55.2ns ± 3%    ~     (p=0.794 n=5+5)
FmtFprintfString-46          102ns ± 0%     103ns ± 0%  +0.98%  (p=0.016 n=5+4)
FmtFprintfInt-46             118ns ± 0%     115ns ± 0%  -2.54%  (p=0.029 n=4+4)
FmtFprintfIntInt-46          181ns ± 0%     179ns ± 0%  -1.10%  (p=0.000 n=5+4)
FmtFprintfPrefixedInt-46     215ns ± 1%     213ns ± 0%    ~     (p=0.143 n=5+4)
FmtFprintfFloat-46           292ns ± 0%     300ns ± 0%  +2.83%  (p=0.029 n=4+4)
FmtManyArgs-46               720ns ± 0%     739ns ± 0%  +2.64%  (p=0.008 n=5+5)
GobDecode-46                9.82ms ± 1%    9.78ms ± 1%    ~     (p=0.151 n=5+5)
GobEncode-46                8.14ms ± 0%    8.12ms ± 1%    ~     (p=0.690 n=5+5)
Gzip-46                      420ms ± 0%     420ms ± 0%    ~     (p=0.548 n=5+5)
Gunzip-46                   48.2ms ± 0%    48.0ms ± 0%  -0.33%  (p=0.032 n=5+5)
HTTPClientServer-46          201µs ± 4%     199µs ± 3%    ~     (p=0.548 n=5+5)
JSONEncode-46               17.1ms ± 0%    17.2ms ± 0%    ~     (p=0.056 n=5+5)
JSONDecode-46               88.0ms ± 0%    88.6ms ± 0%  +0.64%  (p=0.008 n=5+5)
Mandelbrot200-46            5.06ms ± 0%    5.07ms ± 0%    ~     (p=0.548 n=5+5)
GoParse-46                  5.04ms ± 0%    5.07ms ± 0%  +0.65%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-46       117ns ± 0%     112ns ± 4%  -4.27%  (p=0.016 n=4+5)
RegexpMatchEasy0_1K-46       332ns ± 0%     330ns ± 1%    ~     (p=0.095 n=5+5)
RegexpMatchEasy1_32-46       104ns ± 0%     110ns ± 1%  +5.29%  (p=0.029 n=4+4)
RegexpMatchEasy1_1K-46       563ns ± 0%     567ns ± 2%    ~     (p=0.151 n=5+5)
RegexpMatchMedium_32-46      167ns ± 2%     166ns ± 0%    ~     (p=0.333 n=5+4)
RegexpMatchMedium_1K-46     49.5µs ± 0%    49.6µs ± 0%    ~     (p=0.841 n=5+5)
RegexpMatchHard_32-46       2.56µs ± 1%    2.49µs ± 0%  -2.81%  (p=0.008 n=5+5)
RegexpMatchHard_1K-46       77.0µs ± 0%    75.8µs ± 0%  -1.55%  (p=0.008 n=5+5)
Revcomp-46                   631ms ± 1%     628ms ± 0%    ~     (p=0.095 n=5+5)
Template-46                 81.8ms ± 0%    84.3ms ± 1%  +3.05%  (p=0.008 n=5+5)
TimeParse-46                 423ns ± 0%     425ns ± 0%  +0.52%  (p=0.008 n=5+5)
TimeFormat-46                478ns ± 2%     478ns ± 1%    ~     (p=1.000 n=5+5)
[Geo mean]                  71.6µs         71.6µs       -0.01%

name                      old speed      new speed      delta
GobDecode-46              78.1MB/s ± 1%  78.5MB/s ± 1%    ~     (p=0.151 n=5+5)
GobEncode-46              94.3MB/s ± 0%  94.5MB/s ± 1%    ~     (p=0.690 n=5+5)
Gzip-46                   46.2MB/s ± 0%  46.2MB/s ± 0%    ~     (p=0.571 n=5+5)
Gunzip-46                  403MB/s ± 0%   404MB/s ± 0%  +0.33%  (p=0.032 n=5+5)
JSONEncode-46              114MB/s ± 0%   113MB/s ± 0%    ~     (p=0.056 n=5+5)
JSONDecode-46             22.0MB/s ± 0%  21.9MB/s ± 0%  -0.64%  (p=0.008 n=5+5)
GoParse-46                11.5MB/s ± 0%  11.4MB/s ± 0%  -0.64%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-46     272MB/s ± 0%   285MB/s ± 4%  +4.74%  (p=0.016 n=4+5)
RegexpMatchEasy0_1K-46    3.08GB/s ± 0%  3.10GB/s ± 1%    ~     (p=0.151 n=5+5)
RegexpMatchEasy1_32-46     306MB/s ± 0%   290MB/s ± 1%  -5.21%  (p=0.029 n=4+4)
RegexpMatchEasy1_1K-46    1.82GB/s ± 0%  1.81GB/s ± 2%    ~     (p=0.151 n=5+5)
RegexpMatchMedium_32-46   5.99MB/s ± 0%  6.02MB/s ± 1%    ~     (p=0.063 n=4+5)
RegexpMatchMedium_1K-46   20.7MB/s ± 0%  20.7MB/s ± 0%    ~     (p=0.659 n=5+5)
RegexpMatchHard_32-46     12.5MB/s ± 1%  12.8MB/s ± 0%  +2.88%  (p=0.008 n=5+5)
RegexpMatchHard_1K-46     13.3MB/s ± 0%  13.5MB/s ± 0%  +1.58%  (p=0.008 n=5+5)
Revcomp-46                 402MB/s ± 1%   405MB/s ± 0%    ~     (p=0.095 n=5+5)
Template-46               23.7MB/s ± 0%  23.0MB/s ± 1%  -2.95%  (p=0.008 n=5+5)
[Geo mean]                82.2MB/s       82.3MB/s       +0.04%

Frame-pointer is enabled on Linux by default but can be disabled by setting: GOEXPERIMENT=noframepointer.

Fixes #10110

Change-Id: I1bfaca6dba29a63009d7c6ab04ed7a1413d9479e
Reviewed-on: https://go-review.googlesource.com/61511
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-29 18:28:34 +00:00
Dave Brophy 7c7cecc184 fmt: fix incorrect format of whole-number floats when using %#v
This fixes the unwanted behaviour where printing a zero float with the
#v fmt verb outputs "0" - e.g. missing the trailing decimal. This means
that the output would be interpreted as an int rather than a float when
parsed as Go source. After this change the the output is "0.0".

Fixes #26363

Change-Id: Ic5c060522459cd5ce077675d47c848b22ddc34fa
GitHub-Last-Rev: adfb061363
GitHub-Pull-Request: golang/go#26383
Reviewed-on: https://go-review.googlesource.com/123956
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Rob Pike <r@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-28 20:15:15 +00:00
Ben Shi 3ca3e89bb6 cmd/compile: optimize arm64 with indexed FP load/store
The FP load/store on arm64 have register indexed forms. And this
CL implements this optimization.

1. The total size of pkg/android_arm64 (excluding cmd/compile)
decreases about 400 bytes.

2. There is no regression in the go1 benchmark, the test case
GobEncode even gets slight improvement, excluding noise.

name                     old time/op    new time/op    delta
BinaryTree17-4              19.0s ± 0%     19.0s ± 1%    ~     (p=0.817 n=29+29)
Fannkuch11-4                9.94s ± 0%     9.95s ± 0%  +0.03%  (p=0.010 n=24+30)
FmtFprintfEmpty-4           233ns ± 0%     233ns ± 0%    ~     (all equal)
FmtFprintfString-4          427ns ± 0%     427ns ± 0%    ~     (p=0.649 n=30+30)
FmtFprintfInt-4             471ns ± 0%     471ns ± 0%    ~     (all equal)
FmtFprintfIntInt-4          730ns ± 0%     730ns ± 0%    ~     (all equal)
FmtFprintfPrefixedInt-4     889ns ± 0%     889ns ± 0%    ~     (all equal)
FmtFprintfFloat-4          1.21µs ± 0%    1.21µs ± 0%  +0.04%  (p=0.012 n=20+30)
FmtManyArgs-4              2.99µs ± 0%    2.99µs ± 0%    ~     (p=0.651 n=29+29)
GobDecode-4                42.4ms ± 1%    42.3ms ± 1%  -0.27%  (p=0.001 n=29+28)
GobEncode-4                37.8ms ±11%    36.0ms ± 0%  -4.67%  (p=0.000 n=30+26)
Gzip-4                      1.98s ± 1%     1.96s ± 1%  -1.26%  (p=0.000 n=30+30)
Gunzip-4                    175ms ± 0%     175ms ± 0%    ~     (p=0.988 n=29+29)
HTTPClientServer-4          854µs ± 5%     860µs ± 5%    ~     (p=0.236 n=28+29)
JSONEncode-4               88.8ms ± 0%    87.9ms ± 0%  -1.00%  (p=0.000 n=24+26)
JSONDecode-4                390ms ± 1%     392ms ± 2%  +0.48%  (p=0.025 n=30+30)
Mandelbrot200-4            19.5ms ± 0%    19.5ms ± 0%    ~     (p=0.894 n=24+29)
GoParse-4                  20.3ms ± 0%    20.1ms ± 1%  -0.94%  (p=0.000 n=27+26)
RegexpMatchEasy0_32-4       451ns ± 0%     451ns ± 0%    ~     (p=0.578 n=30+30)
RegexpMatchEasy0_1K-4      1.63µs ± 0%    1.63µs ± 0%    ~     (p=0.298 n=30+28)
RegexpMatchEasy1_32-4       431ns ± 0%     434ns ± 0%  +0.67%  (p=0.000 n=30+29)
RegexpMatchEasy1_1K-4      2.60µs ± 0%    2.64µs ± 0%  +1.36%  (p=0.000 n=28+26)
RegexpMatchMedium_32-4      744ns ± 0%     744ns ± 0%    ~     (p=0.474 n=29+29)
RegexpMatchMedium_1K-4      223µs ± 0%     223µs ± 0%  -0.08%  (p=0.038 n=26+30)
RegexpMatchHard_32-4       12.2µs ± 0%    12.3µs ± 0%  +0.27%  (p=0.000 n=29+30)
RegexpMatchHard_1K-4        373µs ± 0%     373µs ± 0%    ~     (p=0.219 n=29+28)
Revcomp-4                   2.84s ± 0%     2.84s ± 0%    ~     (p=0.130 n=28+28)
Template-4                  394ms ± 1%     392ms ± 1%  -0.52%  (p=0.001 n=30+30)
TimeParse-4                1.93µs ± 0%    1.93µs ± 0%    ~     (p=0.587 n=29+30)
TimeFormat-4               2.00µs ± 0%    2.00µs ± 0%  +0.07%  (p=0.001 n=28+27)
[Geo mean]                  306µs          305µs       -0.17%

name                     old speed      new speed      delta
GobDecode-4              18.1MB/s ± 1%  18.2MB/s ± 1%  +0.27%  (p=0.001 n=29+28)
GobEncode-4              20.3MB/s ±10%  21.3MB/s ± 0%  +4.64%  (p=0.000 n=30+26)
Gzip-4                   9.79MB/s ± 1%  9.91MB/s ± 1%  +1.28%  (p=0.000 n=30+30)
Gunzip-4                  111MB/s ± 0%   111MB/s ± 0%    ~     (p=0.988 n=29+29)
JSONEncode-4             21.8MB/s ± 0%  22.1MB/s ± 0%  +1.02%  (p=0.000 n=24+26)
JSONDecode-4             4.97MB/s ± 1%  4.95MB/s ± 2%  -0.45%  (p=0.031 n=30+30)
GoParse-4                2.85MB/s ± 1%  2.88MB/s ± 1%  +1.03%  (p=0.000 n=30+26)
RegexpMatchEasy0_32-4    70.9MB/s ± 0%  70.9MB/s ± 0%    ~     (p=0.904 n=29+28)
RegexpMatchEasy0_1K-4     627MB/s ± 0%   627MB/s ± 0%    ~     (p=0.156 n=30+30)
RegexpMatchEasy1_32-4    74.2MB/s ± 0%  73.7MB/s ± 0%  -0.67%  (p=0.000 n=30+29)
RegexpMatchEasy1_1K-4     393MB/s ± 0%   388MB/s ± 0%  -1.34%  (p=0.000 n=28+26)
RegexpMatchMedium_32-4   1.34MB/s ± 0%  1.34MB/s ± 0%    ~     (all equal)
RegexpMatchMedium_1K-4   4.59MB/s ± 0%  4.59MB/s ± 0%  +0.07%  (p=0.035 n=25+30)
RegexpMatchHard_32-4     2.61MB/s ± 0%  2.61MB/s ± 0%  -0.11%  (p=0.002 n=28+30)
RegexpMatchHard_1K-4     2.75MB/s ± 0%  2.75MB/s ± 0%  +0.15%  (p=0.001 n=30+24)
Revcomp-4                89.4MB/s ± 0%  89.4MB/s ± 0%    ~     (p=0.140 n=28+28)
Template-4               4.93MB/s ± 1%  4.95MB/s ± 1%  +0.51%  (p=0.001 n=30+30)
[Geo mean]               18.4MB/s       18.4MB/s       +0.37%

Change-Id: I9a6b521a971b21cfb51064e8e9b853cef8a1d071
Reviewed-on: https://go-review.googlesource.com/124636
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-08-28 02:37:18 +00:00
Ben Shi 2b69ad0b3c cmd/compile: optimize arm's comparison
The CMP/CMN/TST/TEQ perform similar to SUB/ADD/AND/XOR except
the result is abondoned, and only NZCV flags are affected.

This CL implements further optimization with them.

1. A micro benchmark test gets more than 9% improvment.
TSTTEQ-4                   6.99ms ± 0%    6.35ms ± 0%  -9.15%  (p=0.000 n=33+36)
(https://github.com/benshi001/ugo1/blob/master/tstteq2_test.go)

2. The go1 benckmark shows no regression, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              25.7s ± 1%     25.7s ± 1%    ~     (p=0.830 n=40+40)
Fannkuch11-4                13.3s ± 0%     13.2s ± 0%  -0.65%  (p=0.000 n=40+34)
FmtFprintfEmpty-4           394ns ± 0%     394ns ± 0%    ~     (p=0.819 n=40+40)
FmtFprintfString-4          677ns ± 0%     677ns ± 0%  +0.06%  (p=0.039 n=39+40)
FmtFprintfInt-4             707ns ± 0%     706ns ± 0%  -0.14%  (p=0.000 n=40+39)
FmtFprintfIntInt-4         1.04µs ± 0%    1.04µs ± 0%  +0.10%  (p=0.000 n=29+31)
FmtFprintfPrefixedInt-4    1.10µs ± 0%    1.11µs ± 0%  +0.65%  (p=0.000 n=39+37)
FmtFprintfFloat-4          2.27µs ± 0%    2.26µs ± 0%  -0.53%  (p=0.000 n=39+40)
FmtManyArgs-4              3.96µs ± 0%    3.96µs ± 0%  +0.10%  (p=0.000 n=39+40)
GobDecode-4                53.4ms ± 1%    52.8ms ± 2%  -1.10%  (p=0.000 n=39+39)
GobEncode-4                50.3ms ± 3%    50.4ms ± 2%    ~     (p=0.089 n=40+39)
Gzip-4                      2.62s ± 0%     2.64s ± 0%  +0.60%  (p=0.000 n=40+39)
Gunzip-4                    312ms ± 0%     312ms ± 0%  +0.02%  (p=0.030 n=40+39)
HTTPClientServer-4         1.01ms ± 7%    0.98ms ± 7%  -2.37%  (p=0.000 n=40+39)
JSONEncode-4                126ms ± 1%     126ms ± 1%  -0.38%  (p=0.004 n=39+39)
JSONDecode-4                423ms ± 0%     426ms ± 2%  +0.72%  (p=0.001 n=39+40)
Mandelbrot200-4            18.4ms ± 0%    18.4ms ± 0%  +0.04%  (p=0.000 n=38+40)
GoParse-4                  22.8ms ± 0%    22.6ms ± 0%  -0.68%  (p=0.000 n=35+40)
RegexpMatchEasy0_32-4       699ns ± 0%     704ns ± 0%  +0.73%  (p=0.000 n=27+40)
RegexpMatchEasy0_1K-4      4.27µs ± 0%    4.26µs ± 0%  -0.09%  (p=0.000 n=35+38)
RegexpMatchEasy1_32-4       741ns ± 0%     735ns ± 0%  -0.85%  (p=0.000 n=40+35)
RegexpMatchEasy1_1K-4      5.53µs ± 0%    5.49µs ± 0%  -0.69%  (p=0.000 n=39+40)
RegexpMatchMedium_32-4     1.07µs ± 0%    1.04µs ± 2%  -2.34%  (p=0.000 n=40+40)
RegexpMatchMedium_1K-4      261µs ± 0%     261µs ± 0%  -0.16%  (p=0.000 n=40+39)
RegexpMatchHard_32-4       14.9µs ± 0%    14.9µs ± 0%  -0.18%  (p=0.000 n=39+40)
RegexpMatchHard_1K-4        445µs ± 0%     446µs ± 0%  +0.09%  (p=0.000 n=36+34)
Revcomp-4                  41.8ms ± 1%    41.8ms ± 1%    ~     (p=0.595 n=39+38)
Template-4                  530ms ± 1%     528ms ± 1%  -0.49%  (p=0.000 n=40+40)
TimeParse-4                3.39µs ± 0%    3.42µs ± 0%  +0.98%  (p=0.000 n=36+38)
TimeFormat-4               6.12µs ± 0%    6.07µs ± 0%  -0.81%  (p=0.000 n=34+38)
[Geo mean]                  384µs          383µs       -0.24%

name                     old speed      new speed      delta
GobDecode-4              14.4MB/s ± 1%  14.5MB/s ± 2%  +1.11%  (p=0.000 n=39+39)
GobEncode-4              15.3MB/s ± 3%  15.2MB/s ± 2%    ~     (p=0.104 n=40+39)
Gzip-4                   7.40MB/s ± 1%  7.36MB/s ± 0%  -0.60%  (p=0.000 n=40+39)
Gunzip-4                 62.2MB/s ± 0%  62.1MB/s ± 0%  -0.02%  (p=0.047 n=40+39)
JSONEncode-4             15.4MB/s ± 1%  15.4MB/s ± 2%  +0.39%  (p=0.002 n=39+39)
JSONDecode-4             4.59MB/s ± 0%  4.56MB/s ± 2%  -0.71%  (p=0.000 n=39+40)
GoParse-4                2.54MB/s ± 0%  2.56MB/s ± 0%  +0.72%  (p=0.000 n=26+40)
RegexpMatchEasy0_32-4    45.8MB/s ± 0%  45.4MB/s ± 0%  -0.75%  (p=0.000 n=38+40)
RegexpMatchEasy0_1K-4     240MB/s ± 0%   240MB/s ± 0%  +0.09%  (p=0.000 n=35+38)
RegexpMatchEasy1_32-4    43.1MB/s ± 0%  43.5MB/s ± 0%  +0.84%  (p=0.000 n=40+39)
RegexpMatchEasy1_1K-4     185MB/s ± 0%   186MB/s ± 0%  +0.69%  (p=0.000 n=39+40)
RegexpMatchMedium_32-4    936kB/s ± 1%   959kB/s ± 2%  +2.38%  (p=0.000 n=40+40)
RegexpMatchMedium_1K-4   3.92MB/s ± 0%  3.93MB/s ± 0%  +0.18%  (p=0.000 n=39+40)
RegexpMatchHard_32-4     2.15MB/s ± 0%  2.15MB/s ± 0%  +0.19%  (p=0.000 n=40+40)
RegexpMatchHard_1K-4     2.30MB/s ± 0%  2.30MB/s ± 0%    ~     (all equal)
Revcomp-4                60.8MB/s ± 1%  60.8MB/s ± 1%    ~     (p=0.600 n=39+38)
Template-4               3.66MB/s ± 1%  3.68MB/s ± 1%  +0.46%  (p=0.000 n=40+40)
[Geo mean]               12.8MB/s       12.8MB/s       +0.27%

Change-Id: I849161169ecf0876a04b7c1d3990fa8d1435215e
Reviewed-on: https://go-review.googlesource.com/122855
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-08-27 15:49:46 +00:00
Alberto Donizetti 42cc4ca30a cmd/compile: prevent overflow in walkinrange
In the compiler frontend, walkinrange indiscriminately calls Int64()
on const CTINT nodes, even though Int64's return value is undefined
for anything over 2⁶³ (in practise, it'll return a negative number).

This causes the introduction of bad constants during rewrites of
unsigned expressions, which make the compiler reject valid Go
programs.

This change introduces a preliminary check that Int64() is safe to
call on the consts on hand. If it isn't, walkinrange exits without
doing any rewrite.

Fixes #27143

Change-Id: I2017073cae65468a521ff3262d4ea8ab0d7098d9
Reviewed-on: https://go-review.googlesource.com/130735
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-08-26 21:52:27 +00:00
Ben Shi e03220a594 cmd/compile: optimize 386 code with FLDPI
FLDPI pushes the constant pi to 387's register stack, which is
more efficient than MOVSSconst/MOVSDconst.

1. This optimization reduces 0.3KB of the total size of pkg/linux_386
(exlcuding cmd/compile).

2. There is little regression in the go1 benchmark.
name                     old time/op    new time/op    delta
BinaryTree17-4              3.30s ± 3%     3.30s ± 2%    ~     (p=0.759 n=40+39)
Fannkuch11-4                3.53s ± 1%     3.54s ± 1%    ~     (p=0.168 n=40+40)
FmtFprintfEmpty-4          45.5ns ± 3%    45.6ns ± 3%    ~     (p=0.553 n=40+40)
FmtFprintfString-4         78.4ns ± 3%    78.3ns ± 3%    ~     (p=0.593 n=40+40)
FmtFprintfInt-4            88.8ns ± 2%    89.9ns ± 2%    ~     (p=0.083 n=40+33)
FmtFprintfIntInt-4          140ns ± 4%     140ns ± 4%    ~     (p=0.656 n=40+40)
FmtFprintfPrefixedInt-4     180ns ± 2%     181ns ± 3%  +0.53%  (p=0.050 n=40+40)
FmtFprintfFloat-4           408ns ± 4%     411ns ± 3%    ~     (p=0.112 n=40+40)
FmtManyArgs-4               599ns ± 3%     602ns ± 3%    ~     (p=0.784 n=40+40)
GobDecode-4                7.24ms ± 6%    7.30ms ± 5%    ~     (p=0.171 n=40+40)
GobEncode-4                6.98ms ± 5%    6.89ms ± 8%    ~     (p=0.107 n=40+40)
Gzip-4                      396ms ± 4%     396ms ± 3%    ~     (p=0.852 n=40+40)
Gunzip-4                   41.3ms ± 3%    41.5ms ± 4%    ~     (p=0.221 n=40+40)
HTTPClientServer-4         63.4µs ± 3%    63.4µs ± 2%    ~     (p=0.895 n=39+40)
JSONEncode-4               17.5ms ± 2%    17.5ms ± 3%    ~     (p=0.090 n=40+40)
JSONDecode-4               60.6ms ± 3%    60.1ms ± 4%    ~     (p=0.184 n=40+40)
Mandelbrot200-4            7.80ms ± 3%    7.78ms ± 2%    ~     (p=0.512 n=40+40)
GoParse-4                  3.30ms ± 3%    3.28ms ± 2%  -0.61%  (p=0.034 n=40+40)
RegexpMatchEasy0_32-4       104ns ± 4%     103ns ± 4%    ~     (p=0.118 n=40+40)
RegexpMatchEasy0_1K-4       850ns ± 2%     848ns ± 2%    ~     (p=0.370 n=40+40)
RegexpMatchEasy1_32-4       112ns ± 4%     112ns ± 4%    ~     (p=0.848 n=40+40)
RegexpMatchEasy1_1K-4      1.04µs ± 4%    1.03µs ± 4%    ~     (p=0.333 n=40+40)
RegexpMatchMedium_32-4      132ns ± 4%     131ns ± 3%    ~     (p=0.527 n=40+40)
RegexpMatchMedium_1K-4     43.4µs ± 3%    43.5µs ± 3%    ~     (p=0.111 n=40+40)
RegexpMatchHard_32-4       2.24µs ± 4%    2.24µs ± 4%    ~     (p=0.441 n=40+40)
RegexpMatchHard_1K-4       67.9µs ± 3%    68.0µs ± 3%    ~     (p=0.095 n=40+40)
Revcomp-4                   1.84s ± 2%     1.84s ± 2%    ~     (p=0.677 n=40+40)
Template-4                 68.4ms ± 3%    68.6ms ± 3%    ~     (p=0.345 n=40+40)
TimeParse-4                 433ns ± 3%     433ns ± 3%    ~     (p=0.403 n=40+40)
TimeFormat-4                407ns ± 3%     406ns ± 3%    ~     (p=0.900 n=40+40)
[Geo mean]                 67.1µs         67.2µs       +0.04%

name                     old speed      new speed      delta
GobDecode-4               106MB/s ± 5%   105MB/s ± 5%    ~     (p=0.173 n=40+40)
GobEncode-4               110MB/s ± 5%   112MB/s ± 9%    ~     (p=0.104 n=40+40)
Gzip-4                   49.0MB/s ± 4%  49.1MB/s ± 4%    ~     (p=0.836 n=40+40)
Gunzip-4                  471MB/s ± 3%   468MB/s ± 4%    ~     (p=0.218 n=40+40)
JSONEncode-4              111MB/s ± 2%   111MB/s ± 3%    ~     (p=0.090 n=40+40)
JSONDecode-4             32.0MB/s ± 3%  32.3MB/s ± 4%    ~     (p=0.194 n=40+40)
GoParse-4                17.6MB/s ± 3%  17.7MB/s ± 2%  +0.62%  (p=0.035 n=40+40)
RegexpMatchEasy0_32-4     307MB/s ± 4%   309MB/s ± 4%  +0.70%  (p=0.041 n=40+40)
RegexpMatchEasy0_1K-4    1.20GB/s ± 3%  1.21GB/s ± 2%    ~     (p=0.353 n=40+40)
RegexpMatchEasy1_32-4     285MB/s ± 3%   284MB/s ± 4%    ~     (p=0.384 n=40+40)
RegexpMatchEasy1_1K-4     988MB/s ± 4%   992MB/s ± 3%    ~     (p=0.335 n=40+40)
RegexpMatchMedium_32-4   7.56MB/s ± 4%  7.57MB/s ± 4%    ~     (p=0.314 n=40+40)
RegexpMatchMedium_1K-4   23.6MB/s ± 3%  23.6MB/s ± 3%    ~     (p=0.107 n=40+40)
RegexpMatchHard_32-4     14.3MB/s ± 4%  14.3MB/s ± 4%    ~     (p=0.429 n=40+40)
RegexpMatchHard_1K-4     15.1MB/s ± 3%  15.1MB/s ± 3%    ~     (p=0.099 n=40+40)
Revcomp-4                 138MB/s ± 2%   138MB/s ± 2%    ~     (p=0.658 n=40+40)
Template-4               28.4MB/s ± 3%  28.3MB/s ± 3%    ~     (p=0.331 n=40+40)
[Geo mean]               80.8MB/s       80.8MB/s       +0.09%

Change-Id: I0cb715eead68ade097a302e7fb80ccbd1d1b511e
Reviewed-on: https://go-review.googlesource.com/130975
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-25 02:39:49 +00:00
Ben Shi 3bc34385fa cmd/compile: introduce more read-modify-write operations for amd64
Add suport of read-modify-write for AND/SUB/AND/OR/XOR on amd64.

1. The total size of pkg/linux_amd64 decreases about 4KB, excluding
cmd/compile.

2. The go1 benchmark shows a little improvement, excluding noise.

name                     old time/op    new time/op    delta
BinaryTree17-4              2.63s ± 3%     2.65s ± 4%   +1.01%  (p=0.037 n=35+35)
Fannkuch11-4                2.33s ± 2%     2.39s ± 2%   +2.49%  (p=0.000 n=35+35)
FmtFprintfEmpty-4          45.4ns ± 5%    40.8ns ± 6%  -10.09%  (p=0.000 n=35+35)
FmtFprintfString-4         73.3ns ± 4%    70.9ns ± 3%   -3.23%  (p=0.000 n=30+35)
FmtFprintfInt-4            79.9ns ± 4%    79.5ns ± 3%     ~     (p=0.736 n=34+35)
FmtFprintfIntInt-4          126ns ± 4%     125ns ± 4%     ~     (p=0.083 n=35+35)
FmtFprintfPrefixedInt-4     152ns ± 6%     152ns ± 3%     ~     (p=0.855 n=34+35)
FmtFprintfFloat-4           215ns ± 4%     213ns ± 4%     ~     (p=0.066 n=35+35)
FmtManyArgs-4               522ns ± 3%     506ns ± 3%   -3.15%  (p=0.000 n=35+35)
GobDecode-4                6.45ms ± 8%    6.51ms ± 7%   +0.96%  (p=0.026 n=35+35)
GobEncode-4                6.10ms ± 6%    6.02ms ± 8%     ~     (p=0.160 n=35+35)
Gzip-4                      228ms ± 3%     221ms ± 3%   -2.92%  (p=0.000 n=35+35)
Gunzip-4                   37.5ms ± 4%    37.2ms ± 3%   -0.78%  (p=0.036 n=35+35)
HTTPClientServer-4         58.7µs ± 2%    59.2µs ± 1%   +0.80%  (p=0.000 n=33+33)
JSONEncode-4               12.0ms ± 3%    12.2ms ± 3%   +1.84%  (p=0.008 n=35+35)
JSONDecode-4               57.0ms ± 4%    56.6ms ± 3%     ~     (p=0.320 n=35+35)
Mandelbrot200-4            3.82ms ± 3%    3.79ms ± 3%     ~     (p=0.074 n=35+35)
GoParse-4                  3.21ms ± 5%    3.24ms ± 4%     ~     (p=0.119 n=35+35)
RegexpMatchEasy0_32-4      76.3ns ± 4%    75.4ns ± 4%   -1.14%  (p=0.014 n=34+33)
RegexpMatchEasy0_1K-4       251ns ± 4%     254ns ± 3%   +1.28%  (p=0.016 n=35+35)
RegexpMatchEasy1_32-4      69.6ns ± 3%    70.1ns ± 3%   +0.82%  (p=0.005 n=35+35)
RegexpMatchEasy1_1K-4       367ns ± 4%     376ns ± 4%   +2.47%  (p=0.000 n=35+35)
RegexpMatchMedium_32-4      108ns ± 5%     104ns ± 4%   -3.18%  (p=0.000 n=35+35)
RegexpMatchMedium_1K-4     33.8µs ± 3%    32.7µs ± 3%   -3.27%  (p=0.000 n=35+35)
RegexpMatchHard_32-4       1.55µs ± 3%    1.52µs ± 3%   -1.64%  (p=0.000 n=35+35)
RegexpMatchHard_1K-4       46.6µs ± 3%    46.6µs ± 4%     ~     (p=0.149 n=35+35)
Revcomp-4                   416ms ± 7%     412ms ± 6%   -0.95%  (p=0.033 n=33+35)
Template-4                 64.3ms ± 3%    62.4ms ± 7%   -2.94%  (p=0.000 n=35+35)
TimeParse-4                 320ns ± 2%     322ns ± 3%     ~     (p=0.589 n=35+35)
TimeFormat-4                300ns ± 3%     300ns ± 3%     ~     (p=0.597 n=35+35)
[Geo mean]                 47.4µs         47.0µs        -0.86%

name                     old speed      new speed      delta
GobDecode-4               119MB/s ± 7%   118MB/s ± 7%   -0.96%  (p=0.027 n=35+35)
GobEncode-4               126MB/s ± 7%   127MB/s ± 6%     ~     (p=0.157 n=34+34)
Gzip-4                   85.3MB/s ± 3%  87.9MB/s ± 3%   +3.02%  (p=0.000 n=35+35)
Gunzip-4                  518MB/s ± 4%   522MB/s ± 3%   +0.79%  (p=0.037 n=35+35)
JSONEncode-4              162MB/s ± 3%   159MB/s ± 3%   -1.81%  (p=0.009 n=35+35)
JSONDecode-4             34.1MB/s ± 4%  34.3MB/s ± 3%     ~     (p=0.318 n=35+35)
GoParse-4                18.0MB/s ± 5%  17.9MB/s ± 4%     ~     (p=0.117 n=35+35)
RegexpMatchEasy0_32-4     419MB/s ± 3%   425MB/s ± 4%   +1.46%  (p=0.003 n=32+33)
RegexpMatchEasy0_1K-4    4.07GB/s ± 4%  4.02GB/s ± 3%   -1.28%  (p=0.014 n=35+35)
RegexpMatchEasy1_32-4     460MB/s ± 3%   456MB/s ± 4%   -0.82%  (p=0.004 n=35+35)
RegexpMatchEasy1_1K-4    2.79GB/s ± 4%  2.72GB/s ± 4%   -2.39%  (p=0.000 n=35+35)
RegexpMatchMedium_32-4   9.23MB/s ± 4%  9.53MB/s ± 4%   +3.16%  (p=0.000 n=35+35)
RegexpMatchMedium_1K-4   30.3MB/s ± 3%  31.3MB/s ± 3%   +3.38%  (p=0.000 n=35+35)
RegexpMatchHard_32-4     20.7MB/s ± 3%  21.0MB/s ± 3%   +1.67%  (p=0.000 n=35+35)
RegexpMatchHard_1K-4     22.0MB/s ± 3%  21.9MB/s ± 4%     ~     (p=0.277 n=35+33)
Revcomp-4                 612MB/s ± 7%   618MB/s ± 6%   +0.96%  (p=0.034 n=33+35)
Template-4               30.2MB/s ± 3%  31.1MB/s ± 6%   +3.05%  (p=0.000 n=35+35)
[Geo mean]                123MB/s        124MB/s        +0.64%

Change-Id: Ia025da272e07d0069413824bfff3471b106d6280
Reviewed-on: https://go-review.googlesource.com/121535
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-24 23:38:25 +00:00
Kazuhiro Sera ad644d2e86 all: fix typos detected by github.com/client9/misspell
Change-Id: Iadb3c5de8ae9ea45855013997ed70f7929a88661
GitHub-Last-Rev: ae85bcf82b
GitHub-Pull-Request: golang/go#26920
Reviewed-on: https://go-review.googlesource.com/128955
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-23 15:54:07 +00:00
Yury Smolsky 1484270aec test: restore tests for the reject unsafe code option
Tests in test/safe were neglected after moving to the run.go
framework. This change restores them.

These tests are skipped for go/types via -+ option.

Fixes #25668

Change-Id: I8fe26574a76fa7afa8664c467d7c2e6334f1bba9
Reviewed-on: https://go-review.googlesource.com/124660
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-22 17:54:09 +00:00
Yury Smolsky 02fecd33f6 test: remove errchk, the perl script
gc tests do not depend on errchk.

Fixes #25669

Change-Id: I99eb87bb9677897b9167d4fc9a6321fa66cd9116
Reviewed-on: https://go-review.googlesource.com/115955
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-22 16:28:36 +00:00
Ben Shi a0a7e9fc0c cmd/compile: implement "OPC $imm, (mem)" for 386
New read-modify-write operations are introduced in this CL for 386.

1. The total size of pkg/linux_386 decreases about 10KB (excluding
cmd/compile).

2. The go1 benchmark shows little regression.
name                     old time/op    new time/op    delta
BinaryTree17-4              3.32s ± 4%     3.29s ± 2%    ~     (p=0.059 n=30+30)
Fannkuch11-4                3.49s ± 1%     3.46s ± 1%  -0.92%  (p=0.001 n=30+30)
FmtFprintfEmpty-4          47.7ns ± 2%    46.8ns ± 5%  -1.93%  (p=0.011 n=25+30)
FmtFprintfString-4         79.5ns ± 7%    80.2ns ± 3%  +0.89%  (p=0.001 n=28+29)
FmtFprintfInt-4            90.5ns ± 2%    92.1ns ± 2%  +1.82%  (p=0.014 n=22+30)
FmtFprintfIntInt-4          141ns ± 1%     144ns ± 3%  +2.23%  (p=0.013 n=22+30)
FmtFprintfPrefixedInt-4     183ns ± 2%     184ns ± 3%    ~     (p=0.080 n=21+30)
FmtFprintfFloat-4           409ns ± 3%     412ns ± 3%  +0.83%  (p=0.040 n=30+30)
FmtManyArgs-4               597ns ± 6%     607ns ± 4%  +1.71%  (p=0.006 n=30+30)
GobDecode-4                7.21ms ± 5%    7.18ms ± 6%    ~     (p=0.665 n=30+30)
GobEncode-4                7.17ms ± 6%    7.09ms ± 7%    ~     (p=0.117 n=29+30)
Gzip-4                      413ms ± 4%     399ms ± 4%  -3.48%  (p=0.000 n=30+30)
Gunzip-4                   41.3ms ± 4%    41.7ms ± 3%  +1.05%  (p=0.011 n=30+30)
HTTPClientServer-4         63.5µs ± 3%    62.9µs ± 2%  -0.97%  (p=0.017 n=30+27)
JSONEncode-4               20.3ms ± 5%    20.1ms ± 5%  -1.16%  (p=0.004 n=30+30)
JSONDecode-4               66.2ms ± 4%    67.7ms ± 4%  +2.21%  (p=0.000 n=30+30)
Mandelbrot200-4            5.16ms ± 3%    5.18ms ± 3%    ~     (p=0.123 n=30+30)
GoParse-4                  3.23ms ± 2%    3.27ms ± 2%  +1.08%  (p=0.006 n=30+30)
RegexpMatchEasy0_32-4      98.9ns ± 5%    97.1ns ± 4%  -1.83%  (p=0.006 n=30+30)
RegexpMatchEasy0_1K-4       842ns ± 3%     842ns ± 3%    ~     (p=0.550 n=30+30)
RegexpMatchEasy1_32-4       107ns ± 4%     105ns ± 4%  -1.93%  (p=0.012 n=30+30)
RegexpMatchEasy1_1K-4      1.03µs ± 4%    1.04µs ± 4%    ~     (p=0.304 n=30+30)
RegexpMatchMedium_32-4      132ns ± 2%     129ns ± 4%  -2.02%  (p=0.000 n=21+30)
RegexpMatchMedium_1K-4     44.1µs ± 4%    43.8µs ± 3%    ~     (p=0.641 n=30+30)
RegexpMatchHard_32-4       2.26µs ± 4%    2.23µs ± 4%  -1.28%  (p=0.023 n=30+30)
RegexpMatchHard_1K-4       68.1µs ± 3%    68.6µs ± 4%    ~     (p=0.089 n=30+30)
Revcomp-4                   1.85s ± 2%     1.84s ± 2%    ~     (p=0.072 n=30+30)
Template-4                 69.2ms ± 3%    68.5ms ± 3%  -1.04%  (p=0.012 n=30+30)
TimeParse-4                 441ns ± 3%     446ns ± 4%  +1.21%  (p=0.001 n=30+30)
TimeFormat-4                415ns ± 3%     415ns ± 3%    ~     (p=0.436 n=30+30)
[Geo mean]                 67.0µs         66.9µs       -0.17%

name                     old speed      new speed      delta
GobDecode-4               107MB/s ± 5%   107MB/s ± 6%    ~     (p=0.663 n=30+30)
GobEncode-4               107MB/s ± 6%   108MB/s ± 7%    ~     (p=0.117 n=29+30)
Gzip-4                   47.0MB/s ± 4%  48.7MB/s ± 4%  +3.61%  (p=0.000 n=30+30)
Gunzip-4                  470MB/s ± 4%   466MB/s ± 4%  -1.05%  (p=0.011 n=30+30)
JSONEncode-4             95.6MB/s ± 5%  96.7MB/s ± 5%  +1.16%  (p=0.005 n=30+30)
JSONDecode-4             29.3MB/s ± 4%  28.7MB/s ± 4%  -2.17%  (p=0.000 n=30+30)
GoParse-4                17.9MB/s ± 2%  17.7MB/s ± 2%  -1.06%  (p=0.007 n=30+30)
RegexpMatchEasy0_32-4     323MB/s ± 5%   329MB/s ± 4%  +1.93%  (p=0.006 n=30+30)
RegexpMatchEasy0_1K-4    1.22GB/s ± 3%  1.22GB/s ± 3%    ~     (p=0.496 n=30+30)
RegexpMatchEasy1_32-4     298MB/s ± 4%   303MB/s ± 4%  +1.84%  (p=0.017 n=30+30)
RegexpMatchEasy1_1K-4     995MB/s ± 4%   989MB/s ± 4%    ~     (p=0.307 n=30+30)
RegexpMatchMedium_32-4   7.56MB/s ± 4%  7.74MB/s ± 4%  +2.46%  (p=0.000 n=22+30)
RegexpMatchMedium_1K-4   23.2MB/s ± 4%  23.4MB/s ± 3%    ~     (p=0.651 n=30+30)
RegexpMatchHard_32-4     14.2MB/s ± 4%  14.3MB/s ± 4%  +1.29%  (p=0.021 n=30+30)
RegexpMatchHard_1K-4     15.0MB/s ± 3%  14.9MB/s ± 4%    ~     (p=0.069 n=30+29)
Revcomp-4                 138MB/s ± 2%   138MB/s ± 2%    ~     (p=0.072 n=30+30)
Template-4               28.1MB/s ± 3%  28.4MB/s ± 3%  +1.05%  (p=0.012 n=30+30)
[Geo mean]               79.7MB/s       80.2MB/s       +0.60%

Change-Id: I44a1dfc942c9a385904553c4fe1fa8e509c8aa31
Reviewed-on: https://go-review.googlesource.com/120916
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-22 04:12:42 +00:00
Ben Shi 90f2fa0037 cmd/compile: optimize 386 code with MULLload/DIVSSload/DIVSDload
IMULL/DIVSS/DIVSD all can take the source operand from memory
directly. And this CL implement that optimization.

1. The total size of pkg/linux_386 decreases about 84KB (excluding
cmd/compile).

2. The go1 benchmark shows little regression in total (excluding noise).
name                     old time/op    new time/op    delta
BinaryTree17-4              3.29s ± 2%     3.27s ± 4%    ~     (p=0.192 n=30+30)
Fannkuch11-4                3.49s ± 2%     3.54s ± 1%  +1.48%  (p=0.000 n=30+30)
FmtFprintfEmpty-4          45.9ns ± 3%    46.3ns ± 4%  +0.89%  (p=0.037 n=30+30)
FmtFprintfString-4         78.8ns ± 3%    78.7ns ± 4%    ~     (p=0.209 n=30+27)
FmtFprintfInt-4            91.0ns ± 2%    90.3ns ± 2%  -0.82%  (p=0.031 n=30+27)
FmtFprintfIntInt-4          142ns ± 4%     143ns ± 4%    ~     (p=0.136 n=30+30)
FmtFprintfPrefixedInt-4     181ns ± 3%     183ns ± 4%  +1.40%  (p=0.005 n=30+30)
FmtFprintfFloat-4           404ns ± 4%     408ns ± 3%    ~     (p=0.397 n=30+30)
FmtManyArgs-4               601ns ± 3%     609ns ± 5%    ~     (p=0.059 n=30+30)
GobDecode-4                7.21ms ± 5%    7.24ms ± 5%    ~     (p=0.612 n=30+30)
GobEncode-4                6.91ms ± 6%    6.91ms ± 6%    ~     (p=0.797 n=30+30)
Gzip-4                      398ms ± 6%     399ms ± 4%    ~     (p=0.173 n=30+30)
Gunzip-4                   41.7ms ± 3%    41.8ms ± 3%    ~     (p=0.423 n=30+30)
HTTPClientServer-4         62.3µs ± 2%    62.7µs ± 3%    ~     (p=0.085 n=29+30)
JSONEncode-4               21.0ms ± 4%    20.7ms ± 5%  -1.39%  (p=0.014 n=30+30)
JSONDecode-4               66.3ms ± 3%    67.4ms ± 1%  +1.71%  (p=0.003 n=30+24)
Mandelbrot200-4            5.15ms ± 3%    5.16ms ± 3%    ~     (p=0.697 n=30+30)
GoParse-4                  3.24ms ± 3%    3.27ms ± 4%  +0.91%  (p=0.032 n=30+30)
RegexpMatchEasy0_32-4       101ns ± 5%      99ns ± 4%  -1.82%  (p=0.008 n=29+30)
RegexpMatchEasy0_1K-4       848ns ± 4%     841ns ± 2%  -0.77%  (p=0.043 n=30+30)
RegexpMatchEasy1_32-4       106ns ± 6%     106ns ± 3%    ~     (p=0.939 n=29+30)
RegexpMatchEasy1_1K-4      1.02µs ± 3%    1.03µs ± 4%    ~     (p=0.297 n=28+30)
RegexpMatchMedium_32-4      129ns ± 4%     127ns ± 4%    ~     (p=0.073 n=30+30)
RegexpMatchMedium_1K-4     43.9µs ± 3%    43.8µs ± 3%    ~     (p=0.186 n=30+30)
RegexpMatchHard_32-4       2.24µs ± 4%    2.22µs ± 4%    ~     (p=0.332 n=30+29)
RegexpMatchHard_1K-4       68.0µs ± 4%    67.5µs ± 3%    ~     (p=0.290 n=30+30)
Revcomp-4                   1.85s ± 3%     1.85s ± 3%    ~     (p=0.358 n=30+30)
Template-4                 69.6ms ± 3%    70.0ms ± 4%    ~     (p=0.273 n=30+30)
TimeParse-4                 445ns ± 3%     441ns ± 3%    ~     (p=0.494 n=30+30)
TimeFormat-4                412ns ± 3%     412ns ± 6%    ~     (p=0.841 n=30+30)
[Geo mean]                 66.7µs         66.8µs       +0.13%

name                     old speed      new speed      delta
GobDecode-4               107MB/s ± 5%   106MB/s ± 5%    ~     (p=0.615 n=30+30)
GobEncode-4               111MB/s ± 6%   111MB/s ± 6%    ~     (p=0.790 n=30+30)
Gzip-4                   48.8MB/s ± 6%  48.7MB/s ± 4%    ~     (p=0.167 n=30+30)
Gunzip-4                  465MB/s ± 3%   465MB/s ± 3%    ~     (p=0.420 n=30+30)
JSONEncode-4             92.4MB/s ± 4%  93.7MB/s ± 5%  +1.42%  (p=0.015 n=30+30)
JSONDecode-4             29.3MB/s ± 3%  28.8MB/s ± 1%  -1.72%  (p=0.003 n=30+24)
GoParse-4                17.9MB/s ± 3%  17.7MB/s ± 4%  -0.89%  (p=0.037 n=30+30)
RegexpMatchEasy0_32-4     317MB/s ± 8%   324MB/s ± 4%  +2.14%  (p=0.006 n=30+30)
RegexpMatchEasy0_1K-4    1.21GB/s ± 4%  1.22GB/s ± 2%  +0.77%  (p=0.036 n=30+30)
RegexpMatchEasy1_32-4     298MB/s ± 7%   299MB/s ± 4%    ~     (p=0.511 n=30+30)
RegexpMatchEasy1_1K-4    1.00GB/s ± 3%  1.00GB/s ± 4%    ~     (p=0.304 n=28+30)
RegexpMatchMedium_32-4   7.75MB/s ± 4%  7.82MB/s ± 4%    ~     (p=0.089 n=30+30)
RegexpMatchMedium_1K-4   23.3MB/s ± 3%  23.4MB/s ± 3%    ~     (p=0.181 n=30+30)
RegexpMatchHard_32-4     14.3MB/s ± 4%  14.4MB/s ± 4%    ~     (p=0.320 n=30+29)
RegexpMatchHard_1K-4     15.1MB/s ± 4%  15.2MB/s ± 3%    ~     (p=0.273 n=30+30)
Revcomp-4                 137MB/s ± 3%   137MB/s ± 3%    ~     (p=0.352 n=30+30)
Template-4               27.9MB/s ± 3%  27.7MB/s ± 4%    ~     (p=0.277 n=30+30)
[Geo mean]               79.9MB/s       80.1MB/s       +0.15%

Change-Id: I97333cd8ddabb3c7c88ca5aa9e14a005b74d306d
Reviewed-on: https://go-review.googlesource.com/120695
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-22 03:38:38 +00:00
Ben Shi 705f3c74e6 cmd/compile: optimize AMD64 with DIVSSload and DIVSDload
DIVSSload & DIVSDload directly operate on a memory operand. And
binary size can be reduced by them, while the performance is
not affected.

The total size of pkg/linux_amd64 (excluding cmd/compile) decreases
about 6KB.

There is little regression in the go1 benchmark test (excluding noise).
name                     old time/op    new time/op    delta
BinaryTree17-4              2.63s ± 4%     2.62s ± 4%    ~     (p=0.809 n=30+30)
Fannkuch11-4                2.40s ± 2%     2.40s ± 2%    ~     (p=0.109 n=30+30)
FmtFprintfEmpty-4          43.1ns ± 4%    43.2ns ± 9%    ~     (p=0.168 n=30+30)
FmtFprintfString-4         73.6ns ± 4%    74.1ns ± 4%    ~     (p=0.069 n=30+30)
FmtFprintfInt-4            81.0ns ± 3%    81.4ns ± 5%    ~     (p=0.350 n=30+30)
FmtFprintfIntInt-4          127ns ± 4%     129ns ± 4%  +0.99%  (p=0.021 n=30+30)
FmtFprintfPrefixedInt-4     156ns ± 4%     155ns ± 4%    ~     (p=0.415 n=30+30)
FmtFprintfFloat-4           219ns ± 4%     218ns ± 4%    ~     (p=0.071 n=30+30)
FmtManyArgs-4               522ns ± 3%     518ns ± 3%  -0.68%  (p=0.034 n=30+30)
GobDecode-4                6.49ms ± 6%    6.52ms ± 6%    ~     (p=0.832 n=30+30)
GobEncode-4                6.10ms ± 9%    6.14ms ± 7%    ~     (p=0.485 n=30+30)
Gzip-4                      227ms ± 1%     224ms ± 4%    ~     (p=0.484 n=24+30)
Gunzip-4                   37.2ms ± 3%    36.8ms ± 4%    ~     (p=0.889 n=30+30)
HTTPClientServer-4         58.9µs ± 1%    58.7µs ± 2%  -0.42%  (p=0.003 n=28+28)
JSONEncode-4               12.0ms ± 3%    12.0ms ± 4%    ~     (p=0.523 n=30+30)
JSONDecode-4               54.6ms ± 4%    54.5ms ± 4%    ~     (p=0.708 n=30+30)
Mandelbrot200-4            3.78ms ± 4%    3.81ms ± 3%  +0.99%  (p=0.016 n=30+30)
GoParse-4                  3.20ms ± 4%    3.20ms ± 5%    ~     (p=0.994 n=30+30)
RegexpMatchEasy0_32-4      77.0ns ± 4%    75.9ns ± 3%  -1.39%  (p=0.006 n=29+30)
RegexpMatchEasy0_1K-4       255ns ± 4%     253ns ± 4%    ~     (p=0.091 n=30+30)
RegexpMatchEasy1_32-4      69.7ns ± 3%    70.3ns ± 4%    ~     (p=0.120 n=30+30)
RegexpMatchEasy1_1K-4       373ns ± 2%     378ns ± 3%  +1.43%  (p=0.000 n=21+26)
RegexpMatchMedium_32-4      107ns ± 2%     108ns ± 4%  +1.50%  (p=0.012 n=22+30)
RegexpMatchMedium_1K-4     34.0µs ± 1%    34.3µs ± 3%  +1.08%  (p=0.008 n=24+30)
RegexpMatchHard_32-4       1.53µs ± 3%    1.54µs ± 3%    ~     (p=0.234 n=30+30)
RegexpMatchHard_1K-4       46.7µs ± 4%    47.0µs ± 4%    ~     (p=0.420 n=30+30)
Revcomp-4                   411ms ± 7%     415ms ± 6%    ~     (p=0.059 n=30+30)
Template-4                 65.5ms ± 5%    66.9ms ± 4%  +2.21%  (p=0.001 n=30+30)
TimeParse-4                 317ns ± 3%     311ns ± 3%  -1.97%  (p=0.000 n=30+30)
TimeFormat-4                293ns ± 3%     294ns ± 3%    ~     (p=0.243 n=30+30)
[Geo mean]                 47.4µs         47.5µs       +0.17%

name                     old speed      new speed      delta
GobDecode-4               118MB/s ± 5%   118MB/s ± 6%    ~     (p=0.832 n=30+30)
GobEncode-4               125MB/s ± 7%   125MB/s ± 7%    ~     (p=0.625 n=29+30)
Gzip-4                   85.3MB/s ± 1%  86.6MB/s ± 4%    ~     (p=0.486 n=24+30)
Gunzip-4                  522MB/s ± 3%   527MB/s ± 4%    ~     (p=0.889 n=30+30)
JSONEncode-4              162MB/s ± 3%   162MB/s ± 4%    ~     (p=0.520 n=30+30)
JSONDecode-4             35.5MB/s ± 4%  35.6MB/s ± 4%    ~     (p=0.701 n=30+30)
GoParse-4                18.1MB/s ± 4%  18.1MB/s ± 4%    ~     (p=0.891 n=29+30)
RegexpMatchEasy0_32-4     416MB/s ± 4%   422MB/s ± 3%  +1.43%  (p=0.005 n=29+30)
RegexpMatchEasy0_1K-4    4.01GB/s ± 4%  4.04GB/s ± 4%    ~     (p=0.091 n=30+30)
RegexpMatchEasy1_32-4     460MB/s ± 3%   456MB/s ± 5%    ~     (p=0.123 n=30+30)
RegexpMatchEasy1_1K-4    2.74GB/s ± 2%  2.70GB/s ± 3%  -1.33%  (p=0.000 n=22+26)
RegexpMatchMedium_32-4   9.39MB/s ± 3%  9.19MB/s ± 4%  -2.06%  (p=0.001 n=28+30)
RegexpMatchMedium_1K-4   30.1MB/s ± 1%  29.8MB/s ± 3%  -1.04%  (p=0.008 n=24+30)
RegexpMatchHard_32-4     20.9MB/s ± 3%  20.8MB/s ± 3%    ~     (p=0.234 n=30+30)
RegexpMatchHard_1K-4     21.9MB/s ± 4%  21.8MB/s ± 4%    ~     (p=0.420 n=30+30)
Revcomp-4                 619MB/s ± 7%   612MB/s ± 7%    ~     (p=0.059 n=30+30)
Template-4               29.6MB/s ± 4%  29.0MB/s ± 4%  -2.16%  (p=0.002 n=30+30)
[Geo mean]                123MB/s        123MB/s       -0.33%

Change-Id: Ia59e077feae4f2824df79059daea4d0f678e3e4c
Reviewed-on: https://go-review.googlesource.com/120275
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-08-22 02:47:49 +00:00
Ian Lance Taylor 7e64377903 cmd/compile: only support -race and -msan where they work
Consolidate decision about whether -race and -msan options are
supported in cmd/internal/sys. Use consolidated functions in
cmd/compile and cmd/go. Use a copy of them in cmd/dist; cmd/dist can't
import cmd/internal/sys because Go 1.4 doesn't have it.

Fixes #24315

Change-Id: I9cecaed4895eb1a2a49379b4848db40de66d32a9
Reviewed-on: https://go-review.googlesource.com/121816
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-21 03:38:27 +00:00
Ilya Tocar a3593685cf cmd/compile/internal/ssa: remove useless zero extension
We generate MOVBLZX for byte-sized LoadReg, so
(MOVBQZX (LoadReg (Arg))) is the same as
(LoadReg (Arg)). Remove those zero extension where possible.
Triggers several times during all.bash.

Fixes #25378
Updates #15300

Change-Id: If50656e66f217832a13ee8f49c47997f4fcc093a
Reviewed-on: https://go-review.googlesource.com/115617
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-20 21:38:20 +00:00
Ilya Tocar c292b32f33 cmd/compile: enable disjoint memmove inlining on amd64
Memmove can use AVX/prefetches/other optional instructions, so
only do it for small sizes, when call overhead dominates.

Change-Id: Ice5e93deb11462217f7fb5fc350b703109bb4090
Reviewed-on: https://go-review.googlesource.com/112517
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2018-08-20 21:10:12 +00:00