Commit Graph

2283 Commits

Author SHA1 Message Date
Mikio Hara a0d330f1dd runtime, runtime/cgo: revert CL 18814; don't drop signal stack in new thread on dragonfly
This change reverts CL 18814 which is a workaroud for older DragonFly
BSD kernels, and fixes #13945 and #13947 in a more general way the
same as other platforms except NetBSD.

This is a followup to CL 29491.

Updates #16329.

Change-Id: I771670bc672c827f2b3dbc7fd7417c49897cb991
Reviewed-on: https://go-review.googlesource.com/29971
Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-28 14:45:06 +00:00
Ian Lance Taylor eb268cb321 runtime: minor simplifications to signal code
Change setsig, setsigstack, getsig, raise, raiseproc to take uint32 for
signal number parameter, as that is the type mostly used for signal
numbers.  Same for dieFromSignal, sigInstallGoHandler, raisebadsignal.

Remove setsig restart parameter, as it is always either true or
irrelevant.

Don't check the handler in setsigstack, as the only caller does that
anyhow.

Don't bother to convert the handler from sigtramp to sighandler in
getsig, as it will never be called when the handler is sigtramp or
sighandler.

Don't check the return value from rt_sigaction in the GNU/Linux version
of setsigstack; no other setsigstack checks it, and it never fails.

Change-Id: I6bbd677e048a77eddf974dd3d017bc3c560fbd48
Reviewed-on: https://go-review.googlesource.com/29953
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-28 13:12:47 +00:00
Alex Brainman db82cf4e50 runtime: use RtlGenRandom instead of CryptGenRandom
This change replaces the use of CryptGenRandom with RtlGenRandom in
Windows to generate cryptographically random numbers during process
startup. RtlGenRandom uses the same RNG as CryptGenRandom, but it has many
fewer DLL dependencies and so does not affect process startup time as
much.

This makes running simple Go program on my computers faster.

Windows XP:
benchmark                      old ns/op     new ns/op     delta
BenchmarkRunningGoProgram-2     47408573      10784148      -77.25%

Windows 7 (VM):
benchmark                    old ns/op     new ns/op     delta
BenchmarkRunningGoProgram     16260390      12792150      -21.33%

Windows 7:
benchmark                      old ns/op     new ns/op     delta
BenchmarkRunningGoProgram-2     13600778      10050574      -26.10%

Fixes #15589

Change-Id: I2816239a2056e3d4a6dcd86a6fa2bb619c6008fe
Reviewed-on: https://go-review.googlesource.com/29700
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-28 05:21:53 +00:00
Ian Lance Taylor fdc167164e runtime: remove sigmask type, use sigset instead
The OS-independent sigmask type was not pulling its weight. Replace it
with the OS-dependent sigset type. This requires adding an OS-specific
sigaddset function, but permits removing the OS-specific sigmaskToSigset
function.

Change-Id: I43307b512b0264ec291baadaea902f05ce212305
Reviewed-on: https://go-review.googlesource.com/29950
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-27 21:33:44 +00:00
Ian Lance Taylor 097a581dc0 runtime: simplify signalstack by dropping nil as argument
Change the two calls to signalstack(nil) to inline the code
instead (it's two lines).

Change-Id: Ie92a05494f924f279e40ac159f1b677fda18f281
Reviewed-on: https://go-review.googlesource.com/29854
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-27 17:08:29 +00:00
Elias Naur 60482d8a8b runtime: relax SetFinalizer documentation to allow &local
The SetFinalizer documentation states that

"The argument obj must be a pointer to an object allocated by calling
new or by taking the address of a composite literal."

which precludes pointers to local variables. According to a comment
on #6591, this case is expected to work. This CL updates the documentation
for SetFinalizer accordingly.

Fixes #6591

Change-Id: Id861b3436bc1c9521361ea2d51c1ce74a121c1af
Reviewed-on: https://go-review.googlesource.com/29592
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-09-27 16:38:47 +00:00
Michael Munday 3436f0776f reflect, runtime: optimize Value.Call on s390x and add benchmark
Use an MVC loop to copy arguments in runtime.call* rather than copying
bytes individually.

I've added the benchmark CallArgCopy to test the speed of Value.Call
for various argument sizes.

name                    old speed      new speed       delta
CallArgCopy/size=128     439MB/s ± 1%    582MB/s ± 1%   +32.41%  (p=0.000 n=10+10)
CallArgCopy/size=256     695MB/s ± 1%   1172MB/s ± 1%   +68.67%  (p=0.000 n=10+10)
CallArgCopy/size=1024    573MB/s ± 8%   4175MB/s ± 2%  +628.11%  (p=0.000 n=10+10)
CallArgCopy/size=4096   1.46GB/s ± 2%  10.19GB/s ± 1%  +600.52%  (p=0.000 n=10+10)
CallArgCopy/size=65536  1.51GB/s ± 0%  12.30GB/s ± 1%  +716.30%   (p=0.000 n=9+10)

Change-Id: I87dae4809330e7964f6cb4a9e40e5b3254dd519d
Reviewed-on: https://go-review.googlesource.com/28096
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-27 16:00:19 +00:00
Cherry Zhang 9d4b40f55d runtime, cmd/compile: implement and use DUFFCOPY on ARM64
Change-Id: I8984eac30e5df78d4b94f19412135d3cc36969f8
Reviewed-on: https://go-review.googlesource.com/29910
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-09-27 15:07:31 +00:00
Austin Clements f8b2314c56 runtime: optimize defer code
This optimizes deferproc and deferreturn in various ways.

The most important optimization is that it more carefully arranges to
prevent preemption or stack growth. Currently we do this by switching
to the system stack on every deferproc and every deferreturn. While we
need to be on the system stack for the slow path of allocating and
freeing defers, in the common case we can fit in the nosplit stack.
Hence, this change pushes the system stack switch down into the slow
paths and makes everything now exposed to the user stack nosplit. This
also eliminates the need for various acquirem/releasem pairs, since we
are now preventing preemption by preventing stack split checks.

As another smaller optimization, we special case the common cases of
zero-sized and pointer-sized defer frames to respectively skip the
copy and perform the copy in line instead of calling memmove.

This speeds up the runtime defer benchmark by 42%:

name           old time/op  new time/op  delta
Defer-4        75.1ns ± 1%  43.3ns ± 1%  -42.31%   (p=0.000 n=8+10)

In reality, this speeds up defer by about 2.2X. The two benchmarks
below compare a Lock/defer Unlock pair (DeferLock) with a Lock/Unlock
pair (NoDeferLock). NoDeferLock establishes a baseline cost, so these
two benchmarks together show that this change reduces the overhead of
defer from 61.4ns to 27.9ns.

name           old time/op  new time/op  delta
DeferLock-4    77.4ns ± 1%  43.9ns ± 1%  -43.31%  (p=0.000 n=10+10)
NoDeferLock-4  16.0ns ± 0%  15.9ns ± 0%   -0.39%    (p=0.000 n=9+8)

This also shaves 34ns off cgo calls:

name       old time/op  new time/op  delta
CgoNoop-4   122ns ± 1%  88.3ns ± 1%  -27.72%  (p=0.000 n=8+9)

Updates #14939, #16051.

Change-Id: I2baa0dea378b7e4efebbee8fca919a97d5e15f38
Reviewed-on: https://go-review.googlesource.com/29656
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-26 22:01:35 +00:00
Austin Clements d211c2d377 runtime: implement getcallersp in Go
This makes it possible to inline getcallersp. getcallersp is on the
hot path of defers, so this slightly speeds up defer:

name           old time/op  new time/op  delta
Defer-4        78.3ns ± 2%  75.1ns ± 1%  -4.00%   (p=0.000 n=9+8)

Updates #14939.

Change-Id: Icc1cc4cd2f0a81fc4c8344432d0b2e783accacdd
Reviewed-on: https://go-review.googlesource.com/29655
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-26 22:01:32 +00:00
Austin Clements aaf4099a5c runtime: update malloc.go documentation
The big documentation comment at the top of malloc.go has gotten
woefully out of date. Update it.

Change-Id: Ibdb1bdcfdd707a6dc9db79d0633a36a28882301b
Reviewed-on: https://go-review.googlesource.com/29731
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-26 22:00:53 +00:00
Austin Clements f67c9de656 runtime: document MemStats
This documents all fields in MemStats and more clearly documents where
mstats differs from MemStats.

Fixes #15849.

Change-Id: Ie09374bcdb3a5fdd2d25fe4bba836aaae92cb1dd
Reviewed-on: https://go-review.googlesource.com/28972
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2016-09-26 22:00:50 +00:00
Austin Clements 2098e5d39a runtime: eliminate memstats.heap_reachable
We used to compute an estimate of the reachable heap size that was
different from the marked heap size. This ultimately caused more
problems than it solved, so we pulled it out, but memstats still has
both heap_reachable and heap_marked, and there are some leftover TODOs
about the problems with this estimate.

Clean this up by eliminating heap_reachable in favor of heap_marked
and deleting the stale TODOs.

Change-Id: I713bc20a7c90683d2b43ff63c0b21a440269cc4d
Reviewed-on: https://go-review.googlesource.com/29271
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-26 22:00:47 +00:00
Austin Clements ec9c84c884 runtime: disentangle next_gc from GC trigger
Back in Go 1.4, memstats.next_gc was both the heap size at which GC
would trigger, and the size GC kept the heap under. When we switched
to concurrent GC in Go 1.5, we got somewhat confused and made this
variable the trigger heap size, while gcController.heapGoal became the
goal heap size.

memstats.next_gc is exposed to the user via MemStats.NextGC, while
gcController.heapGoal is not. This is unfortunate because 1) the heap
goal is far more useful for diagnostics, and 2) the trigger heap size
is just part of the GC trigger heuristic, which means it wouldn't be
useful to an application even if it tried to use it.

We never noticed this mess because MemStats.NextGC is practically
undocumented. Now that we're trying to document MemStats, it became
clear that this field had diverged from its original usefulness.

Clean up this mess by shuffling things back around so that next_gc is
the goal heap size and the new (unexposed) memstats.gc_trigger field
is the trigger heap size. This eliminates gcController.heapGoal.

Updates #15849.

Change-Id: I2cbbd43b1d78bdf613cb43f53488bd63913189b7
Reviewed-on: https://go-review.googlesource.com/29270
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-26 22:00:44 +00:00
Ian Lance Taylor e2e11f02a4 runtime: unify Unix implementations of unminit
Change-Id: I2cbb13eb85876ad05a52cbd498a9b86e7a28899c
Reviewed-on: https://go-review.googlesource.com/29772
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-26 19:33:26 +00:00
Ian Lance Taylor ac24388e5e runtime: merge setting new signal mask in minit
All the variants that sets the new signal mask in minit do the same
thing, so merge them. This requires an OS-specific sigdelset function;
the function already exists for linux, and is now added for other OS's.

Change-Id: Ie96f6f02e2cf09c43005085985a078bd9581f670
Reviewed-on: https://go-review.googlesource.com/29771
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-26 18:29:32 +00:00
Ian Lance Taylor c2735039f3 runtime: unify sigtrampgo
Combine the various versions of sigtrampgo into a single function in
signal_unix.go. This requires defining a fixsigcode method on sigctxt
for all operating systems; it only does something on Darwin. This also
requires changing the darwin/amd64 signal handler to call sigreturn
itself, rather than relying on sigtrampgo to call sigreturn for it. We
can then drop the Darwin sigreturn function, as it is no longer used.

Change-Id: I5a0b9d2d2c141957e151b41e694efeb20e4b4b9a
Reviewed-on: https://go-review.googlesource.com/29761
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-26 17:22:42 +00:00
Ian Lance Taylor d15295c679 runtime: unify handling of alternate signal stack
Change all Unix systems to use stackt for the alternate signal
stack (some were using sigaltstackt). Add OS-specific setSignalstackSP
function to handle different types for ss_sp field, and unify all
OS-specific signalstack functions into one. Unify handling of alternate
signal stack in OS-specific minit and sigtrampgo functions via new
functions minitSignalstack and setGsignalStack.

Change-Id: Idc316dc69b1dd725717acdf61a1cd8b9f33ed174
Reviewed-on: https://go-review.googlesource.com/29757
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-26 04:07:31 +00:00
Ian Lance Taylor f05cd4cde5 runtime: simplify conditions testing g.paniconfault
Implement a comment by Ralph Corderoy on CL 29754.

Change-Id: I22bbede211ddcb8a057f16b4f47d335a156cc8d2
Reviewed-on: https://go-review.googlesource.com/29756
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-25 20:29:50 +00:00
Ian Lance Taylor 343bec53c7 runtime: merge sigpanic_unix.go into signal_unix.go
Change-Id: Iba541045b4878405834c637095627631b6559a35
Reviewed-on: https://go-review.googlesource.com/29754
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-25 19:27:05 +00:00
Dmitry Vyukov 38765eba73 runtime/race: don't crash on invalid PCs
Currently raceSymbolizeCode uses funcline, which is internal runtime
function which crashes on incorrect PCs. Use FileLine instead,
it is public and does not crash on invalid data.

Note: FileLine returns "?" file on failure. That string is not NUL-terminated,
so we need to additionally check what FileLine returns.

Fixes #17190

Change-Id: Ic6fbd4f0e68ddd52e9b2dd25e625b50adcb69a98
Reviewed-on: https://go-review.googlesource.com/29714
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-25 12:22:04 +00:00
Dmitry Vyukov c14050646f runtime: fix newextram PC passed to race detector
PC passed to racegostart is expected to be a return PC
of the go statement. Race runtime will subtract 1 from the PC
before symbolization. Passing start PC of a function is wrong.
Add sys.PCQuantum to the function start PC.

Update #17190

Change-Id: Ia504c49e79af84ed4ea360c2aea472b370ea8bf5
Reviewed-on: https://go-review.googlesource.com/29712
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-25 12:15:40 +00:00
Ian Lance Taylor 159a90b93a runtime: merge Unix sighandler functions
Replace all the Unix sighandler functions with a single instance.
Push the relatively small amount of processor-specific code into five
methods on sigctxt: sigpc, sigsp, siglr, fault, preparePanic.
(Some processors already had a fault method.)

Change-Id: Ib459412ff8f7e0f5ad06bfd43eb827c8b196fc32
Reviewed-on: https://go-review.googlesource.com/29752
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-09-25 03:55:33 +00:00
Keith Randall 60074b0fd3 runtime: remove TestCollisions from -short
Takes a bit too long to run it all the time.

Fixes #17217
Update #17104

Change-Id: I4802190ea16ee0f436a7f95b093ea0f995f5b11d
Reviewed-on: https://go-review.googlesource.com/29751
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-24 03:10:13 +00:00
Ian Lance Taylor ab552aa3b6 runtime: unify some signal handling functions
Unify the OS-specific versions of msigsave, msigrestore, sigblock,
updatesigmask, and unblocksig into single versions in signal_unix.go.
To do this, make sigprocmask work the same way on all systems, which
required adding a definition of sigprocmask for linux and openbsd.
Also add a single OS-specific function sigmaskToSigset.

Change-Id: I7cbf75131dddb57eeefe648ef845b0791404f785
Reviewed-on: https://go-review.googlesource.com/29689
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-09-24 01:39:48 +00:00
David Crawshaw ed915ad421 runtime: use sched_yield instead of pthread_yield
Attempt to fix the linux-amd64-clang builder, which broke
with CL 29472.

Turns out pthread_yield is a non-portable Linux function, and
should have #define _GNU_SOURCE before #include <pthread.h>.
GCC doesn't complain about this, but Clang does:

	./raceprof.go:44:3: warning: implicit declaration of function 'pthread_yield' is invalid in C99 [-Wimplicit-function-declaration]

(Though the error, while explicable, certainly could be clearer.)

There is a portable POSIX equivalent, sched_yield, so this
CL uses it instead.

Change-Id: I58ca7a3f73a2b3697712fdb02e72a8027c391169
Reviewed-on: https://go-review.googlesource.com/29675
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-23 04:32:38 +00:00
David Crawshaw b4c9829c22 runtime: check plugin-loaded moduledata addresses
Inspired by difficulties with plugin support on darwin.

Change-Id: I2cef8410837946454e75d00e94e46791f03f2267
Reviewed-on: https://go-review.googlesource.com/29391
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-23 02:16:17 +00:00
Ian Lance Taylor 9d8522fdc7 cmd/compile: don't instrument copy and append in runtime
Instrumenting copy and append for the race detector changes them to call
different functions. In the runtime package the alternate functions are
not marked as nosplit. This caused a crash in the SIGPROF handler when
invoked on a non-Go thread in a program built with the race detector. In
some cases the handler can call copy, the race detector changed that to
a call to a non-nosplit function, the function tried to check the stack
guard, and crashed because it was running on a non-Go thread. The
SIGPROF handler is written carefully to avoid such problems, but hidden
function calls are difficult to avoid.

Fix this by changing the compiler to not instrument copy and append when
compiling the runtime package. Change the runtime package to add
explicit race checks for the only code I could find where copy is used
to write to user data (append is never used).

Change-Id: I11078a66c0aaa459a7d2b827b49f4147922050af
Reviewed-on: https://go-review.googlesource.com/29472
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2016-09-22 23:37:17 +00:00
Ian Lance Taylor 9e028b70ea runtime: merge signal[12]_unix.go into signal_unix.go
Requires adding a sigfwd function for Solaris, as previously
signal2_unix.go was not built for Solaris.

Change-Id: Iea3ff0ddfa15af573813eb075bead532b324a3fc
Reviewed-on: https://go-review.googlesource.com/29550
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-21 23:04:34 +00:00
Mikio Hara 5db80c30e6 runtime: revert CL 18835; don't install new signal stack unconditionally on dragonfly
This change reverts CL 18835 which is a workaroud for older DragonFly
BSD kernels, and fixes #14051, #14052 and #14067 in a more general way
the same as other platforms except NetBSD.

This change also bumps the minimum required version of DragonFly BSD
kernel to 4.4.4.

Fixes #16329.

Change-Id: I0b44b6afa675f5ed9523914226bd9ec4809ba5ae
Reviewed-on: https://go-review.googlesource.com/29491
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-21 22:18:06 +00:00
Lynn Boger b4efd09d18 cmd/link: split large elf text sections on ppc64x
Some applications built with Go on ppc64x with external linking
can fail to link with relocation truncation errors if the elf
text section that is generated is larger than 2^26 bytes and that
section contains a call instruction (bl) which calls a function
beyond the limit addressable by the 24 bit field in the
instruction.

This solution consists of generating multiple text sections where
each is small enough to allow the GNU linker to resolve the calls
by generating long branch code where needed.  Other changes were added
to handle differences in processing when multiple text sections exist.

Some adjustments were required to the computation of a method's address
when using the method offset table when there are multiple text sections.

The number of possible section headers was increased to allow for up
to 128 text sections.  A test case was also added.

Fixes #15823.

Change-Id: If8117b0e0afb058cbc072258425a35aef2363c92
Reviewed-on: https://go-review.googlesource.com/27790
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-21 20:23:49 +00:00
Austin Clements c03925edd3 runtime: remove unnecessary atomics from heapBitSetType
These used to be necessary when racing with updates to the mark bit,
but since the mark bit is no longer in the bitmap and the checkmark is
only updated with the world stopped, we can now always use regular
writes to update the type information in the heap bitmap.

Somewhat surprisingly, this has basically no overall performance
effect beyond the usual noise, but it does clean up the code.

Change-Id: I3933d0b4c0bc1c9bcf6313613515c0b496212105
Reviewed-on: https://go-review.googlesource.com/29277
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-21 15:08:16 +00:00
Austin Clements ab59235729 runtime: consistency check for G rescan position
Issue #17099 shows a failure that indicates we rescanned a stack twice
concurrently during mark termination, which suggests that the rescan
list became inconsistent. Add a simple check when we dequeue something
from the rescan list that it claims to be at the index where we found
it.

Change-Id: I6a267da4154a2e7b7d430cb4056e6bae978eaf62
Reviewed-on: https://go-review.googlesource.com/29280
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-20 18:37:32 +00:00
Austin Clements 39ce6eb9ec runtime: report GCSys and OtherSys in heap profile
The comment block at the end of the heap profile includes *almost*
everything from MemStats. Add the missing fields. These are useful for
debugging RSS that has gone to GC-internal data structures.

Change-Id: I0ee8a918d49629e28fd8fd2bf6861c4529461c24
Reviewed-on: https://go-review.googlesource.com/29276
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-20 18:37:29 +00:00
Cherry Zhang 38cd79889e cmd/compile: simplify div/mod on ARM
On ARM, DIV, DIVU, MOD, MODU are pseudo instructions that makes
runtime calls _div/_udiv/_mod/_umod, which themselves are wrappers
of udiv. The udiv function does the real thing.

Instead of generating these pseudo instructions, call to udiv
directly. This removes one layer of wrappers (which has an awkward
way of passing argument), and also allows combining DIV and MOD
if both results are needed.

Change-Id: I118afc3986db3a1daabb5c1e6e57430888c91817
Reviewed-on: https://go-review.googlesource.com/29390
Reviewed-by: David Chase <drchase@google.com>
2016-09-20 13:40:48 +00:00
Keith Randall 6129f37367 cmd/compile: inline convT2{I,E} when result doesn't escape
No point in calling a function when we can build the interface
using a known type (or itab) and the address of a local.

Get rid of third arg (preallocated stack space) to convT2{I,E}.

Makes go binary smaller by 0.2%

benchmark                   old ns/op     new ns/op     delta
BenchmarkEfaceInteger-8     16.7          10.1          -39.52%

Update #17118
Update #15375

Change-Id: I9724a1f802bfa1e3957bf1856b55558278e198a2
Reviewed-on: https://go-review.googlesource.com/29373
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-09-19 02:37:08 +00:00
David Crawshaw 0cbb12f0bb plugin: new package for loading plugins
Includes a linux implementation.

Change-Id: Iacc2ed7da760ae9deebc928adf2b334b043b07ec
Reviewed-on: https://go-review.googlesource.com/27823
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-16 17:54:40 +00:00
David Crawshaw 8607bed744 runtime: avoid dependence on main symbol
For -buildmode=plugin, this lets the linker drop the main.main symbol
out of the binary while including most of the runtime.

(In the future it should be possible to drop the entire runtime
package from plugins.)

Change-Id: I3e7a024ddf5cc945e3d8b84bf37a0b7cb2a00eb6
Reviewed-on: https://go-review.googlesource.com/27821
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-16 14:49:27 +00:00
David Crawshaw eced6754c2 cmd/link: -buildmode=plugin support for linux
This CL contains several linker changes to support creating plugins.

It collects the exported plugin symbols provided by the compiler and
includes them in the moduledata.

It treats a binary as being dynamically linked if it imports the plugin
package. This lets the dynamic linker de-duplicate symbols.

Change-Id: I099b6f38dda26306eba5c41dbe7862f5a5918d95
Reviewed-on: https://go-review.googlesource.com/27820
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-16 14:49:13 +00:00
Matthew Dempsky 27eebbabc2 cmd/compile, runtime: remove throwreturn
Change-Id: If8d27cf1cd8d650ed0ba332448d3174d80b6b0ca
Reviewed-on: https://go-review.googlesource.com/29217
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-15 12:13:34 +00:00
Martin Möhrmann 150de948ee cmd/compile: intrinsify slicebytetostringtmp when not instrumenting
when not instrumenting:
- Intrinsify uses of slicebytetostringtmp within the runtime package
  in the ssa backend.
- Pass OARRAYBYTESTRTMP nodes to the compiler backends for lowering
  instead of generating calls to slicebytetostringtmp.

name                    old time/op  new time/op  delta
ConcatStringAndBytes-4  27.9ns ± 2%  24.7ns ± 2%  -11.52%  (p=0.000 n=43+43)

Fixes #17044

Change-Id: I51ce9c3b93284ce526edd0234f094e98580faf2d
Reviewed-on: https://go-review.googlesource.com/29017
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-14 21:58:14 +00:00
Josh Bleecher Snyder 9980b70cb4 runtime: limit the number of map overflow buckets
Consider repeatedly adding many items to a map
and then deleting them all, as in #16070. The map
itself doesn't need to grow above the high water
mark of number of items. However, due to random
collisions, the map can accumulate overflow
buckets.

Prior to this CL, those overflow buckets were
never removed, which led to a slow memory leak.

The problem with removing overflow buckets is
iterators. The obvious approach is to repack
keys and values and eliminate unused overflow
buckets. However, keys, values, and overflow
buckets cannot be manipulated without disrupting
iterators.

This CL takes a different approach, which is to
reuse the existing map growth mechanism,
which is well established, well tested, and
safe in the presence of iterators.
When a map has accumulated enough overflow buckets
we trigger map growth, but grow into a map of the
same size as before. The old overflow buckets will
be left behind for garbage collection.

For the code in #16070, instead of climbing
(very slowly) forever, memory usage now cycles
between 264mb and 483mb every 15 minutes or so.

To avoid increasing the size of maps,
the overflow bucket counter is only 16 bits.
For large maps, the counter is incremented
stochastically.

Fixes #16070

Change-Id: If551d77613ec6836907efca58bda3deee304297e
Reviewed-on: https://go-review.googlesource.com/25049
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-13 17:53:32 +00:00
Cherry Zhang 38d35e714a cmd/compile, runtime/internal/atomic: intrinsify And8, Or8 on ARM64
Also add assembly implementation, in case intrinsics is disabled.

Change-Id: Iff0a8a8ce326651bd29f6c403f5ec08dd3629993
Reviewed-on: https://go-review.googlesource.com/28979
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-13 02:09:15 +00:00
Keith Randall d00a3cead8 runtime: make gdb test resilient to line numbering
Don't break on line number, instead break on the actual call.
This makes the test more robust to line numbering changes in the backend.

A CL (28950) changed the generated code line numbering slightly.  A MOVW
$0, R0 instruction at the start of the function changed to line
10 (because several constant zero instructions got CSEd, and one gets
picked arbitrarily).  That's too fragile for a test.

Change-Id: I5d6a8ef0603de7d727585004142780a527e70496
Reviewed-on: https://go-review.googlesource.com/29085
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-12 23:13:12 +00:00
Michael Munday f1515a01fd runtime, math/big: allow R0 on s390x to contain values other than 0
The new SSA backend for s390x can use R0 as a general purpose register.
This change modifies assembly code to either avoid using R0 entirely
or explicitly set R0 to 0.

R0 can still be safely used as 0 in address calculations.

Change-Id: I3efa723e9ef322a91a408bd8c31768d7858526c8
Reviewed-on: https://go-review.googlesource.com/28976
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-12 18:06:01 +00:00
Michael Munday d38d59ffb5 runtime: fix SIGILL in checkvectorfacility on s390x
STFLE does not necessarily write to all the double-words that are
requested. It is therefore necessary to clear the target memory
before calling STFLE in order to ensure that the facility list does
not contain false positives.

Fixes #17032.

Change-Id: I7bec9ade7103e747b72f08562fe57e6f091bd89f
Reviewed-on: https://go-review.googlesource.com/28850
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-09 00:34:22 +00:00
Cherry Zhang 14ff7cc94c runtime/cgo: fix callback on big-endian MIPS64
Use MOVW, instead of MOVV, to pass an int32 arg. Also no need to
restore arg registers.

Fix big-endian MIPS64 build.

Change-Id: Ib43c71075c988153e5e5c5c6e7297b3fee28652a
Reviewed-on: https://go-review.googlesource.com/28830
Reviewed-by: Minux Ma <minux@golang.org>
2016-09-09 00:30:28 +00:00
Josh Bleecher Snyder 07bcc16547 runtime: simplify getargp
Change-Id: I9ed62e8a6d8b9204c18748efd7845adabf3460b9
Reviewed-on: https://go-review.googlesource.com/28775
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-08 17:10:22 +00:00
Martin Möhrmann 252093f120 runtime: remove maxstring
Before this CL the runtime prevented printing of overlong strings with the print
function when the length of the string was determined to be corrupted.
Corruption was checked by comparing the string size against the limit
which was stored in maxstring.

However maxstring was not updated everywhere were go strings were created
e.g. for string constants during compile time. Thereby the check for maximum
string length prevented the printing of some valid strings.

The protection maxstring provided did not warrant the bookkeeping
and global synchronization needed to keep maxstring updated to the
correct limit everywhere.

Fixes #16999

Change-Id: I62cc2f4362f333f75b77f199ce1a71aac0ff7aeb
Reviewed-on: https://go-review.googlesource.com/28813
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-08 15:57:01 +00:00
Ilya Tocar 0cff219c12 strings: use AVX2 for Index if available
IndexHard4-4      1.50ms ± 2%  0.71ms ± 0%  -52.36%  (p=0.000 n=20+19)

This also fixes a bug, that caused a string of length 16 to use
two 8-byte comparisons instead of one 16-byte. And adds a test for
cases when partial_match fails.

Change-Id: I1ee8fc4e068bb36c95c45de78f067c822c0d9df0
Reviewed-on: https://go-review.googlesource.com/22551
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-07 10:43:13 +00:00
Austin Clements 8259cf3c72 runtime/debug: enable TestFreeOSMemory on all arches
TestFreeOSMemory was disabled on many arches because of issue #9993.
Since that's been fixed, enable the test everywhere.

Change-Id: I298c38c3e04128d9c8a1f558980939d5699bea03
Reviewed-on: https://go-review.googlesource.com/27403
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
2016-09-06 21:05:58 +00:00
Austin Clements 1b9499b069 syscall: make Getpagesize return page size from runtime
syscall.Getpagesize currently returns hard-coded page sizes on all
architectures (some of which are probably always wrong, and some of
which are definitely not always right). The runtime now has this
information, queried from the OS during runtime init, so make
syscall.Getpagesize return the page size that the runtime knows.

Updates #10180.

Change-Id: I4daa6fbc61a2193eb8fa9e7878960971205ac346
Reviewed-on: https://go-review.googlesource.com/25051
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-06 21:05:55 +00:00
Austin Clements 6dda7b2f5f runtime: don't hard-code physical page size
Now that the runtime fetches the true physical page size from the OS,
make the physical page size used by heap growth a variable instead of
a constant. This isn't used in any performance-critical paths, so it
shouldn't be an issue.

sys.PhysPageSize is also renamed to sys.DefaultPhysPageSize to make it
clear that it's not necessarily the true page size. There are no uses
of this constant any more, but we'll keep it around for now.

Updates #12480 and #10180.

Change-Id: I6c23b9df860db309c38c8287a703c53817754f03
Reviewed-on: https://go-review.googlesource.com/25022
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-06 21:05:53 +00:00
Austin Clements 276a52de55 runtime: fetch physical page size from the OS
Currently the physical page size assumed by the runtime is hard-coded.
On Linux the runtime at least fetches the OS page size during init and
sanity checks against the hard-coded value, but they may still differ.
On other OSes we wouldn't even notice.

Add support on all OSes to fetch the actual OS physical page size
during runtime init and lift the sanity check of PhysPageSize from the
Linux init code to general malloc init. Currently this is the only use
of the retrieved page size, but we'll add more shortly.

Updates #12480 and #10180.

Change-Id: I065f2834bc97c71d3208edc17fd990ec9058b6da
Reviewed-on: https://go-review.googlesource.com/25050
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-06 21:05:50 +00:00
Austin Clements d7de8b6d23 runtime: assume 64kB physical pages on ARM
Currently we assume the physical page size on ARM is 4kB. While this
is usually true, the architecture also supports 16kB and 64kB physical
pages, and Linux (and possibly other OSes) can be configured to use
these larger page sizes.

With Go 1.6, such a configuration could potentially run, but generally
resulted in memory corruption or random panics. With current master,
this configuration will cause the runtime to panic during init on
Linux when it checks the true physical page size (and will still cause
corruption or panics on other OSes).

However, the assumed physical page size only has to be a multiple of
the true physical page size, the scavenger can now deal with large
physical page sizes, and the rest of the runtime can deal with a
larger assumed physical page size than the true size. Hence, there's
little disadvantage to conservatively setting the assumed physical
page size to 64kB on ARM.

This may result in some extra memory use, since we can only return
memory at multiples of the assumed physical page size. However, it is
a simple change that should make Go run on systems configured for
larger page sizes. The following commits will make the runtime query
the actual physical page size from the OS, but this is a simple step
there.

Updates #12480.

Change-Id: I851829595bc9e0c76235c847a7b5f62ad82b5302
Reviewed-on: https://go-review.googlesource.com/25021
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
2016-09-06 21:05:47 +00:00
Austin Clements cf4f1d07a1 runtime: bound scanobject to ~100 µs
Currently the time spent in scanobject is proportional to the size of
the object being scanned. Since scanobject is non-preemptible, large
objects can cause significant goroutine (and even whole application)
delays through several means:

1. If a GC assist picks up a large object, the allocating goroutine is
   blocked for the whole scan, even if that scan well exceeds that
   goroutine's debt.

2. Since the scheduler does not run on the P performing a large object
   scan, goroutines in that P's run queue do not run unless they are
   stolen by another P (which can take some time). If there are a few
   large objects, all of the Ps may get tied up so the scheduler
   doesn't run anywhere.

3. Even if a large object is scanned by a background worker and other
   Ps are still running the scheduler, the large object scan doesn't
   flush background credit until the whole scan is done. This can
   easily cause all allocations to block in assists, waiting for
   credit, causing an effective STW.

Fix this by splitting large objects into 128 KB "oblets" and scanning
at most one oblet at a time. Since we can scan 1–2 MB/ms, this equates
to bounding scanobject at roughly 100 µs. This improves assist
behavior both because assists can no longer get "unlucky" and be stuck
scanning a large object, and because it causes the background worker
to flush credit and unblock assists more frequently when scanning
large objects. This also improves GC parallelism if the heap consists
primarily of a small number of very large objects by letting multiple
workers scan a large objects in parallel.

Fixes #10345. Fixes #16293.

This substantially improves goroutine latency in the benchmark from
issue #16293, which exercises several forms of very large objects:

name                 old max-latency    new max-latency    delta
SliceNoPointer-12           154µs ± 1%        155µs ±  2%     ~     (p=0.087 n=13+12)
SlicePointer-12             314ms ± 1%       5.94ms ±138%  -98.11%  (p=0.000 n=19+20)
SliceLivePointer-12        1148ms ± 0%       4.72ms ±167%  -99.59%  (p=0.000 n=19+20)
MapNoPointer-12           72509µs ± 1%        408µs ±325%  -99.44%  (p=0.000 n=19+18)
ChanPointer-12              313ms ± 0%       4.74ms ±140%  -98.49%  (p=0.000 n=18+20)
ChanLivePointer-12         1147ms ± 0%       3.30ms ±149%  -99.71%  (p=0.000 n=19+20)

name                 old P99.9-latency  new P99.9-latency  delta
SliceNoPointer-12           113µs ±25%         107µs ±12%     ~     (p=0.153 n=20+18)
SlicePointer-12          309450µs ± 0%         133µs ±23%  -99.96%  (p=0.000 n=20+20)
SliceLivePointer-12         961ms ± 0%        1.35ms ±27%  -99.86%  (p=0.000 n=20+20)
MapNoPointer-12            448µs ±288%         119µs ±18%  -73.34%  (p=0.000 n=18+20)
ChanPointer-12           309450µs ± 0%         134µs ±23%  -99.96%  (p=0.000 n=20+19)
ChanLivePointer-12          961ms ± 0%        1.35ms ±27%  -99.86%  (p=0.000 n=20+20)

This has negligible effect on all metrics from the garbage, JSON, and
HTTP x/benchmarks.

It shows slight improvement on some of the go1 benchmarks,
particularly Revcomp, which uses some multi-megabyte buffers:

name                      old time/op    new time/op    delta
BinaryTree17-12              2.46s ± 1%     2.47s ± 1%  +0.32%  (p=0.012 n=20+20)
Fannkuch11-12                2.82s ± 0%     2.81s ± 0%  -0.61%  (p=0.000 n=17+20)
FmtFprintfEmpty-12          50.8ns ± 5%    50.5ns ± 2%    ~     (p=0.197 n=17+19)
FmtFprintfString-12          131ns ± 1%     132ns ± 0%  +0.57%  (p=0.000 n=20+16)
FmtFprintfInt-12             117ns ± 0%     116ns ± 0%  -0.47%  (p=0.000 n=15+20)
FmtFprintfIntInt-12          180ns ± 0%     179ns ± 1%  -0.78%  (p=0.000 n=16+20)
FmtFprintfPrefixedInt-12     186ns ± 1%     185ns ± 1%  -0.55%  (p=0.000 n=19+20)
FmtFprintfFloat-12           263ns ± 1%     271ns ± 0%  +2.84%  (p=0.000 n=18+20)
FmtManyArgs-12               741ns ± 1%     742ns ± 1%    ~     (p=0.190 n=19+19)
GobDecode-12                7.44ms ± 0%    7.35ms ± 1%  -1.21%  (p=0.000 n=20+20)
GobEncode-12                6.22ms ± 1%    6.21ms ± 1%    ~     (p=0.336 n=20+19)
Gzip-12                      220ms ± 1%     219ms ± 1%    ~     (p=0.130 n=19+19)
Gunzip-12                   37.9ms ± 0%    37.9ms ± 1%    ~     (p=1.000 n=20+19)
HTTPClientServer-12         82.5µs ± 3%    82.6µs ± 3%    ~     (p=0.776 n=20+19)
JSONEncode-12               16.4ms ± 1%    16.5ms ± 2%  +0.49%  (p=0.003 n=18+19)
JSONDecode-12               53.7ms ± 1%    54.1ms ± 1%  +0.71%  (p=0.000 n=19+18)
Mandelbrot200-12            4.19ms ± 1%    4.20ms ± 1%    ~     (p=0.452 n=19+19)
GoParse-12                  3.38ms ± 1%    3.37ms ± 1%    ~     (p=0.123 n=19+19)
RegexpMatchEasy0_32-12      72.1ns ± 1%    71.8ns ± 1%    ~     (p=0.397 n=19+17)
RegexpMatchEasy0_1K-12       242ns ± 0%     242ns ± 0%    ~     (p=0.168 n=17+20)
RegexpMatchEasy1_32-12      72.1ns ± 1%    72.1ns ± 1%    ~     (p=0.538 n=18+19)
RegexpMatchEasy1_1K-12       385ns ± 1%     384ns ± 1%    ~     (p=0.388 n=20+20)
RegexpMatchMedium_32-12      112ns ± 1%     112ns ± 3%    ~     (p=0.539 n=20+20)
RegexpMatchMedium_1K-12     34.4µs ± 2%    34.4µs ± 2%    ~     (p=0.628 n=18+18)
RegexpMatchHard_32-12       1.80µs ± 1%    1.80µs ± 1%    ~     (p=0.522 n=18+19)
RegexpMatchHard_1K-12       54.0µs ± 1%    54.1µs ± 1%    ~     (p=0.647 n=20+19)
Revcomp-12                   387ms ± 1%     369ms ± 5%  -4.89%  (p=0.000 n=17+19)
Template-12                 62.3ms ± 1%    62.0ms ± 0%  -0.48%  (p=0.002 n=20+17)
TimeParse-12                 314ns ± 1%     314ns ± 0%    ~     (p=1.011 n=20+13)
TimeFormat-12                358ns ± 0%     354ns ± 0%  -1.12%  (p=0.000 n=17+20)
[Geo mean]                  53.5µs         53.3µs       -0.23%

Change-Id: I2a0a179d1d6bf7875dd054b7693dd12d2a340132
Reviewed-on: https://go-review.googlesource.com/23540
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-06 19:27:33 +00:00
Austin Clements b275e55d86 runtime: clean up more traces of the old mark bit
Commit 59877bf renamed bitMarked to bitScan, since the bitmap is no
longer used for marking. However, there were several other references
to this strewn about comments and in some other constant names. Fix
these up, too.

Change-Id: I4183d28c6b01977f1d75a99ad55b150f2211772d
Reviewed-on: https://go-review.googlesource.com/28450
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-06 19:26:08 +00:00
Erik Staab 66121ce8a9 runtime: remove redundant expression from SetFinalizer
The previous if condition already checks the same expression and doesn't
have side effects.

Change-Id: Ieaf30a786572b608d0a883052b45fd3f04bc6147
Reviewed-on: https://go-review.googlesource.com/28475
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-06 00:52:26 +00:00
Josh Bleecher Snyder 0e7e43688d runtime: remove a load and shift from scanobject
hbits.morePointers and hbits.isPointer both
do a load and a shift. Do it only once.

Benchmarks using compilebench (because it is
the benchmark I have the most tooling around),
on a quiet machine.

name       old time/op      new time/op      delta
Template        291ms ±14%       290ms ±15%    ~          (p=0.702 n=100+99)
Unicode         143ms ± 9%       142ms ± 9%    ~           (p=0.126 n=99+98)
GoTypes         934ms ± 4%       933ms ± 4%    ~         (p=0.937 n=100+100)
Compiler        4.92s ± 2%       4.90s ± 1%  -0.28%        (p=0.003 n=98+98)

name       old user-ns/op   new user-ns/op   delta
Template   360user-ms ± 5%  355user-ms ± 4%  -1.37%        (p=0.000 n=97+96)
Unicode    178user-ms ± 6%  176user-ms ± 6%  -1.24%        (p=0.001 n=96+99)
GoTypes    1.22user-s ± 5%  1.21user-s ± 5%  -0.94%      (p=0.000 n=100+100)
Compiler   6.50user-s ± 2%  6.44user-s ± 3%  -0.94%        (p=0.000 n=96+98)

On amd64, before:

"".scanobject t=1 size=581 args=0x10 locals=0x78

After:

"".scanobject t=1 size=540 args=0x10 locals=0x78


Change-Id: I420ac3704549d484a5d85e19fea82c85da389514
Reviewed-on: https://go-review.googlesource.com/22712
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-02 19:17:43 +00:00
Dmitry Vyukov cd285f1c6f runtime: fix global buffer reset in StopTrace
We reset global buffer only if its pos != 0.
We ought to do it always, but queue it only if pos != 0.
This is a latent bug. Currently it does not fire because
whenever we create a global buffer, we increment pos.

Change-Id: I01e28ae88ce9a5412497c524391b8b7cb443ffd9
Reviewed-on: https://go-review.googlesource.com/25574
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-09-02 19:14:11 +00:00
Gleb Stepanov 59877bfaaf runtime: rename variable
Rename variable to bitScan according to
TODO comment.

Change-Id: I81dd8cc1ca28c0dc9308a654ad65cdf5b2fd2ce3
Reviewed-on: https://go-review.googlesource.com/25175
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-09-02 17:28:41 +00:00
Austin Clements 3df926d52a runtime: improve message when a bad pointer is found on the stack
Currently this message says "invalid stack pointer", which could be
interpreted as the value of SP being invalid. Change it to "invalid
pointer found on stack" to emphasize that it's a pointer on the stack
that's invalid.

Updates #16948.

Change-Id: I753624f8cc7e08cf13d3ea5d9c790cc4af9fa372
Reviewed-on: https://go-review.googlesource.com/28430
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-09-02 17:04:37 +00:00
Ilya Tocar 44f1854c9d bytes: Use the same algorithm as strings for Index
name                     old time/op    new time/op      delta
IndexByte32-48             9.05ns ± 7%      9.59ns ±11%     +5.93%  (p=0.001 n=19+20)
IndexByte4K-48              118ns ± 4%       122ns ± 8%     +3.52%  (p=0.002 n=19+19)
IndexByte4M-48              172µs ±13%       188µs ±12%     +9.49%  (p=0.000 n=20+20)
IndexByte64M-48            8.00ms ±14%      8.05ms ±23%       ~     (p=0.799 n=20+20)
IndexBytePortable32-48     41.7ns ±15%      42.5ns ±12%       ~     (p=0.372 n=20+20)
IndexBytePortable4K-48     3.08µs ±16%      3.26µs ±10%     +5.77%  (p=0.018 n=20+20)
IndexBytePortable4M-48     3.12ms ±17%      3.20ms ±10%       ~     (p=0.157 n=20+20)
IndexBytePortable64M-48    54.0ms ±14%      55.3ms ±14%       ~     (p=0.640 n=20+20)
Index32-48                  230ns ±12%        46ns ± 6%    -79.87%  (p=0.000 n=20+19)
Index4K-48                 43.2µs ± 9%       3.2µs ±12%    -92.58%  (p=0.000 n=19+20)
Index4M-48                 44.4ms ± 7%       3.3ms ±13%    -92.59%  (p=0.000 n=19+20)
Index64M-48                 714ms ±10%        56ms ± 8%    -92.22%  (p=0.000 n=19+19)
IndexEasy32-48             52.7ns ±10%      31.0ns ±11%    -41.21%  (p=0.000 n=20+20)
IndexEasy4K-48              139ns ± 5%      1598ns ± 6%  +1046.37%  (p=0.000 n=19+19)
IndexEasy4M-48              179µs ± 8%      1674µs ±10%   +834.31%  (p=0.000 n=19+20)
IndexEasy64M-48            8.56ms ±10%     27.82ms ±16%   +225.14%  (p=0.000 n=19+20)

name                     old speed      new speed        delta
IndexByte32-48           3.52GB/s ± 7%    3.35GB/s ±11%     -4.99%  (p=0.001 n=20+20)
IndexByte4K-48           34.5GB/s ± 7%    33.2GB/s ±10%     -3.67%  (p=0.002 n=20+20)
IndexByte4M-48           24.6GB/s ±14%    22.4GB/s ±14%     -8.73%  (p=0.000 n=20+20)
IndexByte64M-48          8.42GB/s ±16%    8.42GB/s ±19%       ~     (p=0.799 n=20+20)
IndexBytePortable32-48    770MB/s ±13%     756MB/s ±11%       ~     (p=0.383 n=20+20)
IndexBytePortable4K-48   1.34GB/s ±14%    1.26GB/s ±10%     -5.76%  (p=0.018 n=20+20)
IndexBytePortable4M-48   1.35GB/s ±15%    1.31GB/s ±11%       ~     (p=0.157 n=20+20)
IndexBytePortable64M-48  1.25GB/s ±16%    1.22GB/s ±13%       ~     (p=0.640 n=20+20)
Index32-48                138MB/s ± 8%     687MB/s ± 8%   +398.57%  (p=0.000 n=19+20)
Index4K-48               94.9MB/s ± 9%  1280.5MB/s ±11%  +1249.11%  (p=0.000 n=19+20)
Index4M-48               94.6MB/s ± 7%  1278.5MB/s ±12%  +1250.99%  (p=0.000 n=19+20)
Index64M-48              94.2MB/s ±10%  1210.9MB/s ± 8%  +1185.04%  (p=0.000 n=19+19)
IndexEasy32-48            608MB/s ±10%    1035MB/s ±10%    +70.15%  (p=0.000 n=20+20)
IndexEasy4K-48           29.3GB/s ± 6%     2.6GB/s ± 6%    -91.24%  (p=0.000 n=19+19)
IndexEasy4M-48           23.3GB/s ±10%     2.5GB/s ± 9%    -89.23%  (p=0.000 n=20+20)
IndexEasy64M-48          7.86GB/s ±11%    2.42GB/s ±14%    -69.18%  (p=0.000 n=19+20)

Change-Id: Ia191f0a6ca80e113397d9ed98d25f195768b65bc
Reviewed-on: https://go-review.googlesource.com/22550
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-01 18:05:50 +00:00
Joe Tsai 6fb4b15f98 Revert "runtime: improve memmove for amd64"
This reverts commit 3607c5f4f1.

This was causing failures on amd64 machines without AVX.

Fixes #16939

Change-Id: I70080fbb4e7ae791857334f2bffd847d08cb25fa
Reviewed-on: https://go-review.googlesource.com/28274
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-31 21:07:35 +00:00
Kevin Burke ffa2bd27a4 runtime: fix typo
Change-Id: I47e3cfa8b49e3d0b55c91387df31488b37038a8f
Reviewed-on: https://go-review.googlesource.com/28225
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-31 16:24:45 +00:00
Denis Nagorny 3607c5f4f1 runtime: improve memmove for amd64
Use AVX if available on 4th generation of Intel(TM) Core(TM) processors.

(collected on E5 2609v3 @1.9GHz)
name                        old speed      new speed       delta
Memmove/1-6                  158MB/s ± 0%    172MB/s ± 0%    +9.09% (p=0.000 n=16+16)
Memmove/2-6                  316MB/s ± 0%    345MB/s ± 0%    +9.09% (p=0.000 n=18+16)
Memmove/3-6                  517MB/s ± 0%    517MB/s ± 0%      ~ (p=0.445 n=16+16)
Memmove/4-6                  687MB/s ± 1%    690MB/s ± 0%    +0.35% (p=0.000 n=20+17)
Memmove/5-6                  729MB/s ± 0%    729MB/s ± 0%    +0.01% (p=0.000 n=16+18)
Memmove/6-6                  875MB/s ± 0%    875MB/s ± 0%    +0.01% (p=0.000 n=18+18)
Memmove/7-6                 1.02GB/s ± 0%   1.02GB/s ± 1%      ~ (p=0.139 n=19+20)
Memmove/8-6                 1.26GB/s ± 0%   1.26GB/s ± 0%    +0.00% (p=0.000 n=18+18)
Memmove/9-6                 1.42GB/s ± 0%   1.42GB/s ± 0%    +0.00% (p=0.000 n=17+18)
Memmove/10-6                1.58GB/s ± 0%   1.58GB/s ± 0%    +0.00% (p=0.000 n=19+19)
Memmove/11-6                1.74GB/s ± 0%   1.74GB/s ± 0%    +0.00% (p=0.001 n=18+17)
Memmove/12-6                1.90GB/s ± 0%   1.90GB/s ± 0%    +0.00% (p=0.000 n=19+19)
Memmove/13-6                2.05GB/s ± 0%   2.05GB/s ± 0%    +0.00% (p=0.000 n=18+19)
Memmove/14-6                2.21GB/s ± 0%   2.21GB/s ± 0%    +0.00% (p=0.000 n=16+20)
Memmove/15-6                2.37GB/s ± 0%   2.37GB/s ± 0%    +0.00% (p=0.004 n=19+20)
Memmove/16-6                2.53GB/s ± 0%   2.53GB/s ± 0%    +0.00% (p=0.000 n=16+16)
Memmove/32-6                4.67GB/s ± 0%   4.67GB/s ± 0%    +0.00% (p=0.000 n=17+17)
Memmove/64-6                8.67GB/s ± 0%   8.64GB/s ± 0%    -0.33% (p=0.000 n=18+17)
Memmove/128-6               12.6GB/s ± 0%   11.6GB/s ± 0%    -8.05% (p=0.000 n=16+19)
Memmove/256-6               16.3GB/s ± 0%   16.6GB/s ± 0%    +1.66% (p=0.000 n=20+18)
Memmove/512-6               21.5GB/s ± 0%   24.4GB/s ± 0%   +13.35% (p=0.000 n=18+17)
Memmove/1024-6              24.7GB/s ± 0%   33.7GB/s ± 0%   +36.12% (p=0.000 n=18+18)
Memmove/2048-6              27.3GB/s ± 0%   43.3GB/s ± 0%   +58.77% (p=0.000 n=19+17)
Memmove/4096-6              37.5GB/s ± 0%   50.5GB/s ± 0%   +34.56% (p=0.000 n=19+19)
MemmoveUnalignedDst/1-6      135MB/s ± 0%    146MB/s ± 0%    +7.69% (p=0.000 n=16+14)
MemmoveUnalignedDst/2-6      271MB/s ± 0%    292MB/s ± 0%    +7.69% (p=0.000 n=18+18)
MemmoveUnalignedDst/3-6      438MB/s ± 0%    438MB/s ± 0%      ~ (p=0.352 n=16+19)
MemmoveUnalignedDst/4-6      584MB/s ± 0%    584MB/s ± 0%      ~ (p=0.876 n=17+17)
MemmoveUnalignedDst/5-6      631MB/s ± 1%    632MB/s ± 0%    +0.25% (p=0.000 n=20+17)
MemmoveUnalignedDst/6-6      759MB/s ± 0%    759MB/s ± 0%    +0.00% (p=0.000 n=19+16)
MemmoveUnalignedDst/7-6      885MB/s ± 0%    883MB/s ± 1%      ~ (p=0.647 n=18+20)
MemmoveUnalignedDst/8-6     1.08GB/s ± 0%   1.08GB/s ± 0%    +0.00% (p=0.035 n=19+18)
MemmoveUnalignedDst/9-6     1.22GB/s ± 0%   1.22GB/s ± 0%      ~ (p=0.251 n=18+17)
MemmoveUnalignedDst/10-6    1.35GB/s ± 0%   1.35GB/s ± 0%      ~ (p=0.327 n=17+18)
MemmoveUnalignedDst/11-6    1.49GB/s ± 0%   1.49GB/s ± 0%      ~ (p=0.531 n=18+19)
MemmoveUnalignedDst/12-6    1.63GB/s ± 0%   1.63GB/s ± 0%      ~ (p=0.886 n=19+18)
MemmoveUnalignedDst/13-6    1.76GB/s ± 0%   1.76GB/s ± 1%    -0.24% (p=0.006 n=18+20)
MemmoveUnalignedDst/14-6    1.90GB/s ± 0%   1.90GB/s ± 0%      ~ (p=0.818 n=20+19)
MemmoveUnalignedDst/15-6    2.03GB/s ± 0%   2.03GB/s ± 0%      ~ (p=0.294 n=17+16)
MemmoveUnalignedDst/16-6    2.17GB/s ± 0%   2.17GB/s ± 0%      ~ (p=0.602 n=16+18)
MemmoveUnalignedDst/32-6    4.05GB/s ± 0%   4.05GB/s ± 0%    +0.00% (p=0.010 n=18+17)
MemmoveUnalignedDst/64-6    7.59GB/s ± 0%   7.59GB/s ± 0%    +0.00% (p=0.022 n=18+16)
MemmoveUnalignedDst/128-6   11.1GB/s ± 0%   11.4GB/s ± 0%    +2.79% (p=0.000 n=18+17)
MemmoveUnalignedDst/256-6   16.4GB/s ± 0%   16.7GB/s ± 0%    +1.59% (p=0.000 n=20+17)
MemmoveUnalignedDst/512-6   15.7GB/s ± 0%   21.3GB/s ± 0%   +35.87% (p=0.000 n=18+20)
MemmoveUnalignedDst/1024-6  16.0GB/s ±20%   31.5GB/s ± 0%   +96.93% (p=0.000 n=20+14)
MemmoveUnalignedDst/2048-6  19.6GB/s ± 0%   42.1GB/s ± 0%  +115.16% (p=0.000 n=17+18)
MemmoveUnalignedDst/4096-6  6.41GB/s ± 0%  33.18GB/s ± 0%  +417.56% (p=0.000 n=17+18)
MemmoveUnalignedSrc/1-6      171MB/s ± 0%    166MB/s ± 0%    -3.33% (p=0.000 n=19+16)
MemmoveUnalignedSrc/2-6      343MB/s ± 0%    342MB/s ± 1%    -0.41% (p=0.000 n=17+20)
MemmoveUnalignedSrc/3-6      508MB/s ± 0%    493MB/s ± 1%    -2.90% (p=0.000 n=17+17)
MemmoveUnalignedSrc/4-6      677MB/s ± 0%    660MB/s ± 2%    -2.55% (p=0.000 n=17+20)
MemmoveUnalignedSrc/5-6      790MB/s ± 0%    790MB/s ± 0%      ~ (p=0.139 n=17+17)
MemmoveUnalignedSrc/6-6      948MB/s ± 0%    946MB/s ± 1%      ~ (p=0.330 n=17+19)
MemmoveUnalignedSrc/7-6     1.11GB/s ± 0%   1.11GB/s ± 0%    -0.05% (p=0.026 n=17+17)
MemmoveUnalignedSrc/8-6     1.38GB/s ± 0%   1.38GB/s ± 0%      ~ (p=0.091 n=18+16)
MemmoveUnalignedSrc/9-6     1.42GB/s ± 0%   1.40GB/s ± 1%    -1.04% (p=0.000 n=19+20)
MemmoveUnalignedSrc/10-6    1.58GB/s ± 0%   1.56GB/s ± 1%    -1.15% (p=0.000 n=18+19)
MemmoveUnalignedSrc/11-6    1.73GB/s ± 0%   1.71GB/s ± 1%    -1.30% (p=0.000 n=20+20)
MemmoveUnalignedSrc/12-6    1.89GB/s ± 0%   1.87GB/s ± 1%    -1.18% (p=0.000 n=17+20)
MemmoveUnalignedSrc/13-6    2.05GB/s ± 0%   2.02GB/s ± 1%    -1.18% (p=0.000 n=17+20)
MemmoveUnalignedSrc/14-6    2.21GB/s ± 0%   2.18GB/s ± 1%    -1.14% (p=0.000 n=17+20)
MemmoveUnalignedSrc/15-6    2.36GB/s ± 0%   2.34GB/s ± 1%    -1.04% (p=0.000 n=17+20)
MemmoveUnalignedSrc/16-6    2.52GB/s ± 0%   2.49GB/s ± 1%    -1.26% (p=0.000 n=19+20)
MemmoveUnalignedSrc/32-6    4.82GB/s ± 0%   4.61GB/s ± 0%    -4.40% (p=0.000 n=19+20)
MemmoveUnalignedSrc/64-6    5.03GB/s ± 4%   7.97GB/s ± 0%   +58.55% (p=0.000 n=20+16)
MemmoveUnalignedSrc/128-6   11.1GB/s ± 0%   11.2GB/s ± 0%    +0.52% (p=0.000 n=17+18)
MemmoveUnalignedSrc/256-6   16.5GB/s ± 0%   16.4GB/s ± 0%    -0.10% (p=0.000 n=20+18)
MemmoveUnalignedSrc/512-6   21.0GB/s ± 0%   22.1GB/s ± 0%    +5.48% (p=0.000 n=14+17)
MemmoveUnalignedSrc/1024-6  24.9GB/s ± 0%   31.9GB/s ± 0%   +28.20% (p=0.000 n=19+20)
MemmoveUnalignedSrc/2048-6  23.3GB/s ± 0%   33.8GB/s ± 0%   +45.22% (p=0.000 n=17+19)
MemmoveUnalignedSrc/4096-6  37.3GB/s ± 0%   42.7GB/s ± 0%   +14.30% (p=0.000 n=17+17)

Change-Id: Iab488d93a293cdf573ab5cd89b95a818bbb5d531
Reviewed-on: https://go-review.googlesource.com/22515
Run-TryBot: Denis Nagorny <denis.nagorny@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-31 16:03:30 +00:00
Josh Bleecher Snyder 2b74de3ed9 runtime: rename fastrand1 to fastrand
Change-Id: I37706ff0a3486827c5b072c95ad890ea87ede847
Reviewed-on: https://go-review.googlesource.com/28210
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-30 23:59:21 +00:00
Cherry Zhang f9dafc742d cmd/compile, runtime, etc: get rid of constant FP registers
On ARM64, MIPS64, and PPC64, some floating point registers were
reserved for constants 0, 1, 2, 0.5, etc. This CL removes them.

On ARM64, they are never used. On MIPS64 and PPC64, the only use
case is a multiplication-by-2 in the old backend of the compiler,
which is replaced with an addition.

Change-Id: I737cbf43283756e3408964fc88c567a938c57036
Reviewed-on: https://go-review.googlesource.com/28095
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-30 23:16:17 +00:00
Cherry Zhang b2e0e9688a cmd/compile: remove Zero and NilCheck for newobject
Recognize runtime.newobject and don't Zero or NilCheck it.

Fixes #15914 (?)
Updates #15390.

TBD: add test

Change-Id: Ia3bfa5c2ddbe2c27c92d9f68534a713b5ce95934
Reviewed-on: https://go-review.googlesource.com/27930
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-08-30 23:10:43 +00:00
Keith Randall 842b05832f all: use testing.GoToolPath instead of "go"
This change makes sure that tests are run with the correct
version of the go tool.  The correct version is the one that
we invoked with "go test", not the one that is first in our path.

Fixes #16577

Change-Id: If22c8f8c3ec9e7c35d094362873819f2fbb8559b
Reviewed-on: https://go-review.googlesource.com/28089
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-30 22:49:11 +00:00
Martin Möhrmann 0dae9dfb08 cmd/compile: improve string iteration performance
Generate a for loop for ranging over strings that only needs to call
the runtime function charntorune for non ASCII characters.

This provides faster iteration over ASCII characters and slightly
faster iteration for other characters.

The runtime function charntorune is changed to take an index from where
to start decoding and returns the index after the last byte belonging
to the decoded rune.

All call sites of charntorune in the runtime are replaced by a for loop
that will be transformed by the compiler instead of calling the charntorune
function directly.

go binary size decreases by 80 bytes.
godoc binary size increases by around 4 kilobytes.

runtime:

name                           old time/op  new time/op  delta
RuneIterate/range/ASCII-4      43.7ns ± 3%  10.3ns ± 4%  -76.33%  (p=0.000 n=44+45)
RuneIterate/range/Japanese-4   72.5ns ± 2%  62.8ns ± 2%  -13.41%  (p=0.000 n=49+50)
RuneIterate/range1/ASCII-4     43.5ns ± 2%  10.4ns ± 3%  -76.18%  (p=0.000 n=50+50)
RuneIterate/range1/Japanese-4  72.5ns ± 2%  62.9ns ± 2%  -13.26%  (p=0.000 n=50+49)
RuneIterate/range2/ASCII-4     43.5ns ± 3%  10.3ns ± 2%  -76.22%  (p=0.000 n=48+47)
RuneIterate/range2/Japanese-4  72.4ns ± 2%  62.7ns ± 2%  -13.47%  (p=0.000 n=50+50)

strings:

name                 old time/op    new time/op    delta
IndexRune-4            64.7ns ± 5%    22.4ns ± 3%  -65.43%  (p=0.000 n=25+21)
MapNoChanges-4          269ns ± 2%     157ns ± 2%  -41.46%  (p=0.000 n=23+24)
Fields-4               23.0ms ± 2%    19.7ms ± 2%  -14.35%  (p=0.000 n=25+25)
FieldsFunc-4           23.1ms ± 2%    19.6ms ± 2%  -14.94%  (p=0.000 n=25+24)

name                 old speed      new speed      delta
Fields-4             45.6MB/s ± 2%  53.2MB/s ± 2%  +16.87%  (p=0.000 n=24+25)
FieldsFunc-4         45.5MB/s ± 2%  53.5MB/s ± 2%  +17.57%  (p=0.000 n=25+24)

Updates #13162

Change-Id: I79ffaf828d82bf9887592f08e5cad883e9f39701
Reviewed-on: https://go-review.googlesource.com/27853
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
2016-08-30 18:17:20 +00:00
Keith Randall 0d7a2241cb runtime: update a few comments
noescape is now 0 instructions with the SSA backend.
fast atomics are no longer a TODO (at least for amd64).

Change-Id: Ib6e06f7471bef282a47ba236d8ce95404bb60a42
Reviewed-on: https://go-review.googlesource.com/28087
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-30 18:16:28 +00:00
Carlos Eduardo Seo aaa6b53524 runtime: insufficient padding in the `p` structure
The current padding in the 'p' struct is hardcoded at 64 bytes. It should be the
cache line size. On ppc64x, the current value is only okay because sys.CacheLineSize
is wrong at 64 bytes. This change fixes that by making the padding equal to the
cache line size. It also fixes the cache line size for ppc64/ppc64le to 128 bytes.

Fixes #16477

Change-Id: Ib7ec5195685116eb11ba312a064f41920373d4a3
Reviewed-on: https://go-review.googlesource.com/25370
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-29 23:22:51 +00:00
Martin Möhrmann e6f9f39ce5 cmd/compile: generate makeslice calls with int arguments
Where possible generate calls to runtime makeslice with int arguments
during compile time instead of makeslice with int64 arguments.

This eliminates converting arguments for calls to makeslice with
int64 arguments for platforms where int64 values do not fit into
arguments of type int.

godoc 386 binary shrinks by approximately 12 kilobyte.

amd64:
name         old time/op  new time/op  delta
MakeSlice-2  29.8ns ± 1%  29.8ns ± 1%   ~     (p=1.000 n=24+24)

386:
name         old time/op  new time/op  delta
MakeSlice-2  52.3ns ± 0%  45.9ns ± 0%  -12.17%  (p=0.000 n=25+22)

Fixes  #15357

Change-Id: Icb8701bb63c5a83877d26c8a4b78e782ba76de7c
Reviewed-on: https://go-review.googlesource.com/27851
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-08-29 18:25:33 +00:00
Emmanuel Odeke 7c04633e0c all: fix obsolete inferno-os links
Fixes #16911.

Fix obsolete inferno-os links, since code.google.com shutdown.
This CL points to the right files by replacing
http://code.google.com/p/inferno-os/source/browse
with
https://bitbucket.org/inferno-os/inferno-os/src/default

To implement the change I wrote and ran this script in the root:
$ grep -Rn 'http://code.google.com/p/inferno-os/source/browse' * \
| cut -d":" -f1 | while read F;do perl -pi -e \
's/http:\/\/code.google.com\/p\/inferno-os\/source\/browse/https:\/\/bitbucket.org\/inferno-os\/inferno-os\/src\/default/g'
$F;done

I excluded any cmd/vendor changes from the commit.

Change-Id: Iaaf828ac8f6fc949019fd01832989d00b29b6749
Reviewed-on: https://go-review.googlesource.com/27994
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-29 04:54:42 +00:00
David Crawshaw bee4206764 runtime: have typelinksinit work forwards
For reasons I have forgotten typelinksinit processed modules backwards.
(I suspect this was an attempt to process types in the executing
binary first.)

It does not appear to be necessary, and it is not the order we want
when a module can be loaded at an arbitrary point during a program's
execution as a plugin. So reverse the order.

While here, make it safe to call typelinksinit multiple times.

Change-Id: Ie10587c55c8e5efa0542981efb6eb3c12dd59e8c
Reviewed-on: https://go-review.googlesource.com/27822
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-08-26 21:22:30 +00:00
Keith Randall 320ddcf834 cmd/compile: inline atomics from runtime/internal/atomic on amd64
Inline atomic reads and writes on amd64.  There's no reason
to pay the overhead of a call for these.

To keep atomic loads from being reordered, we make them
return a <value,memory> tuple.

Change the meaning of resultInArg0 for tuple-generating ops
to mean the first part of the result tuple, not the second.
This means we can always put the store part of the tuple last,
matching how arguments are laid out.  This requires reordering
the outputs of add32carry and sub32carry and their descendents
in various architectures.

benchmark                    old ns/op     new ns/op     delta
BenchmarkAtomicLoad64-8      2.09          0.26          -87.56%
BenchmarkAtomicStore64-8     7.54          5.72          -24.14%

TBD (in a different CL): Cas, Or8, ...

Change-Id: I713ea88e7da3026c44ea5bdb56ed094b20bc5207
Reviewed-on: https://go-review.googlesource.com/27641
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-08-25 20:09:04 +00:00
Josh Bleecher Snyder 71ab9fa312 all: fix assembly vet issues
Add missing function prototypes.
Fix function prototypes.
Use FP references instead of SP references.
Fix variable names.
Update comments.
Clean up whitespace. (Not for vet.)

All fairly minor fixes to make vet happy.

Updates #11041

Change-Id: Ifab2cdf235ff61cdc226ab1d84b8467b5ac9446c
Reviewed-on: https://go-review.googlesource.com/27713
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-25 18:52:31 +00:00
Ian Lance Taylor f29ec7d74a runtime: remove unused type sigtabtt
The type sigtabtt was introduced by an automated tool in
https://golang.org/cl/167550043. It was the Go version of the C type
SigTab. However, when the C code using SigTab was converted to Go in
https://golang.org/cl/168500044 it was rewritten to use a different Go
type, sigTabT, rather than sigtabtt (the difference being that sigTabT
uses string where sigtabtt uses *int8 from the C type char*). So this is
just a dreg from the conversion that was never actually used.

Change-Id: I2ec6eb4b25613bf5e5ad1dbba1f4b5ff20f80f55
Reviewed-on: https://go-review.googlesource.com/27691
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-25 03:51:24 +00:00
Michael Munday 61d5daea0a runtime: use clock_gettime for time.now() on s390x
This should improve the precision of time.now() from microseconds
to nanoseconds.

Also, modify runtime.nanotime to keep it consistent with cleanup
done to time.now.

Updates #11222 for s390x.

Change-Id: I27864115ea1fee7299360d9003cd3a8355f624d3
Reviewed-on: https://go-review.googlesource.com/27710
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-25 02:05:31 +00:00
Keith Randall 3e270ab80b cmd/compile: clean up ctz ops
Now that we have ops that can return 2 results, have BSF return a result
and flags.  We can then get rid of the redundant comparison and use CMOV
instead of CMOVconst ops.

Get rid of a bunch of the ops we don't use.  Ctz{8,16}, plus all the Clzs,
and CMOVNEs.  I don't think we'll ever use them, and they would be easy
to add back if needed.

Change-Id: I8858a1d017903474ea7e4002fc76a6a86e7bd487
Reviewed-on: https://go-review.googlesource.com/27630
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-08-23 23:45:12 +00:00
Ian Lance Taylor d00890b5f3 runtime: add msan calls before calling traceback functions
Tell msan that the arguments to the traceback functions are initialized,
in case the traceback functions are compiled with -fsanitize=memory.

Change-Id: I3ab0816604906c6cd7086245e6ae2e7fa62fe354
Reviewed-on: https://go-review.googlesource.com/24856
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-23 16:31:16 +00:00
Ian Lance Taylor fe251d2581 runtime: remove unused function in test
Change-Id: I43f14cdd9eb4a1d5471fc88c1b4759ceb2c674cf
Reviewed-on: https://go-review.googlesource.com/24817
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-23 14:09:52 +00:00
Ian Lance Taylor 2d85e87f08 runtime/cgo: add tsan acquire/release around setenv/unsetenv
Change-Id: Iabb25e97714d070c31c657559a97a3bfc979da18
Reviewed-on: https://go-review.googlesource.com/25403
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-23 14:07:58 +00:00
Ian Lance Taylor dc9755c2a2 runtime: add missing race and msan checks to reflect functions
Add missing race and msan checks to reflect.typedmmemove and
reflect.typedslicecopy. Missing these checks caused the race detector
to miss races and caused msan to issue false positive errors.

Fixes #16281.

Change-Id: I500b5f92bd68dc99dd5d6f297827fd5d2609e88b
Reviewed-on: https://go-review.googlesource.com/24760
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2016-08-23 13:12:15 +00:00
Carlos Eduardo Seo 0df5ab7e65 runtime: Use clock_gettime to get current time on ppc64x
Fetch the current time in nanoseconds, not microseconds, by using
clock_gettime rather than gettimeofday.

Updates #11222

Change-Id: I1c2c1b88f80ae82002518359436e19099061c6fb
Reviewed-on: https://go-review.googlesource.com/26790
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Minux Ma <minux@golang.org>
2016-08-23 05:37:05 +00:00
Josh Bleecher Snyder e2103adb6c crypto/*, runtime: nacl asm fixes
Found by vet.

Updates #11041

Change-Id: I5217b3e20c6af435d7500d6bb487b9895efe6605
Reviewed-on: https://go-review.googlesource.com/27493
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-08-22 19:50:41 +00:00
Josh Bleecher Snyder 5abfc97e84 runtime: use correct MOV for plan9 brk_ ret value
Updates #11041

Change-Id: I78f8d48f00cfbb451e37c868cc472ef06ea0fd95
Reviewed-on: https://go-review.googlesource.com/27491
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-22 19:49:08 +00:00
Josh Bleecher Snyder e80376ca6b runtime: ignore closeonexec ret val on openbsd/arm
Fixes #16641
Updates #11041

Change-Id: I087208a486f535d74135591b2c9a73168cf80e1a
Reviewed-on: https://go-review.googlesource.com/27490
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-22 19:40:09 +00:00
Dmitry Vyukov 747a158ef3 runtime: speed up StartTrace with lots of blocked goroutines
In StartTrace we emit EvGoCreate for all existing goroutines.
This includes stack unwind to obtain current stack.
Real Go programs can contain hundreds of thousands of blocked goroutines.
For such programs StartTrace can take up to a second (few ms per goroutine).

Obtain current stack ID once and use it for all EvGoCreate events.

This speeds up StartTrace with 10K blocked goroutines from 20ms to 4 ms
(win for StartTrace called from net/http/pprof hander will be bigger
as stack is deeper).

Change-Id: I9e5ff9468331a840f8fdcdd56c5018c2cfde61fc
Reviewed-on: https://go-review.googlesource.com/25573
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2016-08-22 17:40:10 +00:00
Josh Bleecher Snyder 7c5f33b173 runtime: cull dead code
They are unused, and vet wants them to have
a function prototype.

Updates #11041

Change-Id: Idedc96ddd3c3cf1b1d2ab6d98796367eab29f032
Reviewed-on: https://go-review.googlesource.com/27492
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-22 16:41:34 +00:00
Josh Bleecher Snyder 4af1148079 cmd/vet: improve asmdecl parameter handling
The asmdecl check had hand-rolled code that
calculated the size and offset of parameters
based only on the AST.
It included a list of known named types.

This CL changes asmdecl to use go/types instead.
This allows us to easily handle named types.
It also adds support for structs, arrays,
and complex parameters.

It improves the default names given to unnamed
parameters. Previously, all anonymous arguments were
called "unnamed", and the first anonymous return
argument was called "ret".
Anonymous arguments are now called arg, arg1, arg2,
etc., depending on the index in the argument list.
Return arguments are ret, ret1, ret2.

This CL also fixes a bug in the printing of
composite data type sizes.

Updates #11041

Change-Id: I1085116a26fe6199480b680eff659eb9ab31769b
Reviewed-on: https://go-review.googlesource.com/27150
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
2016-08-22 15:42:06 +00:00
Josh Bleecher Snyder 880c967ccd runtime: minor string/rune optimizations
Eliminate a spill in concatstrings.
Provide bounds elim hints in runetochar.
No significant benchmark movement.

Before:
"".runetochar t=1 size=412 args=0x28 locals=0x0
"".concatstrings t=1 size=736 args=0x30 locals=0x98

After:
"".runetochar t=1 size=337 args=0x28 locals=0x0
"".concatstrings t=1 size=711 args=0x30 locals=0x90

Change-Id: Icce646976cb20a223163b7e72a54761193ac17e3
Reviewed-on: https://go-review.googlesource.com/27460
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-22 15:19:31 +00:00
Michael Munday fa897643a1 runtime: remove unnecessary calls to memclr
Go will have already cleared the structs (the original C wouldn't
have).

Change-Id: I4a5a0cfd73953181affc158d188aae2ce281bb33
Reviewed-on: https://go-review.googlesource.com/27435
Run-TryBot: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-20 18:00:09 +00:00
Dmitry Vyukov 14e5951166 runtime: increase malloc size classes
When we calculate class sizes, in some cases we discard considerable
amounts of memory without an apparent reason. For example, we choose
size 8448 with 6 objects in 7 pages. But we can well use object
size 9472, which is also 6 objects in 7 pages but +1024 bytes (+12.12%).

Increase class sizes to the max value that leads to the same
page count/number of objects. Full list of affected size classes:

class 36: pages: 2 size: 1664->1792 +128 (7.69%)
class 39: pages: 1 size: 2560->2688 +128 (5.0%)
class 40: pages: 3 size: 2816->3072 +256 (9.9%)
class 41: pages: 2 size: 3072->3200 +128 (4.16%)
class 42: pages: 3 size: 3328->3456 +128 (3.84%)
class 44: pages: 3 size: 4608->4864 +256 (5.55%)
class 47: pages: 4 size: 6400->6528 +128 (2.0%)
class 48: pages: 5 size: 6656->6784 +128 (1.92%)
class 51: pages: 7 size: 8448->9472 +1024 (12.12%)
class 52: pages: 6 size: 8704->9728 +1024 (11.76%)
class 53: pages: 5 size: 9472->10240 +768 (8.10%)
class 54: pages: 4 size: 10496->10880 +384 (3.65%)
class 57: pages: 7 size: 14080->14336 +256 (1.81%)
class 59: pages: 9 size: 16640->18432 +1792 (10.76%)
class 60: pages: 7 size: 17664->19072 +1408 (7.97%)
class 62: pages: 8 size: 21248->21760 +512 (2.40%)
class 64: pages: 10 size: 24832->27264 +2432 (9.79%)
class 65: pages: 7 size: 28416->28672 +256 (0.90%)

name                      old time/op    new time/op    delta
BinaryTree17-12              2.59s ± 5%     2.52s ± 4%    ~     (p=0.132 n=6+6)
Fannkuch11-12                2.13s ± 3%     2.17s ± 3%    ~     (p=0.180 n=6+6)
FmtFprintfEmpty-12          47.0ns ± 3%    46.6ns ± 1%    ~     (p=0.355 n=6+5)
FmtFprintfString-12          131ns ± 0%     131ns ± 1%    ~     (p=0.476 n=4+6)
FmtFprintfInt-12             121ns ± 6%     122ns ± 2%    ~     (p=0.511 n=6+6)
FmtFprintfIntInt-12          182ns ± 2%     186ns ± 1%  +2.20%  (p=0.015 n=6+6)
FmtFprintfPrefixedInt-12     184ns ± 5%     181ns ± 2%    ~     (p=0.645 n=6+6)
FmtFprintfFloat-12           272ns ± 7%     265ns ± 1%    ~     (p=1.000 n=6+5)
FmtManyArgs-12               783ns ± 2%     802ns ± 2%  +2.38%  (p=0.017 n=6+6)
GobDecode-12                7.04ms ± 4%    7.00ms ± 2%    ~     (p=1.000 n=6+6)
GobEncode-12                6.36ms ± 6%    6.17ms ± 6%    ~     (p=0.240 n=6+6)
Gzip-12                      242ms ±14%     233ms ± 7%    ~     (p=0.310 n=6+6)
Gunzip-12                   36.6ms ±22%    36.0ms ± 9%    ~     (p=0.841 n=5+5)
HTTPClientServer-12         93.1µs ±29%    88.0µs ±32%    ~     (p=0.240 n=6+6)
JSONEncode-12               27.1ms ±39%    26.2ms ±35%    ~     (p=0.589 n=6+6)
JSONDecode-12               71.7ms ±36%    71.5ms ±36%    ~     (p=0.937 n=6+6)
Mandelbrot200-12            4.78ms ±10%    4.70ms ±16%    ~     (p=0.394 n=6+6)
GoParse-12                  4.86ms ±34%    4.95ms ±36%    ~     (p=1.000 n=6+6)
RegexpMatchEasy0_32-12       110ns ±37%     110ns ±36%    ~     (p=0.660 n=6+6)
RegexpMatchEasy0_1K-12       240ns ±38%     234ns ±47%    ~     (p=0.554 n=6+6)
RegexpMatchEasy1_32-12      77.2ns ± 2%    77.2ns ±10%    ~     (p=0.699 n=6+6)
RegexpMatchEasy1_1K-12       337ns ± 5%     331ns ± 4%    ~     (p=0.552 n=6+6)
RegexpMatchMedium_32-12      125ns ±13%     132ns ±26%    ~     (p=0.561 n=6+6)
RegexpMatchMedium_1K-12     35.9µs ± 3%    36.1µs ± 5%    ~     (p=0.818 n=6+6)
RegexpMatchHard_32-12       1.81µs ± 4%    1.82µs ± 5%    ~     (p=0.452 n=5+5)
RegexpMatchHard_1K-12       52.4µs ± 2%    54.4µs ± 3%  +3.84%  (p=0.002 n=6+6)
Revcomp-12                   401ms ± 2%     390ms ± 1%  -2.82%  (p=0.002 n=6+6)
Template-12                 54.5ms ± 3%    54.6ms ± 1%    ~     (p=0.589 n=6+6)
TimeParse-12                 294ns ± 1%     298ns ± 2%    ~     (p=0.160 n=6+6)
TimeFormat-12                323ns ± 4%     318ns ± 5%    ~     (p=0.297 n=6+6)

name                      old speed      new speed      delta
GobDecode-12               109MB/s ± 4%   110MB/s ± 2%    ~     (p=1.000 n=6+6)
GobEncode-12               121MB/s ± 6%   125MB/s ± 6%    ~     (p=0.240 n=6+6)
Gzip-12                   80.4MB/s ±12%  83.3MB/s ± 7%    ~     (p=0.310 n=6+6)
Gunzip-12                  495MB/s ±41%   541MB/s ± 9%    ~     (p=0.931 n=6+5)
JSONEncode-12             80.7MB/s ±39%  82.8MB/s ±34%    ~     (p=0.589 n=6+6)
JSONDecode-12             30.4MB/s ±40%  31.0MB/s ±37%    ~     (p=0.937 n=6+6)
GoParse-12                13.2MB/s ±33%  13.2MB/s ±35%    ~     (p=1.000 n=6+6)
RegexpMatchEasy0_32-12     321MB/s ±34%   326MB/s ±34%    ~     (p=0.699 n=6+6)
RegexpMatchEasy0_1K-12    4.49GB/s ±31%  4.74GB/s ±37%    ~     (p=0.589 n=6+6)
RegexpMatchEasy1_32-12     414MB/s ± 2%   415MB/s ± 9%    ~     (p=0.699 n=6+6)
RegexpMatchEasy1_1K-12    3.03GB/s ± 5%  3.09GB/s ± 4%    ~     (p=0.699 n=6+6)
RegexpMatchMedium_32-12   7.99MB/s ±12%  7.68MB/s ±22%    ~     (p=0.589 n=6+6)
RegexpMatchMedium_1K-12   28.5MB/s ± 3%  28.4MB/s ± 5%    ~     (p=0.818 n=6+6)
RegexpMatchHard_32-12     17.7MB/s ± 4%  17.0MB/s ±15%    ~     (p=0.351 n=5+6)
RegexpMatchHard_1K-12     19.6MB/s ± 2%  18.8MB/s ± 3%  -3.67%  (p=0.002 n=6+6)
Revcomp-12                 634MB/s ± 2%   653MB/s ± 1%  +2.89%  (p=0.002 n=6+6)
Template-12               35.6MB/s ± 3%  35.5MB/s ± 1%    ~     (p=0.615 n=6+6)

Change-Id: I465a47f74227f316e3abea231444f48c7a30ef85
Reviewed-on: https://go-review.googlesource.com/24493
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-08-19 21:24:28 +00:00
Austin Clements 3de7dbb191 runtime: fix check for vacuous page boundary rounding again
The previous fix for this, commit 336dad2a, had everything right in
the commit message, but reversed the test in the code. Fix the test in
the code.

This reversal effectively disabled the scavenger on large page systems
*except* in the rare cases where this code was originally wrong, which
is why it didn't obviously show up in testing.

Fixes #16644. Again. :(

Change-Id: I27cce4aea13de217197db4b628f17860f27ce83e
Reviewed-on: https://go-review.googlesource.com/27402
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-19 20:16:43 +00:00
Austin Clements 244efebe7f runtime: fix out of date comments
The transition from mark 1 to mark 2 no longer enqueues new root
marking jobs, but some of the comments still refer to this. Fix these
comments.

Change-Id: I3f98628dba32c5afe30495ab495da42b32291e9e
Reviewed-on: https://go-review.googlesource.com/24965
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-08-19 18:15:54 +00:00
Josh Bleecher Snyder 604efe1281 runtime: disable TestCgoCallbackGC on FreeBSD
The trybot flakes are a nuisance.

Updates #16396

Change-Id: I8202adb554391676ba82bca44d784c6a81bf2085
Reviewed-on: https://go-review.googlesource.com/27313
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-18 17:13:39 +00:00
David Chase 5b9ff11c3d cmd/compile: ppc64le working, not optimized enough
This time with the cherry-pick from the proper patch of
the old CL.

Stack size increased.
Corrected NaN-comparison glitches.
Marked g register as clobbered by calls.
Fixed shared libraries.

live_ssa.go still disabled because of differences.
Presumably turning on more optimization will fix
both the stack size and the live_ssa.go glitches.

Enhanced debugging output for shared libs test.

Rebased onto master.

Updates #16010.

Change-Id: I40864faf1ef32c118fb141b7ef8e854498e6b2c4
Reviewed-on: https://go-review.googlesource.com/27159
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-08-18 16:34:47 +00:00
Jaana Burcu Dogan c2322b7ea6 runtime: fix the absolute URL to pprof tools
Change-Id: I82eaf5c14a5b8b9ec088409f946adf7b5fd5dbe3
Reviewed-on: https://go-review.googlesource.com/27311
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-17 23:22:53 +00:00
Austin Clements 336dad2a07 runtime: fix check for vacuous page boundary rounding
sysUnused (e.g., madvise MADV_FREE) is only sensible to call on
physical page boundaries, so scavengelist rounds in the bounds of the
region being released to the nearest physical page boundaries.
However, if the region is smaller than a physical page and neither the
start nor end fall on a boundary, then rounding the start up to a page
boundary and the end down to a page boundary will result in end < start.
Currently, we only give up on the region if start == end, so if we
encounter end < start, we'll call madvise with a negative length and
the madvise will fail.

Issue #16644 gives a concrete example of this:

    start = 0x1285ac000
    end   = 0x1285ae000 (1 8K page)

This leads to the rounded values

    start = 0x1285b0000
    end   = 0x1285a0000

which leads to len = -65536.

Fix this by giving up on the region if end <= start, not just if
end == start.

Fixes #16644.

Change-Id: I8300db492dbadc82ac1ad878318b36bcb7c39524
Reviewed-on: https://go-review.googlesource.com/27230
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-17 14:04:16 +00:00
Keith Randall e492d9f018 runtime: fix map iterator concurrent map check
We should check whether there is a concurrent writer at the
start of every mapiternext, not just in mapaccessK (which is
only called during certain map growth situations).

Tests turned off by default because they are inherently flaky.

Fixes #16278

Change-Id: I8b72cab1b8c59d1923bec6fa3eabc932e4e91542
Reviewed-on: https://go-review.googlesource.com/24749
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-08-16 21:52:44 +00:00
Josh Bleecher Snyder 562d06fc23 cmd/compile: inline _, ok = i.(T)
We already inlined

_, ok = e.(T)
_, ok = i.(E)
_, ok = e.(E)

The only ok-only variants not inlined are now

_, ok = i.(I)
_, ok = e.(I)

These call getitab, so are non-trivial.

Change-Id: Ie45fd8933ee179a679b92ce925079b94cff0ee12
Reviewed-on: https://go-review.googlesource.com/26658
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-16 15:24:33 +00:00
Josh Bleecher Snyder 6f74c0774c runtime: move printing of extra newline
No functional changes, makes vet happy.

Updates #11041

Change-Id: I59f3aba46d19b86d605508978652d76a1fe7ac7b
Reviewed-on: https://go-review.googlesource.com/27125
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-16 14:37:17 +00:00
Keith Randall 88c8b7c7f9 Merge remote-tracking branch 'origin/dev.ssa' into merge
Merging from dev.ssa back into master.

Contains complete SSA backends for arm, arm64, 386, amd64p32.
Work in progress for PPC64.

Change-Id: Ifd7075e3ec6f88f776e29f8c7fd55830328897fd
2016-08-15 17:07:16 -07:00
Keith Randall c069bc4996 [dev.ssa] cmd/compile: implement GO386=387
Last part of the 386 SSA port.

Modify the x86 backend to simulate SSE registers and
instructions with 387 registers and instructions.
The simulation isn't terribly performant, but it works,
and the old implementation wasn't very performant either.
Leaving to people who care about 387 to optimize if they want.

Turn on SSA backend for 386 by default.

Fixes #16358

Change-Id: I678fb59132620b2c47e993c1c10c4c21135f70c0
Reviewed-on: https://go-review.googlesource.com/25271
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-10 17:41:01 +00:00
Shenghou Ma 26015b9563 runtime: make stack 16-byte aligned for external code in _rt0_amd64_linux_lib
Fixes #16618.

Change-Id: Iffada12e8672bbdbcf2e787782c497e2c45701b1
Reviewed-on: https://go-review.googlesource.com/25550
Run-TryBot: Minux Ma <minux@golang.org>
Reviewed-by: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-05 23:56:07 +00:00
Shenghou Ma 9fde86b012 runtime, syscall: fix kernel gettimeofday ABI change on iOS 10
Fixes #16570 on iOS.

Thanks Daniel Burhans for reporting the bug and testing the fix.

Change-Id: I43ae7b78c8f85a131ed3d93ea59da9f32a02cd8f
Reviewed-on: https://go-review.googlesource.com/25481
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-05 20:47:34 +00:00
Keith Randall 01dbfb81a0 [dev.ssa] Merge commit 'f135c326402aaa757aa96aad283a91873d4ae124' into mergebranch
Pick up shared library fix in dev.ssa.

Change-Id: I5bdd0e9e0f1d6f7c14b518343ee323ed9a894b9c
2016-08-04 10:52:24 -07:00
David Crawshaw f135c32640 runtime: initialize hash algs before typemap
When compiling with -buildmode=shared, a map[int32]*_type is created for
each extra module mapping duplicate types back to a canonical object.
This is done in the function typelinksinit, which is called before the
init function that sets up the hash functions for the map
implementation. The result is typemap becomes unusable after
runtime initialization.

The fix in this CL is to move algorithm init before typelinksinit in
the runtime setup process. (For 1.8, we may want to turn typemap into
a sorted slice of types and use binary search.)

Manually tested on GOOS=linux with:

	GOHOSTARCH=386 GOARCH=386 ./make.bash && \
		go install -buildmode=shared std && \
		cd ../test && \
		go run run.go -linkshared

Fixes #16590

Change-Id: Idc08c50cc70d20028276fbf564509d2cd5405210
Reviewed-on: https://go-review.googlesource.com/25469
Run-TryBot: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-04 17:39:05 +00:00
Keith Randall d2286ea284 [dev.ssa] Merge remote-tracking branch 'origin/master' into mergebranch
Semi-regular merge from tip into dev.ssa.

Change-Id: Iadb60e594ef65a99c0e1404b14205fa67c32a9e9
2016-08-04 10:08:20 -07:00
Brad Fitzpatrick 2da5633eb9 runtime: fix nanotime for macOS Sierra, again.
macOS Sierra beta4 changed the kernel interface for getting time.
DX now optionally points to an address for additional info.
Set it to zero to avoid corrupting memory.

Fixes #16570

Change-Id: I9f537e552682045325cdbb68b7d0b4ddafade14a
Reviewed-on: https://go-review.googlesource.com/25400
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Quentin Smith <quentin@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-02 20:17:50 +00:00
Rhys Hiltner ccca9c9cc0 runtime: reduce GC assist extra credit
Mutator goroutines that allocate memory during the concurrent mark
phase are required to spend some time assisting the garbage
collector. The magnitude of this mandatory assistance is proportional
to the goroutine's allocation debt and subject to the assistance
ratio as calculated by the pacer.

When assisting the garbage collector, a mutator goroutine will go
beyond paying off its allocation debt. It will build up extra credit
to amortize the overhead of the assist.

In fast-allocating applications with high assist ratios, building up
this credit can take the affected goroutine's entire time slice.
Reduce the penalty on each goroutine being selected to assist the GC
in two ways, to spread the responsibility more evenly.

First, do a consistent amount of extra scan work without regard for
the pacer's assistance ratio. Second, reduce the magnitude of the
extra scan work so it can be completed within a few hundred
microseconds.

Commentary on gcOverAssistWork is by Austin Clements, originally in
https://golang.org/cl/24704

Updates #14812
Fixes #16432

Change-Id: I436f899e778c20daa314f3e9f0e2a1bbd53b43e1
Reviewed-on: https://go-review.googlesource.com/25155
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Chris Broadfoot <cbro@golang.org>
2016-07-27 18:56:04 +00:00
Austin Clements b11fff3886 runtime/pprof: document use of pprof package
Currently the pprof package gives almost no guidance for how to use it
and, despite the standard boilerplate used to create CPU and memory
profiles, this boilerplate appears nowhere in the pprof documentation.

Update the pprof package documentation to give the standard
boilerplate in a form people can copy, paste, and tweak. This
boilerplate is based on rsc's 2011 blog post on profiling Go programs
at https://blog.golang.org/profiling-go-programs, which is where I
always go when I need to copy-paste the boilerplate.

Change-Id: I74021e494ea4dcc6b56d6fb5e59829ad4bb7b0be
Reviewed-on: https://go-review.googlesource.com/25182
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-07-26 22:16:55 +00:00
Keith Randall df2f813bd2 [dev.ssa] cmd/compile: 386 port now works
GOARCH=386 SSATEST=1 ./all.bash passes

Caveat: still needs changes to test/ files to use *_ssa.go versions.  I
won't check those changes in with this CL because the builders will
complain as they don't have SSATEST=1.

Mostly minor fixes.

Implement float <-> uint32 in assembly.  It seems the simplest option
for now.

GO386=387 does not work.  That's why I can't make SSA the default for
386 yet.

Change-Id: Ic4d4402104d32bcfb1fd612f5bb6539f9acb8ae0
Reviewed-on: https://go-review.googlesource.com/25119
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-07-21 20:41:18 +00:00
Ian Lance Taylor ff227b8a56 runtime: add explicit `INT $3` at end of Darwin amd64 sigtramp
The omission of this instruction could confuse the traceback code if a
SIGPROF occurred during a signal handler.  The traceback code would
trace up to sigtramp, but would then get confused because it would see a
PC address that did not appear to be in the function.

Fixes #16453.

Change-Id: I2b3d53e0b272fb01d9c2cb8add22bad879d3eebc
Reviewed-on: https://go-review.googlesource.com/25104
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-07-21 01:04:22 +00:00
Austin Clements f407ca9288 runtime: support smaller physical pages than PhysPageSize
Most operations need an upper bound on the physical page size, which
is what sys.PhysPageSize is for (this is checked at runtime init on
Linux). However, a few operations need a *lower* bound on the physical
page size. Introduce a "minPhysPageSize" constant to act as this lower
bound and use it where it makes sense:

1) In addrspace_free, we have to query each page in the given range.
   Currently we increment by the upper bound on the physical page
   size, which means we may skip over pages if the true size is
   smaller. Worse, we currently pass a result buffer that only has
   enough room for one page. If there are actually multiple pages in
   the range passed to mincore, the kernel will overflow this buffer.
   Fix these problems by incrementing by the lower-bound on the
   physical page size and by passing "1" for the length, which the
   kernel will round up to the true physical page size.

2) In the write barrier, the bad pointer check tests for pointers to
   the first physical page, which are presumably small integers
   masquerading as pointers. However, if physical pages are smaller
   than we think, we may have legitimate pointers below
   sys.PhysPageSize. Hence, use minPhysPageSize for this test since
   pointers should never fall below that.

In particular, this applies to ARM64 and MIPS. The runtime is
configured to use 64kB pages on ARM64, but by default Linux uses 4kB
pages. Similarly, the runtime assumes 16kB pages on MIPS, but both 4kB
and 16kB kernel configurations are common. This also applies to ARM on
systems where the runtime is recompiled to deal with a larger page
size. It is also a step toward making the runtime use only a
dynamically-queried page size.

Change-Id: I1fdfd18f6e7cbca170cc100354b9faa22fde8a69
Reviewed-on: https://go-review.googlesource.com/25020
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Austin Clements <austin@google.com>
2016-07-20 18:28:43 +00:00
Cherry Zhang 7b9873b9b9 [dev.ssa] cmd/internal/obj, etc.: add and use NEGF, NEGD instructions on ARM
Updates #15365.

Change-Id: I372a5617c2c7d91de545cac0464809b96711b63a
Reviewed-on: https://go-review.googlesource.com/24646
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2016-07-20 18:15:37 +00:00
Dmitry Vyukov d73ca5f4d8 runtime/race: fix memory leak
The leak was reported internally on a sever canary that runs for days.
After a day server consumes 5.6GB, after 6 days -- 12.2GB.
The leak is exposed by the added benchmark.
The leak is fixed upstream in :
http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/tsan/rtl/tsan_rtl_thread.cc?view=diff&r1=276102&r2=276103&pathrev=276103

Fixes #16441

Change-Id: I9d4f0adef48ca6cf2cd781b9a6990ad4661ba49b
Reviewed-on: https://go-review.googlesource.com/25091
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
2016-07-20 14:17:44 +00:00
Ian Lance Taylor 50048a4e8e runtime: add as many extra M's as needed
When a non-Go thread calls into Go, the runtime needs an M to run the Go
code. The runtime keeps a list of extra M's available. When the last
extra M is allocated, the needextram field is set to tell it to allocate
a new extra M as soon as it is running in Go. This ensures that an extra
M will always be available for the next thread.

However, if many threads need an extra M at the same time, this
serializes them all. One thread will get an extra M with the needextram
field set. All the other threads will see that there is no M available
and will go to sleep. The one thread that succeeded will create a new
extra M. One lucky thread will get it. All the other threads will see
that there is no M available and will go to sleep. The effect is
thundering herd, as all the threads looking for an extra M go through
the process one by one. This seems to have a particularly bad effect on
the FreeBSD scheduler for some reason.

With this change, we track the number of threads waiting for an M, and
create all of them as soon as one thread gets through. This still means
that all the threads will fight for the lock to pick up the next M. But
at least each thread that gets the lock will succeed, instead of going
to sleep only to fight again.

This smooths out the performance greatly on FreeBSD, reducing the
average wall time of `testprogcgo CgoCallbackGC` by 74%.  On GNU/Linux
the average wall time goes down by 9%.

Fixes #13926
Fixes #16396

Change-Id: I6dc42a4156085a7ed4e5334c60b39db8f8ef8fea
Reviewed-on: https://go-review.googlesource.com/25047
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2016-07-20 13:31:55 +00:00
Cherry Zhang 7d70f84f54 [dev.ssa] cmd/compile: add floating point optimizations in SSA for ARM
Add some simplification rules for floating point ops.

cmd/internal/obj/arm supports instructions that compare FP register
to 0, but runtime softfloat simulator does not. This CL adds these
instructions to softfloat simulator as well.

Updates #15365.

Change-Id: I29405b2bfcb4c8cf106cb7a1a811409fec91b170
Reviewed-on: https://go-review.googlesource.com/24790
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-07-16 03:13:22 +00:00
Josh Bleecher Snyder 4054769a31 runtime/internal/atomic: fix assembly arg sizes
Change-Id: I80ccf40cd3930aff908ee64f6dcbe5f5255198d3
Reviewed-on: https://go-review.googlesource.com/24914
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-07-14 16:35:37 +00:00
Ian Lance Taylor 29ed5da5f2 runtime/pprof: don't print extraneous 0 after goexit
This fixes erroneous handling of the more result parameter of
runtime.Frames.Next.

Fixes #16349.

Change-Id: I4f1c0263dafbb883294b31dbb8922b9d3e650200
Reviewed-on: https://go-review.googlesource.com/24911
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-07-13 21:18:19 +00:00
Keith Randall efefd11725 [dev.ssa] Merge remote-tracking branch 'origin/master' into mergebranch
Semi-regular merge of tip into dev.ssa.

Change-Id: I855817c4746237792a2dab6eaf471087a3646be4
2016-07-13 11:12:44 -07:00
Ian Lance Taylor b30814bbd6 runtime: add ctxt parameter to cgocallback called from Go
The cgocallback function picked up a ctxt parameter in CL 22508.
That CL updated the assembler implementation, but there are a few
mentions in Go code that were not updated. This CL fixes that.

Fixes #16326

Change-Id: I5f68e23565c6a0b11057aff476d13990bff54a66
Reviewed-on: https://go-review.googlesource.com/24848
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
2016-07-12 16:39:00 +00:00
Ian Lance Taylor 12f2b4ff0e runtime: fix case in KeepAlive comment
Fixes #16299.

Change-Id: I76f541c7f11edb625df566f2f1035147b8bcd9dd
Reviewed-on: https://go-review.googlesource.com/24830
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-07-08 16:50:26 +00:00
Ian Lance Taylor fad2bbdc6a runtime: fix nanotime for macOS Sierra
In the beta version of the macOS Sierra (10.12) release, the
gettimeofday system call changed on x86. Previously it always returned
the time in the AX/DX registers. Now, if AX is returned as 0, it means
that the system call has stored the values into the memory pointed to by
the first argument, just as the libc gettimeofday function does. The
libc function handles both cases, and we need to do so as well.

Fixes #16272.

Change-Id: Ibe5ad50a2c5b125e92b5a4e787db4b5179f6b723
Reviewed-on: https://go-review.googlesource.com/24812
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-07-08 03:17:18 +00:00
Ian Lance Taylor 84bb9e62f0 runtime: handle selects with duplicate channels in shrinkstack
The shrinkstack code locks all the channels a goroutine is waiting for,
but didn't handle the case of the same channel appearing in the list
multiple times. This led to a deadlock. The channels are sorted so it's
easy to avoid locking the same channel twice.

Fixes #16286.

Change-Id: Ie514805d0532f61c942e85af5b7b8ac405e2ff65
Reviewed-on: https://go-review.googlesource.com/24815
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-07-08 02:05:40 +00:00
Austin Clements 9c8809f82a runtime/internal/sys: implement Ctz and Bswap in assembly for 386
Ctz is a hot-spot in the Go 1.7 memory manager. In SSA it's
implemented as an intrinsic that compiles to a few instructions, but
on the old backend (all architectures other than amd64), it's
implemented as a fairly complex Go function. As a result, switching to
bitmap-based allocation was a significant hit to allocation-heavy
workloads like BinaryTree17 on non-SSA platforms.

For unknown reasons, this hit 386 particularly hard. We can regain a
lot of the lost performance by implementing Ctz in assembly on the
386. This isn't as good as an intrinsic, since it still generates a
function call and prevents useful inlining, but it's much better than
the pure Go implementation:

name                      old time/op    new time/op    delta
BinaryTree17-12              3.59s ± 1%     3.06s ± 1%  -14.74%  (p=0.000 n=19+20)
Fannkuch11-12                3.72s ± 1%     3.64s ± 1%   -2.09%  (p=0.000 n=17+19)
FmtFprintfEmpty-12          52.3ns ± 3%    52.3ns ± 3%     ~     (p=0.829 n=20+19)
FmtFprintfString-12          156ns ± 1%     148ns ± 3%   -5.20%  (p=0.000 n=18+19)
FmtFprintfInt-12             137ns ± 1%     136ns ± 1%   -0.56%  (p=0.000 n=19+13)
FmtFprintfIntInt-12          227ns ± 2%     225ns ± 2%   -0.93%  (p=0.000 n=19+17)
FmtFprintfPrefixedInt-12     210ns ± 1%     208ns ± 1%   -0.91%  (p=0.000 n=19+17)
FmtFprintfFloat-12           375ns ± 1%     371ns ± 1%   -1.06%  (p=0.000 n=19+18)
FmtManyArgs-12               995ns ± 2%     978ns ± 1%   -1.63%  (p=0.000 n=17+17)
GobDecode-12                9.33ms ± 1%    9.19ms ± 0%   -1.59%  (p=0.000 n=20+17)
GobEncode-12                7.73ms ± 1%    7.73ms ± 1%     ~     (p=0.771 n=19+20)
Gzip-12                      375ms ± 1%     374ms ± 1%     ~     (p=0.141 n=20+18)
Gunzip-12                   61.8ms ± 1%    61.8ms ± 1%     ~     (p=0.602 n=20+20)
HTTPClientServer-12         87.7µs ± 2%    86.9µs ± 3%   -0.87%  (p=0.024 n=19+20)
JSONEncode-12               20.2ms ± 1%    20.4ms ± 0%   +0.53%  (p=0.000 n=18+19)
JSONDecode-12               65.3ms ± 0%    65.4ms ± 1%     ~     (p=0.385 n=16+19)
Mandelbrot200-12            4.11ms ± 1%    4.12ms ± 0%   +0.29%  (p=0.020 n=19+19)
GoParse-12                  3.75ms ± 1%    3.61ms ± 2%   -3.90%  (p=0.000 n=20+20)
RegexpMatchEasy0_32-12       104ns ± 0%     103ns ± 0%   -0.96%  (p=0.000 n=13+16)
RegexpMatchEasy0_1K-12       805ns ± 1%     803ns ± 1%     ~     (p=0.189 n=18+18)
RegexpMatchEasy1_32-12       111ns ± 0%     111ns ± 3%     ~     (p=1.000 n=14+19)
RegexpMatchEasy1_1K-12      1.00µs ± 1%    1.00µs ± 1%   +0.50%  (p=0.003 n=19+19)
RegexpMatchMedium_32-12      133ns ± 2%     133ns ± 2%     ~     (p=0.218 n=20+20)
RegexpMatchMedium_1K-12     41.2µs ± 1%    42.2µs ± 1%   +2.52%  (p=0.000 n=18+16)
RegexpMatchHard_32-12       2.35µs ± 1%    2.38µs ± 1%   +1.53%  (p=0.000 n=18+18)
RegexpMatchHard_1K-12       70.9µs ± 2%    72.0µs ± 1%   +1.42%  (p=0.000 n=19+17)
Revcomp-12                   1.06s ± 0%     1.05s ± 0%   -1.36%  (p=0.000 n=20+18)
Template-12                 86.2ms ± 1%    84.6ms ± 0%   -1.89%  (p=0.000 n=20+18)
TimeParse-12                 425ns ± 2%     428ns ± 1%   +0.77%  (p=0.000 n=18+19)
TimeFormat-12                517ns ± 1%     519ns ± 1%   +0.43%  (p=0.001 n=20+19)
[Geo mean]                  74.3µs         73.5µs        -1.05%

Prior to this commit, BinaryTree17-12 on 386 was 33% slower than at
the go1.6 tag. With this commit, it's 13% slower.

On arm and arm64, BinaryTree17-12 is only ~5% slower than it was at
go1.6. It may be worth implementing Ctz for them as well.

I consider this change low risk, since the functions it replaces are
simple, very well specified, and well tested.

For #16117.

Change-Id: Ic39d851d5aca91330134596effd2dab9689ba066
Reviewed-on: https://go-review.googlesource.com/24640
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-30 19:35:44 +00:00
Dmitry Vyukov bb337372fb runtime: fix race atomic operations on external memory
The assembly is broken: it does `MOVQ g(R12), R14` expecting that
R12 contains tls address, but it does not do get_tls(R12) before.
This magically works on linux: `MOVQ g(R12), R14` is compiled to
`mov %fs:0xfffffffffffffff8,%r14` which does not use R12.
But it crashes on windows.

Add explicit `get_tls(R12)`.

Fixes #16206

Change-Id: Ic1f21a6fef2473bcf9147de6646929781c9c1e98
Reviewed-on: https://go-review.googlesource.com/24590
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-29 15:30:54 +00:00
Ian Lance Taylor 25a609556a runtime: correct printing of blocked field in scheduler trace
When the blocked field was first introduced back in
https://golang.org/cl/61250043 the scheduler trace code incorrectly used
m->blocked instead of mp->blocked.  That has carried through the
conversion to Go.  This CL fixes it.

Change-Id: Id81907b625221895aa5c85b9853f7c185efd8f4b
Reviewed-on: https://go-review.googlesource.com/24571
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-06-29 01:38:39 +00:00
Ian Lance Taylor c7ae41e577 runtime: better error message for newosproc failure
If creating a new thread fails with EAGAIN, point the user at ulimit.

Fixes #15476.

Change-Id: Ib36519614b5c72776ea7f218a0c62df1dd91a8ea
Reviewed-on: https://go-review.googlesource.com/24570
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-06-29 01:37:19 +00:00
David Crawshaw ed9362f769 reflect, runtime: optimize Name method
Several minor changes that remove a good chunk of the overhead added
to the reflect Name method over the 1.7 cycle, as seen from the
non-SSA architectures.

In particular, there are ~20 fewer instructions in reflect.name.name
on 386, and the method now qualifies for inlining.

The simple JSON decoding benchmark on darwin/386:

	name           old time/op    new time/op    delta
	CodeDecoder-8    49.2ms ± 0%    48.9ms ± 1%  -0.77%  (p=0.000 n=10+9)

	name           old speed      new speed      delta
	CodeDecoder-8  39.4MB/s ± 0%  39.7MB/s ± 1%  +0.77%  (p=0.000 n=10+9)

On darwin/amd64 the effect is less pronounced:

	name           old time/op    new time/op    delta
	CodeDecoder-8    38.9ms ± 0%    38.7ms ± 1%  -0.38%  (p=0.005 n=10+10)

	name           old speed      new speed      delta
	CodeDecoder-8  49.9MB/s ± 0%  50.1MB/s ± 1%  +0.38%  (p=0.006 n=10+10)

Counterintuitively, I get much more useful benchmark data out of my
MacBook Pro than a linux workstation with more expensive Intel chips.
While the laptop has fewer cores and an active GUI, the single-threaded
performance is significantly better (nearly 1.5x decoding throughput)
so the differences are more pronounced.

For #16117.

Change-Id: I4e0cc1cc2d271d47d5127b1ee1ca926faf34cabf
Reviewed-on: https://go-review.googlesource.com/24510
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-06-28 12:28:05 +00:00
Lynn Boger b75b0630fe runtime/internal/atomic: Use power5 compatible instructions for ppc64
This modifies a recent performance improvement to the
And8 and Or8 atomic functions which required both ppc64le
and ppc64 to use power8 instructions. Since then it was
decided that ppc64 (BE) should work for power5 and later.
This change uses instructions compatible with power5 for
ppc64 and uses power8 for ppc64le.

Fixes #16004

Change-Id: I623c75e8e6fd1fa063a53d250d86cdc9d0890dc7
Reviewed-on: https://go-review.googlesource.com/24181
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Andrew Gerrand <adg@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-28 04:49:33 +00:00
Raul Silvera c0e5d44506 runtime/pprof: update comments to point to new pprof
In the comments for this file there is a reference to gperftools
for more info on pprof. pprof now live on its own repo on github,
and the version in gperftools is deprecated.

Change-Id: I8a188f129534f73edd132ef4e5a2d566e69df7e9
Reviewed-on: https://go-review.googlesource.com/24502
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-06-27 18:06:11 +00:00
David Crawshaw 797dc58457 cmd/compile, etc: use tflag to optimize Name()==""
Improves JSON decoding benchmark:

	name                  old time/op    new time/op    delta
	CodeDecoder-8           41.3ms ± 6%    39.8ms ± 1%  -3.61%  (p=0.000 n=10+10)

	name                  old speed      new speed      delta
	CodeDecoder-8         47.0MB/s ± 6%  48.7MB/s ± 1%  +3.66%  (p=0.000 n=10+10)

Change-Id: I524ee05c432fad5252e79b29222ec635c1dee4b4
Reviewed-on: https://go-review.googlesource.com/24452
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-24 20:05:34 +00:00
David Crawshaw e369490fb7 cmd/compile, etc: bring back ptrToThis
This was removed in CL 19695 but it slows down reflect.New, which ends
up on the hot path of things like JSON decoding.

There is no immediate cost in binary size, but it will make it harder to
further shrink run time type information in Go 1.8.

Before

	BenchmarkNew-40         30000000                36.3 ns/op

After

	BenchmarkNew-40         50000000                29.5 ns/op

Fixes #16161
Updates #16117

Change-Id: If7cb7f3e745d44678f3f5cf3a5338c59847529d2
Reviewed-on: https://go-review.googlesource.com/24400
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-06-23 17:39:38 +00:00
Ian Lance Taylor 252eda470a cmd/pprof: don't use offset if we don't have a start address
The test is in the runtime package because there are other tests of
pprof there. At some point we should probably move them all into a pprof
testsuite.

Fixes #16128.

Change-Id: Ieefa40c61cf3edde11fe0cf04da1debfd8b3d7c0
Reviewed-on: https://go-review.googlesource.com/24274
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
2016-06-21 01:44:38 +00:00
Ian Lance Taylor 09834d1c08 runtime: panic with the right error on iface conversion
A straight conversion from a type T to an interface type I, where T does
not implement I, should always panic with an interface conversion error
that shows the missing method.  This was not happening if the conversion
was done once using the comma-ok form (the result would not be OK) and
then again in a straight conversion.  Due to an error in the runtime
package the second conversion was failing with a nil pointer
dereference.

Fixes #16130.

Change-Id: I8b9fca0f1bb635a6181b8b76de8c2385bb7ac2d2
Reviewed-on: https://go-review.googlesource.com/24284
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-06-21 01:43:42 +00:00
Ian Lance Taylor 659b9a19aa runtime: set PPROF_TMPDIR before running pprof
Fixes #16121.

Change-Id: I7b838fb6fb9f098e6c348d67379fdc81fb0d69a4
Reviewed-on: https://go-review.googlesource.com/24270
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2016-06-20 23:58:59 +00:00
Austin Clements 9e8fa1e99c runtime: eliminate poisonStack checks
We haven't used poisonStack since we switched to 1-bit stack maps
(4d0f3a1), but the checks are still there. However, nothing prevents
us from genuinely allocating an object at this address on 32-bit and
causing the runtime to crash claiming that it's found a bad pointer.

Since we're not using poisonStack anyway, just pull it out.

Fixes #15831.

Change-Id: Ia6ef604675b8433f75045e369f5acd4644a5bb38
Reviewed-on: https://go-review.googlesource.com/24211
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2016-06-17 15:18:39 +00:00
Austin Clements fca9fc52c8 runtime: fix stale comment in lfstack
Change-Id: I6ef08f6078190dc9df0b2df4f26a76456602f5e8
Reviewed-on: https://go-review.googlesource.com/24176
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-06-16 19:45:33 +00:00
Ian Lance Taylor ea2ac3fe5f runtime: remove useless loop from CgoCCodeSIGPROF test program
I verified that the test fails if I undo the change that it tests for.

Updates #14732.

Change-Id: Ib30352580236adefae946450ddd6cd65a62b7cdf
Reviewed-on: https://go-review.googlesource.com/24151
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Mikio Hara <mikioh.mikioh@gmail.com>
2016-06-16 03:52:18 +00:00
Ian Lance Taylor 26d6dc6bf8 runtime: if the test program hangs, try to get a stack trace
This is an attempt to get more information for #14809, which seems to
occur rarely.

Updates #14809.

Change-Id: Idbeb136ceb57993644e03266622eb699d2685d02
Reviewed-on: https://go-review.googlesource.com/24110
Reviewed-by: Mikio Hara <mikioh.mikioh@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
2016-06-15 15:03:48 +00:00
David Crawshaw af0fc83985 cmd/compile, etc: handle many struct fields
This adds 8 bytes of binary size to every type that has methods. It is
the smallest change I could come up with for 1.7.

Fixes #16037

Change-Id: Ibe15c3165854a21768596967757864b880dbfeed
Reviewed-on: https://go-review.googlesource.com/24070
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-14 15:32:34 +00:00
Keith Randall 0393ed8201 [dev.ssa] Merge remote-tracking branch 'origin/master' into mergebranch
Change-Id: Idd150294aaeced0176b53d6b95852f5d21ff4fdc
2016-06-14 07:34:09 -07:00
Ian Lance Taylor 84d8aff94c runtime: collect stack trace if SIGPROF arrives on non-Go thread
Fixes #15994.

Change-Id: I5aca91ab53985ac7dcb07ce094ec15eb8ec341f8
Reviewed-on: https://go-review.googlesource.com/23891
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-13 21:43:19 +00:00
Keith Randall c83e6f50d9 runtime: aeshash, xor seed in earlier
Instead of doing:

x = input
one round of aes on x
x ^= seed
two rounds of aes on x

Do:

x = input
x ^= seed
three rounds of aes on x

This change provides some additional seed-dependent scrambling
which should help prevent collisions.

Change-Id: I02c774d09c2eb6917cf861513816a1024a9b65d7
Reviewed-on: https://go-review.googlesource.com/23577
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-11 00:35:47 +00:00
Cherry Zhang cbc26869b7 runtime: set $sp before $pc in gdb python script
When setting $pc, gdb does a backtrace using the current value of $sp,
and it may complain if $sp does not match that $pc (although the
assignment went through successfully).

This happens with ARM SSA backend: when setting $pc it prints
> Cannot access memory at address 0x0

As well as occasionally on MIPS64:
> warning: GDB can't find the start of the function at 0xc82003fe07.
> ...

Setting $sp before setting $pc makes it happy.

Change-Id: Idd96dbef3e9b698829da553c6d71d5b4c6d492db
Reviewed-on: https://go-review.googlesource.com/23940
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-06-09 20:02:59 +00:00
Michael Munday 0324a3f828 runtime/cgo: restore the g pointer correctly in crosscall_s390x
R13 needs to be set to g because C code may have clobbered R13.

Fixes #16006.

Change-Id: I66311fe28440e85e589a1695fa1c42416583b4c6
Reviewed-on: https://go-review.googlesource.com/23910
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-08 18:09:47 +00:00