Commit Graph

2283 Commits

Author SHA1 Message Date
Austin Clements 336dad2a07 runtime: fix check for vacuous page boundary rounding
sysUnused (e.g., madvise MADV_FREE) is only sensible to call on
physical page boundaries, so scavengelist rounds in the bounds of the
region being released to the nearest physical page boundaries.
However, if the region is smaller than a physical page and neither the
start nor end fall on a boundary, then rounding the start up to a page
boundary and the end down to a page boundary will result in end < start.
Currently, we only give up on the region if start == end, so if we
encounter end < start, we'll call madvise with a negative length and
the madvise will fail.

Issue #16644 gives a concrete example of this:

    start = 0x1285ac000
    end   = 0x1285ae000 (1 8K page)

This leads to the rounded values

    start = 0x1285b0000
    end   = 0x1285a0000

which leads to len = -65536.

Fix this by giving up on the region if end <= start, not just if
end == start.

Fixes #16644.

Change-Id: I8300db492dbadc82ac1ad878318b36bcb7c39524
Reviewed-on: https://go-review.googlesource.com/27230
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-17 14:04:16 +00:00
Keith Randall e492d9f018 runtime: fix map iterator concurrent map check
We should check whether there is a concurrent writer at the
start of every mapiternext, not just in mapaccessK (which is
only called during certain map growth situations).

Tests turned off by default because they are inherently flaky.

Fixes #16278

Change-Id: I8b72cab1b8c59d1923bec6fa3eabc932e4e91542
Reviewed-on: https://go-review.googlesource.com/24749
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-08-16 21:52:44 +00:00
Josh Bleecher Snyder 562d06fc23 cmd/compile: inline _, ok = i.(T)
We already inlined

_, ok = e.(T)
_, ok = i.(E)
_, ok = e.(E)

The only ok-only variants not inlined are now

_, ok = i.(I)
_, ok = e.(I)

These call getitab, so are non-trivial.

Change-Id: Ie45fd8933ee179a679b92ce925079b94cff0ee12
Reviewed-on: https://go-review.googlesource.com/26658
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-16 15:24:33 +00:00
Josh Bleecher Snyder 6f74c0774c runtime: move printing of extra newline
No functional changes, makes vet happy.

Updates #11041

Change-Id: I59f3aba46d19b86d605508978652d76a1fe7ac7b
Reviewed-on: https://go-review.googlesource.com/27125
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-16 14:37:17 +00:00
Keith Randall 88c8b7c7f9 Merge remote-tracking branch 'origin/dev.ssa' into merge
Merging from dev.ssa back into master.

Contains complete SSA backends for arm, arm64, 386, amd64p32.
Work in progress for PPC64.

Change-Id: Ifd7075e3ec6f88f776e29f8c7fd55830328897fd
2016-08-15 17:07:16 -07:00
Keith Randall c069bc4996 [dev.ssa] cmd/compile: implement GO386=387
Last part of the 386 SSA port.

Modify the x86 backend to simulate SSE registers and
instructions with 387 registers and instructions.
The simulation isn't terribly performant, but it works,
and the old implementation wasn't very performant either.
Leaving to people who care about 387 to optimize if they want.

Turn on SSA backend for 386 by default.

Fixes #16358

Change-Id: I678fb59132620b2c47e993c1c10c4c21135f70c0
Reviewed-on: https://go-review.googlesource.com/25271
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-10 17:41:01 +00:00
Shenghou Ma 26015b9563 runtime: make stack 16-byte aligned for external code in _rt0_amd64_linux_lib
Fixes #16618.

Change-Id: Iffada12e8672bbdbcf2e787782c497e2c45701b1
Reviewed-on: https://go-review.googlesource.com/25550
Run-TryBot: Minux Ma <minux@golang.org>
Reviewed-by: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-05 23:56:07 +00:00
Shenghou Ma 9fde86b012 runtime, syscall: fix kernel gettimeofday ABI change on iOS 10
Fixes #16570 on iOS.

Thanks Daniel Burhans for reporting the bug and testing the fix.

Change-Id: I43ae7b78c8f85a131ed3d93ea59da9f32a02cd8f
Reviewed-on: https://go-review.googlesource.com/25481
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-05 20:47:34 +00:00
Keith Randall 01dbfb81a0 [dev.ssa] Merge commit 'f135c326402aaa757aa96aad283a91873d4ae124' into mergebranch
Pick up shared library fix in dev.ssa.

Change-Id: I5bdd0e9e0f1d6f7c14b518343ee323ed9a894b9c
2016-08-04 10:52:24 -07:00
David Crawshaw f135c32640 runtime: initialize hash algs before typemap
When compiling with -buildmode=shared, a map[int32]*_type is created for
each extra module mapping duplicate types back to a canonical object.
This is done in the function typelinksinit, which is called before the
init function that sets up the hash functions for the map
implementation. The result is typemap becomes unusable after
runtime initialization.

The fix in this CL is to move algorithm init before typelinksinit in
the runtime setup process. (For 1.8, we may want to turn typemap into
a sorted slice of types and use binary search.)

Manually tested on GOOS=linux with:

	GOHOSTARCH=386 GOARCH=386 ./make.bash && \
		go install -buildmode=shared std && \
		cd ../test && \
		go run run.go -linkshared

Fixes #16590

Change-Id: Idc08c50cc70d20028276fbf564509d2cd5405210
Reviewed-on: https://go-review.googlesource.com/25469
Run-TryBot: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-04 17:39:05 +00:00
Keith Randall d2286ea284 [dev.ssa] Merge remote-tracking branch 'origin/master' into mergebranch
Semi-regular merge from tip into dev.ssa.

Change-Id: Iadb60e594ef65a99c0e1404b14205fa67c32a9e9
2016-08-04 10:08:20 -07:00
Brad Fitzpatrick 2da5633eb9 runtime: fix nanotime for macOS Sierra, again.
macOS Sierra beta4 changed the kernel interface for getting time.
DX now optionally points to an address for additional info.
Set it to zero to avoid corrupting memory.

Fixes #16570

Change-Id: I9f537e552682045325cdbb68b7d0b4ddafade14a
Reviewed-on: https://go-review.googlesource.com/25400
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Quentin Smith <quentin@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-02 20:17:50 +00:00
Rhys Hiltner ccca9c9cc0 runtime: reduce GC assist extra credit
Mutator goroutines that allocate memory during the concurrent mark
phase are required to spend some time assisting the garbage
collector. The magnitude of this mandatory assistance is proportional
to the goroutine's allocation debt and subject to the assistance
ratio as calculated by the pacer.

When assisting the garbage collector, a mutator goroutine will go
beyond paying off its allocation debt. It will build up extra credit
to amortize the overhead of the assist.

In fast-allocating applications with high assist ratios, building up
this credit can take the affected goroutine's entire time slice.
Reduce the penalty on each goroutine being selected to assist the GC
in two ways, to spread the responsibility more evenly.

First, do a consistent amount of extra scan work without regard for
the pacer's assistance ratio. Second, reduce the magnitude of the
extra scan work so it can be completed within a few hundred
microseconds.

Commentary on gcOverAssistWork is by Austin Clements, originally in
https://golang.org/cl/24704

Updates #14812
Fixes #16432

Change-Id: I436f899e778c20daa314f3e9f0e2a1bbd53b43e1
Reviewed-on: https://go-review.googlesource.com/25155
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Chris Broadfoot <cbro@golang.org>
2016-07-27 18:56:04 +00:00
Austin Clements b11fff3886 runtime/pprof: document use of pprof package
Currently the pprof package gives almost no guidance for how to use it
and, despite the standard boilerplate used to create CPU and memory
profiles, this boilerplate appears nowhere in the pprof documentation.

Update the pprof package documentation to give the standard
boilerplate in a form people can copy, paste, and tweak. This
boilerplate is based on rsc's 2011 blog post on profiling Go programs
at https://blog.golang.org/profiling-go-programs, which is where I
always go when I need to copy-paste the boilerplate.

Change-Id: I74021e494ea4dcc6b56d6fb5e59829ad4bb7b0be
Reviewed-on: https://go-review.googlesource.com/25182
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-07-26 22:16:55 +00:00
Keith Randall df2f813bd2 [dev.ssa] cmd/compile: 386 port now works
GOARCH=386 SSATEST=1 ./all.bash passes

Caveat: still needs changes to test/ files to use *_ssa.go versions.  I
won't check those changes in with this CL because the builders will
complain as they don't have SSATEST=1.

Mostly minor fixes.

Implement float <-> uint32 in assembly.  It seems the simplest option
for now.

GO386=387 does not work.  That's why I can't make SSA the default for
386 yet.

Change-Id: Ic4d4402104d32bcfb1fd612f5bb6539f9acb8ae0
Reviewed-on: https://go-review.googlesource.com/25119
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-07-21 20:41:18 +00:00
Ian Lance Taylor ff227b8a56 runtime: add explicit `INT $3` at end of Darwin amd64 sigtramp
The omission of this instruction could confuse the traceback code if a
SIGPROF occurred during a signal handler.  The traceback code would
trace up to sigtramp, but would then get confused because it would see a
PC address that did not appear to be in the function.

Fixes #16453.

Change-Id: I2b3d53e0b272fb01d9c2cb8add22bad879d3eebc
Reviewed-on: https://go-review.googlesource.com/25104
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-07-21 01:04:22 +00:00
Austin Clements f407ca9288 runtime: support smaller physical pages than PhysPageSize
Most operations need an upper bound on the physical page size, which
is what sys.PhysPageSize is for (this is checked at runtime init on
Linux). However, a few operations need a *lower* bound on the physical
page size. Introduce a "minPhysPageSize" constant to act as this lower
bound and use it where it makes sense:

1) In addrspace_free, we have to query each page in the given range.
   Currently we increment by the upper bound on the physical page
   size, which means we may skip over pages if the true size is
   smaller. Worse, we currently pass a result buffer that only has
   enough room for one page. If there are actually multiple pages in
   the range passed to mincore, the kernel will overflow this buffer.
   Fix these problems by incrementing by the lower-bound on the
   physical page size and by passing "1" for the length, which the
   kernel will round up to the true physical page size.

2) In the write barrier, the bad pointer check tests for pointers to
   the first physical page, which are presumably small integers
   masquerading as pointers. However, if physical pages are smaller
   than we think, we may have legitimate pointers below
   sys.PhysPageSize. Hence, use minPhysPageSize for this test since
   pointers should never fall below that.

In particular, this applies to ARM64 and MIPS. The runtime is
configured to use 64kB pages on ARM64, but by default Linux uses 4kB
pages. Similarly, the runtime assumes 16kB pages on MIPS, but both 4kB
and 16kB kernel configurations are common. This also applies to ARM on
systems where the runtime is recompiled to deal with a larger page
size. It is also a step toward making the runtime use only a
dynamically-queried page size.

Change-Id: I1fdfd18f6e7cbca170cc100354b9faa22fde8a69
Reviewed-on: https://go-review.googlesource.com/25020
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Austin Clements <austin@google.com>
2016-07-20 18:28:43 +00:00
Cherry Zhang 7b9873b9b9 [dev.ssa] cmd/internal/obj, etc.: add and use NEGF, NEGD instructions on ARM
Updates #15365.

Change-Id: I372a5617c2c7d91de545cac0464809b96711b63a
Reviewed-on: https://go-review.googlesource.com/24646
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2016-07-20 18:15:37 +00:00
Dmitry Vyukov d73ca5f4d8 runtime/race: fix memory leak
The leak was reported internally on a sever canary that runs for days.
After a day server consumes 5.6GB, after 6 days -- 12.2GB.
The leak is exposed by the added benchmark.
The leak is fixed upstream in :
http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/tsan/rtl/tsan_rtl_thread.cc?view=diff&r1=276102&r2=276103&pathrev=276103

Fixes #16441

Change-Id: I9d4f0adef48ca6cf2cd781b9a6990ad4661ba49b
Reviewed-on: https://go-review.googlesource.com/25091
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
2016-07-20 14:17:44 +00:00
Ian Lance Taylor 50048a4e8e runtime: add as many extra M's as needed
When a non-Go thread calls into Go, the runtime needs an M to run the Go
code. The runtime keeps a list of extra M's available. When the last
extra M is allocated, the needextram field is set to tell it to allocate
a new extra M as soon as it is running in Go. This ensures that an extra
M will always be available for the next thread.

However, if many threads need an extra M at the same time, this
serializes them all. One thread will get an extra M with the needextram
field set. All the other threads will see that there is no M available
and will go to sleep. The one thread that succeeded will create a new
extra M. One lucky thread will get it. All the other threads will see
that there is no M available and will go to sleep. The effect is
thundering herd, as all the threads looking for an extra M go through
the process one by one. This seems to have a particularly bad effect on
the FreeBSD scheduler for some reason.

With this change, we track the number of threads waiting for an M, and
create all of them as soon as one thread gets through. This still means
that all the threads will fight for the lock to pick up the next M. But
at least each thread that gets the lock will succeed, instead of going
to sleep only to fight again.

This smooths out the performance greatly on FreeBSD, reducing the
average wall time of `testprogcgo CgoCallbackGC` by 74%.  On GNU/Linux
the average wall time goes down by 9%.

Fixes #13926
Fixes #16396

Change-Id: I6dc42a4156085a7ed4e5334c60b39db8f8ef8fea
Reviewed-on: https://go-review.googlesource.com/25047
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2016-07-20 13:31:55 +00:00
Cherry Zhang 7d70f84f54 [dev.ssa] cmd/compile: add floating point optimizations in SSA for ARM
Add some simplification rules for floating point ops.

cmd/internal/obj/arm supports instructions that compare FP register
to 0, but runtime softfloat simulator does not. This CL adds these
instructions to softfloat simulator as well.

Updates #15365.

Change-Id: I29405b2bfcb4c8cf106cb7a1a811409fec91b170
Reviewed-on: https://go-review.googlesource.com/24790
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-07-16 03:13:22 +00:00
Josh Bleecher Snyder 4054769a31 runtime/internal/atomic: fix assembly arg sizes
Change-Id: I80ccf40cd3930aff908ee64f6dcbe5f5255198d3
Reviewed-on: https://go-review.googlesource.com/24914
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-07-14 16:35:37 +00:00
Ian Lance Taylor 29ed5da5f2 runtime/pprof: don't print extraneous 0 after goexit
This fixes erroneous handling of the more result parameter of
runtime.Frames.Next.

Fixes #16349.

Change-Id: I4f1c0263dafbb883294b31dbb8922b9d3e650200
Reviewed-on: https://go-review.googlesource.com/24911
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-07-13 21:18:19 +00:00
Keith Randall efefd11725 [dev.ssa] Merge remote-tracking branch 'origin/master' into mergebranch
Semi-regular merge of tip into dev.ssa.

Change-Id: I855817c4746237792a2dab6eaf471087a3646be4
2016-07-13 11:12:44 -07:00
Ian Lance Taylor b30814bbd6 runtime: add ctxt parameter to cgocallback called from Go
The cgocallback function picked up a ctxt parameter in CL 22508.
That CL updated the assembler implementation, but there are a few
mentions in Go code that were not updated. This CL fixes that.

Fixes #16326

Change-Id: I5f68e23565c6a0b11057aff476d13990bff54a66
Reviewed-on: https://go-review.googlesource.com/24848
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
2016-07-12 16:39:00 +00:00
Ian Lance Taylor 12f2b4ff0e runtime: fix case in KeepAlive comment
Fixes #16299.

Change-Id: I76f541c7f11edb625df566f2f1035147b8bcd9dd
Reviewed-on: https://go-review.googlesource.com/24830
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-07-08 16:50:26 +00:00
Ian Lance Taylor fad2bbdc6a runtime: fix nanotime for macOS Sierra
In the beta version of the macOS Sierra (10.12) release, the
gettimeofday system call changed on x86. Previously it always returned
the time in the AX/DX registers. Now, if AX is returned as 0, it means
that the system call has stored the values into the memory pointed to by
the first argument, just as the libc gettimeofday function does. The
libc function handles both cases, and we need to do so as well.

Fixes #16272.

Change-Id: Ibe5ad50a2c5b125e92b5a4e787db4b5179f6b723
Reviewed-on: https://go-review.googlesource.com/24812
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-07-08 03:17:18 +00:00
Ian Lance Taylor 84bb9e62f0 runtime: handle selects with duplicate channels in shrinkstack
The shrinkstack code locks all the channels a goroutine is waiting for,
but didn't handle the case of the same channel appearing in the list
multiple times. This led to a deadlock. The channels are sorted so it's
easy to avoid locking the same channel twice.

Fixes #16286.

Change-Id: Ie514805d0532f61c942e85af5b7b8ac405e2ff65
Reviewed-on: https://go-review.googlesource.com/24815
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-07-08 02:05:40 +00:00
Austin Clements 9c8809f82a runtime/internal/sys: implement Ctz and Bswap in assembly for 386
Ctz is a hot-spot in the Go 1.7 memory manager. In SSA it's
implemented as an intrinsic that compiles to a few instructions, but
on the old backend (all architectures other than amd64), it's
implemented as a fairly complex Go function. As a result, switching to
bitmap-based allocation was a significant hit to allocation-heavy
workloads like BinaryTree17 on non-SSA platforms.

For unknown reasons, this hit 386 particularly hard. We can regain a
lot of the lost performance by implementing Ctz in assembly on the
386. This isn't as good as an intrinsic, since it still generates a
function call and prevents useful inlining, but it's much better than
the pure Go implementation:

name                      old time/op    new time/op    delta
BinaryTree17-12              3.59s ± 1%     3.06s ± 1%  -14.74%  (p=0.000 n=19+20)
Fannkuch11-12                3.72s ± 1%     3.64s ± 1%   -2.09%  (p=0.000 n=17+19)
FmtFprintfEmpty-12          52.3ns ± 3%    52.3ns ± 3%     ~     (p=0.829 n=20+19)
FmtFprintfString-12          156ns ± 1%     148ns ± 3%   -5.20%  (p=0.000 n=18+19)
FmtFprintfInt-12             137ns ± 1%     136ns ± 1%   -0.56%  (p=0.000 n=19+13)
FmtFprintfIntInt-12          227ns ± 2%     225ns ± 2%   -0.93%  (p=0.000 n=19+17)
FmtFprintfPrefixedInt-12     210ns ± 1%     208ns ± 1%   -0.91%  (p=0.000 n=19+17)
FmtFprintfFloat-12           375ns ± 1%     371ns ± 1%   -1.06%  (p=0.000 n=19+18)
FmtManyArgs-12               995ns ± 2%     978ns ± 1%   -1.63%  (p=0.000 n=17+17)
GobDecode-12                9.33ms ± 1%    9.19ms ± 0%   -1.59%  (p=0.000 n=20+17)
GobEncode-12                7.73ms ± 1%    7.73ms ± 1%     ~     (p=0.771 n=19+20)
Gzip-12                      375ms ± 1%     374ms ± 1%     ~     (p=0.141 n=20+18)
Gunzip-12                   61.8ms ± 1%    61.8ms ± 1%     ~     (p=0.602 n=20+20)
HTTPClientServer-12         87.7µs ± 2%    86.9µs ± 3%   -0.87%  (p=0.024 n=19+20)
JSONEncode-12               20.2ms ± 1%    20.4ms ± 0%   +0.53%  (p=0.000 n=18+19)
JSONDecode-12               65.3ms ± 0%    65.4ms ± 1%     ~     (p=0.385 n=16+19)
Mandelbrot200-12            4.11ms ± 1%    4.12ms ± 0%   +0.29%  (p=0.020 n=19+19)
GoParse-12                  3.75ms ± 1%    3.61ms ± 2%   -3.90%  (p=0.000 n=20+20)
RegexpMatchEasy0_32-12       104ns ± 0%     103ns ± 0%   -0.96%  (p=0.000 n=13+16)
RegexpMatchEasy0_1K-12       805ns ± 1%     803ns ± 1%     ~     (p=0.189 n=18+18)
RegexpMatchEasy1_32-12       111ns ± 0%     111ns ± 3%     ~     (p=1.000 n=14+19)
RegexpMatchEasy1_1K-12      1.00µs ± 1%    1.00µs ± 1%   +0.50%  (p=0.003 n=19+19)
RegexpMatchMedium_32-12      133ns ± 2%     133ns ± 2%     ~     (p=0.218 n=20+20)
RegexpMatchMedium_1K-12     41.2µs ± 1%    42.2µs ± 1%   +2.52%  (p=0.000 n=18+16)
RegexpMatchHard_32-12       2.35µs ± 1%    2.38µs ± 1%   +1.53%  (p=0.000 n=18+18)
RegexpMatchHard_1K-12       70.9µs ± 2%    72.0µs ± 1%   +1.42%  (p=0.000 n=19+17)
Revcomp-12                   1.06s ± 0%     1.05s ± 0%   -1.36%  (p=0.000 n=20+18)
Template-12                 86.2ms ± 1%    84.6ms ± 0%   -1.89%  (p=0.000 n=20+18)
TimeParse-12                 425ns ± 2%     428ns ± 1%   +0.77%  (p=0.000 n=18+19)
TimeFormat-12                517ns ± 1%     519ns ± 1%   +0.43%  (p=0.001 n=20+19)
[Geo mean]                  74.3µs         73.5µs        -1.05%

Prior to this commit, BinaryTree17-12 on 386 was 33% slower than at
the go1.6 tag. With this commit, it's 13% slower.

On arm and arm64, BinaryTree17-12 is only ~5% slower than it was at
go1.6. It may be worth implementing Ctz for them as well.

I consider this change low risk, since the functions it replaces are
simple, very well specified, and well tested.

For #16117.

Change-Id: Ic39d851d5aca91330134596effd2dab9689ba066
Reviewed-on: https://go-review.googlesource.com/24640
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-30 19:35:44 +00:00
Dmitry Vyukov bb337372fb runtime: fix race atomic operations on external memory
The assembly is broken: it does `MOVQ g(R12), R14` expecting that
R12 contains tls address, but it does not do get_tls(R12) before.
This magically works on linux: `MOVQ g(R12), R14` is compiled to
`mov %fs:0xfffffffffffffff8,%r14` which does not use R12.
But it crashes on windows.

Add explicit `get_tls(R12)`.

Fixes #16206

Change-Id: Ic1f21a6fef2473bcf9147de6646929781c9c1e98
Reviewed-on: https://go-review.googlesource.com/24590
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-29 15:30:54 +00:00
Ian Lance Taylor 25a609556a runtime: correct printing of blocked field in scheduler trace
When the blocked field was first introduced back in
https://golang.org/cl/61250043 the scheduler trace code incorrectly used
m->blocked instead of mp->blocked.  That has carried through the
conversion to Go.  This CL fixes it.

Change-Id: Id81907b625221895aa5c85b9853f7c185efd8f4b
Reviewed-on: https://go-review.googlesource.com/24571
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-06-29 01:38:39 +00:00
Ian Lance Taylor c7ae41e577 runtime: better error message for newosproc failure
If creating a new thread fails with EAGAIN, point the user at ulimit.

Fixes #15476.

Change-Id: Ib36519614b5c72776ea7f218a0c62df1dd91a8ea
Reviewed-on: https://go-review.googlesource.com/24570
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-06-29 01:37:19 +00:00
David Crawshaw ed9362f769 reflect, runtime: optimize Name method
Several minor changes that remove a good chunk of the overhead added
to the reflect Name method over the 1.7 cycle, as seen from the
non-SSA architectures.

In particular, there are ~20 fewer instructions in reflect.name.name
on 386, and the method now qualifies for inlining.

The simple JSON decoding benchmark on darwin/386:

	name           old time/op    new time/op    delta
	CodeDecoder-8    49.2ms ± 0%    48.9ms ± 1%  -0.77%  (p=0.000 n=10+9)

	name           old speed      new speed      delta
	CodeDecoder-8  39.4MB/s ± 0%  39.7MB/s ± 1%  +0.77%  (p=0.000 n=10+9)

On darwin/amd64 the effect is less pronounced:

	name           old time/op    new time/op    delta
	CodeDecoder-8    38.9ms ± 0%    38.7ms ± 1%  -0.38%  (p=0.005 n=10+10)

	name           old speed      new speed      delta
	CodeDecoder-8  49.9MB/s ± 0%  50.1MB/s ± 1%  +0.38%  (p=0.006 n=10+10)

Counterintuitively, I get much more useful benchmark data out of my
MacBook Pro than a linux workstation with more expensive Intel chips.
While the laptop has fewer cores and an active GUI, the single-threaded
performance is significantly better (nearly 1.5x decoding throughput)
so the differences are more pronounced.

For #16117.

Change-Id: I4e0cc1cc2d271d47d5127b1ee1ca926faf34cabf
Reviewed-on: https://go-review.googlesource.com/24510
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-06-28 12:28:05 +00:00
Lynn Boger b75b0630fe runtime/internal/atomic: Use power5 compatible instructions for ppc64
This modifies a recent performance improvement to the
And8 and Or8 atomic functions which required both ppc64le
and ppc64 to use power8 instructions. Since then it was
decided that ppc64 (BE) should work for power5 and later.
This change uses instructions compatible with power5 for
ppc64 and uses power8 for ppc64le.

Fixes #16004

Change-Id: I623c75e8e6fd1fa063a53d250d86cdc9d0890dc7
Reviewed-on: https://go-review.googlesource.com/24181
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Andrew Gerrand <adg@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-28 04:49:33 +00:00
Raul Silvera c0e5d44506 runtime/pprof: update comments to point to new pprof
In the comments for this file there is a reference to gperftools
for more info on pprof. pprof now live on its own repo on github,
and the version in gperftools is deprecated.

Change-Id: I8a188f129534f73edd132ef4e5a2d566e69df7e9
Reviewed-on: https://go-review.googlesource.com/24502
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-06-27 18:06:11 +00:00
David Crawshaw 797dc58457 cmd/compile, etc: use tflag to optimize Name()==""
Improves JSON decoding benchmark:

	name                  old time/op    new time/op    delta
	CodeDecoder-8           41.3ms ± 6%    39.8ms ± 1%  -3.61%  (p=0.000 n=10+10)

	name                  old speed      new speed      delta
	CodeDecoder-8         47.0MB/s ± 6%  48.7MB/s ± 1%  +3.66%  (p=0.000 n=10+10)

Change-Id: I524ee05c432fad5252e79b29222ec635c1dee4b4
Reviewed-on: https://go-review.googlesource.com/24452
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-24 20:05:34 +00:00
David Crawshaw e369490fb7 cmd/compile, etc: bring back ptrToThis
This was removed in CL 19695 but it slows down reflect.New, which ends
up on the hot path of things like JSON decoding.

There is no immediate cost in binary size, but it will make it harder to
further shrink run time type information in Go 1.8.

Before

	BenchmarkNew-40         30000000                36.3 ns/op

After

	BenchmarkNew-40         50000000                29.5 ns/op

Fixes #16161
Updates #16117

Change-Id: If7cb7f3e745d44678f3f5cf3a5338c59847529d2
Reviewed-on: https://go-review.googlesource.com/24400
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-06-23 17:39:38 +00:00
Ian Lance Taylor 252eda470a cmd/pprof: don't use offset if we don't have a start address
The test is in the runtime package because there are other tests of
pprof there. At some point we should probably move them all into a pprof
testsuite.

Fixes #16128.

Change-Id: Ieefa40c61cf3edde11fe0cf04da1debfd8b3d7c0
Reviewed-on: https://go-review.googlesource.com/24274
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
2016-06-21 01:44:38 +00:00
Ian Lance Taylor 09834d1c08 runtime: panic with the right error on iface conversion
A straight conversion from a type T to an interface type I, where T does
not implement I, should always panic with an interface conversion error
that shows the missing method.  This was not happening if the conversion
was done once using the comma-ok form (the result would not be OK) and
then again in a straight conversion.  Due to an error in the runtime
package the second conversion was failing with a nil pointer
dereference.

Fixes #16130.

Change-Id: I8b9fca0f1bb635a6181b8b76de8c2385bb7ac2d2
Reviewed-on: https://go-review.googlesource.com/24284
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-06-21 01:43:42 +00:00
Ian Lance Taylor 659b9a19aa runtime: set PPROF_TMPDIR before running pprof
Fixes #16121.

Change-Id: I7b838fb6fb9f098e6c348d67379fdc81fb0d69a4
Reviewed-on: https://go-review.googlesource.com/24270
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2016-06-20 23:58:59 +00:00
Austin Clements 9e8fa1e99c runtime: eliminate poisonStack checks
We haven't used poisonStack since we switched to 1-bit stack maps
(4d0f3a1), but the checks are still there. However, nothing prevents
us from genuinely allocating an object at this address on 32-bit and
causing the runtime to crash claiming that it's found a bad pointer.

Since we're not using poisonStack anyway, just pull it out.

Fixes #15831.

Change-Id: Ia6ef604675b8433f75045e369f5acd4644a5bb38
Reviewed-on: https://go-review.googlesource.com/24211
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2016-06-17 15:18:39 +00:00
Austin Clements fca9fc52c8 runtime: fix stale comment in lfstack
Change-Id: I6ef08f6078190dc9df0b2df4f26a76456602f5e8
Reviewed-on: https://go-review.googlesource.com/24176
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-06-16 19:45:33 +00:00
Ian Lance Taylor ea2ac3fe5f runtime: remove useless loop from CgoCCodeSIGPROF test program
I verified that the test fails if I undo the change that it tests for.

Updates #14732.

Change-Id: Ib30352580236adefae946450ddd6cd65a62b7cdf
Reviewed-on: https://go-review.googlesource.com/24151
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Mikio Hara <mikioh.mikioh@gmail.com>
2016-06-16 03:52:18 +00:00
Ian Lance Taylor 26d6dc6bf8 runtime: if the test program hangs, try to get a stack trace
This is an attempt to get more information for #14809, which seems to
occur rarely.

Updates #14809.

Change-Id: Idbeb136ceb57993644e03266622eb699d2685d02
Reviewed-on: https://go-review.googlesource.com/24110
Reviewed-by: Mikio Hara <mikioh.mikioh@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
2016-06-15 15:03:48 +00:00
David Crawshaw af0fc83985 cmd/compile, etc: handle many struct fields
This adds 8 bytes of binary size to every type that has methods. It is
the smallest change I could come up with for 1.7.

Fixes #16037

Change-Id: Ibe15c3165854a21768596967757864b880dbfeed
Reviewed-on: https://go-review.googlesource.com/24070
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-14 15:32:34 +00:00
Keith Randall 0393ed8201 [dev.ssa] Merge remote-tracking branch 'origin/master' into mergebranch
Change-Id: Idd150294aaeced0176b53d6b95852f5d21ff4fdc
2016-06-14 07:34:09 -07:00
Ian Lance Taylor 84d8aff94c runtime: collect stack trace if SIGPROF arrives on non-Go thread
Fixes #15994.

Change-Id: I5aca91ab53985ac7dcb07ce094ec15eb8ec341f8
Reviewed-on: https://go-review.googlesource.com/23891
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-13 21:43:19 +00:00
Keith Randall c83e6f50d9 runtime: aeshash, xor seed in earlier
Instead of doing:

x = input
one round of aes on x
x ^= seed
two rounds of aes on x

Do:

x = input
x ^= seed
three rounds of aes on x

This change provides some additional seed-dependent scrambling
which should help prevent collisions.

Change-Id: I02c774d09c2eb6917cf861513816a1024a9b65d7
Reviewed-on: https://go-review.googlesource.com/23577
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-11 00:35:47 +00:00
Cherry Zhang cbc26869b7 runtime: set $sp before $pc in gdb python script
When setting $pc, gdb does a backtrace using the current value of $sp,
and it may complain if $sp does not match that $pc (although the
assignment went through successfully).

This happens with ARM SSA backend: when setting $pc it prints
> Cannot access memory at address 0x0

As well as occasionally on MIPS64:
> warning: GDB can't find the start of the function at 0xc82003fe07.
> ...

Setting $sp before setting $pc makes it happy.

Change-Id: Idd96dbef3e9b698829da553c6d71d5b4c6d492db
Reviewed-on: https://go-review.googlesource.com/23940
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-06-09 20:02:59 +00:00
Michael Munday 0324a3f828 runtime/cgo: restore the g pointer correctly in crosscall_s390x
R13 needs to be set to g because C code may have clobbered R13.

Fixes #16006.

Change-Id: I66311fe28440e85e589a1695fa1c42416583b4c6
Reviewed-on: https://go-review.googlesource.com/23910
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-08 18:09:47 +00:00
Keith Randall 41dd1696ab cmd/compile: fix heap dump test on android
go_android_exec is looking for "exitcode=" to decide the result
of running a test.  The heap dump test nondeterministically prints
"finalized" right at the end of the test.  When the timing is just
right, we print "finalizedexitcode=0" and confuse go_android_exec.

This failure happens occasionally on the android builders.

Change-Id: I4f73a4db05d8f40047ecd3ef3a881a4ae3741e26
Reviewed-on: https://go-review.googlesource.com/23861
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-06-07 17:34:48 +00:00
Keith Randall a871464e5a runtime: fix typo
Fixes #15962

Change-Id: I1949e0787f6c2b1e19b9f9d3af2f712606a6d4cf
Reviewed-on: https://go-review.googlesource.com/23786
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-05 18:10:01 +00:00
Ian Lance Taylor cf862478c8 runtime/cgo: add TSAN locks around mmap call
Change-Id: I806cc5523b7b5e3278d01074bc89900d78700e0c
Reviewed-on: https://go-review.googlesource.com/23736
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2016-06-03 18:26:01 +00:00
Michael Hudson-Doyle 26849746c9 cmd/internal/obj, runtime: fixes for defer in 386 shared libraries
Any defer in a shared object crashed when GOARCH=386. This turns out to be two
bugs:

 1) Calls to morestack were not processed to be PIC safe (must have been
    possible to trigger this another way too)
 2) jmpdefer needs to rewind the return address of the deferred function past
    the instructions that load the GOT pointer into BX, not just past the call

Bug 2) requires re-introducing the a way for .s files to know when they are
being compiled for dynamic linking but I've tried to do that in as minimal
a way as possible.

Fixes #15916

Change-Id: Ia0d09b69ec272a176934176b8eaef5f3bfcacf04
Reviewed-on: https://go-review.googlesource.com/23623
Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-06-03 02:50:27 +00:00
Ian Lance Taylor 03abde4971 runtime: only permit SetCgoTraceback to be called once
Accept a duplicate call, but nothing else.

Change-Id: Iec24bf5ddc3b0f0c559ad2158339aca698601743
Reviewed-on: https://go-review.googlesource.com/23692
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-02 19:24:55 +00:00
Ian Lance Taylor 88e0ec2979 runtime/cgo: avoid races on cgo_context_function
Change-Id: Ie9e6fda675e560234e90b9022526fd689d770818
Reviewed-on: https://go-review.googlesource.com/23610
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-02 18:47:48 +00:00
Dmitry Vyukov ba22172832 runtime: fix typo in comment
Change-Id: I82e35770b45ccd1433dfae0af423073c312c0859
Reviewed-on: https://go-review.googlesource.com/23680
Reviewed-by: Andrew Gerrand <adg@golang.org>
2016-06-02 06:02:01 +00:00
Emmanuel Odeke 77026ef902 runtime: document heap scavenger memory summary
Fixes #15212.

Change-Id: I2628ec8333330721cddc5145af1ffda6f3e0c63f
Reviewed-on: https://go-review.googlesource.com/23319
Reviewed-by: Austin Clements <austin@google.com>
2016-06-01 19:06:43 +00:00
Ian Lance Taylor 690de51ffa runtime: fix restoring PC in ARM version of cgocallback_gofunc
Fixes #15856.

Change-Id: Ia8def161642087e4bd92a87298c77a0f9f83dc86
Reviewed-on: https://go-review.googlesource.com/23586
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-05-31 22:14:39 +00:00
Ian Lance Taylor 3d037cfaf8 runtime: pass signal context to cgo traceback function
When doing a backtrace from a signal that occurs in C code compiled
without using -fasynchronous-unwind-tables, we have to rely on frame
pointers. In order to do that, the traceback function needs the signal
context to reliably pick up the frame pointer.

Change-Id: I7b45930fced01685c337d108e0f146057928f876
Reviewed-on: https://go-review.googlesource.com/23494
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-31 21:17:40 +00:00
Ian Lance Taylor 2256e38978 runtime: update pprof binary header URL
The code has moved from code.google.com to github.com.

Change-Id: I0cc9eb69b3fedc9e916417bc7695759632f2391f
Reviewed-on: https://go-review.googlesource.com/23523
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-05-31 21:10:20 +00:00
Ian Lance Taylor 66736880ca runtime/cgo: add TSAN acquire/release calls
Add TSAN acquire/release calls to runtime/cgo to match the ones
generated by cgo.  This avoids a false positive race around the malloc
memory used in runtime/cgo when other goroutines are simultaneously
calling malloc and free from cgo.

These new calls will only be used when building with CGO_CFLAGS and
CGO_LDFLAGS set to -fsanitize=thread, which becomes a requirement to
avoid all false positives when using TSAN.  These are needed not just
for runtime/cgo, but also for any runtime package that uses cgo (such as
net and os/user).

Add an unused attribute to the _cgo_tsan_acquire and _cgo_tsan_release
functions, in case there are no actual cgo function calls.

Add a test that checks that setting CGO_CFLAGS/CGO_LDFLAGS avoids a
false positive report when using os/user.

Change-Id: I0905c644ff7f003b6718aac782393fa219514c48
Reviewed-on: https://go-review.googlesource.com/23492
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2016-05-31 20:53:16 +00:00
Ian Lance Taylor 4223294eab runtime/pprof, cmd/pprof: fix profiling for PIE
In order to support pprof for position independent executables, pprof
needs to adjust the PC addresses stored in the profile by the address at
which the program is loaded. The legacy profiling support which we use
already supports recording the GNU/Linux /proc/self/maps data
immediately after the CPU samples, so do that. Also change the pprof
symbolizer to use the information, if available, when looking up
addresses in the Go pcline data.

Fixes #15714.

Change-Id: I4bf679210ef7c51d85cf873c968ce82db8898e3e
Reviewed-on: https://go-review.googlesource.com/23525
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2016-05-31 13:02:09 +00:00
Ilya Tocar 429bbf3312 strings: fix and reenable amd64 Index for 17-31 byte strings
Fixes #15689

Change-Id: I56d0103738cc35cd5bc5e77a0e0341c0dd55530e
Reviewed-on: https://go-review.googlesource.com/23440
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2016-05-27 22:57:32 +00:00
David Chase 31e13c83c2 [dev.ssa] Merge branch 'master' into dev.ssa
Change-Id: Iabc80b6e0734efbd234d998271e110d2eaad41dd
2016-05-27 15:19:33 -04:00
Mikio Hara c340f4867b runtime: skip TestGdbBacktrace on netbsd
Also adds missing copyright notice.

Updates #15603.

Change-Id: Icf4bb45ba5edec891491fe5f0039a8a25125d168
Reviewed-on: https://go-review.googlesource.com/23501
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-27 18:47:08 +00:00
Austin Clements 6a86dbe75f runtime: always call stackfree on the system stack
Currently when the garbage collector frees stacks of dead goroutines
in markrootFreeGStacks, it calls stackfree on a regular user stack.
This is a problem, since stackfree manipulates the stack cache in the
per-P mcache, so if it grows the stack or gets preempted in the middle
of manipulating the stack cache (which are both possible since it's on
a user stack), it can easily corrupt the stack cache.

Fix this by calling markrootFreeGStacks on the system stack, so that
all calls to stackfree happen on the system stack. To prevent this bug
in the future, mark stack functions that manipulate the mcache as
go:systemstack.

Fixes #15853.

Change-Id: Ic0d1c181efb342f134285a152560c3a074f14a3d
Reviewed-on: https://go-review.googlesource.com/23511
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-27 17:53:21 +00:00
Austin Clements 966baedfea runtime: record Python stack on TestGdbPython failure
For #15599.

Change-Id: Icc2e58a3f314b7a098d78fe164ba36f5b2897de6
Reviewed-on: https://go-review.googlesource.com/23481
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-05-27 16:46:05 +00:00
Russ Cox 7fdec6216c build: enable framepointer mode by default
This has a minor performance cost, but far less than is being gained by SSA.
As an experiment, enable it during the Go 1.7 beta.
Having frame pointers on by default makes Linux's perf, Intel VTune,
and other profilers much more useful, because it lets them gather a
stack trace efficiently on profiling events.
(It doesn't help us that much, since when we walk the stack we usually
need to look up PC-specific information as well.)

Fixes #15840.

Change-Id: I4efd38412a0de4a9c87b1b6e5d11c301e63f1a2a
Reviewed-on: https://go-review.googlesource.com/23451
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-26 19:02:00 +00:00
David Crawshaw 56e5e0b69c runtime: tell race detector about reflectOffs.lock
Fixes #15832

Change-Id: I6f3f45e3c21edd0e093ecb1d8a067907863478f5
Reviewed-on: https://go-review.googlesource.com/23441
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2016-05-26 14:43:27 +00:00
Austin Clements b92f423879 runtime: unwind BP in jmpdefer to match SP unwind
The irregular calling convention for defers currently incorrectly
manages the BP if frame pointers are enabled. Specifically, jmpdefer
manipulates the SP as if its own caller, deferreturn, had returned.
However, it does not manipulate the BP to match. As a result, when a
BP-based traceback happens during a deferred function call, it unwinds
to the function that performed the defer and then thinks that function
called itself in an infinite regress.

Fix this by making jmpdefer manipulate the BP as if deferreturn had
actually returned.

Fixes #12968.

Updates #15840.

Change-Id: Ic9cc7c863baeaf977883ed0c25a7e80e592cf066
Reviewed-on: https://go-review.googlesource.com/23457
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-26 13:54:05 +00:00
Russ Cox d9557523c2 runtime: make framepointer mode safe for Windows
A few other architectures have already defined a NOFRAME flag.
Use it to disable frame pointer code on a few very low-level functions
that must behave like Windows code.

Makes the failing os/signal test pass on a Windows gomote.

Change-Id: I982365f2c59a0aa302b4428c970846c61027cf3e
Reviewed-on: https://go-review.googlesource.com/23456
Reviewed-by: Austin Clements <austin@google.com>
2016-05-26 13:53:01 +00:00
Russ Cox 8a1dc32447 runtime: add library startup support for ppc64le
I have been running this patch inside Google against Go 1.6 for the last month.

The new tests will probably break the builders but let's see
exactly how they break.

Change-Id: Ia65cf7d3faecffeeb4b06e9b80875c0e57d86d9e
Reviewed-on: https://go-review.googlesource.com/23452
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-05-26 03:31:59 +00:00
Ian Lance Taylor a5d1a72a40 cmd/cgo, runtime, runtime/cgo: TSAN support for malloc
Acquire and release the TSAN synchronization point when calling malloc,
just as we do when calling any other C function. If we don't do this,
TSAN will report false positive errors about races calling malloc and
free.

We used to have a special code path for malloc and free, going through
the runtime functions cmalloc and cfree. The special code path for cfree
was no longer used even before this CL. This CL stops using the special
code path for malloc, because there is no place along that path where we
could conditionally insert the TSAN synchronization. This CL removes
the support for the special code path for both functions.

Instead, cgo now automatically generates the malloc function as though
it were referenced as C.malloc.  We need to automatically generate it
even if C.malloc is not called, even if malloc and size_t are not
declared, to support cgo-provided functions like C.CString.

Change-Id: I829854ec0787a80f33fa0a8a0dc2ee1d617830e2
Reviewed-on: https://go-review.googlesource.com/23260
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-05-25 23:22:24 +00:00
Russ Cox 10c8b2374f runtime: align C library startup calls on amd64
This makes GOEXPERIMENT=framepointer, GOOS=darwin, and buildmode=carchive coexist.

Change-Id: I9f6fb2f0f06f27df683e5b51f2fa55cd21872453
Reviewed-on: https://go-review.googlesource.com/23454
Reviewed-by: Austin Clements <austin@google.com>
2016-05-25 23:16:46 +00:00
Austin Clements 3be48b4dc8 runtime: pass gcWork to scanstack
Currently scanstack obtains its own gcWork from the P for the duration
of the stack scan and then, if called during mark termination,
disposes the gcWork.

However, this means that the number of workbufs allocated will be at
least the number of stacks scanned during mark termination, which may
be very high (especially during a STW GC). This happens because, in
steady state, each scanstack will obtain a fresh workbuf (either from
the empty list or by allocating it), fill it with the scan results,
and then dispose it to the full list. Nothing is consuming from the
full list during this (and hence nothing is recycling them to the
empty list), so the length of the full list by the time mark
termination starts draining it is at least the number of stacks
scanned.

Fix this by pushing the gcWork acquisition up the stack to either the
gcDrain that calls markroot that calls scanstack (which batches across
many stack scans and is the path taken during STW GC) or to newstack
(which is still a single scanstack call, but this is roughly bounded
by the number of Ps).

This fix reduces the workbuf allocation for the test program from
issue #15319 from 213 MB (roughly 2KB * 1e5 goroutines) to 10 MB.

Fixes #15319.

Note that there's potentially a similar issue in write barriers during
mark 2. Fixing that will be more difficult since there's no broader
non-preemptible context, but it should also be less of a problem since
the full list is being drained during mark 2.

Some overall improvements in the go1 benchmarks, plus the usual noise.
No significant change in the garbage benchmark (time/op or GC memory).

name                      old time/op    new time/op    delta
BinaryTree17-12              2.54s ± 1%     2.51s ± 1%  -1.09%  (p=0.000 n=20+19)
Fannkuch11-12                2.12s ± 0%     2.17s ± 0%  +2.18%  (p=0.000 n=19+18)
FmtFprintfEmpty-12          45.1ns ± 1%    45.2ns ± 0%    ~     (p=0.078 n=19+18)
FmtFprintfString-12          127ns ± 0%     128ns ± 0%  +1.08%  (p=0.000 n=19+16)
FmtFprintfInt-12             125ns ± 0%     122ns ± 1%  -2.71%  (p=0.000 n=14+18)
FmtFprintfIntInt-12          196ns ± 0%     190ns ± 1%  -2.91%  (p=0.000 n=12+20)
FmtFprintfPrefixedInt-12     196ns ± 0%     194ns ± 1%  -0.94%  (p=0.000 n=13+18)
FmtFprintfFloat-12           253ns ± 1%     251ns ± 1%  -0.86%  (p=0.000 n=19+20)
FmtManyArgs-12               807ns ± 1%     784ns ± 1%  -2.85%  (p=0.000 n=20+20)
GobDecode-12                7.13ms ± 1%    7.12ms ± 1%    ~     (p=0.351 n=19+20)
GobEncode-12                5.89ms ± 0%    5.95ms ± 0%  +0.94%  (p=0.000 n=19+19)
Gzip-12                      219ms ± 1%     221ms ± 1%  +1.35%  (p=0.000 n=18+20)
Gunzip-12                   37.5ms ± 1%    37.4ms ± 0%    ~     (p=0.057 n=20+19)
HTTPClientServer-12         81.4µs ± 4%    81.9µs ± 3%    ~     (p=0.118 n=17+18)
JSONEncode-12               15.7ms ± 1%    15.8ms ± 1%  +0.73%  (p=0.000 n=17+18)
JSONDecode-12               57.9ms ± 1%    57.2ms ± 1%  -1.34%  (p=0.000 n=19+19)
Mandelbrot200-12            4.12ms ± 1%    4.10ms ± 0%  -0.33%  (p=0.000 n=19+17)
GoParse-12                  3.22ms ± 2%    3.25ms ± 1%  +0.72%  (p=0.000 n=18+20)
RegexpMatchEasy0_32-12      70.6ns ± 1%    71.1ns ± 2%  +0.63%  (p=0.005 n=19+20)
RegexpMatchEasy0_1K-12       240ns ± 0%     239ns ± 1%  -0.59%  (p=0.000 n=19+20)
RegexpMatchEasy1_32-12      71.3ns ± 1%    71.3ns ± 1%    ~     (p=0.844 n=17+17)
RegexpMatchEasy1_1K-12       384ns ± 2%     371ns ± 1%  -3.45%  (p=0.000 n=19+20)
RegexpMatchMedium_32-12      109ns ± 1%     108ns ± 2%  -0.48%  (p=0.029 n=19+19)
RegexpMatchMedium_1K-12     34.3µs ± 1%    34.5µs ± 2%    ~     (p=0.160 n=18+20)
RegexpMatchHard_32-12       1.79µs ± 9%    1.72µs ± 2%  -3.83%  (p=0.000 n=19+19)
RegexpMatchHard_1K-12       53.3µs ± 4%    51.8µs ± 1%  -2.82%  (p=0.000 n=19+20)
Revcomp-12                   386ms ± 0%     388ms ± 0%  +0.72%  (p=0.000 n=17+20)
Template-12                 62.9ms ± 1%    62.5ms ± 1%  -0.57%  (p=0.010 n=18+19)
TimeParse-12                 325ns ± 0%     331ns ± 0%  +1.84%  (p=0.000 n=18+19)
TimeFormat-12                338ns ± 0%     343ns ± 0%  +1.34%  (p=0.000 n=18+20)
[Geo mean]                  52.7µs         52.5µs       -0.42%

Change-Id: Ib2d34736c4ae2ec329605b0fbc44636038d8d018
Reviewed-on: https://go-review.googlesource.com/23391
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-05-25 21:11:47 +00:00
Austin Clements a1f7db88f8 runtime: document scanstack
Also mark it go:systemstack and explain why.

Change-Id: I88baf22741c04012ba2588d8e03dd3801d19b5c0
Reviewed-on: https://go-review.googlesource.com/23390
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-05-25 21:11:44 +00:00
Marcel van Lohuizen 23cb8864b5 runtime: use Run for more benchmarks
Names for Append?Bytes are slightly changed in addition to adding a slash.

Change-Id: I0291aa29c693f9040fd01368eaad9766259677df
Reviewed-on: https://go-review.googlesource.com/23426
Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-25 17:46:50 +00:00
Marcel van Lohuizen 095fbdcc91 runtime: use of Run for some benchmarks
Names of sub-benchmarks are preserved, short of the additional slash.

Change-Id: I9b3f82964f9a44b0d28724413320afd091ed3106
Reviewed-on: https://go-review.googlesource.com/23425
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-25 16:49:02 +00:00
Elias Naur 72eb46c5a0 runtime,runtime/cgo: save callee-saved FP register on arm
Other GOARCHs already handle their callee-saved FP registers, but
arm was missing. Without this change, code using Cgo and floating
point code might fail in mysterious and hard to debug ways.

There are no floating point registers when GOARM=5, so skip the
registers when runtime.goarm < 6.

darwin/arm doesn't support GOARM=5, so the check is left out of
rt0_darwin_arm.s.

Fixes #14876

Change-Id: I6bcb90a76df3664d8ba1f33123a74b1eb2c9f8b2
Reviewed-on: https://go-review.googlesource.com/23140
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
2016-05-25 06:54:28 +00:00
Robert Griesemer 93e8e70499 all: fixed a handful of typos
Change-Id: Ib0683f27b44e2f107cca7a8dcc01d230cbcd5700
Reviewed-on: https://go-review.googlesource.com/23404
Reviewed-by: Alan Donovan <adonovan@google.com>
2016-05-24 21:18:03 +00:00
Austin Clements a640d95172 runtime: update SP when jumping stacks in traceback
When gentraceback starts on a system stack in sigprof, it is
configured to jump to the user stack when it reaches the end of the
system stack. Currently this updates the current frame's FP, but not
its SP. This is okay on non-LR machines (x86) because frame.sp is only
used to find defers, which the bottom-most frame of the user stack
will never have.

However, on LR machines, we use frame.sp to find the saved LR. We then
use to resolve the function of the next frame, which is used to
resolved the size of the next frame. Since we're not updating frame.sp
on a stack jump, we read the saved LR from the system stack instead of
the user stack and wind up resolving the wrong function and hence the
wrong frame size for the next frame.

This has had remarkably few ill effects (though the resulting profiles
must be wrong). We noticed it because of a bad interaction with stack
barriers. Specifically, once we get the next frame size wrong, we also
get the location of its LR wrong. If we happen to get a stack slot
that contains a stale stack barrier LR (for a stack barrier we already
hit) and hasn't been overwritten with something else as we re-grew the
stack, gentraceback will fail with a "found next stack barrier at ..."
error, pointing at the slot that it thinks is an LR, but isn't.

Fixes #15138.

Updates #15313 (might fix it).

Change-Id: I13cfa322b44c0c2f23ac2b3d03e12631e4a6406b
Reviewed-on: https://go-review.googlesource.com/23291
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-05-24 21:07:24 +00:00
Austin Clements 44497ebacb runtime: fix goroutine priority elevation
Currently it's possible for user code to exploit the high scheduler
priority of the GC worker in conjunction with the runnext optimization
to elevate a user goroutine to high priority so it will always run
even if there are other runnable goroutines.

For example, if a goroutine is in a tight allocation loop, the
following can happen:

1. Goroutine 1 allocates, triggering a GC.
2. G 1 attempts an assist, but fails and blocks.
3. The scheduler runs the GC worker, since it is high priority.
   Note that this also starts a new scheduler quantum.
4. The GC worker does enough work to satisfy the assist.
5. The GC worker readies G 1, putting it in runnext.
6. GC finishes and the scheduler runs G 1 from runnext, giving it
   the rest of the GC worker's quantum.
7. Go to 1.

Even if there are other goroutines on the run queue, they never get a
chance to run in the above sequence. This requires a confluence of
circumstances that make it unlikely, though not impossible, that it
would happen in "real" code. In the test added by this commit, we
force this confluence by setting GOMAXPROCS to 1 and GOGC to 1 so it's
easy for the test to repeated trigger GC and wake from a blocked
assist.

We fix this by making GC always put user goroutines at the end of the
run queue, instead of in runnext. This makes it so user code can't
piggy-back on the GC's high priority to make a user goroutine act like
it has high priority. The only other situation where GC wakes user
goroutines is waking all blocked assists at the end, but this uses the
global run queue and hence doesn't have this problem.

Fixes #15706.

Change-Id: I1589dee4b7b7d0c9c8575ed3472226084dfce8bc
Reviewed-on: https://go-review.googlesource.com/23172
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-05-19 18:18:13 +00:00
Austin Clements 91740582c3 runtime: add 'next' flag to ready
Currently ready always puts the readied goroutine in runnext. We're
going to have to change this for some uses, so add a flag for whether
or not to use runnext.

For now we always pass true so this is a no-op change.

For #15706.

Change-Id: Iaa66d8355ccfe4bbe347570cc1b1878c70fa25df
Reviewed-on: https://go-review.googlesource.com/23171
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-05-19 18:17:58 +00:00
Joel Sing 0dcd330bc8 runtime/cgo: make cgo work with openbsd ABI changes
OpenBSD 6.0 (due out November 2016) will support PT_TLS, which will
allow for the OpenBSD cgo pthread_create() workaround to be removed.

However, in order for Go to continue working on supported OpenBSD
releases (the current release and the previous release - 5.9 and 6.0,
once 6.0 is released), we cannot enable PT_TLS immediately. Instead,
adjust the existing code so that it works with the previous TCB
allocation and the new TIB allocation. This allows the same Go
runtime to work on 5.8, 5.9 and later 6.0.

Once OpenBSD 5.9 is no longer supported (May 2017, when 6.1 is
released), PT_TLS can be enabled and the additional cgo runtime
code removed.

Change-Id: I3eed5ec593d80eea78c6656cb12557004b2c0c9a
Reviewed-on: https://go-review.googlesource.com/23197
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-19 15:43:37 +00:00
Ian Lance Taylor 1f7a0d4b5e runtime: don't do a plain throw when throwsplit == true
The test case in #15639 somehow causes an invalid syscall frame. The
failure is obscured because the throw occurs when throwsplit == true,
which causes a "stack split at bad time" error when trying to print the
throw message.

This CL fixes the "stack split at bad time" by using systemstack. No
test because there shouldn't be any way to trigger this error anyhow.

Update #15639.

Change-Id: I4240f3fd01bdc3c112f3ffd1316b68504222d9e1
Reviewed-on: https://go-review.googlesource.com/23153
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-05-19 04:37:45 +00:00
Ian Lance Taylor c08436d1c8 runtime: print PC, not the counter, for a cgo traceback
Change-Id: I54ed7a26a753afb2d6a72080e1f50ce9fba7c183
Reviewed-on: https://go-review.googlesource.com/23228
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-18 23:47:13 +00:00
Ian Lance Taylor 538537a28d runtime: check only up to ptrdata bytes for pointers
Fixes #14508.

Change-Id: I237d0c5a79a73e6c97bdb2077d8ede613128b978
Reviewed-on: https://go-review.googlesource.com/23224
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-05-18 23:39:06 +00:00
Ian Lance Taylor 6ab45c09f6 runtime: add KeepAlive function
Fixes #13347.

Change-Id: I591a80a1566ce70efb5f68e3ad69e7e3ab98cd9b
Reviewed-on: https://go-review.googlesource.com/23102
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-18 20:42:37 +00:00
Cuihtlauac ALVARADO 2380a039c0 runtime: in tests, make sure gdb does not start with a shell
On some systems, gdb is set to: "startup-with-shell on". This
breaks runtime_test. This just make sure gdb does not start by
spawning a shell.

Fixes #15354

Change-Id: Ia040931c61dea22f4fdd79665ab9f84835ecaa70
Reviewed-on: https://go-review.googlesource.com/23142
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-18 14:03:22 +00:00
Ian Lance Taylor 23a59ba17c runtime: deflake TestSignalExitStatus
The signal might get delivered to a different thread, and that thread
might not run again before the currently running thread returns and
exits. Sleep to give the other thread time to pick up the signal and
crash.

Not tested for all cases, but, optimistically:
Fixes #14063.

Change-Id: Iff58669ac6185ad91cce85e0e86f17497a3659fd
Reviewed-on: https://go-review.googlesource.com/23203
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Mikio Hara <mikioh.mikioh@gmail.com>
2016-05-18 04:08:08 +00:00
James Chacon 733162fd6c runtime: prevent racefini from being invoked more than once
racefini calls __tsan_fini which is C code and at the end of it
invoked the standard C library exit(3) call. This has undefined
behavior if invoked more than once. Specifically in C++ programs
it caused static destructors to run twice. At least on glibc
impls it also means the at_exit handlers list (where those are
stored) also free's a list entry when it completes these. So invoking
twice results in a double free at exit which trips debug memory
allocation tracking.

Fix all of this by using an atomic as a boolean barrier around
calls to racefini being invoked > 1 time.

Fixes #15578

Change-Id: I49222aa9b8ded77160931f46434c61a8379570fc
Reviewed-on: https://go-review.googlesource.com/22882
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-18 01:04:55 +00:00
Ian Lance Taylor c1b32acefb runtime: yield after raising signal that should kill process
Issue #15613 points out that the darwin builders have been getting
regular failures in which a process that should exit with a SIGPIPE
signal is instead exiting with exit status 2. The code calls
runtime.raise. On most systems runtime.raise is the equivalent of
pthread_kill(gettid(), sig); that is, it kills the thread with the
signal, which should ensure that the program does not keep going. On
darwin, however, runtime.raise is actually kill(getpid(), sig); that is,
it sends a signal to the entire process. If the process decides to
deliver the signal to a different thread, then it is possible that in
some cases the thread that calls raise is able to execute the next
system call before the signal is actually delivered. That would cause
the observed error.

I have not been able to recreate the problem myself, so I don't know
whether this actually fixes it. But, optimistically:

Fixed #15613.

Change-Id: I60c0a9912aae2f46143ca1388fd85e9c3fa9df1f
Reviewed-on: https://go-review.googlesource.com/23152
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-05-16 22:19:05 +00:00
Austin Clements 466cae6ca9 runtime: use GOTRACEBACK=system for TestStackBarrierProfiling
This should help with debugging failures.

For #15138 and #15477.

Change-Id: I77db2b6375d8b4403d3edf5527899d076291e02c
Reviewed-on: https://go-review.googlesource.com/23134
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-05-16 20:16:50 +00:00
Austin Clements 64770f642f runtime: use conventional shift style for gcBitsChunkBytes
The convention for writing something like "64 kB" is 64<<10, since
this is easier to read than 1<<16. Update gcBitsChunkBytes to follow
this convention.

Change-Id: I5b5a3f726dcf482051ba5b1814db247ff3b8bb2f
Reviewed-on: https://go-review.googlesource.com/23132
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-05-16 18:28:38 +00:00
Austin Clements 30ded16596 runtime: remove obsolete comment from scanobject
Change-Id: I5ebf93b60213c0138754fc20888ae5ce60237b8c
Reviewed-on: https://go-review.googlesource.com/23131
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-05-16 18:28:35 +00:00
Keith Randall 0bc14f57ec strings: fix Contains on amd64
The 17-31 byte code is broken.  Disabled it.

Added a bunch of tests to at least cover the cases
in indexShortStr.  I'll channel Brad and wonder why
this CL ever got in without any tests.

Fixes #15679

Change-Id: I84a7b283a74107db865b9586c955dcf5f2d60161
Reviewed-on: https://go-review.googlesource.com/23106
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-15 05:21:03 +00:00
Austin Clements 6181db53db runtime: improve heapBitsSetType documentation
Currently the heapBitsSetType documentation says that there are no
races on the heap bitmap, but that isn't exactly true. There are no
*write-write* races, but there are read-write races. Expand the
documentation to explain this and why it's okay.

Change-Id: Ibd92b69bcd6524a40a9dd4ec82422b50831071ed
Reviewed-on: https://go-review.googlesource.com/23092
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-05-14 13:49:57 +00:00
Austin Clements d8b08c3aa4 runtime: perform publication barrier even for noscan objects
Currently we only execute a publication barrier for scan objects (and
skip it for noscan objects). This used to be okay because GC would
never consult the object itself (so it wouldn't observe uninitialized
memory even if it found a pointer to a noscan object), and the heap
bitmap was pre-initialized to noscan.

However, now we explicitly initialize the heap bitmap for noscan
objects when we allocate them. While the GC will still never consult
the contents of a noscan object, it does need to see the initialized
heap bitmap. Hence, we need to execute a publication barrier to make
the bitmap visible before user code can expose a pointer to the newly
allocated object even for noscan objects.

Change-Id: Ie4133c638db0d9055b4f7a8061a634d970627153
Reviewed-on: https://go-review.googlesource.com/23043
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-05-14 13:49:51 +00:00
Joel Sing 5bd37b8e78 runtime: stop using sigreturn on openbsd/386
In future releases of OpenBSD, the sigreturn syscall will no longer
exist. As such, stop using sigreturn on openbsd/386 and just return
from the signal trampoline (as we already do for openbsd/amd64 and
openbsd/arm).

Change-Id: Ic4de1795bbfbfb062a685832aea0d597988c6985
Reviewed-on: https://go-review.googlesource.com/23024
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-12 07:32:01 +00:00
Brad Fitzpatrick 9628e6fd1d runtime/testdata/testprogcgo: fix Windows C compiler warning
Noticed and fix by Alex Brainman.

Tested in https://golang.org/cl/23005 (which makes all compiler
warnings fatal during development)

Fixes #15623

Change-Id: Ic19999fce8bb8640d963965cc328574efadd7855
Reviewed-on: https://go-review.googlesource.com/23010
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2016-05-10 23:11:44 +00:00
Cherry Zhang fdc4a964d2 [dev.ssa] cmd/compile/internal/gc, runtime: use 32-bit load for writeBarrier check
Use 32-bit load for writeBarrier check on all architectures.
Padding added to runtime structure.

Updates #15365, #15492.

Change-Id: I5d3dadf8609923fe0fe4fcb384a418b7b9624998
Reviewed-on: https://go-review.googlesource.com/22855
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-10 17:34:30 +00:00
Austin Clements 256a9670cc runtime: fix some out of date comments in bitmap code
Change-Id: I4613aa6d62baba01686bbab10738a7de23daae30
Reviewed-on: https://go-review.googlesource.com/22971
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-05-09 19:24:48 +00:00
Dmitry Vyukov aeecee8ce4 runtime/race: deflake test
The test sometimes fails on builders.
The test uses sleeps to establish the necessary goroutine
execution order. If sleeps undersleep/oversleep
the race is still reported, but it can be reported when the
main test goroutine returns. In such case test driver
can't match the race with the test and reports failure.

Wait for both test goroutines to ensure that the race
is reported in the test scope.

Fixes #15579

Change-Id: I0b9bec0ebfb0c127d83eb5325a7fe19ef9545050
Reviewed-on: https://go-review.googlesource.com/22951
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-05-09 14:50:18 +00:00
Elias Naur e6ec82067a runtime: use entire address space on 32 bit
In issue #13992, Russ mentioned that the heap bitmap footprint was
halved but that the bitmap size calculation hadn't been updated. This
presents the opportunity to either halve the bitmap size or double
the addressable virtual space. This CL doubles the addressable virtual
space. On 32 bit this can be tweaked further to allow the bitmap to
cover the entire 4GB virtual address space, removing a failure mode
if the kernel hands out memory with a too low address.

First, fix the calculation and double _MaxArena32 to cover 4GB virtual
memory space with the same bitmap size (256 MB).

Then, allow the fallback mode for the initial memory reservation
on 32 bit (or 64 bit with too little available virtual memory) to not
include space for the arena. mheap.sysAlloc will automatically reserve
additional space when the existing arena is full.

Finally, set arena_start to 0 in 32 bit mode, so that any address is
acceptable for subsequent (additional) reservations.

Before, the bitmap was always located just before arena_start, so
fix the two places relying on that assumption: Point the otherwise unused
mheap.bitmap to one byte after the end of the bitmap, and use it for
bitmap addressing instead of arena_start.

With arena_start set to 0 on 32 bit, the cgoInRange check is no longer a
sufficient check for Go pointers. Introduce and call inHeapOrStack to
check whether a pointer is to the Go heap or stack.

While we're here, remove sysReserveHigh which seems to be unused.

Fixes #13992

Change-Id: I592b513148a50b9d3967b5c5d94b86b3ec39acc2
Reviewed-on: https://go-review.googlesource.com/20471
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-07 03:04:39 +00:00
Brad Fitzpatrick 131231b8db os: rename remaining four os1_*.go files to os_*.go
Change-Id: Ice9c234960adc7857c8370b777a0b18e29d59281
Reviewed-on: https://go-review.googlesource.com/22853
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-06 17:47:44 +00:00
Brad Fitzpatrick 61602b0e9e runtime: delete empty files
I meant to delete these in CL 22850, actually.

Change-Id: I0c286efd2b9f1caf0221aa88e3bcc03649c89517
Reviewed-on: https://go-review.googlesource.com/22851
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-06 17:04:11 +00:00
Brad Fitzpatrick 2dc680007e runtime: merge the last four os-vs-os1 files together
Change-Id: Ib0ba691c4657fe18a4659753e70d97c623cb9c1d
Reviewed-on: https://go-review.googlesource.com/22850
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-05-06 16:03:25 +00:00
Shenghou Ma 2e32efc44a runtime: get randomness from AT_RANDOM AUXV on linux/mips64x
Fixes #15148.

Change-Id: If3b628f30521adeec1625689dbc98aaf4a9ec858
Reviewed-on: https://go-review.googlesource.com/22811
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-06 05:50:02 +00:00
Russ Cox 88d3db0a5b runtime: stop traceback at foreign function
This can only happen when profiling and there is foreign code
at the top of the g0 stack but we're not in cgo.
That in turn only happens with the race detector.

Fixes #13568.

Change-Id: I23775132c9c1a3a3aaae191b318539f368adf25e
Reviewed-on: https://go-review.googlesource.com/18322
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-06 00:54:25 +00:00
Joe Tsai acc757f678 all: use SeekStart, SeekCurrent, SeekEnd
CL/19862 (f79b50b8d5) recently introduced the constants
SeekStart, SeekCurrent, and SeekEnd to the io package. We should use these constants
consistently throughout the code base.

Updates #15269

Change-Id: If7fcaca7676e4a51f588528f5ced28220d9639a2
Reviewed-on: https://go-review.googlesource.com/22097
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-06 00:10:41 +00:00
David Chase 6db98a3c51 cmd/compile: repair MININT conversion bug in arm softfloat
Negative-case conversion code was wrong for minimum int32,
used negate-then-widen instead of widen-then-negate.

Test already exists; this fixes the failure.

Fixes #15563.

Change-Id: I4b0b3ae8f2c9714bdcc405d4d0b1502ccfba2b40
Reviewed-on: https://go-review.googlesource.com/22830
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-05 22:29:25 +00:00
Keith Randall 802966f7b3 [dev.ssa] Merge remote-tracking branch 'origin/master' into mergebranch
Merge from tip into ssa.

Change-Id: Icbc1c46d9f4721e4a0f99a24dd708044407ee9f7
2016-05-05 14:24:52 -07:00
Keith Randall ab150e1ac9 [dev.ssa] all: merge from tip to get dev.ssa current
So we can start working on other architectures here.

Change is a dummy to keep git happy.

Change-Id: I1caa62a242790601810a1ff72af7ea9773d4da76
Reviewed-on: https://go-review.googlesource.com/22822
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-05-05 20:47:51 +00:00
Emmanuel Odeke 1a7fc2357b runtime: print signal name in panic, if name is known
Adds a small function signame that infers a signal name
from the signal table, otherwise will fallback to using
hex(sig) as previously. No signal table is present for
Windows hence it will always print the hex value.

Sample code and new result:
```go
package main

import (
  "fmt"
  "time"
)

func main() {
  defer func() {
    if err := recover(); err != nil {
      fmt.Printf("err=%v\n", err)
    }
  }()

  ticker := time.Tick(1e9)
  for {
    <-ticker
  }
}
```

```shell
$ go run main.go &
$ kill -11 <pid>
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0xb01dfacedebac1e
pc=0xc71db]
...
```

Fixes #13969

Change-Id: Ie6be312eb766661f1cea9afec352b73270f27f9d
Reviewed-on: https://go-review.googlesource.com/22753
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-05 19:58:00 +00:00
Lynn Boger eeca3ba92f sync/atomic, runtime/internal/atomic: improve ppc64x atomics
The following performance improvements have been made to the
low-level atomic functions for ppc64le & ppc64:

- For those cases containing a lwarx and stwcx (or other sizes):
sync, lwarx, maybe something, stwcx, loop to sync, sync, isync
The sync is moved before (outside) the lwarx/stwcx loop, and the
 sync after is removed, so it becomes:
sync, lwarx, maybe something, stwcx, loop to lwarx, isync

- For the Or8 and And8, the shifting and manipulation of the
address to the word aligned version were removed and the
instructions were changed to use lbarx, stbcx instead of
register shifting, xor, then lwarx, stwcx.

- New instructions LWSYNC, LBAR, STBCC were tested and added.
runtime/atomic_ppc64x.s was changed to use the LWSYNC opcode
instead of the WORD encoding.

Fixes #15469

Ran some of the benchmarks in the runtime and sync directories.
Some results varied from run to run but the trend was improvement
based on best times for base and new:

runtime.test:
BenchmarkChanNonblocking-128         0.88          0.89          +1.14%
BenchmarkChanUncontended-128         569           511           -10.19%
BenchmarkChanContended-128           63110         53231         -15.65%
BenchmarkChanSync-128                691           598           -13.46%
BenchmarkChanSyncWork-128            11355         11649         +2.59%
BenchmarkChanProdCons0-128           2402          2090          -12.99%
BenchmarkChanProdCons10-128          1348          1363          +1.11%
BenchmarkChanProdCons100-128         1002          746           -25.55%
BenchmarkChanProdConsWork0-128       2554          2720          +6.50%
BenchmarkChanProdConsWork10-128      1909          1804          -5.50%
BenchmarkChanProdConsWork100-128     1624          1580          -2.71%
BenchmarkChanCreation-128            237           212           -10.55%
BenchmarkChanSem-128                 705           667           -5.39%
BenchmarkChanPopular-128             5081190       4497566       -11.49%

BenchmarkCreateGoroutines-128             532           473           -11.09%
BenchmarkCreateGoroutinesParallel-128     35.0          34.7          -0.86%
BenchmarkCreateGoroutinesCapture-128      4923          4200          -14.69%

sync.test:
BenchmarkUncontendedSemaphore-128      112           94.2          -15.89%
BenchmarkContendedSemaphore-128        133           128           -3.76%
BenchmarkMutexUncontended-128          1.90          1.67          -12.11%
BenchmarkMutex-128                     353           310           -12.18%
BenchmarkMutexSlack-128                304           283           -6.91%
BenchmarkMutexWork-128                 554           541           -2.35%
BenchmarkMutexWorkSlack-128            567           556           -1.94%
BenchmarkMutexNoSpin-128               275           242           -12.00%
BenchmarkMutexSpin-128                 1129          1030          -8.77%
BenchmarkOnce-128                      1.08          0.96          -11.11%
BenchmarkPool-128                      29.8          27.4          -8.05%
BenchmarkPoolOverflow-128              40564         36583         -9.81%
BenchmarkSemaUncontended-128           3.14          2.63          -16.24%
BenchmarkSemaSyntNonblock-128          1087          1069          -1.66%
BenchmarkSemaSyntBlock-128             897           893           -0.45%
BenchmarkSemaWorkNonblock-128          1034          1028          -0.58%
BenchmarkSemaWorkBlock-128             949           886           -6.64%

Change-Id: I4403fb29d3cd5254b7b1ce87a216bd11b391079e
Reviewed-on: https://go-review.googlesource.com/22549
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Minux Ma <minux@golang.org>
2016-05-05 18:52:28 +00:00
Cherry Zhang bcd4b84bc5 runtime: skip TestCgoCallbackGC on linux/mips64x
Builder is too slow. This test passed on builder machines but took
15+ min.

Change-Id: Ief9d67ea47671a57e954e402751043bc1ce09451
Reviewed-on: https://go-review.googlesource.com/22798
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-05 14:58:14 +00:00
Ian Lance Taylor 34f97d28d2 runtime: put tracebackctxt C functions in .c file
Since tracebackctxt.go uses //export functions, the C functions can't be
externally visible in the C comment. The code was using attributes to
work around that, but that failed on Windows.

Change-Id: If4449fd8209a8998b4f6855ea89e5db1471b2981
Reviewed-on: https://go-review.googlesource.com/22786
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-05 01:19:18 +00:00
Mohit Agarwal 4d6788ecae runtime: clean up profiling data files produced by TestCgoPprof
Fixes #15541

Change-Id: I9b6835157db0eb86de13591e785f971ffe754baa
Reviewed-on: https://go-review.googlesource.com/22783
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-04 22:12:26 +00:00
Shenghou Ma 43d2a10e26 runtime/internal/atomic: fix vet warnings
Change-Id: Ib29cf7abbbdaed81e918e5e41bca4e9b8da24621
Reviewed-on: https://go-review.googlesource.com/22503
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-05-04 16:50:22 +00:00
Cherry Zhang b6687c8933 runtime: add linux/mips64x cgo support
Change-Id: Id40dd05b7b264f3b779fdf9ccc2421ba4bc70589
Reviewed-on: https://go-review.googlesource.com/19806
Reviewed-by: Minux Ma <minux@golang.org>
2016-05-04 16:41:10 +00:00
Cherry Zhang 6e90432342 runtime/cgo: add context argument to crosscall2 on mips64
Change-Id: Id018516075842afd8af12fbf207763a851d5a851
Reviewed-on: https://go-review.googlesource.com/22754
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-05-04 16:40:44 +00:00
Ian Lance Taylor 84e808043f runtime: use cgo traceback for SIGPROF
If we collected a cgo traceback when entering the SIGPROF signal
handler, record it as part of the profiling stack trace.

This serves as the promised test for https://golang.org/cl/21055 .

Change-Id: I5f60cd6cea1d9b7c3932211483a6bfab60ed21d2
Reviewed-on: https://go-review.googlesource.com/22650
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-05-04 00:08:19 +00:00
Dmitry Vyukov caa2147532 runtime: per-P contexts for race detector
Race runtime also needs local malloc caches and currently uses
a mix of per-OS-thread and per-goroutine caches. This leads to
increased memory consumption. But more importantly cache of
synchronization objects is per-goroutine and we don't always
have goroutine context when feeing memory in GC. As the result
synchronization object descriptors leak (more precisely, they
can be reused if another synchronization object is recreated
at the same address, but it does not always help). For example,
the added BenchmarkSyncLeak has effectively runaway memory
consumption (based on a real long running server).

This change updates race runtime with support for per-P contexts.
BenchmarkSyncLeak now stabilizes at ~1GB memory consumption.

Long term, this will allow us to remove race runtime dependency
on glibc (as malloc is the main cornerstone).

I've also implemented a different scheme to pass P context to
race runtime: scheduler notified race runtime about association
between G and P by calling procwire(g, p)/procunwire(g, p).
But it turned out to be very messy as we have lots of places
where the association changes (e.g. syscalls). So I dropped it
in favor of the current scheme: race runtime asks scheduler
about the current P.

Fixes #14533

Change-Id: Iad10d2f816a44affae1b9fed446b3580eafd8c69
Reviewed-on: https://go-review.googlesource.com/19970
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-03 11:00:43 +00:00
Dmitry Vyukov fcd7c02c70 runtime: fix CPU underutilization
Runqempty is a critical predicate for scheduler. If runqempty spuriously
returns true, then scheduler can fail to schedule arbitrary number of
runnable goroutines on idle Ps for arbitrary long time. With the addition
of runnext runqempty predicate become broken (can spuriously return true).
Consider that runnext is not nil and the main array is empty. Runqempty
observes that the array is empty, then it is descheduled for some time.
Then queue owner pushes another element to the queue evicting runnext
into the array. Then queue owner pops runnext. Then runqempty resumes
and observes runnext is nil and returns true. But there were no point
in time when the queue was empty.

Fix runqempty predicate to not return true spuriously.

Change-Id: Ifb7d75a699101f3ff753c4ce7c983cf08befd31e
Reviewed-on: https://go-review.googlesource.com/20858
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-03 10:06:32 +00:00
Emmanuel Odeke 53fd522c0d all: make copyright headers consistent with one space after period
Follows suit with https://go-review.googlesource.com/#/c/20111.

Generated by running
$ grep -R 'Go Authors.  All' * | cut -d":" -f1 | while read F;do perl -pi -e 's/Go
Authors.  All/Go Authors. All/g' $F;done

The code in cmd/internal/unvendor wasn't changed.

Fixes #15213

Change-Id: I4f235cee0a62ec435f9e8540a1ec08ae03b1a75f
Reviewed-on: https://go-review.googlesource.com/21819
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-02 13:43:18 +00:00
Austin Clements 77c7f12438 runtime: update some comments
This updates some comments that became out of date when we moved the
mark bit out of the heap bitmap and started using the high bit for the
first word as a scan/dead bit.

Change-Id: I4a572d16db6114cadff006825466c1f18359f2db
Reviewed-on: https://go-review.googlesource.com/22662
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-05-01 03:31:50 +00:00
Cherry Zhang 5d002dbc21 runtime/cgo: add linux/mips64x cgo support
MIPS N64 ABI passes arguments in registers R4-R11, return value in R2.
R16-R23, R28, R30 and F24-F31 are callee-save. gcc PIC code expects
to be called with indirect call through R25.

Change-Id: I24f582b4b58e1891ba9fd606509990f95cca8051
Reviewed-on: https://go-review.googlesource.com/19805
Reviewed-by: Minux Ma <minux@golang.org>
2016-05-01 02:39:50 +00:00
Cherry Zhang 073d292c45 cmd/link, runtime: add external linking support for linux/mips64x
Fixes #12560

Change-Id: Ic2004fc7b09f2dbbf83c41f8c6307757c0e1676d
Reviewed-on: https://go-review.googlesource.com/19803
Reviewed-by: Minux Ma <minux@golang.org>
2016-05-01 02:38:37 +00:00
Cherry Zhang 981395103e cmd/internal/obj/mips et al.: introduce SB register on mips64x
SB register (R28) is introduced for access external addresses with shorter
instruction sequences. It is loaded at entry points. External data within
2G of SB can be accessed this way.

cmd/internal/obj: relocaltion R_ADDRMIPS is split into two relocations
R_ADDRMIPS and R_ADDRMIPSU, handling the low 16 bits and the "upper" 16
bits of external addresses, respectively, since the instructios may not
be adjacent. It might be better if relocation Variant could be used.

cmd/link/internal/mips64: support new relocations.

cmd/compile/internal/mips64: reserve SB register.

runtime: initialize SB register at entry points.

Change-Id: I5f34868f88c5a9698c042a8a1f12f76806c187b9
Reviewed-on: https://go-review.googlesource.com/19802
Reviewed-by: Minux Ma <minux@golang.org>
2016-05-01 02:36:46 +00:00
Cherry Zhang a409fb80b0 cmd/internal/obj/mips, runtime: change REGTMP to R23
Leave R28 to SB register, which will be introduced in CL 19802.

Change-Id: I1cf7a789695c5de664267ec8086bfb0b043ebc14
Reviewed-on: https://go-review.googlesource.com/19863
Reviewed-by: Minux Ma <minux@golang.org>
2016-05-01 02:36:28 +00:00
Austin Clements a20fd1f6ba runtime: reclaim scan/dead bit in first word
With the switch to separate mark bitmaps, the scan/dead bit for the
first word of each object is now unused. Reclaim this bit and use it
as a scan/dead bit, just like words three and on. The second word is
still used for checkmark.

This dramatically simplifies heapBitsSetTypeNoScan and hasPointers,
since they no longer need different cases for 1, 2, and 3+ word
objects. They can instead just manipulate the heap bitmap for the
first word and be done with it.

In order to enable this, we change heapBitsSetType and runGCProg to
always set the scan/dead bit to scan for the first word on every code
path. Since these functions only apply to types that have pointers,
there's no need to do this conditionally: it's *always* necessary to
set the scan bit in the first word.

We also change every place that scans an object and checks if there
are more pointers. Rather than only checking morePointers if the word
is >= 2, we now check morePointers if word != 1 (since that's the
checkmark word).

Looking forward, we should probably reclaim the checkmark bit, too,
but that's going to be quite a bit more work.

Tested by setting doubleCheck in heapBitsSetType and running all.bash
on both linux/amd64 and linux/386, and by running GOGC=10 all.bash.

This particularly improves the FmtFprintf* go1 benchmarks, since they
do a large amount of noscan allocation.

name                      old time/op    new time/op    delta
BinaryTree17-12              2.34s ± 1%     2.38s ± 1%  +1.70%  (p=0.000 n=17+19)
Fannkuch11-12                2.09s ± 0%     2.09s ± 1%    ~     (p=0.276 n=17+16)
FmtFprintfEmpty-12          44.9ns ± 2%    44.8ns ± 2%    ~     (p=0.340 n=19+18)
FmtFprintfString-12          127ns ± 0%     125ns ± 0%  -1.57%  (p=0.000 n=16+15)
FmtFprintfInt-12             128ns ± 0%     122ns ± 1%  -4.45%  (p=0.000 n=15+20)
FmtFprintfIntInt-12          207ns ± 1%     193ns ± 0%  -6.55%  (p=0.000 n=19+14)
FmtFprintfPrefixedInt-12     197ns ± 1%     191ns ± 0%  -2.93%  (p=0.000 n=17+18)
FmtFprintfFloat-12           263ns ± 0%     248ns ± 1%  -5.88%  (p=0.000 n=15+19)
FmtManyArgs-12               794ns ± 0%     779ns ± 1%  -1.90%  (p=0.000 n=18+18)
GobDecode-12                7.14ms ± 2%    7.11ms ± 1%    ~     (p=0.072 n=20+20)
GobEncode-12                5.85ms ± 1%    5.82ms ± 1%  -0.49%  (p=0.000 n=20+20)
Gzip-12                      218ms ± 1%     215ms ± 1%  -1.22%  (p=0.000 n=19+19)
Gunzip-12                   36.8ms ± 0%    36.7ms ± 0%  -0.18%  (p=0.006 n=18+20)
HTTPClientServer-12         77.1µs ± 4%    77.1µs ± 3%    ~     (p=0.945 n=19+20)
JSONEncode-12               15.6ms ± 1%    15.9ms ± 1%  +1.68%  (p=0.000 n=18+20)
JSONDecode-12               55.2ms ± 1%    53.6ms ± 1%  -2.93%  (p=0.000 n=17+19)
Mandelbrot200-12            4.05ms ± 1%    4.05ms ± 0%    ~     (p=0.306 n=17+17)
GoParse-12                  3.14ms ± 1%    3.10ms ± 1%  -1.31%  (p=0.000 n=19+18)
RegexpMatchEasy0_32-12      69.3ns ± 1%    70.0ns ± 0%  +0.89%  (p=0.000 n=19+17)
RegexpMatchEasy0_1K-12       237ns ± 1%     236ns ± 0%  -0.62%  (p=0.000 n=19+16)
RegexpMatchEasy1_32-12      69.5ns ± 1%    70.3ns ± 1%  +1.14%  (p=0.000 n=18+17)
RegexpMatchEasy1_1K-12       377ns ± 1%     366ns ± 1%  -3.03%  (p=0.000 n=15+19)
RegexpMatchMedium_32-12      107ns ± 1%     107ns ± 2%    ~     (p=0.318 n=20+19)
RegexpMatchMedium_1K-12     33.8µs ± 3%    33.5µs ± 1%  -1.04%  (p=0.001 n=20+19)
RegexpMatchHard_32-12       1.68µs ± 1%    1.73µs ± 0%  +2.50%  (p=0.000 n=20+18)
RegexpMatchHard_1K-12       50.8µs ± 1%    52.0µs ± 1%  +2.50%  (p=0.000 n=19+18)
Revcomp-12                   381ms ± 1%     385ms ± 1%  +1.00%  (p=0.000 n=17+18)
Template-12                 64.9ms ± 3%    62.6ms ± 1%  -3.55%  (p=0.000 n=19+18)
TimeParse-12                 324ns ± 0%     328ns ± 1%  +1.25%  (p=0.000 n=18+18)
TimeFormat-12                345ns ± 0%     334ns ± 0%  -3.31%  (p=0.000 n=15+17)
[Geo mean]                  52.1µs         51.5µs       -1.00%

Change-Id: I13e74da3193a7f80794c654f944d1f0d60817049
Reviewed-on: https://go-review.googlesource.com/22632
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-30 16:49:54 +00:00
Austin Clements d5e3d08b3a runtime: use morePointers and isPointer in more places
This makes this code better self-documenting and makes it easier to
find these places in the future.

Change-Id: I31dc5598ae67f937fb9ef26df92fd41d01e983c3
Reviewed-on: https://go-review.googlesource.com/22631
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-30 16:49:50 +00:00
Austin Clements a5d3f7ece9 runtime: avoid conditional execution in morePointers and isPointer
heapBits.bits is carefully written to produce good machine code. Use
it in heapBits.morePointers and heapBits.isPointer to get good machine
code there, too.

Change-Id: I208c7d0d38697e7a22cad67f692162589b75f1e2
Reviewed-on: https://go-review.googlesource.com/22630
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-30 16:49:47 +00:00
Michael Munday 58f52cbb79 runtime: fix cgocallback_gofunc on ppc64x
Fix issues introduced in 5f9a870.

Change-Id: Ia75945ef563956613bf88bbe57800a96455c265d
Reviewed-on: https://go-review.googlesource.com/22661
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-30 03:49:22 +00:00
Ian Lance Taylor 9fe572e509 runtime: fix cgocallback_gofunc argument passing on arm64
Change-Id: I4b34bcd5cde71ecfbb352b39c4231de6168cc7f3
Reviewed-on: https://go-review.googlesource.com/22651
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
2016-04-29 23:10:52 +00:00
Ian Lance Taylor 5f9a870bf1 cmd/cgo, runtime, runtime/cgo: use cgo context function
Add support for the context function set by runtime.SetCgoTraceback.
The context function was added in CL 17761, without support.
This CL is the support.

This CL has not been tested for real C code, as a working context
function for C code requires unwind support that does not seem to exist.
I wanted to get the CL out before the freeze.

I apologize for the length of this CL.  It's mostly plumbing, but
unfortunately the plumbing is processor-specific.

Change-Id: I8ce11a0de9b3dafcc29efd2649d776e93bff0e90
Reviewed-on: https://go-review.googlesource.com/22508
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-29 22:07:36 +00:00
Rick Hudson 56b5491262 Merge remote-tracking branch 'origin/dev.garbage'
This commit moves the GC from free list allocation to
bit mark allocation. Instead of using the bitmaps
generated during the mark phases to generate free
list and then using the free lists for allocation we
allocate directly from the bitmaps.

The change in the garbage benchmark

name              old time/op  new time/op  delta
XBenchGarbage-12  2.22ms ± 1%  2.13ms ± 1%  -3.90%  (p=0.000 n=18+18)

Change-Id: I17f57233336f0ca5ef5404c3be4ecb443ab622aa
2016-04-29 13:56:44 -04:00
Rick Hudson e9eaa181fc [dev.garbage] runtime: simplify nextFreeFast so it is inlined
nextFreeFast is currently not inlined by the compiler due
to its size and complexity. This CL simplifies
nextFreeFast by letting the slow path handle (nextFree)
handle a corner cases.

Change-Id: Ia9c5d1a7912bcb4bec072f5fd240f0e0bafb20e4
Reviewed-on: https://go-review.googlesource.com/22598
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
2016-04-29 16:47:11 +00:00
Austin Clements b3579c095e [dev.garbage] runtime: revive sweep fast path
sweep used to skip mcental.freeSpan (and its locking) if it didn't
find any new free objects. We lost that optimization when the
freed-object counting changed in dad83f7 to count total free objects
instead of newly freed objects.

The previous commit brings back counting of newly freed objects, so we
can easily revive this optimization by checking that count (like we
used to) instead of the total free objects count.

Change-Id: I43658707a1c61674d0366124d5976b00d98741a9
Reviewed-on: https://go-review.googlesource.com/22596
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-04-29 15:25:28 +00:00
Austin Clements d97625ae9e [dev.garbage] runtime: fix nfree accounting
Commit 8dda1c4 changed the meaning of "nfree" in sweep from the number
of newly freed objects to the total number of free objects in the
span, but didn't update where sweep added nfree to c.local_nsmallfree.
Hence, we're over-accounting the number of frees. This is causing
TestArrayHash to fail with "too many allocs NNN - hash not balanced".

Fix this by computing the number of newly freed objects and adding
that to c.local_nsmallfree, so it behaves like it used to. Computing
this requires a small tweak to mallocgc: apparently we've never set
s.allocCount when allocating a large object; fix this by setting it to
1 so sweep doesn't get confused.

Change-Id: I31902ffd310110da4ffd807c5c06f1117b872dc8
Reviewed-on: https://go-review.googlesource.com/22595
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
2016-04-29 15:25:26 +00:00
Austin Clements 6d11490539 [dev.garbage] runtime: fix allocfreetrace
We broke tracing of freed objects in GODEBUG=allocfreetrace=1 mode
when we removed the sweep over the mark bitmap. Fix it by
re-introducing the sweep over the bitmap specifically if we're in
allocfreetrace mode. This doesn't have to be even remotely efficient,
since the overhead of allocfreetrace is huge anyway, so we can keep
the code for this down to just a few lines.

Change-Id: I9e176b3b04c73608a0ea3068d5d0cd30760ebd40
Reviewed-on: https://go-review.googlesource.com/22592
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-04-29 15:08:21 +00:00
Austin Clements 38f674687a [dev.garbage] runtime: reintroduce no-zeroing optimization
Currently we always zero objects when we allocate them. We used to
have an optimization that would not zero objects that had not been
allocated since the whole span was last zeroed (either by getting it
from the system or by getting it from the heap, which does a bulk
zero), but this depended on the sweeper clobbering the first two words
of each object. Hence, we lost this optimization when the bitmap
sweeper went away.

Re-introduce this optimization using a different mechanism. Each span
already keeps a flag indicating that it just came from the OS or was
just bulk zeroed by the mheap. We can simply use this flag to know
when we don't need to zero an object. This is slightly less efficient
than the old optimization: if a span gets allocated and partially
used, then GC happens and the span gets returned to the mcentral, then
the span gets re-acquired, the old optimization knew that it only had
to re-zero the objects that had been reclaimed, whereas this
optimization will re-zero everything. However, in this case, you're
already paying for the garbage collection, and you've only wasted one
zeroing of the span, so in practice there seems to be little
difference. (If we did want to revive the full optimization, each span
could keep track of a frontier beyond which all free slots are zeroed.
I prototyped this and it didn't obvious do any better than the much
simpler approach in this commit.)

This significantly improves BinaryTree17, which is allocation-heavy
(and runs first, so most pages are already zeroed), and slightly
improves everything else.

name              old time/op  new time/op  delta
XBenchGarbage-12  2.15ms ± 1%  2.14ms ± 1%  -0.80%  (p=0.000 n=17+17)

name                      old time/op    new time/op    delta
BinaryTree17-12              2.71s ± 1%     2.56s ± 1%  -5.73%        (p=0.000 n=18+19)
DivconstI64-12              1.70ns ± 1%    1.70ns ± 1%    ~           (p=0.562 n=18+18)
DivconstU64-12              1.74ns ± 2%    1.74ns ± 1%    ~           (p=0.394 n=20+20)
DivconstI32-12              1.74ns ± 0%    1.74ns ± 0%    ~     (all samples are equal)
DivconstU32-12              1.66ns ± 1%    1.66ns ± 0%    ~           (p=0.516 n=15+16)
DivconstI16-12              1.84ns ± 0%    1.84ns ± 0%    ~     (all samples are equal)
DivconstU16-12              1.82ns ± 0%    1.82ns ± 0%    ~     (all samples are equal)
DivconstI8-12               1.79ns ± 0%    1.79ns ± 0%    ~     (all samples are equal)
DivconstU8-12               1.60ns ± 0%    1.60ns ± 1%    ~           (p=0.603 n=17+19)
Fannkuch11-12                2.11s ± 1%     2.11s ± 0%    ~           (p=0.333 n=16+19)
FmtFprintfEmpty-12          45.1ns ± 4%    45.4ns ± 5%    ~           (p=0.111 n=20+20)
FmtFprintfString-12          134ns ± 0%     129ns ± 0%  -3.45%        (p=0.000 n=18+16)
FmtFprintfInt-12             131ns ± 1%     129ns ± 1%  -1.54%        (p=0.000 n=16+18)
FmtFprintfIntInt-12          205ns ± 2%     203ns ± 0%  -0.56%        (p=0.014 n=20+18)
FmtFprintfPrefixedInt-12     200ns ± 2%     197ns ± 1%  -1.48%        (p=0.000 n=20+18)
FmtFprintfFloat-12           256ns ± 1%     256ns ± 0%  -0.21%        (p=0.008 n=18+20)
FmtManyArgs-12               805ns ± 0%     804ns ± 0%  -0.19%        (p=0.001 n=18+18)
GobDecode-12                7.21ms ± 1%    7.14ms ± 1%  -0.92%        (p=0.000 n=19+20)
GobEncode-12                5.88ms ± 1%    5.88ms ± 1%    ~           (p=0.641 n=18+19)
Gzip-12                      218ms ± 1%     218ms ± 1%    ~           (p=0.271 n=19+18)
Gunzip-12                   37.1ms ± 0%    36.9ms ± 0%  -0.29%        (p=0.000 n=18+17)
HTTPClientServer-12         78.1µs ± 2%    77.4µs ± 2%    ~           (p=0.070 n=19+19)
JSONEncode-12               15.5ms ± 1%    15.5ms ± 0%    ~           (p=0.063 n=20+18)
JSONDecode-12               56.1ms ± 0%    55.4ms ± 1%  -1.18%        (p=0.000 n=19+18)
Mandelbrot200-12            4.05ms ± 0%    4.06ms ± 0%  +0.29%        (p=0.001 n=18+18)
GoParse-12                  3.28ms ± 1%    3.21ms ± 1%  -2.30%        (p=0.000 n=20+20)
RegexpMatchEasy0_32-12      69.4ns ± 2%    69.3ns ± 1%    ~           (p=0.205 n=18+16)
RegexpMatchEasy0_1K-12       239ns ± 0%     239ns ± 0%    ~     (all samples are equal)
RegexpMatchEasy1_32-12      69.4ns ± 1%    69.4ns ± 1%    ~           (p=0.620 n=15+18)
RegexpMatchEasy1_1K-12       370ns ± 1%     369ns ± 2%    ~           (p=0.088 n=20+20)
RegexpMatchMedium_32-12      108ns ± 0%     108ns ± 0%    ~     (all samples are equal)
RegexpMatchMedium_1K-12     33.6µs ± 3%    33.5µs ± 3%    ~           (p=0.718 n=20+20)
RegexpMatchHard_32-12       1.68µs ± 1%    1.67µs ± 2%    ~           (p=0.316 n=20+20)
RegexpMatchHard_1K-12       50.5µs ± 3%    50.4µs ± 3%    ~           (p=0.659 n=20+20)
Revcomp-12                   381ms ± 1%     381ms ± 1%    ~           (p=0.916 n=19+18)
Template-12                 66.5ms ± 1%    65.8ms ± 2%  -1.08%        (p=0.000 n=20+20)
TimeParse-12                 317ns ± 0%     319ns ± 0%  +0.48%        (p=0.000 n=19+12)
TimeFormat-12                338ns ± 0%     338ns ± 0%    ~           (p=0.124 n=19+18)
[Geo mean]                  5.99µs         5.96µs       -0.54%

Change-Id: I638ffd9d9f178835bbfa499bac20bd7224f1a907
Reviewed-on: https://go-review.googlesource.com/22591
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-04-29 15:08:13 +00:00
Austin Clements 3e2462387f [dev.garbage] runtime: eliminate mspan.start
This converts all remaining uses of mspan.start to instead use
mspan.base(). In many cases, this actually reduces the complexity of
the code.

Change-Id: If113840e00d3345a6cf979637f6a152e6344aee7
Reviewed-on: https://go-review.googlesource.com/22590
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
2016-04-29 03:53:17 +00:00
Austin Clements b7adc41fba [dev.garbage] runtime: use s.base() everywhere it makes sense
Currently we have lots of (s.start << _PageShift) and variants. We now
have an s.base() function that returns this. It's faster and more
readable, so use it.

Change-Id: I888060a9dae15ea75ca8cc1c2b31c905e71b452b
Reviewed-on: https://go-review.googlesource.com/22559
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
2016-04-29 03:53:14 +00:00
Austin Clements 2e8b74b695 [dev.garbage] runtime: document sysAlloc
In particular, it always returns an aligned pointer.

Change-Id: I763789a539a4bfd8b0efb36a39a80be1a479d3e2
Reviewed-on: https://go-review.googlesource.com/22558
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-29 03:53:12 +00:00
Austin Clements 15744c92de [dev.garbage] runtime: remove unused head/end arguments from freeSpan
These used to be used for the list of newly freed objects, but that's
no longer a thing.

Change-Id: I5a4503137b74ec0eae5372ca271b1aa0b32df074
Reviewed-on: https://go-review.googlesource.com/22557
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-29 03:53:08 +00:00
Rick Hudson 2fb75ea6c6 [dev.garbage] runtime: use sys.Ctz64 intrinsic
Our compilers now provides instrinsics including
sys.Ctz64 that support CTZ (count trailing zero)
instructions. This CL replaces the Go versions
of CTZ with the compiler intrinsic.

Count trailing zeros CTZ finds the least
significant 1 in a word and returns the number
of less significant 0s in the word.

Allocation uses the bitmap created by the garbage
collector to locate an unmarked object. The logic
takes a word of the bitmap, complements, and then
caches it. It then uses CTZ to locate an available
unmarked object. It then shifts marked bits out of
the bitmap word preparing it for the next search.
Once all the unmarked objects are used in the
cached work the bitmap gets another word and
repeats the process.

Change-Id: Id2fc42d1d4b9893efaa2e1bd01896985b7e42f82
Reviewed-on: https://go-review.googlesource.com/21366
Reviewed-by: Austin Clements <austin@google.com>
2016-04-29 00:00:50 +00:00
Rick Hudson 2063d5d903 [dev.garbage] runtime: restructure alloc and mark bits
Two changes are included here that are dependent on the other.
The first is that allocBits and gcamrkBits are changed to
a *uint8 which points to the first byte of that span's
mark and alloc bits. Several places were altered to
perform pointer arithmetic to locate the byte corresponding
to an object in the span. The actual bit corresponding
to an object is indexed in the byte by using the lower three
bits of the objects index.

The second change avoids the redundant calculation of an
object's index. The index is returned from heapBitsForObject
and then used by the functions indexing allocBits
and gcmarkBits.

Finally we no longer allocate the gc bits in the span
structures. Instead we use an arena based allocation scheme
that allows for a more compact bit map as well as recycling
and bulk clearing of the mark bits.

Change-Id: If4d04b2021c092ec39a4caef5937a8182c64dfef
Reviewed-on: https://go-review.googlesource.com/20705
Reviewed-by: Austin Clements <austin@google.com>
2016-04-29 00:00:47 +00:00
Mikio Hara be730b49ca runtime: drop _SigUnblock for SIGSYS on Linux
The _SigUnblock flag was appended to SIGSYS slot of runtime signal table
for Linux in https://go-review.googlesource.com/22202, but there is
still no concrete opinion on whether SIGSYS must be an unblocked signal
for runtime.

This change removes _SigUnblock flag from SIGSYS on Linux for
consistency in runtime signal handling and adds a reference to #15204 to
runtime signal table for FreeBSD.

Updates #15204.

Change-Id: I42992b1d852c2ab5dd37d6dbb481dba46929f665
Reviewed-on: https://go-review.googlesource.com/22537
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-28 21:48:44 +00:00
Rick Hudson 23aeb34df1 [dev.garbage] Merge remote-tracking branch 'origin/master' into HEAD
Change-Id: I282fd9ce9db435dfd35e882a9502ab1abc185297
2016-04-27 18:46:52 -04:00
Brad Fitzpatrick 06d639e075 runtime: fix SetCgoTraceback doc indentation
It wasn't rendering as HTML nicely.

Change-Id: I5408ec22932a05e85c210c0faa434bd19dce5650
Reviewed-on: https://go-review.googlesource.com/22532
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-27 22:12:01 +00:00
Rick Hudson 1354b32cd7 [dev.garbage] runtime: add gc work buffer tryGet and put fast paths
The complexity of the GC work buffers put and tryGet
prevented them from being inlined. This CL simplifies
the fast path thus enabling inlining. If the fast
path does not succeed the previous put and tryGet
functions are called.

Change-Id: I6da6495d0dadf42bd0377c110b502274cc01acf5
Reviewed-on: https://go-review.googlesource.com/20704
Reviewed-by: Austin Clements <austin@google.com>
2016-04-27 21:55:02 +00:00
Rick Hudson f8d0d4fd59 [dev.garbage] runtime: cleanup and optimize span.base()
Prior to this CL the base of a span was calculated in various
places using shifts or calls to base(). This CL now
always calls base() which has been optimized to calculate the
base of the span when the span is initialized and store that
value in the span structure.

Change-Id: I661f2bfa21e3748a249cdf049ef9062db6e78100
Reviewed-on: https://go-review.googlesource.com/20703
Reviewed-by: Austin Clements <austin@google.com>
2016-04-27 21:54:59 +00:00
Rick Hudson 8dda1c4c08 [dev.garbage] runtime: remove heapBitsSweepSpan
Prior to this CL the sweep phase was responsible for locating
all objects that were about to be freed and calling a function
to process the object. This was done by the function
heapBitsSweepSpan. Part of processing included calls to
tracefree and msanfree as well as counting how many objects
were freed.

The calls to tracefree and msanfree have been moved into the
gcmalloc routine and called when the object is about to be
reallocated. The counting of free objects has been optimized
using an array based popcnt algorithm and if all the objects
in a span are free then span is freed.

Similarly the code to locate the next free object has been
optimized to use an array based ctz (count trailing zero).
Various hot paths in the allocation logic have been optimized.

At this point the garbage benchmark is within 3% of the 1.6
release.

Change-Id: I00643c442e2ada1685c010c3447e4ea8537d2dfa
Reviewed-on: https://go-review.googlesource.com/20201
Reviewed-by: Austin Clements <austin@google.com>
2016-04-27 21:54:57 +00:00
Rick Hudson 4093481523 [dev.garbage] runtime: add bit and cache ctz64 (count trailing zero)
Add to each span a 64 bit cache (allocCache) of the allocBits
at freeindex. allocCache is shifted such that the lowest bit
corresponds to the bit freeindex. allocBits uses a 0 to
indicate an object is free, on the other hand allocCache
uses a 1 to indicate an object is free. This facilitates
ctz64 (count trailing zero) which counts the number of 0s
trailing the least significant 1. This is also the index of
the least significant 1.

Each span maintains a freeindex indicating the boundary
between allocated objects and unallocated objects. allocCache
is shifted as freeindex is incremented such that the low bit
in allocCache corresponds to the bit a freeindex in the
allocBits array.

Currently ctz64 is written in Go using a for loop so it is
not very efficient. Use of the hardware instruction will
follow. With this in mind comparisons of the garbage
benchmark are as follows.

1.6 release        2.8 seconds
dev:garbage branch 3.1 seconds.

Profiling shows the go implementation of ctz64 takes up
1% of the total time.

Change-Id: If084ed9c3b1eda9f3c6ab2e794625cb870b8167f
Reviewed-on: https://go-review.googlesource.com/20200
Reviewed-by: Austin Clements <austin@google.com>
2016-04-27 21:54:54 +00:00
Rick Hudson 44fe90d0b3 [dev.garbage] runtime: logic that uses count trailing zero (ctz)
Most (all?) processors that Go supports supply a hardware
instruction that takes a byte and returns the number
of zeros trailing the first 1 encountered, or 8
if no ones are found. This is the index within the
byte of the first 1 encountered. CTZ should improve the
performance of the nextFreeIndex function.

Since nextFreeIndex wants the next unmarked (0) bit
a bit-wise complement is needed before calling ctz.
Furthermore unmarked bits associated with previously
allocated objects need to be ignored. Instead of writing
a 1 as we allocate the code masks all bits less than the
freeindex after loading the byte.

While this CL does not actual execute a CTZ instruction
it supplies a ctz function with the appropiate signature
along with the logic to execute it.

Change-Id: I5c55ce0ed48ca22c21c4dd9f969b0819b4eadaa7
Reviewed-on: https://go-review.googlesource.com/20169
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-04-27 21:54:52 +00:00
Rick Hudson e4ac2d4acc [dev.garbage] runtime: replace ref with allocCount
This is a renaming of the field ref to the
more appropriate allocCount. The field
holds the number of objects in the span
that are currently allocated. Some throws
strings were adjusted to more accurately
convey the meaning of allocCount.

Change-Id: I10daf44e3e9cc24a10912638c7de3c1984ef8efe
Reviewed-on: https://go-review.googlesource.com/19518
Reviewed-by: Austin Clements <austin@google.com>
2016-04-27 21:54:49 +00:00
Rick Hudson 3479b065d4 [dev.garbage] runtime: allocate directly from GC mark bits
Instead of building a freelist from the mark bits generated
by the GC this CL allocates directly from the mark bits.

The approach moves the mark bits from the pointer/no pointer
heap structures into their own per span data structures. The
mark/allocation vectors consist of a single mark bit per
object. Two vectors are maintained, one for allocation and
one for the GC's mark phase. During the GC cycle's sweep
phase the interpretation of the vectors is swapped. The
mark vector becomes the allocation vector and the old
allocation vector is cleared and becomes the mark vector that
the next GC cycle will use.

Marked entries in the allocation vector indicate that the
object is not free. Each allocation vector maintains a boundary
between areas of the span already allocated from and areas
not yet allocated from. As objects are allocated this boundary
is moved until it reaches the end of the span. At this point
further allocations will be done from another span.

Since we no longer sweep a span inspecting each freed object
the responsibility for maintaining pointer/scalar bits in
the heapBitMap containing is now the responsibility of the
the routines doing the actual allocation.

This CL is functionally complete and ready for performance
tuning.

Change-Id: I336e0fc21eef1066e0b68c7067cc71b9f3d50e04
Reviewed-on: https://go-review.googlesource.com/19470
Reviewed-by: Austin Clements <austin@google.com>
2016-04-27 21:54:47 +00:00
Rick Hudson dc65a82eff [dev.garbage] runtime: mark/allocation helper functions
The gcmarkBits is a bit vector used by the GC to mark
reachable objects. Once a GC cycle is complete the gcmarkBits
swap places with the allocBits. allocBits is then used directly
by malloc to locate free objects, thus avoiding the
construction of a linked free list. This CL introduces a set
of helper functions for manipulating gcmarkBits and allocBits
that will be used by later CLs to realize the actual
algorithm. Minimal attempts have been made to optimize these
helper routines.

Change-Id: I55ad6240ca32cd456e8ed4973c6970b3b882dd34
Reviewed-on: https://go-review.googlesource.com/19420
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Rick Hudson <rlh@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-27 21:54:44 +00:00
Rick Hudson e1c4e9a754 [dev.garbage] runtime: refactor next free object
In preparation for changing how the next free object is chosen
refactor and consolidate code into a single function.

Change-Id: I6836cd88ed7cbf0b2df87abd7c1c3b9fabc1cbd8
Reviewed-on: https://go-review.googlesource.com/19317
Reviewed-by: Austin Clements <austin@google.com>
2016-04-27 21:54:41 +00:00
Rick Hudson aed861038f [dev.garbage] runtime: add stackfreelist
The freelist for normal objects and the freelist
for stacks share the same mspan field for holding
the list head but are operated on by different code
sequences. This overloading complicates the use of bit
vectors for allocation of normal objects. This change
refactors the use of the stackfreelist out from the
use of freelist.

Change-Id: I5b155b5b8a1fcd8e24c12ee1eb0800ad9b6b4fa0
Reviewed-on: https://go-review.googlesource.com/19315
Reviewed-by: Austin Clements <austin@google.com>
2016-04-27 21:54:39 +00:00
Rick Hudson 2ac8bdc52a [dev.garbage] runtime: bitmap allocation data structs
The bitmap allocation data structure prototypes. Before
this is released these underlying data structures need
to be more performant but the signatures of helper
functions utilizing these structures will remain stable.

Change-Id: I5ace12f2fb512a7038a52bbde2bfb7e98783bcbe
Reviewed-on: https://go-review.googlesource.com/19221
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-27 21:54:35 +00:00
Austin Clements b49b71ae19 runtime: don't rescan globals
Currently the runtime rescans globals during mark 2 and mark
termination. This costs as much as 500µs/MB in STW time, which is
enough to surpass the 10ms STW limit with only 20MB of globals.

It's also basically unnecessary. The compiler already generates write
barriers for global -> heap pointer updates and the regular write
barrier doesn't check whether the slot is a global or in the heap.
Some less common write barriers do cause problems.
heapBitsBulkBarrier, which is used by typedmemmove and related
functions, currently depends on having access to the pointer bitmap
and as a result ignores writes to globals. Likewise, the
reflect-related write barriers reflect_typedmemmovepartial and
callwritebarrier ignore non-heap destinations; though it appears they
can never be called with global pointers anyway.

This commit makes heapBitsBulkBarrier issue write barriers for writes
to global pointers using the data and BSS pointer bitmaps, removes the
inheap checks from the reflection write barriers, and eliminates the
rescans during mark 2 and mark termination. It also adds a test that
writes to globals have write barriers.

Programs with large data+BSS segments (with pointers) aren't common,
but for programs that do have large data+BSS segments, this
significantly reduces pause time:

name \ 95%ile-time/markTerm              old         new  delta
LargeBSS/bss:1GB/gomaxprocs:4  148200µs ± 6%  302µs ±52%  -99.80% (p=0.008 n=5+5)

This very slightly improves the go1 benchmarks:

name                      old time/op    new time/op    delta
BinaryTree17-12              2.62s ± 3%     2.62s ± 4%    ~     (p=0.904 n=20+20)
Fannkuch11-12                2.15s ± 1%     2.13s ± 0%  -1.29%  (p=0.000 n=18+20)
FmtFprintfEmpty-12          48.3ns ± 2%    47.6ns ± 1%  -1.52%  (p=0.000 n=20+16)
FmtFprintfString-12          152ns ± 0%     152ns ± 1%    ~     (p=0.725 n=18+18)
FmtFprintfInt-12             150ns ± 1%     149ns ± 1%  -1.14%  (p=0.000 n=19+20)
FmtFprintfIntInt-12          250ns ± 0%     244ns ± 1%  -2.12%  (p=0.000 n=20+18)
FmtFprintfPrefixedInt-12     219ns ± 1%     217ns ± 1%  -1.20%  (p=0.000 n=19+20)
FmtFprintfFloat-12           280ns ± 0%     281ns ± 1%  +0.47%  (p=0.000 n=19+19)
FmtManyArgs-12               928ns ± 0%     923ns ± 1%  -0.53%  (p=0.000 n=19+18)
GobDecode-12                7.21ms ± 1%    7.24ms ± 2%    ~     (p=0.091 n=19+19)
GobEncode-12                6.07ms ± 1%    6.05ms ± 1%  -0.36%  (p=0.002 n=20+17)
Gzip-12                      265ms ± 1%     265ms ± 1%    ~     (p=0.496 n=20+19)
Gunzip-12                   39.6ms ± 1%    39.3ms ± 1%  -0.85%  (p=0.000 n=19+19)
HTTPClientServer-12         74.0µs ± 2%    73.8µs ± 1%    ~     (p=0.569 n=20+19)
JSONEncode-12               15.4ms ± 1%    15.3ms ± 1%  -0.25%  (p=0.049 n=17+17)
JSONDecode-12               53.7ms ± 2%    53.0ms ± 1%  -1.29%  (p=0.000 n=18+17)
Mandelbrot200-12            3.97ms ± 1%    3.97ms ± 0%    ~     (p=0.072 n=17+18)
GoParse-12                  3.35ms ± 2%    3.36ms ± 1%  +0.51%  (p=0.005 n=18+20)
RegexpMatchEasy0_32-12      72.7ns ± 2%    72.2ns ± 1%  -0.70%  (p=0.005 n=19+19)
RegexpMatchEasy0_1K-12       246ns ± 1%     245ns ± 0%  -0.60%  (p=0.000 n=18+16)
RegexpMatchEasy1_32-12      72.8ns ± 1%    72.5ns ± 1%  -0.37%  (p=0.011 n=18+18)
RegexpMatchEasy1_1K-12       380ns ± 1%     385ns ± 1%  +1.34%  (p=0.000 n=20+19)
RegexpMatchMedium_32-12      115ns ± 2%     115ns ± 1%  +0.44%  (p=0.047 n=20+20)
RegexpMatchMedium_1K-12     35.4µs ± 1%    35.5µs ± 1%    ~     (p=0.079 n=18+19)
RegexpMatchHard_32-12       1.83µs ± 0%    1.80µs ± 1%  -1.76%  (p=0.000 n=18+18)
RegexpMatchHard_1K-12       55.1µs ± 0%    54.3µs ± 1%  -1.42%  (p=0.000 n=18+19)
Revcomp-12                   386ms ± 1%     381ms ± 1%  -1.14%  (p=0.000 n=18+18)
Template-12                 61.5ms ± 2%    61.5ms ± 2%    ~     (p=0.647 n=19+20)
TimeParse-12                 338ns ± 0%     336ns ± 1%  -0.72%  (p=0.000 n=14+19)
TimeFormat-12                350ns ± 0%     357ns ± 0%  +2.05%  (p=0.000 n=19+18)
[Geo mean]                  55.3µs         55.0µs       -0.41%

Change-Id: I57e8720385a1b991aeebd111b6874354308e2a6b
Reviewed-on: https://go-review.googlesource.com/20829
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-04-27 18:48:16 +00:00
Austin Clements 30172f1811 runtime: make {add,subtract}{b,1} nosplit
These are used at the bottom level of various GC operations that must
not be preempted. To be on the safe side, mark them all nosplit.

Change-Id: I8f7360e79c9852bd044df71413b8581ad764380c
Reviewed-on: https://go-review.googlesource.com/22504
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-04-27 18:46:00 +00:00
David Crawshaw 217be5b35d reflect: unnamed interface types have no name
Fixes #15468

Change-Id: I8723171f87774a98d5e80e7832ebb96dd1fbea74
Reviewed-on: https://go-review.googlesource.com/22524
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
2016-04-27 18:06:20 +00:00
Zhongwei Yao 74a9bad638 cmd/compile: enable const division for arm64
performance:
benchmark                   old ns/op     new ns/op     delta
BenchmarkDivconstI64-8      8.28          2.70          -67.39%
BenchmarkDivconstU64-8      8.28          4.69          -43.36%
BenchmarkDivconstI32-8      8.28          6.39          -22.83%
BenchmarkDivconstU32-8      8.28          4.43          -46.50%
BenchmarkDivconstI16-8      5.17          5.17          +0.00%
BenchmarkDivconstU16-8      5.33          5.34          +0.19%
BenchmarkDivconstI8-8       3.50          3.50          +0.00%
BenchmarkDivconstU8-8       3.51          3.50          -0.28%

Fixes #15382

Change-Id: Ibce7b28f0586d593b33c4d4ecc5d5e7e7c905d13
Reviewed-on: https://go-review.googlesource.com/22292
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
2016-04-27 17:47:49 +00:00
Cherry Zhang 9629f55fbb cmd/link: remove absolute address for c-archive on darwin/arm
Now it is possible to build a c-archive as PIC on darwin/arm (this is
now the default). Then the system linker can link the binary using
the archive as PIE.

Fixes #12896.

Change-Id: Iad84131572422190f5fa036e7d71910dc155f155
Reviewed-on: https://go-review.googlesource.com/22461
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-04-27 16:22:06 +00:00
Dmitry Vyukov 6dfba5c7ce runtime/race: improve TestNoRaceIOHttp test
TestNoRaceIOHttp does all kinds of bad things:
1. Binds to a fixed port, so concurrent tests fail.
2. Registers HTTP handler multiple times, so repeated tests fail.
3. Relies on sleep to wait for listen.

Fix all of that.

Change-Id: I1210b7797ef5e92465b37dc407246d92a2a24fe8
Reviewed-on: https://go-review.googlesource.com/19953
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-27 08:08:18 +00:00
Austin Clements 2a889b9d93 runtime: make stack re-scan O(# dirty stacks)
Currently the stack re-scan during mark termination is O(# stacks)
because we enqueue a root marking job for every goroutine. It takes
~34ns to process this root marking job for a valid (clean) stack, so
at around 300k goroutines we exceed the 10ms pause goal. A non-trivial
portion of this time is spent simply taking the cache miss to check
the gcscanvalid flag, so simply optimizing the path that handles clean
stacks can only improve this so much.

Fix this by keeping an explicit list of goroutines with dirty stacks
that need to be rescanned. When a goroutine first transitions to
running after a stack scan and marks its stack dirty, it adds itself
to this list. We enqueue root marking jobs only for the goroutines in
this list, so this improves stack re-scanning asymptotically by
completely eliminating time spent on clean goroutines.

This reduces mark termination time for 500k idle goroutines from 15ms
to 238µs. Overall performance effect is negligible.

name \ 95%ile-time/markTerm     old           new         delta
IdleGs/gs:500000/gomaxprocs:12  15000µs ± 0%  238µs ± 5%  -98.41% (p=0.000 n=10+10)

name              old time/op  new time/op  delta
XBenchGarbage-12  2.30ms ± 3%  2.29ms ± 1%  -0.43%  (p=0.049 n=17+18)

name                      old time/op    new time/op    delta
BinaryTree17-12              2.57s ± 3%     2.59s ± 2%    ~     (p=0.141 n=19+20)
Fannkuch11-12                2.09s ± 0%     2.10s ± 1%  +0.53%  (p=0.000 n=19+19)
FmtFprintfEmpty-12          45.3ns ± 3%    45.2ns ± 2%    ~     (p=0.845 n=20+20)
FmtFprintfString-12          129ns ± 0%     127ns ± 0%  -1.55%  (p=0.000 n=16+16)
FmtFprintfInt-12             123ns ± 0%     119ns ± 1%  -3.24%  (p=0.000 n=19+19)
FmtFprintfIntInt-12          195ns ± 1%     189ns ± 1%  -3.11%  (p=0.000 n=17+17)
FmtFprintfPrefixedInt-12     193ns ± 1%     187ns ± 1%  -3.06%  (p=0.000 n=19+19)
FmtFprintfFloat-12           254ns ± 0%     255ns ± 1%  +0.35%  (p=0.001 n=14+17)
FmtManyArgs-12               781ns ± 0%     770ns ± 0%  -1.48%  (p=0.000 n=16+19)
GobDecode-12                7.00ms ± 1%    6.98ms ± 1%    ~     (p=0.563 n=19+19)
GobEncode-12                5.91ms ± 1%    5.92ms ± 0%    ~     (p=0.118 n=19+18)
Gzip-12                      219ms ± 1%     215ms ± 1%  -1.81%  (p=0.000 n=18+18)
Gunzip-12                   37.2ms ± 0%    37.4ms ± 0%  +0.45%  (p=0.000 n=17+19)
HTTPClientServer-12         76.9µs ± 3%    77.5µs ± 2%  +0.81%  (p=0.030 n=20+19)
JSONEncode-12               15.0ms ± 0%    14.8ms ± 1%  -0.88%  (p=0.001 n=15+19)
JSONDecode-12               50.6ms ± 0%    53.2ms ± 2%  +5.07%  (p=0.000 n=17+19)
Mandelbrot200-12            4.05ms ± 0%    4.05ms ± 1%    ~     (p=0.581 n=16+17)
GoParse-12                  3.34ms ± 1%    3.30ms ± 1%  -1.21%  (p=0.000 n=15+20)
RegexpMatchEasy0_32-12      69.6ns ± 1%    69.8ns ± 2%    ~     (p=0.566 n=19+19)
RegexpMatchEasy0_1K-12       238ns ± 1%     236ns ± 0%  -0.91%  (p=0.000 n=17+13)
RegexpMatchEasy1_32-12      69.8ns ± 1%    70.0ns ± 1%  +0.23%  (p=0.026 n=17+16)
RegexpMatchEasy1_1K-12       371ns ± 1%     363ns ± 1%  -2.07%  (p=0.000 n=19+19)
RegexpMatchMedium_32-12      107ns ± 2%     106ns ± 1%  -0.51%  (p=0.031 n=18+20)
RegexpMatchMedium_1K-12     33.0µs ± 0%    32.9µs ± 0%  -0.30%  (p=0.004 n=16+16)
RegexpMatchHard_32-12       1.70µs ± 0%    1.70µs ± 0%  +0.45%  (p=0.000 n=16+17)
RegexpMatchHard_1K-12       51.1µs ± 2%    51.4µs ± 1%  +0.53%  (p=0.000 n=17+19)
Revcomp-12                   378ms ± 1%     385ms ± 1%  +1.92%  (p=0.000 n=19+18)
Template-12                 64.3ms ± 2%    65.0ms ± 2%  +1.09%  (p=0.001 n=19+19)
TimeParse-12                 315ns ± 1%     317ns ± 2%    ~     (p=0.108 n=18+20)
TimeFormat-12                360ns ± 1%     337ns ± 0%  -6.30%  (p=0.000 n=18+13)
[Geo mean]                  51.8µs         51.6µs       -0.48%

Change-Id: Icf8994671476840e3998236e15407a505d4c760c
Reviewed-on: https://go-review.googlesource.com/20700
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-26 23:40:13 +00:00
Austin Clements 5b765ce310 runtime: don't clear gcscanvalid in casfrom_Gscanstatus
Currently we clear gcscanvalid in both casgstatus and
casfrom_Gscanstatus if the new status is _Grunning. This is very
important to do in casgstatus. However, this is potentially wrong in
casfrom_Gscanstatus because in this case the caller doesn't own gp and
hence the write is racy. Unlike the other _Gscan statuses, during
_Gscanrunning, the G is still running. This does not indicate that
it's transitioning into a running state. The scan simply hasn't
happened yet, so it's neither valid nor invalid.

Conveniently, this also means clearing gcscanvalid is unnecessary in
this case because the G was already in _Grunning, so we can simply
remove this code. What will happen instead is that the G will be
preempted to scan itself, that scan will set gcscanvalid to true, and
then the G will return to _Grunning via casgstatus, clearing
gcscanvalid.

This fix will become necessary shortly when we start keeping track of
the set of G's with dirty stacks, since it will no longer be
idempotent to simply set gcscanvalid to false.

Change-Id: I688c82e6fbf00d5dbbbff49efa66acb99ee86785
Reviewed-on: https://go-review.googlesource.com/20669
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-26 23:40:10 +00:00
Austin Clements c707d83856 runtime: fix typos in comment about gcscanvalid
Change-Id: Id4ad7ebf88a21eba2bc5714b96570ed5cfaed757
Reviewed-on: https://go-review.googlesource.com/22210
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-26 23:40:07 +00:00
Austin Clements 9f263c14ed runtime: remove stack barriers during sweep
This adds a best-effort pass to remove stack barriers immediately
after the end of mark termination. This isn't necessary for the Go
runtime, but should help external tools that perform stack walks but
aren't aware of Go's stack barriers such as GDB, perf, and VTune.
(Though clearly they'll still have trouble unwinding stacks during
mark.)

Change-Id: I66600fae1f03ee36b5459d2b00dcc376269af18e
Reviewed-on: https://go-review.googlesource.com/20668
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-26 23:40:04 +00:00
Austin Clements 269c969c81 runtime: remove stack barriers during concurrent mark
Currently we remove stack barriers during STW mark termination, which
has a non-trivial per-goroutine cost and means that we have to touch
even clean stacks during mark termination. However, there's no problem
with leaving them in during the sweep phase. They just have to be out
by the time we install new stack barriers immediately prior to
scanning the stack such as during the mark phase of the next GC cycle
or during mark termination in a STW GC.

Hence, move the gcRemoveStackBarriers from STW mark termination to
just before we install new stack barriers during concurrent mark. This
removes the cost from STW. Furthermore, this combined with concurrent
stack shrinking means that the mark termination scan of a clean stack
is a complete no-op, which will make it possible to skip clean stacks
entirely during mark termination.

This has the downside that it will mess up anything outside of Go that
tries to walk Go stacks all the time instead of just some of the time.
This includes tools like GDB, perf, and VTune. We'll improve the
situation shortly.

Change-Id: Ia40baad8f8c16aeefac05425e00b0cf478137097
Reviewed-on: https://go-review.googlesource.com/20667
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-26 23:40:01 +00:00
Austin Clements efb0c55407 runtime: avoid span root marking entirely during mark termination
Currently we enqueue span root mark jobs during both concurrent mark
and mark termination, but we make the job a no-op during mark
termination.

This is silly. Instead of queueing them up just to not do them, don't
queue them up in the first place.

Change-Id: Ie1d36de884abfb17dd0db6f0449a2b7c997affab
Reviewed-on: https://go-review.googlesource.com/20666
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-26 23:39:58 +00:00
Austin Clements e8337491aa runtime: free dead G stacks concurrently
Currently we free cached stacks of dead Gs during STW stack root
marking. We do this during STW because there's no way to take
ownership of a particular dead G, so attempting to free a dead G's
stack during concurrent stack root marking could race with reusing
that G.

However, we can do this concurrently if we take a completely different
approach. One way to prevent reuse of a dead G is to remove it from
the free G list. Hence, this adds a new fixed root marking task that
simply removes all Gs from the list of dead Gs with cached stacks,
frees their stacks, and then adds them to the list of dead Gs without
cached stacks.

This is also a necessary step toward rescanning only dirty stacks,
since it eliminates another task from STW stack marking.

Change-Id: Iefbad03078b284a2e7bf30fba397da4ca87fe095
Reviewed-on: https://go-review.googlesource.com/20665
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-26 23:39:55 +00:00
Austin Clements 1a2cf91f5e runtime: split gfree list into with-stacks and without-stacks
Currently all free Gs are added to one list. Split this into two
lists: one for free Gs with cached stacks and one for Gs without
cached stacks.

This lets us preferentially allocate Gs that already have a stack, but
more importantly, it sets us up to free cached G stacks concurrently.

Change-Id: Idbe486f708997e1c9d166662995283f02d1eeb3c
Reviewed-on: https://go-review.googlesource.com/20664
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-26 23:39:51 +00:00
Michael Munday 55154cf0b2 cmd/link: fix gdb backtrace on architectures using a link register
Also adds TestGdbBacktrace to the runtime package.

Dwarf modifications written by Bryan Chan (@bryanpkc) who is also
at IBM and covered by the same CLA.

Fixes #14628

Change-Id: I106a1f704c3745a31f29cdadb0032e3905829850
Reviewed-on: https://go-review.googlesource.com/20193
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-26 18:35:47 +00:00
Ilya Tocar 6b02a19247 strings: use SSE4.2 in strings.Index on AMD64
Use PCMPESTRI instruction if available.

Index-4              21.1ns ± 0%  21.1ns ± 0%     ~     (all samples are equal)
IndexHard1-4          395µs ± 0%   105µs ± 0%  -73.53%        (p=0.000 n=19+20)
IndexHard2-4          300µs ± 0%   147µs ± 0%  -51.11%        (p=0.000 n=19+20)
IndexHard3-4          665µs ± 0%   665µs ± 0%     ~           (p=0.942 n=16+19)

Change-Id: I4f66794164740a2b939eb1c78934e2390b489064
Reviewed-on: https://go-review.googlesource.com/22337
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2016-04-26 10:14:26 +00:00
Keith Randall 9cb79e9536 runtime: arm5, fix large-offset floating-point stores
The code sequence for large-offset floating-point stores
includes adding the base pointer to r11.  Make sure we
can interpret that instruction correctly.

Fixes build.

Fixes #15440

Change-Id: I7fe5a4a57e08682967052bf77c54e0ec47fcb53e
Reviewed-on: https://go-review.googlesource.com/22440
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2016-04-25 22:33:33 +00:00
Keith Randall 6f3f02f80d runtime: zero tmpbuf between len and cap
Zero the entire buffer so we don't need to
lower its capacity upon return.  This lets callers
do some appending without allocation.

Zeroing is cheap, the byte buffer requires only
4 extra instructions.

Fixes #14235

Change-Id: I970d7badcef047dafac75ac17130030181f18fe2
Reviewed-on: https://go-review.googlesource.com/22424
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-25 21:16:52 +00:00
Dmitry Vyukov 75b844f0d2 runtime/trace: test detection of broken timestamps
On some processors cputicks (used to generate trace timestamps)
produce non-monotonic timestamps. It is important that the parser
distinguishes logically inconsistent traces (e.g. missing, excessive
or misordered events) from broken timestamps. The former is a bug
in tracer, the latter is a machine issue.

Test that (1) parser does not return a logical error in case of
broken timestamps and (2) broken timestamps are eventually detected
and reported.

Change-Id: Ib4b1eb43ce128b268e754400ed8b5e8def04bd78
Reviewed-on: https://go-review.googlesource.com/21608
Reviewed-by: Austin Clements <austin@google.com>
2016-04-24 09:11:37 +00:00
Dmitry Vyukov a3703618ea runtime: use per-goroutine sequence numbers in tracer
Currently tracer uses global sequencer and it introduces
significant slowdown on parallel machines (up to 10x).
Replace the global sequencer with per-goroutine sequencer.

If we assign per-goroutine sequence numbers to only 3 types
of events (start, unblock and syscall exit), it is enough to
restore consistent partial ordering of all events. Even these
events don't need sequence numbers all the time (if goroutine
starts on the same P where it was unblocked, then start does
not need sequence number).
The burden of restoring the order is put on trace parser.
Details of the algorithm are described in the comments.

On http benchmark with GOMAXPROCS=48:
no tracing: 5026 ns/op
tracing: 27803 ns/op (+453%)
with this change: 6369 ns/op (+26%, mostly for traceback)

Also trace size is reduced by ~22%. Average event size before: 4.63
bytes/event, after: 3.62 bytes/event.

Besides running trace tests, I've also tested with manually broken
cputicks (random skew for each event, per-P skew and episodic random skew).
In all cases broken timestamps were detected and no test failures.

Change-Id: I078bde421ccc386a66f6c2051ab207bcd5613efa
Reviewed-on: https://go-review.googlesource.com/21512
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-23 15:57:05 +00:00
Dmitry Vyukov 2d342fba78 runtime: fix description of trace events
Change-Id: I037101b1921fe151695d32e9874b50dd64982298
Reviewed-on: https://go-review.googlesource.com/22314
Reviewed-by: Austin Clements <austin@google.com>
2016-04-22 21:32:37 +00:00
Ian Lance Taylor 32302d6289 runtime/cgo: use normal libinit on PPC GNU/Linux
The special case was because PPC did not support external linking, but
now it does.

Fixes #10410.

Change-Id: I9b024686e0f03da7a44c1c59b41c529802f16ab0
Reviewed-on: https://go-review.googlesource.com/22372
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-22 14:30:27 +00:00
David Crawshaw c165988360 cmd/compile, etc: use nameOff in uncommonType
linux/amd64 PIE:
	cmd/go:  -62KB (0.5%)
	jujud:  -550KB (0.7%)

For #6853.

Change-Id: Ieb67982abce5832e24b997506f0ae7108f747108
Reviewed-on: https://go-review.googlesource.com/22371
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-22 13:51:29 +00:00
David Crawshaw 1492e7db05 cmd/compile, etc: use nameOff for rtype string
linux/amd64:
	cmd/go:   -8KB (basically nothing)

linux/amd64 PIE:
	cmd/go: -191KB (1.6%)
	jujud:  -1.5MB (1.9%)

Updates #6853
Fixes #15064

Change-Id: I0adbb95685e28be92e8548741df0e11daa0a9b5f
Reviewed-on: https://go-review.googlesource.com/21777
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-22 10:08:05 +00:00
Austin Clements c8bd293e56 runtime: eliminate floating garbage estimate
Currently when we compute the trigger for the next GC, we do it based
on an estimate of the reachable heap size at the start of the GC
cycle, which is itself based on an estimate of the floating garbage.
This was introduced by 4655aad to fix a bad feedback loop that allowed
the heap to grow to many times the true reachable size.

However, this estimate gets easily confused by rapidly allocating
applications, and, worse it's different than the heap size the trigger
controller uses to compute the trigger itself. This results in the
trigger controller often thinking that GC finished before it started.
Since this would be a pretty great outcome from it's perspective, it
sets the trigger for the next cycle as close to the next goal as
possible (which is limited to 95% of the goal).

Furthermore, the bad feedback loop this estimate originally fixed
seems not to happen any more, suggesting it was fixed more correctly
by some other change in the mean time. Finally, with the change to
allocate black, it shouldn't even be theoretically possible for this
bad feedback loop to occur.

Hence, eliminate the floating garbage estimate and simply consider the
reachable heap to be the marked heap. This harms overall throughput
slightly for allocation-heavy benchmarks, but significantly improves
mutator availability.

Fixes #12204. This brings the average trigger in this benchmark from
0.95 (the cap) to 0.7 and the active GC utilization from ~90% to ~45%.

Updates #14951. This makes the trigger controller much better behaved,
so it pulls the trigger lower if assists are consuming a lot of CPU
like it's supposed to, increasing mutator availability.

name              old time/op  new time/op  delta
XBenchGarbage-12  2.21ms ± 1%  2.28ms ± 3%  +3.29%  (p=0.000 n=17+17)

Some of this slow down we paid for in earlier commits. Relative to the
start of the series to switch to allocate-black (the parent of "count
black allocations toward scan work"), the garbage benchmark is 2.62%
slower.

name                      old time/op    new time/op    delta
BinaryTree17-12              2.53s ± 3%     2.53s ± 3%    ~     (p=0.708 n=20+19)
Fannkuch11-12                2.08s ± 0%     2.08s ± 0%  -0.22%  (p=0.002 n=19+18)
FmtFprintfEmpty-12          45.3ns ± 2%    45.2ns ± 3%    ~     (p=0.505 n=20+20)
FmtFprintfString-12          129ns ± 0%     131ns ± 2%  +1.80%  (p=0.000 n=16+19)
FmtFprintfInt-12             121ns ± 2%     121ns ± 2%    ~     (p=0.768 n=19+19)
FmtFprintfIntInt-12          186ns ± 1%     188ns ± 3%  +0.99%  (p=0.000 n=19+19)
FmtFprintfPrefixedInt-12     188ns ± 1%     188ns ± 1%    ~     (p=0.947 n=18+16)
FmtFprintfFloat-12           254ns ± 1%     255ns ± 1%  +0.30%  (p=0.002 n=19+17)
FmtManyArgs-12               763ns ± 0%     770ns ± 0%  +0.92%  (p=0.000 n=18+18)
GobDecode-12                7.00ms ± 1%    7.04ms ± 1%  +0.61%  (p=0.049 n=20+20)
GobEncode-12                5.88ms ± 1%    5.88ms ± 0%    ~     (p=0.641 n=18+19)
Gzip-12                      214ms ± 1%     215ms ± 1%  +0.43%  (p=0.002 n=18+19)
Gunzip-12                   37.6ms ± 0%    37.6ms ± 0%  +0.11%  (p=0.015 n=17+18)
HTTPClientServer-12         76.9µs ± 2%    78.1µs ± 2%  +1.44%  (p=0.000 n=20+18)
JSONEncode-12               15.2ms ± 2%    15.1ms ± 1%    ~     (p=0.271 n=19+18)
JSONDecode-12               53.1ms ± 1%    53.3ms ± 0%  +0.49%  (p=0.000 n=18+19)
Mandelbrot200-12            4.04ms ± 1%    4.03ms ± 0%  -0.33%  (p=0.005 n=18+18)
GoParse-12                  3.29ms ± 1%    3.28ms ± 1%    ~     (p=0.146 n=16+17)
RegexpMatchEasy0_32-12      69.9ns ± 3%    69.5ns ± 1%    ~     (p=0.785 n=20+19)
RegexpMatchEasy0_1K-12       237ns ± 0%     237ns ± 0%    ~     (p=1.000 n=18+18)
RegexpMatchEasy1_32-12      69.5ns ± 1%    69.2ns ± 1%  -0.44%  (p=0.020 n=16+19)
RegexpMatchEasy1_1K-12       372ns ± 1%     371ns ± 2%    ~     (p=0.086 n=20+19)
RegexpMatchMedium_32-12      108ns ± 3%     107ns ± 1%  -1.00%  (p=0.004 n=19+14)
RegexpMatchMedium_1K-12     34.2µs ± 4%    34.0µs ± 2%    ~     (p=0.380 n=19+20)
RegexpMatchHard_32-12       1.77µs ± 4%    1.76µs ± 3%    ~     (p=0.558 n=18+20)
RegexpMatchHard_1K-12       53.4µs ± 4%    52.8µs ± 2%  -1.10%  (p=0.020 n=18+20)
Revcomp-12                   359ms ± 4%     377ms ± 0%  +5.19%  (p=0.000 n=20+18)
Template-12                 63.7ms ± 2%    62.9ms ± 2%  -1.27%  (p=0.005 n=18+20)
TimeParse-12                 316ns ± 2%     313ns ± 1%    ~     (p=0.059 n=20+16)
TimeFormat-12                329ns ± 0%     331ns ± 0%  +0.39%  (p=0.000 n=16+18)
[Geo mean]                  51.6µs         51.7µs       +0.18%

Change-Id: I1dce4640c8205d41717943b021039fffea863c57
Reviewed-on: https://go-review.googlesource.com/21324
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-21 20:07:25 +00:00
Austin Clements 6002e01e34 runtime: allocate black during GC
Currently we allocate white for most of concurrent marking. This is
based on the classical argument that it produces less floating
garbage, since allocations during GC may not get linked into the heap
and allocating white lets us reclaim these. However, it's not clear
how often this actually happens, especially since our write barrier
shades any pointer as soon as it's installed in the heap regardless of
the color of the slot.

On the other hand, allocating black has several advantages that seem
to significantly outweigh this downside.

1) It naturally bounds the total scan work to the live heap size at
the start of a GC cycle. Allocating white does not, and thus depends
entirely on assists to prevent the heap from growing faster than it
can be scanned.

2) It reduces the total amount of scan work per GC cycle by the size
of newly allocated objects that are linked into the heap graph, since
objects allocated black never need to be scanned.

3) It reduces total write barrier work since more objects will already
be black when they are linked into the heap graph.

This gives a slight overall improvement in benchmarks.

name              old time/op  new time/op  delta
XBenchGarbage-12  2.24ms ± 0%  2.21ms ± 1%  -1.32%  (p=0.000 n=18+17)

name                      old time/op    new time/op    delta
BinaryTree17-12              2.60s ± 3%     2.53s ± 3%  -2.56%  (p=0.000 n=20+20)
Fannkuch11-12                2.08s ± 1%     2.08s ± 0%    ~     (p=0.452 n=19+19)
FmtFprintfEmpty-12          45.1ns ± 2%    45.3ns ± 2%    ~     (p=0.367 n=19+20)
FmtFprintfString-12          131ns ± 3%     129ns ± 0%  -1.60%  (p=0.000 n=20+16)
FmtFprintfInt-12             122ns ± 0%     121ns ± 2%  -0.86%  (p=0.000 n=16+19)
FmtFprintfIntInt-12          187ns ± 1%     186ns ± 1%    ~     (p=0.514 n=18+19)
FmtFprintfPrefixedInt-12     189ns ± 0%     188ns ± 1%  -0.54%  (p=0.000 n=16+18)
FmtFprintfFloat-12           256ns ± 0%     254ns ± 1%  -0.43%  (p=0.000 n=17+19)
FmtManyArgs-12               769ns ± 0%     763ns ± 0%  -0.72%  (p=0.000 n=18+18)
GobDecode-12                7.08ms ± 2%    7.00ms ± 1%  -1.22%  (p=0.000 n=20+20)
GobEncode-12                5.88ms ± 0%    5.88ms ± 1%    ~     (p=0.406 n=18+18)
Gzip-12                      214ms ± 0%     214ms ± 1%    ~     (p=0.103 n=17+18)
Gunzip-12                   37.6ms ± 0%    37.6ms ± 0%    ~     (p=0.563 n=17+17)
HTTPClientServer-12         77.2µs ± 3%    76.9µs ± 2%    ~     (p=0.606 n=20+20)
JSONEncode-12               15.1ms ± 1%    15.2ms ± 2%    ~     (p=0.138 n=19+19)
JSONDecode-12               53.3ms ± 1%    53.1ms ± 1%  -0.33%  (p=0.000 n=19+18)
Mandelbrot200-12            4.04ms ± 1%    4.04ms ± 1%    ~     (p=0.075 n=19+18)
GoParse-12                  3.30ms ± 1%    3.29ms ± 1%  -0.57%  (p=0.000 n=18+16)
RegexpMatchEasy0_32-12      69.5ns ± 1%    69.9ns ± 3%    ~     (p=0.822 n=18+20)
RegexpMatchEasy0_1K-12       237ns ± 1%     237ns ± 0%    ~     (p=0.398 n=19+18)
RegexpMatchEasy1_32-12      69.8ns ± 2%    69.5ns ± 1%    ~     (p=0.090 n=20+16)
RegexpMatchEasy1_1K-12       371ns ± 1%     372ns ± 1%    ~     (p=0.178 n=19+20)
RegexpMatchMedium_32-12      108ns ± 2%     108ns ± 3%    ~     (p=0.124 n=20+19)
RegexpMatchMedium_1K-12     33.9µs ± 2%    34.2µs ± 4%    ~     (p=0.309 n=20+19)
RegexpMatchHard_32-12       1.75µs ± 2%    1.77µs ± 4%  +1.28%  (p=0.018 n=19+18)
RegexpMatchHard_1K-12       52.7µs ± 1%    53.4µs ± 4%  +1.23%  (p=0.013 n=15+18)
Revcomp-12                   354ms ± 1%     359ms ± 4%  +1.27%  (p=0.043 n=20+20)
Template-12                 63.6ms ± 2%    63.7ms ± 2%    ~     (p=0.654 n=20+18)
TimeParse-12                 313ns ± 1%     316ns ± 2%  +0.80%  (p=0.014 n=17+20)
TimeFormat-12                332ns ± 0%     329ns ± 0%  -0.66%  (p=0.000 n=16+16)
[Geo mean]                  51.7µs         51.6µs       -0.09%

Change-Id: I2214a6a0e4f544699ea166073249a8efdf080dc0
Reviewed-on: https://go-review.googlesource.com/21323
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-21 20:07:22 +00:00
Austin Clements 64a26b79ac runtime: simplify/optimize allocate-black a bit
Currently allocating black switches to the system stack (which is
probably a historical accident) and atomically updates the global
bytes marked stat. Since we're about to depend on this much more,
optimize it a bit by putting it back on the regular stack and updating
the per-P bytes marked stat, which gets lazily folded into the global
bytes marked stat.

Change-Id: Ibbe16e5382d3fd2256e4381f88af342bf7020b04
Reviewed-on: https://go-review.googlesource.com/22170
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-21 20:07:20 +00:00
Austin Clements 479501c14c runtime: count black allocations toward scan work
Currently we count black allocations toward the scannable heap size,
but not toward the scan work we've done so far. This is clearly
inconsistent (we have, in effect, scanned these allocations and since
they're already black, we're not going to scan them again). Worse, it
means we don't count black allocations toward the scannable heap size
as of the *next* GC because this is based on the amount of scan work
we did in this cycle.

Fix this by counting black allocations as scan work. Currently the GC
spends very little time in allocate-black mode, so this probably
hasn't been a problem, but this will become important when we switch
to always allocating black.

Change-Id: If6ff693b070c385b65b6ecbbbbf76283a0f9d990
Reviewed-on: https://go-review.googlesource.com/22119
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-21 20:07:17 +00:00
Martin Möhrmann 7e460e70d9 runtime: use type int to specify size for newarray
Consistently use type int for the size argument of
runtime.newarray, runtime.reflect_unsafe_NewArray
and reflect.unsafe_NewArray.

Change-Id: Ic77bf2dde216c92ca8c49462f8eedc0385b6314e
Reviewed-on: https://go-review.googlesource.com/22311
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-21 04:15:14 +00:00
Keith Randall 60fd32a47f cmd/compile: change the way we handle large map values
mapaccess{1,2} returns a pointer to the value.  When the key
is not in the map, it returns a pointer to zeroed memory.
Currently, for large map values we have a complicated scheme which
dynamically allocates zeroed memory for this purpose.  It is ugly
code and requires an atomic.Load in a bunch of places we'd rather
not have it.

Switch to a scheme where callsites of mapaccess{1,2} which expect
large return values pass in a pointer to zeroed memory that
mapaccess can return if the key is not found.  This avoids the
atomic.Load on all map accesses with a few extra instructions only
for the large value acccesses, plus a bit of bss space.

There was a time (1.4 & 1.5?) where we did something like this but
all the tricks to make the right size zero value were done by the
linker.  That scheme broke in the presence of dyamic linking.
The scheme in this CL works even when dynamic linking.

Fixes #12337

Change-Id: Ic2d0319944af33bbb59785938d9ab80958d1b4b1
Reviewed-on: https://go-review.googlesource.com/22221
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2016-04-20 21:15:31 +00:00
Keith Randall 001e8e8070 runtime: simplify mallocgc flag argument
mallocgc can calculate noscan itself.  The only remaining
flag argument is needzero, so we just make that a boolean arg.

Fixes #15379

Change-Id: I839a70790b2a0c9dbcee2600052bfbd6c8148e20
Reviewed-on: https://go-review.googlesource.com/22290
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-20 14:02:22 +00:00
Keith Randall bfe0cbdc50 cmd/compile,runtime: pass elem type to {make,grow}slice
No point in passing the slice type to these functions.
All they need is the element type.  One less indirection,
maybe a few less []T type descriptors in the binary.

Change-Id: Ib0b83b5f14ca21d995ecc199ce8ac00c4eb375e6
Reviewed-on: https://go-review.googlesource.com/22275
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-04-20 00:31:16 +00:00
Josh Bleecher Snyder 0150f15a92 runtime: call mallocgc directly from makeslice and growslice
The extra checks provided by newarray are
redundant in these cases.

This shrinks by one frame the call stack expected
by the pprof test.

name                      old time/op  new time/op  delta
MakeSlice-8               34.3ns ± 2%  30.5ns ± 3%  -11.03%  (p=0.000 n=24+22)
GrowSlicePtr-8             134ns ± 2%   129ns ± 3%   -3.25%  (p=0.000 n=25+24)

Change-Id: Icd828655906b921c732701fd9d61da3fa217b0af
Reviewed-on: https://go-review.googlesource.com/22276
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-04-20 00:05:36 +00:00
Julia Hansbrough 58012ea785 runtime: updated SIGSYS to cause a panic + stacktrace
On GNU/Linux, SIGSYS is specified to cause the process to terminate
without a core dump. In https://codereview.appspot.com/3749041 , it
appears that Golang accidentally introduced incorrect behavior for
this signal, which caused Golang processes to keep running after
receiving SIGSYS. This change reverts it to the old/correct behavior.

Updates #15204

Change-Id: I3aa48a9499c1bc36fa5d3f40c088fdd7599e0db5
Reviewed-on: https://go-review.googlesource.com/22202
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-19 22:48:31 +00:00
Keith Randall 998c8e034c cmd/compile: convT2{I,E} don't handle direct interfaces
We now inline type to interface conversions when the type
is pointer-shaped.  No need to keep code to handle that in
convT2{I,E}.

Change-Id: I3a6668259556077cbb2986a9e8fe42a625d506c9
Reviewed-on: https://go-review.googlesource.com/22249
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
2016-04-19 22:27:08 +00:00
Josh Bleecher Snyder a4dd6ea152 runtime: add maxSliceCap
This avoids expensive division calculations
for many common slice element sizes.

name                      old time/op  new time/op  delta
MakeSlice-8               51.9ns ± 3%  35.1ns ± 2%  -32.41%  (p=0.000 n=10+10)
GrowSliceBytes-8          44.1ns ± 2%  44.1ns ± 1%     ~     (p=0.984 n=10+10)
GrowSliceInts-8           60.9ns ± 3%  60.9ns ± 3%     ~     (p=0.698 n=10+10)
GrowSlicePtr-8             131ns ± 1%   120ns ± 2%   -8.41%   (p=0.000 n=8+10)
GrowSliceStruct24Bytes-8   111ns ± 2%   103ns ± 3%   -7.23%    (p=0.000 n=8+8)

Change-Id: I2630eb3d73c814db030cad16e620ea7fecbbd312
Reviewed-on: https://go-review.googlesource.com/22223
Reviewed-by: Keith Randall <khr@golang.org>
2016-04-19 21:38:52 +00:00
Josh Bleecher Snyder 411a0adc9b runtime: add benchmarks for in-place append
Change-Id: I2b43cc976d2efbf8b41170be536fdd10364b65e5
Reviewed-on: https://go-review.googlesource.com/22190
Reviewed-by: Keith Randall <khr@golang.org>
2016-04-18 19:08:39 +00:00
David Crawshaw 95df0c6ab9 cmd/compile, etc: use name offset in method tables
Introduce and start using nameOff for two encoded names. This pair
of changes is best done together because the linker's method decoder
expects the method layouts to match.

Precursor to converting all existing name and *string fields to
nameOff.

linux/amd64:
	cmd/go:  -45KB (0.5%)
	jujud:  -389KB (0.6%)

linux/amd64 PIE:
	cmd/go: -170KB (1.4%)
	jujud:  -1.5MB (1.8%)

For #6853.

Change-Id: Ia044423f010fb987ce070b94c46a16fc78666ff6
Reviewed-on: https://go-review.googlesource.com/21396
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-18 09:12:41 +00:00
Austin Clements 2cdcb6f829 runtime: scavenge memory on physical page-aligned boundaries
Currently the scavenger marks memory unused in multiples of the
allocator page size (8K). This is safe as long as the true physical
page size is 4K (or 8K), as it is on many platforms. However, on
ARM64, PPC64x, and MIPS64, the physical page size is larger than 8K,
so if we attempt to mark memory unused, the kernel will round the
boundaries of the region *out* to all pages covered by the requested
region, and we'll release a larger region of memory than intended. As
a result, the scavenger is currently disabled on these platforms.

Fix this by first rounding the region to be marked unused *in* to
multiples of the physical page size, so that when we ask the kernel to
mark it unused, it releases exactly the requested region.

Fixes #9993.

Change-Id: I96d5fdc2f77f9d69abadcea29bcfe55e68288cb1
Reviewed-on: https://go-review.googlesource.com/22066
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-04-16 21:42:43 +00:00
Austin Clements 1151473077 runtime: check that sysUnused is always physical-page aligned
If sysUnused is passed an address or length that is not aligned to the
physical page boundary, the kernel will unmap more memory than the
caller wanted. Add a check for this.

For #9993.

Change-Id: I68ff03032e7b65cf0a853fe706ce21dc7f2aaaf8
Reviewed-on: https://go-review.googlesource.com/22065
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2016-04-16 21:42:40 +00:00
Austin Clements 8ce844e88e runtime: check kernel physical page size during init
The runtime hard-codes an assumed physical page size. If this is
smaller than the kernel's page size or not a multiple of it, sysUnused
may incorrectly release more memory to the system than intended.

Add a runtime startup check that the runtime's assumed physical page
is compatible with the kernel's physical page size.

For #9993.

Change-Id: Ida9d07f93c00ca9a95dd55fc59bf0d8a607f6728
Reviewed-on: https://go-review.googlesource.com/22064
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-04-16 21:42:37 +00:00
Austin Clements d6b177d1eb runtime: remove empty 386 archauxv
archauxv no longer does anything on 386, so remove it.

Change-Id: I94545238e40fa6a6832a7c3b40aedfc6c1f6a97b
Reviewed-on: https://go-review.googlesource.com/22063
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-16 21:42:34 +00:00
Austin Clements 90addd3d41 runtime: common handling of _AT_RANDOM auxv
The Linux kernel provides 16 bytes of random data via the auxv vector
at startup. Currently we consume this separately on 386, amd64, arm,
and arm64. Now that we have a common auxv parser, handle _AT_RANDOM in
the common path.

Change-Id: Ib69549a1d37e2d07a351cf0f44007bcd24f0d20d
Reviewed-on: https://go-review.googlesource.com/22062
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-16 21:42:31 +00:00
Austin Clements c955bb2040 runtime: common auxv parser
Currently several different Linux architectures have separate copies
of the auxv parser. Bring these all together into a single copy of the
parser that calls out to a per-arch handler for each tag/value pair.
This is in preparation for handling common auxv tags in one place.

For #9993.

Change-Id: Iceebc3afad6b4133b70fca7003561ae370445c10
Reviewed-on: https://go-review.googlesource.com/22061
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2016-04-16 21:42:27 +00:00
Mikio Hara 6f59ccb052 runtime: don't always unblock all signals on dragonfly, freebsd and openbsd
https://golang.org/cl/10173 intrduced msigsave, ensureSigM and
_SigUnblock but didn't enable the new signal save/restore mechanism for
SIG{HUP,INT,QUIT,ABRT,TERM} on DragonFly BSD, FreeBSD and OpenBSD.

At present, it looks like they have the implementation. This change
enables the new mechanism on DragonFly BSD, FreeBSD and OpenBSD the same
as Darwin, NetBSD.

Change-Id: Ifb4b4743b3b4f50bfcdc7cf1fe1b59c377fa2a41
Reviewed-on: https://go-review.googlesource.com/18657
Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-15 21:20:45 +00:00
Austin Clements 7c7081f514 sync/atomic: don't atomically write pointers twice
sync/atomic.StorePointer (which is implemented in
runtime/atomic_pointer.go) writes the pointer twice (through two
completely different code paths, no less). Fix it to only write once.

Change-Id: Id3b2aef9aa9081c2cf096833e001b93d3dd1f5da
Reviewed-on: https://go-review.googlesource.com/21999
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-04-14 21:13:26 +00:00
Austin Clements 8f6c35de2f runtime: make sync_atomic_SwapPointer signature match sync/atomic
SwapPointer is declared as

  func SwapPointer(addr *unsafe.Pointer, new unsafe.Pointer) (old unsafe.Pointer)

in sync/atomic, but defined in the runtime (where it's actually
implemented) as

  func sync_atomic_SwapPointer(ptr unsafe.Pointer, new unsafe.Pointer) unsafe.Pointer

Make ptr a *unsafe.Pointer in the runtime definition to match the type
in sync/atomic.

Change-Id: I99bab651b995001bbe54f9e790fdef2417ef0e9e
Reviewed-on: https://go-review.googlesource.com/21998
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
2016-04-14 21:13:23 +00:00
Keith Randall 98b6febcef runtime/internal/sys: better fallback algorithms for intrinsics
Use deBruijn sequences to count low-order zeros.
Reorg bswap to not use &^, it takes another instruction on x86.

Change-Id: I4a5ed9fd16ee6a279d88c067e8a2ba11de821156
Reviewed-on: https://go-review.googlesource.com/22084
Reviewed-by: David Chase <drchase@google.com>
2016-04-14 21:09:03 +00:00
Jeremy Jackins 02b8e6978a runtime: find a home for orphaned comments
These comments were left behind after runtime.h was converted
from C to Go. I examined the original code and tried to move these
to the places that the most sense.

Change-Id: I8769d60234c0113d682f9de3bd8d6c34c450c188
Reviewed-on: https://go-review.googlesource.com/21969
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-14 18:34:09 +00:00
David Crawshaw f120936dff cmd/compile, etc: use name for type pkgPath
By replacing the *string used to represent pkgPath with a
reflect.name everywhere, the embedded *string for package paths
inside the reflect.name can be replaced by an offset, nameOff.
This reduces the number of pointers in the type information.

This also moves all reflect.name types into the same section, making
it possible to use nameOff more widely in later CLs.

No significant binary size change for normal binaries, but:

linux/amd64 PIE:
	cmd/go: -440KB (3.7%)
	jujud:  -2.6MB (3.2%)

For #6853.

Change-Id: I3890b132a784a1090b1b72b32febfe0bea77eaee
Reviewed-on: https://go-review.googlesource.com/21395
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-13 20:48:26 +00:00
Brad Fitzpatrick 73e2ad2022 runtime: rename os1_darwin.go to os_darwin.go
Change-Id: If0e0bc5a85101db1e70faaab168fc2d12024eb93
Reviewed-on: https://go-review.googlesource.com/22005
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-13 20:37:12 +00:00
Brad Fitzpatrick d9712aa82a runtime: merge the darwin os*.go files together
Merge them together into os1_darwin.go. A future CL will rename it.

Change-Id: Ia4380d3296ebd5ce210908ce3582ff184566f692
Reviewed-on: https://go-review.googlesource.com/22004
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-13 20:35:09 +00:00
Austin Clements 4721ea6abc runtime/internal/atomic: rename Storep1 to StorepNoWB
Make it clear that the point of this function stores a pointer
*without* a write barrier.

sed -i -e 's/Storep1/StorepNoWB/' $(git grep -l Storep1)

Updates #15270.

Change-Id: Ifad7e17815e51a738070655fe3b178afdadaecf6
Reviewed-on: https://go-review.googlesource.com/21994
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
2016-04-13 19:17:25 +00:00
Austin Clements d8e8fc292a runtime/internal/atomic: remove write barrier from Storep1 on s390x
atomic.Storep1 is not supposed to invoke a write barrier (that's what
atomicstorep is for), but currently does on s390x. This causes a panic
in runtime.mapzero when it tries to use atomic.Storep1 to store what's
actually a scalar.

Fix this by eliminating the write barrier from atomic.Storep1 on
s390x. Also add some documentation to atomicstorep to explain the
difference between these.

Fixes #15270.

Change-Id: I291846732d82f090a218df3ef6351180aff54e81
Reviewed-on: https://go-review.googlesource.com/21993
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
2016-04-13 16:06:51 +00:00
Lynn Boger c4807d4cc7 runtime: improve memmove performance ppc64,ppc64le
This change improves the performance of memmove
on ppc64 & ppc64le mainly for moves >=32 bytes.
In addition, the test to detect backward moves
 was enhanced to avoid backward moves if source
and dest were in different types of storage, since
backward moves might not always be efficient.

Fixes #14507

The following shows some of the improvements from the test
in the runtime package:

BenchmarkMemmove32                   4229.56      4717.13      1.12x
BenchmarkMemmove64                   6156.03      7810.42      1.27x
BenchmarkMemmove128                  7521.69      12468.54     1.66x
BenchmarkMemmove256                  6729.90      18260.33     2.71x
BenchmarkMemmove512                  8521.59      18033.81     2.12x
BenchmarkMemmove1024                 9760.92      25762.61     2.64x
BenchmarkMemmove2048                 10241.00     29584.94     2.89x
BenchmarkMemmove4096                 10399.37     31882.31     3.07x

BenchmarkMemmoveUnalignedDst16       1943.69      2258.33      1.16x
BenchmarkMemmoveUnalignedDst32       3885.08      3965.81      1.02x
BenchmarkMemmoveUnalignedDst64       5121.63      6965.54      1.36x
BenchmarkMemmoveUnalignedDst128      7212.34      11372.68     1.58x
BenchmarkMemmoveUnalignedDst256      6564.52      16913.59     2.58x
BenchmarkMemmoveUnalignedDst512      8364.35      17782.57     2.13x
BenchmarkMemmoveUnalignedDst1024     9539.87      24914.72     2.61x
BenchmarkMemmoveUnalignedDst2048     9199.23      21235.11     2.31x
BenchmarkMemmoveUnalignedDst4096     10077.39     25231.99     2.50x

BenchmarkMemmoveUnalignedSrc32       3249.83      3742.52      1.15x
BenchmarkMemmoveUnalignedSrc64       5562.35      6627.96      1.19x
BenchmarkMemmoveUnalignedSrc128      6023.98      10200.84     1.69x
BenchmarkMemmoveUnalignedSrc256      6921.83      15258.43     2.20x
BenchmarkMemmoveUnalignedSrc512      8593.13      16541.97     1.93x
BenchmarkMemmoveUnalignedSrc1024     9730.95      22927.84     2.36x
BenchmarkMemmoveUnalignedSrc2048     9793.28      21537.73     2.20x
BenchmarkMemmoveUnalignedSrc4096     10132.96     26295.06     2.60x

Change-Id: I73af59970d4c97c728deabb9708b31ec7e01bdf2
Reviewed-on: https://go-review.googlesource.com/21990
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-13 15:27:59 +00:00
David Crawshaw 7d469179e6 cmd/compile, etc: store method tables as offsets
This CL introduces the typeOff type and a lookup method of the same
name that can turn a typeOff offset into an *rtype.

In a typical Go binary (built with buildmode=exe, pie, c-archive, or
c-shared), there is one moduledata and all typeOff values are offsets
relative to firstmoduledata.types. This makes computing the pointer
cheap in typical programs.

With buildmode=shared (and one day, buildmode=plugin) there are
multiple modules whose relative offset is determined at runtime.
We identify a type in the general case by the pair of the original
*rtype that references it and its typeOff value. We determine
the module from the original pointer, and then use the typeOff from
there to compute the final *rtype.

To ensure there is only one *rtype representing each type, the
runtime initializes a typemap for each module, using any identical
type from an earlier module when resolving that offset. This means
that types computed from an offset match the type mapped by the
pointer dynamic relocations.

A series of followup CLs will replace other *rtype values with typeOff
(and name/*string with nameOff).

For types created at runtime by reflect, type offsets are treated as
global IDs and reference into a reflect offset map kept by the runtime.

darwin/amd64:
	cmd/go:  -57KB (0.6%)
	jujud:  -557KB (0.8%)

linux/amd64 PIE:
	cmd/go: -361KB (3.0%)
	jujud:  -3.5MB (4.2%)

For #6853.

Change-Id: Icf096fd884a0a0cb9f280f46f7a26c70a9006c96
Reviewed-on: https://go-review.googlesource.com/21285
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-13 13:03:11 +00:00
Matthew Dempsky 6af4e996e2 runtime: simplify setPanicOnFault slightly
No need to acquire the M just to change G's paniconfault flag, and the
original C implementation of SetPanicOnFault did not. The M
acquisition logic is an artifact of golang.org/cl/131010044, which was
started before golang.org/cl/123640043 (which introduced the current
"getg" function) was submitted.

Change-Id: I6d1939008660210be46904395cf5f5bbc2c8f754
Reviewed-on: https://go-review.googlesource.com/21935
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-13 06:14:06 +00:00
Keith Randall 260b7daf0a cmd/compile: fix arg to getcallerpc
getcallerpc's arg needs to point to the first argument slot.
I believe this bug was introduced by Michel's itab changes
(specifically https://go-review.googlesource.com/c/20902).

Fixes #15145

Change-Id: Ifb2e17f3658e2136c7950dfc789b4d5706320683
Reviewed-on: https://go-review.googlesource.com/21931
Reviewed-by: Michel Lespinasse <walken@google.com>
2016-04-13 00:24:38 +00:00
David Crawshaw f028b9f9e2 cmd/link, etc: store typelinks as offsets
This is the first in a series of CLs to replace the use of pointers
in binary read-only data with offsets.

In standard Go binaries these CLs have a small effect, shrinking
8-byte pointers to 4-bytes. In position-independent code, it also
saves the dynamic relocation for the pointer. This has a significant
effect on the binary size when building as PIE, c-archive, or
c-shared.

darwin/amd64:
	cmd/go: -12KB (0.1%)
	jujud:  -82KB (0.1%)

linux/amd64 PIE:
	cmd/go:  -86KB (0.7%)
	jujud:  -569KB (0.7%)

For #6853.

Change-Id: Iad5625bbeba58dabfd4d334dbee3fcbfe04b2dcf
Reviewed-on: https://go-review.googlesource.com/21284
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-12 20:32:41 +00:00
Michael Munday 78ecd61f62 runtime/cgo: add s390x support
Change-Id: I64ada9fe34c3cfc4bd514ec5d8c8f4d4c99074fb
Reviewed-on: https://go-review.googlesource.com/20950
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-12 15:42:36 +00:00
Michael Munday 7cbe7b1e86 runtime/internal/atomic: add s390x atomic operations
Load and store instructions are atomic on the s390x.

Change-Id: I0031ed2fba43f33863bca114d0fdec2e7d1ce807
Reviewed-on: https://go-review.googlesource.com/20938
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-12 15:40:48 +00:00
Dmitry Vyukov 3fafe2e888 internal/trace: support parsing of 1.5 traces
1. Parse out version from trace header.
2. Restore handling of 1.5 traces.
3. Restore optional symbolization of traces.
4. Add some canned 1.5 traces for regression testing
   (http benchmark trace, runtime/trace stress traces,
    plus one with broken timestamps).

Change-Id: Idb18a001d03ded8e13c2730eeeb37c5836e31256
Reviewed-on: https://go-review.googlesource.com/21803
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-04-11 17:56:44 +00:00
Dominik Honnef 683917a721 all: use bytes.Equal, bytes.Contains and strings.Contains, again
The previous cleanup was done with a buggy tool, missing some potential
rewrites.

Change-Id: I333467036e355f999a6a493e8de87e084f374e26
Reviewed-on: https://go-review.googlesource.com/21378
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-11 15:16:54 +00:00
Dave Cheney 720c4c016c runtime: merge lfstack_amd64.go into lfstack_64bit.go
Merge the amd64 lfstack implementation into the general 64 bit
implementation.

Change-Id: Id9ed61b90d2e3bc3b0246294c03eb2c92803b6ca
Reviewed-on: https://go-review.googlesource.com/21707
Run-TryBot: Dave Cheney <dave@cheney.net>
Reviewed-by: Minux Ma <minux@golang.org>
2016-04-11 06:18:52 +00:00
Jeremy Jackins ba09d06e16 runtime: remove remaining references to TheChar
After mdempsky's recent changes, these are the only references to
"TheChar" left in the Go tree. Without the context, and without
knowing the history, this is confusing.

Also rename sys.TheGoos and sys.TheGoarch to sys.GOOS
and sys.GOARCH.

Also change the heap dump format to include sys.GOARCH
rather than TheChar, which is no longer a concept.

Updates #15169 (changes heapdump format)

Change-Id: I3e99eeeae00ed55d7d01e6ed503d958c6e931dca
Reviewed-on: https://go-review.googlesource.com/21647
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-04-11 04:32:07 +00:00
Martin Möhrmann ad7448fe98 runtime: speed up makeslice by avoiding divisions
Only compute the number of maximum allowed elements per slice once.

name         old time/op  new time/op  delta
MakeSlice-2  55.5ns ± 1%  45.6ns ± 2%  -17.88%  (p=0.000 n=99+100)

Change-Id: I951feffda5d11910a75e55d7e978d306d14da2c5
Reviewed-on: https://go-review.googlesource.com/21801
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-04-10 23:39:46 +00:00
Josh Bleecher Snyder 6b33b0e98e cmd/compile: avoid a spill in append fast path
Instead of spilling newlen, recalculate it.
This removes a spill from the fast path,
at the cost of a cheap recalculation
on the (rare) growth path.
This uses 8 bytes less of stack space.
It generates two more bytes of code,
but that is due to suboptimal register allocation;
see far below.

Runtime append microbenchmarks are all over the map,
presumably due to incidental code movement.

Sample code:

func s(b []byte) []byte {
	b = append(b, 1, 2, 3)
	return b
}

Before:

"".s t=1 size=160 args=0x30 locals=0x48
	0x0000 00000 (append.go:8)	TEXT	"".s(SB), $72-48
	0x0000 00000 (append.go:8)	MOVQ	(TLS), CX
	0x0009 00009 (append.go:8)	CMPQ	SP, 16(CX)
	0x000d 00013 (append.go:8)	JLS	149
	0x0013 00019 (append.go:8)	SUBQ	$72, SP
	0x0017 00023 (append.go:8)	FUNCDATA	$0, gclocals·6432f8c6a0d23fa7bee6c5d96f21a92a(SB)
	0x0017 00023 (append.go:8)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0017 00023 (append.go:9)	MOVQ	"".b+88(FP), CX
	0x001c 00028 (append.go:9)	LEAQ	3(CX), DX
	0x0020 00032 (append.go:9)	MOVQ	DX, "".autotmp_0+64(SP)
	0x0025 00037 (append.go:9)	MOVQ	"".b+96(FP), BX
	0x002a 00042 (append.go:9)	CMPQ	DX, BX
	0x002d 00045 (append.go:9)	JGT	$0, 86
	0x002f 00047 (append.go:8)	MOVQ	"".b+80(FP), AX
	0x0034 00052 (append.go:9)	MOVB	$1, (AX)(CX*1)
	0x0038 00056 (append.go:9)	MOVB	$2, 1(AX)(CX*1)
	0x003d 00061 (append.go:9)	MOVB	$3, 2(AX)(CX*1)
	0x0042 00066 (append.go:10)	MOVQ	AX, "".~r1+104(FP)
	0x0047 00071 (append.go:10)	MOVQ	DX, "".~r1+112(FP)
	0x004c 00076 (append.go:10)	MOVQ	BX, "".~r1+120(FP)
	0x0051 00081 (append.go:10)	ADDQ	$72, SP
	0x0055 00085 (append.go:10)	RET
	0x0056 00086 (append.go:9)	LEAQ	type.[]uint8(SB), AX
	0x005d 00093 (append.go:9)	MOVQ	AX, (SP)
	0x0061 00097 (append.go:9)	MOVQ	"".b+80(FP), BP
	0x0066 00102 (append.go:9)	MOVQ	BP, 8(SP)
	0x006b 00107 (append.go:9)	MOVQ	CX, 16(SP)
	0x0070 00112 (append.go:9)	MOVQ	BX, 24(SP)
	0x0075 00117 (append.go:9)	MOVQ	DX, 32(SP)
	0x007a 00122 (append.go:9)	PCDATA	$0, $0
	0x007a 00122 (append.go:9)	CALL	runtime.growslice(SB)
	0x007f 00127 (append.go:9)	MOVQ	40(SP), AX
	0x0084 00132 (append.go:9)	MOVQ	56(SP), BX
	0x0089 00137 (append.go:8)	MOVQ	"".b+88(FP), CX
	0x008e 00142 (append.go:9)	MOVQ	"".autotmp_0+64(SP), DX
	0x0093 00147 (append.go:9)	JMP	52
	0x0095 00149 (append.go:9)	NOP
	0x0095 00149 (append.go:8)	CALL	runtime.morestack_noctxt(SB)
	0x009a 00154 (append.go:8)	JMP	0

After:

"".s t=1 size=176 args=0x30 locals=0x40
	0x0000 00000 (append.go:8)	TEXT	"".s(SB), $64-48
	0x0000 00000 (append.go:8)	MOVQ	(TLS), CX
	0x0009 00009 (append.go:8)	CMPQ	SP, 16(CX)
	0x000d 00013 (append.go:8)	JLS	151
	0x0013 00019 (append.go:8)	SUBQ	$64, SP
	0x0017 00023 (append.go:8)	FUNCDATA	$0, gclocals·6432f8c6a0d23fa7bee6c5d96f21a92a(SB)
	0x0017 00023 (append.go:8)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0017 00023 (append.go:9)	MOVQ	"".b+80(FP), CX
	0x001c 00028 (append.go:9)	LEAQ	3(CX), DX
	0x0020 00032 (append.go:9)	MOVQ	"".b+88(FP), BX
	0x0025 00037 (append.go:9)	CMPQ	DX, BX
	0x0028 00040 (append.go:9)	JGT	$0, 81
	0x002a 00042 (append.go:8)	MOVQ	"".b+72(FP), AX
	0x002f 00047 (append.go:9)	MOVB	$1, (AX)(CX*1)
	0x0033 00051 (append.go:9)	MOVB	$2, 1(AX)(CX*1)
	0x0038 00056 (append.go:9)	MOVB	$3, 2(AX)(CX*1)
	0x003d 00061 (append.go:10)	MOVQ	AX, "".~r1+96(FP)
	0x0042 00066 (append.go:10)	MOVQ	DX, "".~r1+104(FP)
	0x0047 00071 (append.go:10)	MOVQ	BX, "".~r1+112(FP)
	0x004c 00076 (append.go:10)	ADDQ	$64, SP
	0x0050 00080 (append.go:10)	RET
	0x0051 00081 (append.go:9)	LEAQ	type.[]uint8(SB), AX
	0x0058 00088 (append.go:9)	MOVQ	AX, (SP)
	0x005c 00092 (append.go:9)	MOVQ	"".b+72(FP), BP
	0x0061 00097 (append.go:9)	MOVQ	BP, 8(SP)
	0x0066 00102 (append.go:9)	MOVQ	CX, 16(SP)
	0x006b 00107 (append.go:9)	MOVQ	BX, 24(SP)
	0x0070 00112 (append.go:9)	MOVQ	DX, 32(SP)
	0x0075 00117 (append.go:9)	PCDATA	$0, $0
	0x0075 00117 (append.go:9)	CALL	runtime.growslice(SB)
	0x007a 00122 (append.go:9)	MOVQ	40(SP), AX
	0x007f 00127 (append.go:9)	MOVQ	48(SP), CX
	0x0084 00132 (append.go:9)	MOVQ	56(SP), BX
	0x0089 00137 (append.go:9)	ADDQ	$3, CX
	0x008d 00141 (append.go:9)	MOVQ	CX, DX
	0x0090 00144 (append.go:8)	MOVQ	"".b+80(FP), CX
	0x0095 00149 (append.go:9)	JMP	47
	0x0097 00151 (append.go:9)	NOP
	0x0097 00151 (append.go:8)	CALL	runtime.morestack_noctxt(SB)
	0x009c 00156 (append.go:8)	JMP	0

Observe that in the following sequence,
we should use DX directly instead of using
CX as a temporary register, which would make
the new code a strict improvement on the old:

	0x007f 00127 (append.go:9)	MOVQ	48(SP), CX
	0x0084 00132 (append.go:9)	MOVQ	56(SP), BX
	0x0089 00137 (append.go:9)	ADDQ	$3, CX
	0x008d 00141 (append.go:9)	MOVQ	CX, DX
	0x0090 00144 (append.go:8)	MOVQ	"".b+80(FP), CX

Change-Id: I4ee50b18fa53865901d2d7f86c2cbb54c6fa6924
Reviewed-on: https://go-review.googlesource.com/21812
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-04-10 20:02:04 +00:00
Josh Bleecher Snyder 974c201f74 runtime: avoid unnecessary map iteration write barrier
Update #14921

Change-Id: I5c5816d0193757bf7465b1e09c27ca06897df4bf
Reviewed-on: https://go-review.googlesource.com/21814
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-04-10 20:01:47 +00:00
Emmanuel Odeke e4f1d9cf2e runtime: make execution error panic values implement the Error interface
Make execution panics implement Error as
mandated by https://golang.org/ref/spec#Run_time_panics,
instead of panics with strings.

Fixes #14965

Change-Id: I7827f898b9b9c08af541db922cc24fa0800ff18a
Reviewed-on: https://go-review.googlesource.com/21214
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-10 01:16:30 +00:00
Dmitry Vyukov 0435e88a11 runtime: revert "do not call timeBeginPeriod on windows"
This reverts commit ab4c9298b8.

Sysmon critically depends on system timer resolution for retaking
of Ps blocked in system calls. See #14790 for an example
of a program where execution time goes from 2ms to 30ms if
timeBeginPeriod(1) is not used.

We can remove timeBeginPeriod(1) when we support UMS (#7876).

Update #14790

Change-Id: I362b56154359b2c52d47f9f2468fe012b481cf6d
Reviewed-on: https://go-review.googlesource.com/20834
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2016-04-09 16:11:41 +00:00
Dmitry Vyukov 0fb7b4cccd runtime: emit file:line info into traces
This makes traces self-contained and simplifies trace workflow
in modern cloud environments where it is simpler to reach
a service via HTTP than to obtain the binary.

Change-Id: I6ff3ca694dc698270f1e29da37d5efaf4e843a0d
Reviewed-on: https://go-review.googlesource.com/21732
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2016-04-08 20:52:30 +00:00
Michael Munday e6f36f0cd5 runtime: add s390x support (new files and lfstack_64bit.go modifications)
Change-Id: I51c0a332e3cbdab348564e5dcd27583e75e4b881
Reviewed-on: https://go-review.googlesource.com/20946
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-07 18:56:54 +00:00
Dave Cheney 9cc9e95b28 Revert "runtime: merge lfstack{Pack,Unpack} into one file"
This broke solaris, which apparently does use the upper 17 bits of the address space.

This reverts commit 3b02c5b1b6.

Change-Id: Iedfe54abd0384960845468205f20191a97751c0b
Reviewed-on: https://go-review.googlesource.com/21652
Reviewed-by: Dave Cheney <dave@cheney.net>
2016-04-07 14:05:24 +00:00
Richard Miller 121c434f7a runtime/pprof: make TestBlockProfile less timing dependent
The test for profiling of channel blocking is timing dependent,
and in particular the blockSelectRecvAsync case can fail on a
slow builder (plan9_arm) when many tests are run in parallel.
The child goroutine sleeps for a fixed period so the parent
can be observed to block in a select call reading from the
child; but if the OS process running the parent goroutine is
delayed long enough, the child may wake again before the
parent has reached the blocking point.  By repeating the test
three times, the likelihood of a blocking event is increased.

Fixes #15096

Change-Id: I2ddb9576a83408d06b51ded682bf8e71e53ce59e
Reviewed-on: https://go-review.googlesource.com/21604
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-07 09:57:06 +00:00
Dave Cheney 3b02c5b1b6 runtime: merge lfstack{Pack,Unpack} into one file
Merge the remaining lfstack{Pack,Unpack} implemetations into one file.

unsafe.Sizeof(uintptr(0)) == 4 is a constant comparison so this branch
folds away at compile time.

Dmitry confirmed that the upper 17 bits of an address will be zero for a
user mode pointer, so there is no need to sign extend on amd64 during
unpack, so we can reuse the same implementation as all othe 64 bit
archs.

Change-Id: I99f589416d8b181ccde5364c9c2e78e4a5efc7f1
Reviewed-on: https://go-review.googlesource.com/21597
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
2016-04-07 09:17:22 +00:00
Michael Hudson-Doyle 31cf1c1779 runtime: clamp OS-reported number of processors to _MaxGomaxprocs
So that all Go processes do not die on startup on a system with >256 CPUs.

I tested this by hacking osinit to set ncpu to 1000.

Updates #15131

Change-Id: I52e061a0de97be41d684dd8b748fa9087d6f1aef
Reviewed-on: https://go-review.googlesource.com/21599
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-07 00:11:25 +00:00
Dave Cheney 0c81248bf4 runtime: remove unused return value from lfstackUnpack
None of the two places that call lfstackUnpack use the second argument.
This simplifies a followup CL that merges the lfstack{Pack,Unpack}
implementations.

Change-Id: I3c93f6259da99e113d94f8c8027584da79c1ac2c
Reviewed-on: https://go-review.googlesource.com/21595
Run-TryBot: Dave Cheney <dave@cheney.net>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-06 21:04:43 +00:00
Brad Fitzpatrick 2cefd12a1b net, runtime: skip flaky tests on OpenBSD
Flaky tests are a distraction and cover up real problems.

File bugs instead and mark them as flaky.

This moves the net/http flaky test flagging mechanism to internal/testenv.

Updates #15156
Updates #15157
Updates #15158

Change-Id: I0e561cd2a09c0dec369cd4ed93bc5a2b40233dfe
Reviewed-on: https://go-review.googlesource.com/21614
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-06 19:28:24 +00:00
Dave Cheney 5c7ae10f66 runtime: merge 64bit lfstack impls
Merge all the 64bit lfstack impls into one file, adjust build tags to
match.

Merge all the comments on the various lfstack implementations for
posterity.

lfstack_amd64.go can probably be merged, but it is slightly different so
that will happen in a followup.

Change-Id: I5362d5e127daa81c9cb9d4fa8a0cc5c5e5c2707c
Reviewed-on: https://go-review.googlesource.com/21591
Run-TryBot: Dave Cheney <dave@cheney.net>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
2016-04-06 06:47:31 +00:00
Brad Fitzpatrick 8455f3a3d5 os: consolidate os{1,2}_*.go files
Change-Id: I463ca59f486b2842f67f151a55f530ee10663830
Reviewed-on: https://go-review.googlesource.com/21568
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-06 05:04:47 +00:00
Brad Fitzpatrick fd2bb1e30a runtime: rename os1_windows.go to os_windows.go
Change-Id: I11172f3d0e28f17b812e67a4db9cfe513b8e1974
Reviewed-on: https://go-review.googlesource.com/21565
Reviewed-by: Minux Ma <minux@golang.org>
2016-04-06 04:26:01 +00:00
Brad Fitzpatrick e095f53e9b runtime: merge os{,2}_windows.go into os1_windows.go.
A future CL will rename os1_windows.go to os_windows.go.

Change-Id: I223e76002dd1e9c9d1798fb0beac02c7d3bf4812
Reviewed-on: https://go-review.googlesource.com/21564
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
2016-04-06 04:25:44 +00:00
Michael Munday 0f08dd2183 runtime: add s390x support (modified files only)
Change-Id: Ib79ad4a890994ad64edb1feb79bd242d26b5b08a
Reviewed-on: https://go-review.googlesource.com/20945
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-06 04:25:06 +00:00
Shenghou Ma a2eded3421 runtime: get randomness from AT_RANDOM AUXV on linux/arm64
Fixes #15147.

Change-Id: Ibfe46c747dea987787a51eb0c95ccd8c5f24f366
Reviewed-on: https://go-review.googlesource.com/21580
Run-TryBot: Minux Ma <minux@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-06 04:23:37 +00:00
Brad Fitzpatrick 34c58065e5 runtime: rename os1_linux.go to os_linux.go
Change-Id: I938f61763c3256a876d62aeb54ef8c25cc4fc90e
Reviewed-on: https://go-review.googlesource.com/21567
Reviewed-by: Minux Ma <minux@golang.org>
2016-04-06 03:55:38 +00:00
Brad Fitzpatrick 5103fbfdb2 runtime: merge os_linux.go into os1_linux.go
Change-Id: I791c47014fe69e8529c7b2f0b9a554e47902d46c
Reviewed-on: https://go-review.googlesource.com/21566
Reviewed-by: Minux Ma <minux@golang.org>
2016-04-06 03:54:47 +00:00
Brad Fitzpatrick 8556c76f88 runtime: minor Windows cleanup
Change-Id: I9a8081ef1109469e9577c642156aa635188d8954
Reviewed-on: https://go-review.googlesource.com/21538
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2016-04-06 02:23:29 +00:00