Commit Graph

2318 Commits

Author SHA1 Message Date
Martin Möhrmann b679665a18 cmd/compile: move stringtoslicebytetmp to the backend
- removes the runtime function stringtoslicebytetmp
- removes the generation of calls to stringtoslicebytetmp from the frontend
- adds handling of OSTRARRAYBYTETMP in the backend

This reduces binary sizes and avoids function call overhead.

Change-Id: Ib9988d48549cee663b685b4897a483f94727b940
Reviewed-on: https://go-review.googlesource.com/32158
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-28 07:58:47 +00:00
Alex Brainman f595848e9a runtime/cgo: do not link threads lib by default on windows
I do not know why it is included. All tests pass without it.

Change-Id: I839076ee131816dfd177570a902c69fe8fba5022
Reviewed-on: https://go-review.googlesource.com/32144
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-10-28 04:36:31 +00:00
Michael Hudson-Doyle 8b07ec20f7 cmd/compile, runtime: make the go.itab.* symbols module-local
Otherwise, the way the ELF dynamic linker works means that you can end up with
the same itab being passed to additab twice, leading to the itab linked list
having a cycle in it. Add a test to additab in runtime to catch this when it
happens, not some arbitrary and surprsing time later.

Fixes #17594

Change-Id: I6c82edcc9ac88ac188d1185370242dc92f46b1ad
Reviewed-on: https://go-review.googlesource.com/32131
Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-27 19:13:35 +00:00
Russ Cox 5594074dcd runtime: use clock_gettime(CLOCK_REALTIME) for nanosecond-precision time.now on arm64, mips64x
Assembly copied from the clock_gettime(CLOCK_MONOTONIC)
call in runtime.nanotime in these files and then modified to use
CLOCK_REALTIME.

Also comment system call numbers in a few other files.

Fixes #11222.

Change-Id: Ie132086de7386f865908183aac2713f90fc73e0d
Reviewed-on: https://go-review.googlesource.com/32177
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-10-27 17:53:13 +00:00
Russ Cox 09692182fa runtime: print sigcode on signal crash
For #17496.

Change-Id: I671a59581c54d17bc272767eeb7b2742b54eca38
Reviewed-on: https://go-review.googlesource.com/32183
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-27 17:46:01 +00:00
Than McIntosh d80e8de54e cmd/compile: avoid truncating fieldname var locations
Don't include package path when creating LSyms for auto and param
variables during Prog generation, and update the DWARF emit routine
accordingly (remove the code that chops off package path from names in
DWARF var location expressions). Implementation suggested by mdempsky@.

The intent of this change is to have saner location expressions in cases
where the variable corresponds to a structure field. For example, the
SSA compiler's "decompose" phase can take a slice value and break it
apart into three scalar variables corresponding to the fields (slice "X"
gets split into "X.len", "X.cap", "X.ptr"). In such cases we want the
name in the location expression to omit the package path but preserve
the original variable name (e.g. "X").

Fixes #16338

Change-Id: Ibc444e7f3454b70fc500a33f0397e669d127daa1
Reviewed-on: https://go-review.googlesource.com/31819
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-26 21:14:46 +00:00
Austin Clements d6625caf53 runtime: scan mark worker stacks like normal
Currently, markroot delays scanning mark worker stacks until mark
termination by putting the mark worker G directly on the rescan list
when it encounters one during the mark phase. Without this, since mark
workers are non-preemptible, two mark workers that attempt to scan
each other's stacks can deadlock.

However, this is annoyingly asymmetric and causes some real problems.
First, markroot does not own the G at that point, so it's not
technically safe to add it to the rescan list. I haven't been able to
find a specific problem this could cause, but I suspect it's the root
cause of issue #17099. Second, this will interfere with the hybrid
barrier, since there is no stack rescanning during mark termination
with the hybrid barrier.

This commit switches to a different approach. We move the mark
worker's call to gcDrain to the system stack and set the mark worker's
status to _Gwaiting for the duration of the drain to indicate that
it's preemptible. This lets another mark worker scan its G stack while
the drain is running on the system stack. We don't return to the G
stack until we can switch back to _Grunning, which ensures we don't
race with a stack scan. This lets us eliminate the special case for
mark worker stack scans and scan them just like any other goroutine.
The only subtlety to this approach is that we have to disable stack
shrinking for mark workers; they could be referring to captured
variables from the G stack, so it's not safe to move their stacks.

Updates #17099 and #17503.

Change-Id: Ia5213949ec470af63e24dfce01df357c12adbbea
Reviewed-on: https://go-review.googlesource.com/31820
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-26 18:13:16 +00:00
Austin Clements 3193c71c5b runtime: fix bad pointer with 0 stack barriers
Currently, if the number of stack barriers for a stack is 0, we'll
create a zero-length slice that points just past the end of the stack
allocation. This bad pointer causes GC panics.

Fix this by creating a nil slice if the stack barrier count is 0.

In practice, the only way this can happen is if
GODEBUG=gcstackbarrieroff=1 is set because even the minimum size stack
reserves space for two stack barriers.

Change-Id: I3527c9a504c445b64b81170ee285a28594e7983d
Reviewed-on: https://go-review.googlesource.com/31762
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-26 15:46:25 +00:00
Austin Clements d1cc83472d runtime: debug code to panic when marking a free object
This adds debug code enabled in gccheckmark mode that panics if we
attempt to mark an unallocated object. This is a common issue with the
hybrid barrier when we're manipulating uninitialized memory that
contains stale pointers. This also tends to catch bugs that will lead
to "sweep increased allocation count" crashes closer to the source of
the bug.

Change-Id: I443ead3eac6f316a46f50b106078b524cac317f4
Reviewed-on: https://go-review.googlesource.com/31761
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-26 15:46:00 +00:00
Austin Clements 79561a84ce runtime: simplify reflectcall write barriers
Currently reflectcall has a subtle dance with write barriers where the
assembly code copies the result values from the stack to the in-heap
argument frame without write barriers and then calls into the runtime
after the fact to invoke the necessary write barriers.

For the hybrid barrier (and for ROC), we need to switch to a
*pre*-write write barrier, which is very difficult to do with the
current setup. We could tie ourselves in knots of subtle reasoning
about why it's okay in this particular case to have a post-write write
barrier, but this commit instead takes a different approach. Rather
than making things more complex, this simplifies reflection calls so
that the argument copy is done in Go using normal bulk write barriers.

The one difficulty with this approach is that calling into Go requires
putting arguments on the stack, but the call* functions "donate" their
entire stack frame to the called function. We can get away with this
now because the copy avoids using the stack and has copied the results
out before we clobber the stack frame to call into the write barrier.
The solution in this CL is to call another function, passing arguments
in registers instead of on the stack, and let that other function
reserve more stack space and setup the arguments for the runtime.

This approach seemed to work out the best. I also tried making the
call* functions reserve 32 extra bytes of frame for the write barrier
arguments and adjust SP up by 32 bytes around the call. However, even
with the necessary changes to the assembler to correct the spdelta
table, the runtime was still having trouble with the frame layout (and
the changes to the assembler caused many other things that do strange
things with the SP to fail to assemble). The approach I took doesn't
require any funny business with the SP.

Updates #17503.

Change-Id: Ie2bb0084b24d6cff38b5afb218b9e0534ad2119e
Reviewed-on: https://go-review.googlesource.com/31655
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-10-26 15:44:44 +00:00
Alexander Morozov 2481481ff7 runtime: fix comments in time.go
Change-Id: I5c501f598f41241e6d7b21d98a126827a3c3ad9a
Reviewed-on: https://go-review.googlesource.com/32018
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-10-26 03:51:33 +00:00
Keith Randall c78d072c8e Revert "Revert "cmd/compile: inline convI2E""
This reverts commit 7dd9c385f6.

Reason for revert: Reverting the revert, which will re-enable the convI2E optimization.  We originally reverted the convI2E optimization because it was making the builder fail, but the underlying cause was later determined to be unrelated.

Original CL: https://go-review.googlesource.com/31260
Revert CL: https://go-review.googlesource.com/31310
Real bug: https://go-review.googlesource.com/c/25159
Real fix: https://go-review.googlesource.com/c/31316

Change-Id: I17237bb577a23a7675a5caab970ccda71a4124f2
Reviewed-on: https://go-review.googlesource.com/32023
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-25 23:51:34 +00:00
Austin Clements 575b1dda4e runtime: eliminate allspans snapshot
Now that sweeping and span marking use the sweep list, there's no need
for the work.spans snapshot of the allspans list. This change
eliminates the few remaining uses of it, which are either dead code or
can use allspans directly, and removes work.spans and its support
functions.

Change-Id: Id5388b42b1e68e8baee853d8eafb8bb4ff95bb43
Reviewed-on: https://go-review.googlesource.com/30537
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-25 22:33:02 +00:00
Austin Clements c95a8e458f runtime: make markrootSpans time proportional to in-use spans
Currently markrootSpans iterates over all spans ever allocated to find
the in-use spans. Since we now have a list of in-use spans, change it
to iterate over that instead.

This, combined with the previous change, fixes #9265. Before these two
changes, blowing up the heap to 8GB and then shrinking it to a 0MB
live set caused the small-heap portion of the test to run 60x slower
than without the initial blowup. With these two changes, the time is
indistinguishable.

No significant effect on other benchmarks.

Change-Id: I4a27e533efecfb5d18cba3a87c0181a81d0ddc1e
Reviewed-on: https://go-review.googlesource.com/30536
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-25 22:32:59 +00:00
Austin Clements f9497a6747 runtime: make sweep time proportional to in-use spans
Currently sweeping walks the list of all spans, which means the work
in sweeping is proportional to the maximum number of spans ever used.
If the heap was once large but is now small, this causes an
amortization failure: on a small heap, GCs happen frequently, but a
full sweep still has to happen in each GC cycle, which means we spent
a lot of time in sweeping.

Fix this by creating a separate list consisting of just the in-use
spans to be swept, so sweeping is proportional to the number of in-use
spans (which is proportional to the live heap). Specifically, we
create two lists: a list of unswept in-use spans and a list of swept
in-use spans. At the start of the sweep cycle, the swept list becomes
the unswept list and the new swept list is empty. Allocating a new
in-use span adds it to the swept list. Sweeping moves spans from the
unswept list to the swept list.

This fixes the amortization problem because a shrinking heap moves
spans off the unswept list without adding them to the swept list,
reducing the time required by the next sweep cycle.

Updates #9265. This fix eliminates almost all of the time spent in
sweepone; however, markrootSpans has essentially the same bug, so now
the test program from this issue spends all of its time in
markrootSpans.

No significant effect on other benchmarks.

Change-Id: Ib382e82790aad907da1c127e62b3ab45d7a4ac1e
Reviewed-on: https://go-review.googlesource.com/30535
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-25 22:32:57 +00:00
Austin Clements 45baff61e3 runtime: expand comment on work.spans
Change-Id: I4b8a6f5d9bc5aba16026d17f99f3512dacde8d2d
Reviewed-on: https://go-review.googlesource.com/30534
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-25 22:32:54 +00:00
Austin Clements 5915ce6674 runtime: use len(h.spans) to indicate mapped region
Currently we set the len and cap of h.spans to the full reserved
region of the address space and track the actual mapped region
separately in h.spans_mapped. Since we have both the len and cap at
our disposal, change things so len(h.spans) tracks how much of the
spans array is mapped and eliminate h.spans_mapped. This simplifies
mheap and means we'll get nice "index out of bounds" exceptions if we
do try to go off the end of the spans rather than a SIGSEGV.

Change-Id: I8ed9a1a9a844d90e9fd2e269add4704623dbdfe6
Reviewed-on: https://go-review.googlesource.com/30533
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-25 22:32:51 +00:00
Austin Clements 6b0f668044 runtime: consolidate h_spans and mheap_.spans
Like h_allspans and mheap_.allspans, these were two ways of referring
to the spans array from when the runtime was split between C and Go.
Clean this up by making mheap_.spans a slice and eliminating h_spans.

Change-Id: I3aa7038d53c3a4252050aa33e468c48dfed0b70e
Reviewed-on: https://go-review.googlesource.com/30532
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-25 22:32:48 +00:00
Austin Clements 66e849b168 runtime: eliminate mheap.nspan and use range loops
This was necessary in the C days when allspans was an mspan**, but now
that allspans is a Go slice, this is redundant with len(allspans) and
we can use range loops over allspans.

Change-Id: Ie1dc39611e574e29a896e01690582933f4c5be7e
Reviewed-on: https://go-review.googlesource.com/30531
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-25 22:32:45 +00:00
Austin Clements 4d6207790b runtime: consolidate h_allspans and mheap_.allspans
These are two ways to refer to the allspans array that hark back to
when the runtime was split between C and Go. Clean this up by making
mheap_.allspans a slice and eliminating h_allspans.

Change-Id: Ic9360d040cf3eb590b5dfbab0b82e8ace8525610
Reviewed-on: https://go-review.googlesource.com/30530
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-25 22:32:42 +00:00
Josh Bleecher Snyder d8f7e4fadb runtime, syscall: appease vet
No functional changes.

Change-Id: I0842b2560f4296abfc453410fdd79514132cab83
Reviewed-on: https://go-review.googlesource.com/31935
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Rob Pike <r@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-25 15:11:54 +00:00
Michael Munday 3ef07c412f cmd, runtime: remove s390x 3 operand immediate logical ops
These are emulated by the assembler and we don't need them.

Change-Id: I2b07c5315a5b642fdb5e50b468453260ae121164
Reviewed-on: https://go-review.googlesource.com/31758
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-25 12:36:06 +00:00
Russ Cox 71cf409dbd runtime: accept timeout from non-timeout semaphore wait on OS X
Looking at the kernel sources, I don't see how this is possible.
But obviously it is. Just try again.

Fixes #17161.

Change-Id: Iea7d53f7cf75944792d2f75a0d07129831c7bcdb
Reviewed-on: https://go-review.googlesource.com/31823
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-25 02:51:04 +00:00
Russ Cox 39690beb58 runtime: fix invariant comment in chan.go
Change-Id: Ic6317f186d0ee68ab1f2d15be9a966a152f61bfb
Reviewed-on: https://go-review.googlesource.com/31610
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-10-24 15:59:28 +00:00
Russ Cox 8419c85eaa runtime, cmd/link: fix netbsd/arm EABI support
Fixes reported by oshimaya (see #13806).

Fixes #13806.

Change-Id: I9b659ab918a34bc5f7c58f3d7f59058115b7f776
Reviewed-on: https://go-review.googlesource.com/31651
Reviewed-by: Minux Ma <minux@golang.org>
2016-10-24 15:23:13 +00:00
Alex Brainman 9ac60181e2 runtime/cgo: do not link math lib by default on windows
Makes windows same as others.

Change-Id: Ib4651e06d0bd37473ac345d36c91f39aa8f5e662
Reviewed-on: https://go-review.googlesource.com/31791
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-24 08:09:57 +00:00
Austin Clements 3cbfcaa4ba runtime: make mspan.isFree do what's on the tin
Currently mspan.isFree technically returns whether the object was not
allocated *during this cycle*. Fix it so it actually returns whether
or not the object is allocated so the method is more generally useful
(especially for debugging).

It has one caller, which is carefully written to be insensitive to
this distinction, but this lets us simplify this caller.

Change-Id: I9d79cf784a56015e434961733093c1d8d03fc091
Reviewed-on: https://go-review.googlesource.com/30145
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-24 02:33:39 +00:00
Austin Clements bf9c71cb43 runtime: make morestack less subtle
morestack writes the context pointer to gobuf.ctxt, but since
morestack is written in assembly (and has to be very careful with
state), it does *not* invoke the requisite write barrier for this
write. Instead, we patch this up later, in newstack, where we invoke
an explicit write barrier for ctxt.

This already requires some subtle reasoning, and it's going to get a
lot hairier with the hybrid barrier.

Fix this by simplifying the whole mechanism. Instead of writing
gobuf.ctxt in morestack, just pass the value of the context register
to newstack and let it write it to gobuf.ctxt. This is a normal Go
pointer write, so it gets the normal Go write barrier. No subtle
reasoning required.

Updates #17503.

Change-Id: Ia6bf8459bfefc6828f53682ade32c02412e4db63
Reviewed-on: https://go-review.googlesource.com/31550
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-10-24 02:23:16 +00:00
Josh Bleecher Snyder 448e1db103 runtime: skip TestLldbPython
The test is broken on macOS Sierra.

Updates #17463.

Change-Id: Ifbb2379c640b9353a01bc55a5cb26dfaad9b4bdc
Reviewed-on: https://go-review.googlesource.com/31725
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-22 19:17:15 +00:00
Austin Clements a73d68e75e runtime: fix call* signatures and deferArgs with siz=0
This commit fixes two bizarrely related bugs:

1. The signatures for the call* functions were wrong, indicating that
they had only two pointer arguments instead of three. We didn't notice
because the call* functions are defined by a macro expansion, which go
vet doesn't see.

2. deferArgs on a defer object with a zero-sized frame returned a
pointer just past the end of the allocated object, which is illegal in
Go (and can cause the "sweep increased allocation count" crashes).

In a fascinating twist, these two bugs canceled each other out, which
is why I'm fixing them together. The pointer returned by deferArgs is
used in only two ways: as an argument to memmove and as an argument to
reflectcall. memmove is NOSPLIT, so the argument was unobservable.
reflectcall immediately tail calls one of the call* functions, which
are not NOSPLIT, but the deferArgs pointer just happened to be the
third argument that was accidentally marked as a scalar. Hence, when
the garbage collector scanned the stack, it didn't see the bad
pointer as a pointer.

I believe this was all ultimately benign. In principle, stack growth
during the reflectcall could fail to update the args pointer, but it
never points to the stack, so it never needs to be updated. Also in
principle, the garbage collector could fail to mark the args object
because of the incorrect call* signatures, but in all calls to
reflectcall (including the ones spelled "call" in the reflect package)
the args object is kept live by the calling stack.

Change-Id: Ic932c79d5f4382be23118fdd9dba9688e9169e28
Reviewed-on: https://go-review.googlesource.com/31654
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-10-21 16:01:32 +00:00
Austin Clements c242517866 runtime: replace *g with guintptr in trace
trace's reader *g is going to cause write barriers in unfortunate
places, so replace it with a guintptr.

Change-Id: Ie8fb13bb89a78238f9d2a77ec77da703e96df8af
Reviewed-on: https://go-review.googlesource.com/31469
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-21 16:00:20 +00:00
Alberto Donizetti 10560afb54 runtime/debug: avoid overflow in SetMaxThreads
Fixes #16076

Change-Id: I91fa87b642592ee4604537dd8c3197cd61ec8b31
Reviewed-on: https://go-review.googlesource.com/31516
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-10-20 21:08:18 +00:00
Michael Munday f1ad4863aa runtime: get s390x vector facility availability from AT_HWCAP
This is a more robust method for obtaining the availability of vx.
Since this variable may be checked frequently I've also now
padded it so that it will be in its own cache line.

I've kept the other check (in hash/crc32) the same for now until
I can figure out the best way to update it.

Updates #15403.

Change-Id: I74eed651afc6f6a9c5fa3b88fa6a2b0c9ecf5875
Reviewed-on: https://go-review.googlesource.com/31149
Reviewed-by: Austin Clements <austin@google.com>
2016-10-19 21:50:13 +00:00
Austin Clements 2be3ab4415 runtime: keep gcMarkRootCheck happy with spare Gs
oneNewExtraM creates a spare M and G for use with cgo callbacks. The G
doesn't run right away, but goes directly into syscall status. For the
garbage collector, it's marked as "scan valid" and not on the rescan
list, but I forgot to also mark it as "scan done". As a result,
gcMarkRootCheck thinks that the goroutine hasn't been scanned and
panics.

This only affects GODEBUG=gccheckmark=1 mode, since we otherwise skip
the gcMarkRootCheck.

Fixes #17473.

Change-Id: I94f5671c42eb44bd5ea7dc68fbf85f0c19e2e52c
Reviewed-on: https://go-review.googlesource.com/31139
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-19 21:36:53 +00:00
Austin Clements 7bc42a145a runtime: don't reserve space for stack barriers if they're off
Change-Id: I79ebccdaefc434c47b77bd545cc3c50723c18b61
Reviewed-on: https://go-review.googlesource.com/31135
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-19 21:36:37 +00:00
Austin Clements 5b7497f327 runtime: update heap profile stats after world is started
Updating the heap profile stats is one of the most expensive parts of
mark termination other than stack rescanning, but there's really no
need to do this with the world stopped. Move it to right after we've
started the world back up. This creates a *very* small window where
allocations from the next cycle can slip into the profile, but the
exact point where mark termination happens is so non-deterministic
already that a slight reordering here is unimportant.

Change-Id: I2f76f22c70329923ad6a594a2c26869f0736d34e
Reviewed-on: https://go-review.googlesource.com/31363
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-19 21:36:24 +00:00
Austin Clements 9429aab999 runtime: remove gcWork flushes in mark termination
The only reason these flushes are still necessary at all is that
gcmarknewobject doesn't flush its gcWork stats like it's supposed to.
By changing gcmarknewobject to follow the standard protocol, the
flushes become completely unnecessary because mark 2 ensures caches
are flushed (and stay flushed) before we ever enter mark termination.

In the garbage benchmark, this takes roughly 50 µs, which is
surprisingly long for doing nothing. We still double-check after
draining that they are in fact empty.

Change-Id: Ia1c7cf98a53f72baa513792eb33eca6a0b4a7128
Reviewed-on: https://go-review.googlesource.com/31134
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-19 21:36:09 +00:00
Ian Lance Taylor a16954b8a7 cmd/cgo: always use a function literal for pointer checking
The pointer checking code needs to know the exact type of the parameter
expected by the C function, so that it can use a type assertion to
convert the empty interface returned by cgoCheckPointer to the correct
type. Previously this was done by using a type conversion, but that
meant that the code accepted arguments that were convertible to the
parameter type, rather than arguments that were assignable as in a
normal function call. In other words, some code that should not have
passed type checking was accepted.

This CL changes cgo to always use a function literal for pointer
checking. Now the argument is passed to the function literal, which has
the correct argument type, so type checking is performed just as for a
function call as it should be.

Since we now always use a function literal, simplify the checking code
to run as a statement by itself. It now no longer needs to return a
value, and we no longer need a type assertion.

This does have the cost of introducing another function call into any
call to a C function that requires pointer checking, but the cost of the
additional call should be minimal compared to the cost of pointer
checking.

Fixes #16591.

Change-Id: I220165564cf69db9fd5f746532d7f977a5b2c989
Reviewed-on: https://go-review.googlesource.com/31233
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-19 21:20:50 +00:00
Russ Cox 40d81cf061 sync: throw, not panic, for unlock of unlocked mutex
The panic leaves the lock in an unusable state.
Trying to panic with a usable state makes the lock significantly
less efficient and scalable (see early CL patch sets and discussion).

Instead, use runtime.throw, which will crash the program directly.

In general throw is reserved for when the runtime detects truly
serious, unrecoverable problems. This problem is certainly serious,
and, without a significant performance hit, is unrecoverable.

Fixes #13879.

Change-Id: I41920d9e2317270c6f909957d195bd8b68177f8d
Reviewed-on: https://go-review.googlesource.com/31359
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-19 17:46:27 +00:00
Matthew Dempsky 7dd9c385f6 Revert "cmd/compile: inline convI2E"
This reverts commit 395d36a67d.

Appears to be responsible for builder failures.

Change-Id: Ic6c6307f662767e529060b88704a9f074785d99e
Reviewed-on: https://go-review.googlesource.com/31310
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-10-17 20:37:19 +00:00
Lynn Boger 2190f771d8 bytes: fix typo in ppc64le asm for Compare
Correcting a line in asm_ppc64x.s in the cmpbodyLE function
that originally was R14 but accidentally changed to R4.

Fixes #17488

Change-Id: Id4ca6fb2e0cd81251557a0627e17b5e734c39e01
Reviewed-on: https://go-review.googlesource.com/31266
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
2016-10-17 20:01:17 +00:00
Austin Clements 984753b665 runtime: fix GC assist retry path
GC assists retry if preempted or if they fail to park. However, on the
retry path they currently use stale statistics. In particular, the
retry can use "debtBytes", but debtBytes isn't updated when the debt
changes (since other than retries it is only used once). Also, though
less of a problem, the if the assist ratio has changed while the
assist was blocked, the retry will still use the old assist ratio.

Fix all of this by simply making the retry jump back to where we
compute these statistics, rather than just after.

Change-Id: I2ed8b4f0fc9f008ff060aa926f4334b662ac7d3f
Reviewed-on: https://go-review.googlesource.com/30701
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-17 19:09:37 +00:00
Austin Clements 81c431a537 runtime: abstract out assist queue management
This puts all of the assist queue-related code together and makes it
easier to modify how the assist queue works.

Change-Id: Id54e06702bdd5a5dd3fef2ce2c14cd7ca215303c
Reviewed-on: https://go-review.googlesource.com/30700
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-17 19:09:30 +00:00
Keith Randall 395d36a67d cmd/compile: inline convI2E
It's pretty simple.  For:
  e = (interface{})(i)
Do:
  tmp = i.itab
  if tmp != nil {
    tmp = tmp.typ_ // load type from itab
  }
  e = eface{tmp, i.data}

It is smaller and faster than calling the runtime.

Change-Id: I0ad27f62f4ec0b6cd53bc8530e4da0eae3e67a6c
Reviewed-on: https://go-review.googlesource.com/31260
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-17 19:09:07 +00:00
Austin Clements d5bd797ee5 runtime: fix getArgInfo for deferred reflection calls
getArgInfo for reflect.makeFuncStub and reflect.methodValueCall is
necessarily special. These have dynamically determined argument maps
that are stored in their context (that is, their *funcval). These
functions are written to store this context at 0(SP) when called, and
getArgInfo retrieves it from there.

This technique works if getArgInfo is passed an active call frame for
one of these functions. However, getArgInfo is also used in
tracebackdefers, where the "call" is not a true call with an active
stack frame, but a deferred call. In this situation, getArgInfo
currently crashes because tracebackdefers passes a frame with sp set
to 0. However, the entire approach used by getArgInfo is flawed in
this situation because the wrapper has not actually executed, and
hence hasn't saved this metadata to any stack frame.

In the defer case, we know the *funcval from the _defer itself, so we
can fix this by teaching getArgInfo to use the *funcval context
directly when its available, and otherwise get it from the active call
frame.

While we're here, this commit simplifies getArgInfo a bit by making it
play more nicely with the type system. Rather than decoding the
*reflect.methodValue that is the wrapper's context as a *[2]uintptr,
just write out a copy of the reflect.methodValue type in the runtime.

Fixes #16331. Fixes #17471.

Change-Id: I81db4d985179b4a81c68c490cceeccbfc675456a
Reviewed-on: https://go-review.googlesource.com/31138
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-10-17 18:57:01 +00:00
Austin Clements 687d9d5d78 runtime: print a message on bad morestack
If morestack runs on the g0 or gsignal stack, it currently performs
some abort operation that typically produces a signal (e.g., it does
an INT $3 on x86). This is useful if you're running in a debugger, but
if you're not, the runtime tries to trap this signal, which is likely
to send the program into a deeper spiral of collapse and lead to very
confusing diagnostic output.

Help out people trying to debug without a debugger by making morestack
print an informative message before blowing up.

Change-Id: I2814c64509b137bfe20a00091d8551d18c2c4749
Reviewed-on: https://go-review.googlesource.com/31133
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-10-17 18:56:09 +00:00
Lynn Boger 1e28dce80a bytes: improve performance for bytes.Compare on ppc64x
This improves the performance for byte.Compare by rewriting
the cmpbody function in runtime/asm_ppc64x.s.  The previous code
had a simple loop which loaded a pair of bytes and compared them,
which is inefficient for long buffers.  The updated function checks
for 8 or 32 byte chunks and then loads and compares double words where
possible.

Because the byte.Compare result indicates greater or less than,
the doubleword loads must take endianness into account, using a
byte reversed load in the little endian case.

Fixes #17433

benchmark                                   old ns/op     new ns/op     delta
BenchmarkBytesCompare/8-16                  13.6          7.16          -47.35%
BenchmarkBytesCompare/16-16                 25.7          7.83          -69.53%
BenchmarkBytesCompare/32-16                 38.1          7.78          -79.58%
BenchmarkBytesCompare/64-16                 63.0          10.6          -83.17%
BenchmarkBytesCompare/128-16                112           13.0          -88.39%
BenchmarkBytesCompare/256-16                211           28.1          -86.68%
BenchmarkBytesCompare/512-16                410           38.6          -90.59%
BenchmarkBytesCompare/1024-16               807           60.2          -92.54%
BenchmarkBytesCompare/2048-16               1601          103           -93.57%

Change-Id: I121acc74fcd27c430797647b8d682eb0607c63eb
Reviewed-on: https://go-review.googlesource.com/30949
Reviewed-by: David Chase <drchase@google.com>
2016-10-17 18:46:20 +00:00
Martin Möhrmann d295174030 runtime: speed up non-ASCII rune decoding
Copies utf8 constants and EncodeRune implementation from unicode/utf8.

Adds a new decoderune implementation that is used by the compiler
in code generated for ranging over strings. It does not handle
ASCII runes since these are handled directly before calls to decoderune.

The DecodeRuneInString implementation from unicode/utf8 is not used
since it uses a lookup table that would increase the use of cpu caches.

Adds more tests that check decoding of valid and invalid utf8 sequences.

name                              old time/op  new time/op  delta
RuneIterate/range2/ASCII-4        7.45ns ± 2%  7.45ns ± 1%     ~     (p=0.634 n=16+16)
RuneIterate/range2/Japanese-4     53.5ns ± 1%  49.2ns ± 2%   -8.03%  (p=0.000 n=20+20)
RuneIterate/range2/MixedLength-4  46.3ns ± 1%  41.0ns ± 2%  -11.57%  (p=0.000 n=20+20)

new:
"".decoderune t=1 size=423 args=0x28 locals=0x0
old:
"".charntorune t=1 size=666 args=0x28 locals=0x0

Change-Id: I1df1fdb385bb9ea5e5e71b8818ea2bf5ce62de52
Reviewed-on: https://go-review.googlesource.com/28490
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-17 11:25:22 +00:00
Austin Clements 9897e40811 runtime: use more go:nowritebarrierrec in proc.go
Currently we use go:nowritebarrier in many places in proc.go.
go:notinheap and go:yeswritebarrierrec now let us use
go:nowritebarrierrec (the recursive form of the go:nowritebarrier
pragma) more liberally. Do so in proc.go

Change-Id: Ia7fcbc12ce6c51cb24730bf835fb7634ad53462f
Reviewed-on: https://go-review.googlesource.com/30942
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-15 17:58:23 +00:00
Austin Clements 1bc6be6423 runtime: mark several types go:notinheap
This covers basically all sysAlloc'd, persistentalloc'd, and
fixalloc'd types.

Change-Id: I0487c887c2a0ade5e33d4c4c12d837e97468e66b
Reviewed-on: https://go-review.googlesource.com/30941
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-15 17:58:20 +00:00
Austin Clements 991a85c889 runtime: make mSpanList more go:notinheap-friendly
Currently mspan links to its previous mspan using a **mspan field that
points to the previous span's next field. This simplifies some of the
list manipulation code, but is going to make it very hard to convince
the compiler that mspan list manipulations don't need write barriers.

Fix this by using a more traditional ("boring") linked list that uses
a simple *mspan pointer to the previous mspan. This complicates some
of the list manipulation slightly, but it will let us eliminate all
write barriers from the mspan list manipulation code by marking mspan
go:notinheap.

Change-Id: I0d0b212db5f20002435d2a0ed2efc8aa0364b905
Reviewed-on: https://go-review.googlesource.com/30940
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-15 17:58:17 +00:00
Austin Clements 77527a316b cmd/compile: add go:notinheap type pragma
This adds a //go:notinheap pragma for declarations of types that must
not be heap allocated. We ensure these rules by disallowing new(T),
make([]T), append([]T), or implicit allocation of T, by disallowing
conversions to notinheap types, and by propagating notinheap to any
struct or array that contains notinheap elements.

The utility of this pragma is that we can eliminate write barriers for
writes to pointers to go:notinheap types, since the write barrier is
guaranteed to be a no-op. This will let us mark several scheduler and
memory allocator structures as go:notinheap, which will let us
disallow write barriers in the scheduler and memory allocator much
more thoroughly and also eliminate some problematic hybrid write
barriers.

This also makes go:nowritebarrierrec and go:yeswritebarrierrec much
more powerful. Currently we use go:nowritebarrier all over the place,
but it's almost never what you actually want: when write barriers are
illegal, they're typically illegal for a whole dynamic scope. Partly
this is because go:nowritebarrier has been around longer, but it's
also because go:nowritebarrierrec couldn't be used in situations that
had no-op write barriers or where some nested scope did allow write
barriers. go:notinheap eliminates many no-op write barriers and
go:yeswritebarrierrec makes it possible to opt back in to write
barriers, so these two changes will let us use go:nowritebarrierrec
far more liberally.

This updates #13386, which is about controlling pointers from non-GC'd
memory to GC'd memory. That would require some additional pragma (or
pragmas), but could build on this pragma.

Change-Id: I6314f8f4181535dd166887c9ec239977b54940bd
Reviewed-on: https://go-review.googlesource.com/30939
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-15 17:58:14 +00:00
Austin Clements a9e6cebde2 cmd/compile, runtime: add go:yeswritebarrierrec pragma
This pragma cancels the effect of go:nowritebarrierrec. This is useful
in the scheduler because there are places where we enter a function
without a valid P (and hence cannot have write barriers), but then
obtain a P. This allows us to annotate the function with
go:nowritebarrierrec and split out the part after we've obtained a P
into a go:yeswritebarrierrec function.

Change-Id: Ic8ce4b6d3c074a1ecd8280ad90eaf39f0ffbcc2a
Reviewed-on: https://go-review.googlesource.com/30938
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2016-10-15 17:58:11 +00:00
Keith Randall 442de98c14 cmd/compile,runtime: redo how map assignments work
To compile:
  m[k] = v
instead of:
  mapassign(maptype, m, &k, &v), do
do:
  *mapassign(maptype, m, &k) = v

mapassign returns a pointer to the value slot in the map.  It is just
like mapaccess except that it will allocate a new slot if k is not
already present in the map.

This makes map accesses faster but potentially larger (codewise).

It is faster because the write into the map is done when the compiler
knows the concrete type, so it can be done with a few store
instructions instead of calling typedmemmove.  We also potentially
avoid stack temporaries to hold v.

The code can be larger when the map has pointers in its value type,
since there is a write barrier call in addition to the mapassign call.
That makes the code at the callsite a bit bigger (go binary is 0.3%
bigger).

This CL is in preparation for doing operations like m[k] += v with
only a single runtime call.  That will roughly double the speed of
such operations.

Update #17133
Update #5147

Change-Id: Ia435f032090a2ed905dac9234e693972fe8c2dc5
Reviewed-on: https://go-review.googlesource.com/30815
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-12 20:41:23 +00:00
Emmanuel Odeke 898ca6ba0a runtime: update mkduff legacy comments
Update comments for duffzero and duffcopy
which referred to legacy locations:
+ cmd/?g/cgen.go
+ cmd/?g/ggen.go

Remnants of the old days when we had 5g, 6g etc.

Those locations have since moved to:
+ cmd/compile/internal/<arch>/cgen.go
+ cmd/compile/internal/<arch>/ggen.go

Change-Id: Ie2ea668559d52d42b747260ea69a6d5b3d70e859
Reviewed-on: https://go-review.googlesource.com/29073
Reviewed-by: Russ Cox <rsc@golang.org>
2016-10-12 14:51:50 +00:00
Joshua Boelter 1a3b739b26 runtime: check for errors returned by windows sema calls
Add checks for failure of CreateEvent, SetEvent or
WaitForSingleObject. Any failures are considered fatal and
will throw() after printing an informative message.

Updates #16646

Change-Id: I3bacf9001d2abfa8667cc3aff163ff2de1c99915
Reviewed-on: https://go-review.googlesource.com/26655
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-12 13:39:43 +00:00
Gyu-Ho Lee 456b7f5a97 runtime/pprof: preallocate slice in pprof.go
To prevent slice growth when appending.

Change-Id: I2cdb9b09bc33f63188b19573c8b9a77601e63801
Reviewed-on: https://go-review.googlesource.com/23783
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2016-10-12 06:33:34 +00:00
Gleb Stepanov 460d112f6c runtime: fix typo in comments
Fix typo in word synchronization in comments.

Change-Id: I453b4e799301e758799c93df1e32f5244ca2fb84
Reviewed-on: https://go-review.googlesource.com/25174
Reviewed-by: Russ Cox <rsc@golang.org>
2016-10-12 02:48:31 +00:00
Michael Munday 8728df645c runtime: remove canBackTrace variable from TestGdbPython
The canBackTrace variable is true for all of the architectures
Go supports and this is likely to remain the case as new
architectures are added.

Change-Id: I73900c018eb4b2e5c02fccd8d3e89853b2ba9d90
Reviewed-on: https://go-review.googlesource.com/22423
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-12 02:16:11 +00:00
Matthew Dempsky 303b69feb7 cmd/compile, runtime: stop padding stackmaps to 4 bytes
Shrinks cmd/go's text segment by 0.9%.

   text	   data	    bss	    dec	    hex	filename
6447148	 231643	 146328	6825119	 68249f	go.before
6387404	 231643	 146328	6765375	 673b3f	go.after

Change-Id: I431e8482dbb11a7c1c77f2196cada43d5dad2981
Reviewed-on: https://go-review.googlesource.com/30817
Reviewed-by: Austin Clements <austin@google.com>
2016-10-11 20:23:30 +00:00
Cherry Zhang 7c431cb7f9 cmd/link: insert trampolines for too-far jumps on ARM
ARM direct CALL/JMP instruction has 24 bit offset, which can only
encodes jumps within +/-32M. When the target is too far, the top
bits get truncated and the program jumps wild.

This CL detects too-far jumps and automatically insert trampolines,
currently only internal linking on ARM.

It is necessary to make the following changes to the linker:
- Resolve direct jump relocs when assigning addresses to functions.
  this allows trampoline insertion without moving all code that
  already laid down.
- Lay down packages in dependency order, so that when resolving a
  inter-package direct jump reloc, the target address is already
  known. Intra-package jumps are assumed never too far.
- a linker flag -debugtramp is added for debugging trampolines:
    "-debugtramp=1 -v" prints trampoline debug message
    "-debugtramp=2"    forces all inter-package jump to use
                       trampolines (currently ARM only)
    "-debugtramp=2 -v" does both
- Some data structures are changed for bookkeeping.

On ARM, pseudo DIV/DIVU/MOD/MODU instructions now clobber R8
(unfortunate). In the standard library there is no ARM assembly
code that uses these instructions, and the compiler no longer emits
them (CL 29390).

all.bash passes with -debugtramp=2, except a disassembly test (this
is unavoidable as we changed the instruction).

TBD: debug info of trampolines?

Fixes #17028.

Change-Id: Idcce347ea7e0af77c4079041a160b2f6e114b474
Reviewed-on: https://go-review.googlesource.com/29397
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-11 13:35:33 +00:00
Ian Lance Taylor d03e8b226c runtime: record current PC for SIGPROF on non-Go thread
If we get a SIGPROF on a non-Go thread, and the program has not called
runtime.SetCgoTraceback so we have no way to collect a stack trace, then
record a profile that is just the PC where the signal occurred. That
will at least point the user to the right area.

Retrieving the PC from the sigctxt in a signal handler on a non-G thread
required marking a number of trivial sigctxt methods as nosplit, and,
for extra safety, nowritebarrierrec.

The test shows that the existing test CgoPprofThread test does not test
the stack trace, just the profile signal. Leaving that for later.

Change-Id: I8f8f3ff09ac099fc9d9df94b5a9d210ffc20c4ab
Reviewed-on: https://go-review.googlesource.com/30252
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2016-10-11 12:56:15 +00:00
Alex Brainman dd307da10c runtime/cgo: do not explicitly link msvcrt.dll
CL 14472 solved issue #12030 by explicitly linking msvcrt.dll
to every cgo executable we build. This CL achieves the same
by manually loading ntdll.dll during startup.

Updates #12030

Change-Id: I5d9cd925ef65cc34c5d4031c750f0f97794529b2
Reviewed-on: https://go-review.googlesource.com/30737
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
2016-10-11 05:45:06 +00:00
Austin Clements 94589054d3 cmd/trace: label mark termination spans as such
Currently these are labeled "MARK", which was accurate in the STW
collector, but these really indicate mark termination now, since
marking happens for the full duration of the concurrent GC. Re-label
them as "MARK TERMINATION" to clarify this.

Change-Id: Ie98bd961195acde49598b4fa3f9e7d90d757c0a6
Reviewed-on: https://go-review.googlesource.com/30018
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2016-10-07 18:33:23 +00:00
Austin Clements fa9b57bb1d runtime: make next_gc ^0 when GC is disabled
When GC is disabled, we set gcpercent to -1. However, we still use
gcpercent to compute several values, such as next_gc and gc_trigger.
These calculations are meaningless when gcpercent is -1 and result in
meaningless values. This is okay in a sense because we also never use
these values if gcpercent is -1, but they're confusing when exposed to
the user, for example via MemStats or the execution trace. It's
particularly unfortunate in the execution trace because it attempts to
plot the underflowed value of next_gc, which scales all useful
information in the heap row into oblivion.

Fix this by making next_gc ^0 when gcpercent < 0. This has the
advantage of being true in a way: next_gc is effectively infinite when
gcpercent < 0. We can also detect this special value when updating the
execution trace and report next_gc as 0 so it doesn't blow up the
display of the heap line.

Change-Id: I4f366e4451f8892a4908da7b2b6086bdc67ca9a9
Reviewed-on: https://go-review.googlesource.com/30016
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-07 18:32:51 +00:00
Ian Lance Taylor 15937ccb89 runtime: fix sigset type for ppc64 big-endian GNU/Linux
On 64-bit big-endian GNU/Linux machines we need to treat sigset as a
single uint64, not as a pair of uint32 values. This fix was already made
for s390x, but not for ppc64 (which is big-endian--the little endian
version is known as ppc64le). So copy os_linux_390.x to
os_linux_be64.go, and use build constraints as needed.

Fixes #17361

Change-Id: Ia0eb18221a8f5056bf17675fcfeb010407a13fb0
Reviewed-on: https://go-review.googlesource.com/30602
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-06 22:24:40 +00:00
Brad Fitzpatrick 4103fedf19 runtime: skip gdb tests on linux/ppc64 for now
Updates #17366

Change-Id: Ia4bd3c74c48b85f186586184a7c2b66d3b80fc9c
Reviewed-on: https://go-review.googlesource.com/30596
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-10-06 19:52:09 +00:00
Denis Nagorny d7507e9d11 runtime: improve memmove for amd64
Use AVX if available on 4th generation of Intel(TM) Core(TM) processors.

    (collected on E5 2609v3 @1.9GHz)
    name                        old speed      new speed       delta
    Memmove/1-6                  158MB/s ± 0%    172MB/s ± 0%    +9.09% (p=0.000 n=16+16)
    Memmove/2-6                  316MB/s ± 0%    345MB/s ± 0%    +9.09% (p=0.000 n=18+16)
    Memmove/3-6                  517MB/s ± 0%    517MB/s ± 0%      ~ (p=0.445 n=16+16)
    Memmove/4-6                  687MB/s ± 1%    690MB/s ± 0%    +0.35% (p=0.000 n=20+17)
    Memmove/5-6                  729MB/s ± 0%    729MB/s ± 0%    +0.01% (p=0.000 n=16+18)
    Memmove/6-6                  875MB/s ± 0%    875MB/s ± 0%    +0.01% (p=0.000 n=18+18)
    Memmove/7-6                 1.02GB/s ± 0%   1.02GB/s ± 1%      ~ (p=0.139 n=19+20)
    Memmove/8-6                 1.26GB/s ± 0%   1.26GB/s ± 0%    +0.00% (p=0.000 n=18+18)
    Memmove/9-6                 1.42GB/s ± 0%   1.42GB/s ± 0%    +0.00% (p=0.000 n=17+18)
    Memmove/10-6                1.58GB/s ± 0%   1.58GB/s ± 0%    +0.00% (p=0.000 n=19+19)
    Memmove/11-6                1.74GB/s ± 0%   1.74GB/s ± 0%    +0.00% (p=0.001 n=18+17)
    Memmove/12-6                1.90GB/s ± 0%   1.90GB/s ± 0%    +0.00% (p=0.000 n=19+19)
    Memmove/13-6                2.05GB/s ± 0%   2.05GB/s ± 0%    +0.00% (p=0.000 n=18+19)
    Memmove/14-6                2.21GB/s ± 0%   2.21GB/s ± 0%    +0.00% (p=0.000 n=16+20)
    Memmove/15-6                2.37GB/s ± 0%   2.37GB/s ± 0%    +0.00% (p=0.004 n=19+20)
    Memmove/16-6                2.53GB/s ± 0%   2.53GB/s ± 0%    +0.00% (p=0.000 n=16+16)
    Memmove/32-6                4.67GB/s ± 0%   4.67GB/s ± 0%    +0.00% (p=0.000 n=17+17)
    Memmove/64-6                8.67GB/s ± 0%   8.64GB/s ± 0%    -0.33% (p=0.000 n=18+17)
    Memmove/128-6               12.6GB/s ± 0%   11.6GB/s ± 0%    -8.05% (p=0.000 n=16+19)
    Memmove/256-6               16.3GB/s ± 0%   16.6GB/s ± 0%    +1.66% (p=0.000 n=20+18)
    Memmove/512-6               21.5GB/s ± 0%   24.4GB/s ± 0%   +13.35% (p=0.000 n=18+17)
    Memmove/1024-6              24.7GB/s ± 0%   33.7GB/s ± 0%   +36.12% (p=0.000 n=18+18)
    Memmove/2048-6              27.3GB/s ± 0%   43.3GB/s ± 0%   +58.77% (p=0.000 n=19+17)
    Memmove/4096-6              37.5GB/s ± 0%   50.5GB/s ± 0%   +34.56% (p=0.000 n=19+19)
    MemmoveUnalignedDst/1-6      135MB/s ± 0%    146MB/s ± 0%    +7.69% (p=0.000 n=16+14)
    MemmoveUnalignedDst/2-6      271MB/s ± 0%    292MB/s ± 0%    +7.69% (p=0.000 n=18+18)
    MemmoveUnalignedDst/3-6      438MB/s ± 0%    438MB/s ± 0%      ~ (p=0.352 n=16+19)
    MemmoveUnalignedDst/4-6      584MB/s ± 0%    584MB/s ± 0%      ~ (p=0.876 n=17+17)
    MemmoveUnalignedDst/5-6      631MB/s ± 1%    632MB/s ± 0%    +0.25% (p=0.000 n=20+17)
    MemmoveUnalignedDst/6-6      759MB/s ± 0%    759MB/s ± 0%    +0.00% (p=0.000 n=19+16)
    MemmoveUnalignedDst/7-6      885MB/s ± 0%    883MB/s ± 1%      ~ (p=0.647 n=18+20)
    MemmoveUnalignedDst/8-6     1.08GB/s ± 0%   1.08GB/s ± 0%    +0.00% (p=0.035 n=19+18)
    MemmoveUnalignedDst/9-6     1.22GB/s ± 0%   1.22GB/s ± 0%      ~ (p=0.251 n=18+17)
    MemmoveUnalignedDst/10-6    1.35GB/s ± 0%   1.35GB/s ± 0%      ~ (p=0.327 n=17+18)
    MemmoveUnalignedDst/11-6    1.49GB/s ± 0%   1.49GB/s ± 0%      ~ (p=0.531 n=18+19)
    MemmoveUnalignedDst/12-6    1.63GB/s ± 0%   1.63GB/s ± 0%      ~ (p=0.886 n=19+18)
    MemmoveUnalignedDst/13-6    1.76GB/s ± 0%   1.76GB/s ± 1%    -0.24% (p=0.006 n=18+20)
    MemmoveUnalignedDst/14-6    1.90GB/s ± 0%   1.90GB/s ± 0%      ~ (p=0.818 n=20+19)
    MemmoveUnalignedDst/15-6    2.03GB/s ± 0%   2.03GB/s ± 0%      ~ (p=0.294 n=17+16)
    MemmoveUnalignedDst/16-6    2.17GB/s ± 0%   2.17GB/s ± 0%      ~ (p=0.602 n=16+18)
    MemmoveUnalignedDst/32-6    4.05GB/s ± 0%   4.05GB/s ± 0%    +0.00% (p=0.010 n=18+17)
    MemmoveUnalignedDst/64-6    7.59GB/s ± 0%   7.59GB/s ± 0%    +0.00% (p=0.022 n=18+16)
    MemmoveUnalignedDst/128-6   11.1GB/s ± 0%   11.4GB/s ± 0%    +2.79% (p=0.000 n=18+17)
    MemmoveUnalignedDst/256-6   16.4GB/s ± 0%   16.7GB/s ± 0%    +1.59% (p=0.000 n=20+17)
    MemmoveUnalignedDst/512-6   15.7GB/s ± 0%   21.3GB/s ± 0%   +35.87% (p=0.000 n=18+20)
    MemmoveUnalignedDst/1024-6  16.0GB/s ±20%   31.5GB/s ± 0%   +96.93% (p=0.000 n=20+14)
    MemmoveUnalignedDst/2048-6  19.6GB/s ± 0%   42.1GB/s ± 0%  +115.16% (p=0.000 n=17+18)
    MemmoveUnalignedDst/4096-6  6.41GB/s ± 0%  33.18GB/s ± 0%  +417.56% (p=0.000 n=17+18)
    MemmoveUnalignedSrc/1-6      171MB/s ± 0%    166MB/s ± 0%    -3.33% (p=0.000 n=19+16)
    MemmoveUnalignedSrc/2-6      343MB/s ± 0%    342MB/s ± 1%    -0.41% (p=0.000 n=17+20)
    MemmoveUnalignedSrc/3-6      508MB/s ± 0%    493MB/s ± 1%    -2.90% (p=0.000 n=17+17)
    MemmoveUnalignedSrc/4-6      677MB/s ± 0%    660MB/s ± 2%    -2.55% (p=0.000 n=17+20)
    MemmoveUnalignedSrc/5-6      790MB/s ± 0%    790MB/s ± 0%      ~ (p=0.139 n=17+17)
    MemmoveUnalignedSrc/6-6      948MB/s ± 0%    946MB/s ± 1%      ~ (p=0.330 n=17+19)
    MemmoveUnalignedSrc/7-6     1.11GB/s ± 0%   1.11GB/s ± 0%    -0.05% (p=0.026 n=17+17)
    MemmoveUnalignedSrc/8-6     1.38GB/s ± 0%   1.38GB/s ± 0%      ~ (p=0.091 n=18+16)
    MemmoveUnalignedSrc/9-6     1.42GB/s ± 0%   1.40GB/s ± 1%    -1.04% (p=0.000 n=19+20)
    MemmoveUnalignedSrc/10-6    1.58GB/s ± 0%   1.56GB/s ± 1%    -1.15% (p=0.000 n=18+19)
    MemmoveUnalignedSrc/11-6    1.73GB/s ± 0%   1.71GB/s ± 1%    -1.30% (p=0.000 n=20+20)
    MemmoveUnalignedSrc/12-6    1.89GB/s ± 0%   1.87GB/s ± 1%    -1.18% (p=0.000 n=17+20)
    MemmoveUnalignedSrc/13-6    2.05GB/s ± 0%   2.02GB/s ± 1%    -1.18% (p=0.000 n=17+20)
    MemmoveUnalignedSrc/14-6    2.21GB/s ± 0%   2.18GB/s ± 1%    -1.14% (p=0.000 n=17+20)
    MemmoveUnalignedSrc/15-6    2.36GB/s ± 0%   2.34GB/s ± 1%    -1.04% (p=0.000 n=17+20)
    MemmoveUnalignedSrc/16-6    2.52GB/s ± 0%   2.49GB/s ± 1%    -1.26% (p=0.000 n=19+20)
    MemmoveUnalignedSrc/32-6    4.82GB/s ± 0%   4.61GB/s ± 0%    -4.40% (p=0.000 n=19+20)
    MemmoveUnalignedSrc/64-6    5.03GB/s ± 4%   7.97GB/s ± 0%   +58.55% (p=0.000 n=20+16)
    MemmoveUnalignedSrc/128-6   11.1GB/s ± 0%   11.2GB/s ± 0%    +0.52% (p=0.000 n=17+18)
    MemmoveUnalignedSrc/256-6   16.5GB/s ± 0%   16.4GB/s ± 0%    -0.10% (p=0.000 n=20+18)
    MemmoveUnalignedSrc/512-6   21.0GB/s ± 0%   22.1GB/s ± 0%    +5.48% (p=0.000 n=14+17)
    MemmoveUnalignedSrc/1024-6  24.9GB/s ± 0%   31.9GB/s ± 0%   +28.20% (p=0.000 n=19+20)
    MemmoveUnalignedSrc/2048-6  23.3GB/s ± 0%   33.8GB/s ± 0%   +45.22% (p=0.000 n=17+19)
    MemmoveUnalignedSrc/4096-6  37.3GB/s ± 0%   42.7GB/s ± 0%   +14.30% (p=0.000 n=17+17)

Change-Id: Id66aa3e499ccfb117cb99d623ef326b50d057b64
Reviewed-on: https://go-review.googlesource.com/29590
Run-TryBot: Denis Nagorny <denis.nagorny@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-10-06 10:21:58 +00:00
Ian Lance Taylor e5421e21ef runtime: add threadprof tag for test that starts busy thread
The CgoExternalThreadSIGPROF test starts a thread at constructor time
that does a busy loop. That can throw off some other tests. So only
build that code if testprogcgo is built with the tag threadprof, and
adjust the tests that use that code to pass that build tag.

This revealed that the CgoPprofThread test was not testing what it
should have, as it never actually started the cpuHog thread. It was
passing because of the busy loop thread. Fix it to start the thread as
intended.

Change-Id: I087a9e4fc734a86be16a287456441afac5676beb
Reviewed-on: https://go-review.googlesource.com/30362
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-06 01:23:09 +00:00
Lynn Boger 3107c91e2d runtime: memclr perf improvements on ppc64x
This updates runtime/memclr_ppc64x.s to improve performance,
by unrolling loops for larger clears.

Fixes #17348

benchmark                    old MB/s     new MB/s     speedup
BenchmarkMemclr/5-80         199.71       406.63       2.04x
BenchmarkMemclr/16-80        693.66       1817.41      2.62x
BenchmarkMemclr/64-80        2309.35      5793.34      2.51x
BenchmarkMemclr/256-80       5428.18      14765.81     2.72x
BenchmarkMemclr/4096-80      8611.65      27191.94     3.16x
BenchmarkMemclr/65536-80     8736.69      28604.23     3.27x
BenchmarkMemclr/1M-80        9304.94      27600.09     2.97x
BenchmarkMemclr/4M-80        8705.66      27589.64     3.17x
BenchmarkMemclr/8M-80        8575.74      23631.04     2.76x
BenchmarkMemclr/16M-80       8443.10      19240.68     2.28x
BenchmarkMemclr/64M-80       8390.40      9493.04      1.13x
BenchmarkGoMemclr/5-80       263.05       630.37       2.40x
BenchmarkGoMemclr/16-80      904.33       1148.49      1.27x
BenchmarkGoMemclr/64-80      2830.20      8756.70      3.09x
BenchmarkGoMemclr/256-80     6064.59      20299.46     3.35x

Change-Id: Ic76c9183c8b4129ba3df512ca8b0fe6bd424e088
Reviewed-on: https://go-review.googlesource.com/30373
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
2016-10-05 22:18:14 +00:00
Cherry Zhang 4c9a372946 runtime, cmd/internal/obj: get rid of rewindmorestack
In the function prologue, we emit a jump to the beginning of
the function immediately after calling morestack. And in the
runtime stack growing code, it decodes and emulates that jump.
This emulation was necessary before we had per-PC SP deltas,
since the traceback code assumed that the frame size was fixed
for the whole function, except on the first instruction where
it was 0. Since we now have per-PC SP deltas and PCDATA, we
can correctly record that the frame size is 0. This makes the
emulation unnecessary.

This may be helpful for registerized calling convention, where
there may be unspills of arguments after calling morestack. It
also simplifies the runtime.

Change-Id: I7ebee31eaee81795445b33f521ab6a79624c4ceb
Reviewed-on: https://go-review.googlesource.com/30138
Reviewed-by: David Chase <drchase@google.com>
2016-10-05 18:19:46 +00:00
Michael Munday f15f1ff46f runtime/testdata/testprogcgo: add explicit return value to signalThread
Should fix the clang builder.

Change-Id: I3ee34581b6a7ec902420de72a8a08a2426997782
Reviewed-on: https://go-review.googlesource.com/30363
Run-TryBot: Michael Munday <munday@ca.ibm.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-05 15:36:00 +00:00
Ian Lance Taylor 6c13a1db2e runtime: don't call cgocallback from signal handler
Calling cgocallback from a signal handler can fail when using the race
detector. Calling cgocallback will lead to a call to newextram which
will call oneNewExtraM which will call racegostart. The racegostart
function will set up some race detector data structures, and doing that
will sometimes call the C memory allocator. If we are running the signal
handler from a signal that interrupted the C memory allocator, we will
crash or hang.

Instead, change the signal handler code to call needm and dropm. The
needm function will grab allocated m and g structures and initialize the
g to use the current stack--the signal stack. That is all we need to
safely call code that allocates memory and checks whether it needs to
split the stack. This may temporarily leave us with no m available to
run a cgo callback, but that is OK in this case since the code we call
will quickly either crash or call dropm to return the m.

Implementing this required changing some of the setSignalstackSP
functions to avoid a write barrier. These functions never need a write
barrier but in some cases generated one anyhow because on some systems
the ss_sp field is a pointer.

Change-Id: I3893f47c3a66278f85eab7f94c1ab11d4f3be133
Reviewed-on: https://go-review.googlesource.com/30218
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2016-10-05 13:21:49 +00:00
Ian Lance Taylor 7faf702396 runtime: avoid endless loop if printing the panic value panics
Change-Id: I56de359a5ccdc0a10925cd372fa86534353c6ca0
Reviewed-on: https://go-review.googlesource.com/30358
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-05 13:13:27 +00:00
Carl Mastrangelo c1e267cc73 runtime: make append only clear uncopied memory
Also add a benchmark that shows off the new behavior.  The
existing benchmarks reuse the same slice, and thus don't ever have
to clear memory.  Running the Append|Grow benchmarks in runtime:

name                              old time/op  new time/op  delta
AppendSliceLarge/1024Bytes-12      265ns ± 1%   265ns ± 3%     ~     (p=0.524 n=17+20)
AppendSliceLarge/4096Bytes-12      807ns ± 3%   772ns ± 1%   -4.38%  (p=0.000 n=20+20)
AppendSliceLarge/16384Bytes-12    3.20µs ± 4%  2.82µs ± 4%  -11.93%  (p=0.000 n=19+20)
AppendSliceLarge/65536Bytes-12    13.0µs ± 4%  11.0µs ± 3%  -15.22%  (p=0.000 n=20+20)
AppendSliceLarge/262144Bytes-12   62.7µs ± 1%  51.6µs ± 1%  -17.67%  (p=0.000 n=19+20)
AppendSliceLarge/1048576Bytes-12   337µs ± 3%   289µs ± 3%  -14.36%  (p=0.000 n=20+20)
GrowSliceBytes-12                 31.2ns ± 4%  31.4ns ±11%     ~     (p=0.308 n=19+18)
GrowSliceInts-12                  53.4ns ±14%  45.0ns ± 6%  -15.74%  (p=0.000 n=20+19)
GrowSlicePtr-12                   87.0ns ± 3%  83.3ns ± 3%   -4.26%  (p=0.000 n=18+17)
GrowSliceStruct24Bytes-12         88.9ns ± 5%  77.8ns ± 2%  -12.45%  (p=0.000 n=20+19)
Append-12                         17.2ns ± 1%  17.3ns ± 2%     ~     (p=0.464 n=18+17)
AppendGrowByte-12                 2.28ms ± 1%  1.92ms ± 2%  -15.65%  (p=0.000 n=20+18)
AppendGrowString-12                255ms ± 3%   253ms ± 4%     ~     (p=0.065 n=19+19)
AppendSlice/1Bytes-12             3.13ns ± 0%  3.11ns ± 1%   -0.65%  (p=0.000 n=17+18)
AppendSlice/4Bytes-12             3.02ns ± 2%  3.11ns ± 1%   +3.27%  (p=0.000 n=18+17)
AppendSlice/7Bytes-12             4.14ns ± 3%  4.13ns ± 2%     ~     (p=0.380 n=19+18)
AppendSlice/8Bytes-12             3.74ns ± 3%  3.68ns ± 1%   -1.76%  (p=0.000 n=19+18)
AppendSlice/15Bytes-12            4.03ns ± 2%  4.04ns ± 2%     ~     (p=0.261 n=19+20)
AppendSlice/16Bytes-12            4.03ns ± 2%  4.03ns ± 0%     ~     (p=0.062 n=18+17)
AppendSlice/32Bytes-12            3.23ns ± 4%  3.43ns ± 1%   +6.10%  (p=0.000 n=17+18)
AppendStr/1Bytes-12               3.51ns ± 1%  3.52ns ± 1%     ~     (p=0.321 n=18+19)
AppendStr/4Bytes-12               3.46ns ± 1%  3.46ns ± 1%     ~     (p=0.977 n=18+20)
AppendStr/8Bytes-12               3.18ns ± 1%  3.19ns ± 1%     ~     (p=0.650 n=16+17)
AppendStr/16Bytes-12              6.08ns ±27%  5.52ns ± 3%   -9.16%  (p=0.002 n=18+19)
AppendStr/32Bytes-12              3.71ns ± 1%  3.53ns ± 1%   -4.73%  (p=0.000 n=20+19)
AppendSpecialCase-12              17.7ns ± 1%  17.8ns ± 3%   +0.86%  (p=0.045 n=17+18)
AppendInPlace/NoGrow/Byte-12       375ns ± 1%   376ns ± 1%   +0.35%  (p=0.021 n=20+18)
AppendInPlace/NoGrow/1Ptr-12      1.01µs ± 1%  1.10µs ± 1%   +9.28%  (p=0.000 n=18+20)
AppendInPlace/NoGrow/2Ptr-12      1.85µs ± 2%  1.71µs ± 1%   -7.51%  (p=0.000 n=19+18)
AppendInPlace/NoGrow/3Ptr-12      2.57µs ± 2%  2.44µs ± 1%   -5.08%  (p=0.000 n=19+19)
AppendInPlace/NoGrow/4Ptr-12      3.52µs ± 2%  3.35µs ± 2%   -4.70%  (p=0.000 n=20+19)
AppendInPlace/Grow/Byte-12         212ns ± 1%   217ns ± 8%   +2.57%  (p=0.000 n=20+20)
AppendInPlace/Grow/1Ptr-12         214ns ± 2%   217ns ± 3%   +1.23%  (p=0.001 n=18+19)
AppendInPlace/Grow/2Ptr-12         298ns ± 2%   300ns ± 2%   +0.55%  (p=0.038 n=19+20)
AppendInPlace/Grow/3Ptr-12         367ns ± 2%   366ns ± 2%     ~     (p=0.452 n=20+18)
AppendInPlace/Grow/4Ptr-12         416ns ± 2%   411ns ± 2%   -1.18%  (p=0.000 n=20+19)
StackGrowth-12                    43.4ns ± 1%  43.4ns ± 0%     ~     (p=1.000 n=16+16)
StackGrowthDeep-12                11.4µs ± 4%  10.3µs ± 4%   -9.65%  (p=0.000 n=20+19)

Change-Id: I69a8afbd942c787c591d95b9d9439bd6db4d1e49
Reviewed-on: https://go-review.googlesource.com/30192
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-04 22:40:20 +00:00
Nick Craig-Wood 6b795e77df runtime: correct function name in throw message
Change-Id: I8fd271066925734c3f7196f64db04f27c4ce27cb
Reviewed-on: https://go-review.googlesource.com/30274
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-04 13:48:07 +00:00
Brad Fitzpatrick ad26bb5e30 all: use sort.Slice where applicable
I avoided anywhere in the compiler or things which might be used by
the compiler in the future, since they need to build with Go 1.4.

I also avoided anywhere where there was no benefit to changing it.

I probably missed some.

Updates #16721

Change-Id: Ib3c895ff475c6dec2d4322393faaf8cb6a6d4956
Reviewed-on: https://go-review.googlesource.com/30250
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
2016-10-04 05:10:56 +00:00
Austin Clements 7aab88a31e runtime: fix missing space in error message
Change-Id: I422708d50c3c727246e7991039877660ca034dc9
Reviewed-on: https://go-review.googlesource.com/30144
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-03 22:00:13 +00:00
Austin Clements bf776a988b runtime: document bmap.tophash
In particular, it wasn't obvious that some values are special (unless
you also found those special values), so document that it isn't
necessarily a hash value.

Change-Id: Iff292822b44408239e26cd882dc07be6df2c1d38
Reviewed-on: https://go-review.googlesource.com/30143
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-03 22:00:06 +00:00
Austin Clements 38f1df66ff runtime: make gcDumpObject useful on stack frames
gcDumpObject is often used on a stack pointer (for example, when
checkmark finds an unmarked object on the stack), but since stack
spans don't have an elemsize, it doesn't print any of the memory from
the frame. Make it at least slightly more useful by printing
everything between obj and obj+off (inclusive). While we're here, also
print out the span state.

Change-Id: I51be064ea8791b4a365865bfdc7afa7b5aaecfbd
Reviewed-on: https://go-review.googlesource.com/30142
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-03 21:59:54 +00:00
Austin Clements 6879dbde4e runtime: introduce a type for span states
Currently span states are untyped constants and the field is just a
uint8. Make this more type-safe by introducing a type for the span
state.

Change-Id: I369bf59fe6e8234475f4921611424fceb7d0a6de
Reviewed-on: https://go-review.googlesource.com/30141
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-03 21:59:45 +00:00
Austin Clements 99339dd445 runtime: weaken claim about SetFinalizer panicking
Currently the SetFinalizer documentation makes a strong claim that
SetFinalizer will panic if the pointer is not to an object allocated
by calling new, to a composite literal, or to a local variable. This
is not true. For example, it doesn't panic when passed the address of
a package-level variable. Nor can we practically make it true. For
example, we can't distinguish between passing a pointer to a composite
literal and passing a pointer to its first field.

Hence, weaken the guarantee to say that it "may" panic.

Updates #17311. (Might fix it, depending on what we want to do with
package-level variables.)

Change-Id: I1c68ea9d0a5bbd3dd1b7ce329d92b0f05e2e0877
Reviewed-on: https://go-review.googlesource.com/30137
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-03 16:12:48 +00:00
Mike Appleby 360f2e43b7 runtime: sleep on CLOCK_MONOTONIC in futexsleep1 on freebsd
In FreeBSD 10.0, the _umtx_op syscall was changed to allow sleeping on
any supported clock, but the default clock was switched from a monotonic
clock to CLOCK_REALTIME.

Prior to 10.0, the __umtx_op_wait* functions ignored the fourth argument
to _umtx_op (uaddr1), expected the fifth argument (uaddr2) to be a
struct timespec pointer, and used a monotonic clock (nanouptime(9)) for
timeout calculations.

Since 10.0, if callers want a clock other than CLOCK_REALTIME, they must
call _umtx_op with uaddr1 set to a value greater than sizeof(struct
timespec), and with uaddr2 as pointer to a struct _umtx_time, rather
than a timespec. Callers can set the _clockid field of the struct
_umtx_time to request the clock they want.

The relevant FreeBSD commit:
    https://svnweb.freebsd.org/base?view=revision&revision=232144

Fixes #17168

Change-Id: I3dd7b32b683622b8d7b4a6a8f9eb56401bed6bdf
Reviewed-on: https://go-review.googlesource.com/30154
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-01 01:25:21 +00:00
David Crawshaw 441502154f runtime: remove defer from standard cgo call
The endcgo function call is currently deferred in case a cgo
callback into Go panics and unwinds through cgocall. Typical cgo
calls do not have callbacks into Go, and even fewer panic, so we
pay the cost of this defer for no typical benefit.

Amazingly, there is another defer on the cgocallback path also used
to cleanup in case the Go code called by cgo panics. This CL folds
the first defer into the second, to reduce the cost of typical cgo
calls.

This reduces the overhead for a no-op cgo call significantly:

	name       old time/op  new time/op  delta
	CgoNoop-8  93.5ns ± 0%  51.1ns ± 1%  -45.34%  (p=0.016 n=4+5)

The total effect between Go 1.7 and 1.8 is even greater, as CL 29656
reduced the cost of defer recently. Hopefully a future Go release
will drop the cost of defer to nothing, making this optimization
unnecessary. But until then, this is nice.

Change-Id: Id1a5648f687a87001d95bec6842e4054bd20ee4f
Reviewed-on: https://go-review.googlesource.com/30080
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-30 13:31:07 +00:00
Matthew Dempsky 9828b7c468 runtime, syscall: use FP instead of SP for parameters
Consistently access function parameters using the FP pseudo-register
instead of SP (e.g., x+0(FP) instead of x+4(SP) or x+8(SP), depending
on register size). Two reasons: 1) doc/asm says the SP pseudo-register
should use negative offsets in the range [-framesize, 0), and 2)
cmd/vet only validates parameter offsets when indexed from the FP
pseudo-register.

No binary changes to the compiled object files for any of the affected
package/OS/arch combinations.

Change-Id: I0efc6079bc7519fcea588c114ec6a39b245d68b0
Reviewed-on: https://go-review.googlesource.com/30085
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-30 05:40:43 +00:00
Mikio Hara a0d330f1dd runtime, runtime/cgo: revert CL 18814; don't drop signal stack in new thread on dragonfly
This change reverts CL 18814 which is a workaroud for older DragonFly
BSD kernels, and fixes #13945 and #13947 in a more general way the
same as other platforms except NetBSD.

This is a followup to CL 29491.

Updates #16329.

Change-Id: I771670bc672c827f2b3dbc7fd7417c49897cb991
Reviewed-on: https://go-review.googlesource.com/29971
Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-28 14:45:06 +00:00
Ian Lance Taylor eb268cb321 runtime: minor simplifications to signal code
Change setsig, setsigstack, getsig, raise, raiseproc to take uint32 for
signal number parameter, as that is the type mostly used for signal
numbers.  Same for dieFromSignal, sigInstallGoHandler, raisebadsignal.

Remove setsig restart parameter, as it is always either true or
irrelevant.

Don't check the handler in setsigstack, as the only caller does that
anyhow.

Don't bother to convert the handler from sigtramp to sighandler in
getsig, as it will never be called when the handler is sigtramp or
sighandler.

Don't check the return value from rt_sigaction in the GNU/Linux version
of setsigstack; no other setsigstack checks it, and it never fails.

Change-Id: I6bbd677e048a77eddf974dd3d017bc3c560fbd48
Reviewed-on: https://go-review.googlesource.com/29953
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-28 13:12:47 +00:00
Alex Brainman db82cf4e50 runtime: use RtlGenRandom instead of CryptGenRandom
This change replaces the use of CryptGenRandom with RtlGenRandom in
Windows to generate cryptographically random numbers during process
startup. RtlGenRandom uses the same RNG as CryptGenRandom, but it has many
fewer DLL dependencies and so does not affect process startup time as
much.

This makes running simple Go program on my computers faster.

Windows XP:
benchmark                      old ns/op     new ns/op     delta
BenchmarkRunningGoProgram-2     47408573      10784148      -77.25%

Windows 7 (VM):
benchmark                    old ns/op     new ns/op     delta
BenchmarkRunningGoProgram     16260390      12792150      -21.33%

Windows 7:
benchmark                      old ns/op     new ns/op     delta
BenchmarkRunningGoProgram-2     13600778      10050574      -26.10%

Fixes #15589

Change-Id: I2816239a2056e3d4a6dcd86a6fa2bb619c6008fe
Reviewed-on: https://go-review.googlesource.com/29700
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-28 05:21:53 +00:00
Ian Lance Taylor fdc167164e runtime: remove sigmask type, use sigset instead
The OS-independent sigmask type was not pulling its weight. Replace it
with the OS-dependent sigset type. This requires adding an OS-specific
sigaddset function, but permits removing the OS-specific sigmaskToSigset
function.

Change-Id: I43307b512b0264ec291baadaea902f05ce212305
Reviewed-on: https://go-review.googlesource.com/29950
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-27 21:33:44 +00:00
Ian Lance Taylor 097a581dc0 runtime: simplify signalstack by dropping nil as argument
Change the two calls to signalstack(nil) to inline the code
instead (it's two lines).

Change-Id: Ie92a05494f924f279e40ac159f1b677fda18f281
Reviewed-on: https://go-review.googlesource.com/29854
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-27 17:08:29 +00:00
Elias Naur 60482d8a8b runtime: relax SetFinalizer documentation to allow &local
The SetFinalizer documentation states that

"The argument obj must be a pointer to an object allocated by calling
new or by taking the address of a composite literal."

which precludes pointers to local variables. According to a comment
on #6591, this case is expected to work. This CL updates the documentation
for SetFinalizer accordingly.

Fixes #6591

Change-Id: Id861b3436bc1c9521361ea2d51c1ce74a121c1af
Reviewed-on: https://go-review.googlesource.com/29592
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-09-27 16:38:47 +00:00
Michael Munday 3436f0776f reflect, runtime: optimize Value.Call on s390x and add benchmark
Use an MVC loop to copy arguments in runtime.call* rather than copying
bytes individually.

I've added the benchmark CallArgCopy to test the speed of Value.Call
for various argument sizes.

name                    old speed      new speed       delta
CallArgCopy/size=128     439MB/s ± 1%    582MB/s ± 1%   +32.41%  (p=0.000 n=10+10)
CallArgCopy/size=256     695MB/s ± 1%   1172MB/s ± 1%   +68.67%  (p=0.000 n=10+10)
CallArgCopy/size=1024    573MB/s ± 8%   4175MB/s ± 2%  +628.11%  (p=0.000 n=10+10)
CallArgCopy/size=4096   1.46GB/s ± 2%  10.19GB/s ± 1%  +600.52%  (p=0.000 n=10+10)
CallArgCopy/size=65536  1.51GB/s ± 0%  12.30GB/s ± 1%  +716.30%   (p=0.000 n=9+10)

Change-Id: I87dae4809330e7964f6cb4a9e40e5b3254dd519d
Reviewed-on: https://go-review.googlesource.com/28096
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-27 16:00:19 +00:00
Cherry Zhang 9d4b40f55d runtime, cmd/compile: implement and use DUFFCOPY on ARM64
Change-Id: I8984eac30e5df78d4b94f19412135d3cc36969f8
Reviewed-on: https://go-review.googlesource.com/29910
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-09-27 15:07:31 +00:00
Austin Clements f8b2314c56 runtime: optimize defer code
This optimizes deferproc and deferreturn in various ways.

The most important optimization is that it more carefully arranges to
prevent preemption or stack growth. Currently we do this by switching
to the system stack on every deferproc and every deferreturn. While we
need to be on the system stack for the slow path of allocating and
freeing defers, in the common case we can fit in the nosplit stack.
Hence, this change pushes the system stack switch down into the slow
paths and makes everything now exposed to the user stack nosplit. This
also eliminates the need for various acquirem/releasem pairs, since we
are now preventing preemption by preventing stack split checks.

As another smaller optimization, we special case the common cases of
zero-sized and pointer-sized defer frames to respectively skip the
copy and perform the copy in line instead of calling memmove.

This speeds up the runtime defer benchmark by 42%:

name           old time/op  new time/op  delta
Defer-4        75.1ns ± 1%  43.3ns ± 1%  -42.31%   (p=0.000 n=8+10)

In reality, this speeds up defer by about 2.2X. The two benchmarks
below compare a Lock/defer Unlock pair (DeferLock) with a Lock/Unlock
pair (NoDeferLock). NoDeferLock establishes a baseline cost, so these
two benchmarks together show that this change reduces the overhead of
defer from 61.4ns to 27.9ns.

name           old time/op  new time/op  delta
DeferLock-4    77.4ns ± 1%  43.9ns ± 1%  -43.31%  (p=0.000 n=10+10)
NoDeferLock-4  16.0ns ± 0%  15.9ns ± 0%   -0.39%    (p=0.000 n=9+8)

This also shaves 34ns off cgo calls:

name       old time/op  new time/op  delta
CgoNoop-4   122ns ± 1%  88.3ns ± 1%  -27.72%  (p=0.000 n=8+9)

Updates #14939, #16051.

Change-Id: I2baa0dea378b7e4efebbee8fca919a97d5e15f38
Reviewed-on: https://go-review.googlesource.com/29656
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-26 22:01:35 +00:00
Austin Clements d211c2d377 runtime: implement getcallersp in Go
This makes it possible to inline getcallersp. getcallersp is on the
hot path of defers, so this slightly speeds up defer:

name           old time/op  new time/op  delta
Defer-4        78.3ns ± 2%  75.1ns ± 1%  -4.00%   (p=0.000 n=9+8)

Updates #14939.

Change-Id: Icc1cc4cd2f0a81fc4c8344432d0b2e783accacdd
Reviewed-on: https://go-review.googlesource.com/29655
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-26 22:01:32 +00:00
Austin Clements aaf4099a5c runtime: update malloc.go documentation
The big documentation comment at the top of malloc.go has gotten
woefully out of date. Update it.

Change-Id: Ibdb1bdcfdd707a6dc9db79d0633a36a28882301b
Reviewed-on: https://go-review.googlesource.com/29731
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-26 22:00:53 +00:00
Austin Clements f67c9de656 runtime: document MemStats
This documents all fields in MemStats and more clearly documents where
mstats differs from MemStats.

Fixes #15849.

Change-Id: Ie09374bcdb3a5fdd2d25fe4bba836aaae92cb1dd
Reviewed-on: https://go-review.googlesource.com/28972
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2016-09-26 22:00:50 +00:00
Austin Clements 2098e5d39a runtime: eliminate memstats.heap_reachable
We used to compute an estimate of the reachable heap size that was
different from the marked heap size. This ultimately caused more
problems than it solved, so we pulled it out, but memstats still has
both heap_reachable and heap_marked, and there are some leftover TODOs
about the problems with this estimate.

Clean this up by eliminating heap_reachable in favor of heap_marked
and deleting the stale TODOs.

Change-Id: I713bc20a7c90683d2b43ff63c0b21a440269cc4d
Reviewed-on: https://go-review.googlesource.com/29271
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-26 22:00:47 +00:00
Austin Clements ec9c84c884 runtime: disentangle next_gc from GC trigger
Back in Go 1.4, memstats.next_gc was both the heap size at which GC
would trigger, and the size GC kept the heap under. When we switched
to concurrent GC in Go 1.5, we got somewhat confused and made this
variable the trigger heap size, while gcController.heapGoal became the
goal heap size.

memstats.next_gc is exposed to the user via MemStats.NextGC, while
gcController.heapGoal is not. This is unfortunate because 1) the heap
goal is far more useful for diagnostics, and 2) the trigger heap size
is just part of the GC trigger heuristic, which means it wouldn't be
useful to an application even if it tried to use it.

We never noticed this mess because MemStats.NextGC is practically
undocumented. Now that we're trying to document MemStats, it became
clear that this field had diverged from its original usefulness.

Clean up this mess by shuffling things back around so that next_gc is
the goal heap size and the new (unexposed) memstats.gc_trigger field
is the trigger heap size. This eliminates gcController.heapGoal.

Updates #15849.

Change-Id: I2cbbd43b1d78bdf613cb43f53488bd63913189b7
Reviewed-on: https://go-review.googlesource.com/29270
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-09-26 22:00:44 +00:00
Ian Lance Taylor e2e11f02a4 runtime: unify Unix implementations of unminit
Change-Id: I2cbb13eb85876ad05a52cbd498a9b86e7a28899c
Reviewed-on: https://go-review.googlesource.com/29772
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-26 19:33:26 +00:00