Commit Graph

2454 Commits

Author SHA1 Message Date
Keith Randall 5a75d6a08e cmd/compile: optimize non-empty-interface type conversions
When doing i.(T) for non-empty-interface i and concrete type T,
there's no need to read the type out of the itab. Just compare the
itab to the itab we expect for that interface/type pair.

Also optimize type switches by putting the type hash of the
concrete type in the itab. That way we don't need to load the
type pointer out of the itab.

Update #18492

Change-Id: I49e280a21e5687e771db5b8a56b685291ac168ce
Reviewed-on: https://go-review.googlesource.com/34810
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
2017-02-13 18:16:31 +00:00
Josh Bleecher Snyder 8da91a6297 runtime: add Frames example
Based on sample code from iant.

Fixes #18788.

Change-Id: I6bb33ed05af2538fbde42ddcac629280ef7c00a6
Reviewed-on: https://go-review.googlesource.com/36892
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-02-13 06:10:35 +00:00
Russ Cox 45c6f59e1f runtime: use two-level list for semaphore address search in semaRoot
If there are many goroutines contending for two different locks
and both locks hash to the same semaRoot, the scans to find the
goroutines for a particular lock can end up being O(n), making
n lock acquisitions quadratic.

As long as only one actively-used lock hashes to each semaRoot
there's no problem, since the list operations in that case are O(1).
But when the second actively-used lock hits the same semaRoot,
then scans for entries with for a given lock have to scan over the
entries for the other lock.

Fix this problem by changing the semaRoot to hold only one sudog
per unique address. In the running example, this drops the length of
that list from O(n) to 2. Then attach other goroutines waiting on the
same address to a separate list headed by the sudog in the semaRoot list.
Those "same address list" operations are still O(1), so now the
example from above works much better.

There is still an assumption here that in real programs you don't have
many many goroutines queueing up on many many distinct addresses.
If we end up with that problem, we can replace the top-level list with
a treap.

Fixes #17953.

Change-Id: I78c5b1a5053845275ab31686038aa4f6db5720b2
Reviewed-on: https://go-review.googlesource.com/36792
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-02-12 15:54:16 +00:00
Josh Bleecher Snyder 2c91bb4c8a cmd/compile: make panicwrap argument-free
When code defines a method on T,
the compiler generates a corresponding wrapper method on *T.
The first thing the wrapper does is check whether
the pointer is nil and if so, call panicwrap.
This is done to provide a useful error message.

The existing implementation gets its information
from arguments set up by the compiler.
However, with some trouble, this information can
be extracted from the name of the wrapper method itself.

Removing the arguments to panicwrap simplifies and
shrinks the wrapper method.
It also means that the call to panicwrap does not
require any stack space.
This enables a further optimization on amd64/x86,
which is to skip the function prologue if nothing
else in the method requires stack space.
This is frequently the case in simple, hot methods,
such as Less and Swap in sort.Interface implementations.

Fixes #19040.

Benchmarks for package sort on amd64:

name                  old time/op  new time/op  delta
SearchWrappers-8       104ns ± 1%   104ns ± 1%    ~     (p=0.286 n=27+27)
SortString1K-8         128µs ± 1%   128µs ± 1%  -0.44%  (p=0.004 n=30+30)
SortString1K_Slice-8   118µs ± 2%   117µs ± 1%    ~     (p=0.106 n=30+30)
StableString1K-8      18.6µs ± 1%  18.6µs ± 1%    ~     (p=0.446 n=28+26)
SortInt1K-8           65.9µs ± 1%  60.7µs ± 1%  -7.96%  (p=0.000 n=28+30)
StableInt1K-8         75.3µs ± 2%  72.8µs ± 1%  -3.41%  (p=0.000 n=30+30)
StableInt1K_Slice-8   57.7µs ± 1%  57.7µs ± 1%    ~     (p=0.515 n=30+30)
SortInt64K-8          6.28ms ± 1%  6.01ms ± 1%  -4.19%  (p=0.000 n=28+28)
SortInt64K_Slice-8    5.04ms ± 1%  5.04ms ± 1%    ~     (p=0.927 n=28+27)
StableInt64K-8        6.65ms ± 1%  6.38ms ± 1%  -3.97%  (p=0.000 n=26+30)
Sort1e2-8             37.9µs ± 1%  37.2µs ± 1%  -1.89%  (p=0.000 n=29+27)
Stable1e2-8           77.0µs ± 1%  74.7µs ± 1%  -3.06%  (p=0.000 n=27+30)
Sort1e4-8             8.21ms ± 2%  7.98ms ± 1%  -2.77%  (p=0.000 n=29+30)
Stable1e4-8           24.8ms ± 1%  24.3ms ± 1%  -2.31%  (p=0.000 n=28+30)
Sort1e6-8              1.27s ± 4%   1.22s ± 1%  -3.42%  (p=0.000 n=30+29)
Stable1e6-8            5.06s ± 1%   4.92s ± 1%  -2.77%  (p=0.000 n=25+29)
[Geo mean]             731µs        714µs       -2.29%

Before/after assembly for sort.(*intPairs).Less follows.
It can be optimized further, but that's for a follow-up CL.

Before:

"".(*intPairs).Less t=1 size=214 args=0x20 locals=0x38
	0x0000 00000 (<autogenerated>:1)	TEXT	"".(*intPairs).Less(SB), $56-32
	0x0000 00000 (<autogenerated>:1)	MOVQ	(TLS), CX
	0x0009 00009 (<autogenerated>:1)	CMPQ	SP, 16(CX)
	0x000d 00013 (<autogenerated>:1)	JLS	204
	0x0013 00019 (<autogenerated>:1)	SUBQ	$56, SP
	0x0017 00023 (<autogenerated>:1)	MOVQ	BP, 48(SP)
	0x001c 00028 (<autogenerated>:1)	LEAQ	48(SP), BP
	0x0021 00033 (<autogenerated>:1)	MOVQ	32(CX), BX
	0x0025 00037 (<autogenerated>:1)	TESTQ	BX, BX
	0x0028 00040 (<autogenerated>:1)	JEQ	55
	0x002a 00042 (<autogenerated>:1)	LEAQ	64(SP), DI
	0x002f 00047 (<autogenerated>:1)	CMPQ	(BX), DI
	0x0032 00050 (<autogenerated>:1)	JNE	55
	0x0034 00052 (<autogenerated>:1)	MOVQ	SP, (BX)
	0x0037 00055 (<autogenerated>:1)	NOP
	0x0037 00055 (<autogenerated>:1)	FUNCDATA	$0, gclocals·4032f753396f2012ad1784f398b170f4(SB)
	0x0037 00055 (<autogenerated>:1)	FUNCDATA	$1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
	0x0037 00055 (<autogenerated>:1)	MOVQ	""..this+64(FP), AX
	0x003c 00060 (<autogenerated>:1)	TESTQ	AX, AX
	0x003f 00063 (<autogenerated>:1)	JEQ	$0, 135
	0x0041 00065 (<autogenerated>:1)	MOVQ	(AX), CX
	0x0044 00068 (<autogenerated>:1)	MOVQ	8(AX), AX
	0x0048 00072 (<autogenerated>:1)	MOVQ	"".i+72(FP), DX
	0x004d 00077 (<autogenerated>:1)	CMPQ	DX, AX
	0x0050 00080 (<autogenerated>:1)	JCC	$0, 128
	0x0052 00082 (<autogenerated>:1)	SHLQ	$4, DX
	0x0056 00086 (<autogenerated>:1)	MOVQ	(CX)(DX*1), DX
	0x005a 00090 (<autogenerated>:1)	MOVQ	"".j+80(FP), BX
	0x005f 00095 (<autogenerated>:1)	CMPQ	BX, AX
	0x0062 00098 (<autogenerated>:1)	JCC	$0, 128
	0x0064 00100 (<autogenerated>:1)	SHLQ	$4, BX
	0x0068 00104 (<autogenerated>:1)	MOVQ	(CX)(BX*1), AX
	0x006c 00108 (<autogenerated>:1)	CMPQ	DX, AX
	0x006f 00111 (<autogenerated>:1)	SETLT	AL
	0x0072 00114 (<autogenerated>:1)	MOVB	AL, "".~r2+88(FP)
	0x0076 00118 (<autogenerated>:1)	MOVQ	48(SP), BP
	0x007b 00123 (<autogenerated>:1)	ADDQ	$56, SP
	0x007f 00127 (<autogenerated>:1)	RET
	0x0080 00128 (<autogenerated>:1)	PCDATA	$0, $1
	0x0080 00128 (<autogenerated>:1)	CALL	runtime.panicindex(SB)
	0x0085 00133 (<autogenerated>:1)	UNDEF
	0x0087 00135 (<autogenerated>:1)	LEAQ	go.string."sort_test"(SB), AX
	0x008e 00142 (<autogenerated>:1)	MOVQ	AX, (SP)
	0x0092 00146 (<autogenerated>:1)	MOVQ	$9, 8(SP)
	0x009b 00155 (<autogenerated>:1)	LEAQ	go.string."intPairs"(SB), AX
	0x00a2 00162 (<autogenerated>:1)	MOVQ	AX, 16(SP)
	0x00a7 00167 (<autogenerated>:1)	MOVQ	$8, 24(SP)
	0x00b0 00176 (<autogenerated>:1)	LEAQ	go.string."Less"(SB), AX
	0x00b7 00183 (<autogenerated>:1)	MOVQ	AX, 32(SP)
	0x00bc 00188 (<autogenerated>:1)	MOVQ	$4, 40(SP)
	0x00c5 00197 (<autogenerated>:1)	PCDATA	$0, $1
	0x00c5 00197 (<autogenerated>:1)	CALL	runtime.panicwrap(SB)
	0x00ca 00202 (<autogenerated>:1)	UNDEF
	0x00cc 00204 (<autogenerated>:1)	NOP
	0x00cc 00204 (<autogenerated>:1)	PCDATA	$0, $-1
	0x00cc 00204 (<autogenerated>:1)	CALL	runtime.morestack_noctxt(SB)
	0x00d1 00209 (<autogenerated>:1)	JMP	0

After:

"".(*intPairs).Swap t=1 size=147 args=0x18 locals=0x8
	0x0000 00000 (<autogenerated>:1)	TEXT	"".(*intPairs).Swap(SB), $8-24
	0x0000 00000 (<autogenerated>:1)	MOVQ	(TLS), CX
	0x0009 00009 (<autogenerated>:1)	SUBQ	$8, SP
	0x000d 00013 (<autogenerated>:1)	MOVQ	BP, (SP)
	0x0011 00017 (<autogenerated>:1)	LEAQ	(SP), BP
	0x0015 00021 (<autogenerated>:1)	MOVQ	32(CX), BX
	0x0019 00025 (<autogenerated>:1)	TESTQ	BX, BX
	0x001c 00028 (<autogenerated>:1)	JEQ	43
	0x001e 00030 (<autogenerated>:1)	LEAQ	16(SP), DI
	0x0023 00035 (<autogenerated>:1)	CMPQ	(BX), DI
	0x0026 00038 (<autogenerated>:1)	JNE	43
	0x0028 00040 (<autogenerated>:1)	MOVQ	SP, (BX)
	0x002b 00043 (<autogenerated>:1)	NOP
	0x002b 00043 (<autogenerated>:1)	FUNCDATA	$0, gclocals·e6397a44f8e1b6e77d0f200b4fba5269(SB)
	0x002b 00043 (<autogenerated>:1)	FUNCDATA	$1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
	0x002b 00043 (<autogenerated>:1)	MOVQ	""..this+16(FP), AX
	0x0030 00048 (<autogenerated>:1)	TESTQ	AX, AX
	0x0033 00051 (<autogenerated>:1)	JEQ	$0, 140
	0x0035 00053 (<autogenerated>:1)	MOVQ	(AX), CX
	0x0038 00056 (<autogenerated>:1)	MOVQ	8(AX), AX
	0x003c 00060 (<autogenerated>:1)	MOVQ	"".i+24(FP), DX
	0x0041 00065 (<autogenerated>:1)	CMPQ	DX, AX
	0x0044 00068 (<autogenerated>:1)	JCC	$0, 133
	0x0046 00070 (<autogenerated>:1)	SHLQ	$4, DX
	0x004a 00074 (<autogenerated>:1)	MOVQ	8(CX)(DX*1), BX
	0x004f 00079 (<autogenerated>:1)	MOVQ	(CX)(DX*1), SI
	0x0053 00083 (<autogenerated>:1)	MOVQ	"".j+32(FP), DI
	0x0058 00088 (<autogenerated>:1)	CMPQ	DI, AX
	0x005b 00091 (<autogenerated>:1)	JCC	$0, 133
	0x005d 00093 (<autogenerated>:1)	SHLQ	$4, DI
	0x0061 00097 (<autogenerated>:1)	MOVQ	8(CX)(DI*1), AX
	0x0066 00102 (<autogenerated>:1)	MOVQ	(CX)(DI*1), R8
	0x006a 00106 (<autogenerated>:1)	MOVQ	R8, (CX)(DX*1)
	0x006e 00110 (<autogenerated>:1)	MOVQ	AX, 8(CX)(DX*1)
	0x0073 00115 (<autogenerated>:1)	MOVQ	SI, (CX)(DI*1)
	0x0077 00119 (<autogenerated>:1)	MOVQ	BX, 8(CX)(DI*1)
	0x007c 00124 (<autogenerated>:1)	MOVQ	(SP), BP
	0x0080 00128 (<autogenerated>:1)	ADDQ	$8, SP
	0x0084 00132 (<autogenerated>:1)	RET
	0x0085 00133 (<autogenerated>:1)	PCDATA	$0, $1
	0x0085 00133 (<autogenerated>:1)	CALL	runtime.panicindex(SB)
	0x008a 00138 (<autogenerated>:1)	UNDEF
	0x008c 00140 (<autogenerated>:1)	PCDATA	$0, $1
	0x008c 00140 (<autogenerated>:1)	CALL	runtime.panicwrap(SB)
	0x0091 00145 (<autogenerated>:1)	UNDEF

Change-Id: I15bb8435f0690badb868799f313ed8817335efd3
Reviewed-on: https://go-review.googlesource.com/36809
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-11 23:27:35 +00:00
Sokolov Yura d03c124860 runtime: implement fastrand in go
So it could be inlined.

Using bit-tricks it could be implemented without condition
(improved trick version by Minux Ma).

Simple benchmark shows it is faster on i386 and x86_64, though
I don't know will it be faster on other architectures?

benchmark                       old ns/op     new ns/op     delta
BenchmarkFastrand-3             2.79          1.48          -46.95%
BenchmarkFastrandHashiter-3     25.9          24.9          -3.86%

Change-Id: Ie2eb6d0f598c0bb5fac7f6ad0f8b5e3eddaa361b
Reviewed-on: https://go-review.googlesource.com/34782
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-10 19:16:29 +00:00
Brad Fitzpatrick 9f75ecd5e1 runtime/debug: don't run a GC when setting SetGCPercent negative
If the user is calling SetGCPercent(-1), they intend to disable GC.
They probably don't intend to run one. If they do, they can call
runtime.GC themselves.

Change-Id: I40ef40dfc7e15193df9ff26159cd30e56b666f73
Reviewed-on: https://go-review.googlesource.com/34013
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-02-10 18:28:37 +00:00
Heschi Kreinick 2a74b9e814 cmd/trace: Record mark assists in execution traces
During the mark phase of garbage collection, goroutines that allocate
may be recruited to assist. This change creates trace events for mark
assists and displays them similarly to sweep assists in the trace
viewer.

Mark assists are different than sweeps in that they can be preempted, so
displaying them in the trace viewer is a little tricky -- we may need to
synthesize multiple slices for one mark assist. This could have been
done in the parser instead, but I thought it might be preferable to keep
the parser as true to the event stream as possible.

Change-Id: I381dcb1027a187a354b1858537851fa68a620ea7
Reviewed-on: https://go-review.googlesource.com/36015
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2017-02-10 18:03:42 +00:00
Russ Cox 9a7544395a runtime/pprof: merge internal/protopprof into pprof package
These are very tightly coupled, and internal/protopprof is small.
There's no point to having a separate package.

Change-Id: I2c8aa49c9e18a7128657bf2b05323860151b5606
Reviewed-on: https://go-review.googlesource.com/36711
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-10 13:09:19 +00:00
Robert Griesemer 3fd3171c2c cmd/compile/internal/syntax: removed gcCompat code needed to pass orig. tests
The gcCompat mode was introduced to match the new parser's node position
setup exactly with the positions used by the original parser. Some of the
gcCompat adjustments were required to satisfy syntax error test cases,
and the rest were required to make toolstash cmp pass.

This change removes the former gcCompat adjustments and instead adjusts
the respective test cases as necessary. In some cases this makes the error
lines consistent with the ones reported by gccgo.

Where it has changed, the position associated with a given syntactic construct
is the position (line/col number) of the left-most token belonging to the
construct.

Change-Id: I5b60c00c5999a895c4d6d6e9b383c6405ccf725c
Reviewed-on: https://go-review.googlesource.com/36695
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-02-10 01:22:30 +00:00
Ian Lance Taylor e24228af25 runtime: enable/disable SIGPROF if needed when profiling
This ensures that SIGPROF is handled correctly when using
runtime/pprof in a c-archive or c-shared library.

Separate profiler handling into pre-process changes and per-thread
changes. Simplify the Windows code slightly accordingly.

Fixes #18220.

Change-Id: I5060f7084c91ef0bbe797848978bdc527c312777
Reviewed-on: https://go-review.googlesource.com/34018
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
2017-02-09 18:53:34 +00:00
Russ Cox e4371fb179 time: optimize Now on darwin, windows
Fetch both monotonic and wall time together when possible.
Avoids skew and is cheaper.

Also shave a few ns off in conversion in package time.

Compared to current implementation (after monotonic changes):

name   old time/op  new time/op  delta
Now    19.6ns ± 1%   9.7ns ± 1%  -50.63%  (p=0.000 n=41+49) darwin/amd64
Now    23.5ns ± 4%  10.6ns ± 5%  -54.61%  (p=0.000 n=30+28) windows/amd64
Now    54.5ns ± 5%  29.8ns ± 9%  -45.40%  (p=0.000 n=27+29) windows/386

More importantly, compared to Go 1.8:

name   old time/op  new time/op  delta
Now     9.5ns ± 1%   9.7ns ± 1%   +1.94%  (p=0.000 n=41+49) darwin/amd64
Now    12.9ns ± 5%  10.6ns ± 5%  -17.73%  (p=0.000 n=30+28) windows/amd64
Now    15.3ns ± 5%  29.8ns ± 9%  +94.36%  (p=0.000 n=30+29) windows/386

This brings time.Now back in line with Go 1.8 on darwin/amd64 and windows/amd64.

It's not obvious why windows/386 is still noticeably worse than Go 1.8,
but it's better than before this CL. The windows/386 speed is not too
important; the changes just keep the two architectures similar.

Change-Id: If69b94970c8a1a57910a371ee91e0d4e82e46c5d
Reviewed-on: https://go-review.googlesource.com/36428
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-02-09 14:45:16 +00:00
Ian Lance Taylor 87ad863f35 runtime: use atomic ops for fwdSig, make sigtable immutable
The fwdSig array is accessed by the signal handler, which may run in
parallel with other threads manipulating it via the os/signal package.
Use atomic accesses to ensure that there are no problems.

Move the _SigHandling flag out of the sigtable array. This makes sigtable
immutable and safe to read from the signal handler.

Change-Id: Icfa407518c4ebe1da38580920ced764898dfc9ad
Reviewed-on: https://go-review.googlesource.com/36321
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-08 04:14:41 +00:00
David Crawshaw 14c2849c3e runtime: update android time_now call
This was broken in https://golang.org/cl/36255

Change-Id: Ib23323a745a650ac51b0ead161076f97efe6d7b7
Reviewed-on: https://go-review.googlesource.com/36543
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-08 02:56:25 +00:00
Sameer Ajmani 38cb9d28a9 runtime/pprof: document that profile names should not contain spaces.
Change-Id: I967d897e812bee63b32bc2a7dcf453861b89b7e3
Reviewed-on: https://go-review.googlesource.com/36533
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-07 22:00:48 +00:00
Jaana Burcu Dogan 6cf7918e73 runtime/pprof: clarify CPU profile's captured during the lifetime of the prog
Fixes #18504.

Change-Id: I3716fc58fc98472eea15ce3617aee3890670c276
Reviewed-on: https://go-review.googlesource.com/36430
Reviewed-by: Russ Cox <rsc@golang.org>
2017-02-07 19:46:15 +00:00
Austin Clements 4af6b81d41 runtime: fix confusion between _MaxMem and _MaxArena32
Currently both _MaxMem and _MaxArena32 represent the maximum arena
size on 32-bit hosts (except on MIPS32 where _MaxMem is confusingly
smaller than _MaxArena32).

Clean up sysAlloc so that it always uses _MaxMem, which is the maximum
arena size on both 32- and 64-bit architectures and is the arena size
we allocate auxiliary structures for. This lets us simplify and unify
some code paths and eliminate _MaxArena32.

Fixes #18651. mheap.sysAlloc currently assumes that if the arena is
small, we must be on a 32-bit machine and can therefore grow the arena
to _MaxArena32. This breaks down on darwin/arm64, where _MaxMem is
only 2 GB. As a result, on darwin/arm64, we only reserve spans and
bitmap space for a 2 GB heap, and if the application tries to allocate
beyond that, sysAlloc takes the 32-bit path, tries to grow the arena
beyond 2 GB, and panics when it tries to grow the spans array
allocation past its reserved size. This has probably been a problem
for several releases now, but was only noticed recently because
mapSpans didn't check the bounds on the span reservation until
recently. Most likely it corrupted the bitmap before. By using _MaxMem
consistently, we avoid thinking that we can grow the arena larger than
we have auxiliary structures for.

Change-Id: Ifef28cb746a3ead4b31c1d7348495c2242fef520
Reviewed-on: https://go-review.googlesource.com/35253
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Elias Naur <elias.naur@gmail.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-07 18:39:18 +00:00
Austin Clements 1cc24690b8 runtime: simplify and cleanup mallocinit
mallocinit has evolved organically. Make a pass to clean it up in
various ways:

1. Merge the computation of spansSize and bitmapSize. These were
   computed on every loop iteration of two different loops, but always
   have the same value, which can be derived directly from _MaxMem.
   This also avoids over-reserving these on MIPS, were _MaxArena32 is
   larger than _MaxMem.

2. Remove the ulimit -v logic. It's been disabled for many releases
   and the dead code paths to support it are even more wrong now than
   they were when it was first disabled, since now we *must* reserve
   spans and bitmaps for the full address space.

3. Make it clear that we're using a simple linear allocation to lay
   out the spans, bitmap, and arena spaces. Previously there were a
   lot of redundant pointer computations. Now we just bump p1 up as we
   reserve the spaces.

In preparation for #18651.

Updates #5049 (respect ulimit).

Change-Id: Icbe66570d3a7a17bea227dc54fb3c4978b52a3af
Reviewed-on: https://go-review.googlesource.com/35252
Reviewed-by: Russ Cox <rsc@golang.org>
2017-02-07 18:39:15 +00:00
Austin Clements efb5eae3cf runtime: make _MaxMem an untyped constant
Currently _MaxMem is a uintptr, which is going to complicate some
further changes. Make it untyped so we'll be able to do untyped math
on it before truncating it to a uintptr.

The runtime assembly is identical before and after this change on
{linux,windows}/{amd64,386}.

Updates #18651.

Change-Id: I0f64511faa9e0aa25179a556ab9f185ebf8c9cf8
Reviewed-on: https://go-review.googlesource.com/35251
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2017-02-07 18:39:12 +00:00
Cherry Zhang bed8129ee6 cmd/internal/obj: remove Follow pass
The Follow pass in the assembler backend reorders and copies
instructions. This even applies to hand-written assembly code,
which in many cases don't want to be reordered. Now that the
SSA compiler does a good job for laying out instructions, the
benefit of this pass is very little:

AMD64: (old = with Follow, new = without Follow)
name                      old time/op    new time/op    delta
BinaryTree17-12              2.78s ± 1%     2.79s ± 1%  +0.44%  (p=0.000 n=20+19)
Fannkuch11-12                3.11s ± 0%     3.31s ± 1%  +6.16%  (p=0.000 n=19+19)
FmtFprintfEmpty-12          50.9ns ± 1%    51.6ns ± 3%  +1.40%  (p=0.000 n=17+20)
FmtFprintfString-12          127ns ± 0%     128ns ± 1%  +0.88%  (p=0.000 n=17+17)
FmtFprintfInt-12             122ns ± 0%     123ns ± 1%  +0.76%  (p=0.000 n=20+19)
FmtFprintfIntInt-12          185ns ± 1%     186ns ± 1%  +0.65%  (p=0.000 n=20+19)
FmtFprintfPrefixedInt-12     192ns ± 1%     202ns ± 1%  +4.99%  (p=0.000 n=20+19)
FmtFprintfFloat-12           284ns ± 0%     288ns ± 0%  +1.33%  (p=0.000 n=15+19)
FmtManyArgs-12               807ns ± 0%     804ns ± 0%  -0.44%  (p=0.000 n=16+18)
GobDecode-12                7.23ms ± 1%    7.21ms ± 1%    ~     (p=0.052 n=20+20)
GobEncode-12                6.09ms ± 1%    6.12ms ± 1%  +0.41%  (p=0.002 n=19+19)
Gzip-12                      253ms ± 1%     255ms ± 1%  +0.95%  (p=0.000 n=18+20)
Gunzip-12                   38.4ms ± 0%    38.5ms ± 0%  +0.34%  (p=0.000 n=17+17)
HTTPClientServer-12         95.4µs ± 2%    96.1µs ± 1%  +0.78%  (p=0.002 n=19+19)
JSONEncode-12               16.5ms ± 1%    16.6ms ± 1%  +1.17%  (p=0.000 n=19+19)
JSONDecode-12               54.6ms ± 1%    55.3ms ± 1%  +1.23%  (p=0.000 n=18+18)
Mandelbrot200-12            4.47ms ± 0%    4.47ms ± 0%  +0.06%  (p=0.000 n=18+18)
GoParse-12                  3.47ms ± 1%    3.47ms ± 1%    ~     (p=0.583 n=20+20)
RegexpMatchEasy0_32-12      84.8ns ± 1%    85.2ns ± 2%  +0.51%  (p=0.022 n=20+20)
RegexpMatchEasy0_1K-12       206ns ± 1%     206ns ± 1%    ~     (p=0.770 n=20+20)
RegexpMatchEasy1_32-12      82.8ns ± 1%    83.4ns ± 1%  +0.64%  (p=0.000 n=20+19)
RegexpMatchEasy1_1K-12       363ns ± 1%     361ns ± 1%  -0.48%  (p=0.007 n=20+20)
RegexpMatchMedium_32-12      126ns ± 1%     126ns ± 0%  +0.72%  (p=0.000 n=20+20)
RegexpMatchMedium_1K-12     39.1µs ± 1%    39.8µs ± 0%  +1.73%  (p=0.000 n=19+19)
RegexpMatchHard_32-12       1.97µs ± 0%    1.98µs ± 1%  +0.29%  (p=0.005 n=18+20)
RegexpMatchHard_1K-12       59.5µs ± 1%    59.8µs ± 1%  +0.36%  (p=0.000 n=18+20)
Revcomp-12                   442ms ± 1%     445ms ± 2%  +0.67%  (p=0.000 n=19+20)
Template-12                 58.0ms ± 1%    57.5ms ± 1%  -0.85%  (p=0.000 n=19+19)
TimeParse-12                 311ns ± 0%     314ns ± 0%  +0.94%  (p=0.000 n=20+18)
TimeFormat-12                350ns ± 3%     346ns ± 0%    ~     (p=0.076 n=20+19)
[Geo mean]                  55.9µs         56.4µs       +0.80%

ARM32:
name                     old time/op    new time/op    delta
BinaryTree17-4              30.4s ± 0%     30.1s ± 0%  -1.14%  (p=0.000 n=10+8)
Fannkuch11-4                13.7s ± 0%     13.6s ± 0%  -0.75%  (p=0.000 n=10+10)
FmtFprintfEmpty-4           664ns ± 1%     651ns ± 1%  -1.96%  (p=0.000 n=7+8)
FmtFprintfString-4         1.83µs ± 2%    1.77µs ± 2%  -3.21%  (p=0.000 n=10+10)
FmtFprintfInt-4            1.57µs ± 2%    1.54µs ± 2%  -2.25%  (p=0.007 n=10+10)
FmtFprintfIntInt-4         2.37µs ± 2%    2.31µs ± 1%  -2.68%  (p=0.000 n=10+10)
FmtFprintfPrefixedInt-4    2.14µs ± 2%    2.10µs ± 1%  -1.83%  (p=0.006 n=10+10)
FmtFprintfFloat-4          3.69µs ± 2%    3.74µs ± 1%  +1.60%  (p=0.000 n=10+10)
FmtManyArgs-4              9.43µs ± 1%    9.17µs ± 1%  -2.70%  (p=0.000 n=10+10)
GobDecode-4                76.3ms ± 1%    75.5ms ± 1%  -1.14%  (p=0.003 n=10+10)
GobEncode-4                70.7ms ± 2%    69.0ms ± 1%  -2.36%  (p=0.000 n=10+10)
Gzip-4                      2.64s ± 1%     2.65s ± 0%  +0.59%  (p=0.002 n=10+10)
Gunzip-4                    402ms ± 0%     398ms ± 0%  -1.11%  (p=0.000 n=10+9)
HTTPClientServer-4          458µs ± 0%     457µs ± 0%    ~     (p=0.247 n=10+10)
JSONEncode-4                171ms ± 0%     172ms ± 0%  +0.56%  (p=0.000 n=10+10)
JSONDecode-4                672ms ± 1%     668ms ± 1%    ~     (p=0.105 n=10+10)
Mandelbrot200-4            33.5ms ± 0%    33.5ms ± 0%    ~     (p=0.156 n=9+10)
GoParse-4                  33.9ms ± 0%    34.0ms ± 0%  +0.36%  (p=0.031 n=9+9)
RegexpMatchEasy0_32-4       823ns ± 1%     835ns ± 1%  +1.49%  (p=0.000 n=8+8)
RegexpMatchEasy0_1K-4      3.99µs ± 0%    4.02µs ± 1%  +0.92%  (p=0.000 n=8+10)
RegexpMatchEasy1_32-4       877ns ± 3%     904ns ± 2%  +3.07%  (p=0.012 n=10+10)
RegexpMatchEasy1_1K-4      5.99µs ± 0%    5.97µs ± 1%  -0.38%  (p=0.023 n=8+8)
RegexpMatchMedium_32-4     1.40µs ± 2%    1.40µs ± 2%    ~     (p=0.590 n=10+9)
RegexpMatchMedium_1K-4      357µs ± 0%     355µs ± 1%  -0.72%  (p=0.000 n=7+8)
RegexpMatchHard_32-4       22.3µs ± 0%    22.1µs ± 0%  -0.49%  (p=0.000 n=8+7)
RegexpMatchHard_1K-4        661µs ± 0%     658µs ± 0%  -0.42%  (p=0.000 n=8+7)
Revcomp-4                  46.3ms ± 0%    46.3ms ± 0%    ~     (p=0.393 n=10+10)
Template-4                  753ms ± 1%     750ms ± 0%    ~     (p=0.211 n=10+9)
TimeParse-4                4.28µs ± 1%    4.22µs ± 1%  -1.34%  (p=0.000 n=8+10)
TimeFormat-4               9.00µs ± 0%    9.05µs ± 0%  +0.59%  (p=0.000 n=10+10)
[Geo mean]                  538µs          535µs       -0.55%

ARM64:
name                     old time/op    new time/op    delta
BinaryTree17-8              8.39s ± 0%     8.39s ± 0%    ~     (p=0.684 n=10+10)
Fannkuch11-8                5.95s ± 0%     5.99s ± 0%  +0.63%  (p=0.000 n=10+10)
FmtFprintfEmpty-8           116ns ± 0%     116ns ± 0%    ~     (all equal)
FmtFprintfString-8          361ns ± 0%     360ns ± 0%  -0.31%  (p=0.003 n=8+6)
FmtFprintfInt-8             290ns ± 0%     290ns ± 0%    ~     (p=0.620 n=9+9)
FmtFprintfIntInt-8          476ns ± 1%     469ns ± 0%  -1.47%  (p=0.000 n=10+6)
FmtFprintfPrefixedInt-8     412ns ± 2%     417ns ± 2%  +1.39%  (p=0.006 n=9+10)
FmtFprintfFloat-8           652ns ± 1%     652ns ± 0%    ~     (p=0.161 n=10+8)
FmtManyArgs-8              1.94µs ± 0%    1.94µs ± 2%    ~     (p=0.781 n=10+10)
GobDecode-8                17.7ms ± 1%    17.7ms ± 0%    ~     (p=0.962 n=10+7)
GobEncode-8                15.6ms ± 0%    15.6ms ± 1%    ~     (p=0.063 n=10+10)
Gzip-8                      786ms ± 0%     787ms ± 0%    ~     (p=0.356 n=10+9)
Gunzip-8                    127ms ± 0%     127ms ± 0%  +0.08%  (p=0.028 n=10+9)
HTTPClientServer-8          198µs ± 6%     198µs ± 7%    ~     (p=0.796 n=10+10)
JSONEncode-8               42.5ms ± 0%    42.2ms ± 0%  -0.73%  (p=0.000 n=9+8)
JSONDecode-8                158ms ± 1%     162ms ± 0%  +2.28%  (p=0.000 n=10+9)
Mandelbrot200-8            10.1ms ± 0%    10.1ms ± 0%  -0.01%  (p=0.000 n=10+9)
GoParse-8                  8.54ms ± 1%    8.63ms ± 1%  +1.06%  (p=0.000 n=10+9)
RegexpMatchEasy0_32-8       231ns ± 1%     225ns ± 0%  -2.52%  (p=0.000 n=9+10)
RegexpMatchEasy0_1K-8      1.63µs ± 0%    1.63µs ± 0%    ~     (p=0.170 n=10+10)
RegexpMatchEasy1_32-8       253ns ± 0%     249ns ± 0%  -1.41%  (p=0.000 n=9+10)
RegexpMatchEasy1_1K-8      2.08µs ± 0%    2.08µs ± 0%  -0.32%  (p=0.000 n=9+10)
RegexpMatchMedium_32-8      355ns ± 1%     351ns ± 0%  -1.04%  (p=0.007 n=10+7)
RegexpMatchMedium_1K-8      104µs ± 0%     104µs ± 0%    ~     (p=0.148 n=10+10)
RegexpMatchHard_32-8       5.79µs ± 0%    5.79µs ± 0%    ~     (p=0.578 n=10+10)
RegexpMatchHard_1K-8        176µs ± 0%     176µs ± 0%    ~     (p=0.137 n=10+10)
Revcomp-8                   1.37s ± 1%     1.36s ± 1%  -0.26%  (p=0.023 n=10+10)
Template-8                  151ms ± 1%     154ms ± 1%  +2.14%  (p=0.000 n=9+10)
TimeParse-8                 723ns ± 2%     721ns ± 1%    ~     (p=0.592 n=10+10)
TimeFormat-8                804ns ± 2%     798ns ± 3%    ~     (p=0.344 n=10+10)
[Geo mean]                  154µs          154µs       -0.02%

Therefore remove this pass. Also reduce text size by 0.5~2%.

Comment out some dead code in runtime/sys_nacl_amd64p32.s
which contains undefined symbols.

Change-Id: I1473986fe5b18b3d2554ce96cdc6f0999b8d955d
Reviewed-on: https://go-review.googlesource.com/36205
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-02-07 15:00:48 +00:00
Michael Matloob cbef450df7 runtime/pprof: symbolize proto profiles
When generating pprof profiles in proto format, symbolize the profiles.

Change-Id: I2471ed7f919483e5828868306418a63e41aff5c5
Reviewed-on: https://go-review.googlesource.com/34192
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-02-07 14:03:13 +00:00
Michael Matloob 62956897c1 runtime: add definitions for SetGoroutineLabels and Do
This change defines runtime/pprof.SetGoroutineLabels and runtime/pprof.Do, which
are used to set profiler labels on goroutines. The change defines functions
in the runtime for setting and getting profile labels, and sets and unsets
profile labels when goroutines are created and deleted. The change also adds
the package runtime/internal/proflabel, which defines the structure the runtime
uses to store profile labels.

Change-Id: I747a4400141f89b6e8160dab6aa94ca9f0d4c94d
Reviewed-on: https://go-review.googlesource.com/34198
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-on: https://go-review.googlesource.com/35010
2017-02-06 20:29:37 +00:00
Michael Matloob 91ad2a2194 runtime/pprof: add definitions of profile label types
This change defines WithLabels, Labels, Label, and ForLabels.
This is the first step of the profile labels implemention for go 1.9.

Updates #17280

Change-Id: I2dfc9aae90f7a4aa1ff7080d5747f0a1f0728e75
Reviewed-on: https://go-review.googlesource.com/34198
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-02-06 15:43:06 +00:00
Ian Lance Taylor 6aee6b895c runtime: remove markBits.clearMarkedNonAtomic
It's not used, it's never been used, and it doesn't do what its doc
comment says it does.

Fixes #18941.

Change-Id: Ia89d97fb87525f5b861d7701f919e0d6b7cbd376
Reviewed-on: https://go-review.googlesource.com/36322
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-06 04:45:55 +00:00
Elias Naur 78074f6850 runtime: handle SIGPIPE in c-archive and c-shared programs
Before this CL, Go programs in c-archive or c-shared buildmodes
would not handle SIGPIPE. That leads to surprising behaviour where
writes on a closed pipe or socket would raise SIGPIPE and terminate
the program. This CL changes the Go runtime to handle
SIGPIPE regardless of buildmode. In addition, SIGPIPE from non-Go
code is forwarded.

This is a refinement of CL 32796 that fixes the case where a non-default
handler for SIGPIPE is installed by the host C program.

Fixes #17393

Change-Id: Ia41186e52c1ac209d0a594bae9904166ae7df7de
Reviewed-on: https://go-review.googlesource.com/35960
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-02-03 20:07:36 +00:00
Russ Cox 0e3355903d time: record monotonic clock reading in time.Now, for more accurate comparisons
See https://golang.org/design/12914-monotonic for details.

Fixes #12914.

Change-Id: I80edc2e6c012b4ace7161c84cf067d444381a009
Reviewed-on: https://go-review.googlesource.com/36255
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Caleb Spare <cespare@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-02-03 19:04:52 +00:00
Keith Randall 69e1634985 runtime: darwin/amd64, don't depend on outarg slots being unmodified
sigtramp was calling sigtrampgo and depending on the fact that
the 3rd argument slot will not be modified on return.  Our calling
convention doesn't guarantee that.  Avoid that assumption.

There's no actual bug here, as sigtrampgo does not in fact modify its
argument slots.  But I found this while working on the dead stack slot
clobbering tool.  https://go-review.googlesource.com/c/23924/

Change-Id: Ia7e791a2b4c1c74fff24cba8169e7840b4b06ffc
Reviewed-on: https://go-review.googlesource.com/36216
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-02-03 05:08:54 +00:00
Cherry Zhang f69a6defd1 runtime: skip flaky TestGdbPythonCgo on MIPS
It seems the problem is on gdb and the dynamic linker. Skip the
test for now until we figure out what's going on with the system.

Updates #18784.

Change-Id: Ic9320ffd463f6c231b2c4192652263b1cf7f4231
Reviewed-on: https://go-review.googlesource.com/36250
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-02 21:45:42 +00:00
Lars Wiegman e546b295b8 runtime: use mach_absolute_time for runtime.nanotime
The existing darwin/amd64 implementation of runtime.nanotime returns the
wallclock time, which results in timers not functioning properly when
system time runs backwards. By implementing the algorithm used by the
darwin syscall mach_absolute_time, timers will function as expected.

The algorithm is described at
https://opensource.apple.com/source/xnu/xnu-3248.60.10/libsyscall/wrappers/mach_absolute_time.s

Fixes #17610

Change-Id: I9c8d35240d48249a6837dca1111b1406e2686f67
Reviewed-on: https://go-review.googlesource.com/35292
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-02 21:20:40 +00:00
Josh Bleecher Snyder 0358367576 cmd/compile, runtime: convert byte-sized values to interfaces without allocation
Based in part on khr's CL 2500.

Updates #17725
Updates #18121

Change-Id: I744e1f92fc2104e6c5bd883a898c30b2eea8cc31
Reviewed-on: https://go-review.googlesource.com/35555
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-02-02 21:04:34 +00:00
Russ Cox c47df7ae17 all: merge dev.typealias into master
For #18130.

f8b4123613 [dev.typealias] spec: use term 'embedded field' rather than 'anonymous field'
9ecc3ee252 [dev.typealias] cmd/compile: avoid false positive cycles from type aliases
49b7af8a30 [dev.typealias] reflect: add test for type aliases
9bbb07ddec [dev.typealias] cmd/compile, reflect: fix struct field names for embedded byte, rune
43c7094386 [dev.typealias] reflect: fix StructOf use of StructField to match StructField docs
9657e0b077 [dev.typealias] cmd/doc: update for type alias
de2e5459ae [dev.typealias] cmd/compile: declare methods after resolving receiver type
9259f3073a [dev.typealias] test: match gccgo error messages on alias2.go
5d92916770 [dev.typealias] cmd/compile: change Func.Shortname to *Sym
a7c884efc1 [dev.typealias] go/internal/gccgoimporter: support for type aliases
5802cfd900 [dev.typealias] cmd/compile: export/import test cases for type aliases
d7cabd40dd [dev.typealias] go/types: clarified doc string
cc2dcce3d7 [dev.typealias] cmd/compile: a few better comments related to alias types
5c160b28ba [dev.typealias] cmd/compile: improved error message for cyles involving type aliases
b2386dffa1 [dev.typealias] cmd/compile: type-check type alias declarations
ac8421f9a5 [dev.typealias] cmd/compile: various minor cleanups
f011e0c6c3 [dev.typealias] cmd/compile, go/types, go/importer: various alias related fixes
49de5f0351 [dev.typealias] cmd/compile, go/importer: define export format and implement importing of type aliases
5ceec42dc0 [dev.typealias] go/types: export TypeName.IsAlias so clients can use it
aa1f0681bc [dev.typealias] go/types: improved Object printing
c80748e389 [dev.typealias] go/types: remove some more vestiges of prior alias implementation
80d8b69e95 [dev.typealias] go/types: implement type aliases
a917097b5e [dev.typealias] go/build: add go1.9 build tag
3e11940437 [dev.typealias] cmd/compile: recognize type aliases but complain for now (not yet supported)
e0a05c274a [dev.typealias] cmd/gofmt: added test cases for alias type declarations
2e5116bd99 [dev.typealias] go/ast, go/parser, go/printer, go/types: initial type alias support

Change-Id: Ia65f2e011fd7195f18e1dce67d4d49b80a261203
2017-01-31 13:01:31 -05:00
Ian Lance Taylor 0949659952 runtime: add explicit (void) in C to avoid GCC 7 problem
This avoids errors like
    ./traceback.go:80:2: call of non-function C.f1

I filed https://gcc.gnu.org/PR79289 for the GCC problem. I think this
is a bug in GCC, and it may be fixed before the final GCC 7 release.
This CL is correct either way.

Fixes #18855.

Change-Id: I0785a7b7c5b1d0ca87b454b5eca9079f390fcbd4
Reviewed-on: https://go-review.googlesource.com/35919
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2017-01-30 19:27:49 +00:00
David Crawshaw b531eb3062 runtime: reorder modules so main.main comes first
Modules appear in the moduledata linked list in the order they are
loaded by the dynamic loader, with one exception: the
firstmoduledata itself the module that contains the runtime.
This is not always the first module (when using -buildmode=shared,
it is typically libstd.so, the second module).

The order matters for typelinksinit, so we swap the first module
with whatever module contains the main function.

Updates #18729

This fixes the test case extracted with -linkshared, and now

	go test -linkshared encoding/...

passes. However the original issue about a plugin failure is not
yet fixed.

Change-Id: I9f399ecc3518e22e6b0a350358e90b0baa44ac96
Reviewed-on: https://go-review.googlesource.com/35644
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-01-25 22:33:57 +00:00
Russ Cox 9bbb07ddec [dev.typealias] cmd/compile, reflect: fix struct field names for embedded byte, rune
Will also fix type aliases.

Fixes #17766.
For #18130.

Change-Id: I9e1584d47128782152e06abd0a30ef423d5c30d2
Reviewed-on: https://go-review.googlesource.com/35732
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-01-25 18:57:20 +00:00
Ian Lance Taylor aad06da2b9 cmd/link: mark DWARF function symbols as reachable
Otherwise we don't emit any required ELF relocations when doing an
external link, because elfrelocsect skips unreachable symbols.

Fixes #18745.

Change-Id: Ia3583c41bb6c5ebb7579abd26ed8689370311cd6
Reviewed-on: https://go-review.googlesource.com/35590
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2017-01-24 03:37:56 +00:00
Keith Randall a96e117a58 runtime: amd64, use 4-byte ops for memmove of 4 bytes
memmove used to use 2 2-byte load/store pairs to move 4 bytes.
When the result is loaded with a single 4-byte load, it caused
a store to load fowarding stall.  To avoid the stall,
special case memmove to use 4 byte ops for the 4 byte copy case.

We already have a special case for 8-byte copies.
386 already specializes 4-byte copies.
I'll do 2-byte copies also, but not for 1.8.

benchmark                 old ns/op     new ns/op     delta
BenchmarkIssue18740-8     7567          4799          -36.58%

3-byte copies get a bit slower.  Other copies are unchanged.
name         old time/op   new time/op   delta
Memmove/3-8   4.76ns ± 5%   5.26ns ± 3%  +10.50%  (p=0.000 n=10+10)

Fixes #18740

Change-Id: Iec82cbac0ecfee80fa3c8fc83828f9a1819c3c74
Reviewed-on: https://go-review.googlesource.com/35567
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-01-23 19:39:22 +00:00
Bryan C. Mills ea7d9e6a52 runtime: check for nil g and m in msanread
fixes #18707.

Change-Id: Ibc4efef01197799f66d10bfead22faf8ac00473c
Reviewed-on: https://go-review.googlesource.com/35452
Run-TryBot: Bryan Mills <bcmills@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-01-19 23:06:54 +00:00
Austin Clements c1730ae424 runtime: force workers out before checking mark roots
Currently we check that all roots are marked as soon as gcMarkDone
decides to transition from mark 1 to mark 2. However, issue #16083
indicates that there may be a race where we try to complete mark 1
while a worker is still scanning a stack, causing the root mark check
to fail.

We don't yet understand this race, but as a simple mitigation, move
the root check to after gcMarkDone performs a ragged barrier, which
will force any remaining workers to finish their current job.

Updates #16083. This may "fix" it, but it would be better to
understand and fix the underlying race.

Change-Id: I1af9ce67bd87ade7bc2a067295d79c28cd11abd2
Reviewed-on: https://go-review.googlesource.com/35353
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2017-01-18 15:40:33 +00:00
Keith Randall 81a61a96c9 runtime: for plugins, don't add duplicate itabs
We already do this for shared libraries. Do it for plugins also.
Suggestions on how to test this would be welcome.

I'd like to get this in for 1.8.  It could lead to mysterious
hangs when using plugins.

Fixes #18676

Change-Id: I03209b096149090b9ba171c834c5e59087ed0f92
Reviewed-on: https://go-review.googlesource.com/35117
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2017-01-17 22:37:19 +00:00
Bryan C. Mills fdde7ba2a2 runtime: avoid clobbering C callee-save register in cgoSigtramp
Use R11 (a caller-saved temp register) instead of RBX (a callee-saved
register).

I believe this only affects linux/amd64, since it is the only platform
with a non-trivial cgoSigtramp implementation.

Updates #18328.

Change-Id: I3d35c4512624184d5a8ece653fa09ddf50e079a2
Reviewed-on: https://go-review.googlesource.com/35068
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-01-12 00:06:32 +00:00
Austin Clements 2817e77024 runtime: debug prints for spanBytesAlloc underflow
Updates #18043.

Change-Id: I24e687fdd5521c48b672987f15f0d5de9f308884
Reviewed-on: https://go-review.googlesource.com/34612
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-01-10 15:59:39 +00:00
David Chase 7f1ff65c39 cmd/compile: insert scheduling checks on loop backedges
Loop breaking with a counter.  Benchmarked (see comments),
eyeball checked for sanity on popular loops.  This code
ought to handle loops in general, and properly inserts phi
functions in cases where the earlier version might not have.

Includes test, plus modifications to test/run.go to deal with
timeout and killing looping test.  Tests broken by the addition
of extra code (branch frequency and live vars) for added
checks turn the check insertion off.

If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule
checks on every backedge of every reducible loop.  Alternately,
specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will
enable it for a single compilation, but because the core Go
libraries contain some loops that may run long, this is less
likely to have the desired effect.

This is intended as a tool to help in the study and diagnosis
of GC and other latency problems, now that goal STW GC latency
is on the order of 100 microseconds or less.

Updates #17831.
Updates #10958.

Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639
Reviewed-on: https://go-review.googlesource.com/33910
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-01-09 21:01:29 +00:00
Austin Clements ffedff7e50 runtime: add table of size classes in a comment
Change-Id: I52fae67c9aeceaa23e70f2ef0468745b354f8c75
Reviewed-on: https://go-review.googlesource.com/34932
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-01-08 00:01:30 +00:00
shawnps 067bab00a8 all: fix misspellings
Change-Id: I429637ca91f7db4144f17621de851a548dc1ce76
Reviewed-on: https://go-review.googlesource.com/34923
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-01-07 16:53:25 +00:00
Russ Cox b902a63ade runtime: fix corruption crash/race between select and stack growth
To implement the blocking of a select, a goroutine builds a list of
offers to communicate (pseudo-g's, aka sudog), one for each case,
queues them on the corresponding channels, and waits for another
goroutine to complete one of those cases and wake it up. Obviously it
is not OK for two other goroutines to complete multiple cases and both
wake the goroutine blocked in select. To make sure that only one
branch of the select is chosen, all the sudogs contain a pointer to a
shared (single) 'done uint32', which is atomically cas'ed by any
interested goroutines. The goroutine that wins the cas race gets to
wake up the select. A complication is that 'done uint32' is stored on
the stack of the goroutine running the select, and that stack can move
during the select due to stack growth or stack shrinking.

The relevant ordering to block and unblock in select is:

	1. Lock all channels.
	2. Create list of sudogs and queue sudogs on all channels.
	3. Switch to system stack, mark goroutine as asleep,
	   unlock all channels.
	4. Sleep until woken.
	5. Wake up on goroutine stack.
	6. Lock all channels.
	7. Dequeue sudogs from all channels.
	8. Free list of sudogs.
	9. Unlock all channels.

There are two kinds of stack moves: stack growth and stack shrinking.
Stack growth happens while the original goroutine is running.
Stack shrinking happens asynchronously, during garbage collection.

While a channel listing a sudog is locked by select in this process,
no other goroutine can attempt to complete communication on that
channel, because that other goroutine doesn't hold the lock and can't
find the sudog. If the stack moves while all the channel locks are
held or when the sudogs are not yet or no longer queued in the
channels, no problem, because no goroutine can get to the sudogs and
therefore to selectdone. We only need to worry about the stack (and
'done uint32') moving with the sudogs queued in unlocked channels.

Stack shrinking can happen any time the goroutine is stopped.
That code already acquires all the channel locks before doing the
stack move, so it avoids this problem.

Stack growth can happen essentially any time the original goroutine is
running on its own stack (not the system stack). In the first half of
the select, all the channels are locked before any sudogs are queued,
and the channels are not unlocked until the goroutine has stopped
executing on its own stack and is asleep, so that part is OK. In the
second half of the select, the goroutine wakes up on its own goroutine
stack and immediately locks all channels. But the actual call to lock
might grow the stack, before acquiring any locks. In that case, the
stack is moving with the sudogs queued in unlocked channels. Not good.
One goroutine has already won a cas on the old stack (that goroutine
woke up the selecting goroutine, moving it out of step 4), and the
fact that done = 1 now should prevent any other goroutines from
completing any other select cases. During the stack move, however,
sudog.selectdone is moved from pointing to the old done variable on
the old stack to a new memory location on the new stack. Another
goroutine might observe the moved pointer before the new memory
location has been initialized. If the new memory word happens to be
zero, that goroutine might win a cas on the new location, thinking it
can now complete the select (again). It will then complete a second
communication (reading from or writing to the goroutine stack
incorrectly) and then attempt to wake up the selecting goroutine,
which is already awake.

The scribbling over the goroutine stack unexpectedly is already bad,
but likely to go unnoticed, at least immediately. As for the second
wakeup, there are a variety of ways it might play out.

* The goroutine might not be asleep.
That will produce a runtime crash (throw) like in #17007:

	runtime: gp: gp=0xc0422dcb60, goid=2299, gp->atomicstatus=8
	runtime:  g:  g=0xa5cfe0, goid=0,  g->atomicstatus=0
	fatal error: bad g->status in ready

Here, atomicstatus=8 is copystack; the second, incorrect wakeup is
observing that the selecting goroutine is in state "Gcopystack"
instead of "Gwaiting".

* The goroutine might be sleeping in a send on a nil chan.
If it wakes up, it will crash with 'fatal error: unreachable'.

* The goroutine might be sleeping in a send on a non-nil chan.
If it wakes up, it will crash with 'fatal error: chansend:
spurious wakeup'.

* The goroutine might be sleeping in a receive on a nil chan.
If it wakes up, it will crash with 'fatal error: unreachable'.

* The goroutine might be sleeping in a receive on a non-nil chan.
If it wakes up, it will silently (incorrectly!) continue as if it
received a zero value from a closed channel, leaving a sudog queued on
the channel pointing at that zero vaue on the goroutine's stack; that
space will be reused as the goroutine executes, and when some other
goroutine finally completes the receive, it will do a stray write into
the goroutine's stack memory, which may cause problems. Then it will
attempt the real wakeup of the goroutine, leading recursively to any
of the cases in this list.

* The goroutine might have been running a select in a finalizer
(I hope not!) and might now be sleeping waiting for more things to
finalize. If it wakes up, as long as it goes back to sleep quickly
(before the real GC code tries to wake it), the spurious wakeup does
no harm (but the stack was still scribbled on).

* The goroutine might be sleeping in gcParkAssist.
If it wakes up, that will let the goroutine continue executing a bit
earlier than we would have liked. Eventually the GC will attempt the
real wakeup of the goroutine, leading recursively to any of the cases
in this list.

* The goroutine cannot be sleeping in bgsweep, because the background
sweepers never use select.

* The goroutine might be sleeping in netpollblock.
If it wakes up, it will crash with 'fatal error: netpollblock:
corrupted state'.

* The goroutine might be sleeping in main as another thread crashes.
If it wakes up, it will exit(0) instead of letting the other thread
crash with a non-zero exit status.

* The goroutine cannot be sleeping in forcegchelper,
because forcegchelper never uses select.

* The goroutine might be sleeping in an empty select - select {}.
If it wakes up, it will return to the next line in the program!

* The goroutine might be sleeping in a non-empty select (again).
In this case, it will wake up spuriously, with gp.param == nil (no
reason for wakeup), but that was fortuitously overloaded for handling
wakeup due to a closing channel and the way it is handled is to rerun
the select, which (accidentally) handles the spurious wakeup
correctly:

	if cas == nil {
		// This can happen if we were woken up by a close().
		// TODO: figure that out explicitly so we don't need this loop.
		goto loop
	}

Before looping, it will dequeue all the sudogs on all the channels
involved, so that no other goroutine will attempt to wake it.
Since the goroutine was blocked in select before, being blocked in
select again when the spurious wakeup arrives may be quite likely.
In this case, the spurious wakeup does no harm (but the stack was
still scribbled on).

* The goroutine might be sleeping in semacquire (mutex slow path).
If it wakes up, that is taken as a signal to try for the semaphore
again, not a signal that the semaphore is now held, but the next
iteration around the loop will queue the sudog a second time, causing
a cycle in the wakeup list for the given address. If that sudog is the
only one in the list, when it is eventually dequeued, it will
(due to the precise way the code is written) leave the sudog on the
queue inactive with the sudog broken. But the sudog will also be in
the free list, and that will eventually cause confusion.

* The goroutine might be sleeping in notifyListWait, for sync.Cond.
If it wakes up, (*Cond).Wait returns. The docs say "Unlike in other
systems, Wait cannot return unless awoken by Broadcast or Signal,"
so the spurious wakeup is incorrect behavior, but most callers do not
depend on that fact. Eventually the condition will happen, attempting
the real wakeup of the goroutine and leading recursively to any of the
cases in this list.

* The goroutine might be sleeping in timeSleep aka time.Sleep.
If it wakes up, it will continue running, leaving a timer ticking.
When that time bomb goes off, it will try to ready the goroutine
again, leading to any one of the cases in this list.

* The goroutine cannot be sleeping in timerproc,
because timerproc never uses select.

* The goroutine might be sleeping in ReadTrace.
If it wakes up, it will print 'runtime: spurious wakeup of trace
reader' and return nil. All future calls to ReadTrace will print
'runtime: ReadTrace called from multiple goroutines simultaneously'.
Eventually, when trace data is available, a true wakeup will be
attempted, leading to any one of the cases in this list.

None of these fatal errors appear in any of the trybot or dashboard
logs. The 'bad g->status in ready' that happens if the goroutine is
running (the most likely scenario anyway) has happened once on the
dashboard and eight times in trybot logs. Of the eight, five were
atomicstatus=8 during net/http tests, so almost certainly this bug.
The other three were atomicstatus=2, all near code in select,
but in a draft CL by Dmitry that was rewriting select and may or may
not have had its own bugs.

This bug has existed since Go 1.4. Until then the select code was
implemented in C, 'done uint32' was a C stack variable 'uint32 done',
and C stacks never moved. I believe it has become more common recently
because of Brad's work to run more and more tests in net/http in
parallel, which lengthens race windows.

The fix is to run step 6 on the system stack,
avoiding possibility of stack growth.

Fixes #17007 and possibly other mysterious failures.

Change-Id: I9d6575a51ac96ae9d67ec24da670426a4a45a317
Reviewed-on: https://go-review.googlesource.com/34835
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-01-06 19:19:35 +00:00
Austin Clements 5dd978a283 runtime: expand HACKING.md
This adds high-level descriptions of the scheduler structures, the
user and system stacks, error handling, and synchronization.

Change-Id: I1eed97c6dd4a6e3d351279e967b11c6e64898356
Reviewed-on: https://go-review.googlesource.com/34290
Reviewed-by: Rick Hudson <rlh@golang.org>
2017-01-06 18:30:36 +00:00
Austin Clements 618c291544 runtime: update big mgc.go comment
The comment describing the overall GC algorithm at the top of mgc.go
has gotten woefully out-of-date (and was possibly never
correct/complete). Update it to reflect the current workings of the
GC and the set of phases that we now divide it into.

Change-Id: I02143c0ebefe9d4cd7753349dab8045f0973bf95
Reviewed-on: https://go-review.googlesource.com/34711
Reviewed-by: Rick Hudson <rlh@golang.org>
2017-01-06 18:22:35 +00:00
Austin Clements 7aefdfded0 runtime: use 4K as the boundary of legal pointers
Currently, the check for legal pointers in stack copying uses
_PageSize (8K) as the minimum legal pointer. By default, Linux won't
let you map under 64K, but

1) it's less clear what other OSes allow or will allow in the future;

2) while mapping the first page is a terrible idea, mapping anywhere
above that is arguably more justifiable;

3) the compiler only assumes the first physical page (4K) is never
mapped.

Make the runtime consistent with the compiler and more robust by
changing the bad pointer check to use 4K as the minimum legal pointer.

This came out of discussions on CLs 34663 and 34719.

Change-Id: Idf721a788bd9699fb348f47bdd083cf8fa8bd3e5
Reviewed-on: https://go-review.googlesource.com/34890
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-01-06 16:19:14 +00:00
Lion Yang a2b615d527 crypto: detect BMI usability on AMD64 for sha1 and sha256
The existing implementations on AMD64 only detects AVX2 usability,
when they also contains BMI (bit-manipulation instructions).
These instructions crash the running program as 'unknown instructions'
on the architecture, e.g. i3-4000M, which supports AVX2 but not
support BMI.

This change added the detections for BMI1 and BMI2 to AMD64 runtime with
two flags as the result, `support_bmi1` and `support_bmi2`,
in runtime/runtime2.go. It also completed the condition to run AVX2 version
in packages crypto/sha1 and crypto/sha256.

Fixes #18512

Change-Id: I917bf0de365237740999de3e049d2e8f2a4385ad
Reviewed-on: https://go-review.googlesource.com/34850
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-01-05 15:37:37 +00:00
Michael Marineau 6a1cac2700 runtime: check sched_getaffinity return value
Android on ChromeOS uses a restrictive seccomp filter that blocks
sched_getaffinity, leading this code to index a slice by -errno.

Change-Id: Iec09a4f79dfbc17884e24f39bcfdad305de75b37
Reviewed-on: https://go-review.googlesource.com/34794
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
2017-01-03 22:35:42 +00:00
Vladimir Stefanovic 155d314e50 runtime: fix SP alignment in mips{,le} sigfwd
Fixes misc/cgo/testsigfwd, enabled for mips{,le} with the next commit
(https://golang.org/cl/34646).

Change-Id: I2bec894b0492fd4d84dd73a4faa19eafca760107
Reviewed-on: https://go-review.googlesource.com/34645
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-01-03 15:42:06 +00:00