mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
Keith Randall	41dd1696ab	cmd/compile: fix heap dump test on android go_android_exec is looking for "exitcode=" to decide the result of running a test. The heap dump test nondeterministically prints "finalized" right at the end of the test. When the timing is just right, we print "finalizedexitcode=0" and confuse go_android_exec. This failure happens occasionally on the android builders. Change-Id: I4f73a4db05d8f40047ecd3ef3a881a4ae3741e26 Reviewed-on: https://go-review.googlesource.com/23861 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2016-06-07 17:34:48 +00:00
Keith Randall	a871464e5a	runtime: fix typo Fixes #15962 Change-Id: I1949e0787f6c2b1e19b9f9d3af2f712606a6d4cf Reviewed-on: https://go-review.googlesource.com/23786 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-06-05 18:10:01 +00:00
Ian Lance Taylor	cf862478c8	runtime/cgo: add TSAN locks around mmap call Change-Id: I806cc5523b7b5e3278d01074bc89900d78700e0c Reviewed-on: https://go-review.googlesource.com/23736 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-06-03 18:26:01 +00:00
Michael Hudson-Doyle	26849746c9	cmd/internal/obj, runtime: fixes for defer in 386 shared libraries Any defer in a shared object crashed when GOARCH=386. This turns out to be two bugs: 1) Calls to morestack were not processed to be PIC safe (must have been possible to trigger this another way too) 2) jmpdefer needs to rewind the return address of the deferred function past the instructions that load the GOT pointer into BX, not just past the call Bug 2) requires re-introducing the a way for .s files to know when they are being compiled for dynamic linking but I've tried to do that in as minimal a way as possible. Fixes #15916 Change-Id: Ia0d09b69ec272a176934176b8eaef5f3bfcacf04 Reviewed-on: https://go-review.googlesource.com/23623 Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-06-03 02:50:27 +00:00
Ian Lance Taylor	03abde4971	runtime: only permit SetCgoTraceback to be called once Accept a duplicate call, but nothing else. Change-Id: Iec24bf5ddc3b0f0c559ad2158339aca698601743 Reviewed-on: https://go-review.googlesource.com/23692 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-06-02 19:24:55 +00:00
Ian Lance Taylor	88e0ec2979	runtime/cgo: avoid races on cgo_context_function Change-Id: Ie9e6fda675e560234e90b9022526fd689d770818 Reviewed-on: https://go-review.googlesource.com/23610 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-06-02 18:47:48 +00:00
Dmitry Vyukov	ba22172832	runtime: fix typo in comment Change-Id: I82e35770b45ccd1433dfae0af423073c312c0859 Reviewed-on: https://go-review.googlesource.com/23680 Reviewed-by: Andrew Gerrand <adg@golang.org>	2016-06-02 06:02:01 +00:00
Emmanuel Odeke	77026ef902	runtime: document heap scavenger memory summary Fixes #15212. Change-Id: I2628ec8333330721cddc5145af1ffda6f3e0c63f Reviewed-on: https://go-review.googlesource.com/23319 Reviewed-by: Austin Clements <austin@google.com>	2016-06-01 19:06:43 +00:00
Ian Lance Taylor	690de51ffa	runtime: fix restoring PC in ARM version of cgocallback_gofunc Fixes #15856. Change-Id: Ia8def161642087e4bd92a87298c77a0f9f83dc86 Reviewed-on: https://go-review.googlesource.com/23586 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Elias Naur <elias.naur@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-05-31 22:14:39 +00:00
Ian Lance Taylor	3d037cfaf8	runtime: pass signal context to cgo traceback function When doing a backtrace from a signal that occurs in C code compiled without using -fasynchronous-unwind-tables, we have to rely on frame pointers. In order to do that, the traceback function needs the signal context to reliably pick up the frame pointer. Change-Id: I7b45930fced01685c337d108e0f146057928f876 Reviewed-on: https://go-review.googlesource.com/23494 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-31 21:17:40 +00:00
Ian Lance Taylor	2256e38978	runtime: update pprof binary header URL The code has moved from code.google.com to github.com. Change-Id: I0cc9eb69b3fedc9e916417bc7695759632f2391f Reviewed-on: https://go-review.googlesource.com/23523 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-05-31 21:10:20 +00:00
Ian Lance Taylor	66736880ca	runtime/cgo: add TSAN acquire/release calls Add TSAN acquire/release calls to runtime/cgo to match the ones generated by cgo. This avoids a false positive race around the malloc memory used in runtime/cgo when other goroutines are simultaneously calling malloc and free from cgo. These new calls will only be used when building with CGO_CFLAGS and CGO_LDFLAGS set to -fsanitize=thread, which becomes a requirement to avoid all false positives when using TSAN. These are needed not just for runtime/cgo, but also for any runtime package that uses cgo (such as net and os/user). Add an unused attribute to the _cgo_tsan_acquire and _cgo_tsan_release functions, in case there are no actual cgo function calls. Add a test that checks that setting CGO_CFLAGS/CGO_LDFLAGS avoids a false positive report when using os/user. Change-Id: I0905c644ff7f003b6718aac782393fa219514c48 Reviewed-on: https://go-review.googlesource.com/23492 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-05-31 20:53:16 +00:00
Ian Lance Taylor	4223294eab	runtime/pprof, cmd/pprof: fix profiling for PIE In order to support pprof for position independent executables, pprof needs to adjust the PC addresses stored in the profile by the address at which the program is loaded. The legacy profiling support which we use already supports recording the GNU/Linux /proc/self/maps data immediately after the CPU samples, so do that. Also change the pprof symbolizer to use the information, if available, when looking up addresses in the Go pcline data. Fixes #15714. Change-Id: I4bf679210ef7c51d85cf873c968ce82db8898e3e Reviewed-on: https://go-review.googlesource.com/23525 Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2016-05-31 13:02:09 +00:00
Ilya Tocar	429bbf3312	strings: fix and reenable amd64 Index for 17-31 byte strings Fixes #15689 Change-Id: I56d0103738cc35cd5bc5e77a0e0341c0dd55530e Reviewed-on: https://go-review.googlesource.com/23440 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Nigel Tao <nigeltao@golang.org>	2016-05-27 22:57:32 +00:00
David Chase	31e13c83c2	[dev.ssa] Merge branch 'master' into dev.ssa Change-Id: Iabc80b6e0734efbd234d998271e110d2eaad41dd	2016-05-27 15:19:33 -04:00
Mikio Hara	c340f4867b	runtime: skip TestGdbBacktrace on netbsd Also adds missing copyright notice. Updates #15603. Change-Id: Icf4bb45ba5edec891491fe5f0039a8a25125d168 Reviewed-on: https://go-review.googlesource.com/23501 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-27 18:47:08 +00:00
Austin Clements	6a86dbe75f	runtime: always call stackfree on the system stack Currently when the garbage collector frees stacks of dead goroutines in markrootFreeGStacks, it calls stackfree on a regular user stack. This is a problem, since stackfree manipulates the stack cache in the per-P mcache, so if it grows the stack or gets preempted in the middle of manipulating the stack cache (which are both possible since it's on a user stack), it can easily corrupt the stack cache. Fix this by calling markrootFreeGStacks on the system stack, so that all calls to stackfree happen on the system stack. To prevent this bug in the future, mark stack functions that manipulate the mcache as go:systemstack. Fixes #15853. Change-Id: Ic0d1c181efb342f134285a152560c3a074f14a3d Reviewed-on: https://go-review.googlesource.com/23511 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-27 17:53:21 +00:00
Austin Clements	966baedfea	runtime: record Python stack on TestGdbPython failure For #15599. Change-Id: Icc2e58a3f314b7a098d78fe164ba36f5b2897de6 Reviewed-on: https://go-review.googlesource.com/23481 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-05-27 16:46:05 +00:00
Russ Cox	7fdec6216c	build: enable framepointer mode by default This has a minor performance cost, but far less than is being gained by SSA. As an experiment, enable it during the Go 1.7 beta. Having frame pointers on by default makes Linux's perf, Intel VTune, and other profilers much more useful, because it lets them gather a stack trace efficiently on profiling events. (It doesn't help us that much, since when we walk the stack we usually need to look up PC-specific information as well.) Fixes #15840. Change-Id: I4efd38412a0de4a9c87b1b6e5d11c301e63f1a2a Reviewed-on: https://go-review.googlesource.com/23451 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-26 19:02:00 +00:00
David Crawshaw	56e5e0b69c	runtime: tell race detector about reflectOffs.lock Fixes #15832 Change-Id: I6f3f45e3c21edd0e093ecb1d8a067907863478f5 Reviewed-on: https://go-review.googlesource.com/23441 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-05-26 14:43:27 +00:00
Austin Clements	b92f423879	runtime: unwind BP in jmpdefer to match SP unwind The irregular calling convention for defers currently incorrectly manages the BP if frame pointers are enabled. Specifically, jmpdefer manipulates the SP as if its own caller, deferreturn, had returned. However, it does not manipulate the BP to match. As a result, when a BP-based traceback happens during a deferred function call, it unwinds to the function that performed the defer and then thinks that function called itself in an infinite regress. Fix this by making jmpdefer manipulate the BP as if deferreturn had actually returned. Fixes #12968. Updates #15840. Change-Id: Ic9cc7c863baeaf977883ed0c25a7e80e592cf066 Reviewed-on: https://go-review.googlesource.com/23457 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-26 13:54:05 +00:00
Russ Cox	d9557523c2	runtime: make framepointer mode safe for Windows A few other architectures have already defined a NOFRAME flag. Use it to disable frame pointer code on a few very low-level functions that must behave like Windows code. Makes the failing os/signal test pass on a Windows gomote. Change-Id: I982365f2c59a0aa302b4428c970846c61027cf3e Reviewed-on: https://go-review.googlesource.com/23456 Reviewed-by: Austin Clements <austin@google.com>	2016-05-26 13:53:01 +00:00
Russ Cox	8a1dc32447	runtime: add library startup support for ppc64le I have been running this patch inside Google against Go 1.6 for the last month. The new tests will probably break the builders but let's see exactly how they break. Change-Id: Ia65cf7d3faecffeeb4b06e9b80875c0e57d86d9e Reviewed-on: https://go-review.googlesource.com/23452 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-05-26 03:31:59 +00:00
Ian Lance Taylor	a5d1a72a40	cmd/cgo, runtime, runtime/cgo: TSAN support for malloc Acquire and release the TSAN synchronization point when calling malloc, just as we do when calling any other C function. If we don't do this, TSAN will report false positive errors about races calling malloc and free. We used to have a special code path for malloc and free, going through the runtime functions cmalloc and cfree. The special code path for cfree was no longer used even before this CL. This CL stops using the special code path for malloc, because there is no place along that path where we could conditionally insert the TSAN synchronization. This CL removes the support for the special code path for both functions. Instead, cgo now automatically generates the malloc function as though it were referenced as C.malloc. We need to automatically generate it even if C.malloc is not called, even if malloc and size_t are not declared, to support cgo-provided functions like C.CString. Change-Id: I829854ec0787a80f33fa0a8a0dc2ee1d617830e2 Reviewed-on: https://go-review.googlesource.com/23260 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-05-25 23:22:24 +00:00
Russ Cox	10c8b2374f	runtime: align C library startup calls on amd64 This makes GOEXPERIMENT=framepointer, GOOS=darwin, and buildmode=carchive coexist. Change-Id: I9f6fb2f0f06f27df683e5b51f2fa55cd21872453 Reviewed-on: https://go-review.googlesource.com/23454 Reviewed-by: Austin Clements <austin@google.com>	2016-05-25 23:16:46 +00:00
Austin Clements	3be48b4dc8	runtime: pass gcWork to scanstack Currently scanstack obtains its own gcWork from the P for the duration of the stack scan and then, if called during mark termination, disposes the gcWork. However, this means that the number of workbufs allocated will be at least the number of stacks scanned during mark termination, which may be very high (especially during a STW GC). This happens because, in steady state, each scanstack will obtain a fresh workbuf (either from the empty list or by allocating it), fill it with the scan results, and then dispose it to the full list. Nothing is consuming from the full list during this (and hence nothing is recycling them to the empty list), so the length of the full list by the time mark termination starts draining it is at least the number of stacks scanned. Fix this by pushing the gcWork acquisition up the stack to either the gcDrain that calls markroot that calls scanstack (which batches across many stack scans and is the path taken during STW GC) or to newstack (which is still a single scanstack call, but this is roughly bounded by the number of Ps). This fix reduces the workbuf allocation for the test program from issue #15319 from 213 MB (roughly 2KB * 1e5 goroutines) to 10 MB. Fixes #15319. Note that there's potentially a similar issue in write barriers during mark 2. Fixing that will be more difficult since there's no broader non-preemptible context, but it should also be less of a problem since the full list is being drained during mark 2. Some overall improvements in the go1 benchmarks, plus the usual noise. No significant change in the garbage benchmark (time/op or GC memory). name old time/op new time/op delta BinaryTree17-12 2.54s ± 1% 2.51s ± 1% -1.09% (p=0.000 n=20+19) Fannkuch11-12 2.12s ± 0% 2.17s ± 0% +2.18% (p=0.000 n=19+18) FmtFprintfEmpty-12 45.1ns ± 1% 45.2ns ± 0% ~ (p=0.078 n=19+18) FmtFprintfString-12 127ns ± 0% 128ns ± 0% +1.08% (p=0.000 n=19+16) FmtFprintfInt-12 125ns ± 0% 122ns ± 1% -2.71% (p=0.000 n=14+18) FmtFprintfIntInt-12 196ns ± 0% 190ns ± 1% -2.91% (p=0.000 n=12+20) FmtFprintfPrefixedInt-12 196ns ± 0% 194ns ± 1% -0.94% (p=0.000 n=13+18) FmtFprintfFloat-12 253ns ± 1% 251ns ± 1% -0.86% (p=0.000 n=19+20) FmtManyArgs-12 807ns ± 1% 784ns ± 1% -2.85% (p=0.000 n=20+20) GobDecode-12 7.13ms ± 1% 7.12ms ± 1% ~ (p=0.351 n=19+20) GobEncode-12 5.89ms ± 0% 5.95ms ± 0% +0.94% (p=0.000 n=19+19) Gzip-12 219ms ± 1% 221ms ± 1% +1.35% (p=0.000 n=18+20) Gunzip-12 37.5ms ± 1% 37.4ms ± 0% ~ (p=0.057 n=20+19) HTTPClientServer-12 81.4µs ± 4% 81.9µs ± 3% ~ (p=0.118 n=17+18) JSONEncode-12 15.7ms ± 1% 15.8ms ± 1% +0.73% (p=0.000 n=17+18) JSONDecode-12 57.9ms ± 1% 57.2ms ± 1% -1.34% (p=0.000 n=19+19) Mandelbrot200-12 4.12ms ± 1% 4.10ms ± 0% -0.33% (p=0.000 n=19+17) GoParse-12 3.22ms ± 2% 3.25ms ± 1% +0.72% (p=0.000 n=18+20) RegexpMatchEasy0_32-12 70.6ns ± 1% 71.1ns ± 2% +0.63% (p=0.005 n=19+20) RegexpMatchEasy0_1K-12 240ns ± 0% 239ns ± 1% -0.59% (p=0.000 n=19+20) RegexpMatchEasy1_32-12 71.3ns ± 1% 71.3ns ± 1% ~ (p=0.844 n=17+17) RegexpMatchEasy1_1K-12 384ns ± 2% 371ns ± 1% -3.45% (p=0.000 n=19+20) RegexpMatchMedium_32-12 109ns ± 1% 108ns ± 2% -0.48% (p=0.029 n=19+19) RegexpMatchMedium_1K-12 34.3µs ± 1% 34.5µs ± 2% ~ (p=0.160 n=18+20) RegexpMatchHard_32-12 1.79µs ± 9% 1.72µs ± 2% -3.83% (p=0.000 n=19+19) RegexpMatchHard_1K-12 53.3µs ± 4% 51.8µs ± 1% -2.82% (p=0.000 n=19+20) Revcomp-12 386ms ± 0% 388ms ± 0% +0.72% (p=0.000 n=17+20) Template-12 62.9ms ± 1% 62.5ms ± 1% -0.57% (p=0.010 n=18+19) TimeParse-12 325ns ± 0% 331ns ± 0% +1.84% (p=0.000 n=18+19) TimeFormat-12 338ns ± 0% 343ns ± 0% +1.34% (p=0.000 n=18+20) [Geo mean] 52.7µs 52.5µs -0.42% Change-Id: Ib2d34736c4ae2ec329605b0fbc44636038d8d018 Reviewed-on: https://go-review.googlesource.com/23391 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-25 21:11:47 +00:00
Austin Clements	a1f7db88f8	runtime: document scanstack Also mark it go:systemstack and explain why. Change-Id: I88baf22741c04012ba2588d8e03dd3801d19b5c0 Reviewed-on: https://go-review.googlesource.com/23390 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-25 21:11:44 +00:00
Marcel van Lohuizen	23cb8864b5	runtime: use Run for more benchmarks Names for Append?Bytes are slightly changed in addition to adding a slash. Change-Id: I0291aa29c693f9040fd01368eaad9766259677df Reviewed-on: https://go-review.googlesource.com/23426 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-25 17:46:50 +00:00
Marcel van Lohuizen	095fbdcc91	runtime: use of Run for some benchmarks Names of sub-benchmarks are preserved, short of the additional slash. Change-Id: I9b3f82964f9a44b0d28724413320afd091ed3106 Reviewed-on: https://go-review.googlesource.com/23425 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-25 16:49:02 +00:00
Elias Naur	72eb46c5a0	runtime,runtime/cgo: save callee-saved FP register on arm Other GOARCHs already handle their callee-saved FP registers, but arm was missing. Without this change, code using Cgo and floating point code might fail in mysterious and hard to debug ways. There are no floating point registers when GOARM=5, so skip the registers when runtime.goarm < 6. darwin/arm doesn't support GOARM=5, so the check is left out of rt0_darwin_arm.s. Fixes #14876 Change-Id: I6bcb90a76df3664d8ba1f33123a74b1eb2c9f8b2 Reviewed-on: https://go-review.googlesource.com/23140 Run-TryBot: Elias Naur <elias.naur@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2016-05-25 06:54:28 +00:00
Robert Griesemer	93e8e70499	all: fixed a handful of typos Change-Id: Ib0683f27b44e2f107cca7a8dcc01d230cbcd5700 Reviewed-on: https://go-review.googlesource.com/23404 Reviewed-by: Alan Donovan <adonovan@google.com>	2016-05-24 21:18:03 +00:00
Austin Clements	a640d95172	runtime: update SP when jumping stacks in traceback When gentraceback starts on a system stack in sigprof, it is configured to jump to the user stack when it reaches the end of the system stack. Currently this updates the current frame's FP, but not its SP. This is okay on non-LR machines (x86) because frame.sp is only used to find defers, which the bottom-most frame of the user stack will never have. However, on LR machines, we use frame.sp to find the saved LR. We then use to resolve the function of the next frame, which is used to resolved the size of the next frame. Since we're not updating frame.sp on a stack jump, we read the saved LR from the system stack instead of the user stack and wind up resolving the wrong function and hence the wrong frame size for the next frame. This has had remarkably few ill effects (though the resulting profiles must be wrong). We noticed it because of a bad interaction with stack barriers. Specifically, once we get the next frame size wrong, we also get the location of its LR wrong. If we happen to get a stack slot that contains a stale stack barrier LR (for a stack barrier we already hit) and hasn't been overwritten with something else as we re-grew the stack, gentraceback will fail with a "found next stack barrier at ..." error, pointing at the slot that it thinks is an LR, but isn't. Fixes #15138. Updates #15313 (might fix it). Change-Id: I13cfa322b44c0c2f23ac2b3d03e12631e4a6406b Reviewed-on: https://go-review.googlesource.com/23291 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2016-05-24 21:07:24 +00:00
Austin Clements	44497ebacb	runtime: fix goroutine priority elevation Currently it's possible for user code to exploit the high scheduler priority of the GC worker in conjunction with the runnext optimization to elevate a user goroutine to high priority so it will always run even if there are other runnable goroutines. For example, if a goroutine is in a tight allocation loop, the following can happen: 1. Goroutine 1 allocates, triggering a GC. 2. G 1 attempts an assist, but fails and blocks. 3. The scheduler runs the GC worker, since it is high priority. Note that this also starts a new scheduler quantum. 4. The GC worker does enough work to satisfy the assist. 5. The GC worker readies G 1, putting it in runnext. 6. GC finishes and the scheduler runs G 1 from runnext, giving it the rest of the GC worker's quantum. 7. Go to 1. Even if there are other goroutines on the run queue, they never get a chance to run in the above sequence. This requires a confluence of circumstances that make it unlikely, though not impossible, that it would happen in "real" code. In the test added by this commit, we force this confluence by setting GOMAXPROCS to 1 and GOGC to 1 so it's easy for the test to repeated trigger GC and wake from a blocked assist. We fix this by making GC always put user goroutines at the end of the run queue, instead of in runnext. This makes it so user code can't piggy-back on the GC's high priority to make a user goroutine act like it has high priority. The only other situation where GC wakes user goroutines is waking all blocked assists at the end, but this uses the global run queue and hence doesn't have this problem. Fixes #15706. Change-Id: I1589dee4b7b7d0c9c8575ed3472226084dfce8bc Reviewed-on: https://go-review.googlesource.com/23172 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-19 18:18:13 +00:00
Austin Clements	91740582c3	runtime: add 'next' flag to ready Currently ready always puts the readied goroutine in runnext. We're going to have to change this for some uses, so add a flag for whether or not to use runnext. For now we always pass true so this is a no-op change. For #15706. Change-Id: Iaa66d8355ccfe4bbe347570cc1b1878c70fa25df Reviewed-on: https://go-review.googlesource.com/23171 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-19 18:17:58 +00:00
Joel Sing	0dcd330bc8	runtime/cgo: make cgo work with openbsd ABI changes OpenBSD 6.0 (due out November 2016) will support PT_TLS, which will allow for the OpenBSD cgo pthread_create() workaround to be removed. However, in order for Go to continue working on supported OpenBSD releases (the current release and the previous release - 5.9 and 6.0, once 6.0 is released), we cannot enable PT_TLS immediately. Instead, adjust the existing code so that it works with the previous TCB allocation and the new TIB allocation. This allows the same Go runtime to work on 5.8, 5.9 and later 6.0. Once OpenBSD 5.9 is no longer supported (May 2017, when 6.1 is released), PT_TLS can be enabled and the additional cgo runtime code removed. Change-Id: I3eed5ec593d80eea78c6656cb12557004b2c0c9a Reviewed-on: https://go-review.googlesource.com/23197 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-19 15:43:37 +00:00
Ian Lance Taylor	1f7a0d4b5e	runtime: don't do a plain throw when throwsplit == true The test case in #15639 somehow causes an invalid syscall frame. The failure is obscured because the throw occurs when throwsplit == true, which causes a "stack split at bad time" error when trying to print the throw message. This CL fixes the "stack split at bad time" by using systemstack. No test because there shouldn't be any way to trigger this error anyhow. Update #15639. Change-Id: I4240f3fd01bdc3c112f3ffd1316b68504222d9e1 Reviewed-on: https://go-review.googlesource.com/23153 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-05-19 04:37:45 +00:00
Ian Lance Taylor	c08436d1c8	runtime: print PC, not the counter, for a cgo traceback Change-Id: I54ed7a26a753afb2d6a72080e1f50ce9fba7c183 Reviewed-on: https://go-review.googlesource.com/23228 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-18 23:47:13 +00:00
Ian Lance Taylor	538537a28d	runtime: check only up to ptrdata bytes for pointers Fixes #14508. Change-Id: I237d0c5a79a73e6c97bdb2077d8ede613128b978 Reviewed-on: https://go-review.googlesource.com/23224 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-05-18 23:39:06 +00:00
Ian Lance Taylor	6ab45c09f6	runtime: add KeepAlive function Fixes #13347. Change-Id: I591a80a1566ce70efb5f68e3ad69e7e3ab98cd9b Reviewed-on: https://go-review.googlesource.com/23102 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-18 20:42:37 +00:00
Cuihtlauac ALVARADO	2380a039c0	runtime: in tests, make sure gdb does not start with a shell On some systems, gdb is set to: "startup-with-shell on". This breaks runtime_test. This just make sure gdb does not start by spawning a shell. Fixes #15354 Change-Id: Ia040931c61dea22f4fdd79665ab9f84835ecaa70 Reviewed-on: https://go-review.googlesource.com/23142 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-18 14:03:22 +00:00
Ian Lance Taylor	23a59ba17c	runtime: deflake TestSignalExitStatus The signal might get delivered to a different thread, and that thread might not run again before the currently running thread returns and exits. Sleep to give the other thread time to pick up the signal and crash. Not tested for all cases, but, optimistically: Fixes #14063. Change-Id: Iff58669ac6185ad91cce85e0e86f17497a3659fd Reviewed-on: https://go-review.googlesource.com/23203 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Mikio Hara <mikioh.mikioh@gmail.com>	2016-05-18 04:08:08 +00:00
James Chacon	733162fd6c	runtime: prevent racefini from being invoked more than once racefini calls __tsan_fini which is C code and at the end of it invoked the standard C library exit(3) call. This has undefined behavior if invoked more than once. Specifically in C++ programs it caused static destructors to run twice. At least on glibc impls it also means the at_exit handlers list (where those are stored) also free's a list entry when it completes these. So invoking twice results in a double free at exit which trips debug memory allocation tracking. Fix all of this by using an atomic as a boolean barrier around calls to racefini being invoked > 1 time. Fixes #15578 Change-Id: I49222aa9b8ded77160931f46434c61a8379570fc Reviewed-on: https://go-review.googlesource.com/22882 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-18 01:04:55 +00:00
Ian Lance Taylor	c1b32acefb	runtime: yield after raising signal that should kill process Issue #15613 points out that the darwin builders have been getting regular failures in which a process that should exit with a SIGPIPE signal is instead exiting with exit status 2. The code calls runtime.raise. On most systems runtime.raise is the equivalent of pthread_kill(gettid(), sig); that is, it kills the thread with the signal, which should ensure that the program does not keep going. On darwin, however, runtime.raise is actually kill(getpid(), sig); that is, it sends a signal to the entire process. If the process decides to deliver the signal to a different thread, then it is possible that in some cases the thread that calls raise is able to execute the next system call before the signal is actually delivered. That would cause the observed error. I have not been able to recreate the problem myself, so I don't know whether this actually fixes it. But, optimistically: Fixed #15613. Change-Id: I60c0a9912aae2f46143ca1388fd85e9c3fa9df1f Reviewed-on: https://go-review.googlesource.com/23152 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-05-16 22:19:05 +00:00
Austin Clements	466cae6ca9	runtime: use GOTRACEBACK=system for TestStackBarrierProfiling This should help with debugging failures. For #15138 and #15477. Change-Id: I77db2b6375d8b4403d3edf5527899d076291e02c Reviewed-on: https://go-review.googlesource.com/23134 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-16 20:16:50 +00:00
Austin Clements	64770f642f	runtime: use conventional shift style for gcBitsChunkBytes The convention for writing something like "64 kB" is 64<<10, since this is easier to read than 1<<16. Update gcBitsChunkBytes to follow this convention. Change-Id: I5b5a3f726dcf482051ba5b1814db247ff3b8bb2f Reviewed-on: https://go-review.googlesource.com/23132 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-16 18:28:38 +00:00
Austin Clements	30ded16596	runtime: remove obsolete comment from scanobject Change-Id: I5ebf93b60213c0138754fc20888ae5ce60237b8c Reviewed-on: https://go-review.googlesource.com/23131 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-16 18:28:35 +00:00
Keith Randall	0bc14f57ec	strings: fix Contains on amd64 The 17-31 byte code is broken. Disabled it. Added a bunch of tests to at least cover the cases in indexShortStr. I'll channel Brad and wonder why this CL ever got in without any tests. Fixes #15679 Change-Id: I84a7b283a74107db865b9586c955dcf5f2d60161 Reviewed-on: https://go-review.googlesource.com/23106 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-15 05:21:03 +00:00
Austin Clements	6181db53db	runtime: improve heapBitsSetType documentation Currently the heapBitsSetType documentation says that there are no races on the heap bitmap, but that isn't exactly true. There are no write-write races, but there are read-write races. Expand the documentation to explain this and why it's okay. Change-Id: Ibd92b69bcd6524a40a9dd4ec82422b50831071ed Reviewed-on: https://go-review.googlesource.com/23092 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-14 13:49:57 +00:00
Austin Clements	d8b08c3aa4	runtime: perform publication barrier even for noscan objects Currently we only execute a publication barrier for scan objects (and skip it for noscan objects). This used to be okay because GC would never consult the object itself (so it wouldn't observe uninitialized memory even if it found a pointer to a noscan object), and the heap bitmap was pre-initialized to noscan. However, now we explicitly initialize the heap bitmap for noscan objects when we allocate them. While the GC will still never consult the contents of a noscan object, it does need to see the initialized heap bitmap. Hence, we need to execute a publication barrier to make the bitmap visible before user code can expose a pointer to the newly allocated object even for noscan objects. Change-Id: Ie4133c638db0d9055b4f7a8061a634d970627153 Reviewed-on: https://go-review.googlesource.com/23043 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-14 13:49:51 +00:00
Joel Sing	5bd37b8e78	runtime: stop using sigreturn on openbsd/386 In future releases of OpenBSD, the sigreturn syscall will no longer exist. As such, stop using sigreturn on openbsd/386 and just return from the signal trampoline (as we already do for openbsd/amd64 and openbsd/arm). Change-Id: Ic4de1795bbfbfb062a685832aea0d597988c6985 Reviewed-on: https://go-review.googlesource.com/23024 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-12 07:32:01 +00:00
Brad Fitzpatrick	9628e6fd1d	runtime/testdata/testprogcgo: fix Windows C compiler warning Noticed and fix by Alex Brainman. Tested in https://golang.org/cl/23005 (which makes all compiler warnings fatal during development) Fixes #15623 Change-Id: Ic19999fce8bb8640d963965cc328574efadd7855 Reviewed-on: https://go-review.googlesource.com/23010 Reviewed-by: Alex Brainman <alex.brainman@gmail.com>	2016-05-10 23:11:44 +00:00
Cherry Zhang	fdc4a964d2	[dev.ssa] cmd/compile/internal/gc, runtime: use 32-bit load for writeBarrier check Use 32-bit load for writeBarrier check on all architectures. Padding added to runtime structure. Updates #15365, #15492. Change-Id: I5d3dadf8609923fe0fe4fcb384a418b7b9624998 Reviewed-on: https://go-review.googlesource.com/22855 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-10 17:34:30 +00:00
Austin Clements	256a9670cc	runtime: fix some out of date comments in bitmap code Change-Id: I4613aa6d62baba01686bbab10738a7de23daae30 Reviewed-on: https://go-review.googlesource.com/22971 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-09 19:24:48 +00:00
Dmitry Vyukov	aeecee8ce4	runtime/race: deflake test The test sometimes fails on builders. The test uses sleeps to establish the necessary goroutine execution order. If sleeps undersleep/oversleep the race is still reported, but it can be reported when the main test goroutine returns. In such case test driver can't match the race with the test and reports failure. Wait for both test goroutines to ensure that the race is reported in the test scope. Fixes #15579 Change-Id: I0b9bec0ebfb0c127d83eb5325a7fe19ef9545050 Reviewed-on: https://go-review.googlesource.com/22951 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-05-09 14:50:18 +00:00
Elias Naur	e6ec82067a	runtime: use entire address space on 32 bit In issue #13992, Russ mentioned that the heap bitmap footprint was halved but that the bitmap size calculation hadn't been updated. This presents the opportunity to either halve the bitmap size or double the addressable virtual space. This CL doubles the addressable virtual space. On 32 bit this can be tweaked further to allow the bitmap to cover the entire 4GB virtual address space, removing a failure mode if the kernel hands out memory with a too low address. First, fix the calculation and double _MaxArena32 to cover 4GB virtual memory space with the same bitmap size (256 MB). Then, allow the fallback mode for the initial memory reservation on 32 bit (or 64 bit with too little available virtual memory) to not include space for the arena. mheap.sysAlloc will automatically reserve additional space when the existing arena is full. Finally, set arena_start to 0 in 32 bit mode, so that any address is acceptable for subsequent (additional) reservations. Before, the bitmap was always located just before arena_start, so fix the two places relying on that assumption: Point the otherwise unused mheap.bitmap to one byte after the end of the bitmap, and use it for bitmap addressing instead of arena_start. With arena_start set to 0 on 32 bit, the cgoInRange check is no longer a sufficient check for Go pointers. Introduce and call inHeapOrStack to check whether a pointer is to the Go heap or stack. While we're here, remove sysReserveHigh which seems to be unused. Fixes #13992 Change-Id: I592b513148a50b9d3967b5c5d94b86b3ec39acc2 Reviewed-on: https://go-review.googlesource.com/20471 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-07 03:04:39 +00:00
Brad Fitzpatrick	131231b8db	os: rename remaining four os1_.go files to os_.go Change-Id: Ice9c234960adc7857c8370b777a0b18e29d59281 Reviewed-on: https://go-review.googlesource.com/22853 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-06 17:47:44 +00:00
Brad Fitzpatrick	61602b0e9e	runtime: delete empty files I meant to delete these in CL 22850, actually. Change-Id: I0c286efd2b9f1caf0221aa88e3bcc03649c89517 Reviewed-on: https://go-review.googlesource.com/22851 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-06 17:04:11 +00:00
Brad Fitzpatrick	2dc680007e	runtime: merge the last four os-vs-os1 files together Change-Id: Ib0ba691c4657fe18a4659753e70d97c623cb9c1d Reviewed-on: https://go-review.googlesource.com/22850 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-05-06 16:03:25 +00:00
Shenghou Ma	2e32efc44a	runtime: get randomness from AT_RANDOM AUXV on linux/mips64x Fixes #15148. Change-Id: If3b628f30521adeec1625689dbc98aaf4a9ec858 Reviewed-on: https://go-review.googlesource.com/22811 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Minux Ma <minux@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-06 05:50:02 +00:00
Russ Cox	88d3db0a5b	runtime: stop traceback at foreign function This can only happen when profiling and there is foreign code at the top of the g0 stack but we're not in cgo. That in turn only happens with the race detector. Fixes #13568. Change-Id: I23775132c9c1a3a3aaae191b318539f368adf25e Reviewed-on: https://go-review.googlesource.com/18322 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-06 00:54:25 +00:00
Joe Tsai	acc757f678	all: use SeekStart, SeekCurrent, SeekEnd CL/19862 (`f79b50b8d5`) recently introduced the constants SeekStart, SeekCurrent, and SeekEnd to the io package. We should use these constants consistently throughout the code base. Updates #15269 Change-Id: If7fcaca7676e4a51f588528f5ced28220d9639a2 Reviewed-on: https://go-review.googlesource.com/22097 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Joe Tsai <joetsai@digital-static.net> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-06 00:10:41 +00:00
David Chase	6db98a3c51	cmd/compile: repair MININT conversion bug in arm softfloat Negative-case conversion code was wrong for minimum int32, used negate-then-widen instead of widen-then-negate. Test already exists; this fixes the failure. Fixes #15563. Change-Id: I4b0b3ae8f2c9714bdcc405d4d0b1502ccfba2b40 Reviewed-on: https://go-review.googlesource.com/22830 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-05 22:29:25 +00:00
Keith Randall	802966f7b3	[dev.ssa] Merge remote-tracking branch 'origin/master' into mergebranch Merge from tip into ssa. Change-Id: Icbc1c46d9f4721e4a0f99a24dd708044407ee9f7	2016-05-05 14:24:52 -07:00
Keith Randall	ab150e1ac9	[dev.ssa] all: merge from tip to get dev.ssa current So we can start working on other architectures here. Change is a dummy to keep git happy. Change-Id: I1caa62a242790601810a1ff72af7ea9773d4da76 Reviewed-on: https://go-review.googlesource.com/22822 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-05-05 20:47:51 +00:00
Emmanuel Odeke	1a7fc2357b	runtime: print signal name in panic, if name is known Adds a small function signame that infers a signal name from the signal table, otherwise will fallback to using hex(sig) as previously. No signal table is present for Windows hence it will always print the hex value. Sample code and new result: ```go package main import ( "fmt" "time" ) func main() { defer func() { if err := recover(); err != nil { fmt.Printf("err=%v\n", err) } }() ticker := time.Tick(1e9) for { <-ticker } } ``` ```shell $ go run main.go & $ kill -11 <pid> fatal error: unexpected signal during runtime execution [signal SIGSEGV: segmentation violation code=0x1 addr=0xb01dfacedebac1e pc=0xc71db] ... ``` Fixes #13969 Change-Id: Ie6be312eb766661f1cea9afec352b73270f27f9d Reviewed-on: https://go-review.googlesource.com/22753 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-05 19:58:00 +00:00
Lynn Boger	eeca3ba92f	sync/atomic, runtime/internal/atomic: improve ppc64x atomics The following performance improvements have been made to the low-level atomic functions for ppc64le & ppc64: - For those cases containing a lwarx and stwcx (or other sizes): sync, lwarx, maybe something, stwcx, loop to sync, sync, isync The sync is moved before (outside) the lwarx/stwcx loop, and the sync after is removed, so it becomes: sync, lwarx, maybe something, stwcx, loop to lwarx, isync - For the Or8 and And8, the shifting and manipulation of the address to the word aligned version were removed and the instructions were changed to use lbarx, stbcx instead of register shifting, xor, then lwarx, stwcx. - New instructions LWSYNC, LBAR, STBCC were tested and added. runtime/atomic_ppc64x.s was changed to use the LWSYNC opcode instead of the WORD encoding. Fixes #15469 Ran some of the benchmarks in the runtime and sync directories. Some results varied from run to run but the trend was improvement based on best times for base and new: runtime.test: BenchmarkChanNonblocking-128 0.88 0.89 +1.14% BenchmarkChanUncontended-128 569 511 -10.19% BenchmarkChanContended-128 63110 53231 -15.65% BenchmarkChanSync-128 691 598 -13.46% BenchmarkChanSyncWork-128 11355 11649 +2.59% BenchmarkChanProdCons0-128 2402 2090 -12.99% BenchmarkChanProdCons10-128 1348 1363 +1.11% BenchmarkChanProdCons100-128 1002 746 -25.55% BenchmarkChanProdConsWork0-128 2554 2720 +6.50% BenchmarkChanProdConsWork10-128 1909 1804 -5.50% BenchmarkChanProdConsWork100-128 1624 1580 -2.71% BenchmarkChanCreation-128 237 212 -10.55% BenchmarkChanSem-128 705 667 -5.39% BenchmarkChanPopular-128 5081190 4497566 -11.49% BenchmarkCreateGoroutines-128 532 473 -11.09% BenchmarkCreateGoroutinesParallel-128 35.0 34.7 -0.86% BenchmarkCreateGoroutinesCapture-128 4923 4200 -14.69% sync.test: BenchmarkUncontendedSemaphore-128 112 94.2 -15.89% BenchmarkContendedSemaphore-128 133 128 -3.76% BenchmarkMutexUncontended-128 1.90 1.67 -12.11% BenchmarkMutex-128 353 310 -12.18% BenchmarkMutexSlack-128 304 283 -6.91% BenchmarkMutexWork-128 554 541 -2.35% BenchmarkMutexWorkSlack-128 567 556 -1.94% BenchmarkMutexNoSpin-128 275 242 -12.00% BenchmarkMutexSpin-128 1129 1030 -8.77% BenchmarkOnce-128 1.08 0.96 -11.11% BenchmarkPool-128 29.8 27.4 -8.05% BenchmarkPoolOverflow-128 40564 36583 -9.81% BenchmarkSemaUncontended-128 3.14 2.63 -16.24% BenchmarkSemaSyntNonblock-128 1087 1069 -1.66% BenchmarkSemaSyntBlock-128 897 893 -0.45% BenchmarkSemaWorkNonblock-128 1034 1028 -0.58% BenchmarkSemaWorkBlock-128 949 886 -6.64% Change-Id: I4403fb29d3cd5254b7b1ce87a216bd11b391079e Reviewed-on: https://go-review.googlesource.com/22549 Reviewed-by: Michael Munday <munday@ca.ibm.com> Reviewed-by: Minux Ma <minux@golang.org>	2016-05-05 18:52:28 +00:00
Cherry Zhang	bcd4b84bc5	runtime: skip TestCgoCallbackGC on linux/mips64x Builder is too slow. This test passed on builder machines but took 15+ min. Change-Id: Ief9d67ea47671a57e954e402751043bc1ce09451 Reviewed-on: https://go-review.googlesource.com/22798 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-05 14:58:14 +00:00
Ian Lance Taylor	34f97d28d2	runtime: put tracebackctxt C functions in .c file Since tracebackctxt.go uses //export functions, the C functions can't be externally visible in the C comment. The code was using attributes to work around that, but that failed on Windows. Change-Id: If4449fd8209a8998b4f6855ea89e5db1471b2981 Reviewed-on: https://go-review.googlesource.com/22786 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-05 01:19:18 +00:00
Mohit Agarwal	4d6788ecae	runtime: clean up profiling data files produced by TestCgoPprof Fixes #15541 Change-Id: I9b6835157db0eb86de13591e785f971ffe754baa Reviewed-on: https://go-review.googlesource.com/22783 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-04 22:12:26 +00:00
Shenghou Ma	43d2a10e26	runtime/internal/atomic: fix vet warnings Change-Id: Ib29cf7abbbdaed81e918e5e41bca4e9b8da24621 Reviewed-on: https://go-review.googlesource.com/22503 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-05-04 16:50:22 +00:00
Cherry Zhang	b6687c8933	runtime: add linux/mips64x cgo support Change-Id: Id40dd05b7b264f3b779fdf9ccc2421ba4bc70589 Reviewed-on: https://go-review.googlesource.com/19806 Reviewed-by: Minux Ma <minux@golang.org>	2016-05-04 16:41:10 +00:00
Cherry Zhang	6e90432342	runtime/cgo: add context argument to crosscall2 on mips64 Change-Id: Id018516075842afd8af12fbf207763a851d5a851 Reviewed-on: https://go-review.googlesource.com/22754 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-05-04 16:40:44 +00:00
Ian Lance Taylor	84e808043f	runtime: use cgo traceback for SIGPROF If we collected a cgo traceback when entering the SIGPROF signal handler, record it as part of the profiling stack trace. This serves as the promised test for https://golang.org/cl/21055 . Change-Id: I5f60cd6cea1d9b7c3932211483a6bfab60ed21d2 Reviewed-on: https://go-review.googlesource.com/22650 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-05-04 00:08:19 +00:00
Dmitry Vyukov	caa2147532	runtime: per-P contexts for race detector Race runtime also needs local malloc caches and currently uses a mix of per-OS-thread and per-goroutine caches. This leads to increased memory consumption. But more importantly cache of synchronization objects is per-goroutine and we don't always have goroutine context when feeing memory in GC. As the result synchronization object descriptors leak (more precisely, they can be reused if another synchronization object is recreated at the same address, but it does not always help). For example, the added BenchmarkSyncLeak has effectively runaway memory consumption (based on a real long running server). This change updates race runtime with support for per-P contexts. BenchmarkSyncLeak now stabilizes at ~1GB memory consumption. Long term, this will allow us to remove race runtime dependency on glibc (as malloc is the main cornerstone). I've also implemented a different scheme to pass P context to race runtime: scheduler notified race runtime about association between G and P by calling procwire(g, p)/procunwire(g, p). But it turned out to be very messy as we have lots of places where the association changes (e.g. syscalls). So I dropped it in favor of the current scheme: race runtime asks scheduler about the current P. Fixes #14533 Change-Id: Iad10d2f816a44affae1b9fed446b3580eafd8c69 Reviewed-on: https://go-review.googlesource.com/19970 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-03 11:00:43 +00:00
Dmitry Vyukov	fcd7c02c70	runtime: fix CPU underutilization Runqempty is a critical predicate for scheduler. If runqempty spuriously returns true, then scheduler can fail to schedule arbitrary number of runnable goroutines on idle Ps for arbitrary long time. With the addition of runnext runqempty predicate become broken (can spuriously return true). Consider that runnext is not nil and the main array is empty. Runqempty observes that the array is empty, then it is descheduled for some time. Then queue owner pushes another element to the queue evicting runnext into the array. Then queue owner pops runnext. Then runqempty resumes and observes runnext is nil and returns true. But there were no point in time when the queue was empty. Fix runqempty predicate to not return true spuriously. Change-Id: Ifb7d75a699101f3ff753c4ce7c983cf08befd31e Reviewed-on: https://go-review.googlesource.com/20858 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-03 10:06:32 +00:00
Emmanuel Odeke	53fd522c0d	all: make copyright headers consistent with one space after period Follows suit with https://go-review.googlesource.com/#/c/20111. Generated by running $ grep -R 'Go Authors. All' * \| cut -d":" -f1 \| while read F;do perl -pi -e 's/Go Authors. All/Go Authors. All/g' $F;done The code in cmd/internal/unvendor wasn't changed. Fixes #15213 Change-Id: I4f235cee0a62ec435f9e8540a1ec08ae03b1a75f Reviewed-on: https://go-review.googlesource.com/21819 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-02 13:43:18 +00:00
Austin Clements	77c7f12438	runtime: update some comments This updates some comments that became out of date when we moved the mark bit out of the heap bitmap and started using the high bit for the first word as a scan/dead bit. Change-Id: I4a572d16db6114cadff006825466c1f18359f2db Reviewed-on: https://go-review.googlesource.com/22662 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-01 03:31:50 +00:00
Cherry Zhang	5d002dbc21	runtime/cgo: add linux/mips64x cgo support MIPS N64 ABI passes arguments in registers R4-R11, return value in R2. R16-R23, R28, R30 and F24-F31 are callee-save. gcc PIC code expects to be called with indirect call through R25. Change-Id: I24f582b4b58e1891ba9fd606509990f95cca8051 Reviewed-on: https://go-review.googlesource.com/19805 Reviewed-by: Minux Ma <minux@golang.org>	2016-05-01 02:39:50 +00:00
Cherry Zhang	073d292c45	cmd/link, runtime: add external linking support for linux/mips64x Fixes #12560 Change-Id: Ic2004fc7b09f2dbbf83c41f8c6307757c0e1676d Reviewed-on: https://go-review.googlesource.com/19803 Reviewed-by: Minux Ma <minux@golang.org>	2016-05-01 02:38:37 +00:00
Cherry Zhang	981395103e	cmd/internal/obj/mips et al.: introduce SB register on mips64x SB register (R28) is introduced for access external addresses with shorter instruction sequences. It is loaded at entry points. External data within 2G of SB can be accessed this way. cmd/internal/obj: relocaltion R_ADDRMIPS is split into two relocations R_ADDRMIPS and R_ADDRMIPSU, handling the low 16 bits and the "upper" 16 bits of external addresses, respectively, since the instructios may not be adjacent. It might be better if relocation Variant could be used. cmd/link/internal/mips64: support new relocations. cmd/compile/internal/mips64: reserve SB register. runtime: initialize SB register at entry points. Change-Id: I5f34868f88c5a9698c042a8a1f12f76806c187b9 Reviewed-on: https://go-review.googlesource.com/19802 Reviewed-by: Minux Ma <minux@golang.org>	2016-05-01 02:36:46 +00:00
Cherry Zhang	a409fb80b0	cmd/internal/obj/mips, runtime: change REGTMP to R23 Leave R28 to SB register, which will be introduced in CL 19802. Change-Id: I1cf7a789695c5de664267ec8086bfb0b043ebc14 Reviewed-on: https://go-review.googlesource.com/19863 Reviewed-by: Minux Ma <minux@golang.org>	2016-05-01 02:36:28 +00:00
Austin Clements	a20fd1f6ba	runtime: reclaim scan/dead bit in first word With the switch to separate mark bitmaps, the scan/dead bit for the first word of each object is now unused. Reclaim this bit and use it as a scan/dead bit, just like words three and on. The second word is still used for checkmark. This dramatically simplifies heapBitsSetTypeNoScan and hasPointers, since they no longer need different cases for 1, 2, and 3+ word objects. They can instead just manipulate the heap bitmap for the first word and be done with it. In order to enable this, we change heapBitsSetType and runGCProg to always set the scan/dead bit to scan for the first word on every code path. Since these functions only apply to types that have pointers, there's no need to do this conditionally: it's always necessary to set the scan bit in the first word. We also change every place that scans an object and checks if there are more pointers. Rather than only checking morePointers if the word is >= 2, we now check morePointers if word != 1 (since that's the checkmark word). Looking forward, we should probably reclaim the checkmark bit, too, but that's going to be quite a bit more work. Tested by setting doubleCheck in heapBitsSetType and running all.bash on both linux/amd64 and linux/386, and by running GOGC=10 all.bash. This particularly improves the FmtFprintf* go1 benchmarks, since they do a large amount of noscan allocation. name old time/op new time/op delta BinaryTree17-12 2.34s ± 1% 2.38s ± 1% +1.70% (p=0.000 n=17+19) Fannkuch11-12 2.09s ± 0% 2.09s ± 1% ~ (p=0.276 n=17+16) FmtFprintfEmpty-12 44.9ns ± 2% 44.8ns ± 2% ~ (p=0.340 n=19+18) FmtFprintfString-12 127ns ± 0% 125ns ± 0% -1.57% (p=0.000 n=16+15) FmtFprintfInt-12 128ns ± 0% 122ns ± 1% -4.45% (p=0.000 n=15+20) FmtFprintfIntInt-12 207ns ± 1% 193ns ± 0% -6.55% (p=0.000 n=19+14) FmtFprintfPrefixedInt-12 197ns ± 1% 191ns ± 0% -2.93% (p=0.000 n=17+18) FmtFprintfFloat-12 263ns ± 0% 248ns ± 1% -5.88% (p=0.000 n=15+19) FmtManyArgs-12 794ns ± 0% 779ns ± 1% -1.90% (p=0.000 n=18+18) GobDecode-12 7.14ms ± 2% 7.11ms ± 1% ~ (p=0.072 n=20+20) GobEncode-12 5.85ms ± 1% 5.82ms ± 1% -0.49% (p=0.000 n=20+20) Gzip-12 218ms ± 1% 215ms ± 1% -1.22% (p=0.000 n=19+19) Gunzip-12 36.8ms ± 0% 36.7ms ± 0% -0.18% (p=0.006 n=18+20) HTTPClientServer-12 77.1µs ± 4% 77.1µs ± 3% ~ (p=0.945 n=19+20) JSONEncode-12 15.6ms ± 1% 15.9ms ± 1% +1.68% (p=0.000 n=18+20) JSONDecode-12 55.2ms ± 1% 53.6ms ± 1% -2.93% (p=0.000 n=17+19) Mandelbrot200-12 4.05ms ± 1% 4.05ms ± 0% ~ (p=0.306 n=17+17) GoParse-12 3.14ms ± 1% 3.10ms ± 1% -1.31% (p=0.000 n=19+18) RegexpMatchEasy0_32-12 69.3ns ± 1% 70.0ns ± 0% +0.89% (p=0.000 n=19+17) RegexpMatchEasy0_1K-12 237ns ± 1% 236ns ± 0% -0.62% (p=0.000 n=19+16) RegexpMatchEasy1_32-12 69.5ns ± 1% 70.3ns ± 1% +1.14% (p=0.000 n=18+17) RegexpMatchEasy1_1K-12 377ns ± 1% 366ns ± 1% -3.03% (p=0.000 n=15+19) RegexpMatchMedium_32-12 107ns ± 1% 107ns ± 2% ~ (p=0.318 n=20+19) RegexpMatchMedium_1K-12 33.8µs ± 3% 33.5µs ± 1% -1.04% (p=0.001 n=20+19) RegexpMatchHard_32-12 1.68µs ± 1% 1.73µs ± 0% +2.50% (p=0.000 n=20+18) RegexpMatchHard_1K-12 50.8µs ± 1% 52.0µs ± 1% +2.50% (p=0.000 n=19+18) Revcomp-12 381ms ± 1% 385ms ± 1% +1.00% (p=0.000 n=17+18) Template-12 64.9ms ± 3% 62.6ms ± 1% -3.55% (p=0.000 n=19+18) TimeParse-12 324ns ± 0% 328ns ± 1% +1.25% (p=0.000 n=18+18) TimeFormat-12 345ns ± 0% 334ns ± 0% -3.31% (p=0.000 n=15+17) [Geo mean] 52.1µs 51.5µs -1.00% Change-Id: I13e74da3193a7f80794c654f944d1f0d60817049 Reviewed-on: https://go-review.googlesource.com/22632 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-30 16:49:54 +00:00
Austin Clements	d5e3d08b3a	runtime: use morePointers and isPointer in more places This makes this code better self-documenting and makes it easier to find these places in the future. Change-Id: I31dc5598ae67f937fb9ef26df92fd41d01e983c3 Reviewed-on: https://go-review.googlesource.com/22631 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-30 16:49:50 +00:00
Austin Clements	a5d3f7ece9	runtime: avoid conditional execution in morePointers and isPointer heapBits.bits is carefully written to produce good machine code. Use it in heapBits.morePointers and heapBits.isPointer to get good machine code there, too. Change-Id: I208c7d0d38697e7a22cad67f692162589b75f1e2 Reviewed-on: https://go-review.googlesource.com/22630 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-30 16:49:47 +00:00
Michael Munday	58f52cbb79	runtime: fix cgocallback_gofunc on ppc64x Fix issues introduced in `5f9a870`. Change-Id: Ia75945ef563956613bf88bbe57800a96455c265d Reviewed-on: https://go-review.googlesource.com/22661 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-30 03:49:22 +00:00
Ian Lance Taylor	9fe572e509	runtime: fix cgocallback_gofunc argument passing on arm64 Change-Id: I4b34bcd5cde71ecfbb352b39c4231de6168cc7f3 Reviewed-on: https://go-review.googlesource.com/22651 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Munday <munday@ca.ibm.com>	2016-04-29 23:10:52 +00:00
Ian Lance Taylor	5f9a870bf1	cmd/cgo, runtime, runtime/cgo: use cgo context function Add support for the context function set by runtime.SetCgoTraceback. The context function was added in CL 17761, without support. This CL is the support. This CL has not been tested for real C code, as a working context function for C code requires unwind support that does not seem to exist. I wanted to get the CL out before the freeze. I apologize for the length of this CL. It's mostly plumbing, but unfortunately the plumbing is processor-specific. Change-Id: I8ce11a0de9b3dafcc29efd2649d776e93bff0e90 Reviewed-on: https://go-review.googlesource.com/22508 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-29 22:07:36 +00:00
Rick Hudson	56b5491262	Merge remote-tracking branch 'origin/dev.garbage' This commit moves the GC from free list allocation to bit mark allocation. Instead of using the bitmaps generated during the mark phases to generate free list and then using the free lists for allocation we allocate directly from the bitmaps. The change in the garbage benchmark name old time/op new time/op delta XBenchGarbage-12 2.22ms ± 1% 2.13ms ± 1% -3.90% (p=0.000 n=18+18) Change-Id: I17f57233336f0ca5ef5404c3be4ecb443ab622aa	2016-04-29 13:56:44 -04:00
Rick Hudson	e9eaa181fc	[dev.garbage] runtime: simplify nextFreeFast so it is inlined nextFreeFast is currently not inlined by the compiler due to its size and complexity. This CL simplifies nextFreeFast by letting the slow path handle (nextFree) handle a corner cases. Change-Id: Ia9c5d1a7912bcb4bec072f5fd240f0e0bafb20e4 Reviewed-on: https://go-review.googlesource.com/22598 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com>	2016-04-29 16:47:11 +00:00
Austin Clements	b3579c095e	[dev.garbage] runtime: revive sweep fast path sweep used to skip mcental.freeSpan (and its locking) if it didn't find any new free objects. We lost that optimization when the freed-object counting changed in dad83f7 to count total free objects instead of newly freed objects. The previous commit brings back counting of newly freed objects, so we can easily revive this optimization by checking that count (like we used to) instead of the total free objects count. Change-Id: I43658707a1c61674d0366124d5976b00d98741a9 Reviewed-on: https://go-review.googlesource.com/22596 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-04-29 15:25:28 +00:00
Austin Clements	d97625ae9e	[dev.garbage] runtime: fix nfree accounting Commit `8dda1c4` changed the meaning of "nfree" in sweep from the number of newly freed objects to the total number of free objects in the span, but didn't update where sweep added nfree to c.local_nsmallfree. Hence, we're over-accounting the number of frees. This is causing TestArrayHash to fail with "too many allocs NNN - hash not balanced". Fix this by computing the number of newly freed objects and adding that to c.local_nsmallfree, so it behaves like it used to. Computing this requires a small tweak to mallocgc: apparently we've never set s.allocCount when allocating a large object; fix this by setting it to 1 so sweep doesn't get confused. Change-Id: I31902ffd310110da4ffd807c5c06f1117b872dc8 Reviewed-on: https://go-review.googlesource.com/22595 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2016-04-29 15:25:26 +00:00
Austin Clements	6d11490539	[dev.garbage] runtime: fix allocfreetrace We broke tracing of freed objects in GODEBUG=allocfreetrace=1 mode when we removed the sweep over the mark bitmap. Fix it by re-introducing the sweep over the bitmap specifically if we're in allocfreetrace mode. This doesn't have to be even remotely efficient, since the overhead of allocfreetrace is huge anyway, so we can keep the code for this down to just a few lines. Change-Id: I9e176b3b04c73608a0ea3068d5d0cd30760ebd40 Reviewed-on: https://go-review.googlesource.com/22592 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-04-29 15:08:21 +00:00
Austin Clements	38f674687a	[dev.garbage] runtime: reintroduce no-zeroing optimization Currently we always zero objects when we allocate them. We used to have an optimization that would not zero objects that had not been allocated since the whole span was last zeroed (either by getting it from the system or by getting it from the heap, which does a bulk zero), but this depended on the sweeper clobbering the first two words of each object. Hence, we lost this optimization when the bitmap sweeper went away. Re-introduce this optimization using a different mechanism. Each span already keeps a flag indicating that it just came from the OS or was just bulk zeroed by the mheap. We can simply use this flag to know when we don't need to zero an object. This is slightly less efficient than the old optimization: if a span gets allocated and partially used, then GC happens and the span gets returned to the mcentral, then the span gets re-acquired, the old optimization knew that it only had to re-zero the objects that had been reclaimed, whereas this optimization will re-zero everything. However, in this case, you're already paying for the garbage collection, and you've only wasted one zeroing of the span, so in practice there seems to be little difference. (If we did want to revive the full optimization, each span could keep track of a frontier beyond which all free slots are zeroed. I prototyped this and it didn't obvious do any better than the much simpler approach in this commit.) This significantly improves BinaryTree17, which is allocation-heavy (and runs first, so most pages are already zeroed), and slightly improves everything else. name old time/op new time/op delta XBenchGarbage-12 2.15ms ± 1% 2.14ms ± 1% -0.80% (p=0.000 n=17+17) name old time/op new time/op delta BinaryTree17-12 2.71s ± 1% 2.56s ± 1% -5.73% (p=0.000 n=18+19) DivconstI64-12 1.70ns ± 1% 1.70ns ± 1% ~ (p=0.562 n=18+18) DivconstU64-12 1.74ns ± 2% 1.74ns ± 1% ~ (p=0.394 n=20+20) DivconstI32-12 1.74ns ± 0% 1.74ns ± 0% ~ (all samples are equal) DivconstU32-12 1.66ns ± 1% 1.66ns ± 0% ~ (p=0.516 n=15+16) DivconstI16-12 1.84ns ± 0% 1.84ns ± 0% ~ (all samples are equal) DivconstU16-12 1.82ns ± 0% 1.82ns ± 0% ~ (all samples are equal) DivconstI8-12 1.79ns ± 0% 1.79ns ± 0% ~ (all samples are equal) DivconstU8-12 1.60ns ± 0% 1.60ns ± 1% ~ (p=0.603 n=17+19) Fannkuch11-12 2.11s ± 1% 2.11s ± 0% ~ (p=0.333 n=16+19) FmtFprintfEmpty-12 45.1ns ± 4% 45.4ns ± 5% ~ (p=0.111 n=20+20) FmtFprintfString-12 134ns ± 0% 129ns ± 0% -3.45% (p=0.000 n=18+16) FmtFprintfInt-12 131ns ± 1% 129ns ± 1% -1.54% (p=0.000 n=16+18) FmtFprintfIntInt-12 205ns ± 2% 203ns ± 0% -0.56% (p=0.014 n=20+18) FmtFprintfPrefixedInt-12 200ns ± 2% 197ns ± 1% -1.48% (p=0.000 n=20+18) FmtFprintfFloat-12 256ns ± 1% 256ns ± 0% -0.21% (p=0.008 n=18+20) FmtManyArgs-12 805ns ± 0% 804ns ± 0% -0.19% (p=0.001 n=18+18) GobDecode-12 7.21ms ± 1% 7.14ms ± 1% -0.92% (p=0.000 n=19+20) GobEncode-12 5.88ms ± 1% 5.88ms ± 1% ~ (p=0.641 n=18+19) Gzip-12 218ms ± 1% 218ms ± 1% ~ (p=0.271 n=19+18) Gunzip-12 37.1ms ± 0% 36.9ms ± 0% -0.29% (p=0.000 n=18+17) HTTPClientServer-12 78.1µs ± 2% 77.4µs ± 2% ~ (p=0.070 n=19+19) JSONEncode-12 15.5ms ± 1% 15.5ms ± 0% ~ (p=0.063 n=20+18) JSONDecode-12 56.1ms ± 0% 55.4ms ± 1% -1.18% (p=0.000 n=19+18) Mandelbrot200-12 4.05ms ± 0% 4.06ms ± 0% +0.29% (p=0.001 n=18+18) GoParse-12 3.28ms ± 1% 3.21ms ± 1% -2.30% (p=0.000 n=20+20) RegexpMatchEasy0_32-12 69.4ns ± 2% 69.3ns ± 1% ~ (p=0.205 n=18+16) RegexpMatchEasy0_1K-12 239ns ± 0% 239ns ± 0% ~ (all samples are equal) RegexpMatchEasy1_32-12 69.4ns ± 1% 69.4ns ± 1% ~ (p=0.620 n=15+18) RegexpMatchEasy1_1K-12 370ns ± 1% 369ns ± 2% ~ (p=0.088 n=20+20) RegexpMatchMedium_32-12 108ns ± 0% 108ns ± 0% ~ (all samples are equal) RegexpMatchMedium_1K-12 33.6µs ± 3% 33.5µs ± 3% ~ (p=0.718 n=20+20) RegexpMatchHard_32-12 1.68µs ± 1% 1.67µs ± 2% ~ (p=0.316 n=20+20) RegexpMatchHard_1K-12 50.5µs ± 3% 50.4µs ± 3% ~ (p=0.659 n=20+20) Revcomp-12 381ms ± 1% 381ms ± 1% ~ (p=0.916 n=19+18) Template-12 66.5ms ± 1% 65.8ms ± 2% -1.08% (p=0.000 n=20+20) TimeParse-12 317ns ± 0% 319ns ± 0% +0.48% (p=0.000 n=19+12) TimeFormat-12 338ns ± 0% 338ns ± 0% ~ (p=0.124 n=19+18) [Geo mean] 5.99µs 5.96µs -0.54% Change-Id: I638ffd9d9f178835bbfa499bac20bd7224f1a907 Reviewed-on: https://go-review.googlesource.com/22591 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-04-29 15:08:13 +00:00
Austin Clements	3e2462387f	[dev.garbage] runtime: eliminate mspan.start This converts all remaining uses of mspan.start to instead use mspan.base(). In many cases, this actually reduces the complexity of the code. Change-Id: If113840e00d3345a6cf979637f6a152e6344aee7 Reviewed-on: https://go-review.googlesource.com/22590 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2016-04-29 03:53:17 +00:00
Austin Clements	b7adc41fba	[dev.garbage] runtime: use s.base() everywhere it makes sense Currently we have lots of (s.start << _PageShift) and variants. We now have an s.base() function that returns this. It's faster and more readable, so use it. Change-Id: I888060a9dae15ea75ca8cc1c2b31c905e71b452b Reviewed-on: https://go-review.googlesource.com/22559 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2016-04-29 03:53:14 +00:00
Austin Clements	2e8b74b695	[dev.garbage] runtime: document sysAlloc In particular, it always returns an aligned pointer. Change-Id: I763789a539a4bfd8b0efb36a39a80be1a479d3e2 Reviewed-on: https://go-review.googlesource.com/22558 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-29 03:53:12 +00:00
Austin Clements	15744c92de	[dev.garbage] runtime: remove unused head/end arguments from freeSpan These used to be used for the list of newly freed objects, but that's no longer a thing. Change-Id: I5a4503137b74ec0eae5372ca271b1aa0b32df074 Reviewed-on: https://go-review.googlesource.com/22557 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-29 03:53:08 +00:00
Rick Hudson	2fb75ea6c6	[dev.garbage] runtime: use sys.Ctz64 intrinsic Our compilers now provides instrinsics including sys.Ctz64 that support CTZ (count trailing zero) instructions. This CL replaces the Go versions of CTZ with the compiler intrinsic. Count trailing zeros CTZ finds the least significant 1 in a word and returns the number of less significant 0s in the word. Allocation uses the bitmap created by the garbage collector to locate an unmarked object. The logic takes a word of the bitmap, complements, and then caches it. It then uses CTZ to locate an available unmarked object. It then shifts marked bits out of the bitmap word preparing it for the next search. Once all the unmarked objects are used in the cached work the bitmap gets another word and repeats the process. Change-Id: Id2fc42d1d4b9893efaa2e1bd01896985b7e42f82 Reviewed-on: https://go-review.googlesource.com/21366 Reviewed-by: Austin Clements <austin@google.com>	2016-04-29 00:00:50 +00:00
Rick Hudson	2063d5d903	[dev.garbage] runtime: restructure alloc and mark bits Two changes are included here that are dependent on the other. The first is that allocBits and gcamrkBits are changed to a *uint8 which points to the first byte of that span's mark and alloc bits. Several places were altered to perform pointer arithmetic to locate the byte corresponding to an object in the span. The actual bit corresponding to an object is indexed in the byte by using the lower three bits of the objects index. The second change avoids the redundant calculation of an object's index. The index is returned from heapBitsForObject and then used by the functions indexing allocBits and gcmarkBits. Finally we no longer allocate the gc bits in the span structures. Instead we use an arena based allocation scheme that allows for a more compact bit map as well as recycling and bulk clearing of the mark bits. Change-Id: If4d04b2021c092ec39a4caef5937a8182c64dfef Reviewed-on: https://go-review.googlesource.com/20705 Reviewed-by: Austin Clements <austin@google.com>	2016-04-29 00:00:47 +00:00
Mikio Hara	be730b49ca	runtime: drop _SigUnblock for SIGSYS on Linux The _SigUnblock flag was appended to SIGSYS slot of runtime signal table for Linux in https://go-review.googlesource.com/22202, but there is still no concrete opinion on whether SIGSYS must be an unblocked signal for runtime. This change removes _SigUnblock flag from SIGSYS on Linux for consistency in runtime signal handling and adds a reference to #15204 to runtime signal table for FreeBSD. Updates #15204. Change-Id: I42992b1d852c2ab5dd37d6dbb481dba46929f665 Reviewed-on: https://go-review.googlesource.com/22537 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-28 21:48:44 +00:00
Rick Hudson	23aeb34df1	[dev.garbage] Merge remote-tracking branch 'origin/master' into HEAD Change-Id: I282fd9ce9db435dfd35e882a9502ab1abc185297	2016-04-27 18:46:52 -04:00
Brad Fitzpatrick	06d639e075	runtime: fix SetCgoTraceback doc indentation It wasn't rendering as HTML nicely. Change-Id: I5408ec22932a05e85c210c0faa434bd19dce5650 Reviewed-on: https://go-review.googlesource.com/22532 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-27 22:12:01 +00:00
Rick Hudson	1354b32cd7	[dev.garbage] runtime: add gc work buffer tryGet and put fast paths The complexity of the GC work buffers put and tryGet prevented them from being inlined. This CL simplifies the fast path thus enabling inlining. If the fast path does not succeed the previous put and tryGet functions are called. Change-Id: I6da6495d0dadf42bd0377c110b502274cc01acf5 Reviewed-on: https://go-review.googlesource.com/20704 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:55:02 +00:00
Rick Hudson	f8d0d4fd59	[dev.garbage] runtime: cleanup and optimize span.base() Prior to this CL the base of a span was calculated in various places using shifts or calls to base(). This CL now always calls base() which has been optimized to calculate the base of the span when the span is initialized and store that value in the span structure. Change-Id: I661f2bfa21e3748a249cdf049ef9062db6e78100 Reviewed-on: https://go-review.googlesource.com/20703 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:59 +00:00
Rick Hudson	8dda1c4c08	[dev.garbage] runtime: remove heapBitsSweepSpan Prior to this CL the sweep phase was responsible for locating all objects that were about to be freed and calling a function to process the object. This was done by the function heapBitsSweepSpan. Part of processing included calls to tracefree and msanfree as well as counting how many objects were freed. The calls to tracefree and msanfree have been moved into the gcmalloc routine and called when the object is about to be reallocated. The counting of free objects has been optimized using an array based popcnt algorithm and if all the objects in a span are free then span is freed. Similarly the code to locate the next free object has been optimized to use an array based ctz (count trailing zero). Various hot paths in the allocation logic have been optimized. At this point the garbage benchmark is within 3% of the 1.6 release. Change-Id: I00643c442e2ada1685c010c3447e4ea8537d2dfa Reviewed-on: https://go-review.googlesource.com/20201 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:57 +00:00
Rick Hudson	4093481523	[dev.garbage] runtime: add bit and cache ctz64 (count trailing zero) Add to each span a 64 bit cache (allocCache) of the allocBits at freeindex. allocCache is shifted such that the lowest bit corresponds to the bit freeindex. allocBits uses a 0 to indicate an object is free, on the other hand allocCache uses a 1 to indicate an object is free. This facilitates ctz64 (count trailing zero) which counts the number of 0s trailing the least significant 1. This is also the index of the least significant 1. Each span maintains a freeindex indicating the boundary between allocated objects and unallocated objects. allocCache is shifted as freeindex is incremented such that the low bit in allocCache corresponds to the bit a freeindex in the allocBits array. Currently ctz64 is written in Go using a for loop so it is not very efficient. Use of the hardware instruction will follow. With this in mind comparisons of the garbage benchmark are as follows. 1.6 release 2.8 seconds dev:garbage branch 3.1 seconds. Profiling shows the go implementation of ctz64 takes up 1% of the total time. Change-Id: If084ed9c3b1eda9f3c6ab2e794625cb870b8167f Reviewed-on: https://go-review.googlesource.com/20200 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:54 +00:00
Rick Hudson	44fe90d0b3	[dev.garbage] runtime: logic that uses count trailing zero (ctz) Most (all?) processors that Go supports supply a hardware instruction that takes a byte and returns the number of zeros trailing the first 1 encountered, or 8 if no ones are found. This is the index within the byte of the first 1 encountered. CTZ should improve the performance of the nextFreeIndex function. Since nextFreeIndex wants the next unmarked (0) bit a bit-wise complement is needed before calling ctz. Furthermore unmarked bits associated with previously allocated objects need to be ignored. Instead of writing a 1 as we allocate the code masks all bits less than the freeindex after loading the byte. While this CL does not actual execute a CTZ instruction it supplies a ctz function with the appropiate signature along with the logic to execute it. Change-Id: I5c55ce0ed48ca22c21c4dd9f969b0819b4eadaa7 Reviewed-on: https://go-review.googlesource.com/20169 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:52 +00:00
Rick Hudson	e4ac2d4acc	[dev.garbage] runtime: replace ref with allocCount This is a renaming of the field ref to the more appropriate allocCount. The field holds the number of objects in the span that are currently allocated. Some throws strings were adjusted to more accurately convey the meaning of allocCount. Change-Id: I10daf44e3e9cc24a10912638c7de3c1984ef8efe Reviewed-on: https://go-review.googlesource.com/19518 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:49 +00:00
Rick Hudson	3479b065d4	[dev.garbage] runtime: allocate directly from GC mark bits Instead of building a freelist from the mark bits generated by the GC this CL allocates directly from the mark bits. The approach moves the mark bits from the pointer/no pointer heap structures into their own per span data structures. The mark/allocation vectors consist of a single mark bit per object. Two vectors are maintained, one for allocation and one for the GC's mark phase. During the GC cycle's sweep phase the interpretation of the vectors is swapped. The mark vector becomes the allocation vector and the old allocation vector is cleared and becomes the mark vector that the next GC cycle will use. Marked entries in the allocation vector indicate that the object is not free. Each allocation vector maintains a boundary between areas of the span already allocated from and areas not yet allocated from. As objects are allocated this boundary is moved until it reaches the end of the span. At this point further allocations will be done from another span. Since we no longer sweep a span inspecting each freed object the responsibility for maintaining pointer/scalar bits in the heapBitMap containing is now the responsibility of the the routines doing the actual allocation. This CL is functionally complete and ready for performance tuning. Change-Id: I336e0fc21eef1066e0b68c7067cc71b9f3d50e04 Reviewed-on: https://go-review.googlesource.com/19470 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:47 +00:00
Rick Hudson	dc65a82eff	[dev.garbage] runtime: mark/allocation helper functions The gcmarkBits is a bit vector used by the GC to mark reachable objects. Once a GC cycle is complete the gcmarkBits swap places with the allocBits. allocBits is then used directly by malloc to locate free objects, thus avoiding the construction of a linked free list. This CL introduces a set of helper functions for manipulating gcmarkBits and allocBits that will be used by later CLs to realize the actual algorithm. Minimal attempts have been made to optimize these helper routines. Change-Id: I55ad6240ca32cd456e8ed4973c6970b3b882dd34 Reviewed-on: https://go-review.googlesource.com/19420 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Rick Hudson <rlh@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-27 21:54:44 +00:00
Rick Hudson	e1c4e9a754	[dev.garbage] runtime: refactor next free object In preparation for changing how the next free object is chosen refactor and consolidate code into a single function. Change-Id: I6836cd88ed7cbf0b2df87abd7c1c3b9fabc1cbd8 Reviewed-on: https://go-review.googlesource.com/19317 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:41 +00:00
Rick Hudson	aed861038f	[dev.garbage] runtime: add stackfreelist The freelist for normal objects and the freelist for stacks share the same mspan field for holding the list head but are operated on by different code sequences. This overloading complicates the use of bit vectors for allocation of normal objects. This change refactors the use of the stackfreelist out from the use of freelist. Change-Id: I5b155b5b8a1fcd8e24c12ee1eb0800ad9b6b4fa0 Reviewed-on: https://go-review.googlesource.com/19315 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:39 +00:00
Rick Hudson	2ac8bdc52a	[dev.garbage] runtime: bitmap allocation data structs The bitmap allocation data structure prototypes. Before this is released these underlying data structures need to be more performant but the signatures of helper functions utilizing these structures will remain stable. Change-Id: I5ace12f2fb512a7038a52bbde2bfb7e98783bcbe Reviewed-on: https://go-review.googlesource.com/19221 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-27 21:54:35 +00:00
Austin Clements	b49b71ae19	runtime: don't rescan globals Currently the runtime rescans globals during mark 2 and mark termination. This costs as much as 500µs/MB in STW time, which is enough to surpass the 10ms STW limit with only 20MB of globals. It's also basically unnecessary. The compiler already generates write barriers for global -> heap pointer updates and the regular write barrier doesn't check whether the slot is a global or in the heap. Some less common write barriers do cause problems. heapBitsBulkBarrier, which is used by typedmemmove and related functions, currently depends on having access to the pointer bitmap and as a result ignores writes to globals. Likewise, the reflect-related write barriers reflect_typedmemmovepartial and callwritebarrier ignore non-heap destinations; though it appears they can never be called with global pointers anyway. This commit makes heapBitsBulkBarrier issue write barriers for writes to global pointers using the data and BSS pointer bitmaps, removes the inheap checks from the reflection write barriers, and eliminates the rescans during mark 2 and mark termination. It also adds a test that writes to globals have write barriers. Programs with large data+BSS segments (with pointers) aren't common, but for programs that do have large data+BSS segments, this significantly reduces pause time: name \ 95%ile-time/markTerm old new delta LargeBSS/bss:1GB/gomaxprocs:4 148200µs ± 6% 302µs ±52% -99.80% (p=0.008 n=5+5) This very slightly improves the go1 benchmarks: name old time/op new time/op delta BinaryTree17-12 2.62s ± 3% 2.62s ± 4% ~ (p=0.904 n=20+20) Fannkuch11-12 2.15s ± 1% 2.13s ± 0% -1.29% (p=0.000 n=18+20) FmtFprintfEmpty-12 48.3ns ± 2% 47.6ns ± 1% -1.52% (p=0.000 n=20+16) FmtFprintfString-12 152ns ± 0% 152ns ± 1% ~ (p=0.725 n=18+18) FmtFprintfInt-12 150ns ± 1% 149ns ± 1% -1.14% (p=0.000 n=19+20) FmtFprintfIntInt-12 250ns ± 0% 244ns ± 1% -2.12% (p=0.000 n=20+18) FmtFprintfPrefixedInt-12 219ns ± 1% 217ns ± 1% -1.20% (p=0.000 n=19+20) FmtFprintfFloat-12 280ns ± 0% 281ns ± 1% +0.47% (p=0.000 n=19+19) FmtManyArgs-12 928ns ± 0% 923ns ± 1% -0.53% (p=0.000 n=19+18) GobDecode-12 7.21ms ± 1% 7.24ms ± 2% ~ (p=0.091 n=19+19) GobEncode-12 6.07ms ± 1% 6.05ms ± 1% -0.36% (p=0.002 n=20+17) Gzip-12 265ms ± 1% 265ms ± 1% ~ (p=0.496 n=20+19) Gunzip-12 39.6ms ± 1% 39.3ms ± 1% -0.85% (p=0.000 n=19+19) HTTPClientServer-12 74.0µs ± 2% 73.8µs ± 1% ~ (p=0.569 n=20+19) JSONEncode-12 15.4ms ± 1% 15.3ms ± 1% -0.25% (p=0.049 n=17+17) JSONDecode-12 53.7ms ± 2% 53.0ms ± 1% -1.29% (p=0.000 n=18+17) Mandelbrot200-12 3.97ms ± 1% 3.97ms ± 0% ~ (p=0.072 n=17+18) GoParse-12 3.35ms ± 2% 3.36ms ± 1% +0.51% (p=0.005 n=18+20) RegexpMatchEasy0_32-12 72.7ns ± 2% 72.2ns ± 1% -0.70% (p=0.005 n=19+19) RegexpMatchEasy0_1K-12 246ns ± 1% 245ns ± 0% -0.60% (p=0.000 n=18+16) RegexpMatchEasy1_32-12 72.8ns ± 1% 72.5ns ± 1% -0.37% (p=0.011 n=18+18) RegexpMatchEasy1_1K-12 380ns ± 1% 385ns ± 1% +1.34% (p=0.000 n=20+19) RegexpMatchMedium_32-12 115ns ± 2% 115ns ± 1% +0.44% (p=0.047 n=20+20) RegexpMatchMedium_1K-12 35.4µs ± 1% 35.5µs ± 1% ~ (p=0.079 n=18+19) RegexpMatchHard_32-12 1.83µs ± 0% 1.80µs ± 1% -1.76% (p=0.000 n=18+18) RegexpMatchHard_1K-12 55.1µs ± 0% 54.3µs ± 1% -1.42% (p=0.000 n=18+19) Revcomp-12 386ms ± 1% 381ms ± 1% -1.14% (p=0.000 n=18+18) Template-12 61.5ms ± 2% 61.5ms ± 2% ~ (p=0.647 n=19+20) TimeParse-12 338ns ± 0% 336ns ± 1% -0.72% (p=0.000 n=14+19) TimeFormat-12 350ns ± 0% 357ns ± 0% +2.05% (p=0.000 n=19+18) [Geo mean] 55.3µs 55.0µs -0.41% Change-Id: I57e8720385a1b991aeebd111b6874354308e2a6b Reviewed-on: https://go-review.googlesource.com/20829 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-04-27 18:48:16 +00:00
Austin Clements	30172f1811	runtime: make {add,subtract}{b,1} nosplit These are used at the bottom level of various GC operations that must not be preempted. To be on the safe side, mark them all nosplit. Change-Id: I8f7360e79c9852bd044df71413b8581ad764380c Reviewed-on: https://go-review.googlesource.com/22504 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-04-27 18:46:00 +00:00
David Crawshaw	217be5b35d	reflect: unnamed interface types have no name Fixes #15468 Change-Id: I8723171f87774a98d5e80e7832ebb96dd1fbea74 Reviewed-on: https://go-review.googlesource.com/22524 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org>	2016-04-27 18:06:20 +00:00
Zhongwei Yao	74a9bad638	cmd/compile: enable const division for arm64 performance: benchmark old ns/op new ns/op delta BenchmarkDivconstI64-8 8.28 2.70 -67.39% BenchmarkDivconstU64-8 8.28 4.69 -43.36% BenchmarkDivconstI32-8 8.28 6.39 -22.83% BenchmarkDivconstU32-8 8.28 4.43 -46.50% BenchmarkDivconstI16-8 5.17 5.17 +0.00% BenchmarkDivconstU16-8 5.33 5.34 +0.19% BenchmarkDivconstI8-8 3.50 3.50 +0.00% BenchmarkDivconstU8-8 3.51 3.50 -0.28% Fixes #15382 Change-Id: Ibce7b28f0586d593b33c4d4ecc5d5e7e7c905d13 Reviewed-on: https://go-review.googlesource.com/22292 Reviewed-by: Michael Munday <munday@ca.ibm.com> Reviewed-by: David Chase <drchase@google.com>	2016-04-27 17:47:49 +00:00
Cherry Zhang	9629f55fbb	cmd/link: remove absolute address for c-archive on darwin/arm Now it is possible to build a c-archive as PIC on darwin/arm (this is now the default). Then the system linker can link the binary using the archive as PIE. Fixes #12896. Change-Id: Iad84131572422190f5fa036e7d71910dc155f155 Reviewed-on: https://go-review.googlesource.com/22461 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2016-04-27 16:22:06 +00:00
Dmitry Vyukov	6dfba5c7ce	runtime/race: improve TestNoRaceIOHttp test TestNoRaceIOHttp does all kinds of bad things: 1. Binds to a fixed port, so concurrent tests fail. 2. Registers HTTP handler multiple times, so repeated tests fail. 3. Relies on sleep to wait for listen. Fix all of that. Change-Id: I1210b7797ef5e92465b37dc407246d92a2a24fe8 Reviewed-on: https://go-review.googlesource.com/19953 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-27 08:08:18 +00:00
Austin Clements	2a889b9d93	runtime: make stack re-scan O(# dirty stacks) Currently the stack re-scan during mark termination is O(# stacks) because we enqueue a root marking job for every goroutine. It takes ~34ns to process this root marking job for a valid (clean) stack, so at around 300k goroutines we exceed the 10ms pause goal. A non-trivial portion of this time is spent simply taking the cache miss to check the gcscanvalid flag, so simply optimizing the path that handles clean stacks can only improve this so much. Fix this by keeping an explicit list of goroutines with dirty stacks that need to be rescanned. When a goroutine first transitions to running after a stack scan and marks its stack dirty, it adds itself to this list. We enqueue root marking jobs only for the goroutines in this list, so this improves stack re-scanning asymptotically by completely eliminating time spent on clean goroutines. This reduces mark termination time for 500k idle goroutines from 15ms to 238µs. Overall performance effect is negligible. name \ 95%ile-time/markTerm old new delta IdleGs/gs:500000/gomaxprocs:12 15000µs ± 0% 238µs ± 5% -98.41% (p=0.000 n=10+10) name old time/op new time/op delta XBenchGarbage-12 2.30ms ± 3% 2.29ms ± 1% -0.43% (p=0.049 n=17+18) name old time/op new time/op delta BinaryTree17-12 2.57s ± 3% 2.59s ± 2% ~ (p=0.141 n=19+20) Fannkuch11-12 2.09s ± 0% 2.10s ± 1% +0.53% (p=0.000 n=19+19) FmtFprintfEmpty-12 45.3ns ± 3% 45.2ns ± 2% ~ (p=0.845 n=20+20) FmtFprintfString-12 129ns ± 0% 127ns ± 0% -1.55% (p=0.000 n=16+16) FmtFprintfInt-12 123ns ± 0% 119ns ± 1% -3.24% (p=0.000 n=19+19) FmtFprintfIntInt-12 195ns ± 1% 189ns ± 1% -3.11% (p=0.000 n=17+17) FmtFprintfPrefixedInt-12 193ns ± 1% 187ns ± 1% -3.06% (p=0.000 n=19+19) FmtFprintfFloat-12 254ns ± 0% 255ns ± 1% +0.35% (p=0.001 n=14+17) FmtManyArgs-12 781ns ± 0% 770ns ± 0% -1.48% (p=0.000 n=16+19) GobDecode-12 7.00ms ± 1% 6.98ms ± 1% ~ (p=0.563 n=19+19) GobEncode-12 5.91ms ± 1% 5.92ms ± 0% ~ (p=0.118 n=19+18) Gzip-12 219ms ± 1% 215ms ± 1% -1.81% (p=0.000 n=18+18) Gunzip-12 37.2ms ± 0% 37.4ms ± 0% +0.45% (p=0.000 n=17+19) HTTPClientServer-12 76.9µs ± 3% 77.5µs ± 2% +0.81% (p=0.030 n=20+19) JSONEncode-12 15.0ms ± 0% 14.8ms ± 1% -0.88% (p=0.001 n=15+19) JSONDecode-12 50.6ms ± 0% 53.2ms ± 2% +5.07% (p=0.000 n=17+19) Mandelbrot200-12 4.05ms ± 0% 4.05ms ± 1% ~ (p=0.581 n=16+17) GoParse-12 3.34ms ± 1% 3.30ms ± 1% -1.21% (p=0.000 n=15+20) RegexpMatchEasy0_32-12 69.6ns ± 1% 69.8ns ± 2% ~ (p=0.566 n=19+19) RegexpMatchEasy0_1K-12 238ns ± 1% 236ns ± 0% -0.91% (p=0.000 n=17+13) RegexpMatchEasy1_32-12 69.8ns ± 1% 70.0ns ± 1% +0.23% (p=0.026 n=17+16) RegexpMatchEasy1_1K-12 371ns ± 1% 363ns ± 1% -2.07% (p=0.000 n=19+19) RegexpMatchMedium_32-12 107ns ± 2% 106ns ± 1% -0.51% (p=0.031 n=18+20) RegexpMatchMedium_1K-12 33.0µs ± 0% 32.9µs ± 0% -0.30% (p=0.004 n=16+16) RegexpMatchHard_32-12 1.70µs ± 0% 1.70µs ± 0% +0.45% (p=0.000 n=16+17) RegexpMatchHard_1K-12 51.1µs ± 2% 51.4µs ± 1% +0.53% (p=0.000 n=17+19) Revcomp-12 378ms ± 1% 385ms ± 1% +1.92% (p=0.000 n=19+18) Template-12 64.3ms ± 2% 65.0ms ± 2% +1.09% (p=0.001 n=19+19) TimeParse-12 315ns ± 1% 317ns ± 2% ~ (p=0.108 n=18+20) TimeFormat-12 360ns ± 1% 337ns ± 0% -6.30% (p=0.000 n=18+13) [Geo mean] 51.8µs 51.6µs -0.48% Change-Id: Icf8994671476840e3998236e15407a505d4c760c Reviewed-on: https://go-review.googlesource.com/20700 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 23:40:13 +00:00
Austin Clements	5b765ce310	runtime: don't clear gcscanvalid in casfrom_Gscanstatus Currently we clear gcscanvalid in both casgstatus and casfrom_Gscanstatus if the new status is _Grunning. This is very important to do in casgstatus. However, this is potentially wrong in casfrom_Gscanstatus because in this case the caller doesn't own gp and hence the write is racy. Unlike the other _Gscan statuses, during _Gscanrunning, the G is still running. This does not indicate that it's transitioning into a running state. The scan simply hasn't happened yet, so it's neither valid nor invalid. Conveniently, this also means clearing gcscanvalid is unnecessary in this case because the G was already in _Grunning, so we can simply remove this code. What will happen instead is that the G will be preempted to scan itself, that scan will set gcscanvalid to true, and then the G will return to _Grunning via casgstatus, clearing gcscanvalid. This fix will become necessary shortly when we start keeping track of the set of G's with dirty stacks, since it will no longer be idempotent to simply set gcscanvalid to false. Change-Id: I688c82e6fbf00d5dbbbff49efa66acb99ee86785 Reviewed-on: https://go-review.googlesource.com/20669 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 23:40:10 +00:00
Austin Clements	c707d83856	runtime: fix typos in comment about gcscanvalid Change-Id: Id4ad7ebf88a21eba2bc5714b96570ed5cfaed757 Reviewed-on: https://go-review.googlesource.com/22210 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 23:40:07 +00:00
Austin Clements	9f263c14ed	runtime: remove stack barriers during sweep This adds a best-effort pass to remove stack barriers immediately after the end of mark termination. This isn't necessary for the Go runtime, but should help external tools that perform stack walks but aren't aware of Go's stack barriers such as GDB, perf, and VTune. (Though clearly they'll still have trouble unwinding stacks during mark.) Change-Id: I66600fae1f03ee36b5459d2b00dcc376269af18e Reviewed-on: https://go-review.googlesource.com/20668 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 23:40:04 +00:00
Austin Clements	269c969c81	runtime: remove stack barriers during concurrent mark Currently we remove stack barriers during STW mark termination, which has a non-trivial per-goroutine cost and means that we have to touch even clean stacks during mark termination. However, there's no problem with leaving them in during the sweep phase. They just have to be out by the time we install new stack barriers immediately prior to scanning the stack such as during the mark phase of the next GC cycle or during mark termination in a STW GC. Hence, move the gcRemoveStackBarriers from STW mark termination to just before we install new stack barriers during concurrent mark. This removes the cost from STW. Furthermore, this combined with concurrent stack shrinking means that the mark termination scan of a clean stack is a complete no-op, which will make it possible to skip clean stacks entirely during mark termination. This has the downside that it will mess up anything outside of Go that tries to walk Go stacks all the time instead of just some of the time. This includes tools like GDB, perf, and VTune. We'll improve the situation shortly. Change-Id: Ia40baad8f8c16aeefac05425e00b0cf478137097 Reviewed-on: https://go-review.googlesource.com/20667 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 23:40:01 +00:00
Austin Clements	efb0c55407	runtime: avoid span root marking entirely during mark termination Currently we enqueue span root mark jobs during both concurrent mark and mark termination, but we make the job a no-op during mark termination. This is silly. Instead of queueing them up just to not do them, don't queue them up in the first place. Change-Id: Ie1d36de884abfb17dd0db6f0449a2b7c997affab Reviewed-on: https://go-review.googlesource.com/20666 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 23:39:58 +00:00
Austin Clements	e8337491aa	runtime: free dead G stacks concurrently Currently we free cached stacks of dead Gs during STW stack root marking. We do this during STW because there's no way to take ownership of a particular dead G, so attempting to free a dead G's stack during concurrent stack root marking could race with reusing that G. However, we can do this concurrently if we take a completely different approach. One way to prevent reuse of a dead G is to remove it from the free G list. Hence, this adds a new fixed root marking task that simply removes all Gs from the list of dead Gs with cached stacks, frees their stacks, and then adds them to the list of dead Gs without cached stacks. This is also a necessary step toward rescanning only dirty stacks, since it eliminates another task from STW stack marking. Change-Id: Iefbad03078b284a2e7bf30fba397da4ca87fe095 Reviewed-on: https://go-review.googlesource.com/20665 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 23:39:55 +00:00
Austin Clements	1a2cf91f5e	runtime: split gfree list into with-stacks and without-stacks Currently all free Gs are added to one list. Split this into two lists: one for free Gs with cached stacks and one for Gs without cached stacks. This lets us preferentially allocate Gs that already have a stack, but more importantly, it sets us up to free cached G stacks concurrently. Change-Id: Idbe486f708997e1c9d166662995283f02d1eeb3c Reviewed-on: https://go-review.googlesource.com/20664 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 23:39:51 +00:00
Michael Munday	55154cf0b2	cmd/link: fix gdb backtrace on architectures using a link register Also adds TestGdbBacktrace to the runtime package. Dwarf modifications written by Bryan Chan (@bryanpkc) who is also at IBM and covered by the same CLA. Fixes #14628 Change-Id: I106a1f704c3745a31f29cdadb0032e3905829850 Reviewed-on: https://go-review.googlesource.com/20193 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 18:35:47 +00:00
Ilya Tocar	6b02a19247	strings: use SSE4.2 in strings.Index on AMD64 Use PCMPESTRI instruction if available. Index-4 21.1ns ± 0% 21.1ns ± 0% ~ (all samples are equal) IndexHard1-4 395µs ± 0% 105µs ± 0% -73.53% (p=0.000 n=19+20) IndexHard2-4 300µs ± 0% 147µs ± 0% -51.11% (p=0.000 n=19+20) IndexHard3-4 665µs ± 0% 665µs ± 0% ~ (p=0.942 n=16+19) Change-Id: I4f66794164740a2b939eb1c78934e2390b489064 Reviewed-on: https://go-review.googlesource.com/22337 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2016-04-26 10:14:26 +00:00
Keith Randall	9cb79e9536	runtime: arm5, fix large-offset floating-point stores The code sequence for large-offset floating-point stores includes adding the base pointer to r11. Make sure we can interpret that instruction correctly. Fixes build. Fixes #15440 Change-Id: I7fe5a4a57e08682967052bf77c54e0ec47fcb53e Reviewed-on: https://go-review.googlesource.com/22440 Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2016-04-25 22:33:33 +00:00
Keith Randall	6f3f02f80d	runtime: zero tmpbuf between len and cap Zero the entire buffer so we don't need to lower its capacity upon return. This lets callers do some appending without allocation. Zeroing is cheap, the byte buffer requires only 4 extra instructions. Fixes #14235 Change-Id: I970d7badcef047dafac75ac17130030181f18fe2 Reviewed-on: https://go-review.googlesource.com/22424 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-25 21:16:52 +00:00
Dmitry Vyukov	75b844f0d2	runtime/trace: test detection of broken timestamps On some processors cputicks (used to generate trace timestamps) produce non-monotonic timestamps. It is important that the parser distinguishes logically inconsistent traces (e.g. missing, excessive or misordered events) from broken timestamps. The former is a bug in tracer, the latter is a machine issue. Test that (1) parser does not return a logical error in case of broken timestamps and (2) broken timestamps are eventually detected and reported. Change-Id: Ib4b1eb43ce128b268e754400ed8b5e8def04bd78 Reviewed-on: https://go-review.googlesource.com/21608 Reviewed-by: Austin Clements <austin@google.com>	2016-04-24 09:11:37 +00:00
Dmitry Vyukov	a3703618ea	runtime: use per-goroutine sequence numbers in tracer Currently tracer uses global sequencer and it introduces significant slowdown on parallel machines (up to 10x). Replace the global sequencer with per-goroutine sequencer. If we assign per-goroutine sequence numbers to only 3 types of events (start, unblock and syscall exit), it is enough to restore consistent partial ordering of all events. Even these events don't need sequence numbers all the time (if goroutine starts on the same P where it was unblocked, then start does not need sequence number). The burden of restoring the order is put on trace parser. Details of the algorithm are described in the comments. On http benchmark with GOMAXPROCS=48: no tracing: 5026 ns/op tracing: 27803 ns/op (+453%) with this change: 6369 ns/op (+26%, mostly for traceback) Also trace size is reduced by ~22%. Average event size before: 4.63 bytes/event, after: 3.62 bytes/event. Besides running trace tests, I've also tested with manually broken cputicks (random skew for each event, per-P skew and episodic random skew). In all cases broken timestamps were detected and no test failures. Change-Id: I078bde421ccc386a66f6c2051ab207bcd5613efa Reviewed-on: https://go-review.googlesource.com/21512 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-23 15:57:05 +00:00
Dmitry Vyukov	2d342fba78	runtime: fix description of trace events Change-Id: I037101b1921fe151695d32e9874b50dd64982298 Reviewed-on: https://go-review.googlesource.com/22314 Reviewed-by: Austin Clements <austin@google.com>	2016-04-22 21:32:37 +00:00
Ian Lance Taylor	32302d6289	runtime/cgo: use normal libinit on PPC GNU/Linux The special case was because PPC did not support external linking, but now it does. Fixes #10410. Change-Id: I9b024686e0f03da7a44c1c59b41c529802f16ab0 Reviewed-on: https://go-review.googlesource.com/22372 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-22 14:30:27 +00:00
David Crawshaw	c165988360	cmd/compile, etc: use nameOff in uncommonType linux/amd64 PIE: cmd/go: -62KB (0.5%) jujud: -550KB (0.7%) For #6853. Change-Id: Ieb67982abce5832e24b997506f0ae7108f747108 Reviewed-on: https://go-review.googlesource.com/22371 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-22 13:51:29 +00:00
David Crawshaw	1492e7db05	cmd/compile, etc: use nameOff for rtype string linux/amd64: cmd/go: -8KB (basically nothing) linux/amd64 PIE: cmd/go: -191KB (1.6%) jujud: -1.5MB (1.9%) Updates #6853 Fixes #15064 Change-Id: I0adbb95685e28be92e8548741df0e11daa0a9b5f Reviewed-on: https://go-review.googlesource.com/21777 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-22 10:08:05 +00:00
Austin Clements	c8bd293e56	runtime: eliminate floating garbage estimate Currently when we compute the trigger for the next GC, we do it based on an estimate of the reachable heap size at the start of the GC cycle, which is itself based on an estimate of the floating garbage. This was introduced by `4655aad` to fix a bad feedback loop that allowed the heap to grow to many times the true reachable size. However, this estimate gets easily confused by rapidly allocating applications, and, worse it's different than the heap size the trigger controller uses to compute the trigger itself. This results in the trigger controller often thinking that GC finished before it started. Since this would be a pretty great outcome from it's perspective, it sets the trigger for the next cycle as close to the next goal as possible (which is limited to 95% of the goal). Furthermore, the bad feedback loop this estimate originally fixed seems not to happen any more, suggesting it was fixed more correctly by some other change in the mean time. Finally, with the change to allocate black, it shouldn't even be theoretically possible for this bad feedback loop to occur. Hence, eliminate the floating garbage estimate and simply consider the reachable heap to be the marked heap. This harms overall throughput slightly for allocation-heavy benchmarks, but significantly improves mutator availability. Fixes #12204. This brings the average trigger in this benchmark from 0.95 (the cap) to 0.7 and the active GC utilization from ~90% to ~45%. Updates #14951. This makes the trigger controller much better behaved, so it pulls the trigger lower if assists are consuming a lot of CPU like it's supposed to, increasing mutator availability. name old time/op new time/op delta XBenchGarbage-12 2.21ms ± 1% 2.28ms ± 3% +3.29% (p=0.000 n=17+17) Some of this slow down we paid for in earlier commits. Relative to the start of the series to switch to allocate-black (the parent of "count black allocations toward scan work"), the garbage benchmark is 2.62% slower. name old time/op new time/op delta BinaryTree17-12 2.53s ± 3% 2.53s ± 3% ~ (p=0.708 n=20+19) Fannkuch11-12 2.08s ± 0% 2.08s ± 0% -0.22% (p=0.002 n=19+18) FmtFprintfEmpty-12 45.3ns ± 2% 45.2ns ± 3% ~ (p=0.505 n=20+20) FmtFprintfString-12 129ns ± 0% 131ns ± 2% +1.80% (p=0.000 n=16+19) FmtFprintfInt-12 121ns ± 2% 121ns ± 2% ~ (p=0.768 n=19+19) FmtFprintfIntInt-12 186ns ± 1% 188ns ± 3% +0.99% (p=0.000 n=19+19) FmtFprintfPrefixedInt-12 188ns ± 1% 188ns ± 1% ~ (p=0.947 n=18+16) FmtFprintfFloat-12 254ns ± 1% 255ns ± 1% +0.30% (p=0.002 n=19+17) FmtManyArgs-12 763ns ± 0% 770ns ± 0% +0.92% (p=0.000 n=18+18) GobDecode-12 7.00ms ± 1% 7.04ms ± 1% +0.61% (p=0.049 n=20+20) GobEncode-12 5.88ms ± 1% 5.88ms ± 0% ~ (p=0.641 n=18+19) Gzip-12 214ms ± 1% 215ms ± 1% +0.43% (p=0.002 n=18+19) Gunzip-12 37.6ms ± 0% 37.6ms ± 0% +0.11% (p=0.015 n=17+18) HTTPClientServer-12 76.9µs ± 2% 78.1µs ± 2% +1.44% (p=0.000 n=20+18) JSONEncode-12 15.2ms ± 2% 15.1ms ± 1% ~ (p=0.271 n=19+18) JSONDecode-12 53.1ms ± 1% 53.3ms ± 0% +0.49% (p=0.000 n=18+19) Mandelbrot200-12 4.04ms ± 1% 4.03ms ± 0% -0.33% (p=0.005 n=18+18) GoParse-12 3.29ms ± 1% 3.28ms ± 1% ~ (p=0.146 n=16+17) RegexpMatchEasy0_32-12 69.9ns ± 3% 69.5ns ± 1% ~ (p=0.785 n=20+19) RegexpMatchEasy0_1K-12 237ns ± 0% 237ns ± 0% ~ (p=1.000 n=18+18) RegexpMatchEasy1_32-12 69.5ns ± 1% 69.2ns ± 1% -0.44% (p=0.020 n=16+19) RegexpMatchEasy1_1K-12 372ns ± 1% 371ns ± 2% ~ (p=0.086 n=20+19) RegexpMatchMedium_32-12 108ns ± 3% 107ns ± 1% -1.00% (p=0.004 n=19+14) RegexpMatchMedium_1K-12 34.2µs ± 4% 34.0µs ± 2% ~ (p=0.380 n=19+20) RegexpMatchHard_32-12 1.77µs ± 4% 1.76µs ± 3% ~ (p=0.558 n=18+20) RegexpMatchHard_1K-12 53.4µs ± 4% 52.8µs ± 2% -1.10% (p=0.020 n=18+20) Revcomp-12 359ms ± 4% 377ms ± 0% +5.19% (p=0.000 n=20+18) Template-12 63.7ms ± 2% 62.9ms ± 2% -1.27% (p=0.005 n=18+20) TimeParse-12 316ns ± 2% 313ns ± 1% ~ (p=0.059 n=20+16) TimeFormat-12 329ns ± 0% 331ns ± 0% +0.39% (p=0.000 n=16+18) [Geo mean] 51.6µs 51.7µs +0.18% Change-Id: I1dce4640c8205d41717943b021039fffea863c57 Reviewed-on: https://go-review.googlesource.com/21324 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-21 20:07:25 +00:00
Austin Clements	6002e01e34	runtime: allocate black during GC Currently we allocate white for most of concurrent marking. This is based on the classical argument that it produces less floating garbage, since allocations during GC may not get linked into the heap and allocating white lets us reclaim these. However, it's not clear how often this actually happens, especially since our write barrier shades any pointer as soon as it's installed in the heap regardless of the color of the slot. On the other hand, allocating black has several advantages that seem to significantly outweigh this downside. 1) It naturally bounds the total scan work to the live heap size at the start of a GC cycle. Allocating white does not, and thus depends entirely on assists to prevent the heap from growing faster than it can be scanned. 2) It reduces the total amount of scan work per GC cycle by the size of newly allocated objects that are linked into the heap graph, since objects allocated black never need to be scanned. 3) It reduces total write barrier work since more objects will already be black when they are linked into the heap graph. This gives a slight overall improvement in benchmarks. name old time/op new time/op delta XBenchGarbage-12 2.24ms ± 0% 2.21ms ± 1% -1.32% (p=0.000 n=18+17) name old time/op new time/op delta BinaryTree17-12 2.60s ± 3% 2.53s ± 3% -2.56% (p=0.000 n=20+20) Fannkuch11-12 2.08s ± 1% 2.08s ± 0% ~ (p=0.452 n=19+19) FmtFprintfEmpty-12 45.1ns ± 2% 45.3ns ± 2% ~ (p=0.367 n=19+20) FmtFprintfString-12 131ns ± 3% 129ns ± 0% -1.60% (p=0.000 n=20+16) FmtFprintfInt-12 122ns ± 0% 121ns ± 2% -0.86% (p=0.000 n=16+19) FmtFprintfIntInt-12 187ns ± 1% 186ns ± 1% ~ (p=0.514 n=18+19) FmtFprintfPrefixedInt-12 189ns ± 0% 188ns ± 1% -0.54% (p=0.000 n=16+18) FmtFprintfFloat-12 256ns ± 0% 254ns ± 1% -0.43% (p=0.000 n=17+19) FmtManyArgs-12 769ns ± 0% 763ns ± 0% -0.72% (p=0.000 n=18+18) GobDecode-12 7.08ms ± 2% 7.00ms ± 1% -1.22% (p=0.000 n=20+20) GobEncode-12 5.88ms ± 0% 5.88ms ± 1% ~ (p=0.406 n=18+18) Gzip-12 214ms ± 0% 214ms ± 1% ~ (p=0.103 n=17+18) Gunzip-12 37.6ms ± 0% 37.6ms ± 0% ~ (p=0.563 n=17+17) HTTPClientServer-12 77.2µs ± 3% 76.9µs ± 2% ~ (p=0.606 n=20+20) JSONEncode-12 15.1ms ± 1% 15.2ms ± 2% ~ (p=0.138 n=19+19) JSONDecode-12 53.3ms ± 1% 53.1ms ± 1% -0.33% (p=0.000 n=19+18) Mandelbrot200-12 4.04ms ± 1% 4.04ms ± 1% ~ (p=0.075 n=19+18) GoParse-12 3.30ms ± 1% 3.29ms ± 1% -0.57% (p=0.000 n=18+16) RegexpMatchEasy0_32-12 69.5ns ± 1% 69.9ns ± 3% ~ (p=0.822 n=18+20) RegexpMatchEasy0_1K-12 237ns ± 1% 237ns ± 0% ~ (p=0.398 n=19+18) RegexpMatchEasy1_32-12 69.8ns ± 2% 69.5ns ± 1% ~ (p=0.090 n=20+16) RegexpMatchEasy1_1K-12 371ns ± 1% 372ns ± 1% ~ (p=0.178 n=19+20) RegexpMatchMedium_32-12 108ns ± 2% 108ns ± 3% ~ (p=0.124 n=20+19) RegexpMatchMedium_1K-12 33.9µs ± 2% 34.2µs ± 4% ~ (p=0.309 n=20+19) RegexpMatchHard_32-12 1.75µs ± 2% 1.77µs ± 4% +1.28% (p=0.018 n=19+18) RegexpMatchHard_1K-12 52.7µs ± 1% 53.4µs ± 4% +1.23% (p=0.013 n=15+18) Revcomp-12 354ms ± 1% 359ms ± 4% +1.27% (p=0.043 n=20+20) Template-12 63.6ms ± 2% 63.7ms ± 2% ~ (p=0.654 n=20+18) TimeParse-12 313ns ± 1% 316ns ± 2% +0.80% (p=0.014 n=17+20) TimeFormat-12 332ns ± 0% 329ns ± 0% -0.66% (p=0.000 n=16+16) [Geo mean] 51.7µs 51.6µs -0.09% Change-Id: I2214a6a0e4f544699ea166073249a8efdf080dc0 Reviewed-on: https://go-review.googlesource.com/21323 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-21 20:07:22 +00:00
Austin Clements	64a26b79ac	runtime: simplify/optimize allocate-black a bit Currently allocating black switches to the system stack (which is probably a historical accident) and atomically updates the global bytes marked stat. Since we're about to depend on this much more, optimize it a bit by putting it back on the regular stack and updating the per-P bytes marked stat, which gets lazily folded into the global bytes marked stat. Change-Id: Ibbe16e5382d3fd2256e4381f88af342bf7020b04 Reviewed-on: https://go-review.googlesource.com/22170 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-21 20:07:20 +00:00
Austin Clements	479501c14c	runtime: count black allocations toward scan work Currently we count black allocations toward the scannable heap size, but not toward the scan work we've done so far. This is clearly inconsistent (we have, in effect, scanned these allocations and since they're already black, we're not going to scan them again). Worse, it means we don't count black allocations toward the scannable heap size as of the next GC because this is based on the amount of scan work we did in this cycle. Fix this by counting black allocations as scan work. Currently the GC spends very little time in allocate-black mode, so this probably hasn't been a problem, but this will become important when we switch to always allocating black. Change-Id: If6ff693b070c385b65b6ecbbbbf76283a0f9d990 Reviewed-on: https://go-review.googlesource.com/22119 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-21 20:07:17 +00:00
Martin Möhrmann	7e460e70d9	runtime: use type int to specify size for newarray Consistently use type int for the size argument of runtime.newarray, runtime.reflect_unsafe_NewArray and reflect.unsafe_NewArray. Change-Id: Ic77bf2dde216c92ca8c49462f8eedc0385b6314e Reviewed-on: https://go-review.googlesource.com/22311 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Martin Möhrmann <martisch@uos.de> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-21 04:15:14 +00:00
Keith Randall	60fd32a47f	cmd/compile: change the way we handle large map values mapaccess{1,2} returns a pointer to the value. When the key is not in the map, it returns a pointer to zeroed memory. Currently, for large map values we have a complicated scheme which dynamically allocates zeroed memory for this purpose. It is ugly code and requires an atomic.Load in a bunch of places we'd rather not have it. Switch to a scheme where callsites of mapaccess{1,2} which expect large return values pass in a pointer to zeroed memory that mapaccess can return if the key is not found. This avoids the atomic.Load on all map accesses with a few extra instructions only for the large value acccesses, plus a bit of bss space. There was a time (1.4 & 1.5?) where we did something like this but all the tricks to make the right size zero value were done by the linker. That scheme broke in the presence of dyamic linking. The scheme in this CL works even when dynamic linking. Fixes #12337 Change-Id: Ic2d0319944af33bbb59785938d9ab80958d1b4b1 Reviewed-on: https://go-review.googlesource.com/22221 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2016-04-20 21:15:31 +00:00
Keith Randall	001e8e8070	runtime: simplify mallocgc flag argument mallocgc can calculate noscan itself. The only remaining flag argument is needzero, so we just make that a boolean arg. Fixes #15379 Change-Id: I839a70790b2a0c9dbcee2600052bfbd6c8148e20 Reviewed-on: https://go-review.googlesource.com/22290 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-20 14:02:22 +00:00
Keith Randall	bfe0cbdc50	cmd/compile,runtime: pass elem type to {make,grow}slice No point in passing the slice type to these functions. All they need is the element type. One less indirection, maybe a few less []T type descriptors in the binary. Change-Id: Ib0b83b5f14ca21d995ecc199ce8ac00c4eb375e6 Reviewed-on: https://go-review.googlesource.com/22275 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2016-04-20 00:31:16 +00:00
Josh Bleecher Snyder	0150f15a92	runtime: call mallocgc directly from makeslice and growslice The extra checks provided by newarray are redundant in these cases. This shrinks by one frame the call stack expected by the pprof test. name old time/op new time/op delta MakeSlice-8 34.3ns ± 2% 30.5ns ± 3% -11.03% (p=0.000 n=24+22) GrowSlicePtr-8 134ns ± 2% 129ns ± 3% -3.25% (p=0.000 n=25+24) Change-Id: Icd828655906b921c732701fd9d61da3fa217b0af Reviewed-on: https://go-review.googlesource.com/22276 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-04-20 00:05:36 +00:00
Julia Hansbrough	58012ea785	runtime: updated SIGSYS to cause a panic + stacktrace On GNU/Linux, SIGSYS is specified to cause the process to terminate without a core dump. In https://codereview.appspot.com/3749041 , it appears that Golang accidentally introduced incorrect behavior for this signal, which caused Golang processes to keep running after receiving SIGSYS. This change reverts it to the old/correct behavior. Updates #15204 Change-Id: I3aa48a9499c1bc36fa5d3f40c088fdd7599e0db5 Reviewed-on: https://go-review.googlesource.com/22202 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-19 22:48:31 +00:00
Keith Randall	998c8e034c	cmd/compile: convT2{I,E} don't handle direct interfaces We now inline type to interface conversions when the type is pointer-shaped. No need to keep code to handle that in convT2{I,E}. Change-Id: I3a6668259556077cbb2986a9e8fe42a625d506c9 Reviewed-on: https://go-review.googlesource.com/22249 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michel Lespinasse <walken@google.com>	2016-04-19 22:27:08 +00:00
Josh Bleecher Snyder	a4dd6ea152	runtime: add maxSliceCap This avoids expensive division calculations for many common slice element sizes. name old time/op new time/op delta MakeSlice-8 51.9ns ± 3% 35.1ns ± 2% -32.41% (p=0.000 n=10+10) GrowSliceBytes-8 44.1ns ± 2% 44.1ns ± 1% ~ (p=0.984 n=10+10) GrowSliceInts-8 60.9ns ± 3% 60.9ns ± 3% ~ (p=0.698 n=10+10) GrowSlicePtr-8 131ns ± 1% 120ns ± 2% -8.41% (p=0.000 n=8+10) GrowSliceStruct24Bytes-8 111ns ± 2% 103ns ± 3% -7.23% (p=0.000 n=8+8) Change-Id: I2630eb3d73c814db030cad16e620ea7fecbbd312 Reviewed-on: https://go-review.googlesource.com/22223 Reviewed-by: Keith Randall <khr@golang.org>	2016-04-19 21:38:52 +00:00
Josh Bleecher Snyder	411a0adc9b	runtime: add benchmarks for in-place append Change-Id: I2b43cc976d2efbf8b41170be536fdd10364b65e5 Reviewed-on: https://go-review.googlesource.com/22190 Reviewed-by: Keith Randall <khr@golang.org>	2016-04-18 19:08:39 +00:00
David Crawshaw	95df0c6ab9	cmd/compile, etc: use name offset in method tables Introduce and start using nameOff for two encoded names. This pair of changes is best done together because the linker's method decoder expects the method layouts to match. Precursor to converting all existing name and *string fields to nameOff. linux/amd64: cmd/go: -45KB (0.5%) jujud: -389KB (0.6%) linux/amd64 PIE: cmd/go: -170KB (1.4%) jujud: -1.5MB (1.8%) For #6853. Change-Id: Ia044423f010fb987ce070b94c46a16fc78666ff6 Reviewed-on: https://go-review.googlesource.com/21396 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-18 09:12:41 +00:00
Austin Clements	2cdcb6f829	runtime: scavenge memory on physical page-aligned boundaries Currently the scavenger marks memory unused in multiples of the allocator page size (8K). This is safe as long as the true physical page size is 4K (or 8K), as it is on many platforms. However, on ARM64, PPC64x, and MIPS64, the physical page size is larger than 8K, so if we attempt to mark memory unused, the kernel will round the boundaries of the region out to all pages covered by the requested region, and we'll release a larger region of memory than intended. As a result, the scavenger is currently disabled on these platforms. Fix this by first rounding the region to be marked unused in to multiples of the physical page size, so that when we ask the kernel to mark it unused, it releases exactly the requested region. Fixes #9993. Change-Id: I96d5fdc2f77f9d69abadcea29bcfe55e68288cb1 Reviewed-on: https://go-review.googlesource.com/22066 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-04-16 21:42:43 +00:00
Austin Clements	1151473077	runtime: check that sysUnused is always physical-page aligned If sysUnused is passed an address or length that is not aligned to the physical page boundary, the kernel will unmap more memory than the caller wanted. Add a check for this. For #9993. Change-Id: I68ff03032e7b65cf0a853fe706ce21dc7f2aaaf8 Reviewed-on: https://go-review.googlesource.com/22065 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2016-04-16 21:42:40 +00:00
Austin Clements	8ce844e88e	runtime: check kernel physical page size during init The runtime hard-codes an assumed physical page size. If this is smaller than the kernel's page size or not a multiple of it, sysUnused may incorrectly release more memory to the system than intended. Add a runtime startup check that the runtime's assumed physical page is compatible with the kernel's physical page size. For #9993. Change-Id: Ida9d07f93c00ca9a95dd55fc59bf0d8a607f6728 Reviewed-on: https://go-review.googlesource.com/22064 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-04-16 21:42:37 +00:00
Austin Clements	d6b177d1eb	runtime: remove empty 386 archauxv archauxv no longer does anything on 386, so remove it. Change-Id: I94545238e40fa6a6832a7c3b40aedfc6c1f6a97b Reviewed-on: https://go-review.googlesource.com/22063 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-16 21:42:34 +00:00
Austin Clements	90addd3d41	runtime: common handling of _AT_RANDOM auxv The Linux kernel provides 16 bytes of random data via the auxv vector at startup. Currently we consume this separately on 386, amd64, arm, and arm64. Now that we have a common auxv parser, handle _AT_RANDOM in the common path. Change-Id: Ib69549a1d37e2d07a351cf0f44007bcd24f0d20d Reviewed-on: https://go-review.googlesource.com/22062 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-16 21:42:31 +00:00
Austin Clements	c955bb2040	runtime: common auxv parser Currently several different Linux architectures have separate copies of the auxv parser. Bring these all together into a single copy of the parser that calls out to a per-arch handler for each tag/value pair. This is in preparation for handling common auxv tags in one place. For #9993. Change-Id: Iceebc3afad6b4133b70fca7003561ae370445c10 Reviewed-on: https://go-review.googlesource.com/22061 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2016-04-16 21:42:27 +00:00
Mikio Hara	6f59ccb052	runtime: don't always unblock all signals on dragonfly, freebsd and openbsd https://golang.org/cl/10173 intrduced msigsave, ensureSigM and _SigUnblock but didn't enable the new signal save/restore mechanism for SIG{HUP,INT,QUIT,ABRT,TERM} on DragonFly BSD, FreeBSD and OpenBSD. At present, it looks like they have the implementation. This change enables the new mechanism on DragonFly BSD, FreeBSD and OpenBSD the same as Darwin, NetBSD. Change-Id: Ifb4b4743b3b4f50bfcdc7cf1fe1b59c377fa2a41 Reviewed-on: https://go-review.googlesource.com/18657 Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-15 21:20:45 +00:00
Austin Clements	7c7081f514	sync/atomic: don't atomically write pointers twice sync/atomic.StorePointer (which is implemented in runtime/atomic_pointer.go) writes the pointer twice (through two completely different code paths, no less). Fix it to only write once. Change-Id: Id3b2aef9aa9081c2cf096833e001b93d3dd1f5da Reviewed-on: https://go-review.googlesource.com/21999 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-04-14 21:13:26 +00:00
Austin Clements	8f6c35de2f	runtime: make sync_atomic_SwapPointer signature match sync/atomic SwapPointer is declared as func SwapPointer(addr unsafe.Pointer, new unsafe.Pointer) (old unsafe.Pointer) in sync/atomic, but defined in the runtime (where it's actually implemented) as func sync_atomic_SwapPointer(ptr unsafe.Pointer, new unsafe.Pointer) unsafe.Pointer Make ptr a unsafe.Pointer in the runtime definition to match the type in sync/atomic. Change-Id: I99bab651b995001bbe54f9e790fdef2417ef0e9e Reviewed-on: https://go-review.googlesource.com/21998 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2016-04-14 21:13:23 +00:00
Keith Randall	98b6febcef	runtime/internal/sys: better fallback algorithms for intrinsics Use deBruijn sequences to count low-order zeros. Reorg bswap to not use &^, it takes another instruction on x86. Change-Id: I4a5ed9fd16ee6a279d88c067e8a2ba11de821156 Reviewed-on: https://go-review.googlesource.com/22084 Reviewed-by: David Chase <drchase@google.com>	2016-04-14 21:09:03 +00:00
Jeremy Jackins	02b8e6978a	runtime: find a home for orphaned comments These comments were left behind after runtime.h was converted from C to Go. I examined the original code and tried to move these to the places that the most sense. Change-Id: I8769d60234c0113d682f9de3bd8d6c34c450c188 Reviewed-on: https://go-review.googlesource.com/21969 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-14 18:34:09 +00:00
David Crawshaw	f120936dff	cmd/compile, etc: use name for type pkgPath By replacing the string used to represent pkgPath with a reflect.name everywhere, the embedded string for package paths inside the reflect.name can be replaced by an offset, nameOff. This reduces the number of pointers in the type information. This also moves all reflect.name types into the same section, making it possible to use nameOff more widely in later CLs. No significant binary size change for normal binaries, but: linux/amd64 PIE: cmd/go: -440KB (3.7%) jujud: -2.6MB (3.2%) For #6853. Change-Id: I3890b132a784a1090b1b72b32febfe0bea77eaee Reviewed-on: https://go-review.googlesource.com/21395 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-13 20:48:26 +00:00
Brad Fitzpatrick	73e2ad2022	runtime: rename os1_darwin.go to os_darwin.go Change-Id: If0e0bc5a85101db1e70faaab168fc2d12024eb93 Reviewed-on: https://go-review.googlesource.com/22005 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-13 20:37:12 +00:00
Brad Fitzpatrick	d9712aa82a	runtime: merge the darwin os*.go files together Merge them together into os1_darwin.go. A future CL will rename it. Change-Id: Ia4380d3296ebd5ce210908ce3582ff184566f692 Reviewed-on: https://go-review.googlesource.com/22004 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-13 20:35:09 +00:00
Austin Clements	4721ea6abc	runtime/internal/atomic: rename Storep1 to StorepNoWB Make it clear that the point of this function stores a pointer without a write barrier. sed -i -e 's/Storep1/StorepNoWB/' $(git grep -l Storep1) Updates #15270. Change-Id: Ifad7e17815e51a738070655fe3b178afdadaecf6 Reviewed-on: https://go-review.googlesource.com/21994 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2016-04-13 19:17:25 +00:00
Austin Clements	d8e8fc292a	runtime/internal/atomic: remove write barrier from Storep1 on s390x atomic.Storep1 is not supposed to invoke a write barrier (that's what atomicstorep is for), but currently does on s390x. This causes a panic in runtime.mapzero when it tries to use atomic.Storep1 to store what's actually a scalar. Fix this by eliminating the write barrier from atomic.Storep1 on s390x. Also add some documentation to atomicstorep to explain the difference between these. Fixes #15270. Change-Id: I291846732d82f090a218df3ef6351180aff54e81 Reviewed-on: https://go-review.googlesource.com/21993 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Michael Munday <munday@ca.ibm.com>	2016-04-13 16:06:51 +00:00
Lynn Boger	c4807d4cc7	runtime: improve memmove performance ppc64,ppc64le This change improves the performance of memmove on ppc64 & ppc64le mainly for moves >=32 bytes. In addition, the test to detect backward moves was enhanced to avoid backward moves if source and dest were in different types of storage, since backward moves might not always be efficient. Fixes #14507 The following shows some of the improvements from the test in the runtime package: BenchmarkMemmove32 4229.56 4717.13 1.12x BenchmarkMemmove64 6156.03 7810.42 1.27x BenchmarkMemmove128 7521.69 12468.54 1.66x BenchmarkMemmove256 6729.90 18260.33 2.71x BenchmarkMemmove512 8521.59 18033.81 2.12x BenchmarkMemmove1024 9760.92 25762.61 2.64x BenchmarkMemmove2048 10241.00 29584.94 2.89x BenchmarkMemmove4096 10399.37 31882.31 3.07x BenchmarkMemmoveUnalignedDst16 1943.69 2258.33 1.16x BenchmarkMemmoveUnalignedDst32 3885.08 3965.81 1.02x BenchmarkMemmoveUnalignedDst64 5121.63 6965.54 1.36x BenchmarkMemmoveUnalignedDst128 7212.34 11372.68 1.58x BenchmarkMemmoveUnalignedDst256 6564.52 16913.59 2.58x BenchmarkMemmoveUnalignedDst512 8364.35 17782.57 2.13x BenchmarkMemmoveUnalignedDst1024 9539.87 24914.72 2.61x BenchmarkMemmoveUnalignedDst2048 9199.23 21235.11 2.31x BenchmarkMemmoveUnalignedDst4096 10077.39 25231.99 2.50x BenchmarkMemmoveUnalignedSrc32 3249.83 3742.52 1.15x BenchmarkMemmoveUnalignedSrc64 5562.35 6627.96 1.19x BenchmarkMemmoveUnalignedSrc128 6023.98 10200.84 1.69x BenchmarkMemmoveUnalignedSrc256 6921.83 15258.43 2.20x BenchmarkMemmoveUnalignedSrc512 8593.13 16541.97 1.93x BenchmarkMemmoveUnalignedSrc1024 9730.95 22927.84 2.36x BenchmarkMemmoveUnalignedSrc2048 9793.28 21537.73 2.20x BenchmarkMemmoveUnalignedSrc4096 10132.96 26295.06 2.60x Change-Id: I73af59970d4c97c728deabb9708b31ec7e01bdf2 Reviewed-on: https://go-review.googlesource.com/21990 Reviewed-by: Bill O'Farrell <billotosyr@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-13 15:27:59 +00:00
David Crawshaw	7d469179e6	cmd/compile, etc: store method tables as offsets This CL introduces the typeOff type and a lookup method of the same name that can turn a typeOff offset into an rtype. In a typical Go binary (built with buildmode=exe, pie, c-archive, or c-shared), there is one moduledata and all typeOff values are offsets relative to firstmoduledata.types. This makes computing the pointer cheap in typical programs. With buildmode=shared (and one day, buildmode=plugin) there are multiple modules whose relative offset is determined at runtime. We identify a type in the general case by the pair of the original rtype that references it and its typeOff value. We determine the module from the original pointer, and then use the typeOff from there to compute the final rtype. To ensure there is only one rtype representing each type, the runtime initializes a typemap for each module, using any identical type from an earlier module when resolving that offset. This means that types computed from an offset match the type mapped by the pointer dynamic relocations. A series of followup CLs will replace other rtype values with typeOff (and name/string with nameOff). For types created at runtime by reflect, type offsets are treated as global IDs and reference into a reflect offset map kept by the runtime. darwin/amd64: cmd/go: -57KB (0.6%) jujud: -557KB (0.8%) linux/amd64 PIE: cmd/go: -361KB (3.0%) jujud: -3.5MB (4.2%) For #6853. Change-Id: Icf096fd884a0a0cb9f280f46f7a26c70a9006c96 Reviewed-on: https://go-review.googlesource.com/21285 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-13 13:03:11 +00:00
Matthew Dempsky	6af4e996e2	runtime: simplify setPanicOnFault slightly No need to acquire the M just to change G's paniconfault flag, and the original C implementation of SetPanicOnFault did not. The M acquisition logic is an artifact of golang.org/cl/131010044, which was started before golang.org/cl/123640043 (which introduced the current "getg" function) was submitted. Change-Id: I6d1939008660210be46904395cf5f5bbc2c8f754 Reviewed-on: https://go-review.googlesource.com/21935 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-13 06:14:06 +00:00
Keith Randall	260b7daf0a	cmd/compile: fix arg to getcallerpc getcallerpc's arg needs to point to the first argument slot. I believe this bug was introduced by Michel's itab changes (specifically https://go-review.googlesource.com/c/20902). Fixes #15145 Change-Id: Ifb2e17f3658e2136c7950dfc789b4d5706320683 Reviewed-on: https://go-review.googlesource.com/21931 Reviewed-by: Michel Lespinasse <walken@google.com>	2016-04-13 00:24:38 +00:00
David Crawshaw	f028b9f9e2	cmd/link, etc: store typelinks as offsets This is the first in a series of CLs to replace the use of pointers in binary read-only data with offsets. In standard Go binaries these CLs have a small effect, shrinking 8-byte pointers to 4-bytes. In position-independent code, it also saves the dynamic relocation for the pointer. This has a significant effect on the binary size when building as PIE, c-archive, or c-shared. darwin/amd64: cmd/go: -12KB (0.1%) jujud: -82KB (0.1%) linux/amd64 PIE: cmd/go: -86KB (0.7%) jujud: -569KB (0.7%) For #6853. Change-Id: Iad5625bbeba58dabfd4d334dbee3fcbfe04b2dcf Reviewed-on: https://go-review.googlesource.com/21284 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-12 20:32:41 +00:00
Michael Munday	78ecd61f62	runtime/cgo: add s390x support Change-Id: I64ada9fe34c3cfc4bd514ec5d8c8f4d4c99074fb Reviewed-on: https://go-review.googlesource.com/20950 Reviewed-by: Bill O'Farrell <billotosyr@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-12 15:42:36 +00:00
Michael Munday	7cbe7b1e86	runtime/internal/atomic: add s390x atomic operations Load and store instructions are atomic on the s390x. Change-Id: I0031ed2fba43f33863bca114d0fdec2e7d1ce807 Reviewed-on: https://go-review.googlesource.com/20938 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-12 15:40:48 +00:00
Dmitry Vyukov	3fafe2e888	internal/trace: support parsing of 1.5 traces 1. Parse out version from trace header. 2. Restore handling of 1.5 traces. 3. Restore optional symbolization of traces. 4. Add some canned 1.5 traces for regression testing (http benchmark trace, runtime/trace stress traces, plus one with broken timestamps). Change-Id: Idb18a001d03ded8e13c2730eeeb37c5836e31256 Reviewed-on: https://go-review.googlesource.com/21803 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-04-11 17:56:44 +00:00
Dominik Honnef	683917a721	all: use bytes.Equal, bytes.Contains and strings.Contains, again The previous cleanup was done with a buggy tool, missing some potential rewrites. Change-Id: I333467036e355f999a6a493e8de87e084f374e26 Reviewed-on: https://go-review.googlesource.com/21378 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-11 15:16:54 +00:00
Dave Cheney	720c4c016c	runtime: merge lfstack_amd64.go into lfstack_64bit.go Merge the amd64 lfstack implementation into the general 64 bit implementation. Change-Id: Id9ed61b90d2e3bc3b0246294c03eb2c92803b6ca Reviewed-on: https://go-review.googlesource.com/21707 Run-TryBot: Dave Cheney <dave@cheney.net> Reviewed-by: Minux Ma <minux@golang.org>	2016-04-11 06:18:52 +00:00
Jeremy Jackins	ba09d06e16	runtime: remove remaining references to TheChar After mdempsky's recent changes, these are the only references to "TheChar" left in the Go tree. Without the context, and without knowing the history, this is confusing. Also rename sys.TheGoos and sys.TheGoarch to sys.GOOS and sys.GOARCH. Also change the heap dump format to include sys.GOARCH rather than TheChar, which is no longer a concept. Updates #15169 (changes heapdump format) Change-Id: I3e99eeeae00ed55d7d01e6ed503d958c6e931dca Reviewed-on: https://go-review.googlesource.com/21647 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2016-04-11 04:32:07 +00:00
Martin Möhrmann	ad7448fe98	runtime: speed up makeslice by avoiding divisions Only compute the number of maximum allowed elements per slice once. name old time/op new time/op delta MakeSlice-2 55.5ns ± 1% 45.6ns ± 2% -17.88% (p=0.000 n=99+100) Change-Id: I951feffda5d11910a75e55d7e978d306d14da2c5 Reviewed-on: https://go-review.googlesource.com/21801 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-10 23:39:46 +00:00
Josh Bleecher Snyder	6b33b0e98e	cmd/compile: avoid a spill in append fast path Instead of spilling newlen, recalculate it. This removes a spill from the fast path, at the cost of a cheap recalculation on the (rare) growth path. This uses 8 bytes less of stack space. It generates two more bytes of code, but that is due to suboptimal register allocation; see far below. Runtime append microbenchmarks are all over the map, presumably due to incidental code movement. Sample code: func s(b []byte) []byte { b = append(b, 1, 2, 3) return b } Before: "".s t=1 size=160 args=0x30 locals=0x48 0x0000 00000 (append.go:8) TEXT "".s(SB), $72-48 0x0000 00000 (append.go:8) MOVQ (TLS), CX 0x0009 00009 (append.go:8) CMPQ SP, 16(CX) 0x000d 00013 (append.go:8) JLS 149 0x0013 00019 (append.go:8) SUBQ $72, SP 0x0017 00023 (append.go:8) FUNCDATA $0, gclocals·6432f8c6a0d23fa7bee6c5d96f21a92a(SB) 0x0017 00023 (append.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0017 00023 (append.go:9) MOVQ "".b+88(FP), CX 0x001c 00028 (append.go:9) LEAQ 3(CX), DX 0x0020 00032 (append.go:9) MOVQ DX, "".autotmp_0+64(SP) 0x0025 00037 (append.go:9) MOVQ "".b+96(FP), BX 0x002a 00042 (append.go:9) CMPQ DX, BX 0x002d 00045 (append.go:9) JGT $0, 86 0x002f 00047 (append.go:8) MOVQ "".b+80(FP), AX 0x0034 00052 (append.go:9) MOVB $1, (AX)(CX1) 0x0038 00056 (append.go:9) MOVB $2, 1(AX)(CX1) 0x003d 00061 (append.go:9) MOVB $3, 2(AX)(CX1) 0x0042 00066 (append.go:10) MOVQ AX, "".~r1+104(FP) 0x0047 00071 (append.go:10) MOVQ DX, "".~r1+112(FP) 0x004c 00076 (append.go:10) MOVQ BX, "".~r1+120(FP) 0x0051 00081 (append.go:10) ADDQ $72, SP 0x0055 00085 (append.go:10) RET 0x0056 00086 (append.go:9) LEAQ type.[]uint8(SB), AX 0x005d 00093 (append.go:9) MOVQ AX, (SP) 0x0061 00097 (append.go:9) MOVQ "".b+80(FP), BP 0x0066 00102 (append.go:9) MOVQ BP, 8(SP) 0x006b 00107 (append.go:9) MOVQ CX, 16(SP) 0x0070 00112 (append.go:9) MOVQ BX, 24(SP) 0x0075 00117 (append.go:9) MOVQ DX, 32(SP) 0x007a 00122 (append.go:9) PCDATA $0, $0 0x007a 00122 (append.go:9) CALL runtime.growslice(SB) 0x007f 00127 (append.go:9) MOVQ 40(SP), AX 0x0084 00132 (append.go:9) MOVQ 56(SP), BX 0x0089 00137 (append.go:8) MOVQ "".b+88(FP), CX 0x008e 00142 (append.go:9) MOVQ "".autotmp_0+64(SP), DX 0x0093 00147 (append.go:9) JMP 52 0x0095 00149 (append.go:9) NOP 0x0095 00149 (append.go:8) CALL runtime.morestack_noctxt(SB) 0x009a 00154 (append.go:8) JMP 0 After: "".s t=1 size=176 args=0x30 locals=0x40 0x0000 00000 (append.go:8) TEXT "".s(SB), $64-48 0x0000 00000 (append.go:8) MOVQ (TLS), CX 0x0009 00009 (append.go:8) CMPQ SP, 16(CX) 0x000d 00013 (append.go:8) JLS 151 0x0013 00019 (append.go:8) SUBQ $64, SP 0x0017 00023 (append.go:8) FUNCDATA $0, gclocals·6432f8c6a0d23fa7bee6c5d96f21a92a(SB) 0x0017 00023 (append.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0017 00023 (append.go:9) MOVQ "".b+80(FP), CX 0x001c 00028 (append.go:9) LEAQ 3(CX), DX 0x0020 00032 (append.go:9) MOVQ "".b+88(FP), BX 0x0025 00037 (append.go:9) CMPQ DX, BX 0x0028 00040 (append.go:9) JGT $0, 81 0x002a 00042 (append.go:8) MOVQ "".b+72(FP), AX 0x002f 00047 (append.go:9) MOVB $1, (AX)(CX1) 0x0033 00051 (append.go:9) MOVB $2, 1(AX)(CX1) 0x0038 00056 (append.go:9) MOVB $3, 2(AX)(CX1) 0x003d 00061 (append.go:10) MOVQ AX, "".~r1+96(FP) 0x0042 00066 (append.go:10) MOVQ DX, "".~r1+104(FP) 0x0047 00071 (append.go:10) MOVQ BX, "".~r1+112(FP) 0x004c 00076 (append.go:10) ADDQ $64, SP 0x0050 00080 (append.go:10) RET 0x0051 00081 (append.go:9) LEAQ type.[]uint8(SB), AX 0x0058 00088 (append.go:9) MOVQ AX, (SP) 0x005c 00092 (append.go:9) MOVQ "".b+72(FP), BP 0x0061 00097 (append.go:9) MOVQ BP, 8(SP) 0x0066 00102 (append.go:9) MOVQ CX, 16(SP) 0x006b 00107 (append.go:9) MOVQ BX, 24(SP) 0x0070 00112 (append.go:9) MOVQ DX, 32(SP) 0x0075 00117 (append.go:9) PCDATA $0, $0 0x0075 00117 (append.go:9) CALL runtime.growslice(SB) 0x007a 00122 (append.go:9) MOVQ 40(SP), AX 0x007f 00127 (append.go:9) MOVQ 48(SP), CX 0x0084 00132 (append.go:9) MOVQ 56(SP), BX 0x0089 00137 (append.go:9) ADDQ $3, CX 0x008d 00141 (append.go:9) MOVQ CX, DX 0x0090 00144 (append.go:8) MOVQ "".b+80(FP), CX 0x0095 00149 (append.go:9) JMP 47 0x0097 00151 (append.go:9) NOP 0x0097 00151 (append.go:8) CALL runtime.morestack_noctxt(SB) 0x009c 00156 (append.go:8) JMP 0 Observe that in the following sequence, we should use DX directly instead of using CX as a temporary register, which would make the new code a strict improvement on the old: 0x007f 00127 (append.go:9) MOVQ 48(SP), CX 0x0084 00132 (append.go:9) MOVQ 56(SP), BX 0x0089 00137 (append.go:9) ADDQ $3, CX 0x008d 00141 (append.go:9) MOVQ CX, DX 0x0090 00144 (append.go:8) MOVQ "".b+80(FP), CX Change-Id: I4ee50b18fa53865901d2d7f86c2cbb54c6fa6924 Reviewed-on: https://go-review.googlesource.com/21812 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-04-10 20:02:04 +00:00
Josh Bleecher Snyder	974c201f74	runtime: avoid unnecessary map iteration write barrier Update #14921 Change-Id: I5c5816d0193757bf7465b1e09c27ca06897df4bf Reviewed-on: https://go-review.googlesource.com/21814 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-04-10 20:01:47 +00:00
Emmanuel Odeke	e4f1d9cf2e	runtime: make execution error panic values implement the Error interface Make execution panics implement Error as mandated by https://golang.org/ref/spec#Run_time_panics, instead of panics with strings. Fixes #14965 Change-Id: I7827f898b9b9c08af541db922cc24fa0800ff18a Reviewed-on: https://go-review.googlesource.com/21214 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-10 01:16:30 +00:00
Dmitry Vyukov	0435e88a11	runtime: revert "do not call timeBeginPeriod on windows" This reverts commit `ab4c9298b8`. Sysmon critically depends on system timer resolution for retaking of Ps blocked in system calls. See #14790 for an example of a program where execution time goes from 2ms to 30ms if timeBeginPeriod(1) is not used. We can remove timeBeginPeriod(1) when we support UMS (#7876). Update #14790 Change-Id: I362b56154359b2c52d47f9f2468fe012b481cf6d Reviewed-on: https://go-review.googlesource.com/20834 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>	2016-04-09 16:11:41 +00:00
Dmitry Vyukov	0fb7b4cccd	runtime: emit file:line info into traces This makes traces self-contained and simplifies trace workflow in modern cloud environments where it is simpler to reach a service via HTTP than to obtain the binary. Change-Id: I6ff3ca694dc698270f1e29da37d5efaf4e843a0d Reviewed-on: https://go-review.googlesource.com/21732 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2016-04-08 20:52:30 +00:00
Michael Munday	e6f36f0cd5	runtime: add s390x support (new files and lfstack_64bit.go modifications) Change-Id: I51c0a332e3cbdab348564e5dcd27583e75e4b881 Reviewed-on: https://go-review.googlesource.com/20946 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-07 18:56:54 +00:00
Dave Cheney	9cc9e95b28	Revert "runtime: merge lfstack{Pack,Unpack} into one file" This broke solaris, which apparently does use the upper 17 bits of the address space. This reverts commit `3b02c5b1b6`. Change-Id: Iedfe54abd0384960845468205f20191a97751c0b Reviewed-on: https://go-review.googlesource.com/21652 Reviewed-by: Dave Cheney <dave@cheney.net>	2016-04-07 14:05:24 +00:00
Richard Miller	121c434f7a	runtime/pprof: make TestBlockProfile less timing dependent The test for profiling of channel blocking is timing dependent, and in particular the blockSelectRecvAsync case can fail on a slow builder (plan9_arm) when many tests are run in parallel. The child goroutine sleeps for a fixed period so the parent can be observed to block in a select call reading from the child; but if the OS process running the parent goroutine is delayed long enough, the child may wake again before the parent has reached the blocking point. By repeating the test three times, the likelihood of a blocking event is increased. Fixes #15096 Change-Id: I2ddb9576a83408d06b51ded682bf8e71e53ce59e Reviewed-on: https://go-review.googlesource.com/21604 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-07 09:57:06 +00:00
Dave Cheney	3b02c5b1b6	runtime: merge lfstack{Pack,Unpack} into one file Merge the remaining lfstack{Pack,Unpack} implemetations into one file. unsafe.Sizeof(uintptr(0)) == 4 is a constant comparison so this branch folds away at compile time. Dmitry confirmed that the upper 17 bits of an address will be zero for a user mode pointer, so there is no need to sign extend on amd64 during unpack, so we can reuse the same implementation as all othe 64 bit archs. Change-Id: I99f589416d8b181ccde5364c9c2e78e4a5efc7f1 Reviewed-on: https://go-review.googlesource.com/21597 Run-TryBot: Dave Cheney <dave@cheney.net> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2016-04-07 09:17:22 +00:00
Michael Hudson-Doyle	31cf1c1779	runtime: clamp OS-reported number of processors to _MaxGomaxprocs So that all Go processes do not die on startup on a system with >256 CPUs. I tested this by hacking osinit to set ncpu to 1000. Updates #15131 Change-Id: I52e061a0de97be41d684dd8b748fa9087d6f1aef Reviewed-on: https://go-review.googlesource.com/21599 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-07 00:11:25 +00:00
Dave Cheney	0c81248bf4	runtime: remove unused return value from lfstackUnpack None of the two places that call lfstackUnpack use the second argument. This simplifies a followup CL that merges the lfstack{Pack,Unpack} implementations. Change-Id: I3c93f6259da99e113d94f8c8027584da79c1ac2c Reviewed-on: https://go-review.googlesource.com/21595 Run-TryBot: Dave Cheney <dave@cheney.net> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-06 21:04:43 +00:00
Brad Fitzpatrick	2cefd12a1b	net, runtime: skip flaky tests on OpenBSD Flaky tests are a distraction and cover up real problems. File bugs instead and mark them as flaky. This moves the net/http flaky test flagging mechanism to internal/testenv. Updates #15156 Updates #15157 Updates #15158 Change-Id: I0e561cd2a09c0dec369cd4ed93bc5a2b40233dfe Reviewed-on: https://go-review.googlesource.com/21614 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-06 19:28:24 +00:00
Dave Cheney	5c7ae10f66	runtime: merge 64bit lfstack impls Merge all the 64bit lfstack impls into one file, adjust build tags to match. Merge all the comments on the various lfstack implementations for posterity. lfstack_amd64.go can probably be merged, but it is slightly different so that will happen in a followup. Change-Id: I5362d5e127daa81c9cb9d4fa8a0cc5c5e5c2707c Reviewed-on: https://go-review.googlesource.com/21591 Run-TryBot: Dave Cheney <dave@cheney.net> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2016-04-06 06:47:31 +00:00
Brad Fitzpatrick	8455f3a3d5	os: consolidate os{1,2}_*.go files Change-Id: I463ca59f486b2842f67f151a55f530ee10663830 Reviewed-on: https://go-review.googlesource.com/21568 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-06 05:04:47 +00:00
Brad Fitzpatrick	fd2bb1e30a	runtime: rename os1_windows.go to os_windows.go Change-Id: I11172f3d0e28f17b812e67a4db9cfe513b8e1974 Reviewed-on: https://go-review.googlesource.com/21565 Reviewed-by: Minux Ma <minux@golang.org>	2016-04-06 04:26:01 +00:00
Brad Fitzpatrick	e095f53e9b	runtime: merge os{,2}_windows.go into os1_windows.go. A future CL will rename os1_windows.go to os_windows.go. Change-Id: I223e76002dd1e9c9d1798fb0beac02c7d3bf4812 Reviewed-on: https://go-review.googlesource.com/21564 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2016-04-06 04:25:44 +00:00
Michael Munday	0f08dd2183	runtime: add s390x support (modified files only) Change-Id: Ib79ad4a890994ad64edb1feb79bd242d26b5b08a Reviewed-on: https://go-review.googlesource.com/20945 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-06 04:25:06 +00:00
Shenghou Ma	a2eded3421	runtime: get randomness from AT_RANDOM AUXV on linux/arm64 Fixes #15147. Change-Id: Ibfe46c747dea987787a51eb0c95ccd8c5f24f366 Reviewed-on: https://go-review.googlesource.com/21580 Run-TryBot: Minux Ma <minux@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-06 04:23:37 +00:00
Brad Fitzpatrick	34c58065e5	runtime: rename os1_linux.go to os_linux.go Change-Id: I938f61763c3256a876d62aeb54ef8c25cc4fc90e Reviewed-on: https://go-review.googlesource.com/21567 Reviewed-by: Minux Ma <minux@golang.org>	2016-04-06 03:55:38 +00:00
Brad Fitzpatrick	5103fbfdb2	runtime: merge os_linux.go into os1_linux.go Change-Id: I791c47014fe69e8529c7b2f0b9a554e47902d46c Reviewed-on: https://go-review.googlesource.com/21566 Reviewed-by: Minux Ma <minux@golang.org>	2016-04-06 03:54:47 +00:00
Brad Fitzpatrick	8556c76f88	runtime: minor Windows cleanup Change-Id: I9a8081ef1109469e9577c642156aa635188d8954 Reviewed-on: https://go-review.googlesource.com/21538 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>	2016-04-06 02:23:29 +00:00
David Chase	d8c815d8b5	cmd/compile: note escape of parts of closured-capture vars Missed a case for closure calls (OCALLFUNC && indirect) in esc.go:esccall. Cleanup to runtime code for windows to more thoroughly hide a technical escape. Also made code pickier about failing to late non-optional kernel32.dll. Fixes #14409. Change-Id: Ie75486a2c8626c4583224e02e4872c2875f7bca5 Reviewed-on: https://go-review.googlesource.com/20102 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-05 18:10:09 +00:00
Dmitry Vyukov	475d113b53	runtime: don't burn CPU unnecessarily Two GC-related functions, scang and casgstatus, wait in an active spin loop. Active spinning is never a good idea in user-space. Once we wait several times more than the expected wait time, something unexpected is happenning (e.g. the thread we are waiting for is descheduled or handling a page fault) and we need to yield to OS scheduler. Moreover, the expected wait time is very high for these functions: scang wait time can be tens of milliseconds, casgstatus can be hundreds of microseconds. It does not make sense to spin even for that time. go install -a std profile on a 4-core machine shows that 11% of time is spent in the active spin in scang: 6.12% compile compile [.] runtime.scang 3.27% compile compile [.] runtime.readgstatus 1.72% compile compile [.] runtime/internal/atomic.Load The active spin also increases tail latency in the case of the slightest oversubscription: GC goroutines spend whole quantum in the loop instead of executing user code. Here is scang wait time histogram during go install -a std: 13707.0000 - 1815442.7667 [ 118]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎... 1815442.7667 - 3617178.5333 [ 9]: ∎∎∎∎∎∎∎∎∎ 3617178.5333 - 5418914.3000 [ 11]: ∎∎∎∎∎∎∎∎∎∎∎ 5418914.3000 - 7220650.0667 [ 5]: ∎∎∎∎∎ 7220650.0667 - 9022385.8333 [ 12]: ∎∎∎∎∎∎∎∎∎∎∎∎ 9022385.8333 - 10824121.6000 [ 13]: ∎∎∎∎∎∎∎∎∎∎∎∎∎ 10824121.6000 - 12625857.3667 [ 15]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 12625857.3667 - 14427593.1333 [ 18]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 14427593.1333 - 16229328.9000 [ 18]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 16229328.9000 - 18031064.6667 [ 32]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 18031064.6667 - 19832800.4333 [ 28]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 19832800.4333 - 21634536.2000 [ 6]: ∎∎∎∎∎∎ 21634536.2000 - 23436271.9667 [ 15]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 23436271.9667 - 25238007.7333 [ 11]: ∎∎∎∎∎∎∎∎∎∎∎ 25238007.7333 - 27039743.5000 [ 27]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 27039743.5000 - 28841479.2667 [ 20]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 28841479.2667 - 30643215.0333 [ 10]: ∎∎∎∎∎∎∎∎∎∎ 30643215.0333 - 32444950.8000 [ 7]: ∎∎∎∎∎∎∎ 32444950.8000 - 34246686.5667 [ 4]: ∎∎∎∎ 34246686.5667 - 36048422.3333 [ 4]: ∎∎∎∎ 36048422.3333 - 37850158.1000 [ 1]: ∎ 37850158.1000 - 39651893.8667 [ 5]: ∎∎∎∎∎ 39651893.8667 - 41453629.6333 [ 2]: ∎∎ 41453629.6333 - 43255365.4000 [ 2]: ∎∎ 43255365.4000 - 45057101.1667 [ 2]: ∎∎ 45057101.1667 - 46858836.9333 [ 1]: ∎ 46858836.9333 - 48660572.7000 [ 2]: ∎∎ 48660572.7000 - 50462308.4667 [ 3]: ∎∎∎ 50462308.4667 - 52264044.2333 [ 2]: ∎∎ 52264044.2333 - 54065780.0000 [ 2]: ∎∎ and the zoomed-in first part: 13707.0000 - 19916.7667 [ 2]: ∎∎ 19916.7667 - 26126.5333 [ 2]: ∎∎ 26126.5333 - 32336.3000 [ 9]: ∎∎∎∎∎∎∎∎∎ 32336.3000 - 38546.0667 [ 8]: ∎∎∎∎∎∎∎∎ 38546.0667 - 44755.8333 [ 12]: ∎∎∎∎∎∎∎∎∎∎∎∎ 44755.8333 - 50965.6000 [ 10]: ∎∎∎∎∎∎∎∎∎∎ 50965.6000 - 57175.3667 [ 5]: ∎∎∎∎∎ 57175.3667 - 63385.1333 [ 6]: ∎∎∎∎∎∎ 63385.1333 - 69594.9000 [ 5]: ∎∎∎∎∎ 69594.9000 - 75804.6667 [ 6]: ∎∎∎∎∎∎ 75804.6667 - 82014.4333 [ 6]: ∎∎∎∎∎∎ 82014.4333 - 88224.2000 [ 4]: ∎∎∎∎ 88224.2000 - 94433.9667 [ 1]: ∎ 94433.9667 - 100643.7333 [ 1]: ∎ 100643.7333 - 106853.5000 [ 2]: ∎∎ 106853.5000 - 113063.2667 [ 0]: 113063.2667 - 119273.0333 [ 2]: ∎∎ 119273.0333 - 125482.8000 [ 2]: ∎∎ 125482.8000 - 131692.5667 [ 1]: ∎ 131692.5667 - 137902.3333 [ 1]: ∎ 137902.3333 - 144112.1000 [ 0]: 144112.1000 - 150321.8667 [ 2]: ∎∎ 150321.8667 - 156531.6333 [ 1]: ∎ 156531.6333 - 162741.4000 [ 1]: ∎ 162741.4000 - 168951.1667 [ 0]: 168951.1667 - 175160.9333 [ 0]: 175160.9333 - 181370.7000 [ 1]: ∎ 181370.7000 - 187580.4667 [ 1]: ∎ 187580.4667 - 193790.2333 [ 2]: ∎∎ 193790.2333 - 200000.0000 [ 0]: Here is casgstatus wait time histogram: 631.0000 - 5276.6333 [ 3]: ∎∎∎ 5276.6333 - 9922.2667 [ 5]: ∎∎∎∎∎ 9922.2667 - 14567.9000 [ 2]: ∎∎ 14567.9000 - 19213.5333 [ 6]: ∎∎∎∎∎∎ 19213.5333 - 23859.1667 [ 5]: ∎∎∎∎∎ 23859.1667 - 28504.8000 [ 6]: ∎∎∎∎∎∎ 28504.8000 - 33150.4333 [ 6]: ∎∎∎∎∎∎ 33150.4333 - 37796.0667 [ 2]: ∎∎ 37796.0667 - 42441.7000 [ 1]: ∎ 42441.7000 - 47087.3333 [ 3]: ∎∎∎ 47087.3333 - 51732.9667 [ 0]: 51732.9667 - 56378.6000 [ 1]: ∎ 56378.6000 - 61024.2333 [ 0]: 61024.2333 - 65669.8667 [ 0]: 65669.8667 - 70315.5000 [ 0]: 70315.5000 - 74961.1333 [ 1]: ∎ 74961.1333 - 79606.7667 [ 0]: 79606.7667 - 84252.4000 [ 0]: 84252.4000 - 88898.0333 [ 0]: 88898.0333 - 93543.6667 [ 0]: 93543.6667 - 98189.3000 [ 0]: 98189.3000 - 102834.9333 [ 0]: 102834.9333 - 107480.5667 [ 1]: ∎ 107480.5667 - 112126.2000 [ 0]: 112126.2000 - 116771.8333 [ 0]: 116771.8333 - 121417.4667 [ 0]: 121417.4667 - 126063.1000 [ 0]: 126063.1000 - 130708.7333 [ 0]: 130708.7333 - 135354.3667 [ 0]: 135354.3667 - 140000.0000 [ 1]: ∎ Ideally we eliminate the waiting by switching to async state machine for GC, but for now just yield to OS scheduler after a reasonable wait time. To choose yielding parameters I've measured golang.org/x/benchmarks/http tail latencies with different yield delays and oversubscription levels. With no oversubscription (to the degree possible): scang yield delay = 1, casgstatus yield delay = 1 Latency-50 1.41ms ±15% 1.41ms ± 5% ~ (p=0.611 n=13+12) Latency-95 5.21ms ± 2% 5.15ms ± 2% -1.15% (p=0.012 n=13+13) Latency-99 7.16ms ± 2% 7.05ms ± 2% -1.54% (p=0.002 n=13+13) Latency-999 10.7ms ± 9% 10.2ms ±10% -5.46% (p=0.004 n=12+13) scang yield delay = 5000, casgstatus yield delay = 3000 Latency-50 1.41ms ±15% 1.41ms ± 8% ~ (p=0.511 n=13+13) Latency-95 5.21ms ± 2% 5.14ms ± 2% -1.23% (p=0.006 n=13+13) Latency-99 7.16ms ± 2% 7.02ms ± 2% -1.94% (p=0.000 n=13+13) Latency-999 10.7ms ± 9% 10.1ms ± 8% -6.14% (p=0.000 n=12+13) scang yield delay = 10000, casgstatus yield delay = 5000 Latency-50 1.41ms ±15% 1.45ms ± 6% ~ (p=0.724 n=13+13) Latency-95 5.21ms ± 2% 5.18ms ± 1% ~ (p=0.287 n=13+13) Latency-99 7.16ms ± 2% 7.05ms ± 2% -1.64% (p=0.002 n=13+13) Latency-999 10.7ms ± 9% 10.0ms ± 5% -6.72% (p=0.000 n=12+13) scang yield delay = 30000, casgstatus yield delay = 10000 Latency-50 1.41ms ±15% 1.51ms ± 7% +6.57% (p=0.002 n=13+13) Latency-95 5.21ms ± 2% 5.21ms ± 2% ~ (p=0.960 n=13+13) Latency-99 7.16ms ± 2% 7.06ms ± 2% -1.50% (p=0.012 n=13+13) Latency-999 10.7ms ± 9% 10.0ms ± 6% -6.49% (p=0.000 n=12+13) scang yield delay = 100000, casgstatus yield delay = 50000 Latency-50 1.41ms ±15% 1.53ms ± 6% +8.48% (p=0.000 n=13+12) Latency-95 5.21ms ± 2% 5.23ms ± 2% ~ (p=0.287 n=13+13) Latency-99 7.16ms ± 2% 7.08ms ± 2% -1.21% (p=0.004 n=13+13) Latency-999 10.7ms ± 9% 9.9ms ± 3% -7.99% (p=0.000 n=12+12) scang yield delay = 200000, casgstatus yield delay = 100000 Latency-50 1.41ms ±15% 1.47ms ± 5% ~ (p=0.072 n=13+13) Latency-95 5.21ms ± 2% 5.17ms ± 2% ~ (p=0.091 n=13+13) Latency-99 7.16ms ± 2% 7.02ms ± 2% -1.99% (p=0.000 n=13+13) Latency-999 10.7ms ± 9% 9.9ms ± 5% -7.86% (p=0.000 n=12+13) With slight oversubscription (another instance of http benchmark was running in background with reduced GOMAXPROCS): scang yield delay = 1, casgstatus yield delay = 1 Latency-50 840µs ± 3% 804µs ± 3% -4.37% (p=0.000 n=15+18) Latency-95 6.52ms ± 4% 6.03ms ± 4% -7.51% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 10.0ms ± 4% -7.33% (p=0.000 n=18+14) Latency-999 18.0ms ± 9% 16.8ms ± 7% -6.84% (p=0.000 n=18+18) scang yield delay = 5000, casgstatus yield delay = 3000 Latency-50 840µs ± 3% 809µs ± 3% -3.71% (p=0.000 n=15+17) Latency-95 6.52ms ± 4% 6.11ms ± 4% -6.29% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 9.9ms ± 6% -7.55% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.5ms ±11% -8.49% (p=0.000 n=18+18) scang yield delay = 10000, casgstatus yield delay = 5000 Latency-50 840µs ± 3% 823µs ± 5% -2.06% (p=0.002 n=15+18) Latency-95 6.52ms ± 4% 6.32ms ± 3% -3.05% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 4% -5.22% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.7ms ±10% -7.09% (p=0.000 n=18+18) scang yield delay = 30000, casgstatus yield delay = 10000 Latency-50 840µs ± 3% 836µs ± 5% ~ (p=0.442 n=15+18) Latency-95 6.52ms ± 4% 6.39ms ± 3% -2.00% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 6% -5.15% (p=0.000 n=18+17) Latency-999 18.0ms ± 9% 16.6ms ± 8% -7.48% (p=0.000 n=18+18) scang yield delay = 100000, casgstatus yield delay = 50000 Latency-50 840µs ± 3% 836µs ± 6% ~ (p=0.401 n=15+18) Latency-95 6.52ms ± 4% 6.40ms ± 4% -1.79% (p=0.010 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 5% -4.95% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.5ms ±14% -8.17% (p=0.000 n=18+18) scang yield delay = 200000, casgstatus yield delay = 100000 Latency-50 840µs ± 3% 828µs ± 2% -1.49% (p=0.001 n=15+17) Latency-95 6.52ms ± 4% 6.38ms ± 4% -2.04% (p=0.001 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 4% -4.77% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.9ms ± 9% -6.23% (p=0.000 n=18+18) With significant oversubscription (background http benchmark was running with full GOMAXPROCS): scang yield delay = 1, casgstatus yield delay = 1 Latency-50 1.32ms ±12% 1.30ms ±13% ~ (p=0.454 n=14+14) Latency-95 16.3ms ±10% 15.3ms ± 7% -6.29% (p=0.001 n=14+14) Latency-99 29.4ms ±10% 27.9ms ± 5% -5.04% (p=0.001 n=14+12) Latency-999 49.9ms ±19% 45.9ms ± 5% -8.00% (p=0.008 n=14+13) scang yield delay = 5000, casgstatus yield delay = 3000 Latency-50 1.32ms ±12% 1.29ms ± 9% ~ (p=0.227 n=14+14) Latency-95 16.3ms ±10% 15.4ms ± 5% -5.27% (p=0.002 n=14+14) Latency-99 29.4ms ±10% 27.9ms ± 6% -5.16% (p=0.001 n=14+14) Latency-999 49.9ms ±19% 46.8ms ± 8% -6.21% (p=0.050 n=14+14) scang yield delay = 10000, casgstatus yield delay = 5000 Latency-50 1.32ms ±12% 1.35ms ± 9% ~ (p=0.401 n=14+14) Latency-95 16.3ms ±10% 15.0ms ± 4% -7.67% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.4ms ± 5% -6.98% (p=0.000 n=14+14) Latency-999 49.9ms ±19% 44.7ms ± 5% -10.56% (p=0.000 n=14+11) scang yield delay = 30000, casgstatus yield delay = 10000 Latency-50 1.32ms ±12% 1.36ms ±10% ~ (p=0.246 n=14+14) Latency-95 16.3ms ±10% 14.9ms ± 5% -8.31% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.4ms ± 7% -6.70% (p=0.000 n=14+14) Latency-999 49.9ms ±19% 44.9ms ±15% -10.13% (p=0.003 n=14+14) scang yield delay = 100000, casgstatus yield delay = 50000 Latency-50 1.32ms ±12% 1.41ms ± 9% +6.37% (p=0.008 n=14+13) Latency-95 16.3ms ±10% 15.1ms ± 8% -7.45% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.5ms ±12% -6.67% (p=0.002 n=14+14) Latency-999 49.9ms ±19% 45.9ms ±16% -8.06% (p=0.019 n=14+14) scang yield delay = 200000, casgstatus yield delay = 100000 Latency-50 1.32ms ±12% 1.42ms ±10% +7.21% (p=0.003 n=14+14) Latency-95 16.3ms ±10% 15.0ms ± 7% -7.59% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.3ms ± 8% -7.20% (p=0.000 n=14+14) Latency-999 49.9ms ±19% 44.8ms ± 8% -10.21% (p=0.001 n=14+13) All numbers are on 8 cores and with GOGC=10 (http benchmark has tiny heap, few goroutines and low allocation rate, so by default GC barely affects tail latency). 10us/5us yield delays seem to provide a reasonable compromise and give 5-10% tail latency reduction. That's what used in this change. go install -a std results on 4 core machine: name old time/op new time/op delta Time 8.39s ± 2% 7.94s ± 2% -5.34% (p=0.000 n=47+49) UserTime 24.6s ± 2% 22.9s ± 2% -6.76% (p=0.000 n=49+49) SysTime 1.77s ± 9% 1.89s ±11% +7.00% (p=0.000 n=49+49) CpuLoad 315ns ± 2% 313ns ± 1% -0.59% (p=0.000 n=49+48) # %CPU MaxRSS 97.1ms ± 4% 97.5ms ± 9% ~ (p=0.838 n=46+49) # bytes Update #14396 Update #14189 Change-Id: I3f4109bf8f7fd79b39c466576690a778232055a2 Reviewed-on: https://go-review.googlesource.com/21503 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-04-05 15:52:03 +00:00
Dmitry Vyukov	3b246fa863	runtime: sleep less when we can do work Usleep(100) in runqgrab negatively affects latency and throughput of parallel application. We are sleeping instead of doing useful work. This is effect is particularly visible on windows where minimal sleep duration is 1-15ms. Reduce sleep from 100us to 3us and use osyield on windows. Sync chan send/recv takes ~50ns, so 3us gives us ~50x overshoot. benchmark old ns/op new ns/op delta BenchmarkChanSync-12 216 217 +0.46% BenchmarkChanSyncWork-12 27213 25816 -5.13% CPU consumption goes up from 106% to 108% in the first case, and from 107% to 125% in the second case. Test case from #14790 on windows: BenchmarkDefaultResolution-8 4583372 29720 -99.35% Benchmark1ms-8 992056 30701 -96.91% 99-th latency percentile for HTTP request serving is improved by up to 15% (see http://golang.org/cl/20835 for details). The following benchmarks are from the change that originally added this sleep (see https://golang.org/s/go15gomaxprocs): name old time/op new time/op delta Chain 22.6µs ± 2% 22.7µs ± 6% ~ (p=0.905 n=9+10) ChainBuf 22.4µs ± 3% 22.5µs ± 4% ~ (p=0.780 n=9+10) Chain-2 23.5µs ± 4% 24.9µs ± 1% +5.66% (p=0.000 n=10+9) ChainBuf-2 23.7µs ± 1% 24.4µs ± 1% +3.31% (p=0.000 n=9+10) Chain-4 24.2µs ± 2% 25.1µs ± 3% +3.70% (p=0.000 n=9+10) ChainBuf-4 24.4µs ± 5% 25.0µs ± 2% +2.37% (p=0.023 n=10+10) Powser 2.37s ± 1% 2.37s ± 1% ~ (p=0.423 n=8+9) Powser-2 2.48s ± 2% 2.57s ± 2% +3.74% (p=0.000 n=10+9) Powser-4 2.66s ± 1% 2.75s ± 1% +3.40% (p=0.000 n=10+10) Sieve 13.3s ± 2% 13.3s ± 2% ~ (p=1.000 n=10+9) Sieve-2 7.00s ± 2% 7.44s ±16% ~ (p=0.408 n=8+10) Sieve-4 4.13s ±21% 3.85s ±22% ~ (p=0.113 n=9+9) Fixes #14790 Change-Id: Ie7c6a1c4f9c8eb2f5d65ab127a3845386d6f8b5d Reviewed-on: https://go-review.googlesource.com/20835 Reviewed-by: Austin Clements <austin@google.com>	2016-04-05 15:32:06 +00:00
Alex Brainman	ffeae198d0	runtime: leave directory before removing it in TestDLLPreloadMitigation Fixes #15120 Change-Id: I1d9a192ac163826bad8b46e8c0b0b9e218e69570 Reviewed-on: https://go-review.googlesource.com/21520 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-05 04:32:49 +00:00
Alex Brainman	fcac88098b	runtime: remove race out of BenchmarkChanToSyscallPing1ms Fixes #15119 Change-Id: I31445bf282a5e2a160ff4e66c5a592b989a5798f Reviewed-on: https://go-review.googlesource.com/21448 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-05 00:52:40 +00:00
Austin Clements	61f56e925e	runtime: fix pagesInUse accounting When we grow the heap, we create a temporary "in use" span for the memory acquired from the OS and then free that span to link it into the heap. Hence, we (1) increase pagesInUse when we make the temporary span so that (2) freeing the span will correctly decrease it. However, currently step (1) increases pagesInUse by the number of pages requested from the heap, while step (2) decreases it by the number of pages requested from the OS (the size of the temporary span). These aren't necessarily the same, since we round up the number of pages we request from the OS, so steps 1 and 2 don't necessarily cancel out like they're supposed to. Over time, this can add up and cause pagesInUse to underflow and wrap around to 2^64. The garbage collector computes the sweep ratio from this, so if this happens, the sweep ratio becomes effectively infinite, causing the first allocation on each P in a sweep cycle to sweep the entire heap. This makes sweeping effectively STW. Fix this by increasing pagesInUse in step 1 by the number of pages requested from the OS, so that the two steps correctly cancel out. We add a test that checks that the running total matches the actual state of the heap. Fixes #15022. For 1.6.x. Change-Id: Iefd9d6abe37d0d447cbdbdf9941662e4f18eeffc Reviewed-on: https://go-review.googlesource.com/21280 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2016-04-04 15:33:26 +00:00
Alex Brainman	1f5b1b2b66	runtime: change osyield to use Windows SwitchToThread It appears that windows osyield is just 15ms sleep on my computer (see benchmarks below). Replace NtWaitForSingleObject in osyield with SwitchToThread (as suggested by Dmitry). Also add issue #14790 related benchmarks, so we can track perfomance changes in CL 20834 and CL 20835 and beyond. Update #14790 benchmark old ns/op new ns/op delta BenchmarkChanToSyscallPing1ms 1953200 1953000 -0.01% BenchmarkChanToSyscallPing15ms 31562904 31248400 -1.00% BenchmarkSyscallToSyscallPing1ms 5247 4202 -19.92% BenchmarkSyscallToSyscallPing15ms 5260 4374 -16.84% BenchmarkChanToChanPing1ms 474 494 +4.22% BenchmarkChanToChanPing15ms 468 489 +4.49% BenchmarkOsYield1ms 980018 75.5 -99.99% BenchmarkOsYield15ms 15625200 75.8 -100.00% Change-Id: I1b4cc7caca784e2548ee3c846ca07ef152ebedce Reviewed-on: https://go-review.googlesource.com/21294 Run-TryBot: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-04 10:05:05 +00:00
Christopher Nelson	ed8f0e5c33	cmd/go: fix -buildmode=c-archive should work on windows Add supporting code for runtime initialization, including both 32- and 64-bit x86 architectures. Add .ctors section on Windows to PE .o files, and INITENTRY to .ctors section to plug in to the GCC C/C++ startup initialization mechanism. This allows the Go runtime to initialize itself. Add .text section symbol for .ctor relocations. Note: This is unlikely to be useful for MSVC-based toolchains. Fixes #13494 Change-Id: I4286a96f70e5f5228acae88eef46e2bed95813f3 Reviewed-on: https://go-review.googlesource.com/18057 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2016-04-04 03:38:25 +00:00
Eric Engestrom	7a8caf7d43	all: fix spelling mistakes Signed-off-by: Eric Engestrom <eric@engestrom.ch> Change-Id: I91873aaebf79bdf1c00d38aacc1a1fb8d79656a7 Reviewed-on: https://go-review.googlesource.com/21433 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-03 17:03:15 +00:00
Brad Fitzpatrick	683448a304	runtime, syscall: only search for Windows DLLs in the System32 directory Make sure that for any DLL that Go uses itself, we only look for the DLL in the Windows System32 directory, guarding against DLL preloading attacks. (Unless the Windows version is ancient and LoadLibraryEx is unavailable, in which case the user probably has bigger security problems anyway.) This does not change the behavior of syscall.LoadLibrary or NewLazyDLL if the DLL name is something unused by Go itself. This change also intentionally does not add any new API surface. Instead, x/sys is updated with a LoadLibraryEx function and LazyDLL.Flags in: https://golang.org/cl/21388 Updates #14959 Change-Id: I8d29200559cc19edf8dcf41dbdd39a389cd6aeb9 Reviewed-on: https://go-review.googlesource.com/21140 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-01 22:55:36 +00:00
Ian Lance Taylor	59fc42b230	runtime: allocate mp.cgocallers earlier Fixes #15061. Change-Id: I71f69f398d1c5f3a884bbd044786f1a5600d0fae Reviewed-on: https://go-review.googlesource.com/21398 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-01 22:23:13 +00:00
Ian Lance Taylor	58394fd7d5	runtime/cgo: only build _cgo_callers if x_cgo_callers is defined Fixes a problem when using the external linker on Solaris. The Solaris external linker still doesn't work due to issue #14957. The problem is, for example, with `go test cmd/objdump`: objdump_test.go:71: go build fmthello.go: exit status 2 # command-line-arguments /var/gcc/iant/go/pkg/tool/solaris_amd64/link: running gcc failed: exit status 1 Undefined first referenced symbol in file x_cgo_callers /tmp/go-link-355600608/go.o ld: fatal: symbol referencing errors collect2: error: ld returned 1 exit status Change-Id: I54917cfd5c288ee77ea25c439489bd2c9124fe73 Reviewed-on: https://go-review.googlesource.com/21392 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2016-04-01 16:13:45 +00:00
Ian Lance Taylor	ea306ae625	runtime: support symbolic backtrace of C code in a cgo crash The new function runtime.SetCgoTraceback may be used to register stack traceback and symbolizer functions, written in C, to do a stack traceback from cgo code. There is a sample implementation of runtime.SetCgoSymbolizer at github.com/ianlancetaylor/cgosymbolizer. Just importing that package is sufficient to get symbolic C backtraces. Currently only supported on linux/amd64. Change-Id: If96ee2eb41c6c7379d407b9561b87557bfe47341 Reviewed-on: https://go-review.googlesource.com/17761 Reviewed-by: Austin Clements <austin@google.com>	2016-04-01 04:13:44 +00:00
Matthew Dempsky	e6066711a0	cmd/compile, runtime: fix pedantic int->string conversions Previously, cmd/compile rejected constant int->string conversions if the integer value did not fit into an "int" value. Also, runtime incorrectly truncated 64-bit values to 32-bit before checking if they're a valid Unicode code point. According to the Go spec, both of these cases should instead yield "\uFFFD". Fixes #15039. Change-Id: I3c8a3ad9a0780c0a8dc1911386a523800fec9764 Reviewed-on: https://go-review.googlesource.com/21344 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-31 10:28:23 +00:00
Keith Randall	4b209dbf0b	runtime: don't use REP;MOVSB if CPUID doesn't say it is fast Only use REP;MOVSB if: 1) The CPUID flag says it is fast, and 2) The pointers are unaligned Otherwise, use REP;MOVSQ. Update #14630 Change-Id: I946b28b87880c08e5eed1ce2945016466c89db66 Reviewed-on: https://go-review.googlesource.com/21300 Reviewed-by: Nigel Tao <nigeltao@golang.org>	2016-03-31 02:54:10 +00:00
Austin Clements	17f6e5396b	runtime: print sweep ratio if gcpacertrace>0 Change-Id: I5217bf4b75e110ca2946e1abecac6310ed84dad5 Reviewed-on: https://go-review.googlesource.com/21205 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-03-30 02:27:58 +00:00
Michel Lespinasse	859b63cc09	cmd/compile: optimize remaining convT2I calls See #14874 Updates #6853 This change adds a compiler optimization for non pointer shaped convT2I. Since itab symbols are now emitted by the compiler, the itab address can be passed directly to convT2I instead of passing the iface type and a cache pointer argument. Compilebench results for the 5-commits series ending here: name old time/op new time/op delta Template 336ms ± 4% 344ms ± 4% +2.61% (p=0.027 n=9+8) Unicode 165ms ± 6% 173ms ± 7% +5.11% (p=0.014 n=9+9) GoTypes 1.09s ± 1% 1.06s ± 2% -3.29% (p=0.000 n=9+9) Compiler 5.09s ±10% 4.75s ±10% -6.64% (p=0.011 n=10+10) MakeBash 31.1s ± 5% 30.3s ± 3% ~ (p=0.089 n=10+10) name old text-bytes new text-bytes delta HelloSize 558k ± 0% 558k ± 0% +0.02% (p=0.000 n=10+10) CmdGoSize 6.24M ± 0% 6.11M ± 0% -2.11% (p=0.000 n=10+10) name old data-bytes new data-bytes delta HelloSize 3.66k ± 0% 3.74k ± 0% +2.41% (p=0.000 n=10+10) CmdGoSize 134k ± 0% 162k ± 0% +20.76% (p=0.000 n=10+10) name old bss-bytes new bss-bytes delta HelloSize 126k ± 0% 126k ± 0% ~ (all samples are equal) CmdGoSize 149k ± 0% 146k ± 0% -2.17% (p=0.000 n=10+10) name old exe-bytes new exe-bytes delta HelloSize 924k ± 0% 924k ± 0% +0.05% (p=0.000 n=10+10) CmdGoSize 9.77M ± 0% 9.62M ± 0% -1.47% (p=0.000 n=10+10) Change-Id: Ib230ddc04988824035c32287ae544a965fedd344 Reviewed-on: https://go-review.googlesource.com/20902 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Michel Lespinasse <walken@google.com>	2016-03-29 02:21:50 +00:00
Michel Lespinasse	79688ca58f	cmd/link: collect itablinks as a slice in moduledata See #14874 This change tells the linker to collect all the itablink symbols and collect them so that moduledata can have a slice of all compiler generated itabs. The logic is shamelessly adapted from what is done with typelink symbols. Change-Id: Ie93b59acf0fcba908a876d506afbf796f222dbac Reviewed-on: https://go-review.googlesource.com/20889 Reviewed-by: Keith Randall <khr@golang.org>	2016-03-29 02:18:56 +00:00
Michel Lespinasse	f00bbd5f81	cmd/compile: emit itabs and itablinks See #14874 This change tells the compiler to emit itab and itablink symbols in situations where they could be useful; however the compiled code does not actually make use of the new symbols yet. Change-Id: I0db3e6ec0cb1f3b7cebd4c60229e4a48372fe586 Reviewed-on: https://go-review.googlesource.com/20888 Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Michel Lespinasse <walken@google.com>	2016-03-29 02:18:03 +00:00
Michel Lespinasse	7043d2bb5e	runtime: insert itabs into hash table during init See #14874 This change makes the runtime register all compiler generated itabs (as obtained from the moduledata) during init. Change-Id: I9969a0985b99b8bda820a631f7fe4c78f1174cdf Reviewed-on: https://go-review.googlesource.com/20900 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Michel Lespinasse <walken@google.com>	2016-03-29 02:14:49 +00:00
Matthew Dempsky	deb83d0639	cmd/compile: remove unused write barrier helpers These have been unused since CL 10316. Passes toolstash -cmp. Change-Id: Icc19f3fcc7275fbee1c665f704e10a110ecce2a5 Reviewed-on: https://go-review.googlesource.com/21242 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-03-29 00:16:55 +00:00
Shinji Tanaka	0f86d1edfb	runtime: use set_thread_area instead of modify_ldt on linux/386 linux/386 depends on modify_ldt system call, but recent Linux kernels can disable this system call. Any Go programs built as linux/386 crash with the message 'Trace/breakpoint trap'. The kernel config CONFIG_MODIFY_LDT_SYSCALL, which control enable/disable modify_ldt, is disabled on Amazon Linux 2016.03. This fixes this problem by using set_thread_area instead of modify_ldt on linux/386. Fixes #14795. Change-Id: I0cc5139e40e9e5591945164156a77b6bdff2c7f1 Reviewed-on: https://go-review.googlesource.com/21190 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Minux Ma <minux@golang.org>	2016-03-28 16:56:38 +00:00
David Chase	8eec2bbfbc	cmd/compile: added some intrinsics to SSA back end One intrinsic was needed to help get the very best performance out of a future GC; as long as that one was being added, I also added Bswap since that is sometimes a handy thing to have. I had intended to fill out the bit-scan intrinsic family, but the mismatch between the "scan forward" instruction and "count leading zeroes" was large enough to cause me to leave it out -- it poses a dilemma that I'd rather dodge right now. These intrinsics are not exposed for general use. That's a separate issue requiring an API proposal change ( https://github.com/golang/proposal ) All intrinsics are tested, both that they are substituted on the appropriate architecture, and that they produce the expected result. Change-Id: I5848037cfd97de4f75bdc33bdd89bba00af4a8ee Reviewed-on: https://go-review.googlesource.com/20564 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-28 16:29:59 +00:00
Matthew Dempsky	995fb0319e	cmd/compile: fix stringtoslicebytetmp optimization Fixes #14973. Change-Id: Iea68c9deca9429bde465c9ae05639209fe0ccf72 Reviewed-on: https://go-review.googlesource.com/21175 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-27 19:12:37 +00:00
Shenghou Ma	a7f7a9cca7	runtime, runtime/cgo: save callee-saved FP registers on arm64 For #14876. Change-Id: I0992859264cbaf9c9b691fad53345bbb01b4cf3b Reviewed-on: https://go-review.googlesource.com/21085 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-25 23:04:44 +00:00
Shenghou Ma	71916437fb	runtime/cgo: save callee-saved xmm registers on windows/amd64 For #14876. Change-Id: I33947f74e8058437a784862f1f064974afc99250 Reviewed-on: https://go-review.googlesource.com/21084 Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-03-25 23:04:33 +00:00
Joe Sylve	562e38c0af	runtime: fix signal handling on Solaris This fixes the problems with signal handling that were inadvertently introduced in https://go-review.googlesource.com/21006. Fixes #14899 Change-Id: Ia746914dcb3146a52413d32c57b089af763f0810 Reviewed-on: https://go-review.googlesource.com/21145 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-03-25 21:38:47 +00:00
Marvin Stenger	6b0688f742	runtime: speed up growslice by avoiding divisions 2 This is a follow-up of https://go-review.googlesource.com/#/c/20653/ Special case computation for slices with elements of byte size or pointer size. name old time/op new time/op delta GrowSliceBytes-4 86.2ns ± 3% 75.4ns ± 2% -12.50% (p=0.000 n=20+20) GrowSliceInts-4 161ns ± 3% 136ns ± 3% -15.59% (p=0.000 n=19+19) GrowSlicePtr-4 239ns ± 2% 233ns ± 2% -2.52% (p=0.000 n=20+20) GrowSliceStruct24Bytes-4 258ns ± 3% 256ns ± 3% ~ (p=0.134 n=20+20) Change-Id: Ice5fa648058fe9d7fa89dee97ca359966f671128 Reviewed-on: https://go-review.googlesource.com/21101 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-25 16:47:56 +00:00
Richard Miller	967b9940b4	runtime: avoid fork/exit race in plan9 There's a race between runtime.goexitsall killing all OS processes of a go program in order to exit, and runtime.newosproc forking a new one. If the new process has been created but not yet stored its pid in m.procid, it will not be killed by goexitsall and deadlock results. This CL prevents the race by making the newly forked process check whether the program is exiting. It also prevents a potential "shoot-out" if multiple goroutines call Exit at the same time, which could possibly lead to two processes killing each other and leaving the rest deadlocked. Change-Id: I3170b4a62d2461f6b029b3d6aad70373714ed53e Reviewed-on: https://go-review.googlesource.com/21135 Run-TryBot: David du Colombier <0intro@gmail.com> Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David du Colombier <0intro@gmail.com>	2016-03-25 15:04:45 +00:00
Dmitry Vyukov	ea0386f85f	runtime: improve randomized stealing logic During random stealing we steal 4*GOMAXPROCS times from random procs. One would expect that most of the time we check all procs this way, but due to low quality PRNG we actually miss procs with frightening probability. Below are modelling experiment results for 1e6 tries: GOMAXPROCS = 2 : missed 1 procs 7944 times GOMAXPROCS = 3 : missed 1 procs 101620 times GOMAXPROCS = 3 : missed 2 procs 3571 times GOMAXPROCS = 4 : missed 1 procs 63916 times GOMAXPROCS = 4 : missed 2 procs 61 times GOMAXPROCS = 4 : missed 3 procs 16 times GOMAXPROCS = 5 : missed 1 procs 133136 times GOMAXPROCS = 5 : missed 2 procs 1025 times GOMAXPROCS = 5 : missed 3 procs 101 times GOMAXPROCS = 5 : missed 4 procs 15 times GOMAXPROCS = 8 : missed 1 procs 151765 times GOMAXPROCS = 8 : missed 2 procs 5057 times GOMAXPROCS = 8 : missed 3 procs 1726 times GOMAXPROCS = 8 : missed 4 procs 68 times GOMAXPROCS = 12 : missed 1 procs 199081 times GOMAXPROCS = 12 : missed 2 procs 27489 times GOMAXPROCS = 12 : missed 3 procs 3113 times GOMAXPROCS = 12 : missed 4 procs 233 times GOMAXPROCS = 12 : missed 5 procs 9 times GOMAXPROCS = 16 : missed 1 procs 237477 times GOMAXPROCS = 16 : missed 2 procs 30037 times GOMAXPROCS = 16 : missed 3 procs 9466 times GOMAXPROCS = 16 : missed 4 procs 1334 times GOMAXPROCS = 16 : missed 5 procs 192 times GOMAXPROCS = 16 : missed 6 procs 5 times GOMAXPROCS = 16 : missed 7 procs 1 times GOMAXPROCS = 16 : missed 8 procs 1 times A missed proc won't lead to underutilization because we check all procs again after dropping P. But it can lead to an unpleasant situation when we miss a proc, drop P, check all procs, discover work, acquire P, miss the proc again, repeat. Improve stealing logic to cover all procs. Also don't enter spinning mode and try to steal when there is nobody around. Change-Id: Ibb6b122cc7fb836991bad7d0639b77c807aab4c2 Reviewed-on: https://go-review.googlesource.com/20836 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>	2016-03-25 11:00:48 +00:00
David Crawshaw	24ce64d1a9	cmd/compile, runtime: new static name encoding Create a byte encoding designed for static Go names. It is intended to be a compact representation of a name and optional tag data that can be turned into a Go string without allocating, and describes whether or not it is exported without unicode table. The encoding is described in reflect/type.go: // The first byte is a bit field containing: // // 1<<0 the name is exported // 1<<1 tag data follows the name // 1<<2 pkgPath string follow the name and tag // // The next two bytes are the data length: // // l := uint16(data[1])<<8 \| uint16(data[2]) // // Bytes [3:3+l] are the string data. // // If tag data follows then bytes 3+l and 3+l+1 are the tag length, // with the data following. // // If the import path follows, then ptrSize bytes at the end of // the data form a string. The import path is only set for concrete // methods that are defined in a different package than their type. Shrinks binary sizes: cmd/go: 164KB (1.6%) jujud: 1.0MB (1.5%) For #6853. Change-Id: I46b6591015b17936a443c9efb5009de8dfe8b609 Reviewed-on: https://go-review.googlesource.com/20968 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-03-25 00:13:49 +00:00
Joe Sylve	df2b2eb63d	runtime: improve last ditch signal forwarding for Unix libraries The current runtime attempts to forward signals generated by non-Go code to the original signal handler. If it can't call the original handler directly, it currently attempts to re-raise the signal after resetting the handler. In this case, the original context is lost. This fix prevents that problem by simply returning from the go signal handler after resetting the original handler. It only does this when the original handler is the system default handler, which in all cases is known to not recover. The signal is not reset, so it is retriggered and the original handler takes over with the proper context. Fixes #14899 Change-Id: Ib1c19dfa4b50d9732d7a453de3784c8141e1cbb3 Reviewed-on: https://go-review.googlesource.com/21006 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-03-24 19:34:17 +00:00
Marvin Stenger	44532f1a9d	runtime: fix inconsistency in slice.go Fixes #14938. Additionally some simplifications along the way. Change-Id: I2c5fb7e32dcc6fab68fff36a49cb72e715756abe Reviewed-on: https://go-review.googlesource.com/21046 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2016-03-24 18:17:28 +00:00
Elias Naur	be10c51500	runtime/cgo: block signals to the iOS mach exception handler For darwin/arm{,64} a non-Go thread is created to convert EXC_BAD_ACCESS to panics. However, the Go signal handler refuse to handle signals that would otherwise be ignored if they arrive at non-Go threads. Block all (posix) signals to that thread, making sure that no unexpected signals arrive to it. At least one test, TestStop in os/signal, depends on signals not arriving on any non-Go threads. For #14318 Change-Id: I901467fb53bdadb0d03b0f1a537116c7f4754423 Reviewed-on: https://go-review.googlesource.com/21047 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2016-03-24 14:12:12 +00:00
Lynn Boger	baec148767	bytes: Equal perf improvements on ppc64le/ppc64 The existing implementation for Equal and similar functions in the bytes package operate on one byte at at time. This performs poorly on ppc64/ppc64le especially when the byte buffers are large. This change improves those functions by loading and comparing double words where possible. The common code has been moved to a function that can be shared by the other functions in this file which perform the same type of comparison. Further optimizations are done for the case where >= 32 bytes are being compared. The new function memeqbody is used by memeq_varlen, Equal, and eqstring. When running the bytes test with -test.bench=Equal benchmark old MB/s new MB/s speedup BenchmarkEqual1 164.83 129.49 0.79x BenchmarkEqual6 563.51 445.47 0.79x BenchmarkEqual9 656.15 1099.00 1.67x BenchmarkEqual15 591.93 1024.30 1.73x BenchmarkEqual16 613.25 1914.12 3.12x BenchmarkEqual20 682.37 1687.04 2.47x BenchmarkEqual32 807.96 3843.29 4.76x BenchmarkEqual4K 1076.25 23280.51 21.63x BenchmarkEqual4M 1079.30 13120.14 12.16x BenchmarkEqual64M 1073.28 10876.92 10.13x It was determined that the degradation in the smaller byte tests were due to unfavorable code alignment of the single byte loop. Fixes #14368 Change-Id: I0dd87382c28887c70f4fbe80877a8ba03c31d7cd Reviewed-on: https://go-review.googlesource.com/20249 Reviewed-by: Minux Ma <minux@golang.org>	2016-03-23 14:21:15 +00:00
Keith Randall	6a33f7765f	runtime: use MOVSB instead of MOVSQ for unaligned moves MOVSB is quite a bit faster for unaligned moves. Possibly we should use MOVSB all of the time, but Intel folks say it might be a bit faster to use MOVSQ on some processors (but not any I have access to at the moment). benchmark old ns/op new ns/op delta BenchmarkMemmove4096-8 93.9 93.2 -0.75% BenchmarkMemmoveUnalignedDst4096-8 256 151 -41.02% BenchmarkMemmoveUnalignedSrc4096-8 175 90.5 -48.29% Fixes #14630 Change-Id: I568e6d6590eb3615e6a699fb474020596be665ff Reviewed-on: https://go-review.googlesource.com/20293 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-03-21 19:10:24 +00:00
Michael Munday	cc17d1ba76	runtime/internal/sys: add s390x support Change-Id: I928532b406a3457d2c5f75f4de7d46a3f795192e Reviewed-on: https://go-review.googlesource.com/20939 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-21 08:04:38 +00:00
Keith Randall	377eaa7494	runtime: add space Missed this in review of 20812 Change-Id: I01e220499dcd58e1a7205e2a577dd9630a8b7174 Reviewed-on: https://go-review.googlesource.com/20819 Reviewed-by: Keith Randall <khr@golang.org>	2016-03-18 21:31:49 +00:00
Keith Randall	15ea61146e	runtime: use unaligned loads on ppc64 benchmark old ns/op new ns/op delta BenchmarkAlignedLoad-160 8.67 7.42 -14.42% BenchmarkUnalignedLoad-160 8.63 7.37 -14.60% Change-Id: Id4609d7b4038c4d2ec332efc4fe6f1adfb61b82b Reviewed-on: https://go-review.googlesource.com/20812 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-18 19:21:53 +00:00
Marcel van Lohuizen	872ca73cad	runtime: don't assume b.N > 0 Change-Id: I2e26717f2563d7633ffd15f4adf63c3d0ee3403f Reviewed-on: https://go-review.googlesource.com/20856 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2016-03-18 15:55:02 +00:00
Austin Clements	857d0b48db	runtime: document sudog Change-Id: I85c0bcf02842cc32dbc9bfdcea27efe871173574 Reviewed-on: https://go-review.googlesource.com/20774 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-17 18:57:10 +00:00
Matthew Dempsky	ac74e5debc	cmd/compile: stop constructing sudog type The compiler doesn't care about the runtime's sudog type. Stop constructing it. Change-Id: If1885fe30b2e215a08d17662eab5ea6d81fe58ab Reviewed-on: https://go-review.googlesource.com/20797 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-03-17 18:02:40 +00:00
Austin Clements	f11e4eb5cc	runtime: shrink stacks during concurrent mark Currently we shrink stacks during STW mark termination because it used to be unsafe to shrink them concurrently. For some programs, this significantly increases pause time: stack shrinking costs ~5ms/MB copied plus 2µs/shrink. Now that we've made it safe to shrink a stack without the world being stopped, shrink them during the concurrent mark phase. This reduces the STW time in the program from issue #12967 by an order of magnitude and brings it from over the 10ms goal to well under: name old 95%ile-markTerm-time new 95%ile-markTerm-time delta Stackshrink-4 23.8ms ±60% 1.80ms ±39% -92.44% (p=0.008 n=5+5) Fixes #12967. This slows down the go1 and garbage benchmarks overall by < 0.5%. name old time/op new time/op delta XBenchGarbage-12 2.48ms ± 1% 2.49ms ± 1% +0.45% (p=0.005 n=25+21) name old time/op new time/op delta BinaryTree17-12 2.93s ± 2% 2.97s ± 2% +1.34% (p=0.002 n=19+20) Fannkuch11-12 2.51s ± 1% 2.59s ± 0% +3.09% (p=0.000 n=18+18) FmtFprintfEmpty-12 51.1ns ± 2% 51.5ns ± 1% ~ (p=0.280 n=20+17) FmtFprintfString-12 175ns ± 1% 169ns ± 1% -3.01% (p=0.000 n=20+20) FmtFprintfInt-12 160ns ± 1% 160ns ± 0% +0.53% (p=0.000 n=20+20) FmtFprintfIntInt-12 265ns ± 0% 266ns ± 1% +0.59% (p=0.000 n=20+20) FmtFprintfPrefixedInt-12 237ns ± 1% 238ns ± 1% +0.44% (p=0.000 n=20+20) FmtFprintfFloat-12 326ns ± 1% 341ns ± 1% +4.55% (p=0.000 n=20+19) FmtManyArgs-12 1.01µs ± 0% 1.02µs ± 0% +0.43% (p=0.000 n=20+19) GobDecode-12 8.41ms ± 1% 8.30ms ± 2% -1.22% (p=0.000 n=20+19) GobEncode-12 6.66ms ± 1% 6.68ms ± 0% +0.30% (p=0.000 n=18+19) Gzip-12 322ms ± 1% 322ms ± 1% ~ (p=1.000 n=20+20) Gunzip-12 42.8ms ± 0% 42.9ms ± 0% ~ (p=0.174 n=20+20) HTTPClientServer-12 69.7µs ± 1% 70.6µs ± 1% +1.20% (p=0.000 n=20+20) JSONEncode-12 16.8ms ± 0% 16.8ms ± 1% ~ (p=0.154 n=19+19) JSONDecode-12 65.1ms ± 0% 65.3ms ± 1% +0.34% (p=0.003 n=20+20) Mandelbrot200-12 3.93ms ± 0% 3.92ms ± 0% ~ (p=0.396 n=19+20) GoParse-12 3.66ms ± 1% 3.65ms ± 1% ~ (p=0.117 n=16+18) RegexpMatchEasy0_32-12 85.0ns ± 2% 85.5ns ± 2% ~ (p=0.143 n=20+20) RegexpMatchEasy0_1K-12 267ns ± 1% 267ns ± 1% ~ (p=0.867 n=20+17) RegexpMatchEasy1_32-12 83.3ns ± 2% 83.8ns ± 1% ~ (p=0.068 n=20+20) RegexpMatchEasy1_1K-12 432ns ± 1% 432ns ± 1% ~ (p=0.804 n=20+19) RegexpMatchMedium_32-12 133ns ± 0% 133ns ± 0% ~ (p=1.000 n=20+20) RegexpMatchMedium_1K-12 40.3µs ± 1% 40.4µs ± 1% ~ (p=0.319 n=20+19) RegexpMatchHard_32-12 2.10µs ± 1% 2.10µs ± 1% ~ (p=0.723 n=20+18) RegexpMatchHard_1K-12 63.0µs ± 0% 63.0µs ± 0% ~ (p=0.158 n=19+17) Revcomp-12 461ms ± 1% 476ms ± 8% +3.29% (p=0.002 n=20+20) Template-12 80.1ms ± 1% 79.3ms ± 1% -1.00% (p=0.000 n=20+20) TimeParse-12 360ns ± 0% 360ns ± 0% ~ (p=0.802 n=18+19) TimeFormat-12 374ns ± 1% 372ns ± 0% -0.77% (p=0.000 n=20+19) [Geo mean] 61.8µs 62.0µs +0.40% Change-Id: Ib60cd46b7a4987e07670eb271d22f6cee5802842 Reviewed-on: https://go-review.googlesource.com/20044 Reviewed-by: Keith Randall <khr@golang.org>	2016-03-16 20:13:25 +00:00
Austin Clements	c14d25c648	runtime: generalize work.finalizersDone to work.markrootDone We're about to add another root marking job that needs to happen only during the first markroot pass (whether that's concurrent or STW), just like finalizer scanning. Rather than introducing another flag that has the same value as finalizersDone, just rename finalizersDone to markrootDone. Change-Id: I535356c6ea1f3734cb5b6add264cb7bf48de95e8 Reviewed-on: https://go-review.googlesource.com/20043 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-16 20:13:22 +00:00
Austin Clements	276b177771	runtime: make shrinkstack concurrent-safe Currently shinkstack is only safe during STW because it adjusts channel-related stack pointers and moves send/receive stack slots without synchronizing with the channel code. Make it safe to use when the world isn't stopped by: 1) Locking all channels the G is blocked on while adjusting the sudogs and copying the area of the stack that may contain send/receive slots. 2) For any stack frames that may contain send/receive slot, using an atomic CAS to adjust pointers to prevent races between adjusting a pointer in a receive slot and a concurrent send writing to that receive slot. In principle, the synchronization could be finer-grained. For example, we considered synchronizing around the sudogs, which would allow channel operations involving other Gs to continue if the G being shrunk was far enough down the send/receive queue. However, using the channel lock means no additional locks are necessary in the channel code. Furthermore, the stack shrinking code holds the channel lock for a very short time (much less than the time required to shrink the stack). This does not yet make stack shrinking concurrent; it merely makes doing so safe. This has negligible effect on the go1 and garbage benchmarks. For #12967. Change-Id: Ia49df3a8a7be4b36e365aac4155a2416b94b988c Reviewed-on: https://go-review.googlesource.com/20042 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2016-03-16 20:13:20 +00:00
Austin Clements	d45bf7228f	runtime: define lock order between G status and channel lock Currently, locking a G's stack by setting its status to _Gcopystack or _Gscan is unordered with respect to channel locks. However, when we make stack shrinking concurrent, stack shrinking will need to lock the G and then acquire channel locks, which imposes an order on these. Document this lock ordering and fix closechan to respect it. Everything else already happens to respect it. For #12967. Change-Id: I4dd02675efffb3e7daa5285cf75bf24f987d90d4 Reviewed-on: https://go-review.googlesource.com/20041 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-16 20:13:17 +00:00
Austin Clements	db72b41bcd	runtime: protect sudog.elem with hchan.lock Currently sudog.elem is never accessed concurrently, so in several cases we drop the channel lock just before reading/writing the sent/received value from/to sudog.elem. However, concurrent stack shrinking is going to have to adjust sudog.elem to point to the new stack, which means it needs a way to synchronize with accesses to sudog.elem. Hence, add sudog.elem to the fields protected by hchan.lock and scoot the unlocks down past the uses of sudog.elem. While we're here, better document the channel synchronization rules. For #12967. Change-Id: I3ad0ca71f0a74b0716c261aef21b2f7f13f74917 Reviewed-on: https://go-review.googlesource.com/20040 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-16 20:13:15 +00:00
Austin Clements	3c2a21ff13	runtime: fix transient _Gwaiting states in newstack With concurrent stack shrinking, the stack can move the instant after a G enters _Gwaiting. There are only two places that put a G into _Gwaiting: gopark and newstack. We fixed uses of gopark. This commit fixes newstack by simplifying its G transitions and, in particular, eliminating or narrowing the transient _Gwaiting states it passes through so it's clear nothing in the G is accessed while in _Gwaiting. For #12967. Change-Id: I2440ead411d2bc61beb1e2ab020ebe3cb3481af9 Reviewed-on: https://go-review.googlesource.com/20039 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-16 20:13:12 +00:00
Austin Clements	8fb182d020	runtime: never pass stack pointers to gopark gopark calls the unlock function after setting the G to _Gwaiting. This means it's generally unsafe to access the G's stack from the unlock function because the G may start running on another P. Once we start shrinking stacks concurrently, a stack shrink could also move the stack the moment after it enters _Gwaiting and before the unlock function is called. Document this restriction and fix the two places where we currently violate it. This is unlikely to be a problem in practice for these two places right now, but they're already skating on thin ice. For example, the following sequence could in principle cause corruption, deadlock, or a panic in the select code: On M1/P1: 1. G1 selects on channels A and B. 2. selectgoImpl calls gopark. 3. gopark puts G1 in _Gwaiting. 4. gopark calls selparkcommit. 5. selparkcommit releases the lock on channel A. On M2/P2: 6. G2 sends to channel A. 7. The send puts G1 in _Grunnable and puts it on P2's run queue. 8. The scheduler runs, selects G1, puts it in _Grunning, and resumes G1. 9. On G1, the sellock immediately following the gopark gets called. 10. sellock grows and moves the stack. On M1/P1: 11. selparkcommit continues to scan the lock order for the next channel to unlock, but it's now reading from a freed (and possibly reused) stack. This shouldn't happen in practice because step 10 isn't the first call to sellock, so the stack should already be big enough. However, once we start shrinking stacks concurrently, this reasoning won't work any more. For #12967. Change-Id: I3660c5be37e5be9f87433cb8141bdfdf37fadc4c Reviewed-on: https://go-review.googlesource.com/20038 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-16 20:13:10 +00:00
Austin Clements	005140a77e	runtime: put g.waiting list in lock order Currently the g.waiting list created by a select is in poll order. However, nothing depends on this, and we're going to need access to the channel lock order in other places shortly, so modify select to put the waiting list in channel lock order. For #12967. Change-Id: If0d38816216ecbb37a36624d9b25dd96e0a775ec Reviewed-on: https://go-review.googlesource.com/20037 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2016-03-16 20:13:07 +00:00
Austin Clements	26594c3dfd	runtime: use indexes for select lock order Currently the select lock order is a []*hchan. We're going to need to refer to things other than the channel itself in lock order shortly, so switch this to a []uint16 of indexes into the select cases. This parallels the existing representation for the poll order. Change-Id: I89262223fe20b4ddf5321592655ba9eac489cda1 Reviewed-on: https://go-review.googlesource.com/20036 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-16 20:13:05 +00:00
Austin Clements	e4a95b6343	runtime: record channel in sudog Given a G, there's currently no way to find the channel it's blocking on. We'll need this information to fix a (probably theoretical) bug in select and to implement concurrent stack shrinking, so record the channel in the sudog. For #12967. Change-Id: If8fb63a140f1d07175818824d08c0ebeec2bdf66 Reviewed-on: https://go-review.googlesource.com/20035 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-16 20:13:02 +00:00
Austin Clements	d7cedc4b74	runtime: perform gcMarkRootCheck during STW in checkmark mode gcMarkRootCheck is too expensive to do during mark termination. However, since it's a useful check and it complements checkmark mode nicely, enable it during mark termination is checkmark is enabled. Change-Id: Icd9039e85e6e9d22747454441b50f1cdd1412202 Reviewed-on: https://go-review.googlesource.com/20663 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-16 20:12:59 +00:00
Alan Donovan	ed73efbb74	runtime/debug: clarify WriteHeapDump STW behavior Change-Id: I049d2596fe8ce0e93391599f5c224779fd8e316f Reviewed-on: https://go-review.googlesource.com/20761 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-16 17:02:50 +00:00
Michael Matloob	7e3344f74e	runtime: update link to WriteHeapDump format The new link is https://golang.org/s/go15heapdump. Change-Id: Ifcaf8572bfe815ffaa78442a1991f6e20e990a50 Reviewed-on: https://go-review.googlesource.com/20740 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-15 22:09:43 +00:00
Wedson Almeida Filho	8e7072ca83	sync: new Cond implementation Change Cond implementation to use a notification list such that waiters can first register for a notification, release the lock, then actually wait. Signalers never have to park anymore. This is intended to address an issue in the previous implementation where Broadcast could fail to signal all waiters. Results of the existing benchmark are below. Original New Diff BenchmarkCond1-48 2000000 745 ns/op 755 +1.3% BenchmarkCond2-48 1000000 1545 ns/op 1532 -0.8% BenchmarkCond4-48 300000 3833 ns/op 3896 +1.6% BenchmarkCond8-48 200000 10049 ns/op 10257 +2.1% BenchmarkCond16-48 100000 21123 ns/op 21236 +0.5% BenchmarkCond32-48 30000 40393 ns/op 41097 +1.7% Fixes #14064 Change-Id: I083466d61593a791a034df61f5305adfb8f1c7f9 Reviewed-on: https://go-review.googlesource.com/18892 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Caleb Spare <cespare@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-15 22:01:20 +00:00
David Crawshaw	f2772a4935	cmd/compile: compute second method type at runtime The type information for a method includes two variants: a func without the receiver, and a func with the receiver as the first parameter. The former is used as part of the dynamic interface checks, but the latter is only returned as a type in the reflect.Method struct. Instead of computing it at compile time, construct it at run time with reflect.FuncOf. Using cl/20701 as a baseline, cmd/go: -480KB, (4.4%) jujud: -5.6MB, (7.8%) For #6853. Change-Id: I1b8c73f3ab894735f53d00cb9c0b506d84d54e92 Reviewed-on: https://go-review.googlesource.com/20709 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-03-15 19:57:40 +00:00
Elias Naur	ea4b785ae0	runtime: preserve darwin/arm{,64} callee-save registers CL 14603 attempted to preserve the callee-save registers for the darwin/arm runtime initialization routine, but I believe it wasn't sufficient and resulted in the crash reported in issue Saving and restoring the registers on the stack the same way linux/arm does seems more obvious and fixes #14778, so do that. Even though #14778 is not reproducible on darwin/arm64, I applied a similar change there, and to linux/arm64 which obeys the same calling convention. Finally, this CL is a candidate for a 1.6 minor release for the same reason CL 14603 was in a 1.5 minor release (as CL 16968). It is small and only touches the iOS platforms and gomobile on darwin/arm is currently useless without it. Fixes #14778 Fixes #12590 (again) Change-Id: I7401daf0bbd7c579a7e84761384a7b763651752a Reviewed-on: https://go-review.googlesource.com/20621 Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Elias Naur <elias.naur@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-15 08:43:34 +00:00
Austin Clements	b1a6e07919	runtime: document the G states In particular, write down the rules for stack ownership because the details of this are about to get very important with concurrent stack shrinking. (Interestingly, the details don't actually change, but anything that's currently skating on thin ice is likely to fall through.) Fox #12967. Change-Id: I561e2610e864295e9faba07717a934aabefcaab9 Reviewed-on: https://go-review.googlesource.com/20034 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-03-14 16:30:16 +00:00
Austin Clements	da153354b2	runtime: copy stack before adjusting Currently copystack adjusts pointers in the old stack and then copies the adjusted stack to the new stack. In addition to being generally confusing, this is going to make concurrent stack shrinking harder. Switch this around so that we first copy the stack and then adjust pointers on the new stack (never writing to the old stack). This reprises CL 15996, but takes a different and simpler approach. CL 15996 still walked the old stack while adjusting pointers on the new stack. In this CL, we adjust auxiliary structures before walking the stack, so we can just walk the new stack. For #12967. Change-Id: I94fa86f823ba9ee478e73b2ba509eed3361c43df Reviewed-on: https://go-review.googlesource.com/20033 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-03-14 16:29:47 +00:00
Austin Clements	5a50e00306	runtime: improve comment on selectgo In particular, document that *sel is on the stack no matter what. Change-Id: I1c264215e026c27721b13eedae73ec845066cdec Reviewed-on: https://go-review.googlesource.com/20032 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-03-14 16:29:21 +00:00
Richard Miller	8a2d6e9f6f	runtime: fix a typo in asssembly macro GO_RESULTS_INITIALIZED Fixes #14772 Change-Id: I32f2b6b74de28be406b1306364bc07620a453962 Reviewed-on: https://go-review.googlesource.com/20680 Reviewed-by: David du Colombier <0intro@gmail.com> Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-14 14:53:29 +00:00
Martin Möhrmann	43ed65f869	runtime: speed up growslice by avoiding divisions Only compute the number of maximum allowed elements per slice once. Special case newcap computation for slices with byte sized elements. name old time/op new time/op delta GrowSliceBytes-2 61.1ns ± 1% 43.4ns ± 1% -29.00% (p=0.000 n=20+20) GrowSliceInts-2 85.9ns ± 1% 75.7ns ± 1% -11.80% (p=0.000 n=20+20) Change-Id: I5d9c0d5987cdd108ac29dc32e31912dcefa2324d Reviewed-on: https://go-review.googlesource.com/20653 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-14 00:45:57 +00:00
Matthew Dempsky	d1341d6cf3	cmd/compile, runtime: eliminate growslice_n Fixes #11419. Change-Id: I7935a253e3e96191a33f5041bab203ecc5f0c976 Reviewed-on: https://go-review.googlesource.com/20647 Reviewed-by: Keith Randall <khr@golang.org>	2016-03-13 23:34:31 +00:00
Emmanuel Odeke	6dfcc336c5	runtime: move testSchedLocalQueue* to export_test Move functions testSchedLocalQueueLocal and testSchedLocalQueueSteal from proc.go to export_test.go, the only site that they are used. Fixes #14796 Change-Id: I16b6fa4a13835eab33f66a2c2e87a5f5c79b7bd3 Reviewed-on: https://go-review.googlesource.com/20640 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-13 00:34:58 +00:00
David Crawshaw	cc158403d6	cmd/compile: track reflect.Type.Method in deadcode In addition to reflect.Value.Call, exported methods can be invoked by the Func value in the reflect.Method struct. This CL has the compiler track what functions get access to a legitimate reflect.Method struct by looking for interface calls to either of: Method(int) reflect.Method MethodByName(string) (reflect.Method, bool) This is a little overly conservative. If a user implements a type with one of these methods without using the underlying calls on reflect.Type, the linker will assume the worst and include all exported methods. But it's cheap. No change to any of the binary sizes reported in cl/20483. For #14740 Change-Id: Ie17786395d0453ce0384d8b240ecb043b7726137 Reviewed-on: https://go-review.googlesource.com/20489 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2016-03-11 21:19:20 +00:00
Ian Lance Taylor	b354f91417	runtime: limit TestCgoCCodeSIGPROF test to 1 second Still fails about 20% of the time on my laptop. Fixes #14766. Change-Id: I169ab728c6022dceeb91188f5ad466ed6413c062 Reviewed-on: https://go-review.googlesource.com/20590 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-11 06:43:01 +00:00
Ian Lance Taylor	1d809c5c14	runtime: fix names in SetFinalizer doc comment Fixes #14554. Change-Id: I37ab4e4dc1aee84ac448d437314f8eecbbc02994 Reviewed-on: https://go-review.googlesource.com/20021 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-10 18:20:44 +00:00
Richard Miller	6b59d61822	runtime: Plan 9 - prevent preemption by GC while exiting On Plan 9, there's no "kill all threads" system call, so exit is done by sending a "go: exit" note to each OS process. If concurrent GC occurs during this loop, deadlock sometimes results. Prevent this by incrementing m.locks before sending notes. Change-Id: I31aa15134ff6e42d9a82f9f8a308620b3ad1b1b1 Reviewed-on: https://go-review.googlesource.com/20477 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-09 16:48:00 +00:00
David Crawshaw	8df733bd22	cmd/compile: remove slices from rtype.funcType Alternative to golang.org/cl/19852. This memory layout doesn't have an easy type representation, but it is noticeably smaller than the current funcType, and saves significant extra space. Some notes on the layout are in reflect/type.go: // A rtype for each in and out parameter is stored in an array that // directly follows the funcType (and possibly its uncommonType). So // a function type with one method, one input, and one output is: // // struct { // funcType // uncommonType // [2]rtype // [0] is in, [1] is out // uncommonTypeSliceContents // } There are three arbitrary limits introduced by this CL: 1. No more than 65535 function input parameters. 2. No more than 32767 function output parameters. 3. reflect.FuncOf is limited to 128 parameters. I don't think these are limits in practice, but are worth noting. Reduces godoc binary size by 2.4%, 330KB. For #6853. Change-Id: I225c0a0516ebdbe92d41dfdf43f716da42dfe347 Reviewed-on: https://go-review.googlesource.com/19916 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-09 01:25:18 +00:00
David Crawshaw	a24b3ed753	cmd/compile: remove rtype *uncommonType field Instead of a pointer on every rtype, use a bit flag to indicate that the contents of uncommonType directly follows the rtype value when it is needed. This requires a bit of juggling in the compiler's rtype encoder. The backing arrays for fields in the rtype are presently encoded directly after the slice header. This packing requires separating the encoding of the uncommonType slice headers from their backing arrays. Reduces binary size of godoc by ~180KB (1.5%). No measurable change in all.bash time. For #6853. Change-Id: I60205948ceb5c0abba76fdf619652da9c465a597 Reviewed-on: https://go-review.googlesource.com/19790 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-08 23:23:13 +00:00
Russ Cox	0c7ccbf601	cmd/go: ignore C files when CGO_ENABLED=0 Before, those C files might have been intended for the Plan 9 C compiler, but that option was removed in Go 1.5. We can simplify the maintenance of cgo packages now if we assume C files (and C++ and M and SWIG files) should only be considered when cgo is enabled. Also remove newly unnecessary build tags in runtime/cgo's C files. Fixes #14123 Change-Id: Ia5a7fe62b9469965aa7c3547fe43c6c9292b8205 Reviewed-on: https://go-review.googlesource.com/19613 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-08 20:10:52 +00:00
Burcu Dogan	6df8038768	runtime: listen 127.0.0.1 instead of localhost on android Fixes #14486. Related to #14485. Change-Id: I2dd77b0337aebfe885ae828483deeaacb500b12a Reviewed-on: https://go-review.googlesource.com/20340 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-08 00:22:38 +00:00
Austin Clements	2b19b6e3f1	runtime: fix checkmark scanning of finalizers Currently work.finalizersDone is reset only at the beginning of gcStart. As a result, it will be set when checkmark runs, so checkmark will skip scanning finalizers. Hence, if there are any bugs that cause the regular scan of finalizers to miss pointers, checkmark will also miss them and fail to detect the missed pointer. Fix this by resetting finalizersDone in gcResetMarkState. This way it gets reset before any full mark, which is exactly what we want. Change-Id: I4ddb5eba5b3b97e52aaf3e08fd9aa692bda32b20 Reviewed-on: https://go-review.googlesource.com/20332 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-07 22:32:20 +00:00
Matthew Dempsky	a03bdc3e6b	runtime: eliminate unnecessary type conversions Automated refactoring produced using github.com/mdempsky/unconvert. Change-Id: Iacf871a4f221ef17f48999a464ab2858b2bbaa90 Reviewed-on: https://go-review.googlesource.com/20071 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-07 20:53:27 +00:00
Richard Miller	0de0cafb9f	runtime: new files for plan9_arm support Implementation more or less follows plan9_386 version. Revised 7 March to correct a bug in runtime.seek and tidy whitespace for 8-column tabs. Change-Id: I2e921558b5816502e8aafe330530c5a48a6c7537 Reviewed-on: https://go-review.googlesource.com/18966 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-07 16:25:48 +00:00
Richard Miller	d145456b16	runtime: signal handling support for plan9_arm Plan 9 trap/signal handling differs on ARM from other architectures because ARM has a link register. Also trap message syntax varies between different architectures (historical accident?). Revised 7 March to clarify a comment. Change-Id: Ib6485f82857a2f9a0d6b2c375cf0aaa230b83656 Reviewed-on: https://go-review.googlesource.com/18969 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-07 16:25:17 +00:00
Austin Clements	ff71ed86b6	runtime: merge {bgMark,assist}StartTime We used to start background mark workers and assists at different times, so we needed to keep track of these separately. They're now set to exactly the same time, so clean things up by merging them in to one value, markStartTime. Change-Id: I17c9843c3ed2d6f07b4c8cd0b2c438fc6de23b53 Reviewed-on: https://go-review.googlesource.com/20143 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-03-07 00:22:42 +00:00
Hitoshi Mitake	8c838192b8	runtime: don't print EnableGC flag in WriteHeapProfile() Current runtime.WriteHeapProfile() doesn't print correct EnableGC. Even if GOGC=off, the result file has below line: # EnableGC = true It is hard to print correct status of the variable because of corner cases e.g. initialization. For avoiding confusion, this commit removes the print. Change-Id: Ia792454a6c650bdc50a06fbaff4df7b6330ae08a Reviewed-on: https://go-review.googlesource.com/18600 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2016-03-05 08:14:23 +00:00
Austin Clements	b5481dd0a6	runtime: disable gcMarkRootCheck debugging check during STW gcMarkRootCheck takes ~10ns per goroutine. This is just a debugging check, so disable it (plus, if something is going to go wrong, it's more likely to go wrong during concurrent mark). We may be able to re-enable this later, or move it to after we've started the world again. (But not for 1.6.x.) For 1.6.x. Fixes #14419. name / 95%ile-time/markTerm old new delta 500kIdleGs-12 24.0ms ± 0% 18.9ms ± 6% -21.46% (p=0.000 n=15+20) Change-Id: Idb2a2b1771449de772c159ef95920d6df1090666 Reviewed-on: https://go-review.googlesource.com/20148 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-04 21:12:06 +00:00
Austin Clements	9ab9053344	runtime: reset mark state before stopping the world Currently we reset the mark state during STW sweep termination. This involves looping over all of the goroutines. Each iteration of this loop takes ~25ns, so at around 400k goroutines, we'll exceed our 10ms pause goal. However, it's safe to do this before we stop the world for sweep termination because nothing is consuming this state yet. Hence, move the reset to just before STW. This isn't perfect: a long reset can still delay allocating goroutines that block on GC starting. But it's certainly better to block some things eventually than to block everything immediately. For 1.6.x. Fixes #14420. name \ 95%ile-time/sweepTerm old new delta 500kIdleGs-12 11312µs ± 6% 18.9µs ± 6% -99.83% (p=0.000 n=16+20) Change-Id: I9815c4d8d9b0d3c3e94dfdab78049cefe0dcc93c Reviewed-on: https://go-review.googlesource.com/20147 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-04 21:12:03 +00:00
Ian Lance Taylor	1716162a9a	runtime: fix off-by-one error finding module for PC Also fix compiler-invoked panics to avoid a confusing "malloc deadlock" crash if they are invoked while executing the runtime. Fixes #14599. Change-Id: I89436abcbf3587901909abbdca1973301654a76e Reviewed-on: https://go-review.googlesource.com/20219 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-03-04 21:06:31 +00:00
David Crawshaw	69285a8b46	reflect: recognize unnamed directional channels go test github.com/onsi/gomega/gbytes now passes at tip, and tests added to the reflect package. Fixes #14645 Change-Id: I16216c1a86211a1103d913237fe6bca5000cf885 Reviewed-on: https://go-review.googlesource.com/20221 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-04 20:34:30 +00:00
Mikio Hara	b0f4ee533a	net: deduplicate TCP socket code This change consolidates functions and methods related to TCPAddr, TCPConn and TCPListener for maintenance purpose, especially for documentation. Also refactors Dial error code paths. The followup changes will update comments and examples. Updates #10624. Change-Id: I3333ee218ebcd08928f9e2826cd1984d15ea153e Reviewed-on: https://go-review.googlesource.com/20009 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-03 04:23:59 +00:00
Russ Cox	6969d9bf03	runtime/pprof: sort counted profiles by count This is especially helpful in programs with very large numbers of goroutines: the bulk of the goroutines will show up at the top. Before: 1 @ 0x86ab8 0x86893 0x82164 0x8e7ce 0x7b798 0x5b871 # 0x86ab8 runtime/pprof.writeRuntimeProfile+0xb8 /Users/rsc/go/src/runtime/pprof/pprof.go:545 # 0x86893 runtime/pprof.writeGoroutine+0x93 /Users/rsc/go/src/runtime/pprof/pprof.go:507 # 0x82164 runtime/pprof.(Profile).WriteTo+0xd4 /Users/rsc/go/src/runtime/pprof/pprof.go:236 # 0x8e7ce runtime/pprof_test.TestGoroutineCounts+0x15e /Users/rsc/go/src/runtime/pprof/pprof_test.go:603 # 0x7b798 testing.tRunner+0x98 /Users/rsc/go/src/testing/testing.go:473 1 @ 0x2d373 0x2d434 0x560f 0x516b 0x7cd42 0x7b861 0x2297 0x2cf90 0x5b871 # 0x7cd42 testing.RunTests+0x8d2 /Users/rsc/go/src/testing/testing.go:583 # 0x7b861 testing.(M).Run+0x81 /Users/rsc/go/src/testing/testing.go:515 # 0x2297 main.main+0x117 runtime/pprof/_test/_testmain.go:72 # 0x2cf90 runtime.main+0x2b0 /Users/rsc/go/src/runtime/proc.go:188 10 @ 0x2d373 0x2d434 0x560f 0x516b 0x8e5b6 0x5b871 # 0x8e5b6 runtime/pprof_test.func1+0x36 /Users/rsc/go/src/runtime/pprof/pprof_test.go:582 50 @ 0x2d373 0x2d434 0x560f 0x516b 0x8e656 0x5b871 # 0x8e656 runtime/pprof_test.func3+0x36 /Users/rsc/go/src/runtime/pprof/pprof_test.go:584 40 @ 0x2d373 0x2d434 0x560f 0x516b 0x8e606 0x5b871 # 0x8e606 runtime/pprof_test.func2+0x36 /Users/rsc/go/src/runtime/pprof/pprof_test.go:583 After: 50 @ 0x2d373 0x2d434 0x560f 0x516b 0x8ecc6 0x5b871 # 0x8ecc6 runtime/pprof_test.func3+0x36 /Users/rsc/go/src/runtime/pprof/pprof_test.go:584 40 @ 0x2d373 0x2d434 0x560f 0x516b 0x8ec76 0x5b871 # 0x8ec76 runtime/pprof_test.func2+0x36 /Users/rsc/go/src/runtime/pprof/pprof_test.go:583 10 @ 0x2d373 0x2d434 0x560f 0x516b 0x8ec26 0x5b871 # 0x8ec26 runtime/pprof_test.func1+0x36 /Users/rsc/go/src/runtime/pprof/pprof_test.go:582 1 @ 0x2d373 0x2d434 0x560f 0x516b 0x7cd42 0x7b861 0x2297 0x2cf90 0x5b871 # 0x7cd42 testing.RunTests+0x8d2 /Users/rsc/go/src/testing/testing.go:583 # 0x7b861 testing.(M).Run+0x81 /Users/rsc/go/src/testing/testing.go:515 # 0x2297 main.main+0x117 runtime/pprof/_test/_testmain.go:72 # 0x2cf90 runtime.main+0x2b0 /Users/rsc/go/src/runtime/proc.go:188 1 @ 0x87128 0x86f03 0x82164 0x8ee30 0x7b798 0x5b871 # 0x87128 runtime/pprof.writeRuntimeProfile+0xb8 /Users/rsc/go/src/runtime/pprof/pprof.go:566 # 0x86f03 runtime/pprof.writeGoroutine+0x93 /Users/rsc/go/src/runtime/pprof/pprof.go:528 # 0x82164 runtime/pprof.(Profile).WriteTo+0xd4 /Users/rsc/go/src/runtime/pprof/pprof.go:236 # 0x8ee30 runtime/pprof_test.TestGoroutineCounts+0x150 /Users/rsc/go/src/runtime/pprof/pprof_test.go:603 # 0x7b798 testing.tRunner+0x98 /Users/rsc/go/src/testing/testing.go:473 Change-Id: I43de9eee2d96f9c46f7b0fbe099a0571164324f5 Reviewed-on: https://go-review.googlesource.com/20107 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-03-02 20:04:29 +00:00
Brad Fitzpatrick	5fea2ccc77	all: single space after period. The tree's pretty inconsistent about single space vs double space after a period in documentation. Make it consistently a single space, per earlier decisions. This means contributors won't be confused by misleading precedence. This CL doesn't use go/doc to parse. It only addresses // comments. It was generated with: $ perl -i -npe 's,^(\s// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s//(.+\.) +([A-Z])') $ go test go/doc -update Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7 Reviewed-on: https://go-review.googlesource.com/20022 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dave Day <djd@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-02 00:13:47 +00:00
Brad Fitzpatrick	519474451a	all: make copyright headers consistent with one space after period This is a subset of https://golang.org/cl/20022 with only the copyright header lines, so the next CL will be smaller and more reviewable. Go policy has been single space after periods in comments for some time. The copyright header template at: https://golang.org/doc/contribute.html#copyright also uses a single space. Make them all consistent. Change-Id: Icc26c6b8495c3820da6b171ca96a74701b4a01b0 Reviewed-on: https://go-review.googlesource.com/20111 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-01 23:34:33 +00:00
Keith Randall	9d854fd44a	Merge branch 'dev.ssa' into mergebranch Merge dev.ssa branch back into master. Change-Id: Ie6fac3f8d355ab164f934415fe4fc7fcb8c3db16	2016-03-01 12:50:17 -08:00
Gerrit Code Review	acdb0da47d	Merge "[dev.ssa] Merge remote-tracking branch 'origin/master' into ssamerge" into dev.ssa	2016-02-29 23:19:17 +00:00
Keith Randall	4fffd4569d	[dev.ssa] Merge remote-tracking branch 'origin/master' into ssamerge (Last?) Semi-regular merge from tip to dev.ssa. Conflicts: src/cmd/compile/internal/gc/closure.go src/cmd/compile/internal/gc/gsubr.go src/cmd/compile/internal/gc/lex.go src/cmd/compile/internal/gc/pgen.go src/cmd/compile/internal/gc/syntax.go src/cmd/compile/internal/gc/walk.go src/cmd/internal/obj/pass.go Change-Id: Ib5ea8bf74d420f4902a9c6208761be9f22371ae7	2016-02-29 13:32:20 -08:00
David Chase	8107b0012f	[dev.ssa] cmd/compile: use 32-bit load to read writebarrier Avoid targeting a partial register with load; ensure source of load (writebarrier) is aligned. Better yet would be "CMPB $1,writebarrier" but that requires wrestling with flagalloc (mem operand complicates moving instruction around). Didn't see a change in time for benchcmd -n 10 Build go build net/http Verified that we clean the code up properly: 0x20a8 <main.main+104>: mov 0xc30a2(%rip),%eax # 0xc5150 <runtime.writeBarrier> 0x20ae <main.main+110>: test %al,%al Change-Id: Id5fb8c260eaec27bd727cb0ae1476c60343b0986 Reviewed-on: https://go-review.googlesource.com/19998 Reviewed-by: Keith Randall <khr@golang.org>	2016-02-28 22:29:23 +00:00
Austin Clements	d62d831882	runtime: clean up adjustpointer and eliminate write barrier Commit `a5c3bbe` modified adjustpointers to use uintptrs instead of unsafe.Pointers for manipulating stack pointers for clarity and to eliminate the unnecessary write barrier when writing the updated stack pointer. This commit makes the equivalent change to adjustpointer. Change-Id: I6dc309590b298bdd86ecdc9737db848d6786c3f7 Reviewed-on: https://go-review.googlesource.com/17148 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-28 04:19:01 +00:00
Ian Lance Taylor	8d94b9b820	runtime: more deflaking of TestCgoCheckBytes Fixes #14519. Change-Id: I8f78f67a463e6467e09df90446f7ebd28789d6c9 Reviewed-on: https://go-review.googlesource.com/19933 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2016-02-26 19:20:47 +00:00
Dmitry Vyukov	bdc14698f8	runtime: unwire g/m in dropg always Currently dropg does not unwire locked g/m. This is unnecessary distiction between locked and non-locked g/m. We always restart goroutines with execute which re-wires g/m. First, this produces false sense that this distinction is necessary. Second, it can confuse some sanity and cross checks. For example, if we check that g/m are unwired before we wire them in execute, the check will fail for locked g/m. I've hit this while doing some race detector changes, When we deschedule a goroutine and run scheduler code, m.curg is generally nil, but not for locked ms. Remove the distinction. Change-Id: I3b87a28ff343baa1d564aab1f821b582a84dee07 Reviewed-on: https://go-review.googlesource.com/19950 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-26 15:45:45 +00:00
Austin Clements	3b3d58e119	runtime: remove workbuf logging Early in Go 1.5 we had bugs with ownership of workbufs, so we added a system for tracing their ownership to help debug these issues. However, this system has both CPU and space overhead even when disabled, it clutters up the workbuf API, the higher level gcWork abstraction makes it very difficult to mess up the ownership of workbufs in practice, and the tracing hasn't been enabled or needed since `5b66e5d` nine months ago. Hence, remove it. Benchmarks show the usual noise from changes at this level, but little overall movement. name old time/op new time/op delta XBenchGarbage-12 2.48ms ± 1% 2.47ms ± 0% -0.68% (p=0.000 n=21+21) name old time/op new time/op delta BinaryTree17-12 2.98s ± 7% 2.98s ± 6% ~ (p=0.799 n=20+20) Fannkuch11-12 2.61s ± 3% 2.55s ± 5% -2.55% (p=0.003 n=20+20) FmtFprintfEmpty-12 52.8ns ± 6% 53.6ns ± 6% ~ (p=0.228 n=20+20) FmtFprintfString-12 177ns ± 4% 177ns ± 4% ~ (p=0.280 n=20+20) FmtFprintfInt-12 162ns ± 5% 162ns ± 3% ~ (p=0.347 n=20+20) FmtFprintfIntInt-12 277ns ± 7% 273ns ± 4% -1.62% (p=0.005 n=20+20) FmtFprintfPrefixedInt-12 237ns ± 4% 242ns ± 4% +2.13% (p=0.005 n=20+20) FmtFprintfFloat-12 315ns ± 4% 312ns ± 4% -0.97% (p=0.001 n=20+20) FmtManyArgs-12 1.11µs ± 3% 1.15µs ± 4% +3.41% (p=0.004 n=20+20) GobDecode-12 8.50ms ± 7% 8.53ms ± 7% ~ (p=0.429 n=20+20) GobEncode-12 6.86ms ± 9% 6.93ms ± 7% +0.93% (p=0.030 n=20+20) Gzip-12 326ms ± 4% 329ms ± 4% +0.98% (p=0.020 n=20+20) Gunzip-12 43.3ms ± 3% 43.8ms ± 9% +1.25% (p=0.003 n=20+20) HTTPClientServer-12 72.0µs ± 3% 71.5µs ± 3% ~ (p=0.053 n=20+20) JSONEncode-12 17.0ms ± 6% 17.3ms ± 7% +1.32% (p=0.006 n=20+20) JSONDecode-12 64.2ms ± 4% 63.5ms ± 3% -1.05% (p=0.005 n=20+20) Mandelbrot200-12 4.00ms ± 3% 3.99ms ± 3% ~ (p=0.121 n=20+20) GoParse-12 3.74ms ± 5% 3.75ms ± 9% ~ (p=0.383 n=20+20) RegexpMatchEasy0_32-12 104ns ± 4% 104ns ± 6% ~ (p=0.392 n=20+20) RegexpMatchEasy0_1K-12 358ns ± 3% 361ns ± 4% +0.95% (p=0.003 n=20+20) RegexpMatchEasy1_32-12 86.3ns ± 5% 86.1ns ± 6% ~ (p=0.614 n=20+20) RegexpMatchEasy1_1K-12 523ns ± 4% 518ns ± 3% -1.14% (p=0.008 n=20+20) RegexpMatchMedium_32-12 137ns ± 3% 134ns ± 4% -1.90% (p=0.005 n=20+20) RegexpMatchMedium_1K-12 41.0µs ± 3% 40.6µs ± 4% -1.11% (p=0.004 n=20+20) RegexpMatchHard_32-12 2.13µs ± 4% 2.11µs ± 5% -1.31% (p=0.014 n=20+20) RegexpMatchHard_1K-12 64.1µs ± 3% 63.2µs ± 5% -1.38% (p=0.005 n=20+20) Revcomp-12 555ms ±10% 548ms ± 7% -1.17% (p=0.011 n=20+20) Template-12 84.2ms ± 5% 88.2ms ± 4% +4.73% (p=0.000 n=20+20) TimeParse-12 365ns ± 4% 371ns ± 5% +1.77% (p=0.002 n=20+20) TimeFormat-12 361ns ± 4% 365ns ± 3% +1.08% (p=0.002 n=20+20) [Geo mean] 64.7µs 64.8µs +0.19% Change-Id: Ib043a7a0d18b588b298873d60913d44cd19f3b44 Reviewed-on: https://go-review.googlesource.com/19887 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-02-26 15:14:32 +00:00
David Crawshaw	0231f5420f	cmd/compile: remove uncommonType.name Reduces binary size of cmd/go by 0.5%. For #6853. Change-Id: I5a4b814049580ab5098ad252d979f80b70d8a5f9 Reviewed-on: https://go-review.googlesource.com/19694 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-26 12:02:39 +00:00
Keith Randall	687abca1ea	runtime: avoid using REP prefix for IndexByte REP-prefixed instructions have a large startup cost. Avoid them like the plague. benchmark old ns/op new ns/op delta BenchmarkIndexByte10-8 22.4 5.34 -76.16% Fixes #13983 Change-Id: I857e956e240fc9681d053f2584ccf24c1b272bb3 Reviewed-on: https://go-review.googlesource.com/18703 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-26 01:09:53 +00:00
Austin Clements	cbe849fc38	runtime: eliminate unused _Genqueue state _Genqueue and _Gscanenqueue were introduced as part of the GC quiesce code. The quiesce code was removed by `197aa9e`, but these states and some associated code stuck around. Remove them. Change-Id: I69df81881602d4a431556513dac2959668d27c20 Reviewed-on: https://go-review.googlesource.com/19638 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-25 23:37:32 +00:00
Austin Clements	4eb33f6b8d	runtime: eliminate a conditional branch from heapBits.bits Change-Id: I1fa5e629b2890a8509559ce4ea17b74f47d71925 Reviewed-on: https://go-review.googlesource.com/19637 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-25 23:37:29 +00:00
Austin Clements	0168c2676f	runtime: use only per-P gcWork Currently most uses of gcWork use the per-P gcWork, but there are two places that still use a stack-based gcWork. Simplify things by making these instead use the per-P gcWork. Change-Id: I712d012cce9dd5757c8541824e9641ac1c2a329c Reviewed-on: https://go-review.googlesource.com/19636 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-25 23:37:27 +00:00

... 4 5 6 7 8 ...

2283 Commits