mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
Ian Lance Taylor	86cd9c1176	runtime: only call netpoll if netpollinited returns true This fixes a race on old Linux kernels, in which we might temporarily set epfd to an invalid value other than -1. It's also the right thing to do. No test because the problem only occurs on old kernels. Fixes #22606 Change-Id: Id84bdd6ae6d7c5d47c39e97b74da27576cb51a54 Reviewed-on: https://go-review.googlesource.com/76319 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2017-11-07 16:18:12 +00:00
Austin Clements	e9079a69f3	runtime: buffered write barrier implementation This implements runtime support for buffered write barriers on amd64. The buffered write barrier has a fast path that simply enqueues pointers in a per-P buffer. Unlike the current write barrier, this fast path is not a normal Go call and does not require the compiler to spill general-purpose registers or put arguments on the stack. When the buffer fills up, the write barrier takes the slow path, which spills all general purpose registers and flushes the buffer. We don't allow safe-points or stack splits while this frame is active, so it doesn't matter that we have no type information for the spilled registers in this frame. One minor complication is cgocheck=2 mode, which uses the write barrier to detect Go pointers being written to non-Go memory. We obviously can't buffer this, so instead we set the buffer to its minimum size, forcing the write barrier into the slow path on every call. For this specific case, we pass additional information as arguments to the flush function. This also requires enabling the cgo write barrier slightly later during runtime initialization, after Ps (and the per-P write barrier buffers) have been initialized. The code in this CL is not yet active. The next CL will modify the compiler to generate calls to the new write barrier. This reduces the average cost of the write barrier by roughly a factor of 4, which will pay for the cost of having it enabled more of the time after we make the GC pacer less aggressive. (Benchmarks will be in the next CL.) Updates #14951. Updates #22460. Change-Id: I396b5b0e2c5e5c4acfd761a3235fd15abadc6cb1 Reviewed-on: https://go-review.googlesource.com/73711 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-30 18:12:44 +00:00
Austin Clements	3526d8031a	runtime: allow write barriers in gchelper We're about to start tracking nowritebarrierrec through systemstack calls, which detects that we're calling markroot (which has write barriers) from gchelper, which is called from the scheduler during STW apparently without a P. But it turns out that func helpgc, which wakes up blocked Ms to run gchelper, installs a P for gchelper to use. This means there is a P when gchelper runs, so it is allowed to have write barriers. Tell the compiler this by marking gchelper go:yeswritebarrierrec. Also, document the call to gchelper so I don't have to spend another half a day puzzling over how on earth this could possibly work before discovering the spooky action-at-a-distance in helpgc. Updates #22384. For #22460. Change-Id: I7394c9b4871745575f87a2d4fbbc5b8e54d669f7 Reviewed-on: https://go-review.googlesource.com/72772 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-29 17:56:21 +00:00
Russ Cox	986582126a	runtime: avoid monotonic time zero on systems with low-res timers Otherwise low-res timers cause problems at call sites that expect to be able to use 0 as meaning "no time set" and therefore expect that nanotime never returns 0 itself. For example, sched.lastpoll == 0 means no last poll. Fixes #22394. Change-Id: Iea28acfddfff6f46bc90f041ec173e0fea591285 Reviewed-on: https://go-review.googlesource.com/73410 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-10-25 17:10:20 +00:00
David du Colombier	f4faca6013	runtime: don't terminate locked OS threads on Plan 9 CL 46037 and CL 46038 implemented termination of locked OS threads when the goroutine exits. However, this behavior leads to crashes of Go programs using runtime.LockOSThread on Plan 9. This is notably the case of the os/exec and net packages. This change disables termination of locked OS threads on Plan 9. Updates #22227. Change-Id: If9fa241bff1c0b68e7e9e321e06e5203b3923212 Reviewed-on: https://go-review.googlesource.com/71230 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-10-17 15:15:12 +00:00
David du Colombier	d155b32f8d	runtime: disable use of template thread on Plan 9 CL 46033 added a "template thread" mechanism to allow creation of thread with a known-good state from a thread of unknown state. However, we are experiencing issues on Plan 9 with programs using the os/exec and net package. These package are relying on runtime.LockOSThread. Updates #22227. Change-Id: I85b71580a41df9fe8b24bd8623c064b6773288b0 Reviewed-on: https://go-review.googlesource.com/70231 Run-TryBot: David du Colombier <0intro@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-10-17 15:15:07 +00:00
David du Colombier	926373ea79	runtime: fix crash on Plan 9 Since CL 46037, the runtime is crashing after calling exitThread on Plan 9. The exitThread function shouldn't be called on Plan 9, because the system manages thread stacks. Fixes #22221. Change-Id: I5d61c9660a87dc27e4cfcb3ca3ddcb4b752f2397 Reviewed-on: https://go-review.googlesource.com/70190 Run-TryBot: David du Colombier <0intro@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-10-12 00:11:33 +00:00
Austin Clements	44d9e96da9	runtime: don't try to free OS-created signal stacks Android's libc creates a signal stack for every thread it creates. In Go, minitSignalStack picks up this existing signal stack and puts it in m.gsignal.stack. However, if we later try to exit a thread (because a locked goroutine is exiting), we'll attempt to stackfree this libc-allocated signal stack and panic. Fix this by clearing gsignal.stack when we unminitSignals in such a situation. This should fix the Android build, which is currently broken. Change-Id: Ieea8d72ef063d22741c54c9daddd8bb84926a488 Reviewed-on: https://go-review.googlesource.com/70130 Reviewed-by: David Crawshaw <crawshaw@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-10-11 22:17:30 +00:00
Austin Clements	0aef82aa4a	runtime: make (Un)LockOSThread doc more prescriptive Right now users have to infer why they would want LockOSThread and when it may or may not be appropriate to call UnlockOSThread. This requires some understanding of Go's internal thread pool implementation, which is unfortunate. Improve the situation by making the documentation on these functions more prescriptive so users can figure out when to use them even if they don't know about the scheduler. Change-Id: Ide221791e37cb5106dd8a172f89fbc5b3b98fe32 Reviewed-on: https://go-review.googlesource.com/52871 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-10-11 17:47:23 +00:00
Austin Clements	4f34a52913	runtime: terminate locked OS thread if its goroutine exits runtime.LockOSThread is sometimes used when the caller intends to put the OS thread into an unusual state. In this case, we never want to return this thread to the runtime thread pool. However, currently exiting the goroutine implicitly unlocks its OS thread. Fix this by terminating the locked OS thread when its goroutine exits, rather than simply returning it to the pool. Fixes #20395. Change-Id: I3dcec63b200957709965f7240dc216fa84b62ad9 Reviewed-on: https://go-review.googlesource.com/46038 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-10-11 17:47:21 +00:00
Austin Clements	eff2b2620d	runtime: make it possible to exit Go-created threads Currently, threads created by the runtime exist until the whole program exits. For #14592 and #20395, we want to be able to exit and clean up threads created by the runtime. This commit implements that mechanism. The main difficulty is how to clean up the g0 stack. In cgo mode and on Solaris and Windows where the OS manages thread stacks, we simply arrange to return from mstart and let the system clean up the thread. If the runtime allocated the g0 stack, then we use a new exitThread syscall wrapper that arranges to clear a flag in the M once the stack can safely be reaped and call the thread termination syscall. exitThread is based on the existing exit1 wrapper, which was always meant to terminate the calling thread. However, exit1 has never been used since it was introduced 9 years ago, so it was broken on several platforms. exitThread also has the additional complication of having to flag that the stack is unused, which requires some tricks on platforms that use the stack for syscalls. This still leaves the problem of how to reap the unused g0 stacks. For this, we move the M from allm to a new freem list as part of the M exiting. Later, allocm scans the freem list, finds Ms that are marked as done with their stack, removes these from the list and frees their g0 stacks. This also allows these Ms to be garbage collected. This CL does not yet use any of this functionality. Follow-up CLs will. Likewise, there are no new tests in this CL because we'll need follow-up functionality to test it. Change-Id: Ic851ee74227b6d39c6fc1219fc71b45d3004bc63 Reviewed-on: https://go-review.googlesource.com/46037 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-10-11 17:47:18 +00:00
Austin Clements	6c7bea6309	runtime: replace sched.mcount int32 with sched.mnext int64 Currently, since Ms never exit, the number of Ms, the number of Ms ever created, and the ID of the next M are all the same and must be small. That's about to change, so rename sched.mcount to sched.mnext to make it clear it's the number of Ms ever created (and the ID of the next M), change its type to int64, and use mcount() for the number of Ms. In the next commit, mcount() will become slightly less trivial. For #20395. Change-Id: I9af34d36bd72416b5656555d16e8085076f1b196 Reviewed-on: https://go-review.googlesource.com/68750 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-10-11 17:47:16 +00:00
Austin Clements	a212083eea	runtime: mark mstart as nowritebarrierrec mstart is the entry point for new threads, so it certainly can't interact with GC enough to have write barriers. We move the one small piece that is allowed to have write barriers out into its own function. Change-Id: Id9c31d6ffac31d0051fab7db15eb428c11cadbad Reviewed-on: https://go-review.googlesource.com/46035 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-11 17:47:13 +00:00
Austin Clements	2595fe7fb6	runtime: don't start new threads from locked threads Applications that need to manipulate kernel thread state are currently on thin ice in Go: they can use LockOSThread to prevent other goroutines from running on the manipulated thread, but Go may clone this manipulated state into a new thread that's put into the runtime's thread pool along with other threads. Fix this by never starting a new thread from a locked thread or a thread that may have been started by C. Instead, the runtime starts a "template thread" with a known-good state. If it then needs to start a new thread but doesn't know that the current thread is in a good state, it forwards the thread creation to the template thread. Fixes #20676. Change-Id: I798137a56e04b7723d55997e9c5c085d1d910643 Reviewed-on: https://go-review.googlesource.com/46033 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-10-11 17:47:08 +00:00
Daniel Martí	6f5ede8bd5	runtime: remove a few unused params and results These have never had a use - not even going back to when they were added in C. Change-Id: I143b6902b3bacb1fa83c56c9070a8adb9f61a844 Reviewed-on: https://go-review.googlesource.com/69119 Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Dave Cheney <dave@cheney.net> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-10-09 20:14:50 +00:00
Gabriel Aszalos	a04adcaf35	runtime: remove the 'go:nosplit' directive from documentation The //go:nosplit directive was visible in GoDoc because the function that it preceeded (Gosched) is exported. This change moves the directive above the documentation, hiding it from the output. Change-Id: I281fd7573f11d977487809f74c9cc16b2af0dc88 Reviewed-on: https://go-review.googlesource.com/69120 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-09 17:02:18 +00:00
Austin Clements	c85b12b579	runtime: make LockOSThread/UnlockOSThread nested Currently, there is a single bit for LockOSThread, so two calls to LockOSThread followed by one call to UnlockOSThread will unlock the thread. There's evidence (#20458) that this is almost never what people want or expect and it makes these APIs very hard to use correctly or reliably. Change this so LockOSThread/UnlockOSThread can be nested and the calling goroutine will not be unwired until UnlockOSThread has been called as many times as LockOSThread has. This should fix the vast majority of incorrect uses while having no effect on the vast majority of correct uses. Fixes #20458. Change-Id: I1464e5e9a0ea4208fbb83638ee9847f929a2bacb Reviewed-on: https://go-review.googlesource.com/45752 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-10-05 19:50:23 +00:00
Austin Clements	e900e275e8	runtime: clean up loops over allp allp now has length gomaxprocs, which means none of allp[i] are nil or in state _Pdead. This lets replace several different styles of loops over allp with normal range loops. for i := 0; i < gomaxprocs; i++ { ... } loops can simply range over allp. Likewise, range loops over allp[:gomaxprocs] can just range over allp. Loops that check for p == nil \|\| p.state == _Pdead don't need to check this any more. Loops that check for p == nil don't have to check this if dead Ps don't affect them. I checked that all such loops are, in fact, unaffected by dead Ps. One loop was potentially affected, which this fixes by zeroing p.gcAssistTime in procresize. Updates #15131. Change-Id: Ifa1c2a86ed59892eca0610360a75bb613bc6dcee Reviewed-on: https://go-review.googlesource.com/45575 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-09-27 16:29:15 +00:00
Austin Clements	ee55000f6c	runtime: eliminate GOMAXPROCS limit Now that allp is dynamically allocated, there's no need for a hard cap on GOMAXPROCS. Fixes #15131. Change-Id: I53eee8e228a711a818f7ebce8d9fd915b3865eed Reviewed-on: https://go-review.googlesource.com/45574 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-09-27 16:29:12 +00:00
Austin Clements	84d2c7ea83	runtime: dynamically allocate allp This makes it possible to eliminate the hard cap on GOMAXPROCS. Updates #15131. Change-Id: I4c422b340791621584c118a6be1b38e8a44f8b70 Reviewed-on: https://go-review.googlesource.com/45573 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-09-27 16:29:09 +00:00
Austin Clements	229aaac19e	runtime: remove getcallerpc argument Now that getcallerpc is a compiler intrinsic on x86 and non-x86 platforms don't need the argument, we can drop it. Sadly, this doesn't let us remove any dummy arguments since all of those cases also use getcallersp, which still takes the argument pointer, but this is at least an improvement. Change-Id: I9c34a41cf2c18cba57f59938390bf9491efb22d2 Reviewed-on: https://go-review.googlesource.com/65474 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-09-22 22:17:15 +00:00
Ian Lance Taylor	332719f7ce	runtime: don't call lockOSThread for every cgo call For a trivial benchmark with a do-nothing cgo call: name old time/op new time/op delta Call-4 64.5ns ± 7% 63.0ns ± 6% -2.25% (p=0.027 n=20+16) Because Windows uses the cgocall mechanism to make system calls, and passes arguments in a struct held in the m, we need to do the lockOSThread/unlockOSThread in that code. Because deferreturn was getting a nosplit stack overflow error, change it to avoid calling typedmemmove. Updates #21827. Change-Id: I9b1d61434c44faeb29805b46b409c812c9acadc2 Reviewed-on: https://go-review.googlesource.com/64070 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2017-09-22 18:17:13 +00:00
Giovanni Bajo	e7e4a4ffa3	runtime: improve fastrand with a better generator The current generator is a simple LSFR, which showed strong correlation in higher bits, as manifested by fastrandn(). Change it with xorshift64+, which is slightly more complex, has a larger state, but has a period of 2^64-1 and is much better at statistical tests. The version used here is capable of passing Diehard and even SmallCrush. Speed is slightly worse but is probably insignificant: name old time/op new time/op delta Fastrand-4 0.77ns ±12% 0.91ns ±21% +17.31% (p=0.048 n=5+5) FastrandHashiter-4 13.6ns ±21% 15.2ns ±17% ~ (p=0.160 n=6+5) Fastrandn/2-4 2.30ns ± 5% 2.45ns ±15% ~ (p=0.222 n=5+5) Fastrandn/3-4 2.36ns ± 7% 2.45ns ± 6% ~ (p=0.222 n=5+5) Fastrandn/4-4 2.33ns ± 8% 2.61ns ±30% ~ (p=0.126 n=6+5) Fastrandn/5-4 2.33ns ± 5% 2.48ns ± 9% ~ (p=0.052 n=6+5) Fixes #21806 Change-Id: I013bb37b463fdfc229a7f324df8fe2da8d286f33 Reviewed-on: https://go-review.googlesource.com/62530 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-09-16 10:17:26 +00:00
Ian Lance Taylor	165c15afa3	runtime: change lockedg/lockedm to guintptr/muintptr This change has no real effect in itself. This is to prepare for a followup change that will call lockOSThread during a cgo callback when there is no p assigned, and therefore when lockOSThread can not use a write barrier. Change-Id: Ia122d41acf54191864bcb68f393f2ed3b2f87abc Reviewed-on: https://go-review.googlesource.com/63630 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-09-15 17:29:51 +00:00
Aliaksandr Valialkin	76f4fd8a52	runtime: improve timers scalability on multi-CPU systems Use per-P timers, so each P may work with its own timers. This CL improves performance on multi-CPU systems in the following cases: - When serving high number of concurrent connections with read/write deadlines set (for instance, highly loaded net/http server). - When using high number of concurrent timers. These timers may be implicitly created via context.WithDeadline or context.WithTimeout. Production servers should usually set timeout on connections and external requests in order to prevent from resource leakage. See https://blog.cloudflare.com/the-complete-guide-to-golang-net-http-timeouts/ Below are relevant benchmark results for various GOMAXPROCS values on linux/amd64: context package: name old time/op new time/op delta WithTimeout/concurrency=40 4.92µs ± 0% 5.17µs ± 1% +5.07% (p=0.000 n=9+9) WithTimeout/concurrency=4000 6.03µs ± 1% 6.49µs ± 0% +7.63% (p=0.000 n=8+10) WithTimeout/concurrency=400000 8.58µs ± 7% 9.02µs ± 4% +5.02% (p=0.019 n=10+10) name old time/op new time/op delta WithTimeout/concurrency=40-2 3.70µs ± 1% 2.78µs ± 4% -24.90% (p=0.000 n=8+9) WithTimeout/concurrency=4000-2 4.49µs ± 4% 3.67µs ± 5% -18.26% (p=0.000 n=10+10) WithTimeout/concurrency=400000-2 6.16µs ±10% 5.15µs ±13% -16.30% (p=0.000 n=10+10) name old time/op new time/op delta WithTimeout/concurrency=40-4 3.58µs ± 1% 2.64µs ± 2% -26.13% (p=0.000 n=9+10) WithTimeout/concurrency=4000-4 4.17µs ± 0% 3.32µs ± 1% -20.36% (p=0.000 n=10+10) WithTimeout/concurrency=400000-4 5.57µs ± 9% 4.83µs ±10% -13.27% (p=0.001 n=10+10) time package: name old time/op new time/op delta AfterFunc 6.15ms ± 3% 6.07ms ± 2% ~ (p=0.133 n=10+9) AfterFunc-2 3.43ms ± 1% 3.56ms ± 1% +3.91% (p=0.000 n=10+9) AfterFunc-4 5.04ms ± 2% 2.36ms ± 0% -53.20% (p=0.000 n=10+9) After 6.54ms ± 2% 6.49ms ± 3% ~ (p=0.393 n=10+10) After-2 3.68ms ± 1% 3.87ms ± 0% +5.14% (p=0.000 n=9+9) After-4 6.66ms ± 1% 2.87ms ± 1% -56.89% (p=0.000 n=10+10) Stop 698µs ± 2% 689µs ± 1% -1.26% (p=0.011 n=10+10) Stop-2 729µs ± 2% 434µs ± 3% -40.49% (p=0.000 n=10+10) Stop-4 837µs ± 3% 333µs ± 2% -60.20% (p=0.000 n=10+10) SimultaneousAfterFunc 694µs ± 1% 692µs ± 7% ~ (p=0.481 n=10+10) SimultaneousAfterFunc-2 714µs ± 3% 569µs ± 2% -20.33% (p=0.000 n=10+10) SimultaneousAfterFunc-4 782µs ± 2% 386µs ± 2% -50.67% (p=0.000 n=10+10) StartStop 267µs ± 3% 274µs ± 0% +2.64% (p=0.000 n=8+9) StartStop-2 238µs ± 2% 140µs ± 3% -40.95% (p=0.000 n=10+8) StartStop-4 320µs ± 1% 125µs ± 1% -61.02% (p=0.000 n=9+9) Reset 75.0µs ± 1% 77.5µs ± 2% +3.38% (p=0.000 n=10+10) Reset-2 150µs ± 2% 40µs ± 5% -73.09% (p=0.000 n=10+9) Reset-4 226µs ± 1% 33µs ± 1% -85.42% (p=0.000 n=10+10) Sleep 857µs ± 6% 878µs ± 9% ~ (p=0.079 n=10+9) Sleep-2 617µs ± 4% 585µs ± 2% -5.21% (p=0.000 n=10+10) Sleep-4 689µs ± 3% 465µs ± 4% -32.53% (p=0.000 n=10+10) Ticker 55.9ms ± 2% 55.9ms ± 2% ~ (p=0.971 n=10+10) Ticker-2 28.7ms ± 2% 28.1ms ± 1% -2.06% (p=0.000 n=10+10) Ticker-4 14.6ms ± 0% 13.6ms ± 1% -6.80% (p=0.000 n=9+10) Fixes #15133 Change-Id: I6f4b09d2db8c5bec93146db6501b44dbfe5c0ac4 Reviewed-on: https://go-review.googlesource.com/34784 Reviewed-by: Austin Clements <austin@google.com>	2017-09-12 16:52:23 +00:00
Austin Clements	b0392159f6	runtime,cmd/trace: trace GC STW events Right now we only kind of sort of trace GC STW events. We emit events around mark termination, but those start well after stopping the world and end before starting it again, and we don't emit any events for sweep termination. Fix this by generalizing EvGCScanStart/EvGCScanDone. These were already re-purposed to indicate mark termination (despite the names). This commit renames them to EvGCSTWStart/EvGCSTWDone, adds an argument to indicate the STW reason, and shuffles the runtime to generate them right before stopping the world and right after starting the world, respectively. These events will make it possible to generate precise minimum mutator utilization (MMU) graphs and could be useful in detecting non-preemptible goroutines (e.g., #20792). Change-Id: If95783f370781d8ef66addd94886028103a7c26f Reviewed-on: https://go-review.googlesource.com/55411 Reviewed-by: Rick Hudson <rlh@golang.org>	2017-08-29 21:54:55 +00:00
Daniel Martí	fbc8973a6b	all: join some chained ifs to unindent code Found with mvdan.cc/unindent. It skipped the cases where parentheses would need to be added, where comments would have to be moved elsewhere, or where actions and simple logic would mix. One of them was of the form "err != nil && err == io.EOF", so the first part was removed. Change-Id: Ie504c2b03a2c87d10ecbca1b9270069be1171b91 Reviewed-on: https://go-review.googlesource.com/57690 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-08-29 20:57:41 +00:00
Austin Clements	9d17e175e0	runtime: capture runtimeInitTime after nanotime is initialized CL 36428 changed the way nanotime works so on Darwin and Windows it now depends on runtime.startNano, which is computed at runtime.init time. Unfortunately, the `runtimeInitTime = nanotime()` initialization happened before runtime.init, so on these platforms runtimeInitTime is set incorrectly. The one (and only) consequence of this is that the start time printed in gctrace lines is bogus: gc 1 18446653480.186s 0%: 0.092+0.47+0.038 ms clock, 0.37+0.15/0.81/1.8+0.15 ms cpu, 4->4->1 MB, 5 MB goal, 8 P To fix this, this commit moves the runtimeInitTime initialization to shortly after runtime.init, at which point nanotime is safe to use. This also requires changing the condition in newproc1 that currently uses runtimeInitTime != 0 simply to detect whether or not the main M has started. Since runtimeInitTime could genuinely be 0 now, this introduces a separate flag to newproc1. Fixes #21554. Change-Id: Id874a4b912d3fa3d22f58d01b31ffb3548266d3b Reviewed-on: https://go-review.googlesource.com/58690 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-08-25 16:02:17 +00:00
Daniel Morsing	32b94f13cf	runtime: move selectdone into g Writing to selectdone on the stack of another goroutine meant a pretty subtle dance between the select code and the stack copying code. Instead move the selectdone variable into the g struct. Change-Id: Id246aaf18077c625adef7ca2d62794afef1bdd1b Reviewed-on: https://go-review.googlesource.com/53390 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-08-15 19:18:00 +00:00
Austin Clements	250a9610a4	runtime: make STW duration more accurate Currently, GC captures the start-the-world time stamp after startTheWorldWithSema returns. This is problematic for two reasons: 1. It's possible to get preempted between startTheWorldWithSema starting the world and calling nanotime. 2. startTheWorldWithSema does several clean-up tasks after the world is up and running that on rare occasions can take upwards of 10ms. Since the runtime uses the start-the-world time stamp to compute the STW duration, both of these can significantly inflate the reported STW duration. Fix this by having startTheWorldWithSema itself call nanotime once the world is started. Change-Id: I114630234fb73c9dabae50a2ef1884661f2459db Reviewed-on: https://go-review.googlesource.com/55410 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-08-15 18:47:08 +00:00
Vladimir Stefanovic	835dfef939	runtime/pprof: prevent a deadlock that SIGPROF might create on mips{,le} 64bit atomics on mips/mipsle are implemented using spinlocks. If SIGPROF is received while the program is in the critical section, it will try to write the sample using the same spinlock, creating a deadloop. Prevent it by creating a counter of SIGPROFs during atomic64 and postpone writing the sample(s) until called from elsewhere, with pc set to _LostSIGPROFDuringAtomic64. Added a test case, per Cherry's suggestion. Works around #20146. Change-Id: Icff504180bae4ee83d78b19c0d9d6a80097087f9 Reviewed-on: https://go-review.googlesource.com/42652 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-07-26 13:29:59 +00:00
Ian Lance Taylor	28f650a2f7	runtime: don't call libc sigaction function in forked child If we are using vfork, and if something (such as TSAN) is intercepting the sigaction function, then we must call the system call, not the libc function. Otherwise the intercepted sigaction call in the child may trash the data structures in the parent. Change-Id: Id9588bfeaa934f32c920bf829c5839be5cacf243 Reviewed-on: https://go-review.googlesource.com/50251 Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> Reviewed-by: Austin Clements <austin@google.com>	2017-07-20 18:02:47 +00:00
Austin Clements	093adeef40	runtime: use next timer to decide whether to relax Currently, sysmon waits 60 ms during idle before relaxing. This is primarily to avoid reducing the precision of short-duration timers. Of course, if there are no short-duration timers, this wastes 60 ms running the timer at high resolution. Improve this by instead inspecting the time until the next timer fires and relaxing the timer resolution immediately if the next timer won't fire for a while. Updates #20937. Change-Id: If4ad0a565b65a9b3e8c4cdc2eff1486968c79f24 Reviewed-on: https://go-review.googlesource.com/47833 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-07-07 21:22:31 +00:00
Austin Clements	7a8f39fa14	runtime: delay before osRelaxing Currently, sysmon relaxes the Windows timer resolution as soon as the Go process becomes idle. However, if it's going idle because of a short sleep (< 15.6 ms), this can turn that short sleep into a long sleep (15.6 ms). To address this, wait for 60 ms of idleness before relaxing the timer resolution. It would be better to check the time until the next wakeup and relax immediately if it makes sense, but there's currently no interaction between sysmon and the timer subsystem, so adding this simple delay is a much simpler and safer change for late in the release cycle. Fixes #20937. Change-Id: I817db24c3bdfa06dba04b7bc197cfd554363c379 Reviewed-on: https://go-review.googlesource.com/47832 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-07-07 21:02:40 +00:00
Austin Clements	9745e88b22	runtime: use rwmutex for execLock Currently the execLock is a mutex, which has the unfortunate side-effect of serializing all thread creation. This replaces it with an rwmutex so threads can be created in parallel, but exec still blocks thread creation. Fixes #20738. Change-Id: Ia8f30a92053c3d28af460b0da71176abe5fd074b Reviewed-on: https://go-review.googlesource.com/47072 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-06-28 22:08:59 +00:00
John R. Lenton	91139b87f7	runtime, syscall: workaround for bug in Linux's execve Linux's execve has (at the time of writing, and since v2.6.30) a bug when it ran concurrently with clone, in that it would fail to set up some datastructures if the thread count before and after some steps differed. This is described better and in more detail by Colin King in Launchpad¹ and kernel² bugs. When a program written in Go runtime.Exec's a setuid binary, this issue may cause the resulting process to not have the expected uid. This patch works around the issue by using a mutex to serialize exec and clone. 1. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672819 2. https://bugzilla.kernel.org/show_bug.cgi?id=195453 Fixes #19546 Change-Id: I126e87d1d9ce3be5ea4ec9c7ffe13f92e087903d Reviewed-on: https://go-review.googlesource.com/43713 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-06-20 23:53:17 +00:00
Ian Lance Taylor	df0892cbf8	runtime, syscall: reset signal handlers to default in child Block all signals during a fork. In the parent process, after the fork, restore the signal mask. In the child process, reset all currently handled signals to the default handler, and then restore the signal mask. The effect of this is that the child will be operating using the same signal regime as the program it is about to exec, as exec resets all non-ignored signals to the default, and preserves the signal mask. We do this so that in the case of a signal sent to the process group, the child process will not try to run a signal handler while in the precarious state after a fork. Fixes #18600. Change-Id: I9f39aaa3884035908d687ee323c975f349d5faaa Reviewed-on: https://go-review.googlesource.com/45471 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-06-14 14:00:56 +00:00
Austin Clements	f4f018518d	runtime: move pdesc into p There are currently two arrays indexed by P ID: allp and pdesc. Consolidate these by moving the pdesc fields into type p so they can be indexed off allp along with all other per-P state. For #15131. Change-Id: Ib6c4e6e7612281a1171ba4a0d62e52fd59e960b4 Reviewed-on: https://go-review.googlesource.com/45572 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-06-14 00:43:30 +00:00
Austin Clements	200d0cc192	runtime: clean up some silly allp loops Back in the day, allp was just a pointer to an array. As a result, the runtime has a few loops of the form: for i := 0; ; i++ { p := allp[i] if p == nil { break } ... } This is silly now because it requires that allp be one longer than the maximum possible number of Ps, but now that allp is in Go it has a length. Replace these with range loops. Change-Id: I91ef4bc7bd3c9d4fda2264f4aa1b1d0271d7f578 Reviewed-on: https://go-review.googlesource.com/45571 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-06-13 18:57:48 +00:00
Austin Clements	4e7067cde4	runtime: mark extra M's G as dead when not in use Currently the extra Ms created for cgo callbacks have a corresponding G that's kept in syscall state with only a call to goexit on its stack. This leads to confusing output from runtime.NumGoroutines and in tracebacks: goroutine 17 [syscall, locked to thread]: runtime.goexit() .../src/runtime/asm_amd64.s:2197 +0x1 Fix this by putting this goroutine into state _Gdead when it's not in use instead of _Gsyscall. To keep the goroutine counts correct, we also add one to sched.ngsys while the goroutine is in _Gdead. The effect of this is as if the goroutine simply doesn't exist when it's not in use. Fixes #16631. Fixes #16714. Change-Id: Ieae08a2febd4b3d00bef5c23fd6ca88fb2bb0087 Reviewed-on: https://go-review.googlesource.com/45030 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-06-07 02:13:51 +00:00
Ian Lance Taylor	2d86f49428	runtime: delay exiting while panic is running deferred functions Try to avoid a race between the main goroutine exiting and a panic occurring. Don't try too hard, to avoid hanging. Updates #3934 Fixes #20018 Change-Id: I57a02b6d795d2a61f1cadd137ce097145280ece7 Reviewed-on: https://go-review.googlesource.com/41052 Reviewed-by: Austin Clements <austin@google.com>	2017-06-05 22:42:48 +00:00
Austin Clements	11eaf42886	runtime: reduce Windows timer resolution when idle Currently Go sets the system-wide timer resolution to 1ms the whole time it's running. This has negative affects on system performance and power consumption. Unfortunately, simply reducing the timer resolution to the default 15ms interferes with several sleeps in the runtime itself, including sysmon's ability to interrupt goroutines. This commit takes a hybrid approach: it only reduces the timer resolution when the Go process is entirely idle. When the process is idle, nothing needs a high resolution timer. When the process is non-idle, it's already consuming CPU so it doesn't really matter if the OS also takes timer interrupts more frequently. Updates #8687. Change-Id: I0652564b4a36d61a80e045040094a39c19da3b06 Reviewed-on: https://go-review.googlesource.com/38403 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2017-04-29 04:15:49 +00:00
Austin Clements	29be3f1999	runtime: generalize GC trigger Currently the GC triggering condition is an awkward combination of the gcMode (whether or not it's gcBackgroundMode) and a boolean "forceTrigger" flag. Replace this with a new gcTrigger type that represents the range of transition predicates we need. This has several advantages: 1. We can remove the awkward logic that affects the trigger behavior based on the gcMode. Now gcMode purely controls whether to run a STW GC or not and the gcTrigger controls whether this is a forced GC that cannot be consolidated with other GC cycles. 2. We can lift the time-based triggering logic in sysmon to just another type of GC trigger and move the logic to the trigger test. 3. This sets us up to have a cycle count-based trigger, which we'll use to make runtime.GC trigger concurrent GC with the desired consolidation properties. For #18216. Change-Id: If9cd49349579a548800f5022ae47b8128004bbfc Reviewed-on: https://go-review.googlesource.com/37516 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 01:15:06 +00:00
Austin Clements	640cd3b322	runtime: check transition condition before triggering periodic GC Currently sysmon triggers periodic GC if GC is not currently running and it's been long enough since the last GC. This misses some important conditions; for example, whether GC is enabled at all by GOGC. As a result, if GOGC is off, once we pass the timeout for periodic GC, sysmon will attempt to trigger a GC every 10ms. This GC will be a no-op because gcStart will check all of the appropriate conditions and do nothing, but it still goes through the motions of waking the forcegc goroutine and printing a gctrace line. Fix this by making sysmon call gcShouldStart to check all of the appropriate transition conditions before attempting to trigger a periodic GC. Fixes #19247. Change-Id: Icee5521ce175e8419f934723849853d53773af31 Reviewed-on: https://go-review.googlesource.com/37515 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 01:15:03 +00:00
Cherry Zhang	3a1ce1085a	runtime: access _cgo_yield indirectly The darwin linker for ARM does not allow PC-relative relocation of external symbol in text section. Work around it by accessing it indirectly: putting its address in a global variable (which is not external), and accessing through that variable. Fixes #19684. Change-Id: I41361bbb281b5dbdda0d100ae49d32c69ed85a81 Reviewed-on: https://go-review.googlesource.com/38596 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Elias Naur <elias.naur@gmail.com>	2017-03-24 15:37:56 +00:00
Daniel Martí	77b09b8b8d	runtime: remove unused g parameter Found by github.com/mvdan/unparam. Change-Id: I20145440ff1bcd27fcf15a740354c52f313e536c Reviewed-on: https://go-review.googlesource.com/37894 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-03-16 14:03:45 +00:00
Austin Clements	f2e87158f0	runtime: make GOTRACEBACK=crash crash promptly in cgo binaries GOTRACEBACK=crash works by bouncing a SIGQUIT around the process sched.mcount times. However, sched.mcount includes the extra Ms allocated by oneNewExtraM for cgo callbacks. Hence, if there are any extra Ms that don't have real OS threads, we'll try to send SIGQUIT more times than there are threads to catch it. Since nothing will catch these extra signals, we'll fall back to blocking for five seconds before aborting the process. Avoid this five second delay by subtracting out the number of extra Ms when sending SIGQUITs. Of course, in a cgo binary, it's still possible for the SIGQUIT to go to a cgo thread and cause some other failure mode. This does not fix that. Change-Id: I4fbf3c52dd721812796c4c1dcb2ab4cb7026d965 Reviewed-on: https://go-review.googlesource.com/38182 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-15 22:16:10 +00:00
Bryan C. Mills	e57350f4c0	runtime: fix _cgo_yield usage with sysmon and on BSD There are a few problems from change 35494, discovered during testing of change 37852. 1. I was confused about the usage of n.key in the sema variant, so we were looping on the wrong condition. The error was not caught by the TryBots (presumably due to missing TSAN coverage in the BSD and darwin builders?). 2. The sysmon goroutine sometimes skips notetsleep entirely, using direct usleep syscalls instead. In that case, we were not calling _cgo_yield, leading to missed signals under TSAN. 3. Some notetsleep calls have long finite timeouts. They should be broken up into smaller chunks with a yield at the end of each chunk. updates #18717 Change-Id: I91175af5dea3857deebc686f51a8a40f9d690bcc Reviewed-on: https://go-review.googlesource.com/37867 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-09 18:36:49 +00:00
Bryan C. Mills	29edf0f9fe	runtime: poll libc to deliver signals under TSAN fixes #18717 Change-Id: I7244463d2e7489e0b0fe3b74c4b782e71210beb2 Reviewed-on: https://go-review.googlesource.com/35494 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-08 18:58:30 +00:00
David Chase	d71f36b5aa	cmd/compile: check loop rescheduling with stack bound, not counter After benchmarking with a compiler modified to have better spill location, it became clear that this method of checking was actually faster on (at least) two different architectures (ppc64 and amd64) and it also provides more timely interruption of loops. This change adds a modified FOR loop node "FORUNTIL" that checks after executing the loop body instead of before (i.e., always at least once). This ensures that a pointer past the end of a slice or array is not made visible to the garbage collector. Without the rescheduling checks inserted, the restructured loop from this change apparently provides a 1% geomean improvement on PPC64 running the go1 benchmarks; the improvement on AMD64 is only 0.12%. Inserting the rescheduling check exposed some peculiar bug with the ssa test code for s390x; this was updated based on initial code actually generated for GOARCH=s390x to use appropriate OpArg, OpAddr, and OpVarDef. NaCl is disabled in testing. Change-Id: Ieafaa9a61d2a583ad00968110ef3e7a441abca50 Reviewed-on: https://go-review.googlesource.com/36206 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-03-08 18:52:12 +00:00
Austin Clements	0efc8b2188	runtime: avoid repeated findmoduledatap calls Currently almost every function that deals with a _func has to first look up the moduledata for the module containing the function's entry point. This means we almost always do at least two identical module lookups whenever we deal with a _func (one to get the _func and another to get something from its module data) and sometimes several more. Fix this by making findfunc return a new funcInfo type that embeds _func, but also includes the moduledata, and making all of the functions that currently take a _func instead take a funcInfo and use the already-found moduledata. This transformation is trivial for the most part, since the *_func type is usually inferred. The annoying part is that we can no longer use nil to indicate failure, so this introduces a funcInfo.valid() method and replaces nil checks with calls to valid. Change-Id: I9b8075ef1c31185c1943596d96dec45c7ab5100f Reviewed-on: https://go-review.googlesource.com/37331 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2017-03-06 19:17:24 +00:00
Russ Cox	1a680a902a	runtime/pprof: use new profile buffers for CPU profiling This doesn't change the functionality of the current code, but it sets us up for exporting the profiling labels into the profile. The old code had a hash table of profile samples maintained during the signal handler, with evictions going into a log. The new code just logs every sample directly, leaving the hash-based deduplication to an ordinary goroutine. The new code also avoids storing the entire profile in two forms in memory, an unfortunate regression introduced when binary profile support was added. After this CL the entire profile is only stored once in memory. We'd still like to get back down to storing it zero times (streaming it to the underlying io.Writer). Change-Id: I0893a1788267c564aa1af17970d47377b2a43457 Reviewed-on: https://go-review.googlesource.com/36712 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2017-02-24 17:01:47 +00:00
Russ Cox	a1261b8b0a	runtime: do not allocate on every time.Sleep It's common for some goroutines to loop calling time.Sleep. Allocate once per goroutine, not every time. This comes up in runtime/pprof's background reader. Change-Id: I89d17dc7379dca266d2c9cd3aefc2382f5bdbade Reviewed-on: https://go-review.googlesource.com/37162 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-02-24 15:34:01 +00:00
Dmitry Vyukov	0556e26273	sync: make Mutex more fair Add new starvation mode for Mutex. In starvation mode ownership is directly handed off from unlocking goroutine to the next waiter. New arriving goroutines don't compete for ownership. Unfair wait time is now limited to 1ms. Also fix a long standing bug that goroutines were requeued at the tail of the wait queue. That lead to even more unfair acquisition times with multiple waiters. Performance of normal mode is not considerably affected. Fixes #13086 On the provided in the issue lockskew program: done in 1.207853ms done in 1.177451ms done in 1.184168ms done in 1.198633ms done in 1.185797ms done in 1.182502ms done in 1.316485ms done in 1.211611ms done in 1.182418ms name old time/op new time/op delta MutexUncontended-48 0.65ns ± 0% 0.65ns ± 1% ~ (p=0.087 n=10+10) Mutex-48 112ns ± 1% 114ns ± 1% +1.69% (p=0.000 n=10+10) MutexSlack-48 113ns ± 0% 87ns ± 1% -22.65% (p=0.000 n=8+10) MutexWork-48 149ns ± 0% 145ns ± 0% -2.48% (p=0.000 n=9+10) MutexWorkSlack-48 149ns ± 0% 122ns ± 3% -18.26% (p=0.000 n=6+10) MutexNoSpin-48 103ns ± 4% 105ns ± 3% ~ (p=0.089 n=10+10) MutexSpin-48 490ns ± 4% 515ns ± 6% +5.08% (p=0.006 n=10+10) Cond32-48 13.4µs ± 6% 13.1µs ± 5% -2.75% (p=0.023 n=10+10) RWMutexWrite100-48 53.2ns ± 3% 41.2ns ± 3% -22.57% (p=0.000 n=10+10) RWMutexWrite10-48 45.9ns ± 2% 43.9ns ± 2% -4.38% (p=0.000 n=10+10) RWMutexWorkWrite100-48 122ns ± 2% 134ns ± 1% +9.92% (p=0.000 n=10+10) RWMutexWorkWrite10-48 206ns ± 1% 188ns ± 1% -8.52% (p=0.000 n=8+10) Cond32-24 12.1µs ± 3% 12.4µs ± 3% +1.98% (p=0.043 n=10+9) MutexUncontended-24 0.74ns ± 1% 0.75ns ± 1% ~ (p=0.650 n=10+10) Mutex-24 122ns ± 2% 124ns ± 1% +1.31% (p=0.007 n=10+10) MutexSlack-24 96.9ns ± 2% 102.8ns ± 2% +6.11% (p=0.000 n=10+10) MutexWork-24 146ns ± 1% 135ns ± 2% -7.70% (p=0.000 n=10+9) MutexWorkSlack-24 135ns ± 1% 128ns ± 2% -5.01% (p=0.000 n=10+9) MutexNoSpin-24 114ns ± 3% 110ns ± 4% -3.84% (p=0.000 n=10+10) MutexSpin-24 482ns ± 4% 475ns ± 8% ~ (p=0.286 n=10+10) RWMutexWrite100-24 43.0ns ± 3% 43.1ns ± 2% ~ (p=0.956 n=10+10) RWMutexWrite10-24 43.4ns ± 1% 43.2ns ± 1% ~ (p=0.085 n=10+9) RWMutexWorkWrite100-24 130ns ± 3% 131ns ± 3% ~ (p=0.747 n=10+10) RWMutexWorkWrite10-24 191ns ± 1% 192ns ± 1% ~ (p=0.210 n=10+10) Cond32-12 11.5µs ± 2% 11.7µs ± 2% +1.98% (p=0.002 n=10+10) MutexUncontended-12 1.48ns ± 0% 1.50ns ± 1% +1.08% (p=0.004 n=10+10) Mutex-12 141ns ± 1% 143ns ± 1% +1.63% (p=0.000 n=10+10) MutexSlack-12 121ns ± 0% 119ns ± 0% -1.65% (p=0.001 n=8+9) MutexWork-12 141ns ± 2% 150ns ± 3% +6.36% (p=0.000 n=9+10) MutexWorkSlack-12 131ns ± 0% 138ns ± 0% +5.73% (p=0.000 n=9+10) MutexNoSpin-12 87.0ns ± 1% 83.7ns ± 1% -3.80% (p=0.000 n=10+10) MutexSpin-12 364ns ± 1% 377ns ± 1% +3.77% (p=0.000 n=10+10) RWMutexWrite100-12 42.8ns ± 1% 43.9ns ± 1% +2.41% (p=0.000 n=8+10) RWMutexWrite10-12 39.8ns ± 4% 39.3ns ± 1% ~ (p=0.433 n=10+9) RWMutexWorkWrite100-12 131ns ± 1% 131ns ± 0% ~ (p=0.591 n=10+9) RWMutexWorkWrite10-12 173ns ± 1% 174ns ± 0% ~ (p=0.059 n=10+8) Cond32-6 10.9µs ± 2% 10.9µs ± 2% ~ (p=0.739 n=10+10) MutexUncontended-6 2.97ns ± 0% 2.97ns ± 0% ~ (all samples are equal) Mutex-6 122ns ± 6% 122ns ± 2% ~ (p=0.668 n=10+10) MutexSlack-6 149ns ± 3% 142ns ± 3% -4.63% (p=0.000 n=10+10) MutexWork-6 136ns ± 3% 140ns ± 5% ~ (p=0.077 n=10+10) MutexWorkSlack-6 152ns ± 0% 138ns ± 2% -9.21% (p=0.000 n=6+10) MutexNoSpin-6 150ns ± 1% 152ns ± 0% +1.50% (p=0.000 n=8+10) MutexSpin-6 726ns ± 0% 730ns ± 1% ~ (p=0.069 n=10+10) RWMutexWrite100-6 40.6ns ± 1% 40.9ns ± 1% +0.91% (p=0.001 n=8+10) RWMutexWrite10-6 37.1ns ± 0% 37.0ns ± 1% ~ (p=0.386 n=9+10) RWMutexWorkWrite100-6 133ns ± 1% 134ns ± 1% +1.01% (p=0.005 n=9+10) RWMutexWorkWrite10-6 152ns ± 0% 152ns ± 0% ~ (all samples are equal) Cond32-2 7.86µs ± 2% 7.95µs ± 2% +1.10% (p=0.023 n=10+10) MutexUncontended-2 8.10ns ± 0% 9.11ns ± 4% +12.44% (p=0.000 n=9+10) Mutex-2 32.9ns ± 9% 38.4ns ± 6% +16.58% (p=0.000 n=10+10) MutexSlack-2 93.4ns ± 1% 98.5ns ± 2% +5.39% (p=0.000 n=10+9) MutexWork-2 40.8ns ± 3% 43.8ns ± 7% +7.38% (p=0.000 n=10+9) MutexWorkSlack-2 98.6ns ± 5% 108.2ns ± 2% +9.80% (p=0.000 n=10+8) MutexNoSpin-2 399ns ± 1% 398ns ± 2% ~ (p=0.463 n=8+9) MutexSpin-2 1.99µs ± 3% 1.97µs ± 1% -0.81% (p=0.003 n=9+8) RWMutexWrite100-2 37.6ns ± 5% 46.0ns ± 4% +22.17% (p=0.000 n=10+8) RWMutexWrite10-2 50.1ns ± 6% 36.8ns ±12% -26.46% (p=0.000 n=9+10) RWMutexWorkWrite100-2 136ns ± 0% 134ns ± 2% -1.80% (p=0.001 n=7+9) RWMutexWorkWrite10-2 140ns ± 1% 138ns ± 1% -1.50% (p=0.000 n=10+10) Cond32 5.93µs ± 1% 5.91µs ± 0% ~ (p=0.411 n=9+10) MutexUncontended 15.9ns ± 0% 15.8ns ± 0% -0.63% (p=0.000 n=8+8) Mutex 15.9ns ± 0% 15.8ns ± 0% -0.44% (p=0.003 n=10+10) MutexSlack 26.9ns ± 3% 26.7ns ± 2% ~ (p=0.084 n=10+10) MutexWork 47.8ns ± 0% 47.9ns ± 0% +0.21% (p=0.014 n=9+8) MutexWorkSlack 54.9ns ± 3% 54.5ns ± 3% ~ (p=0.254 n=10+10) MutexNoSpin 786ns ± 2% 765ns ± 1% -2.66% (p=0.000 n=10+10) MutexSpin 3.87µs ± 1% 3.83µs ± 0% -0.85% (p=0.005 n=9+8) RWMutexWrite100 21.2ns ± 2% 21.0ns ± 1% -0.88% (p=0.018 n=10+9) RWMutexWrite10 22.6ns ± 1% 22.6ns ± 0% ~ (p=0.471 n=9+9) RWMutexWorkWrite100 132ns ± 0% 132ns ± 0% ~ (all samples are equal) RWMutexWorkWrite10 124ns ± 0% 123ns ± 0% ~ (p=0.656 n=10+10) Change-Id: I66412a3a0980df1233ad7a5a0cd9723b4274528b Reviewed-on: https://go-review.googlesource.com/34310 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-17 17:24:59 +00:00
Ian Lance Taylor	c05b06a12d	os: use poller for file I/O This changes the os package to use the runtime poller for file I/O where possible. When a system call blocks on a pollable descriptor, the goroutine will be blocked on the poller but the thread will be released to run other goroutines. When using a non-pollable descriptor, the os package will continue to use thread-blocking system calls as before. For example, on GNU/Linux, the runtime poller uses epoll. epoll does not support ordinary disk files, so they will continue to use blocking I/O as before. The poller will be used for pipes. Since this means that the poller is used for many more programs, this modifies the runtime to only block waiting for the poller if there is some goroutine that is waiting on the poller. Otherwise, there is no point, as the poller will never make any goroutine ready. This preserves the runtime's current simple deadlock detection. This seems to crash FreeBSD systems, so it is disabled on FreeBSD. This is issue 19093. Using the poller on Windows requires opening the file with FILE_FLAG_OVERLAPPED. We should only do that if we can remove that flag if the program calls the Fd method. This is issue 19098. Update #6817. Update #7903. Update #15021. Update #18507. Update #19093. Update #19098. Change-Id: Ia5197dcefa7c6fbcca97d19a6f8621b2abcbb1fe Reviewed-on: https://go-review.googlesource.com/36800 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-15 19:31:55 +00:00
Austin Clements	0993b2fd06	runtime: remove g.stackAlloc Since we're no longer stealing space for the stack barrier array from the stack allocation, the stack allocation is simply g.stack.hi-g.stack.lo. Updates #17503. Change-Id: Id9b450ae12c3df9ec59cfc4365481a0a16b7c601 Reviewed-on: https://go-review.googlesource.com/36621 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-02-14 15:52:56 +00:00
Austin Clements	d089a6c718	runtime: remove stack barriers Now that we don't rescan stacks, stack barriers are unnecessary. This removes all of the code and structures supporting them as well as tests that were specifically for stack barriers. Updates #17503. Change-Id: Ia29221730e0f2bbe7beab4fa757f31a032d9690c Reviewed-on: https://go-review.googlesource.com/36620 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-14 15:52:54 +00:00
Austin Clements	c5ebcd2c8a	runtime: remove rescan list With the hybrid barrier, rescanning stacks is no longer necessary so the rescan list is no longer necessary. Remove it. This leaves the gcrescanstacks GODEBUG variable, since it's useful for debugging, but changes it to simply walk all of the Gs to rescan stacks rather than using the rescan list. We could also remove g.gcscanvalid, which is effectively a distributed rescan list. However, it's still useful for gcrescanstacks mode and it adds little complexity, so we'll leave it in. Fixes #17099. Updates #17503. Change-Id: I776d43f0729567335ef1bfd145b75c74de2cc7a9 Reviewed-on: https://go-review.googlesource.com/36619 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-02-14 15:52:51 +00:00
Josh Bleecher Snyder	46a75870ad	runtime: speed up fastrand() % n This occurs a fair amount in the runtime for non-power-of-two n. Use an alternative, faster formulation. name old time/op new time/op delta Fastrandn/2-8 4.45ns ± 2% 2.09ns ± 3% -53.12% (p=0.000 n=14+14) Fastrandn/3-8 4.78ns ±11% 2.06ns ± 2% -56.94% (p=0.000 n=15+15) Fastrandn/4-8 4.76ns ± 9% 1.99ns ± 3% -58.28% (p=0.000 n=15+13) Fastrandn/5-8 4.96ns ±13% 2.03ns ± 6% -59.14% (p=0.000 n=15+15) name old time/op new time/op delta SelectUncontended-8 33.7ns ± 2% 33.9ns ± 2% +0.70% (p=0.000 n=49+50) SelectSyncContended-8 1.68µs ± 4% 1.65µs ± 4% -1.54% (p=0.000 n=50+45) SelectAsyncContended-8 282ns ± 1% 277ns ± 1% -1.50% (p=0.000 n=48+43) SelectNonblock-8 5.31ns ± 1% 5.32ns ± 1% ~ (p=0.275 n=45+44) SelectProdCons-8 585ns ± 3% 577ns ± 2% -1.35% (p=0.000 n=50+50) GoroutineSelect-8 1.59ms ± 2% 1.59ms ± 1% ~ (p=0.084 n=49+48) Updates #16213 Change-Id: Ib555a4d7da2042a25c3976f76a436b536487d5b7 Reviewed-on: https://go-review.googlesource.com/36932 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-14 00:01:22 +00:00
Ian Lance Taylor	e24228af25	runtime: enable/disable SIGPROF if needed when profiling This ensures that SIGPROF is handled correctly when using runtime/pprof in a c-archive or c-shared library. Separate profiler handling into pre-process changes and per-thread changes. Simplify the Windows code slightly accordingly. Fixes #18220. Change-Id: I5060f7084c91ef0bbe797848978bdc527c312777 Reviewed-on: https://go-review.googlesource.com/34018 TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com>	2017-02-09 18:53:34 +00:00
Russ Cox	e4371fb179	time: optimize Now on darwin, windows Fetch both monotonic and wall time together when possible. Avoids skew and is cheaper. Also shave a few ns off in conversion in package time. Compared to current implementation (after monotonic changes): name old time/op new time/op delta Now 19.6ns ± 1% 9.7ns ± 1% -50.63% (p=0.000 n=41+49) darwin/amd64 Now 23.5ns ± 4% 10.6ns ± 5% -54.61% (p=0.000 n=30+28) windows/amd64 Now 54.5ns ± 5% 29.8ns ± 9% -45.40% (p=0.000 n=27+29) windows/386 More importantly, compared to Go 1.8: name old time/op new time/op delta Now 9.5ns ± 1% 9.7ns ± 1% +1.94% (p=0.000 n=41+49) darwin/amd64 Now 12.9ns ± 5% 10.6ns ± 5% -17.73% (p=0.000 n=30+28) windows/amd64 Now 15.3ns ± 5% 29.8ns ± 9% +94.36% (p=0.000 n=30+29) windows/386 This brings time.Now back in line with Go 1.8 on darwin/amd64 and windows/amd64. It's not obvious why windows/386 is still noticeably worse than Go 1.8, but it's better than before this CL. The windows/386 speed is not too important; the changes just keep the two architectures similar. Change-Id: If69b94970c8a1a57910a371ee91e0d4e82e46c5d Reviewed-on: https://go-review.googlesource.com/36428 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-09 14:45:16 +00:00
Michael Matloob	62956897c1	runtime: add definitions for SetGoroutineLabels and Do This change defines runtime/pprof.SetGoroutineLabels and runtime/pprof.Do, which are used to set profiler labels on goroutines. The change defines functions in the runtime for setting and getting profile labels, and sets and unsets profile labels when goroutines are created and deleted. The change also adds the package runtime/internal/proflabel, which defines the structure the runtime uses to store profile labels. Change-Id: I747a4400141f89b6e8160dab6aa94ca9f0d4c94d Reviewed-on: https://go-review.googlesource.com/34198 Run-TryBot: Michael Matloob <matloob@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-on: https://go-review.googlesource.com/35010	2017-02-06 20:29:37 +00:00
David Chase	7f1ff65c39	cmd/compile: insert scheduling checks on loop backedges Loop breaking with a counter. Benchmarked (see comments), eyeball checked for sanity on popular loops. This code ought to handle loops in general, and properly inserts phi functions in cases where the earlier version might not have. Includes test, plus modifications to test/run.go to deal with timeout and killing looping test. Tests broken by the addition of extra code (branch frequency and live vars) for added checks turn the check insertion off. If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule checks on every backedge of every reducible loop. Alternately, specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will enable it for a single compilation, but because the core Go libraries contain some loops that may run long, this is less likely to have the desired effect. This is intended as a tool to help in the study and diagnosis of GC and other latency problems, now that goal STW GC latency is on the order of 100 microseconds or less. Updates #17831. Updates #10958. Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639 Reviewed-on: https://go-review.googlesource.com/33910 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-01-09 21:01:29 +00:00
Austin Clements	0c942e8f2c	runtime: avoid incorrect panic when a signal arrives during STW Stop-the-world and freeze-the-world (used for unhandled panics) are currently not safe to do at the same time. While a regular unhandled panic can't happen concurrently with STW (if the P hasn't been stopped, then the panic blocks the STW), a panic from a _SigThrow signal can happen on an already-stopped P, racing with STW. When this happens, freezetheworld sets sched.stopwait to 0x7fffffff and stopTheWorldWithSema panics because sched.stopwait != 0. Fix this by detecting when freeze-the-world happens before stop-the-world has completely stopped the world and freeze the STW operation rather than panicking. Fixes #17442. Change-Id: I646a7341221dd6d33ea21d818c2f7218e2cb7e20 Reviewed-on: https://go-review.googlesource.com/34611 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-12-20 17:27:47 +00:00
Austin Clements	0bae74e8c9	runtime: wake idle Ps when enqueuing GC work If the scheduler has no user work and there's no GC work visible, it puts the P to sleep (or blocks on the network). However, if we later enqueue more GC work, there's currently nothing that specifically wakes up the scheduler to let it start an idle GC worker. As a result, we can underutilize the CPU during GC if Ps have been put to sleep. Fix this by making GC wake idle Ps when work buffers are put on the full list. We already have a hook to do this, since we use this to preempt a random P if we need more dedicated workers. We expand this hook to instead wake an idle P if there is one. The logic we use for this is identical to the logic used to wake an idle P when we ready a goroutine. To make this really sound, we also fix the scheduler to re-check the idle GC worker condition after releasing its P. This closes a race where 1) the scheduler checks for idle work and finds none, 2) new work is enqueued but there are no idle Ps so none are woken, and 3) the scheduler releases its P. There is one subtlety here. Currently we call enlistWorker directly from putfull, but the gcWork is in an inconsistent state in the places that call putfull. This isn't a problem right now because nothing that enlistWorker does touches the gcWork, but with the added call to wakep, it's possible to get a recursive call into the gcWork (specifically, while write barriers are disallowed, this can do an allocation, which can dispose a gcWork, which can put a workbuf). To handle this, we lift the enlistWorker calls up a layer and delay them until the gcWork is in a consistent state. Fixes #14179. Change-Id: Ia2467a52e54c9688c3c1752e1fc00f5b37bbfeeb Reviewed-on: https://go-review.googlesource.com/32434 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-11-20 22:44:22 +00:00
Austin Clements	49ea9207b6	runtime: exit idle worker if there's higher-priority work Idle GC workers trigger whenever there's a GC running and the scheduler doesn't find any other work. However, they currently run for a full scheduler quantum (~10ms) once started. This is really bad for event-driven applications, where work may come in on the network hundreds of times during that window. In the go-gcbench rpc benchmark, this is bad enough to often cause effective STWs where all Ps are in the idle worker. When this happens, we don't even poll the network any more (except for the background 10ms poll in sysmon), so we don't even know there's more work to do. Fix this by making idle workers check with the scheduler roughly every 100 µs to see if there's any higher-priority work the P should be doing. This check includes polling the network for incoming work. Fixes #16528. Change-Id: I6f62ebf6d36a92368da9891bafbbfd609b9bd003 Reviewed-on: https://go-review.googlesource.com/32433 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-11-20 22:44:17 +00:00
Russ Cox	e6da64b6c0	runtime: fix Windows profiling crash I don't have any way to test or reproduce this problem, but the current code is clearly wrong for Windows. Make it better. As I said on #17165: But the borrowing of M's and the profiling of M's by the CPU profiler seem not synchronized enough. This code implements the CPU profiler on Windows: func profileloop1(param uintptr) uint32 { stdcall2(_SetThreadPriority, currentThread, _THREAD_PRIORITY_HIGHEST) for { stdcall2(_WaitForSingleObject, profiletimer, _INFINITE) first := (m)(atomic.Loadp(unsafe.Pointer(&allm))) for mp := first; mp != nil; mp = mp.alllink { thread := atomic.Loaduintptr(&mp.thread) // Do not profile threads blocked on Notes, // this includes idle worker threads, // idle timer thread, idle heap scavenger, etc. if thread == 0 \|\| mp.profilehz == 0 \|\| mp.blocked { continue } stdcall1(_SuspendThread, thread) if mp.profilehz != 0 && !mp.blocked { profilem(mp) } stdcall1(_ResumeThread, thread) } } } func profilem(mp m) { var r context rbuf := make([]byte, unsafe.Sizeof(r)+15) tls := &mp.tls[0] gp := ((g)(unsafe.Pointer(tls))) // align Context to 16 bytes r = (context)(unsafe.Pointer((uintptr(unsafe.Pointer(&rbuf[15]))) &^ 15)) r.contextflags = _CONTEXT_CONTROL stdcall2(_GetThreadContext, mp.thread, uintptr(unsafe.Pointer(r))) sigprof(r.ip(), r.sp(), 0, gp, mp) } func sigprof(pc, sp, lr uintptr, gp g, mp m) { if prof.hz == 0 { return } // Profiling runs concurrently with GC, so it must not allocate. mp.mallocing++ ... lots of code ... mp.mallocing-- } A borrowed M may migrate between threads. Between the atomic.Loaduintptr(&mp.thread) and the SuspendThread, mp may have moved to a new thread, so that it's in active use. In particular it might be calling malloc, as in the crash stack trace. If so, the mp.mallocing++ in sigprof would provoke the crash. Those lines are trying to guard against allocation during sigprof. But on Windows, mp is the thread being traced, not the current thread. Those lines should really be using getg().m.mallocing, which is the same on Unix but not on Windows. With that change, it's possible the race on the actual thread is not a problem: the traceback would get confused and eventually return an error, but that's fine. The code expects that possibility. Fixes #17165. Change-Id: If6619731910d65ca4b1a6e7de761fa2518ef339e Reviewed-on: https://go-review.googlesource.com/33132 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-11-11 20:50:08 +00:00
David Crawshaw	47c1715ba5	runtime: address comments from CL 32357 Change-Id: I174d7307bfdd8ec57bb4266dab8569fd2234abb4 Reviewed-on: https://go-review.googlesource.com/32610 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-11-02 13:49:25 +00:00
David Crawshaw	54ec7b072e	runtime: access modules via a slice The introduction of -buildmode=plugin means modules can be added to a Go program while it is running. This means there exists some time while the program is running with the module is on the moduledata linked list, but it has not been initialized to the satisfaction of other parts of the runtime. Notably, the GC. This CL adds a new way of access modules, an activeModules function. It returns a slice of modules that is built in the background and atomically swapped in. The parts of the runtime that need to wait on module initialization can use this slice instead of the linked list. Fixes #17455 Change-Id: I04790fd07e40c7295beb47cea202eb439206d33d Reviewed-on: https://go-review.googlesource.com/32357 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-11-01 16:04:12 +00:00
Martin Möhrmann	d7b34d5f29	runtime: improve atoi implementation - Adds overflow checks - Adds parsing of negative integers - Adds boolean return value to signal parsing errors - Adds atoi32 for parsing of integers that fit in an int32 - Adds tests Handling of errors to provide error messages at the call sites is left to future CLs. Updates #17718 Change-Id: I3cacd0ab1230b9efc5404c68edae7304d39bcbc0 Reviewed-on: https://go-review.googlesource.com/32390 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-11-01 14:04:39 +00:00
Austin Clements	8f81dfe8b4	runtime: perform write barrier before pointer write Currently, we perform write barriers after performing pointer writes. At the moment, it simply doesn't matter what order this happens in, as long as they appear atomic to GC. But both the hybrid barrier and ROC are going to require a pre-write write barrier. For the hybrid barrier, this is important because the barrier needs to observe both the current value of the slot and the value that will be written to it. (Alternatively, the caller could do the write and pass in the old value, but it seems easier and more useful to just swap the order of the barrier and the write.) For ROC, this is necessary because, if the pointer write is going to make the pointer reachable to some goroutine that it currently is not visible to, the garbage collector must take some special action before that pointer becomes more broadly visible. This commits swaps pointer writes around so the write barrier occurs before the pointer write. The main subtlety here is bulk memory writes. Currently, these copy to the destination first and then use the pointer bitmap of the destination to find the copied pointers and invoke the write barrier. This is necessary because the source may not have a pointer bitmap. To handle these, we pass both the source and the destination to the bulk memory barrier, which uses the pointer bitmap of the destination, but reads the pointer values from the source. Updates #17503. Change-Id: I78ecc0c5c94ee81c29019c305b3d232069294a55 Reviewed-on: https://go-review.googlesource.com/31763 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-28 20:47:52 +00:00
Austin Clements	c3163d23f0	runtime: eliminate write barriers from save As for dropg, save is writing a nil pointer that will generate a write barrier with the hybrid barrier. However, in this case, ctxt always should already be nil, so replace the write with an assertion that this is the case. At this point, we're ready to disable the write barrier elision optimizations that interfere with the hybrid barrier. Updates #17503. Change-Id: I83208e65aa33403d442401f355b2e013ab9a50e9 Reviewed-on: https://go-review.googlesource.com/31571 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-28 20:05:49 +00:00
Austin Clements	8044b77a57	runtime: eliminate write barriers from dropg Currently this contains no write barriers because it's writing nil pointers, but with the hybrid barrier, even these will produce write barriers. However, since these are gs and ms, they don't need write barriers, so we can simply eliminate them. Updates #17503. Change-Id: Ib188a60492c5cfb352814bf9b2bcb2941fb7d6c0 Reviewed-on: https://go-review.googlesource.com/31570 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-28 20:05:39 +00:00
Austin Clements	ee785f03a2	runtime: shade stack-to-stack copy when starting a goroutine The hybrid barrier requires barriers on stack-to-stack copies if either stack is grey. There are only two instances of this in the runtime: channel sends and starting a goroutine. Channel sends already use typedmemmove and hence have the necessary barriers. This commits adds barriers for the stack-to-stack copy when starting a goroutine. Updates #17503. Change-Id: Ibb55e08127ca4d021ac54be61cb96732efa5df5b Reviewed-on: https://go-review.googlesource.com/31455 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-28 20:05:18 +00:00
Austin Clements	88518e7dd6	runtime: zero-initialize LR on new stacks Currently we initialize LR on a new stack by writing nil to it. But this is an initializing write since the newly allocated stack is not zeroed, so this is unsafe with the hybrid barrier. Change this is a uintptr write to avoid a bad write barrier. Updates #17503. Change-Id: I062ac352e35df7da4644c1f2a5aaab87049d1f60 Reviewed-on: https://go-review.googlesource.com/32093 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-28 19:14:03 +00:00
Austin Clements	87e48c5afd	runtime, cmd/compile: rename memclr -> memclrNoHeapPointers Since barrier-less memclr is only safe in very narrow circumstances, this commit renames memclr to avoid accidentally calling memclr on typed memory. This can cause subtle, non-deterministic bugs, so it's worth some effort to prevent. In the near term, this will also prevent bugs creeping in from any concurrent CLs that add calls to memclr; if this happens, whichever patch hits master second will fail to compile. This also adds the other new memclr variants to the compiler's builtin.go to minimize the churn on that binary blob. We'll use these in future commits. Updates #17503. Change-Id: I00eead049f5bd35ca107ea525966831f3d1ed9ca Reviewed-on: https://go-review.googlesource.com/31369 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-28 18:20:33 +00:00
Peter Weinberger	ca922b6d36	runtime: Profile goroutines holding contended mutexes. runtime.SetMutexProfileFraction(n int) will capture 1/n-th of stack traces of goroutines holding contended mutexes if n > 0. From runtime/pprof, pprot.Lookup("mutex").WriteTo writes the accumulated stack traces to w (in essentially the same format that blocking profiling uses). Change-Id: Ie0b54fa4226853d99aa42c14cb529ae586a8335a Reviewed-on: https://go-review.googlesource.com/29650 Reviewed-by: Austin Clements <austin@google.com>	2016-10-28 11:47:16 +00:00
Alberto Donizetti	10560afb54	runtime/debug: avoid overflow in SetMaxThreads Fixes #16076 Change-Id: I91fa87b642592ee4604537dd8c3197cd61ec8b31 Reviewed-on: https://go-review.googlesource.com/31516 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-10-20 21:08:18 +00:00
Austin Clements	2be3ab4415	runtime: keep gcMarkRootCheck happy with spare Gs oneNewExtraM creates a spare M and G for use with cgo callbacks. The G doesn't run right away, but goes directly into syscall status. For the garbage collector, it's marked as "scan valid" and not on the rescan list, but I forgot to also mark it as "scan done". As a result, gcMarkRootCheck thinks that the goroutine hasn't been scanned and panics. This only affects GODEBUG=gccheckmark=1 mode, since we otherwise skip the gcMarkRootCheck. Fixes #17473. Change-Id: I94f5671c42eb44bd5ea7dc68fbf85f0c19e2e52c Reviewed-on: https://go-review.googlesource.com/31139 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-19 21:36:53 +00:00
Austin Clements	687d9d5d78	runtime: print a message on bad morestack If morestack runs on the g0 or gsignal stack, it currently performs some abort operation that typically produces a signal (e.g., it does an INT $3 on x86). This is useful if you're running in a debugger, but if you're not, the runtime tries to trap this signal, which is likely to send the program into a deeper spiral of collapse and lead to very confusing diagnostic output. Help out people trying to debug without a debugger by making morestack print an informative message before blowing up. Change-Id: I2814c64509b137bfe20a00091d8551d18c2c4749 Reviewed-on: https://go-review.googlesource.com/31133 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-10-17 18:56:09 +00:00
Austin Clements	9897e40811	runtime: use more go:nowritebarrierrec in proc.go Currently we use go:nowritebarrier in many places in proc.go. go:notinheap and go:yeswritebarrierrec now let us use go:nowritebarrierrec (the recursive form of the go:nowritebarrier pragma) more liberally. Do so in proc.go Change-Id: Ia7fcbc12ce6c51cb24730bf835fb7634ad53462f Reviewed-on: https://go-review.googlesource.com/30942 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-15 17:58:23 +00:00
Austin Clements	a9e6cebde2	cmd/compile, runtime: add go:yeswritebarrierrec pragma This pragma cancels the effect of go:nowritebarrierrec. This is useful in the scheduler because there are places where we enter a function without a valid P (and hence cannot have write barriers), but then obtain a P. This allows us to annotate the function with go:nowritebarrierrec and split out the part after we've obtained a P into a go:yeswritebarrierrec function. Change-Id: Ic8ce4b6d3c074a1ecd8280ad90eaf39f0ffbcc2a Reviewed-on: https://go-review.googlesource.com/30938 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2016-10-15 17:58:11 +00:00
Cherry Zhang	7c431cb7f9	cmd/link: insert trampolines for too-far jumps on ARM ARM direct CALL/JMP instruction has 24 bit offset, which can only encodes jumps within +/-32M. When the target is too far, the top bits get truncated and the program jumps wild. This CL detects too-far jumps and automatically insert trampolines, currently only internal linking on ARM. It is necessary to make the following changes to the linker: - Resolve direct jump relocs when assigning addresses to functions. this allows trampoline insertion without moving all code that already laid down. - Lay down packages in dependency order, so that when resolving a inter-package direct jump reloc, the target address is already known. Intra-package jumps are assumed never too far. - a linker flag -debugtramp is added for debugging trampolines: "-debugtramp=1 -v" prints trampoline debug message "-debugtramp=2" forces all inter-package jump to use trampolines (currently ARM only) "-debugtramp=2 -v" does both - Some data structures are changed for bookkeeping. On ARM, pseudo DIV/DIVU/MOD/MODU instructions now clobber R8 (unfortunate). In the standard library there is no ARM assembly code that uses these instructions, and the compiler no longer emits them (CL 29390). all.bash passes with -debugtramp=2, except a disassembly test (this is unavoidable as we changed the instruction). TBD: debug info of trampolines? Fixes #17028. Change-Id: Idcce347ea7e0af77c4079041a160b2f6e114b474 Reviewed-on: https://go-review.googlesource.com/29397 Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-10-11 13:35:33 +00:00
Ian Lance Taylor	d03e8b226c	runtime: record current PC for SIGPROF on non-Go thread If we get a SIGPROF on a non-Go thread, and the program has not called runtime.SetCgoTraceback so we have no way to collect a stack trace, then record a profile that is just the PC where the signal occurred. That will at least point the user to the right area. Retrieving the PC from the sigctxt in a signal handler on a non-G thread required marking a number of trivial sigctxt methods as nosplit, and, for extra safety, nowritebarrierrec. The test shows that the existing test CgoPprofThread test does not test the stack trace, just the profile signal. Leaving that for later. Change-Id: I8f8f3ff09ac099fc9d9df94b5a9d210ffc20c4ab Reviewed-on: https://go-review.googlesource.com/30252 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-10-11 12:56:15 +00:00
Dmitry Vyukov	c14050646f	runtime: fix newextram PC passed to race detector PC passed to racegostart is expected to be a return PC of the go statement. Race runtime will subtract 1 from the PC before symbolization. Passing start PC of a function is wrong. Add sys.PCQuantum to the function start PC. Update #17190 Change-Id: Ia504c49e79af84ed4ea360c2aea472b370ea8bf5 Reviewed-on: https://go-review.googlesource.com/29712 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-09-25 12:15:40 +00:00
David Crawshaw	8607bed744	runtime: avoid dependence on main symbol For -buildmode=plugin, this lets the linker drop the main.main symbol out of the binary while including most of the runtime. (In the future it should be possible to drop the entire runtime package from plugins.) Change-Id: I3e7a024ddf5cc945e3d8b84bf37a0b7cb2a00eb6 Reviewed-on: https://go-review.googlesource.com/27821 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-09-16 14:49:27 +00:00
Josh Bleecher Snyder	2b74de3ed9	runtime: rename fastrand1 to fastrand Change-Id: I37706ff0a3486827c5b072c95ad890ea87ede847 Reviewed-on: https://go-review.googlesource.com/28210 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-08-30 23:59:21 +00:00
Keith Randall	0d7a2241cb	runtime: update a few comments noescape is now 0 instructions with the SSA backend. fast atomics are no longer a TODO (at least for amd64). Change-Id: Ib6e06f7471bef282a47ba236d8ce95404bb60a42 Reviewed-on: https://go-review.googlesource.com/28087 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-08-30 18:16:28 +00:00
David Crawshaw	f135c32640	runtime: initialize hash algs before typemap When compiling with -buildmode=shared, a map[int32]*_type is created for each extra module mapping duplicate types back to a canonical object. This is done in the function typelinksinit, which is called before the init function that sets up the hash functions for the map implementation. The result is typemap becomes unusable after runtime initialization. The fix in this CL is to move algorithm init before typelinksinit in the runtime setup process. (For 1.8, we may want to turn typemap into a sorted slice of types and use binary search.) Manually tested on GOOS=linux with: GOHOSTARCH=386 GOARCH=386 ./make.bash && \ go install -buildmode=shared std && \ cd ../test && \ go run run.go -linkshared Fixes #16590 Change-Id: Idc08c50cc70d20028276fbf564509d2cd5405210 Reviewed-on: https://go-review.googlesource.com/25469 Run-TryBot: David Crawshaw <crawshaw@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-08-04 17:39:05 +00:00
Ian Lance Taylor	50048a4e8e	runtime: add as many extra M's as needed When a non-Go thread calls into Go, the runtime needs an M to run the Go code. The runtime keeps a list of extra M's available. When the last extra M is allocated, the needextram field is set to tell it to allocate a new extra M as soon as it is running in Go. This ensures that an extra M will always be available for the next thread. However, if many threads need an extra M at the same time, this serializes them all. One thread will get an extra M with the needextram field set. All the other threads will see that there is no M available and will go to sleep. The one thread that succeeded will create a new extra M. One lucky thread will get it. All the other threads will see that there is no M available and will go to sleep. The effect is thundering herd, as all the threads looking for an extra M go through the process one by one. This seems to have a particularly bad effect on the FreeBSD scheduler for some reason. With this change, we track the number of threads waiting for an M, and create all of them as soon as one thread gets through. This still means that all the threads will fight for the lock to pick up the next M. But at least each thread that gets the lock will succeed, instead of going to sleep only to fight again. This smooths out the performance greatly on FreeBSD, reducing the average wall time of `testprogcgo CgoCallbackGC` by 74%. On GNU/Linux the average wall time goes down by 9%. Fixes #13926 Fixes #16396 Change-Id: I6dc42a4156085a7ed4e5334c60b39db8f8ef8fea Reviewed-on: https://go-review.googlesource.com/25047 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-07-20 13:31:55 +00:00
Ian Lance Taylor	25a609556a	runtime: correct printing of blocked field in scheduler trace When the blocked field was first introduced back in https://golang.org/cl/61250043 the scheduler trace code incorrectly used m->blocked instead of mp->blocked. That has carried through the conversion to Go. This CL fixes it. Change-Id: Id81907b625221895aa5c85b9853f7c185efd8f4b Reviewed-on: https://go-review.googlesource.com/24571 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-06-29 01:38:39 +00:00
Ian Lance Taylor	84d8aff94c	runtime: collect stack trace if SIGPROF arrives on non-Go thread Fixes #15994. Change-Id: I5aca91ab53985ac7dcb07ce094ec15eb8ec341f8 Reviewed-on: https://go-review.googlesource.com/23891 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-06-13 21:43:19 +00:00
Russ Cox	7fdec6216c	build: enable framepointer mode by default This has a minor performance cost, but far less than is being gained by SSA. As an experiment, enable it during the Go 1.7 beta. Having frame pointers on by default makes Linux's perf, Intel VTune, and other profilers much more useful, because it lets them gather a stack trace efficiently on profiling events. (It doesn't help us that much, since when we walk the stack we usually need to look up PC-specific information as well.) Fixes #15840. Change-Id: I4efd38412a0de4a9c87b1b6e5d11c301e63f1a2a Reviewed-on: https://go-review.googlesource.com/23451 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-26 19:02:00 +00:00
Ian Lance Taylor	a5d1a72a40	cmd/cgo, runtime, runtime/cgo: TSAN support for malloc Acquire and release the TSAN synchronization point when calling malloc, just as we do when calling any other C function. If we don't do this, TSAN will report false positive errors about races calling malloc and free. We used to have a special code path for malloc and free, going through the runtime functions cmalloc and cfree. The special code path for cfree was no longer used even before this CL. This CL stops using the special code path for malloc, because there is no place along that path where we could conditionally insert the TSAN synchronization. This CL removes the support for the special code path for both functions. Instead, cgo now automatically generates the malloc function as though it were referenced as C.malloc. We need to automatically generate it even if C.malloc is not called, even if malloc and size_t are not declared, to support cgo-provided functions like C.CString. Change-Id: I829854ec0787a80f33fa0a8a0dc2ee1d617830e2 Reviewed-on: https://go-review.googlesource.com/23260 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-05-25 23:22:24 +00:00
Austin Clements	3be48b4dc8	runtime: pass gcWork to scanstack Currently scanstack obtains its own gcWork from the P for the duration of the stack scan and then, if called during mark termination, disposes the gcWork. However, this means that the number of workbufs allocated will be at least the number of stacks scanned during mark termination, which may be very high (especially during a STW GC). This happens because, in steady state, each scanstack will obtain a fresh workbuf (either from the empty list or by allocating it), fill it with the scan results, and then dispose it to the full list. Nothing is consuming from the full list during this (and hence nothing is recycling them to the empty list), so the length of the full list by the time mark termination starts draining it is at least the number of stacks scanned. Fix this by pushing the gcWork acquisition up the stack to either the gcDrain that calls markroot that calls scanstack (which batches across many stack scans and is the path taken during STW GC) or to newstack (which is still a single scanstack call, but this is roughly bounded by the number of Ps). This fix reduces the workbuf allocation for the test program from issue #15319 from 213 MB (roughly 2KB * 1e5 goroutines) to 10 MB. Fixes #15319. Note that there's potentially a similar issue in write barriers during mark 2. Fixing that will be more difficult since there's no broader non-preemptible context, but it should also be less of a problem since the full list is being drained during mark 2. Some overall improvements in the go1 benchmarks, plus the usual noise. No significant change in the garbage benchmark (time/op or GC memory). name old time/op new time/op delta BinaryTree17-12 2.54s ± 1% 2.51s ± 1% -1.09% (p=0.000 n=20+19) Fannkuch11-12 2.12s ± 0% 2.17s ± 0% +2.18% (p=0.000 n=19+18) FmtFprintfEmpty-12 45.1ns ± 1% 45.2ns ± 0% ~ (p=0.078 n=19+18) FmtFprintfString-12 127ns ± 0% 128ns ± 0% +1.08% (p=0.000 n=19+16) FmtFprintfInt-12 125ns ± 0% 122ns ± 1% -2.71% (p=0.000 n=14+18) FmtFprintfIntInt-12 196ns ± 0% 190ns ± 1% -2.91% (p=0.000 n=12+20) FmtFprintfPrefixedInt-12 196ns ± 0% 194ns ± 1% -0.94% (p=0.000 n=13+18) FmtFprintfFloat-12 253ns ± 1% 251ns ± 1% -0.86% (p=0.000 n=19+20) FmtManyArgs-12 807ns ± 1% 784ns ± 1% -2.85% (p=0.000 n=20+20) GobDecode-12 7.13ms ± 1% 7.12ms ± 1% ~ (p=0.351 n=19+20) GobEncode-12 5.89ms ± 0% 5.95ms ± 0% +0.94% (p=0.000 n=19+19) Gzip-12 219ms ± 1% 221ms ± 1% +1.35% (p=0.000 n=18+20) Gunzip-12 37.5ms ± 1% 37.4ms ± 0% ~ (p=0.057 n=20+19) HTTPClientServer-12 81.4µs ± 4% 81.9µs ± 3% ~ (p=0.118 n=17+18) JSONEncode-12 15.7ms ± 1% 15.8ms ± 1% +0.73% (p=0.000 n=17+18) JSONDecode-12 57.9ms ± 1% 57.2ms ± 1% -1.34% (p=0.000 n=19+19) Mandelbrot200-12 4.12ms ± 1% 4.10ms ± 0% -0.33% (p=0.000 n=19+17) GoParse-12 3.22ms ± 2% 3.25ms ± 1% +0.72% (p=0.000 n=18+20) RegexpMatchEasy0_32-12 70.6ns ± 1% 71.1ns ± 2% +0.63% (p=0.005 n=19+20) RegexpMatchEasy0_1K-12 240ns ± 0% 239ns ± 1% -0.59% (p=0.000 n=19+20) RegexpMatchEasy1_32-12 71.3ns ± 1% 71.3ns ± 1% ~ (p=0.844 n=17+17) RegexpMatchEasy1_1K-12 384ns ± 2% 371ns ± 1% -3.45% (p=0.000 n=19+20) RegexpMatchMedium_32-12 109ns ± 1% 108ns ± 2% -0.48% (p=0.029 n=19+19) RegexpMatchMedium_1K-12 34.3µs ± 1% 34.5µs ± 2% ~ (p=0.160 n=18+20) RegexpMatchHard_32-12 1.79µs ± 9% 1.72µs ± 2% -3.83% (p=0.000 n=19+19) RegexpMatchHard_1K-12 53.3µs ± 4% 51.8µs ± 1% -2.82% (p=0.000 n=19+20) Revcomp-12 386ms ± 0% 388ms ± 0% +0.72% (p=0.000 n=17+20) Template-12 62.9ms ± 1% 62.5ms ± 1% -0.57% (p=0.010 n=18+19) TimeParse-12 325ns ± 0% 331ns ± 0% +1.84% (p=0.000 n=18+19) TimeFormat-12 338ns ± 0% 343ns ± 0% +1.34% (p=0.000 n=18+20) [Geo mean] 52.7µs 52.5µs -0.42% Change-Id: Ib2d34736c4ae2ec329605b0fbc44636038d8d018 Reviewed-on: https://go-review.googlesource.com/23391 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-25 21:11:47 +00:00
Austin Clements	91740582c3	runtime: add 'next' flag to ready Currently ready always puts the readied goroutine in runnext. We're going to have to change this for some uses, so add a flag for whether or not to use runnext. For now we always pass true so this is a no-op change. For #15706. Change-Id: Iaa66d8355ccfe4bbe347570cc1b1878c70fa25df Reviewed-on: https://go-review.googlesource.com/23171 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-19 18:17:58 +00:00
Ian Lance Taylor	1f7a0d4b5e	runtime: don't do a plain throw when throwsplit == true The test case in #15639 somehow causes an invalid syscall frame. The failure is obscured because the throw occurs when throwsplit == true, which causes a "stack split at bad time" error when trying to print the throw message. This CL fixes the "stack split at bad time" by using systemstack. No test because there shouldn't be any way to trigger this error anyhow. Update #15639. Change-Id: I4240f3fd01bdc3c112f3ffd1316b68504222d9e1 Reviewed-on: https://go-review.googlesource.com/23153 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-05-19 04:37:45 +00:00
Ian Lance Taylor	84e808043f	runtime: use cgo traceback for SIGPROF If we collected a cgo traceback when entering the SIGPROF signal handler, record it as part of the profiling stack trace. This serves as the promised test for https://golang.org/cl/21055 . Change-Id: I5f60cd6cea1d9b7c3932211483a6bfab60ed21d2 Reviewed-on: https://go-review.googlesource.com/22650 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-05-04 00:08:19 +00:00
Dmitry Vyukov	caa2147532	runtime: per-P contexts for race detector Race runtime also needs local malloc caches and currently uses a mix of per-OS-thread and per-goroutine caches. This leads to increased memory consumption. But more importantly cache of synchronization objects is per-goroutine and we don't always have goroutine context when feeing memory in GC. As the result synchronization object descriptors leak (more precisely, they can be reused if another synchronization object is recreated at the same address, but it does not always help). For example, the added BenchmarkSyncLeak has effectively runaway memory consumption (based on a real long running server). This change updates race runtime with support for per-P contexts. BenchmarkSyncLeak now stabilizes at ~1GB memory consumption. Long term, this will allow us to remove race runtime dependency on glibc (as malloc is the main cornerstone). I've also implemented a different scheme to pass P context to race runtime: scheduler notified race runtime about association between G and P by calling procwire(g, p)/procunwire(g, p). But it turned out to be very messy as we have lots of places where the association changes (e.g. syscalls). So I dropped it in favor of the current scheme: race runtime asks scheduler about the current P. Fixes #14533 Change-Id: Iad10d2f816a44affae1b9fed446b3580eafd8c69 Reviewed-on: https://go-review.googlesource.com/19970 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-03 11:00:43 +00:00
Dmitry Vyukov	fcd7c02c70	runtime: fix CPU underutilization Runqempty is a critical predicate for scheduler. If runqempty spuriously returns true, then scheduler can fail to schedule arbitrary number of runnable goroutines on idle Ps for arbitrary long time. With the addition of runnext runqempty predicate become broken (can spuriously return true). Consider that runnext is not nil and the main array is empty. Runqempty observes that the array is empty, then it is descheduled for some time. Then queue owner pushes another element to the queue evicting runnext into the array. Then queue owner pops runnext. Then runqempty resumes and observes runnext is nil and returns true. But there were no point in time when the queue was empty. Fix runqempty predicate to not return true spuriously. Change-Id: Ifb7d75a699101f3ff753c4ce7c983cf08befd31e Reviewed-on: https://go-review.googlesource.com/20858 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-03 10:06:32 +00:00

1 2 3 4 5

249 Commits