mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
Michael Anthony Knyszek	42019613df	runtime: make distributed/local malloc stats the source-of-truth This change makes it so that various local malloc stats (excluding heap_scan and local_tinyallocs) are no longer written first to mheap fields but are instead accessed directly from each mcache. This change is part of a move toward having stats be distributed, and cleaning up some old code related to the stats. Note that because there's no central source-of-truth, when an mcache dies, it must donate its stats to another mcache. It's always safe to donate to the mcache for the 0th P, so do that. Change-Id: I2556093dbc27357cb9621c9b97671f3c00aa1173 Reviewed-on: https://go-review.googlesource.com/c/go/+/246964 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 17:26:08 +00:00
Michael Anthony Knyszek	ce46f197b6	runtime: access the assist ratio atomically This change makes it so that the GC assist ratio (the pair of gcControllerState fields assistBytesPerWork and assistWorkPerByte) is updated atomically. Note that the pair of fields are not updated together atomically, but that's OK. The code here was already racy for some time and in practice the assist ratio moves very slowly. The purpose of this change is so that we can document gcController.revise to be safe for concurrent use, which will be useful in further changes. Change-Id: Ie25d630207c88e4f85f2b8953f6a0051ebf1b4ea Reviewed-on: https://go-review.googlesource.com/c/go/+/246963 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 17:26:01 +00:00
Austin Clements	c91dffbc9a	runtime: tidy cgocallback On amd64 and 386, we have a very roundabout way of remembering that we need to dropm on return that currently involves saving a zero to needm's argument slot and later bringing it back. Just store the zero. This also makes amd64 and 386 more consistent with cgocallback on all other platforms: rather than saving the old M to the G stack, they now save it to a named slot on the G0 stack. The needm function no longer needs a dummy argument to get the SP, so we drop that. Change-Id: I7e84bb4a5ff9552de70dcf41d8accf02310535e7 Reviewed-on: https://go-review.googlesource.com/c/go/+/263268 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-10-26 14:50:34 +00:00
Austin Clements	30c1887873	runtime,cmd/cgo: simplify C -> Go call path This redesigns the way calls work from C to exported Go functions. It removes several steps from the call path, makes cmd/cgo no longer sensitive to the Go calling convention, and eliminates the use of reflectcall from cgo. In order to avoid generating a large amount of FFI glue between the C and Go ABIs, the cgo tool has long depended on generating a C function that marshals the arguments into a struct, and then the actual ABI switch happens in functions with fixed signatures that simply take a pointer to this struct. In a way, this CL simply pushes this idea further. Currently, the cgo tool generates this argument struct in the exact layout of the Go stack frame and depends on reflectcall to unpack it into the appropriate Go call (even though it's actually reflectcall'ing a function generated by cgo). In this CL, we decouple this struct from the Go stack layout. Instead, cgo generates a Go function that takes the struct, unpacks it, and calls the exported function. Since this generated function has a generic signature (like the rest of the call path), we don't need reflectcall and can instead depend on the Go compiler itself to implement the call to the exported Go function. One complication is that syscall.NewCallback on Windows, which converts a Go function into a C function pointer, depends on cgocallback's current dynamic calling approach since the signatures of the callbacks aren't known statically. For this specific case, we continue to depend on reflectcall. Really, the current approach makes some overly simplistic assumptions about translating the C ABI to the Go ABI. Now we're at least in a much better position to do a proper ABI translation. For comparison, the current cgo call path looks like: GoF (generated C function) -> crosscall2 (in cgo/asm_.s) -> _cgoexp_GoF (generated Go function) -> cgocallback (in asm_.s) -> cgocallback_gofunc (in asm_.s) -> cgocallbackg (in cgocall.go) -> cgocallbackg1 (in cgocall.go) -> reflectcall (in asm_.s) -> _cgoexpwrap_GoF (generated Go function) -> p.GoF Now the call path looks like: GoF (generated C function) -> crosscall2 (in cgo/asm_.s) -> cgocallback (in asm_.s) -> cgocallbackg (in cgocall.go) -> cgocallbackg1 (in cgocall.go) -> _cgoexp_GoF (generated Go function) -> p.GoF Notably: 1. We combine _cgoexp_GoF and _cgoexpwrap_GoF and move the combined operation to the end of the sequence. This combined function also handles reflectcall's previous role. 2. We combined cgocallback and cgocallback_gofunc since the only purpose of having both was to convert a raw PC into a Go function value. We instead construct the Go function value in cgocallbackg1. 3. cgocallbackg1 no longer reaches backwards through the stack to get the arguments to cgocallback_gofunc. Instead, we just pass the arguments down. 4. Currently, we need an explicit msanwrite to mark the results struct as written because reflectcall doesn't do this. Now, the results are written by regular Go assignments, so the Go compiler generates the necessary MSAN annotations. This also means we no longer need to track the size of the arguments frame. Updates #40724, since now we don't need to teach cgo about the register ABI or change how it uses reflectcall. Change-Id: I7840489a2597962aeb670e0c1798a16a7359c94f Reviewed-on: https://go-review.googlesource.com/c/go/+/258938 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-10-26 14:50:32 +00:00
Tiwei Bie	bc0b198bd7	runtime: dump the status of lockedg on error The dumpgstatus() will dump current g's status anyway. When lockedg's status is bad, it's more helpful to dump lockedg's status as well than dumping current g's status twice. Change-Id: If5248cb94b9cdcbf4ceea07562237e1d6ee28489 GitHub-Last-Rev: `da814c51ff` GitHub-Pull-Request: golang/go#40248 Reviewed-on: https://go-review.googlesource.com/c/go/+/243097 Reviewed-by: Keith Randall <khr@golang.org> Trust: Emmanuel Odeke <emm.odeke@gmail.com> Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Go Bot <gobot@golang.org>	2020-10-24 03:15:23 +00:00
Andrew G. Morgan	d1b1145cac	syscall: support POSIX semantics for Linux syscalls This change adds two new methods for invoking system calls under Linux: syscall.AllThreadsSyscall() and syscall.AllThreadsSyscall6(). These system call wrappers ensure that all OSThreads mirror a common system call. The wrappers serialize execution of the runtime to ensure no race conditions where any Go code observes a non-atomic OS state change. As such, the syscalls have higher runtime overhead than regular system calls, and only need to be used where such thread (or 'm' in the parlance of the runtime sources) consistency is required. The new support is used to enable these functions under Linux: syscall.Setegid(), syscall.Seteuid(), syscall.Setgroups(), syscall.Setgid(), syscall.Setregid(), syscall.Setreuid(), syscall.Setresgid(), syscall.Setresuid() and syscall.Setuid(). They work identically to their glibc counterparts. Extensive discussion of the background issue addressed in this patch can be found here: https://github.com/golang/go/issues/1435 In the case where cgo is used, the C runtime can launch pthreads that are not managed by the Go runtime. As such, the added syscall.AllThreadsSyscall() return ENOTSUP when cgo is enabled. However, for the 9 syscall.Set() functions listed above, when cgo is active, these functions redirect to invoke their C.set() equivalents in glibc, which wraps the raw system calls with a nptl:setxid fixup mechanism. This achieves POSIX semantics for these functions in the combined Go and C runtime. As a side note, the glibc/nptl:setxid support (2019-11-30) does not extend to all security related system calls under Linux so using native Go (CGO_ENABLED=0) and these AllThreadsSyscall()s, where needed, will yield more well defined/consistent behavior over all threads of a Go program. That is, using the syscall.AllThreadsSyscall*() wrappers for things like setting state through SYS_PRCTL and SYS_CAPSET etc. Fixes #1435 Change-Id: Ib1a3e16b9180f64223196a32fc0f9dce14d9105c Reviewed-on: https://go-review.googlesource.com/c/go/+/210639 Trust: Emmanuel Odeke <emm.odeke@gmail.com> Trust: Ian Lance Taylor <iant@golang.org> Trust: Michael Pratt <mpratt@google.com> Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Austin Clements <austin@google.com>	2020-10-23 20:53:14 +00:00
Michael Pratt	4a2cc73f87	runtime: don't attempt to steal from idle Ps Work stealing is a scalability bottleneck in the scheduler. Since each P has a work queue, work stealing must look at every P to determine if there is any work. The number of Ps scales linearly with GOMAXPROCS (i.e., the number of Ps _is_ GOMAXPROCS), thus this work scales linearly with GOMAXPROCS. Work stealing is a later attempt by a P to find work before it goes idle. Since the P has no work of its own, extra costs here tend not to directly affect application-level benchmarks. Where they show up is extra CPU usage by the process as a whole. These costs get particularly expensive for applications that transition between blocked and running frequently. Long term, we need a more scalable approach in general, but for now we can make a simple observation: idle Ps ([1]) cannot possibly have anything in their runq, so we need not bother checking at all. We track idle Ps via a new global bitmap, updated in pidleput/pidleget. This is already a slow path (requires sched.lock), so we don't expect high contention there. Using a single bitmap avoids the need to touch every P to read p.status. Currently, the bitmap approach is not significantly better than reading p.status. However, in a future CL I'd like to apply a similiar optimization to timers. Once done, findrunnable would not touch most Ps at all (in mostly idle programs), which will avoid memory latency to pull those Ps into cache. When reading this bitmap, we are racing with Ps going in and out of idle, so there are a few cases to consider: 1. _Prunning -> _Pidle: Running P goes idle after we check the bitmap. In this case, we will try to steal (and find nothing) so there is no harm. 2. _Pidle -> _Prunning while spinning: A P that starts running may queue new work that we miss. This is OK: (a) that P cannot go back to sleep without completing its work, and (b) more fundamentally, we will recheck after we drop our P. 3. _Pidle -> _Prunning after spinning: After spinning, we really can miss work from a newly woken P. (a) above still applies here as well, but this is also the same delicate dance case described in findrunnable: if nothing is spinning anymore, the other P will unpark a thread to run the work it submits. Benchmark results from WakeupParallel/syscall/pair/race/1ms (see golang.org/cl/228577): name old msec new msec delta Perf-task-clock-8 250 ± 1% 247 ± 4% ~ (p=0.690 n=5+5) Perf-task-clock-16 258 ± 2% 259 ± 2% ~ (p=0.841 n=5+5) Perf-task-clock-32 284 ± 2% 270 ± 4% -4.94% (p=0.032 n=5+5) Perf-task-clock-64 326 ± 3% 303 ± 2% -6.92% (p=0.008 n=5+5) Perf-task-clock-128 407 ± 2% 363 ± 5% -10.69% (p=0.008 n=5+5) Perf-task-clock-256 561 ± 1% 481 ± 1% -14.20% (p=0.016 n=4+5) Perf-task-clock-512 840 ± 5% 683 ± 2% -18.70% (p=0.008 n=5+5) Perf-task-clock-1024 1.38k ±14% 1.07k ± 2% -21.85% (p=0.008 n=5+5) [1] "Idle Ps" here refers to _Pidle Ps in the sched.pidle list. In other contexts, Ps may temporarily transition through _Pidle (e.g., in handoffp); those Ps may have work. Updates #28808 Updates #18237 Change-Id: Ieeb958bd72e7d8fb375b0b1f414e8d7378b14e29 Reviewed-on: https://go-review.googlesource.com/c/go/+/259578 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com> Trust: Michael Pratt <mpratt@google.com>	2020-10-23 14:18:27 +00:00
Ian Lance Taylor	05739d6f17	runtime: wait for preemption signals before syscall.Exec Fixes #41702 Fixes #42023 Change-Id: If07f40b1d73b8f276ee28ffb8b7214175e56c24d Reviewed-on: https://go-review.googlesource.com/c/go/+/262817 Trust: Ian Lance Taylor <iant@golang.org> Trust: Bryan C. Mills <bcmills@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-10-16 23:50:26 +00:00
Martin Möhrmann	7c58ef732e	runtime: implement GODEBUG=inittrace=1 support Setting inittrace=1 causes the runtime to emit a single line to standard error for each package with init work, summarizing the execution time and memory allocation. The emitted debug information for init functions can be used to find bottlenecks or regressions in Go startup performance. Packages with no init function work (user defined or compiler generated) are omitted. Tracing plugin inits is not supported as they can execute concurrently. This would make the implementation of tracing more complex while adding support for a very rare use case. Plugin inits can be traced separately by testing a main package importing the plugins package imports explicitly. $ GODEBUG=inittrace=1 go test init internal/bytealg @0.008 ms, 0 ms clock, 0 bytes, 0 allocs init runtime @0.059 ms, 0.026 ms clock, 0 bytes, 0 allocs init math @0.19 ms, 0.001 ms clock, 0 bytes, 0 allocs init errors @0.22 ms, 0.004 ms clock, 0 bytes, 0 allocs init strconv @0.24 ms, 0.002 ms clock, 32 bytes, 2 allocs init sync @0.28 ms, 0.003 ms clock, 16 bytes, 1 allocs init unicode @0.44 ms, 0.11 ms clock, 23328 bytes, 24 allocs ... Inspired by stapelberg@google.com who instrumented doInit in a prototype to measure init times with GDB. Fixes #41378 Change-Id: Ic37c6a0cfc95488de9e737f5e346b8dbb39174e1 Reviewed-on: https://go-review.googlesource.com/c/go/+/254659 Trust: Martin Möhrmann <moehrmann@google.com> Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-10-14 05:34:32 +00:00
Cherry Zhang	a413908dd0	all: add GOOS=ios Introduce GOOS=ios for iOS systems. GOOS=ios matches "darwin" build tag, like GOOS=android matches "linux" and GOOS=illumos matches "solaris". Only ios/arm64 is supported (ios/amd64 is not). GOOS=ios and GOOS=darwin remain essentially the same at this point. They will diverge at later time, to differentiate macOS and iOS. Uses of GOOS=="darwin" are changed to (GOOS=="darwin" \|\| GOOS=="ios"), except if it clearly means macOS (e.g. GOOS=="darwin" && GOARCH=="amd64"), it remains GOOS=="darwin". Updates #38485. Change-Id: I4faacdc1008f42434599efb3c3ad90763a83b67c Reviewed-on: https://go-review.googlesource.com/c/go/+/254740 Trust: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-09-23 18:12:59 +00:00
Michael Pratt	d42b32e321	runtime: add sched.lock assertions Functions that require holding sched.lock now have an assertion. A few places with missing locks have been fixed in this CL: Additionally, locking is added around the call to procresize in schedinit. This doesn't technically need a lock since the program is still starting (thus no concurrency) when this is called, but lock held checking doesn't know that. Updates #40677 Change-Id: I198d3cbaa727f7088e4d55ba8fa989cf1ee8f9cf Reviewed-on: https://go-review.googlesource.com/c/go/+/250261 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-09-22 15:14:09 +00:00
Michael Pratt	02ff8b8ce4	runtime: expand gopark documentation unlockf is called after the G is put into _Gwaiting, meaning another G may have readied this one before unlockf is called. This is implied by the current doc, but add additional notes to call out this behavior, as it can be quite surprising. Updates #40641 Change-Id: I60b1ccc6a4dd9ced8ad2aa1f729cb2e973100b59 Reviewed-on: https://go-review.googlesource.com/c/go/+/256058 Trust: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-09-21 15:08:44 +00:00
Paschalis Tsilias	331614c4da	runtime: improve error messages after allocating a stack that is too big In the current implementation, we can observe crashes after calling debug.SetMaxStack and allocating a stack larger than 4GB since stackalloc works with 32-bit sizes. To avoid this, we define an upper limit as the largest feasible point we can grow a stack to and provide a better error message when we get a stack overflow. Fixes #41228 Change-Id: I55fb0a824f47ed9fb1fcc2445a4dfd57da9ef8d4 Reviewed-on: https://go-review.googlesource.com/c/go/+/255997 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Trust: Giovanni Bajo <rasky@develer.com> Reviewed-by: Keith Randall <khr@golang.org>	2020-09-20 09:54:44 +00:00
Ian Lance Taylor	2556eb76c8	runtime: ignore SIGPROF if profiling disable for thread This avoids a deadlock on prof.signalLock between setcpuprofilerate and cpuprof.add if a SIGPROF is delivered to the thread between the call to setThreadCPUProfiler and acquiring prof.signalLock. Fixes #41014 Change-Id: Ie825e8594f93a19fb1a6320ed640f4e631553596 Reviewed-on: https://go-review.googlesource.com/c/go/+/253758 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-09-09 18:04:10 +00:00
Keith Randall	5c2c6d3fbf	runtime: framepointers are no longer an experiment - hard code them I think they are no longer experimental status. Might as well promote them to permanent. Change-Id: Id1259601b3dd2061dd60df86ee48080bfb575d2f Reviewed-on: https://go-review.googlesource.com/c/go/+/249857 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2020-08-27 21:15:47 +00:00
Cholerae Hu	613388315e	runtime: reduce critical path in injectglist Change-Id: Ia3fb30ac9add39c803f11f69d967c6604fdeacf8 Reviewed-on: https://go-review.googlesource.com/c/go/+/233217 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2020-08-18 04:22:33 +00:00
liu-xuewen	ba97be4b58	runtime: remove tracebackinit and unused skipPC CL [152537](https://go-review.googlesource.com/c/go/+/152537/) changed the way inlined frames are represented in tracebacks to no longer use skipPC Change-Id: I42386fdcc5cf72f3c122e789b6af9cbd0c6bed4b GitHub-Last-Rev: `79c26dcd53` GitHub-Pull-Request: golang/go#39829 Reviewed-on: https://go-review.googlesource.com/c/go/+/239701 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-08-17 21:05:19 +00:00
Austin Clements	7bbd5ca5a6	runtime: replace index and contains with bytealg calls The runtime has its own implementation of string indexing. To reduce code duplication and cognitive load, replace this with calls to the internal/bytealg package. We can't do this on Plan 9 because it needs string indexing in a note handler (which isn't allowed to use the optimized bytealg version because it uses SSE), so we can't just eliminate the index function, but this CL does down-scope it so make it clear it's only for note handlers on Plan 9. Change-Id: Ie1a142678262048515c481e8c26313b80c5875df Reviewed-on: https://go-review.googlesource.com/c/go/+/244537 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-08-17 13:20:03 +00:00
Michael Anthony Knyszek	6b4dcf19fa	runtime: hold sched.lock over globrunqputbatch in runqputbatch globrunqputbatch should never be called without sched.lock held. runqputbatch's documentation even says it may acquire sched.lock in order to call it. Fixes #40457. Change-Id: I5421b64f1da3a6087dfebbef7203db0c95d213a8 Reviewed-on: https://go-review.googlesource.com/c/go/+/245377 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2020-07-30 15:46:39 +00:00
Michael Pratt	85afa2eb19	runtime: ensure startm new M is consistently visible to checkdead If no M is available, startm first grabs an idle P, then drops sched.lock and calls newm to start a new M to run than P. Unfortunately, that leaves a window in which a G (e.g., returning from a syscall) may find no idle P, add to the global runq, and then in stopm discover that there are no running M's, a condition that should be impossible with runnable G's. To avoid this condition, we pre-allocate the new M ID in startm before dropping sched.lock. This ensures that checkdead will see the M as running, and since that new M must eventually run the scheduler, it will handle any pending work as necessary. Outside of startm, most other calls to newm/allocm don't have a P at all. The only exception is startTheWorldWithSema, which always has an M if there is 1 P (i.e., the currently running M), and if there is >1 P the findrunnable spinning dance ensures the problem never occurs. This has been tested with strategically placed sleeps in the runtime to help induce the correct race ordering, but the timing on this is too narrow for a test that can be checked in. Fixes #40368 Change-Id: If5e0293a430cc85154b7ed55bc6dadf9b340abe2 Reviewed-on: https://go-review.googlesource.com/c/go/+/245018 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-07-28 16:59:04 +00:00
Ian Lance Taylor	9b90491e4a	runtime: steal timers from running P's Previously we did not steal timers from running P's, because that P should be responsible for running its own timers. However, if the P is running a CPU-bound G, this can cause measurable delays in running ready timers. Also, in CL 214185 we avoided taking the timer lock of a P with no ready timers, which reduces the chances of timer lock contention. So, if we can't find any ready timers on sleeping P's, try stealing them from running P's. Fixes #38860 Change-Id: I0bf1d5dc56258838bdacccbf89493524e23d7fed Reviewed-on: https://go-review.googlesource.com/c/go/+/232199 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-06-03 05:33:54 +00:00
Richard Musiol	0452f9460f	runtime: fix race condition between timer and event handler This change fixes a race condition between beforeIdle waking up the innermost event handler and a timer causing a different goroutine to wake up at the exact same moment. This messes up the wasm event handling and leads to memory corruption. The solution is to make beforeIdle return the goroutine that must run next and have findrunnable pick this goroutine without considering timers again. Fixes #38093 Fixes #38574 Change-Id: Iffbe99411d25c2730953d1c8b0741fd892f8e540 Reviewed-on: https://go-review.googlesource.com/c/go/+/230178 Run-TryBot: Richard Musiol <neelance@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-05-31 18:35:04 +00:00
Michael Pratt	11b3730a02	runtime: disable preemption in startTemplateThread When a locked M wants to start a new M, it hands off to the template thread to actually call clone and start the thread. The template thread is lazily created the first time a thread is locked (or if cgo is in use). stoplockedm will release the P (_Pidle), then call handoffp to give the P to another M. In the case of a pending STW, one of two things can happen: 1. handoffp starts an M, which does acquirep followed by schedule, which will finally enter _Pgcstop. 2. handoffp immediately enters _Pgcstop. This only occurs if the P has no local work, GC work, and no spinning M is required. If handoffp starts an M, and must create a new M to do so, then newm will simply queue the M on newmHandoff for the template thread to do the clone. When a stop-the-world is required, stopTheWorldWithSema will start the stop and then wait for all Ps to enter _Pgcstop. If the template thread is not fully created because startTemplateThread gets stopped, then another stoplockedm may queue an M that will never get created, and the handoff P will never leave _Pidle. Thus stopTheWorldWithSema will wait forever. A sequence to trigger this hang when STW occurs can be visualized with two threads: T1 T2 ------------------------------- ----------------------------- LockOSThread LockOSThread haveTemplateThread == 0 startTemplateThread haveTemplateThread = 1 newm haveTemplateThread == 1 preempt -> schedule g.m.lockedExt++ gcstopm -> _Pgcstop g.m.lockedg = ... park g.lockedm = ... return ... (any code) preempt -> schedule stoplockedm releasep -> _Pidle handoffp startm (first 3 handoffp cases) newm g.m.lockedExt != 0 Add to newmHandoff, return park Note that the P in T2 is stuck sitting in _Pidle. Since the template thread isn't running, the new M will not be started complete the transition to _Pgcstop. To resolve this, we disable preemption around the assignment of haveTemplateThread and the creation of the template thread in order to guarantee that if handTemplateThread is set then the template thread will eventually exist, in the presence of stops. Fixes #38931 Change-Id: I50535fbbe2f328f47b18e24d9030136719274191 Reviewed-on: https://go-review.googlesource.com/c/go/+/232978 Run-TryBot: Michael Pratt <mpratt@google.com> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-05-21 21:01:39 +00:00
Michael Anthony Knyszek	c847589ad0	runtime: synchronize StartTrace and StopTrace with sysmon Currently sysmon is not stopped when the world is stopped, which is in general a difficult thing to do. The result of this is that when tracing starts and the value of trace.enabled changes, it's possible for sysmon to fail to emit an event when it really should. This leads to traces which the execution trace parser deems inconsistent. Fix this by putting all of sysmon's work behind a new lock sysmonlock. StartTrace and StopTrace both acquire this lock after stopping the world but before performing any work in order to ensure sysmon sees the required state change in tracing. This change is expected to slow down StartTrace and StopTrace, but will help ensure consistent traces are generated. Updates #29707. Fixes #38794. Change-Id: I64c58e7c3fd173cd5281ffc208d6db24ff6c0284 Reviewed-on: https://go-review.googlesource.com/c/go/+/234617 Run-TryBot: Michael Knyszek <mknyszek@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-05-21 14:48:50 +00:00
Dan Scales	f9640b88c7	runtime: incorporate Gscan acquire/release into lock ranking order I added routines that can acquire/release a particular rank without acquiring/releasing an associated lock. I added lockRankGscan as a rank for acquiring/releasing the Gscan bit. castogscanstatus() and casGtoPreemptScan() are acquires of the Gscan bit. casfrom_Gscanstatus() is a release of the Gscan bit. casgstatus() is like an acquire and release of the Gscan bit, since it will wait if Gscan bit is currently set. We have a cycle between hchan and Gscan. The acquisition of Gscan and then hchan only happens in syncadjustsudogs() when the G is suspended, so the main normal ordering (get hchan, then get Gscan) can't be happening. So, I added a new rank lockRankHchanLeaf that is used when acquiring hchan locks in syncadjustsudogs. This ranking is set so no other locks can be acquired except other hchan locks. Fixes #38922 Change-Id: I58ce526a74ba856cb42078f7b9901f2832e1d45c Reviewed-on: https://go-review.googlesource.com/c/go/+/228417 Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-05-07 20:45:42 +00:00
徐志强	85162292af	runtime: call osyield directly in lockextra The `yield := osyield` line doesn't serve any purpose, it's committed in `2015`, time to delete that line:) Change-Id: I382d4d32cf320f054f011f3b6684c868cbcb0ff2 GitHub-Last-Rev: `7a0aa25e55` GitHub-Pull-Request: golang/go#36078 Reviewed-on: https://go-review.googlesource.com/c/go/+/210837 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2020-05-07 04:59:13 +00:00
geedchin	01a9cf8487	runtime: correct waitReasonForceGGIdle to waitResonForceGCIdle Change-Id: I211db915ce2e98555c58f4320ca58e91536f8f3d GitHub-Last-Rev: `40a7430f88` GitHub-Pull-Request: golang/go#38852 Reviewed-on: https://go-review.googlesource.com/c/go/+/232037 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-05-05 02:38:39 +00:00
Michael Anthony Knyszek	2491c5fd24	runtime: wake scavenger and update address on sweep done This change modifies the semantics of waking the scavenger: rather than wake on any update to pacing, wake when we know we will have work to do, that is, when the sweeper is done. The current scavenger runs over the address space just once per GC cycle, and we want to maximize the chance that the scavenger observes the most attractive scavengable memory in that pass (i.e. free memory with the highest address), so the timing is important. By having the scavenger awaken and reset its search space when the sweeper is done, we increase the chance that the scavenger will observe the most attractive scavengable memory, because no more memory will be freed that GC cycle (so the highest scavengable address should now be available). Furthermore, in applications that go idle, this means the background scavenger will be awoken even if another GC doesn't happen, which isn't true today. However, we're unable to wake the scavenger directly from within the sweeper; waking the scavenger involves modifying timers and readying goroutines, the latter of which may trigger an allocation today (and the sweeper may run during allocation!). Instead, we do the following: 1. Set a flag which is checked by sysmon. sysmon will clear the flag and wake the scavenger. 2. Wake the scavenger unconditionally at sweep termination. The idea behind this policy is that it gets us close enough to the state above without having to deal with the complexity of waking the scavenger in deep parts of the runtime. If the application goes idle and sweeping finishes (so we don't reach sweep termination), then sysmon will wake the scavenger. sysmon has a worst-case 20 ms delay in responding to this signal, which is probably fine if the application is completely idle anyway, but if the application is actively allocating, then the proportional sweeper should help ensure that sweeping ends very close to sweep termination, so sweep termination is a perfectly reasonable time to wake up the scavenger. Updates #35788. Change-Id: I84289b37816a7d595d803c72a71b7f5c59d47e6b Reviewed-on: https://go-review.googlesource.com/c/go/+/207998 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-04-30 18:12:03 +00:00
Austin Clements	4e00b4c366	runtime: move condition into wakep All five calls to wakep are protected by the same check of nmidle and nmspinning. Move this check into wakep. Change-Id: I2094eec211ce551e462e87614578f37f1896ba38 Reviewed-on: https://go-review.googlesource.com/c/go/+/230757 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-04-30 00:42:35 +00:00
Austin Clements	b3863fbbc2	runtime: make newproc1 not start the goroutine Currently, newproc1 allocates, initializes, and schedules a new goroutine. We're about to change debug call injection in a way that will need to create a new goroutine without immediately scheduling it. To prepare for that, make scheduling the responsibility of newproc1's caller. Currently, there's exactly one caller (newproc), so this simply shifts that responsibility. For #36365. Change-Id: Idacd06b63e738982e840fe995d891bfd377ce23b Reviewed-on: https://go-review.googlesource.com/c/go/+/229298 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-04-29 21:29:08 +00:00
Dan Scales	0a820007e7	runtime: static lock ranking for the runtime (enabled by GOEXPERIMENT) I took some of the infrastructure from Austin's lock logging CR https://go-review.googlesource.com/c/go/+/192704 (with deadlock detection from the logs), and developed a setup to give static lock ranking for runtime locks. Static lock ranking establishes a documented total ordering among locks, and then reports an error if the total order is violated. This can happen if a deadlock happens (by acquiring a sequence of locks in different orders), or if just one side of a possible deadlock happens. Lock ordering deadlocks cannot happen as long as the lock ordering is followed. Along the way, I found a deadlock involving the new timer code, which Ian fixed via https://go-review.googlesource.com/c/go/+/207348, as well as two other potential deadlocks. See the constants at the top of runtime/lockrank.go to show the static lock ranking that I ended up with, along with some comments. This is great documentation of the current intended lock ordering when acquiring multiple locks in the runtime. I also added an array lockPartialOrder[] which shows and enforces the current partial ordering among locks (which is embedded within the total ordering). This is more specific about the dependencies among locks. I don't try to check the ranking within a lock class with multiple locks that can be acquired at the same time (i.e. check the ranking when multiple hchan locks are acquired). Currently, I am doing a lockInit() call to set the lock rank of most locks. Any lock that is not otherwise initialized is assumed to be a leaf lock (a very high rank lock), so that eliminates the need to do anything for a bunch of locks (including all architecture-dependent locks). For two locks, root.lock and notifyList.lock (only in the runtime/sema.go file), it is not as easy to do lock initialization, so instead, I am passing the lock rank with the lock calls. For Windows compilation, I needed to increase the StackGuard size from 896 to 928 because of the new lock-rank checking functions. Checking of the static lock ranking is enabled by setting GOEXPERIMENT=staticlockranking before doing a run. To make sure that the static lock ranking code has no overhead in memory or CPU when not enabled by GOEXPERIMENT, I changed 'go build/install' so that it defines a build tag (with the same name) whenever any experiment has been baked into the toolchain (by checking Expstring()). This allows me to avoid increasing the size of the 'mutex' type when static lock ranking is not enabled. Fixes #38029 Change-Id: I154217ff307c47051f8dae9c2a03b53081acd83a Reviewed-on: https://go-review.googlesource.com/c/go/+/207619 Reviewed-by: Dan Scales <danscales@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-07 21:51:03 +00:00
Michael Anthony Knyszek	f1f947af28	runtime: don't hold worldsema across mark phase This change makes it so that worldsema isn't held across the mark phase. This means that various operations like ReadMemStats may now stop the world during the mark phase, reducing latency on such operations. Only three such operations are still no longer allowed to occur during marking: GOMAXPROCS, StartTrace, and StopTrace. For the former it's because any change to GOMAXPROCS impacts GC mark background worker scheduling and the details there are tricky. For the latter two it's because tracing needs to observe consistent GC start and GC end events, and if StartTrace or StopTrace may stop the world during marking, then it's possible for it to see a GC end event without a start or GC start event without an end, respectively. To ensure that GOMAXPROCS and StartTrace/StopTrace cannot proceed until marking is complete, the runtime now holds a new semaphore, gcsema, across the mark phase just like it used to with worldsema. This change is being landed once more after being reverted in the Go 1.14 release cycle, since CL 215157 allows it to have a positive effect on system performance. For the benchmark BenchmarkReadMemStatsLatency in the runtime, which measures ReadMemStats latencies while the GC is exercised, the tail of these latencies reduced dramatically on an 8-core machine: name old 50%tile-ns new 50%tile-ns delta ReadMemStatsLatency-8 4.40M ±74% 0.12M ± 2% -97.35% (p=0.008 n=5+5) name old 90%tile-ns new 90%tile-ns delta ReadMemStatsLatency-8 102M ± 6% 0M ±14% -99.79% (p=0.008 n=5+5) name old 99%tile-ns new 99%tile-ns delta ReadMemStatsLatency-8 147M ±18% 4M ±57% -97.43% (p=0.008 n=5+5) Fixes #19812. Change-Id: If66c3c97d171524ae29f0e7af4bd33509d9fd0bb Reviewed-on: https://go-review.googlesource.com/c/go/+/216557 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-03-18 19:13:50 +00:00
Ian Lance Taylor	2fbca94db7	runtime: add goroutines returned by poller to local run queue In Go 1.13, when the network poller found a list of ready goroutines, they were added to the global run queue. The timer goroutine would typically sleep in a futex with a timeout, and when the timeout expired the timer goroutine would either be handed off to an idle P or added to the global run queue. The effect was that on a busy system with no idle P's goroutines waiting for timeouts and goroutines waiting for the network would start at the same priority. That changed on tip with the new timer code. Now timer functions are invoked directly from a P, and it happens that the functions used by time.Sleep and time.After and time.Ticker add the newly ready goroutines to the local run queue. When a P looks for work it will prefer goroutines on the local run queue; in fact it will only occasionally look at the global run queue, and even when it does it will just pull one goroutine off. So on a busy system with both active timers and active network connections the system can noticeably prefer to run goroutines waiting for timers rather than goroutines waiting for the network. This CL undoes that change by, when possible, adding goroutines waiting for the network to the local run queue of the P that checked. This doesn't affect network poller checks done by sysmon, but it does affect network poller checks done as each P enters the scheduler. This CL also makes injecting a list into either the local or global run queue more efficient, using bulk operations rather than individual ones. Change-Id: I85a66ad74e4fc3b458256fb7ab395d06f0d2ffac Reviewed-on: https://go-review.googlesource.com/c/go/+/216198 Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-03-16 22:31:39 +00:00
Ian Lance Taylor	3093959ee1	runtime: remove mcache field from m Having an mcache field in both m and p is confusing, so remove it from m. Always use mcache field from p. Use new variable mcache0 during bootstrap. Change-Id: If2cba9f8bb131d911d512b61fd883a86cf62cc98 Reviewed-on: https://go-review.googlesource.com/c/go/+/205239 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-02-24 16:39:52 +00:00
Michael Knyszek	64c22b70bf	Revert "runtime: don't hold worldsema across mark phase" This reverts commit `7b294cdd8d`, CL 182657. Reason for revert: This change may be causing latency problems for applications which call ReadMemStats, because it may cause all goroutines to stop until the GC completes. https://golang.org/cl/215157 fixes this problem, but it's too late in the cycle to land that. Updates #19812. Change-Id: Iaa26f4dec9b06b9db2a771a44e45f58d0aa8f26d Reviewed-on: https://go-review.googlesource.com/c/go/+/216358 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-01-24 23:27:33 +00:00
Ian Lance Taylor	895b7c85ad	runtime: don't skip checkTimers if we would clear deleted timers The timers code used to have a problem: if code started and stopped a lot of timers, as would happen with, for example, lots of calls to context.WithTimeout, then it would steadily use memory holding timers that had stopped but not been removed from the timer heap. That problem was fixed by CL 214299, which would remove all deleted timers whenever they got to be more than 1/4 of the total number of timers on the heap. The timers code had a different problem: if there were some idle P's, the running P's would have lock contention trying to steal their timers. That problem was fixed by CL 214185, which only acquired the timer lock if the next timer was ready to run or there were some timers to adjust. Unfortunately, CL 214185 partially undid 214299, in that we could now accumulate an increasing number of deleted timers while there were no timers ready to run. This CL restores the 214299 behavior, by checking whether there are lots of deleted timers without acquiring the lock. This is a performance issue to consider for the 1.14 release. Change-Id: I13c980efdcc2a46eb84882750c39e3f7c5b2e7c3 Reviewed-on: https://go-review.googlesource.com/c/go/+/215722 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-01-22 18:10:42 +00:00
Ian Lance Taylor	cfe3cd903f	runtime: keep P's first timer when in new atomically accessed field This reduces lock contention when only a few P's are running and checking for whether they need to run timers on the sleeping P's. Without this change the running P's would get lock contention while looking at the sleeping P's timers. With this change a single atomic load suffices to determine whether there are any ready timers. Change-Id: Ie843782bd56df49867a01ecf19c47498ec827452 Reviewed-on: https://go-review.googlesource.com/c/go/+/214185 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com>	2020-01-14 19:54:20 +00:00
Ian Lance Taylor	641e61db57	runtime: don't let P's timer heap get clogged with deleted timers Whenever more than 1/4 of the timers on a P's heap are deleted, remove them from the heap. Change-Id: Iff63ed3d04e6f33ffc5c834f77f645c52c007e52 Reviewed-on: https://go-review.googlesource.com/c/go/+/214299 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-01-10 23:03:06 +00:00
Rhys Hiltner	a4c579e8f7	runtime: emit trace event in direct semaphore handoff When a goroutine yields the remainder of its time to another goroutine during direct semaphore handoff (as in an Unlock of a sync.Mutex in starvation mode), it needs to signal that change to the execution tracer. The discussion in CL 200577 didn't reach consensus on how best to describe that, but pointed out that "traceEvGoSched / goroutine calls Gosched" could be confusing. Emit a "traceEvGoPreempt / goroutine is preempted" event in this case, to allow the execution tracer to find a consistent event ordering without being both specific and inaccurate about why the active goroutine has changed. Fixes #36186 Change-Id: Ic4ade19325126db2599aff6aba7cba028bb0bee9 Reviewed-on: https://go-review.googlesource.com/c/go/+/211797 Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-01-02 20:13:03 +00:00
Ian Lance Taylor	580337e268	runtime, time: remove old timer code Updates #6239 Updates #27707 Change-Id: I65e6471829c9de4677d3ac78ef6cd7aa0a1fc4cb Reviewed-on: https://go-review.googlesource.com/c/go/+/171884 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>	2019-11-19 15:30:58 +00:00
Ian Lance Taylor	2d8c1995b9	runtime: release timersLock while running timer Dan Scales pointed out a theoretical deadlock in the runtime. The timer code runs timer functions while holding the timers lock for a P. The scavenger queues up a timer function that calls wakeScavenger, which acquires the scavenger lock. The scavengeSleep function acquires the scavenger lock, then calls resetTimer which can call addInitializedTimer which acquires the timers lock for the current P. So there is a potential deadlock, in that the scavenger lock and the timers lock for some P may both be acquired in different order. It's not clear to me whether this deadlock can ever actually occur. Issue 35532 describes another possible deadlock. The pollSetDeadline function acquires pd.lock for some poll descriptor, and in some cases calls resettimer which can in some cases acquire the timers lock for the current P. The timer code runs timer functions while holding the timers lock for a P. The timer function for poll descriptors winds up in netpolldeadlineimpl which acquires pd.lock. So again there is a potential deadlock, in that the pd lock for some poll descriptor and the timers lock for some P may both be acquired in different order. I think this can happen if we change the deadline for a network connection exactly as the former deadline expires. Looking at the code, I don't see any reason why we have to hold the timers lock while running a timer function. This CL implements that change. Updates #6239 Updates #27707 Fixes #35532 Change-Id: I17792f5a0120e01ea07cf1b2de8434d5c10704dd Reviewed-on: https://go-review.googlesource.com/c/go/+/207348 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2019-11-19 02:41:53 +00:00
Ian Lance Taylor	e762378c42	runtime: acquire timersLocks around moveTimers In the discussion of CL 171828 we decided that it was not necessary to acquire timersLock around the call to moveTimers, because the world is stopped. However, that is not correct, as sysmon runs even when the world is stopped, and it calls timeSleepUntil which looks through the timers. timeSleepUntil acquires timersLock, but that doesn't help if moveTimers is running at the same time. Updates #6239 Updates #27707 Updates #35462 Change-Id: I346c5bde594c4aff9955ae430b37c2b6fc71567f Reviewed-on: https://go-review.googlesource.com/c/go/+/206938 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2019-11-13 18:03:37 +00:00
Carlo Alberto Ferraris	97d0505334	runtime: consistently seed fastrand state across archs Some, but not all, architectures mix in OS-provided random seeds when initializing the fastrand state. The others have TODOs saying we need to do the same. Lift that logic up in the architecture-independent part, and use memhash to mix the seed instead of a simple addition. Previously, dumping the fastrand state at initialization would yield something like the following on linux-amd64, where the values in the first column do not change between runs (as thread IDs are sequential and always start at 0), and the values in the second column, while changing every run, are pretty correlated: first run: 0x0 0x44d82f1c 0x5f356495 0x44f339de 0xbe6ac92a 0x44f91cd8 0x1da02dbf 0x44fd91bc 0x7cd59254 0x44fee8a4 0xdc0af6e9 0x4547a1e0 0x3b405b7e 0x474c76fc 0x9a75c013 0x475309dc 0xf9ab24a8 0x4bffd075 second run: 0x0 0xa63fc3eb 0x5f356495 0xa6648dc2 0xbe6ac92a 0xa66c1c59 0x1da02dbf 0xa671bce8 0x7cd59254 0xa70e8287 0xdc0af6e9 0xa7129d2e 0x3b405b7e 0xa7379e2d 0x9a75c013 0xa7e4c64c 0xf9ab24a8 0xa7ecce07 With this change, we get initial states that appear to be much more unpredictable, both within the same run as well as between runs: 0x11bddad7 0x97241c63 0x553dacc6 0x2bcd8523 0x62c01085 0x16413d92 0x6f40e9e6 0x7a138de6 0xa4898053 0x70d816f0 0x5ca5b433 0x188a395b 0x62778ca9 0xd462c3b5 0xd6e160e4 0xac9b4bd 0xb9571d65 0x597a981d Change-Id: Ib22c530157d74200df0083f830e0408fd4aaea58 Reviewed-on: https://go-review.googlesource.com/c/go/+/203439 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-11-12 21:40:12 +00:00
Rhys Hiltner	7148478f1b	sync: yield to the waiter when unlocking a starving mutex When we have already assigned the semaphore ticket to a specific waiter, we want to get the waiter running as fast as possible since no other G waiting on the semaphore can acquire it optimistically. The net effect is that, when a sync.Mutex is contended, the code in the critical section guarded by the Mutex gets a priority boost. Fixes #33747 The original work was done in CL 200577 by Carlo Alberto Ferraris. The change was reverted in CL 205817 because it broke the linux-arm64-packet and solaris-amd64-oraclerel builders. Change-Id: I76d79b1d63fd206ed1c57fe6900cb7ae9e4d46cb Reviewed-on: https://go-review.googlesource.com/c/go/+/206180 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-11-09 19:31:32 +00:00
Michael Anthony Knyszek	a2cd2bd55d	runtime: add per-p page allocation cache This change adds a per-p free page cache which the page allocator may allocate out of without a lock. The change also introduces a completely lockless page allocator fast path. Although the cache contains at most 64 pages (and usually less), the vast majority (85%+) of page allocations are exactly 1 page in size. Updates #35112. Change-Id: I170bf0a9375873e7e3230845eb1df7e5cf741b78 Reviewed-on: https://go-review.googlesource.com/c/go/+/195701 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 18:00:54 +00:00
Michael Anthony Knyszek	4517c02f28	runtime: add per-p mspan cache This change adds a per-p mspan object cache similar to the sudog cache. Unfortunately this cache can't quite operate like the sudog cache, since it is used in contexts where write barriers are disallowed (i.e. allocation codepaths), so rather than managing an array and a slice, it's just an array and a length. A little bit more unsafe, but avoids any write barriers. The purpose of this change is to reduce the number of operations which require the heap lock in allocation, paving the way for a lockless fast path. Updates #35112. Change-Id: I32cfdcd8528fb7be985640e4f3a13cb98ffb7865 Reviewed-on: https://go-review.googlesource.com/c/go/+/196642 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 17:01:32 +00:00
Bryan C. Mills	73d57bf80f	Revert "sync: yield to the waiter when unlocking a starving mutex" This reverts CL 200577. Reason for revert: broke linux-arm64-packet and solaris-amd64-oraclerel builders Fixes #35424 Updates #33747 Change-Id: I2575fd84d37995d458183caae54704f15d8b8426 Reviewed-on: https://go-review.googlesource.com/c/go/+/205817 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-11-07 15:04:03 +00:00
Carlo Alberto Ferraris	a8f57f4ada	sync: yield to the waiter when unlocking a starving mutex When we have already assigned the semaphore ticket to a specific waiter, we want to get the waiter running as fast as possible since no other G waiting on the semaphore can acquire it optimistically. The net effect is that, when a sync.Mutex is contented, the code in the critical section guarded by the Mutex gets a priority boost. Fixes #33747 Change-Id: I9967f0f763c25504010651bdd7f944ee0189cd45 Reviewed-on: https://go-review.googlesource.com/c/go/+/200577 Reviewed-by: Rhys Hiltner <rhys@justin.tv> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-11-07 05:59:33 +00:00
Ian Lance Taylor	b50fcc88e9	runtime: don't hold scheduler lock when calling timeSleepUntil Otherwise, we can get into a deadlock: sysmon takes the scheduler lock and calls timeSleepUntil which takes each P's timer lock. Simultaneously, some P calls runtimer (holding the P's own timer lock) which wakes up the scavenger, calling goready, calling wakep, calling startm, getting the scheduler lock. Now the sysmon thread is holding the scheduler lock and trying to get a P's timer lock, while some other thread running on that P is holding the P's timer lock and trying to get the scheduler lock. So change sysmon to call timeSleepUntil without holding the scheduler lock, and change timeSleepUntil to use allpLock, which is only held for limited periods of time and should never compete with timer locks. This hopefully Fixes #35375 At least it should fix the linux-arm64-packet builder problems, which occurred more reliably as that system has GOMAXPROCS == 96, giving a lot more scope for this deadlock. Change-Id: I7a7917daf7a4882e0b27ca416e4f6300cfaaa774 Reviewed-on: https://go-review.googlesource.com/c/go/+/205558 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2019-11-06 15:46:26 +00:00
Ian Lance Taylor	d80ab3e85a	runtime: wake netpoller when dropping P, don't sleep too long in sysmon When dropping a P, if it has any timers, and if some thread is sleeping in the netpoller, wake the netpoller to run the P's timers. This mitigates races between the netpoller deciding how long to sleep and a new timer being added. In sysmon, if all P's are idle, check the timers to decide how long to sleep. This avoids oversleeping if no thread is using the netpoller. This can happen in particular if some threads use runtime.LockOSThread, as those threads do not block in the netpoller. Also, print the number of timers per P for GODEBUG=scheddetail=1. Before this CL, TestLockedDeadlock2 would fail about 1% of the time. With this CL, I ran it 150,000 times with no failures. Updates #6239 Updates #27707 Fixes #35274 Fixes #35288 Change-Id: I7e5193e6c885e567f0b1ee023664aa3e2902fcd1 Reviewed-on: https://go-review.googlesource.com/c/go/+/204800 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2019-11-04 21:37:08 +00:00

1 2 3 4 5 ...

360 Commits