mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
Austin Clements	d19fedd180	runtime: move checkmarks to a separate bitmap Currently, the GC stores the object marks for checkmarks mode in the heap bitmap using a rather complex encoding: for one word objects, the checkmark is stored in the pointer/scalar bit since one word objects must be pointers; for larger objects, the checkmark is stored in what would be the scan/dead bit for the second word of the object. This encoding made more sense when the runtime used the first scan/dead bit as the regular mark bit, but we moved away from that long ago. This encoding and overloading of the heap bitmap bits causes a great deal of complexity in many parts of the allocator and garbage collector and leads to some subtle bugs like #15903. This CL moves the checkmarks mark bits into their own per-arena bitmap and reclaims the second scan/dead bit as a regular scan/dead bit. I tested this by enabling doubleCheck mode in heapBitsSetType and running in both regular and GODEBUG=gccheckmark=1 mode. Fixes #15903. No performance degradation. (Very slight improvement on a few benchmarks, but it's probably just noise.) name old time/op new time/op delta BiogoIgor 16.6s ± 1% 16.4s ± 1% -0.94% (p=0.000 n=25+24) BiogoKrishna 19.2s ± 3% 19.2s ± 3% ~ (p=0.638 n=23+25) BleveIndexBatch100 6.12s ± 5% 6.17s ± 4% ~ (p=0.170 n=25+25) CompileTemplate 206ms ± 1% 205ms ± 1% -0.43% (p=0.005 n=24+24) CompileUnicode 82.2ms ± 2% 81.5ms ± 2% -0.95% (p=0.001 n=22+22) CompileGoTypes 755ms ± 3% 754ms ± 4% ~ (p=0.715 n=25+25) CompileCompiler 3.73s ± 1% 3.73s ± 1% ~ (p=0.445 n=25+24) CompileSSA 8.67s ± 1% 8.66s ± 1% ~ (p=0.836 n=24+22) CompileFlate 134ms ± 2% 133ms ± 1% -0.66% (p=0.001 n=24+23) CompileGoParser 164ms ± 1% 163ms ± 1% -0.85% (p=0.000 n=24+24) CompileReflect 466ms ± 5% 466ms ± 3% ~ (p=0.863 n=25+25) CompileTar 182ms ± 1% 182ms ± 1% -0.31% (p=0.048 n=24+24) CompileXML 249ms ± 1% 248ms ± 1% -0.32% (p=0.031 n=21+25) CompileStdCmd 10.3s ± 1% 10.3s ± 1% ~ (p=0.459 n=23+23) FoglemanFauxGLRenderRotateBoat 8.66s ± 1% 8.62s ± 1% -0.47% (p=0.000 n=23+24) FoglemanPathTraceRenderGopherIter1 20.3s ± 3% 20.2s ± 2% ~ (p=0.893 n=25+25) GopherLuaKNucleotide 29.7s ± 1% 29.8s ± 2% ~ (p=0.421 n=24+25) MarkdownRenderXHTML 246ms ± 1% 247ms ± 1% ~ (p=0.558 n=25+24) Tile38WithinCircle100kmRequest 779µs ± 4% 779µs ± 3% ~ (p=0.954 n=25+25) Tile38IntersectsCircle100kmRequest 1.02ms ± 3% 1.01ms ± 4% ~ (p=0.658 n=25+25) Tile38KNearestLimit100Request 984µs ± 4% 986µs ± 4% ~ (p=0.627 n=24+25) [Geo mean] 552ms 551ms -0.19% https://perf.golang.org/search?q=upload:20200723.6 Change-Id: Ic703f26a83fb034941dc6f4788fc997d56890dec Reviewed-on: https://go-review.googlesource.com/c/go/+/244539 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Martin Möhrmann <moehrmann@google.com>	2020-08-17 14:31:20 +00:00
Michael Anthony Knyszek	2491c5fd24	runtime: wake scavenger and update address on sweep done This change modifies the semantics of waking the scavenger: rather than wake on any update to pacing, wake when we know we will have work to do, that is, when the sweeper is done. The current scavenger runs over the address space just once per GC cycle, and we want to maximize the chance that the scavenger observes the most attractive scavengable memory in that pass (i.e. free memory with the highest address), so the timing is important. By having the scavenger awaken and reset its search space when the sweeper is done, we increase the chance that the scavenger will observe the most attractive scavengable memory, because no more memory will be freed that GC cycle (so the highest scavengable address should now be available). Furthermore, in applications that go idle, this means the background scavenger will be awoken even if another GC doesn't happen, which isn't true today. However, we're unable to wake the scavenger directly from within the sweeper; waking the scavenger involves modifying timers and readying goroutines, the latter of which may trigger an allocation today (and the sweeper may run during allocation!). Instead, we do the following: 1. Set a flag which is checked by sysmon. sysmon will clear the flag and wake the scavenger. 2. Wake the scavenger unconditionally at sweep termination. The idea behind this policy is that it gets us close enough to the state above without having to deal with the complexity of waking the scavenger in deep parts of the runtime. If the application goes idle and sweeping finishes (so we don't reach sweep termination), then sysmon will wake the scavenger. sysmon has a worst-case 20 ms delay in responding to this signal, which is probably fine if the application is completely idle anyway, but if the application is actively allocating, then the proportional sweeper should help ensure that sweeping ends very close to sweep termination, so sweep termination is a perfectly reasonable time to wake up the scavenger. Updates #35788. Change-Id: I84289b37816a7d595d803c72a71b7f5c59d47e6b Reviewed-on: https://go-review.googlesource.com/c/go/+/207998 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-04-30 18:12:03 +00:00
Michael Anthony Knyszek	a13691966a	runtime: add new mcentral implementation Currently mcentral is implemented as a couple of linked lists of spans protected by a lock. Unfortunately this design leads to significant lock contention. The span ownership model is also confusing and complicated. In-use spans jump between being owned by multiple sources, generally some combination of a gcSweepBuf, a concurrent sweeper, an mcentral or an mcache. So first to address contention, this change replaces those linked lists with gcSweepBufs which have an atomic fast path. Then, we change up the ownership model: a span may be simultaneously owned only by an mcentral and the page reclaimer. Otherwise, an mcentral (which now consists of sweep bufs), a sweeper, or an mcache are the sole owners of a span at any given time. This dramatically simplifies reasoning about span ownership in the runtime. As a result of this new ownership model, sweeping is now driven by walking over the mcentrals rather than having its own global list of spans. Because we no longer have a global list and we traditionally haven't used the mcentrals for large object spans, we no longer have anywhere to put large objects. So, this change also makes it so that we keep large object spans in the appropriate mcentral lists. In terms of the static lock ranking, we add the spanSet spine locks in pretty much the same place as the mcentral locks, since they have the potential to be manipulated both on the allocation and sweep paths, like the mcentral locks. This new implementation is turned on by default via a feature flag called go115NewMCentralImpl. Benchmark results for 1 KiB allocation throughput (5 runs each): name \ MiB/s go113 go114 gotip gotip+this-patch AllocKiB-1 1.71k ± 1% 1.68k ± 1% 1.59k ± 2% 1.71k ± 1% AllocKiB-2 2.46k ± 1% 2.51k ± 1% 2.54k ± 1% 2.93k ± 1% AllocKiB-4 4.27k ± 1% 4.41k ± 2% 4.33k ± 1% 5.01k ± 2% AllocKiB-8 4.38k ± 3% 5.24k ± 1% 5.46k ± 1% 8.23k ± 1% AllocKiB-12 4.38k ± 3% 4.49k ± 1% 5.10k ± 1% 10.04k ± 0% AllocKiB-16 4.31k ± 1% 4.14k ± 3% 4.22k ± 0% 10.42k ± 0% AllocKiB-20 4.26k ± 1% 3.98k ± 1% 4.09k ± 1% 10.46k ± 3% AllocKiB-24 4.20k ± 1% 3.97k ± 1% 4.06k ± 1% 10.74k ± 1% AllocKiB-28 4.15k ± 0% 4.00k ± 0% 4.20k ± 0% 10.76k ± 1% Fixes #37487. Change-Id: I92d47355acacf9af2c41bf080c08a8c1638ba210 Reviewed-on: https://go-review.googlesource.com/c/go/+/221182 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-04-27 18:19:26 +00:00
Michael Anthony Knyszek	03ba6b070d	runtime: prevent preemption while releasing worldsema in gcStart Currently, as a result of us releasing worldsema now to allow STW events during a mark phase, we release worldsema between starting the world and having the goroutine block in STW mode. This inserts preemption points which, if followed through, could lead to a deadlock. Specifically, because user goroutine scheduling is disabled in STW mode, the goroutine will block before properly releasing worldsema. The fix here is to prevent preemption while releasing the worldsema. Fixes #38404. Updates #19812. Change-Id: I8ed5b3aa108ab2e4680c38e77b0584fb75690e3d Reviewed-on: https://go-review.googlesource.com/c/go/+/228337 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Austin Clements <austin@google.com>	2020-04-16 02:35:01 +00:00
Dan Scales	0a820007e7	runtime: static lock ranking for the runtime (enabled by GOEXPERIMENT) I took some of the infrastructure from Austin's lock logging CR https://go-review.googlesource.com/c/go/+/192704 (with deadlock detection from the logs), and developed a setup to give static lock ranking for runtime locks. Static lock ranking establishes a documented total ordering among locks, and then reports an error if the total order is violated. This can happen if a deadlock happens (by acquiring a sequence of locks in different orders), or if just one side of a possible deadlock happens. Lock ordering deadlocks cannot happen as long as the lock ordering is followed. Along the way, I found a deadlock involving the new timer code, which Ian fixed via https://go-review.googlesource.com/c/go/+/207348, as well as two other potential deadlocks. See the constants at the top of runtime/lockrank.go to show the static lock ranking that I ended up with, along with some comments. This is great documentation of the current intended lock ordering when acquiring multiple locks in the runtime. I also added an array lockPartialOrder[] which shows and enforces the current partial ordering among locks (which is embedded within the total ordering). This is more specific about the dependencies among locks. I don't try to check the ranking within a lock class with multiple locks that can be acquired at the same time (i.e. check the ranking when multiple hchan locks are acquired). Currently, I am doing a lockInit() call to set the lock rank of most locks. Any lock that is not otherwise initialized is assumed to be a leaf lock (a very high rank lock), so that eliminates the need to do anything for a bunch of locks (including all architecture-dependent locks). For two locks, root.lock and notifyList.lock (only in the runtime/sema.go file), it is not as easy to do lock initialization, so instead, I am passing the lock rank with the lock calls. For Windows compilation, I needed to increase the StackGuard size from 896 to 928 because of the new lock-rank checking functions. Checking of the static lock ranking is enabled by setting GOEXPERIMENT=staticlockranking before doing a run. To make sure that the static lock ranking code has no overhead in memory or CPU when not enabled by GOEXPERIMENT, I changed 'go build/install' so that it defines a build tag (with the same name) whenever any experiment has been baked into the toolchain (by checking Expstring()). This allows me to avoid increasing the size of the 'mutex' type when static lock ranking is not enabled. Fixes #38029 Change-Id: I154217ff307c47051f8dae9c2a03b53081acd83a Reviewed-on: https://go-review.googlesource.com/c/go/+/207619 Reviewed-by: Dan Scales <danscales@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-07 21:51:03 +00:00
Michael Anthony Knyszek	d1ecfcc1e8	runtime: ensure minTriggerRatio never exceeds maxTriggerRatio Currently, the capping logic for the GC trigger ratio is such that if gcpercent is low, we may end up setting the trigger ratio far too high, breaking the promise of SetGCPercent and GOGC has a trade-off knob (we won't start a GC early enough, and we will use more memory). This change modifies the capping logic for the trigger ratio by scaling the minTriggerRatio with gcpercent the same way we scale maxTriggerRatio. Fixes #37927. Change-Id: I2a048c1808fb67186333d3d5a6bee328be2f35da Reviewed-on: https://go-review.googlesource.com/c/go/+/223937 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-03-26 16:12:18 +00:00
Michael Anthony Knyszek	f1f947af28	runtime: don't hold worldsema across mark phase This change makes it so that worldsema isn't held across the mark phase. This means that various operations like ReadMemStats may now stop the world during the mark phase, reducing latency on such operations. Only three such operations are still no longer allowed to occur during marking: GOMAXPROCS, StartTrace, and StopTrace. For the former it's because any change to GOMAXPROCS impacts GC mark background worker scheduling and the details there are tricky. For the latter two it's because tracing needs to observe consistent GC start and GC end events, and if StartTrace or StopTrace may stop the world during marking, then it's possible for it to see a GC end event without a start or GC start event without an end, respectively. To ensure that GOMAXPROCS and StartTrace/StopTrace cannot proceed until marking is complete, the runtime now holds a new semaphore, gcsema, across the mark phase just like it used to with worldsema. This change is being landed once more after being reverted in the Go 1.14 release cycle, since CL 215157 allows it to have a positive effect on system performance. For the benchmark BenchmarkReadMemStatsLatency in the runtime, which measures ReadMemStats latencies while the GC is exercised, the tail of these latencies reduced dramatically on an 8-core machine: name old 50%tile-ns new 50%tile-ns delta ReadMemStatsLatency-8 4.40M ±74% 0.12M ± 2% -97.35% (p=0.008 n=5+5) name old 90%tile-ns new 90%tile-ns delta ReadMemStatsLatency-8 102M ± 6% 0M ±14% -99.79% (p=0.008 n=5+5) name old 99%tile-ns new 99%tile-ns delta ReadMemStatsLatency-8 147M ±18% 4M ±57% -97.43% (p=0.008 n=5+5) Fixes #19812. Change-Id: If66c3c97d171524ae29f0e7af4bd33509d9fd0bb Reviewed-on: https://go-review.googlesource.com/c/go/+/216557 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-03-18 19:13:50 +00:00
Michael Knyszek	64c22b70bf	Revert "runtime: don't hold worldsema across mark phase" This reverts commit `7b294cdd8d`, CL 182657. Reason for revert: This change may be causing latency problems for applications which call ReadMemStats, because it may cause all goroutines to stop until the GC completes. https://golang.org/cl/215157 fixes this problem, but it's too late in the cycle to land that. Updates #19812. Change-Id: Iaa26f4dec9b06b9db2a771a44e45f58d0aa8f26d Reviewed-on: https://go-review.googlesource.com/c/go/+/216358 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-01-24 23:27:33 +00:00
Michael Knyszek	ad3cef184e	Revert "runtime: release worldsema before Gosched in STW GC mode" This reverts commit `05511a5c0a`, CL 208379. Reason for revert: So that we can cleanly revert https://golang.org/cl/182657. Change-Id: I4fdf4f864a093db7866b3306f0f8f856b9f4d684 Reviewed-on: https://go-review.googlesource.com/c/go/+/216357 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-01-24 23:27:22 +00:00
Michael Anthony Knyszek	05511a5c0a	runtime: release worldsema before Gosched in STW GC mode After CL 182657 we no longer hold worldsema across the GC, we hold gcsema instead. However in STW GC mode we don't release worldsema before calling Gosched on the user goroutine (note that user goroutines are disabled during STW GC) so that user goroutine holds onto it. When the GC is done and the runtime inevitably wants to "stop the world" again (though there isn't much to stop) it'll sit there waiting for worldsema which won't be released until the aforementioned goroutine is scheduled, which it won't be until the GC is done! So, we have a deadlock. The fix is easy: just release worldsema before calling Gosched. Fixes #34736. Change-Id: Ia50db22ebed3176114e7e60a7edaf82f8535c1b4 Reviewed-on: https://go-review.googlesource.com/c/go/+/208379 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-11-22 17:33:48 +00:00
Michael Anthony Knyszek	dac936a4ab	runtime: make more page sweeper operations atomic This change makes it so that allocation and free related page sweeper metadata operations (e.g. pageInUse and pagesInUse) are atomic rather than protected by the heap lock. This will help in reducing the length of the critical path with the heap lock held in future changes. Updates #35112. Change-Id: Ie82bff024204dd17c4c671af63350a7a41add354 Reviewed-on: https://go-review.googlesource.com/c/go/+/196640 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 17:00:57 +00:00
Michael Knyszek	74af7fc603	runtime: place lower limit on trigger ratio This change makes it so that the GC pacer's trigger ratio can never fall below 0.6. Upcoming changes to the allocator make it significantly more scalable and thus much faster in certain cases, creating a large gap between the performance of allocation and scanning. The consequence of this is that the trigger ratio can drop very low (0.07 was observed) in order to drop GC utilization. A low trigger ratio like this results in a high amount of black allocations, which causes the live heap to appear larger, and thus the heap, and RSS, grows to a much higher stable point. This change alleviates the problem by placing a lower bound on the trigger ratio. The expected (and confirmed) effect of this is that utilization in certain scenarios will no longer converge to the expected 25%, and may go higher. As a result of this artificially high trigger ratio, more time will also be spent doing GC assists compared to dedicated mark workers, since the GC will be on for an artifically short fraction of time (artificial with respect to the pacer). The biggest concern of this change is that allocation latency will suffer as a result, since there will now be more assists. But, upcoming changes to the allocator reduce the latency enough to outweigh the expected increase in latency from this change, without the blowup in RSS observed from the changes to the allocator. Updates #35112. Change-Id: Idd7c94fa974d0de673304c4397e716e89bfbf09b Reviewed-on: https://go-review.googlesource.com/c/go/+/200439 Reviewed-by: Austin Clements <austin@google.com>	2019-11-04 22:52:25 +00:00
Austin Clements	d16ec13756	runtime: scan stacks conservatively at async safe points This adds support for scanning the stack when a goroutine is stopped at an async safe point. This is not yet lit up because asyncPreempt is not yet injected, but prepares us for that. This works by conservatively scanning the registers dumped in the frame of asyncPreempt and its parent frame, which was stopped at an asynchronous safe point. Conservative scanning works by only marking words that are pointers to valid, allocated heap objects. One complication is pointers to stack objects. In this case, we can't determine if the stack object is still "allocated" or if it was freed by an earlier GC. Hence, we need to propagate the conservative-ness of scanning stack objects: if all pointers found to a stack object were found via conservative scanning, then the stack object itself needs to be scanned conservatively, since its pointers may point to dead objects. For #10958, #24543. Change-Id: I7ff84b058c37cde3de8a982da07002eaba126fd6 Reviewed-on: https://go-review.googlesource.com/c/go/+/201761 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-11-02 21:51:16 +00:00
Austin Clements	8c5861576a	runtime: remove g.gcscanvalid Currently, gcscanvalid is used to resolve a race between attempts to scan a stack. Now that there's a clear owner of the stack scan operation, there's no longer any danger of racing or attempting to scan a stack more than once, so this CL eliminates gcscanvalid. I double-checked my reasoning by first adding a throw if gcscanvalid was set in scanstack and verifying that all.bash still passed. For #10958, #24543. Fixes #24363. Change-Id: I76794a5fcda325ed7cfc2b545e2a839b8b3bc713 Reviewed-on: https://go-review.googlesource.com/c/go/+/201139 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-10-25 23:25:32 +00:00
Michael Anthony Knyszek	62e4156552	runtime: fix lock acquire cycles related to scavenge.lock There are currently two edges in the lock cycle graph caused by scavenge.lock: with sched.lock and mheap_.lock. These edges appear because of the call to ready() and stack growths respectively. Furthermore, there's already an invariant in the code wherein mheap_.lock must be acquired before scavenge.lock, hence the cycle. The fix to this is to bring scavenge.lock higher in the lock cycle graph, such that sched.lock and mheap_.lock are only acquired once scavenge.lock is already held. To faciliate this change, we move scavenger waking outside of gcSetTriggerRatio such that it doesn't have to happen with the heap locked. Furthermore, we check scavenge generation numbers with the heap locked by using gopark instead of goparkunlock, and specify a function which aborts the park should there be any skew in generation count. Fixes #34047. Change-Id: I3519119214bac66375e2b1262b36ce376c820d12 Reviewed-on: https://go-review.googlesource.com/c/go/+/191977 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-09-26 15:44:49 +00:00
Michael Anthony Knyszek	9b30811280	runtime: redefine scavenge goal in terms of heap_inuse This change makes it so that the scavenge goal is defined primarily in terms of heap_inuse at the end of the last GC rather than next_gc. The reason behind this change is that next_gc doesn't take into account fragmentation, and we can fall into situation where the scavenger thinks it should have work to do but there's no free and unscavenged memory available. In order to ensure the scavenge goal still tracks next_gc, we multiply heap_inuse by the ratio between the current heap goal and the last heap goal, which describes whether the heap is growing or shrinking, and by how much. Finally, this change updates the documentation for scavenging and elaborates on why the scavenge goal is defined the way it is. Fixes #34048. Updates #32828. Change-Id: I8deaf87620b5dc12a40ab8a90bf27932868610da Reviewed-on: https://go-review.googlesource.com/c/go/+/193040 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2019-09-25 22:15:39 +00:00
Michael Knyszek	aae0b5b0b2	runtime: use hard heap goal if we've done more scan work than expected This change makes it so that if we're already finding ourselves in a situation where we've done more scan work than expected in the steady-state (that is, 50% of heap_scan for GOGC=100), then we fall back on the hard heap goal instead of continuing to assume the expected case. In some cases its possible that we're already doing more scan work than expected, and if GC assists come in just at that window where we notice it, they might accumulate way too much assist credit, causing undue heap growths if GOMAXPROCS=1 (since the fractional background worker isn't guaranteed to fire). This case seems awfully specific, and that's because it's exactly the case for TestGcSys, which has been flaky for some time as a result. Fixes #28574, #27636, and #27156. Change-Id: I771f42bed34739dbb1b84ad82cfe247f70836031 Reviewed-on: https://go-review.googlesource.com/c/go/+/184097 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-09-04 21:52:18 +00:00
Michael Anthony Knyszek	7b294cdd8d	runtime: don't hold worldsema across mark phase This change makes it so that worldsema isn't held across the mark phase. This means that various operations like ReadMemStats may now stop the world during the mark phase, reducing latency on such operations. Only three such operations are still no longer allowed to occur during marking: GOMAXPROCS, StartTrace, and StopTrace. For the former it's because any change to GOMAXPROCS impacts GC mark background worker scheduling and the details there are tricky. For the latter two it's because tracing needs to observe consistent GC start and GC end events, and if StartTrace or StopTrace may stop the world during marking, then it's possible for it to see a GC end event without a start or GC start event without an end, respectively. To ensure that GOMAXPROCS and StartTrace/StopTrace cannot proceed until marking is complete, the runtime now holds a new semaphore, gcsema, across the mark phase just like it used to with worldsema. Fixes #19812. Change-Id: I15d43ed184f711b3d104e8f267fb86e335f86bf9 Reviewed-on: https://go-review.googlesource.com/c/go/+/182657 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-09-04 15:53:59 +00:00
Michael Anthony Knyszek	7ed7669c0d	runtime: ensure mheap lock stack growth invariant is maintained Currently there's an invariant in the runtime wherein the heap lock can only be acquired on the system stack, otherwise a self-deadlock could occur if the stack grows while the lock is held. This invariant is upheld and documented in a number of situations (e.g. allocManual, freeManual) but there are other places where the invariant is either not maintained at all which risks self-deadlock (e.g. setGCPercent, gcResetMarkState, allocmcache) or is maintained but undocumented (e.g. gcSweep, readGCStats_m). This change adds go:systemstack to any function that acquires the heap lock or adds a systemstack(func() { ... }) around the critical section, where appropriate. It also documents the invariant on (*mheap).lock directly and updates repetitive documentation to refer to that comment. Fixes #32105. Change-Id: I702b1290709c118b837389c78efde25c51a2cafb Reviewed-on: https://go-review.googlesource.com/c/go/+/177857 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com>	2019-05-24 15:34:57 +00:00
Tamir Duberstein	9c86eae384	runtime: resolve latent TODOs These were added in https://go-review.googlesource.com/1224; according to austin@google.com these annotations are not valuable - resolving by removing the TODOs. Change-Id: Icf3f21bc385cac9673ba29f0154680e970cf91f2 Reviewed-on: https://go-review.googlesource.com/c/go/+/176899 Reviewed-by: Austin Clements <austin@google.com>	2019-05-13 19:33:29 +00:00
Michael Anthony Knyszek	fe67ea32bf	runtime: add background scavenger This change adds a background scavenging goroutine whose pacing is determined when the heap goal changes. The scavenger is paced to use at most 1% of the mutator's time for most systems. Furthermore, the scavenger's pacing is computed based on the estimated number of scavengable huge pages to take advantage of optimizations provided by the OS. The purpose of this scavenger is to deal with a shrinking heap: if the heap goal is falling over time, the scavenger should kick in and start returning free pages from the heap to the OS. Also, now that we have a pacing system, the credit system used by scavengeLocked has become redundant. Replace it with a mechanism which only scavenges on the allocation path if it makes sense to do so with respect to the new pacing system. Fixes #30333. Change-Id: I6203f8dc84affb26c3ab04528889dd9663530edc Reviewed-on: https://go-review.googlesource.com/c/go/+/142960 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-05-09 16:21:43 +00:00
Austin Clements	5c22842cf2	runtime: introduce effective GOGC, eliminate heap_marked hacks Currently, the pacer assumes the goal growth ratio is always exactly GOGC/100. But sometimes this isn't the case, like when the heap is very small (limited by heapminimum). So to placate the pacer, we lie about the value of heap_marked in such situations. Right now, these two lies make a truth, but GOGC is about to get more complicated with the introduction of heap limits. Rather than introduce more lies into the system to handle this, introduce the concept of an "effective GOGC", which is the GOGC we're actually using for pacing (we'll need this concept anyway with heap limits). This commit changes the pacer to use the effective GOGC rather than the user-set GOGC. This way, we no longer need to lie about heap_marked because its true value is incorporated into the effective GOGC. Change-Id: I5b005258f937ab184ffcb5e76053abd798d542bd Reviewed-on: https://go-review.googlesource.com/c/go/+/66092 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2019-03-05 23:08:18 +00:00
Austin Clements	4a7d5aa30b	runtime: don't use GOGC in minimum sweep distance Currently, the minimum sweep distance is 1MB * GOGC/100. It's been this way since it was introduced in CL 13043 with no justification. Since there seems to be no good reason to scale the minimum sweep distance by GOGC, make the minimum sweep distance just 1MB. Change-Id: I5320574a23c0eec641e346282aab08a3bbb057da Reviewed-on: https://go-review.googlesource.com/c/go/+/66091 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2019-03-05 23:08:16 +00:00
Austin Clements	7da03b9fbb	runtime: compute goal first in gcSetTriggerRatio This slightly rearranges gcSetTriggerRatio to compute the goal before computing the other controls. This will simplify implementing the heap limit, which needs to control the absolute goal and flow the rest of the control parameters from this. For #16843. Change-Id: I46b7c1f8b6e4edbee78930fb093b60bd1a03d75e Reviewed-on: https://go-review.googlesource.com/c/go/+/46750 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2019-03-05 23:08:15 +00:00
Austin Clements	7ac0a8bc39	runtime: remove unused gcTriggerAlways This was used during the implementation of concurrent runtime.GC() but now there's nothing that triggers GC unconditionally. Remove this trigger type and simplify (gcTrigger).test() accordingly. Change-Id: I17a893c2ed1f661b8146d7783d529f71735c9105 Reviewed-on: https://go-review.googlesource.com/c/go/+/66090 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2019-03-05 23:08:13 +00:00
Austin Clements	95a6f112c6	runtime: work around "P has cached GC work" failures We still don't understand what's causing there to be remaining GC work when we enter mark termination, but in order to move forward on this issue, this CL implements a work-around for the problem. If debugCachedWork is false, this CL does a second check for remaining GC work as soon as it stops the world for mark termination. If it finds any work, it starts the world again and re-enters concurrent mark. This will increase STW time by a small amount proportional to GOMAXPROCS, but fixes a serious correctness issue. This works-around #27993. Change-Id: Ia23b85dd6c792ee8d623428bd1a3115631e387b8 Reviewed-on: https://go-review.googlesource.com/c/156140 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2019-01-04 01:24:58 +00:00
Austin Clements	8e6396115e	runtime: don't spin in checkPut if non-preemptible Currently it's possible for the runtime to deadlock if checkPut is called in a non-preemptible context. In this case, checkPut may spin, so it won't leave the non-preemptible context, but the thread running gcMarkDone needs to preempt all of the goroutines before it can release the checkPut spin loops. Fix this by returning from checkPut if it's called under any of the conditions that would prevent gcMarkDone from preempting it. In this case, it leaves a note behind that this happened; if the runtime does later detect left-over work it can at least indicate that it was unable to catch it in the act. For #27993. Updates #29385 (may fix it). Change-Id: Ic71c10701229febb4ddf8c104fb10e06d84b122e Reviewed-on: https://go-review.googlesource.com/c/156017 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2019-01-02 20:21:01 +00:00
Austin Clements	2d00007bdb	runtime: flush on every write barrier while debugging Currently, we flush the write barrier buffer on every write barrier once throwOnGCWork is set, but not during the mark completion algorithm itself. As seen in recent failures like https://build.golang.org/log/317369853b803b4ee762b27653f367e1aa445ac1 by the time we actually catch a late gcWork put, the write barrier buffer is full-size again. As a result, we're probably not catching the actual problematic write barrier, which is probably somewhere in the buffer. Fix this by using the gcWork pause generation to also keep the write barrier buffer small between the mark completion flushes it and when mark completion is done. For #27993. Change-Id: I77618169441d42a7d562fb2a998cfaa89891edb2 Reviewed-on: https://go-review.googlesource.com/c/154638 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2018-12-18 15:17:50 +00:00
Austin Clements	ccbca561ef	runtime: capture pause stack for late gcWork put debugging This captures the stack trace where mark completion observed that each P had no work, and then dumps this if that P later discovers more work. Hopefully this will help bound where the work was created. For #27993. Change-Id: I4f29202880d22c433482dc1463fb50ab693b6de6 Reviewed-on: https://go-review.googlesource.com/c/154599 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2018-12-17 21:24:19 +00:00
Michael Anthony Knyszek	578667f4b5	runtime: enable preemption of mark termination goroutine A mark worker goroutine may attempt to preempt the mark termination goroutine to scan its stack while the mark termination goroutine is trying to preempt that worker to flush its work buffer, in rare cases. This change makes it so that, like a worker goroutine, the mark termination goroutine stack is preemptible while it is on the system stack, attempting to preempt others. Fixes #28695. Change-Id: I23bbb191f4fdad293e8a70befd51c9175f8a1171 Reviewed-on: https://go-review.googlesource.com/c/153077 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-12-07 16:42:49 +00:00
Austin Clements	438b9544a0	runtime: check more work flushing races This adds several new checks to help debug #27993. It adds a mechanism for freezing write barriers and gcWork puts during the mark completion algorithm. This way, if we do detect mark completion, we can catch any puts that happened during the completion algorithm. Based on build dashboard failures, this seems to be the window of time when these are happening. This also double-checks that all work buffers are empty immediately upon entering mark termination (much earlier than the current check). This is unlikely to trigger based on the current failures, but is a good safety net. Change-Id: I03f56c48c4322069e28c50fbc3c15b2fee2130c2 Reviewed-on: https://go-review.googlesource.com/c/151797 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2018-11-29 22:08:05 +00:00
Austin Clements	9098d1d854	runtime: debug code to catch bad gcWork.puts This adds a debug check to throw immediately if any pointers are added to the gcWork buffer after the mark completion barrier. The intent is to catch the source of the cached GC work that occasionally produces "P has cached GC work at end of mark termination" failures. The result should be that we get "throwOnGCWork" throws instead of "P has cached GC work at end of mark termination" throws, but with useful stack traces. This should be reverted before the release. I've been unable to reproduce this issue locally, but this issue appears fairly regularly on the builders, so the intent is to catch it on the builders. This probably slows down the GC slightly. For #27993. Change-Id: I5035e14058ad313bfbd3d68c41ec05179147a85c Reviewed-on: https://go-review.googlesource.com/c/149969 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-11-21 16:28:17 +00:00
Austin Clements	b8fad4b33d	runtime: improve "P has cached GC work" debug info For #27993. Change-Id: I20127e8a9844c2c488f38e1ab1f8f5a27a5df03e Reviewed-on: https://go-review.googlesource.com/c/149968 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-11-21 16:20:19 +00:00
Austin Clements	5333550bdc	runtime: implement efficient page reclaimer When we attempt to allocate an N page span (either for a large allocation or when an mcentral runs dry), we first try to sweep spans to release N pages. Currently, this can be extremely expensive: sweeping a span to emptiness is the hardest thing to ask for and the sweeper generally doesn't know where to even look for potentially fruitful results. Since this is on the critical path of many allocations, this is unfortunate. This CL changes how we reclaim empty spans. Instead of trying lots of spans and hoping for the best, it uses the newly introduced span marks to efficiently find empty spans. The span marks (and in-use bits) are in a dense bitmap, so these spans can be found with an efficient sequential memory scan. This approach can scan for unmarked spans at about 300 GB/ms and can free unmarked spans at about 32 MB/ms. We could probably significantly improve the rate at which is can free unmarked spans, but that's a separate issue. Like the current reclaimer, this is still linear in the number of spans that are swept, but the constant factor is now so vanishingly small that it doesn't matter. The benchmark in #18155 demonstrates both significant page reclaiming delays, and object reclaiming delays. With "-retain-count=20000000 -preallocate=true -loop-count=3", the benchmark demonstrates several page reclaiming delays on the order of 40ms. After this change, the page reclaims are insignificant. The longest sweeps are still ~150ms, but are object reclaiming delays. We'll address those in the next several CLs. Updates #18155. Fixes #21378 by completely replacing the logic that had that bug. Change-Id: Iad80eec11d7fc262d02c8f0761ac6998425c4064 Reviewed-on: https://go-review.googlesource.com/c/138959 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-11-15 19:27:11 +00:00
Austin Clements	ba1698e963	runtime: mark span when marking any object on the span This adds a mark bit for each span that is set if any objects on the span are marked. This will be used for sweeping. For #18155. The impact of this is negligible for most benchmarks, and < 1% for GC-heavy benchmarks. name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.18ms ± 0% 2.20ms ± 1% +0.88% (p=0.000 n=16+18) (https://perf.golang.org/search?q=upload:20180928.1) name old time/op new time/op delta BinaryTree17-12 2.68s ± 1% 2.68s ± 1% ~ (p=0.707 n=17+19) Fannkuch11-12 2.28s ± 0% 2.39s ± 0% +4.95% (p=0.000 n=19+18) FmtFprintfEmpty-12 40.3ns ± 4% 39.4ns ± 2% -2.27% (p=0.000 n=17+18) FmtFprintfString-12 67.9ns ± 1% 68.3ns ± 1% +0.55% (p=0.000 n=18+19) FmtFprintfInt-12 75.7ns ± 1% 76.1ns ± 1% +0.44% (p=0.005 n=18+19) FmtFprintfIntInt-12 123ns ± 1% 121ns ± 1% -1.00% (p=0.000 n=18+18) FmtFprintfPrefixedInt-12 150ns ± 0% 148ns ± 0% -1.33% (p=0.000 n=16+13) FmtFprintfFloat-12 208ns ± 0% 204ns ± 0% -1.92% (p=0.000 n=13+17) FmtManyArgs-12 501ns ± 1% 498ns ± 0% -0.55% (p=0.000 n=19+17) GobDecode-12 6.24ms ± 0% 6.25ms ± 1% ~ (p=0.113 n=20+19) GobEncode-12 5.33ms ± 0% 5.29ms ± 1% -0.72% (p=0.000 n=20+18) Gzip-12 220ms ± 1% 218ms ± 1% -1.02% (p=0.000 n=19+19) Gunzip-12 35.5ms ± 0% 35.7ms ± 0% +0.45% (p=0.000 n=16+18) HTTPClientServer-12 77.9µs ± 1% 77.7µs ± 1% -0.30% (p=0.047 n=20+19) JSONEncode-12 8.82ms ± 0% 8.93ms ± 0% +1.20% (p=0.000 n=18+17) JSONDecode-12 47.3ms ± 0% 47.0ms ± 0% -0.49% (p=0.000 n=17+18) Mandelbrot200-12 3.69ms ± 0% 3.68ms ± 0% -0.25% (p=0.000 n=19+18) GoParse-12 3.13ms ± 1% 3.13ms ± 1% ~ (p=0.640 n=20+20) RegexpMatchEasy0_32-12 76.2ns ± 1% 76.2ns ± 1% ~ (p=0.818 n=20+19) RegexpMatchEasy0_1K-12 226ns ± 0% 226ns ± 0% -0.22% (p=0.001 n=17+18) RegexpMatchEasy1_32-12 71.9ns ± 1% 72.0ns ± 1% ~ (p=0.653 n=18+18) RegexpMatchEasy1_1K-12 355ns ± 1% 356ns ± 1% ~ (p=0.160 n=18+19) RegexpMatchMedium_32-12 106ns ± 1% 106ns ± 1% ~ (p=0.325 n=17+20) RegexpMatchMedium_1K-12 31.1µs ± 2% 31.2µs ± 0% +0.59% (p=0.007 n=19+15) RegexpMatchHard_32-12 1.54µs ± 2% 1.53µs ± 2% -0.78% (p=0.021 n=17+18) RegexpMatchHard_1K-12 46.0µs ± 1% 45.9µs ± 1% -0.31% (p=0.025 n=17+19) Revcomp-12 391ms ± 1% 394ms ± 2% +0.80% (p=0.000 n=17+19) Template-12 59.9ms ± 1% 59.9ms ± 1% ~ (p=0.428 n=20+19) TimeParse-12 304ns ± 1% 312ns ± 0% +2.88% (p=0.000 n=20+17) TimeFormat-12 318ns ± 0% 326ns ± 0% +2.64% (p=0.000 n=20+17) (https://perf.golang.org/search?q=upload:20180928.2) Change-Id: I336b9bf054113580a24103192904c8c76593e90e Reviewed-on: https://go-review.googlesource.com/c/138958 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2018-11-15 19:27:09 +00:00
Brad Fitzpatrick	3813edf26e	all: use "reports whether" consistently in the few places that didn't Go documentation style for boolean funcs is to say: // Foo reports whether ... func Foo() bool (rather than "returns true if") This CL also replaces 4 uses of "iff" with the same "reports whether" wording, which doesn't lose any meaning, and will prevent people from sending typo fixes when they don't realize it's "if and only if". In the past I think we've had the typo CLs updated to just say "reports whether". So do them all at once. (Inspired by the addition of another "returns true if" in CL 146938 in fd_plan9.go) Created with: $ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" \| grep -v vendor) $ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" \| grep -v vendor) Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f Reviewed-on: https://go-review.googlesource.com/c/147037 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-11-02 22:47:58 +00:00
Austin Clements	007e8a2fbd	runtime: rename gosweepdone to isSweepDone and document better gosweepdone is another anachronism from the time when the sweeper was implemented in C. Rename it to "isSweepDone" for the modern era. Change-Id: I8472aa6f52478459c3f2edc8a4b2761e73c4c2dd Reviewed-on: https://go-review.googlesource.com/c/138658 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-10-09 18:32:08 +00:00
Austin Clements	f3bb4cbfd5	runtime: eliminate gosweepone gosweepone just switches to the system stack and calls sweepone. sweepone doesn't need to run on the system stack, so this is pretty pointless. Historically, this was necessary because the sweeper was written in C and hence needed to run on the system stack. gosweepone was the function that Go code (specifically, bgsweep) used to call into the C sweeper implementation. This probably became unnecessary in 2014 with CL golang.org/cl/167540043, which ported the sweeper to Go. This CL changes all callers of gosweepone to call sweepone and eliminates gosweepone. Change-Id: I26b8ef0c7d060b4c0c5dedbb25ecfc936acc7269 Reviewed-on: https://go-review.googlesource.com/c/138657 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-10-09 18:20:40 +00:00
Igor Zhilianin	f90e89e675	all: fix a bunch of misspellings Change-Id: If2954bdfc551515403706b2cd0dde94e45936e08 GitHub-Last-Rev: `d4cfc41a55` GitHub-Pull-Request: golang/go#28049 Reviewed-on: https://go-review.googlesource.com/c/140299 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-10-06 15:40:03 +00:00
Austin Clements	0906d648aa	runtime: eliminate gchelper mechanism Now that we do no mark work during mark termination, we no longer need the gchelper mechanism. Updates #26903. Updates #17503. Change-Id: Ie94e5c0f918cfa047e88cae1028fece106955c1b Reviewed-on: https://go-review.googlesource.com/c/134785 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:38 +00:00
Austin Clements	550dfc8ae1	runtime: eliminate work.markrootdone and second root marking pass Before STW and concurrent GC were unified, there could be either one or two root marking passes per GC cycle. There were several tasks we had to make sure happened once and only once (whether that was at the beginning of concurrent mark for concurrent GC or during mark termination for STW GC). We kept track of this in work.markrootdone. Now that STW and concurrent GC both use the concurrent marking code and we've eliminated all work done by the second root marking pass, we only ever need a single root marking pass. Hence, we can eliminate work.markrootdone and all of the code that's conditional on it. Updates #26903. Change-Id: I654a0f5e21b9322279525560a31e64b8d33b790f Reviewed-on: https://go-review.googlesource.com/c/134784 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:37 +00:00
Austin Clements	873bd47dfb	runtime: flush mcaches lazily Currently, all mcaches are flushed during STW mark termination as a root marking job. This is currently necessary because all spans must be out of these caches before sweeping begins to avoid races with allocation and to ensure the spans are in the state expected by sweeping. We do it as a root marking job because mcache flushing is somewhat expensive and O(GOMAXPROCS) and this parallelizes the work across the Ps. However, it's also the last remaining root marking job performed during mark termination. This CL moves mcache flushing out of mark termination and performs it lazily. We keep track of the last sweepgen at which each mcache was flushed and as each P is woken from STW, it observes that its mcache is out-of-date and flushes it. The introduces a complication for spans cached in stale mcaches. These may now be observed by background or proportional sweeping or when attempting to add a finalizer, but aren't in a stable state. For example, they are likely to be on the wrong mcentral list. To fix this, this CL extends the sweepgen protocol to also capture whether a span is cached and, if so, whether or not its cache is stale. This protocol blocks asynchronous sweeping from touching cached spans and makes it the responsibility of mcache flushing to sweep the flushed spans. This eliminates the last mark termination root marking job, which means we can now eliminate that entire infrastructure. Updates #26903. This implements lazy mcache flushing. Change-Id: Iadda7aabe540b2026cffc5195da7be37d5b4125e Reviewed-on: https://go-review.googlesource.com/c/134783 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:35 +00:00
Austin Clements	457c8f4fe9	runtime: eliminate blocking GC work drains Now work.helperDrainBlock is always false, so we can remove it and code paths that only ran when it was true. That means we no longer use the gcDrainBlock mode of gcDrain, so we can eliminate that. That means we no longer use gcWork.get, so we can eliminate that. That means we no longer use getfull, so we can eliminate that. Updates #26903. This is a follow-up to unifying STW GC and concurrent GC. Change-Id: I8dbcf8ce24861df0a6149e0b7c5cd0eadb5c13f6 Reviewed-on: https://go-review.googlesource.com/c/134782 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:34 +00:00
Austin Clements	143b13ae82	runtime: clean up remaining mark work check Now that STW GC marking is unified with concurrent marking, there should never be mark work remaining in mark termination. Hence, we can make that check unconditional. Updates #26903. This is a follow-up to unifying STW GC and concurrent GC. Change-Id: I43a21df5577635ab379c397a7405ada68d331e03 Reviewed-on: https://go-review.googlesource.com/c/134781 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:32 +00:00
Austin Clements	1678b2c580	runtime: implement STW GC in terms of concurrent GC Currently, STW GC works very differently from concurrent GC. The largest differences in that in concurrent GC, all marking work is done by background mark workers during the mark phase, while in STW GC, all marking work is done by gchelper during the mark termination phase. This is a consequence of the evolution of Go's GC from a STW GC by incrementally moving work from STW mark termination into concurrent mark. However, at this point, the STW code paths exist only as a debugging mode. Having separate code paths for this increases the maintenance burden and complexity of the garbage collector. At the same time, these code paths aren't tested nearly as well, making it far more likely that they will bit-rot. This CL reverses the relationship between STW GC, by re-implementing STW GC in terms of concurrent GC. This builds on the new scheduled support for disabling user goroutine scheduling. During sweep termination, it disables user scheduling, so when the GC starts the world again for concurrent mark, it's really only "concurrent" with itself. There are several code paths that were specific to STW GC that are now vestigial. We'll remove these in the follow-up CLs. Updates #26903. Change-Id: Ia3883d2fcf7ab1d89bdc9c8ee54bf9bffb32c096 Reviewed-on: https://go-review.googlesource.com/c/134780 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:31 +00:00
Austin Clements	198440cc3d	runtime: remove GODEBUG=gcrescanstacks=1 mode Currently, setting GODEBUG=gcrescanstacks=1 enables a debugging mode where the garbage collector re-scans goroutine stacks during mark termination. This was introduced in Go 1.8 to debug the hybrid write barrier, but I don't think we ever used it. Now it's one of the last sources of mark work during mark termination. This CL removes it. Updates #26903. This is preparation for unifying STW GC and concurrent GC. Updates #17503. Change-Id: I6ae04d3738aa9c448e6e206e21857a33ecd12acf Reviewed-on: https://go-review.googlesource.com/c/134777 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:27 +00:00
Austin Clements	ecc365960b	runtime: avoid using STW GC mechanism for checkmarks mode Currently, checkmarks mode uses the full STW GC infrastructure to perform mark checking. We're about to remove that infrastructure and, furthermore, since checkmarks is about doing the simplest thing possible to check concurrent GC, it's valuable for it to be simpler. Hence, this CL makes checkmarks even simpler by making it non-parallel and divorcing it from the STW GC infrastructure (including the gchelper mechanism). Updates #26903. This is preparation for unifying STW GC and concurrent GC. Change-Id: Iad21158123e025e3f97d7986d577315e994bd43e Reviewed-on: https://go-review.googlesource.com/c/134776 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:26 +00:00
Austin Clements	918ed88e47	runtime: remove gcStart's mode argument This argument is always gcBackgroundMode since only debug.gcstoptheworld can trigger a STW GC at this point. Remove the unnecessary argument. Updates #26903. This is preparation for unifying STW GC and concurrent GC. Change-Id: Icb4ba8f10f80c2b69cf51a21e04fa2c761b71c94 Reviewed-on: https://go-review.googlesource.com/c/134775 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:24 +00:00
Austin Clements	e25ef35254	runtime: don't disable GC work caching during mark termination Currently, we disable GC work caching during mark termination. This is no longer necessary with the new mark completion detection because 1. There's no way for any of the GC mark termination helpers to have any real work queued and, 2. Mark termination has to explicitly flush every P's buffers anyway in order to flush Ps that didn't run a GC mark termination helper. Hence, remove the code that disposes gcWork buffers during mark termination. Updates #26903. This is a follow-up to eliminating mark 2. Change-Id: I81f002ee25d5c10f42afd39767774636519007f9 Reviewed-on: https://go-review.googlesource.com/c/134320 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:23 +00:00
Austin Clements	d398dbdfc3	runtime: eliminate gcBlackenPromptly mode Now that there is no mark 2 phase, gcBlackenPromptly is no longer used. Updates #26903. This is a follow-up to eliminating mark 2. Change-Id: Ib9c534f21b36b8416fcf3cab667f186167b827f8 Reviewed-on: https://go-review.googlesource.com/c/134319 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:21 +00:00

1 2 3 4 5 ...

363 Commits