mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
Cherry Zhang	a413908dd0	all: add GOOS=ios Introduce GOOS=ios for iOS systems. GOOS=ios matches "darwin" build tag, like GOOS=android matches "linux" and GOOS=illumos matches "solaris". Only ios/arm64 is supported (ios/amd64 is not). GOOS=ios and GOOS=darwin remain essentially the same at this point. They will diverge at later time, to differentiate macOS and iOS. Uses of GOOS=="darwin" are changed to (GOOS=="darwin" \|\| GOOS=="ios"), except if it clearly means macOS (e.g. GOOS=="darwin" && GOARCH=="amd64"), it remains GOOS=="darwin". Updates #38485. Change-Id: I4faacdc1008f42434599efb3c3ad90763a83b67c Reviewed-on: https://go-review.googlesource.com/c/go/+/254740 Trust: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-09-23 18:12:59 +00:00
Michael Anthony Knyszek	e6d0bd2b89	runtime: clean up old mcentral code This change deletes the old mcentral implementation from the code base and the newMCentralImpl feature flag along with it. Updates #37487. Change-Id: Ibca8f722665f0865051f649ffe699cbdbfdcfcf2 Reviewed-on: https://go-review.googlesource.com/c/go/+/221184 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-08-17 20:06:49 +00:00
Keith Randall	12c01f7698	runtime: ensure arenaBaseOffset makes it into DWARF (for viewcore) This constant does not make it into DWARF because it is an ideal constant larger than maxint (1<<63-1). DWARF has no way to represent signed values that large. Define a different typed constant that is unsigned and so can represent this constant properly. Viewcore needs this constant to interrogate the heap data structures. In addition, the sign of arenaBaseOffset changed in 1.15, and providing a new name lets viewcore detect the sign change easily. Change-Id: I4274a2f6e79ebbf1411e85d64758fac1672fb96b Reviewed-on: https://go-review.googlesource.com/c/go/+/240198 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-06-27 02:49:01 +00:00
Austin Clements	886caba73c	runtime: always mark span when marking an object The page sweeper depends on spans being marked if any object in the span is marked, but currently only greyobject does this. gcmarknewobject and wbBufFlush1 also mark objects, but neither set span marks. As a result, if there are live objects on a span, but they're all marked via allocation or write barriers, then the span itself won't be marked and the page reclaimer will free the span, ultimately leading to memory corruption when the memory for those live allocations gets reused. Fix this by making gcmarknewobject and wbBufFlush1 also mark pages. No test because I have no idea how to reliably (or even unreliably) trigger this. Fixes #39432. Performance is a wash or very slightly worse. I benchmarked the gcmarknewobject and wbBufFlush1 changes independently and both showed a slight performance improvement, so I'm going to call this noise. name old time/op new time/op delta BiogoIgor 15.9s ± 2% 15.9s ± 2% ~ (p=0.758 n=25+25) BiogoKrishna 15.7s ± 3% 15.7s ± 3% ~ (p=0.382 n=21+21) BleveIndexBatch100 4.94s ± 3% 5.07s ± 4% +2.63% (p=0.000 n=25+25) CompileTemplate 204ms ± 1% 205ms ± 1% +0.43% (p=0.000 n=21+23) CompileUnicode 77.8ms ± 1% 78.1ms ± 1% ~ (p=0.130 n=23+23) CompileGoTypes 731ms ± 1% 733ms ± 1% +0.30% (p=0.006 n=22+22) CompileCompiler 3.64s ± 2% 3.65s ± 3% ~ (p=0.179 n=24+25) CompileSSA 8.44s ± 1% 8.46s ± 1% +0.30% (p=0.003 n=22+23) CompileFlate 132ms ± 1% 133ms ± 1% ~ (p=0.098 n=22+22) CompileGoParser 164ms ± 1% 164ms ± 1% +0.37% (p=0.000 n=21+23) CompileReflect 455ms ± 1% 457ms ± 2% +0.50% (p=0.002 n=20+22) CompileTar 182ms ± 2% 182ms ± 1% ~ (p=0.382 n=22+22) CompileXML 245ms ± 3% 245ms ± 1% ~ (p=0.070 n=21+23) CompileStdCmd 16.5s ± 2% 16.5s ± 3% ~ (p=0.486 n=23+23) FoglemanFauxGLRenderRotateBoat 12.9s ± 1% 13.0s ± 1% +0.97% (p=0.000 n=21+24) FoglemanPathTraceRenderGopherIter1 18.6s ± 1% 18.7s ± 0% ~ (p=0.083 n=23+24) GopherLuaKNucleotide 28.4s ± 1% 29.3s ± 1% +2.84% (p=0.000 n=25+25) MarkdownRenderXHTML 252ms ± 0% 251ms ± 1% -0.50% (p=0.000 n=23+24) Tile38WithinCircle100kmRequest 516µs ± 2% 516µs ± 2% ~ (p=0.763 n=24+25) Tile38IntersectsCircle100kmRequest 689µs ± 2% 689µs ± 2% ~ (p=0.617 n=24+24) Tile38KNearestLimit100Request 608µs ± 1% 606µs ± 2% -0.35% (p=0.030 n=19+22) [Geo mean] 522ms 524ms +0.41% https://perf.golang.org/search?q=upload:20200606.4 Change-Id: I8b331f310dbfaba0468035f207467c8403005bf5 Reviewed-on: https://go-review.googlesource.com/c/go/+/236817 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-06-08 17:09:33 +00:00
Michael Anthony Knyszek	796786cd0c	runtime: make maxOffAddr reflect the actual address space upper bound Currently maxOffAddr is defined in terms of the whole 64-bit address space, assuming that it's all supported, by using ^uintptr(0) as the maximal address in the offset space. In reality, the maximal address in the offset space is (1<<heapAddrBits)-1 because we don't have more than that actually available to us on a given platform. On most platforms this is fine, because arenaBaseOffset is just connecting two segments of address space, but on AIX we use it as an actual offset for the starting address of the available address space, which is limited. This means using ^uintptr(0) as the maximal address in the offset address space causes wrap-around, especially when we just want to represent a range approximately like [addr, infinity), which today we do by using maxOffAddr. To fix this, we define maxOffAddr more appropriately, in terms of (1<<heapAddrBits)-1. This change also redefines arenaBaseOffset to not be the negation of the virtual address corresponding to address zero in the virtual address space, but instead directly as the virtual address corresponding to zero. This matches the existing documentation more closely and makes the logic around arenaBaseOffset decidedly simpler, especially when trying to reason about its use on AIX. Fixes #38966. Change-Id: I1336e5036a39de846f64cc2d253e8536dee57611 Reviewed-on: https://go-review.googlesource.com/c/go/+/233497 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-05-14 16:20:19 +00:00
Michael Anthony Knyszek	2e455ec2eb	runtime: avoid overflow from linearAlloc Currently linearAlloc manages an exclusive "end" address for the top of its reserved space. While unlikely for a linearAlloc to be allocated with an "end" address hitting the top of the address space, it is possible and could lead to overflow. Avoid overflow by chopping off the last byte from the linearAlloc if it's bumping up against the top of the address space defensively. In practice, this means that if 32-bit platforms map the top of the address space and use the linearAlloc to acquire arenas, the top arena will not be usable. Fixes #35954. Change-Id: I512cddcd34fd1ab15cb6ca92bbf899fc1ef22ff6 Reviewed-on: https://go-review.googlesource.com/c/go/+/231338 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-05-07 21:40:09 +00:00
Michael Anthony Knyszek	a13691966a	runtime: add new mcentral implementation Currently mcentral is implemented as a couple of linked lists of spans protected by a lock. Unfortunately this design leads to significant lock contention. The span ownership model is also confusing and complicated. In-use spans jump between being owned by multiple sources, generally some combination of a gcSweepBuf, a concurrent sweeper, an mcentral or an mcache. So first to address contention, this change replaces those linked lists with gcSweepBufs which have an atomic fast path. Then, we change up the ownership model: a span may be simultaneously owned only by an mcentral and the page reclaimer. Otherwise, an mcentral (which now consists of sweep bufs), a sweeper, or an mcache are the sole owners of a span at any given time. This dramatically simplifies reasoning about span ownership in the runtime. As a result of this new ownership model, sweeping is now driven by walking over the mcentrals rather than having its own global list of spans. Because we no longer have a global list and we traditionally haven't used the mcentrals for large object spans, we no longer have anywhere to put large objects. So, this change also makes it so that we keep large object spans in the appropriate mcentral lists. In terms of the static lock ranking, we add the spanSet spine locks in pretty much the same place as the mcentral locks, since they have the potential to be manipulated both on the allocation and sweep paths, like the mcentral locks. This new implementation is turned on by default via a feature flag called go115NewMCentralImpl. Benchmark results for 1 KiB allocation throughput (5 runs each): name \ MiB/s go113 go114 gotip gotip+this-patch AllocKiB-1 1.71k ± 1% 1.68k ± 1% 1.59k ± 2% 1.71k ± 1% AllocKiB-2 2.46k ± 1% 2.51k ± 1% 2.54k ± 1% 2.93k ± 1% AllocKiB-4 4.27k ± 1% 4.41k ± 2% 4.33k ± 1% 5.01k ± 2% AllocKiB-8 4.38k ± 3% 5.24k ± 1% 5.46k ± 1% 8.23k ± 1% AllocKiB-12 4.38k ± 3% 4.49k ± 1% 5.10k ± 1% 10.04k ± 0% AllocKiB-16 4.31k ± 1% 4.14k ± 3% 4.22k ± 0% 10.42k ± 0% AllocKiB-20 4.26k ± 1% 3.98k ± 1% 4.09k ± 1% 10.46k ± 3% AllocKiB-24 4.20k ± 1% 3.97k ± 1% 4.06k ± 1% 10.74k ± 1% AllocKiB-28 4.15k ± 0% 4.00k ± 0% 4.20k ± 0% 10.76k ± 1% Fixes #37487. Change-Id: I92d47355acacf9af2c41bf080c08a8c1638ba210 Reviewed-on: https://go-review.googlesource.com/c/go/+/221182 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-04-27 18:19:26 +00:00
Michael Anthony Knyszek	eacdf76b93	runtime: add bitmap-based markrootSpans implementation Currently markrootSpans, the scanning routine which scans span specials (particularly finalizers) as roots, uses sweepSpans to shard work and find spans to mark. However, as part of a future CL to change span ownership and how mcentral works, we want to avoid having markrootSpans use the sweep bufs to find specials, so in this change we introduce a new mechanism. Much like for the page reclaimer, we set up a per-page bitmap where the first page for a span is marked if the span contains any specials, and unmarked if it has no specials. This bitmap is updated by addspecial, removespecial, and during sweeping. markrootSpans then shards this bitmap into mark work and markers iterate over the bitmap looking for spans with specials to mark. Unlike the page reclaimer, we don't need to use the pageInUse bits because having a special implies that a span is in-use. While in terms of computational complexity this design is technically worse, because it needs to iterate over the mapped heap, in practice this iteration is very fast (we can skip over large swathes of the heap very quickly) and we only look at spans that have any specials at all, rather than having to touch each span. This new implementation of markrootSpans is behind a feature flag called go115NewMarkrootSpans. Updates #37487. Change-Id: I8ea07b6c11059f6d412fe419e0ab512d989377b8 Reviewed-on: https://go-review.googlesource.com/c/go/+/221178 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-04-21 22:50:51 +00:00
Ian Lance Taylor	7ea40f6594	runtime: use mcache0 if no P in profilealloc A case that I missed in CL 205239: profilealloc can be called at program startup if GOMAXPROCS is large enough. Fixes #38474 Change-Id: I2f089fc6ec00c376680e1c0b8a2557b62789dd7f Reviewed-on: https://go-review.googlesource.com/c/go/+/228420 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-04-17 00:45:52 +00:00
Dan Scales	0a820007e7	runtime: static lock ranking for the runtime (enabled by GOEXPERIMENT) I took some of the infrastructure from Austin's lock logging CR https://go-review.googlesource.com/c/go/+/192704 (with deadlock detection from the logs), and developed a setup to give static lock ranking for runtime locks. Static lock ranking establishes a documented total ordering among locks, and then reports an error if the total order is violated. This can happen if a deadlock happens (by acquiring a sequence of locks in different orders), or if just one side of a possible deadlock happens. Lock ordering deadlocks cannot happen as long as the lock ordering is followed. Along the way, I found a deadlock involving the new timer code, which Ian fixed via https://go-review.googlesource.com/c/go/+/207348, as well as two other potential deadlocks. See the constants at the top of runtime/lockrank.go to show the static lock ranking that I ended up with, along with some comments. This is great documentation of the current intended lock ordering when acquiring multiple locks in the runtime. I also added an array lockPartialOrder[] which shows and enforces the current partial ordering among locks (which is embedded within the total ordering). This is more specific about the dependencies among locks. I don't try to check the ranking within a lock class with multiple locks that can be acquired at the same time (i.e. check the ranking when multiple hchan locks are acquired). Currently, I am doing a lockInit() call to set the lock rank of most locks. Any lock that is not otherwise initialized is assumed to be a leaf lock (a very high rank lock), so that eliminates the need to do anything for a bunch of locks (including all architecture-dependent locks). For two locks, root.lock and notifyList.lock (only in the runtime/sema.go file), it is not as easy to do lock initialization, so instead, I am passing the lock rank with the lock calls. For Windows compilation, I needed to increase the StackGuard size from 896 to 928 because of the new lock-rank checking functions. Checking of the static lock ranking is enabled by setting GOEXPERIMENT=staticlockranking before doing a run. To make sure that the static lock ranking code has no overhead in memory or CPU when not enabled by GOEXPERIMENT, I changed 'go build/install' so that it defines a build tag (with the same name) whenever any experiment has been baked into the toolchain (by checking Expstring()). This allows me to avoid increasing the size of the 'mutex' type when static lock ranking is not enabled. Fixes #38029 Change-Id: I154217ff307c47051f8dae9c2a03b53081acd83a Reviewed-on: https://go-review.googlesource.com/c/go/+/207619 Reviewed-by: Dan Scales <danscales@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-07 21:51:03 +00:00
Austin Clements	d965bb6130	runtime: use divRoundUp There are a handful of places where the runtime wants to round up the result of a division. We just introduced a helper to do this. This CL replaces all of the hand-coded round-ups (that I could find) with this helper. Change-Id: I465d152157ff0f3cad40c0aa57491e4f2de510ad Reviewed-on: https://go-review.googlesource.com/c/go/+/224385 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-03-20 16:05:35 +00:00
Ian Lance Taylor	3093959ee1	runtime: remove mcache field from m Having an mcache field in both m and p is confusing, so remove it from m. Always use mcache field from p. Use new variable mcache0 during bootstrap. Change-Id: If2cba9f8bb131d911d512b61fd883a86cf62cc98 Reviewed-on: https://go-review.googlesource.com/c/go/+/205239 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-02-24 16:39:52 +00:00
Jerrin Shaji George	921ceadd29	runtime: rewrite a comment in malloc.go This commit changes the wording of a comment in malloc.go that describes how span objects are zeroed to make it more clear. Change-Id: I07722df1e101af3cbf8680ad07437d4a230b0168 GitHub-Last-Rev: `0e909898c7` GitHub-Pull-Request: golang/go#37008 Reviewed-on: https://go-review.googlesource.com/c/go/+/217618 Reviewed-by: Austin Clements <austin@google.com>	2020-02-05 21:19:43 +00:00
Joel Sing	8e0be05ec7	runtime: add support for linux/riscv64 Based on riscv-go port. Updates #27532 Change-Id: If522807a382130be3c8d40f4b4c1131d1de7c9e3 Reviewed-on: https://go-review.googlesource.com/c/go/+/204632 Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-01-19 14:04:09 +00:00
Clément Chigot	5042317d69	runtime: add arenaBaseOffset on aix/ppc64 On AIX, addresses returned by mmap are between 0x0a00000000000000 and 0x0afffffffffffff. The previous solution to handle these large addresses was to increase the arena size up to 60 bits addresses, cf CL 138736. However, with the new page allocator, the 60bit heap addresses are causing huge memory allocations, especially by (s *pageAlloc).init. mmap and munmap syscalls dealing with these allocations are reducing performances of every Go programs. In order to avoid these allocations, arenaBaseOffset is set to 0x0a00000000000000 and heap addresses are on 48bit, as others operating systems. Updates: #35451 Change-Id: Ice916b8578f76703428ec12a82024147a7592bc0 Reviewed-on: https://go-review.googlesource.com/c/go/+/206841 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2019-11-16 00:02:02 +00:00
Michael Anthony Knyszek	7f574e476a	runtime: remove unnecessary large parameter to mheap_.alloc mheap_.alloc currently accepts both a spanClass and a "large" parameter indicating whether the allocation is large. These are redundant, since spanClass.sizeclass() == 0 is an equivalent way to determine this and is already used in mheap_.alloc. There are no places in the runtime where the size class could be non-zero and large == true. Updates #35112. Change-Id: Ie66facf8f0faca6f4cd3d20a8ac4bc259e11823d Reviewed-on: https://go-review.googlesource.com/c/go/+/196639 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 16:44:33 +00:00
Michael Anthony Knyszek	ffb5646fe0	runtime: define maximum supported physical page and huge page sizes This change defines a maximum supported physical and huge page size in the runtime based on the new page allocator's implementation, and uses them where appropriate. Furthemore, if the system exceeds the maximum supported huge page size, we simply ignore it silently. It also fixes a huge-page-related test which is only triggered by a condition which is definitely wrong. Finally, it adds a few TODOs related to code clean-up and supporting larger huge page sizes. Updates #35112. Fixes #35431. Change-Id: Ie4348afb6bf047cce2c1433576d1514720d8230f Reviewed-on: https://go-review.googlesource.com/c/go/+/205937 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-11-08 16:35:48 +00:00
Michael Anthony Knyszek	33dfd3529b	runtime: remove old page allocator This change removes the old page allocator from the runtime. Updates #35112. Change-Id: Ib20e1c030f869b6318cd6f4288a9befdbae1b771 Reviewed-on: https://go-review.googlesource.com/c/go/+/195700 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 00:07:43 +00:00
Michael Anthony Knyszek	e6135c2768	runtime: switch to new page allocator This change flips the oldPageAllocator constant enabling the new page allocator in the Go runtime. Updates #35112. Change-Id: I7fc8332af9fd0e43ce28dd5ebc1c1ce519ce6d0c Reviewed-on: https://go-review.googlesource.com/c/go/+/201765 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-07 23:55:56 +00:00
Michael Anthony Knyszek	689f6f77f0	runtime: integrate new page allocator into runtime This change integrates all the bits and pieces of the new page allocator into the runtime, behind a global constant. Updates #35112. Change-Id: I6696bde7bab098a498ab37ed2a2caad2a05d30ec Reviewed-on: https://go-review.googlesource.com/c/go/+/201764 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-07 20:14:02 +00:00
Michael Anthony Knyszek	73317080e1	runtime: add scavenging code for new page allocator This change adds a scavenger for the new page allocator along with tests. The scavenger walks over the heap backwards once per GC, looking for memory to scavenge. It walks across the heap without any lock held, searching optimistically. If it finds what appears to be a scavenging candidate it acquires the heap lock and attempts to verify it. Upon verification it then scavenges. Notably, unlike the old scavenger, it doesn't show any preference for huge pages and instead follows a more strict last-page-first policy. Updates #35112. Change-Id: I0621ef73c999a471843eab2d1307ae5679dd18d6 Reviewed-on: https://go-review.googlesource.com/c/go/+/195697 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-07 19:14:27 +00:00
Michael Anthony Knyszek	198f0452b0	runtime: define darwin/arm64's address space as 33 bits On iOS, the address space is not 48 bits as one might believe, since it's arm64 hardware. In fact, all pointers are truncated to 33 bits, and the OS only gives applications access to the range [1<<32, 2<<32). While today this has no effect on the Go runtime, future changes which care about address space size need this to be correct. Updates #35112. Change-Id: Id518a2298080f7e3d31cf7d909506a37748cc49a Reviewed-on: https://go-review.googlesource.com/c/go/+/205758 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2019-11-07 02:33:31 +00:00
Michael Anthony Knyszek	383b447e0d	runtime: clean up power-of-two rounding code with align functions This change renames the "round" function to the more appropriately named "alignUp" which rounds an integer up to the next multiple of a power of two. This change also adds the alignDown function, which is almost like alignUp but rounds down to the previous multiple of a power of two. With these two functions, we also go and replace manual rounding code with it where we can. Change-Id: Ie1487366280484dcb2662972b01b4f7135f72fec Reviewed-on: https://go-review.googlesource.com/c/go/+/190618 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2019-11-04 23:41:34 +00:00
Michael Anthony Knyszek	a41ebe6e25	runtime: add physHugePageShift This change adds physHugePageShift which is defined such that 1 << physHugePageShift == physHugePageSize. The purpose of this variable is to avoid doing expensive divisions in key functions, such as (mspan).hugePages. This change also does a sweep of any place we might do a division or mod operation with physHugePageSize and turns it into bit shifts and other bitwise operations. Finally, this change adds a check to mallocinit which ensures that physHugePageSize is always a power of two. osinit might choose to ignore non-powers-of-two for the value and replace it with zero, but mallocinit will fail if it's not a power of two (or zero). It also derives physHugePageShift from physHugePageSize. This change helps improve the performance of most applications because of how often (mspan).hugePages is called. Updates #32828. Change-Id: I1a6db113d52d563f59ae8fd4f0e130858859e68f Reviewed-on: https://go-review.googlesource.com/c/go/+/186598 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-07-30 18:44:52 +00:00
Keith Randall	01d137262a	runtime: use uintptr instead of int32 for counting to next heap profile sample Overflow of the comparison caused very large (>=1<<32) allocations to sometimes not get sampled at all. Use uintptr so the comparison will never overflow. Fixes #33342 Tested on the example in 33342. I don't want to check a test in that needs that much memory, however. Change-Id: I51fe77a9117affed8094da93c0bc5f445ac2d3d3 Reviewed-on: https://go-review.googlesource.com/c/go/+/188017 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-07-29 21:07:49 +00:00
Austin Clements	5b15510d96	runtime: align allocations harder in GODEBUG=sbrk=1 mode Currently, GODEBUG=sbrk=1 mode aligns allocations by their type's alignment. You would think this would be the right thing to do, but because 64-bit fields are only 4-byte aligned right now (see #599), this can cause a 64-bit field of an allocated object to be 4-byte aligned, but not 8-byte aligned. If there is an atomic access to that unaligned 64-bit field, it will crash. This doesn't happen in normal allocation mode because the size-segregated allocation and the current size classes will cause any types larger than 8 bytes to be 8 byte aligned. We fix this by making sbrk=1 mode use alignment based on the type's size rather than its declared alignment. This matches how the tiny allocator aligns allocations. This was tested with GOARCH=386 GODEBUG=sbrk=1 go test sync/atomic This crashes with an unaligned access before this change, and passes with this change. This should be reverted when/if we fix #599. Fixes #33159. Change-Id: Ifc52c72c6b99c5d370476685271baa43ad907565 Reviewed-on: https://go-review.googlesource.com/c/go/+/186919 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-07-19 15:04:08 +00:00
Michael Anthony Knyszek	4e7bef84c1	runtime: mark newly-mapped memory as scavenged On most platforms newly-mapped memory is untouched, meaning the pages backing the region haven't been faulted in yet. However, we mark this memory as unscavenged which means the background scavenger aggressively "returns" this memory to the OS if the heap is small. The only platform where newly-mapped memory is actually unscavenged (and counts toward the application's RSS) is on Windows, since (*mheap).sysAlloc commits the reservation. Instead of making a special case for Windows, I change the requirements a bit for a sysReserve'd region. It must now be both sysMap'd and sysUsed'd, with sysMap being a no-op on Windows. Comments about memory allocation have been updated to include a more up-to-date mental model of which states a region of memory may be in (at a very low level) and how to transition between these states. Now this means we can correctly mark newly-mapped heap memory as scavenged on every platform, reducing the load on the background scavenger early on in the application for small heaps. As a result, heap-growth scavenging is no longer necessary, since any actual RSS growth will be accounted for on the allocation codepath. Finally, this change also cleans up grow a little bit to avoid pretending that it's freeing an in-use span and just does the necessary operations directly. Fixes #32012. Fixes #31966. Updates #26473. Change-Id: Ie06061eb638162e0560cdeb0b8993d94cfb4d290 Reviewed-on: https://go-review.googlesource.com/c/go/+/177097 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-05-16 22:00:47 +00:00
Michael Anthony Knyszek	1033065ee3	runtime: add physHugePageSize This change adds the global physHugePageSize which is initialized in osinit(). physHugePageSize contains the system's transparent huge page (or superpage) size in bytes. For #30333. Change-Id: I2f0198c40729dbbe6e6f2676cef1d57dd107562c Reviewed-on: https://go-review.googlesource.com/c/go/+/170858 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com>	2019-05-03 18:41:45 +00:00
Richard Musiol	460f9c6068	runtime, cmd/link: optimize memory allocation on wasm WebAssembly's memory is contiguous. Allocating memory at a high address also allocates all memory up to that address. This change reduces the initial memory allocated on wasm from 1GB to 16MB by using multiple heap arenas and reducing the size of a heap arena. Fixes #27462. Change-Id: Ic941e6edcadd411e65a14cb2f9fd6c8eae02fc7a Reviewed-on: https://go-review.googlesource.com/c/go/+/170950 Run-TryBot: Richard Musiol <neelance@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-04-16 15:03:34 +00:00
Keith Randall	db16de9203	runtime: remove kindNoPointers We already have the ptrdata field in a type, which encodes exactly the same information that kindNoPointers does. My problem with kindNoPointers is that it often leads to double-negative code like: t.kind & kindNoPointers != 0 Much clearer is: t.ptrdata == 0 Update #27167 Change-Id: I92307d7f018a6bbe3daca4a4abb4225e359349b1 Reviewed-on: https://go-review.googlesource.com/c/go/+/169157 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-03-25 20:46:35 +00:00
Ian Lance Taylor	c5babcc485	runtime: align first persistentalloc chunk as requested Change-Id: Ib391e019b1a7513d234fb1c8ff802efe8fa7c950 Reviewed-on: https://go-review.googlesource.com/c/go/+/163859 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-03-06 21:33:41 +00:00
Marcel van Lohuizen	9650726e79	internal/reflectlite: lite version of reflect package to be used by errors package for checking assignability and setting error values in As. Updates #29934. Change-Id: I8c1d02a2c6efa0919d54b286cfe8b4edc26da059 Reviewed-on: https://go-review.googlesource.com/c/161759 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2019-02-27 18:27:01 +00:00
Raul Silvera	dc889025c7	runtime: sample large heap allocations correctly Remove an unnecessary check on the heap sampling code that forced sampling of all heap allocations larger than the sampling rate. This need to follow a poisson process so that they can be correctly unsampled. Maintain a check for MemProfileRate==1 to provide a mechanism for full sampling, as documented in https://golang.org/pkg/runtime/#pkg-variables. Additional testing for this change is on cl/129117. Fixes #26618 Change-Id: I7802bde2afc655cf42cffac34af9bafeb3361957 GitHub-Last-Rev: `471f747af8` GitHub-Pull-Request: golang/go#29791 Reviewed-on: https://go-review.googlesource.com/c/158337 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2019-01-18 15:29:32 +00:00
Ian Lance Taylor	0d6a2d5f9a	runtime: skip writes to persistent memory in cgo checker Fixes #23899 Fixes #28458 Change-Id: Ie177f2d4c399445d8d5e1a327f2419c7866cb45e Reviewed-on: https://go-review.googlesource.com/c/155697 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-01-04 02:40:56 +00:00
Clément Chigot	041526c6ef	runtime: handle 64bits addresses for AIX This commit allows the runtime to handle 64bits addresses returned by mmap syscall on AIX. Mmap syscall returns addresses on 59bits on AIX. But the Arena implementation only allows addresses with less than 48 bits. This commit increases the arena size up to 1<<60 for aix/ppc64. Update: #25893 Change-Id: Iea72e8a944d10d4f00be915785e33ae82dd6329e Reviewed-on: https://go-review.googlesource.com/c/138736 Reviewed-by: Austin Clements <austin@google.com>	2018-11-26 14:06:28 +00:00
Austin Clements	e500ffd88c	runtime: track all heap arenas in a slice Currently, there's no efficient way to iterate over the Go heap. We're going to need this for fast free page sweeping, so this CL adds a slice of all allocated heap arenas. This will also be useful for generational GC. For #18155. Change-Id: I58d126cfb9c3f61b3125d80b74ccb1b2169efbcc Reviewed-on: https://go-review.googlesource.com/c/138076 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-11-15 19:14:10 +00:00
Martin Möhrmann	1e50dd02a9	runtime: use multiplication with overflow check for newarray This improves performance for e.g. maps with a bucket size (key+value*8 bytes) larger than 32 bytes and removes loading a value from the maxElems array for smaller bucket sizes. name old time/op new time/op delta MakeMap/[Byte]Byte 95.5ns ± 1% 94.7ns ± 1% -0.78% (p=0.013 n=9+9) MakeMap/[Int]Int 128ns ± 0% 121ns ± 2% -5.63% (p=0.000 n=6+10) Updates #21588 Change-Id: I7d9eb7d49150c399c15dcab675e24bc97ff97852 Reviewed-on: https://go-review.googlesource.com/c/143997 Reviewed-by: Keith Randall <khr@golang.org>	2018-10-23 16:48:01 +00:00
Michael Anthony Knyszek	07e738ec32	runtime: use only treaps for tracking spans Currently, mheap tracks spans in both mSpanLists and mTreaps, but mSpanLists, while they tend to be smaller, complicate the implementation. Here we simplify the implementation by removing free and busy from mheap and renaming freelarge -> free and busylarge -> busy. This change also slightly changes the reclamation policy. Previously, for allocations under 1MB we would attempt to find a small span of the right size. Now, we just try to find any number of spans totaling the right size. This may increase heap fragmentation, but that will be dealt with using virtual memory tricks in follow-up CLs. For #14045. Garbage-heavy benchmarks show very little change, except what appears to be a decrease in STW times and peak RSS. name old STW-ns/GC new STW-ns/GC delta Garbage/benchmem-MB=64-8 263k ±64% 217k ±24% -17.66% (p=0.028 n=25+23) name old STW-ns/op new STW-ns/op delta Garbage/benchmem-MB=64-8 9.39k ±65% 7.80k ±24% -16.88% (p=0.037 n=25+23) name old peak-RSS-bytes new peak-RSS-bytes delta Garbage/benchmem-MB=64-8 281M ± 0% 249M ± 4% -11.40% (p=0.000 n=19+18) https://perf.golang.org/search?q=upload:20181005.1 Go1 benchmarks perform roughly the same, the most notable regression being the JSON encode/decode benchmark with worsens by ~2%. name old time/op new time/op delta BinaryTree17-8 3.02s ± 2% 2.99s ± 2% -1.18% (p=0.000 n=25+24) Fannkuch11-8 3.05s ± 1% 3.02s ± 2% -1.20% (p=0.000 n=25+25) FmtFprintfEmpty-8 43.6ns ± 5% 43.4ns ± 3% ~ (p=0.528 n=25+25) FmtFprintfString-8 74.9ns ± 3% 73.4ns ± 1% -2.03% (p=0.001 n=25+24) FmtFprintfInt-8 79.3ns ± 3% 77.9ns ± 1% -1.73% (p=0.003 n=25+25) FmtFprintfIntInt-8 119ns ± 6% 116ns ± 0% -2.68% (p=0.000 n=25+18) FmtFprintfPrefixedInt-8 134ns ± 4% 132ns ± 1% -1.52% (p=0.004 n=25+25) FmtFprintfFloat-8 240ns ± 1% 241ns ± 1% ~ (p=0.403 n=24+23) FmtManyArgs-8 543ns ± 1% 537ns ± 1% -1.00% (p=0.000 n=25+25) GobDecode-8 6.88ms ± 1% 6.92ms ± 4% ~ (p=0.088 n=24+22) GobEncode-8 5.92ms ± 1% 5.93ms ± 1% ~ (p=0.898 n=25+24) Gzip-8 267ms ± 2% 266ms ± 2% ~ (p=0.213 n=25+24) Gunzip-8 35.4ms ± 1% 35.6ms ± 1% +0.70% (p=0.000 n=25+25) HTTPClientServer-8 104µs ± 2% 104µs ± 2% ~ (p=0.686 n=25+25) JSONEncode-8 9.67ms ± 1% 9.80ms ± 4% +1.32% (p=0.000 n=25+25) JSONDecode-8 47.7ms ± 1% 48.8ms ± 5% +2.33% (p=0.000 n=25+25) Mandelbrot200-8 4.87ms ± 1% 4.91ms ± 1% +0.79% (p=0.000 n=25+25) GoParse-8 3.59ms ± 4% 3.55ms ± 1% ~ (p=0.199 n=25+24) RegexpMatchEasy0_32-8 90.3ns ± 1% 89.9ns ± 1% -0.47% (p=0.000 n=25+21) RegexpMatchEasy0_1K-8 204ns ± 1% 204ns ± 1% ~ (p=0.914 n=25+24) RegexpMatchEasy1_32-8 84.9ns ± 0% 84.6ns ± 1% -0.36% (p=0.000 n=24+25) RegexpMatchEasy1_1K-8 350ns ± 1% 348ns ± 3% -0.59% (p=0.007 n=25+25) RegexpMatchMedium_32-8 122ns ± 1% 121ns ± 0% -1.08% (p=0.000 n=25+18) RegexpMatchMedium_1K-8 36.1µs ± 1% 34.6µs ± 1% -4.02% (p=0.000 n=25+25) RegexpMatchHard_32-8 1.69µs ± 2% 1.65µs ± 1% -2.38% (p=0.000 n=25+25) RegexpMatchHard_1K-8 50.8µs ± 1% 49.4µs ± 1% -2.69% (p=0.000 n=25+24) Revcomp-8 453ms ± 2% 449ms ± 3% -0.74% (p=0.022 n=25+24) Template-8 63.2ms ± 2% 63.4ms ± 1% ~ (p=0.127 n=25+24) TimeParse-8 313ns ± 1% 315ns ± 3% ~ (p=0.924 n=24+25) TimeFormat-8 294ns ± 1% 292ns ± 2% -0.65% (p=0.004 n=23+24) [Geo mean] 49.9µs 49.6µs -0.65% name old speed new speed delta GobDecode-8 112MB/s ± 1% 110MB/s ± 4% -1.00% (p=0.036 n=24+24) GobEncode-8 130MB/s ± 1% 129MB/s ± 1% ~ (p=0.894 n=25+24) Gzip-8 72.7MB/s ± 2% 73.0MB/s ± 2% ~ (p=0.208 n=25+24) Gunzip-8 549MB/s ± 1% 545MB/s ± 1% -0.70% (p=0.000 n=25+25) JSONEncode-8 201MB/s ± 1% 198MB/s ± 3% -1.29% (p=0.000 n=25+25) JSONDecode-8 40.7MB/s ± 1% 39.8MB/s ± 5% -2.23% (p=0.000 n=25+25) GoParse-8 16.2MB/s ± 4% 16.3MB/s ± 1% ~ (p=0.211 n=25+24) RegexpMatchEasy0_32-8 354MB/s ± 1% 356MB/s ± 1% +0.47% (p=0.000 n=25+21) RegexpMatchEasy0_1K-8 5.00GB/s ± 0% 4.99GB/s ± 1% ~ (p=0.588 n=24+24) RegexpMatchEasy1_32-8 377MB/s ± 1% 378MB/s ± 1% +0.39% (p=0.000 n=25+25) RegexpMatchEasy1_1K-8 2.92GB/s ± 1% 2.94GB/s ± 3% +0.65% (p=0.008 n=25+25) RegexpMatchMedium_32-8 8.14MB/s ± 1% 8.22MB/s ± 1% +0.98% (p=0.000 n=25+24) RegexpMatchMedium_1K-8 28.4MB/s ± 1% 29.6MB/s ± 1% +4.19% (p=0.000 n=25+25) RegexpMatchHard_32-8 18.9MB/s ± 2% 19.4MB/s ± 1% +2.43% (p=0.000 n=25+25) RegexpMatchHard_1K-8 20.2MB/s ± 1% 20.7MB/s ± 1% +2.76% (p=0.000 n=25+24) Revcomp-8 561MB/s ± 2% 566MB/s ± 3% +0.75% (p=0.021 n=25+24) Template-8 30.7MB/s ± 2% 30.6MB/s ± 1% ~ (p=0.131 n=25+24) [Geo mean] 120MB/s 121MB/s +0.48% https://perf.golang.org/search?q=upload:20181004.6 Change-Id: I97f9fee34577961a116a8ddd445c6272253f0f95 Reviewed-on: https://go-review.googlesource.com/c/139837 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-10-17 16:26:10 +00:00
Austin Clements	4c1c839a3d	runtime: clarify table of arena sizes Currently the table of arena sizes mixes the number of entries in the L1 with the size of the L2. While the size of the L2 is important, this makes it hard to see what's actually going on because there's an implicit factor of sys.PtrSize. This changes the L2 column to say both the number of entries and the size that results in. This should hopefully make the relations between the columns of the table clearer, since they can now be plugged directly into the given formula. Change-Id: Ie677adaef763b893a2f620bd4fc3b8db314b3a1e Reviewed-on: https://go-review.googlesource.com/c/139697 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-10-04 16:14:44 +00:00
Austin Clements	918ed88e47	runtime: remove gcStart's mode argument This argument is always gcBackgroundMode since only debug.gcstoptheworld can trigger a STW GC at this point. Remove the unnecessary argument. Updates #26903. This is preparation for unifying STW GC and concurrent GC. Change-Id: Icb4ba8f10f80c2b69cf51a21e04fa2c761b71c94 Reviewed-on: https://go-review.googlesource.com/c/134775 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:24 +00:00
Austin Clements	01e6cfc2a0	runtime: don't call mcache.refill on systemstack mcache.refill doesn't need to run on the system stack; it just needs to be non-preemptible. Its only caller, mcache.nextFree, also needs to be non-preemptible, so we can remove the unnecessary systemstack switch. Change-Id: Iba5b3f4444855f1dc134485ba588efff3b54c426 Reviewed-on: https://go-review.googlesource.com/138196 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-09-28 18:39:49 +00:00
Austin Clements	5a8c11ce3e	runtime: rename _MSpan* constants to mSpan* We already aliased mSpanInUse to _MSpanInUse. The dual constants are getting annoying, so fix all of these to use the mSpan* naming convention. This was done automatically with: sed -i -re 's/_?MSpan(Dead\|InUse\|Manual\|Free)/mSpan\1/g' *.go plus deleting the existing definition of mSpanInUse. Change-Id: I09979d9d491d06c10689cea625dc57faa9cc6767 Reviewed-on: https://go-review.googlesource.com/137875 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-09-26 20:51:07 +00:00
Tobias Klauser	c7ac645d28	runtime: fix reference to sys{Fault,Free,Reserve,Unused,Used} in comments Change-Id: Icbaedc49c810c63c51d56ae394d2f70e4d64b3e0 Reviewed-on: https://go-review.googlesource.com/136495 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-09-25 06:22:24 +00:00
Austin Clements	2d4c1ff739	runtime: don't create heap hints outside TSAN's supported heap TSAN for Go only supports heap address in the range [0x00c000000000, 0x00e000000000). However, we currently create heap hints of the form 0xXXc000000000 for XX between 0x00 and 0x7f. Even for XX=0x01, this hint is outside TSAN's supported heap address range. Fix this by creating a slightly different set of hints in race mode, all of which fall inside TSAN's heap address range. This should fix TestArenaCollision flakes. That test forces the runtime to use later heap hints. Currently, this always results in TSAN "failed to allocate" failures on Windows (which happens to have a slightly more constrained TSAN layout than non-Windows). Most of the time we don't notice these failures, but sometimes it crashes TSAN, leading to a test failure. Fixes #25698. Change-Id: I8926cd61f0ee5ee00efa77b283f7b809c555be46 Reviewed-on: https://go-review.googlesource.com/123780 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-07-19 18:12:51 +00:00
Richard Musiol	35ea62468b	runtime: add js/wasm architecture This commit adds the js/wasm architecture to the runtime package. Currently WebAssembly has no support for threads yet, see https://github.com/WebAssembly/design/issues/1073. Because of that, there is no preemption of goroutines and no sysmon goroutine. Design doc: https://docs.google.com/document/d/131vjr4DH6JFnb-blm_uRdaC0_Nv3OUwjEY5qVCxCup4 About WebAssembly assembly files: https://docs.google.com/document/d/1GRmy3rA4DiYtBlX-I1Jr_iHykbX8EixC3Mq0TCYqbKc Updates #18892 Change-Id: I7f12d21b5180500d55ae9fd2f7e926a1731db391 Reviewed-on: https://go-review.googlesource.com/103877 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-05-08 00:17:34 +00:00
Austin Clements	4946d9e87b	runtime: stop when we run out of hints in race mode Currently, the runtime falls back to asking for any address the OS can offer for the heap when it runs out of hint addresses. However, the race detector assumes the heap lives in [0x00c000000000, 0x00e000000000), and will fail in a non-obvious way if we go outside this region. Fix this by actively throwing a useful error if we run out of heap hints in race mode. This problem is currently being triggered by TestArenaCollision, which intentionally triggers this fallback behavior. Fix the test to look for the new panic message in race mode. Fixes #24670. Updates #24133. Change-Id: I57de6d17a3495dc1f1f84afc382cd18a6efc2bf7 Reviewed-on: https://go-review.googlesource.com/104717 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-04 18:08:04 +00:00
Austin Clements	788464724c	runtime: reduce arena size to 4MB on 64-bit Windows Currently, we use 64MB heap arenas on 64-bit platforms. This works well on UNIX-like OSes because they treat untouched pages as essentially free. However, on Windows, committed memory is charged against a process whether or not it has demand-faulted physical pages in. Hence, on Windows, even a process with a tiny heap will commit 64MB for one heap arena, plus another 32MB for the arena map. Things are much worse under the race detector, which increases the heap commitment by a factor of 5.5X, leading to 384MB of committed memory at runtime init. Fix this by reducing the heap arena size to 4MB on Windows. To counterbalance the effect of increasing the arena map size by a factor of 16, and to further reduce the impact of the commitment for the arena map, we switch from a single entry L1 arena map to a 64 entry L1 arena map. Compared to the original arena design, this slows down the x/benchmarks garbage benchmark by 0.49% (the slow down of this commit alone is 1.59%, but the previous commit bought us a 1% speed-up): name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.28ms ± 1% 2.29ms ± 1% +0.49% (p=0.000 n=17+18) (https://perf.golang.org/search?q=upload:20180223.1) (This was measured on linux/amd64 by modifying its arena configuration as above.) Fixes #23900. Change-Id: I6b7fa5ecebee2947bf20cfeb78c248809469c6b1 Reviewed-on: https://go-review.googlesource.com/96780 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-23 21:59:51 +00:00
Austin Clements	ec25210564	runtime: support a two-level arena map Currently, the heap arena map is a single, large array that covers every possible arena frame in the entire address space. This is practical up to about 48 bits of address space with 64 MB arenas. However, there are two problems with this: 1. mips64, ppc64, and s390x support full 64-bit address spaces (though on Linux only s390x has kernel support for 64-bit address spaces). On these platforms, it would be good to support these larger address spaces. 2. On Windows, processes are charged for untouched memory, so for processes with small heaps, the mostly-untouched 32 MB arena map plus a 64 MB arena are significant overhead. Hence, it would be good to reduce both the arena map size and the arena size, but with a single-level arena, these are inversely proportional. This CL adds support for a two-level arena map. Arena frame numbers are now divided into arenaL1Bits of L1 index and arenaL2Bits of L2 index. At the moment, arenaL1Bits is always 0, so we effectively have a single level map. We do a few things so that this has no cost beyond the current single-level map: 1. We embed the L2 array directly in mheap, so if there's a single entry in the L2 array, the representation is identical to the current representation and there's no extra level of indirection. 2. Hot code that accesses the arena map is structured so that it optimizes to nearly the same machine code as it does currently. 3. We make some small tweaks to hot code paths and to the inliner itself to keep some important functions inlined despite their now-larger ASTs. In particular, this is necessary for heapBitsForAddr and heapBits.next. Possibly as a result of some of the tweaks, this actually slightly improves the performance of the x/benchmarks garbage benchmark: name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.28ms ± 1% 2.26ms ± 1% -1.07% (p=0.000 n=17+19) (https://perf.golang.org/search?q=upload:20180223.2) For #23900. Change-Id: If5164e0961754f97eb9eca58f837f36d759505ff Reviewed-on: https://go-review.googlesource.com/96779 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-23 21:59:50 +00:00
Austin Clements	33b76920ec	runtime: rename "arena index" to "arena map" There are too many places where I want to talk about "indexing into the arena index". Make this less awkward and ambiguous by calling it the "arena map" instead. Change-Id: I726b0667bb2139dbc006175a0ec09a871cdf73f9 Reviewed-on: https://go-review.googlesource.com/96777 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-23 21:59:48 +00:00
Austin Clements	ea8d7a370d	runtime: clarify address space limit constants and comments Now that we support the full non-contiguous virtual address space of amd64 hardware, some of the comments and constants related to this are out of date. This renames memLimitBits to heapAddrBits because 1<<memLimitBits is no longer the limit of the address space and rewrites the comment to focus first on hardware limits (which span OSes) and then discuss kernel limits. Second, this eliminates the memLimit constant because there's no longer a meaningful "highest possible heap pointer value" on amd64. Updates #23862. Change-Id: I44b32033d2deb6b69248fb8dda14fc0e65c47f11 Reviewed-on: https://go-review.googlesource.com/95498 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-21 20:32:36 +00:00

1 2 3 4 5

229 Commits