mirror of https://github.com/golang/go.git
5580 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
636c5f0208 |
runtime: enable vDSO support for s390x architecture
This change adds support for vDSO for s390x architecture. This avoids the use of system calls in nanotime and walltime and accelerates them by factor 4-5.
Benchmarks:
100,000,000 x time.Now():
syscall fallback 13923ms 139.23 ns/op
vDSO enabled 2640ms 26.40 ns/op
Change-Id: Ic679fe31048379e59ccf83b400140f13c9d49696
GitHub-Last-Rev:
|
|
|
|
4388faf964 |
runtime: use unsafe.Slice in getStackMap
CL 362934 added open code for unsafe.Slice, so using it now no longer negatively impacts the performance. Updates #48798 Change-Id: Ifbabe8bc1cc4349c5bcd11586a11fc99bcb388b1 Reviewed-on: https://go-review.googlesource.com/c/go/+/404974 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
579902d0b1 |
cmd/compile,runtime: open code unsafe.Slice
So prevent heavy runtime call overhead, and the compiler will have a chance to optimize the bound check. With this optimization, changing runtime/stack.go to use unsafe.Slice no longer negatively impacts stack copying performance: name old time/op new time/op delta StackCopyWithStkobj-8 16.3ms ± 6% 16.5ms ± 5% ~ (p=0.382 n=8+8) name old alloc/op new alloc/op delta StackCopyWithStkobj-8 17.0B ± 0% 17.0B ± 0% ~ (all equal) name old allocs/op new allocs/op delta StackCopyWithStkobj-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fixes #48798 Change-Id: I731a9a4abd6dd6846f44eece7f86025b7bb1141b Reviewed-on: https://go-review.googlesource.com/c/go/+/362934 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> |
|
|
|
ccb798741b |
runtime: change maxSearchAddr into a helper function
This avoids a dependency on the compiler statically initializing maxSearchAddr, which is necessary so we can disable the (overly aggressive and spec non-conforming) optimizations in cmd/compile and gccgo. Updates #51913. Change-Id: I424e62c81c722bb179ed8d2d8e188274a1aeb7b6 Reviewed-on: https://go-review.googlesource.com/c/go/+/396194 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
e0f99775f2 |
runtime: store pointer-size words in memclr
GC requires the whole zeroed word to be visible for a memory subsystem. While the implementation of Enhanced REP STOSB tries to use as efficient stores as possible, e.g writing the whole cache line and not byte-after-byte, we should use REP STOSQ to guarantee the requirements of the GC. The performance is not affected. Change-Id: I1b0fd1444a40bfbb661541291ab96eba11bcc762 Reviewed-on: https://go-review.googlesource.com/c/go/+/405274 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> |
|
|
|
aeaf4b0e5b |
runtime: not mark save_g NOFRAME on ARM
On ARM, when GOARM<=6 the TLS pointer is fetched via a call to a kernel helper. This call clobbers LR, even just temporarily. If the function is NOFRAME, if a profiling signal lands right after the call returns, the unwinder will find the wrong LR. Not mark it NOFRAME, so the LR will be saved in the usual way and stack unwinding should work. May fix #52829. Change-Id: I419a31dcf4afbcff8d7ab8f179eec3c477589e60 Reviewed-on: https://go-review.googlesource.com/c/go/+/405482 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> |
|
|
|
482669db62 |
runtime: add lock partial order edge for trace and wbufSpans and mheap
This edge represents the case of executing a write barrier under the trace lock: we might use the wbufSpans lock to get a new trace buffer, or mheap to allocate a totally new one. Fixes #52794. Change-Id: Ia1ac2c744b8284ae29f4745373df3f9675ab1168 Reviewed-on: https://go-review.googlesource.com/c/go/+/405476 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> |
|
|
|
8fdd277fe6 |
runtime: profile finalizer G more carefully in goroutine profile
If SetFinalizer is never called, we might readgstatus on a nil fing variable, resulting in a crash. This change guards code that accesses fing by a nil check. Fixes #52821. Change-Id: I3e8e7004f97f073dc622e801a1d37003ea153a29 Reviewed-on: https://go-review.googlesource.com/c/go/+/405475 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Rhys Hiltner <rhys@justin.tv> |
|
|
|
dcfe57b8c2 |
runtime: use ERMS in memclr_amd64
This patch adds support for REP STOSB in memclr(). The current implementation uses REP STOSB when 1) ERMS is supported 2) size is bigger than 2kb and less than 32mb. The threshold of 2kb is chosen based on benchmark results and is close to what Intel mentioned in their comparison of ERMSB and AVX (Table 3-4. Relative Performance of Memcpy() Using ERMSB Vs. 128-bit AVX in the Intel Optimization Guide). While REP STOS uses a no-RFO write protocol, ERMS could show the same or slower performance comparing to Non-Temporal Stores when the size is bigger than LLC depending on hardware. Benchmarks (including MemclrRange from CL373362) goos: darwin goarch: amd64 pkg: runtime cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz name old time/op new time/op delta Memclr/5-12 1.90ns ± 2% 2.13ns ± 2% +11.72% (p=0.001 n=7+7) Memclr/16-12 2.33ns ± 4% 2.36ns ± 4% ~ (p=0.259 n=7+7) Memclr/64-12 2.58ns ± 2% 2.61ns ± 3% ~ (p=0.091 n=7+7) Memclr/256-12 4.89ns ± 4% 4.94ns ± 3% ~ (p=0.620 n=7+7) Memclr/4096-12 38.4ns ± 2% 39.5ns ± 5% ~ (p=0.078 n=7+7) Memclr/65536-12 929ns ± 2% 1040ns ±19% ~ (p=0.268 n=5+7) Memclr/1M-12 24.2µs ± 5% 19.0µs ± 9% -21.62% (p=0.001 n=7+7) Memclr/4M-12 93.3µs ± 3% 73.2µs ± 4% -21.50% (p=0.001 n=7+7) Memclr/8M-12 209µs ± 6% 164µs ± 3% -21.55% (p=0.001 n=7+7) Memclr/16M-12 731µs ± 4% 507µs ± 6% -30.71% (p=0.001 n=7+7) Memclr/64M-12 1.79ms ± 1% 1.83ms ± 3% +2.47% (p=0.041 n=6+6) MemclrRange/1_2_47K-12 873ns ± 3% 899ns ± 5% ~ (p=0.053 n=7+7) MemclrRange/2_8_166K-12 2.98µs ± 4% 2.90µs ± 5% ~ (p=0.165 n=7+7) MemclrRange/4_16_315K-12 6.81µs ± 4% 5.31µs ± 9% -22.01% (p=0.001 n=7+7) MemclrRange/128_256_1623K-12 37.5µs ± 4% 28.1µs ± 4% -25.19% (p=0.001 n=7+6) [Geo mean] 1.56µs 1.43µs -8.43% name old speed new speed delta Memclr/5-12 2.63GB/s ± 2% 2.35GB/s ± 2% -10.50% (p=0.001 n=7+7) Memclr/16-12 6.86GB/s ± 4% 6.79GB/s ± 4% ~ (p=0.259 n=7+7) Memclr/64-12 24.8GB/s ± 2% 24.5GB/s ± 3% ~ (p=0.097 n=7+7) Memclr/256-12 52.4GB/s ± 4% 51.9GB/s ± 3% ~ (p=0.620 n=7+7) Memclr/4096-12 107GB/s ± 2% 104GB/s ± 5% ~ (p=0.073 n=7+7) Memclr/65536-12 70.6GB/s ± 2% 64.2GB/s ±21% ~ (p=0.268 n=5+7) Memclr/1M-12 43.4GB/s ± 5% 55.5GB/s ±10% +28.04% (p=0.001 n=7+7) Memclr/4M-12 45.0GB/s ± 4% 57.3GB/s ± 4% +27.38% (p=0.001 n=7+7) Memclr/8M-12 40.1GB/s ± 5% 51.1GB/s ± 3% +27.37% (p=0.001 n=7+7) Memclr/16M-12 23.0GB/s ± 4% 33.1GB/s ± 6% +44.39% (p=0.001 n=7+7) Memclr/64M-12 37.6GB/s ± 1% 36.7GB/s ± 3% -2.38% (p=0.041 n=6+6) MemclrRange/1_2_47K-12 55.9GB/s ± 3% 54.3GB/s ± 5% ~ (p=0.053 n=7+7) MemclrRange/2_8_166K-12 57.4GB/s ± 5% 58.9GB/s ± 5% ~ (p=0.165 n=7+7) MemclrRange/4_16_315K-12 47.4GB/s ± 4% 60.9GB/s ± 9% +28.40% (p=0.001 n=7+7) MemclrRange/128_256_1623K-12 44.3GB/s ± 4% 58.4GB/s ± 9% +31.73% (p=0.001 n=7+7) [Geo mean] 33.6GB/s 36.8GB/s +9.27% goos: linux goarch: amd64 pkg: runtime cpu: Intel(R) Xeon(R) Gold 6230N CPU @ 2.30GHz name old time/op new time/op delta Memclr/5-2 2.53ns ± 0% 2.52ns ± 0% -0.25% (p=0.001 n=7+7) Memclr/16-2 2.77ns ± 0% 2.55ns ± 0% -7.97% (p=0.000 n=5+7) Memclr/64-2 3.16ns ± 0% 3.16ns ± 0% ~ (p=0.432 n=7+7) Memclr/256-2 7.26ns ± 0% 7.26ns ± 0% ~ (p=0.220 n=7+7) Memclr/4096-2 49.3ns ± 0% 43.5ns ± 0% -11.80% (p=0.001 n=7+7) Memclr/65536-2 1.32µs ± 1% 1.24µs ± 0% -6.31% (p=0.001 n=7+7) Memclr/1M-2 27.3µs ± 0% 26.6µs ± 5% ~ (p=0.195 n=7+7) Memclr/4M-2 195µs ± 0% 148µs ± 4% -24.22% (p=0.001 n=7+7) Memclr/8M-2 391µs ± 0% 308µs ± 0% -21.09% (p=0.001 n=7+6) Memclr/16M-2 782µs ± 0% 639µs ± 1% -18.31% (p=0.001 n=7+7) Memclr/64M-2 2.83ms ± 1% 2.84ms ± 1% ~ (p=0.620 n=7+7) MemclrRange/1K_2K-2 1.24µs ± 0% 1.24µs ± 0% ~ (p=1.000 n=7+6) MemclrRange/2K_8K-2 3.89µs ± 0% 3.11µs ± 0% -20.00% (p=0.001 n=6+7) MemclrRange/4K_16K-2 3.63µs ± 0% 2.37µs ± 0% -34.61% (p=0.001 n=7+7) MemclrRange/160K_228K-2 31.0µs ± 0% 30.6µs ± 1% -1.50% (p=0.001 n=7+7) [Geo mean] 1.97µs 1.76µs -10.59% name old speed new speed delta Memclr/5-2 1.98GB/s ± 0% 1.98GB/s ± 0% +0.27% (p=0.001 n=7+7) Memclr/16-2 5.78GB/s ± 0% 6.28GB/s ± 0% +8.67% (p=0.001 n=7+7) Memclr/64-2 20.2GB/s ± 0% 20.3GB/s ± 0% ~ (p=0.535 n=7+7) Memclr/256-2 35.3GB/s ± 0% 35.2GB/s ± 0% ~ (p=0.259 n=7+7) Memclr/4096-2 83.1GB/s ± 0% 94.2GB/s ± 0% +13.39% (p=0.001 n=7+7) Memclr/65536-2 49.7GB/s ± 1% 53.0GB/s ± 0% +6.73% (p=0.001 n=7+7) Memclr/1M-2 38.4GB/s ± 0% 39.4GB/s ± 4% ~ (p=0.209 n=7+7) Memclr/4M-2 21.5GB/s ± 0% 28.4GB/s ± 4% +32.02% (p=0.001 n=7+7) Memclr/8M-2 21.5GB/s ± 0% 27.2GB/s ± 0% +26.73% (p=0.001 n=7+6) Memclr/16M-2 21.4GB/s ± 0% 26.2GB/s ± 1% +22.42% (p=0.001 n=7+7) Memclr/64M-2 23.7GB/s ± 1% 23.7GB/s ± 1% ~ (p=0.620 n=7+7) MemclrRange/1K_2K-2 77.3GB/s ± 0% 77.3GB/s ± 0% ~ (p=0.710 n=7+7) MemclrRange/2K_8K-2 85.7GB/s ± 0% 107.1GB/s ± 0% +25.00% (p=0.001 n=6+7) MemclrRange/4K_16K-2 89.0GB/s ± 0% 136.1GB/s ± 0% +52.92% (p=0.001 n=7+7) MemclrRange/160K_228K-2 53.6GB/s ± 0% 54.4GB/s ± 1% +1.52% (p=0.001 n=7+7) [Geo mean] 29.2GB/s 32.7GB/s +11.86% Change-Id: I8f3533f88ebd303ae1666a77391fec304bea9724 Reviewed-on: https://go-review.googlesource.com/c/go/+/374396 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
f566fe3910 |
runtime: make racereadrange ABIinternal
CL 266638 marked racewriterange (and some other race functions) as ABIinternal but missed racereadrange. arm64 and ppc64le (the other two register ABI platforms at the moment) already have racereadrange marked as such. The other two instrumented calls are to racefuncenter/racefuncexit. Do you think they would need this treatment as well? arm64 already does, but amd64 and ppc64le do not. Fixes #51459 Change-Id: I3f54e1298433b6d67bfe18120d9f86205ff66a73 Reviewed-on: https://go-review.googlesource.com/c/go/+/393154 Reviewed-by: Than McIntosh <thanm@google.com> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> |
|
|
|
dd8d425fed |
all: fix some lint issues
Make some code more simple.
Change-Id: I801adf0dba5f6c515681345c732dbb907f945419
GitHub-Last-Rev:
|
|
|
|
9c9090eb1d |
cmd/link: generate PPC64 ABI register save/restore functions if needed
They are usually needed when internally linking gcc code compiled with -Os. These are typically generated by ld or gold, but are missing when linking internally. The PPC64 ELF ABI describes a set of functions to save/restore non-volatile, callee-save registers using R1/R0/R12: _savegpr0_n: Save Rn-R31 relative to R1, save LR (in R0), return _restgpr0_n: Restore Rn-R31 from R1, and return to saved LR _savefpr_n: Save Fn-F31 based on R1, and save LR (in R0), return _restfpr_n: Restore Fn-F31 from R1, and return to 16(R1) _savegpr1_n: Save Rn-R31 based on R12, return _restgpr1_n: Restore Rn-R31 based on R12, return _savevr_m: Save VRm-VR31 based on R0, R12 is scratch, return _restvr_m: Restore VRm-VR31 based on R0, R12 is scratch, return m is a value 20<=m<=31 n is a value 14<=n<=31 Add several new functions similar to those suggested by the PPC64 ELFv2 ABI. And update the linker to scan external relocs for these calls, and redirect them to runtime.elf_<func>+offset in runtime/asm_ppc64x.go. Similarly, code which generates plt stubs is moved into a dedicated function. This avoids an extra scan of relocs. fixes #52336 Change-Id: I2f0f8b5b081a7b294dff5c92b4b1db8eba9a9400 Reviewed-on: https://go-review.googlesource.com/c/go/+/400796 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Paul Murphy <murp@ibm.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
7f71e1fc28 |
runtime/cgo: remove memset in _cgo_sys_thread_start on linux
pthread_attr_init in glibc and musl libc already explicitly clear the pthread_attr argument before setting it, see https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/pthread_attr_init.c and https://git.musl-libc.org/cgit/musl/log/src/thread/pthread_attr_init.c It looks like pthread_attr_init has been implemented like this for a long time in both libcs. The comment and memset in _cgo_sys_thread_start probably stem from a time where not all libcs did the explicit memset in pthread_attr_init. Also, the memset in _cgo_sys_thread_start is not performed on all linux platforms further indicating that this isn't an issue anymore. Change-Id: I920148b5d24751195ced7af5bb7c52a7f8293259 Reviewed-on: https://go-review.googlesource.com/c/go/+/404275 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> |
|
|
|
7b1f8b62be |
runtime: prefer curg for execution trace profile
The CPU profiler adds goroutine labels to its samples based on getg().m.curg. That allows the profile to correctly attribute work that the runtime does on behalf of that goroutine on the M's g0 stack via systemstack calls, such as using runtime.Callers to record the call stack. Those labels also cover work on the g0 stack via mcall. When the active goroutine calls runtime.Gosched, it will receive attribution of its share of the scheduler work necessary to find the next runnable goroutine. The execution tracer's attribution of CPU samples to specific goroutines should match. When curg is set, attribute the CPU samples to that goroutine's ID. Fixes #52693 Change-Id: Ic9af92e153abd8477559e48bc8ebaf3739527b94 Reviewed-on: https://go-review.googlesource.com/c/go/+/404055 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Rhys Hiltner <rhys@justin.tv> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
1926fa5f84 |
runtime: use profile data before advancing index
Fixes #52704 Change-Id: Ia2104c62d7ea9d67469144948b2ceb5d9f1313b3 Reviewed-on: https://go-review.googlesource.com/c/go/+/404054 Run-TryBot: Rhys Hiltner <rhys@justin.tv> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> |
|
|
|
1b0f9fbb67 |
cmd/compile: enable Asan check for global variables
With this patch, -asan option can detect the error memory access to global variables. So this patch makes a few changes: 1. Add the asanregisterglobals runtime support function, which calls asan runtime function _asan_register_globals to register global variables. 2. Create a new initialization function for the package being compiled. This function initializes an array of instrumented global variables and pass it to function runtime.asanregisterglobals. An instrumented global variable has trailing redzone. 3. Writes the new size of instrumented global variables that have trailing redzones into object file. 4. Notice that the current implementation is only compatible with the ASan library from version v7 to v9. Therefore, using the -asan option requires that the gcc version is not less than 7 and the clang version is less than 4, otherwise a segmentation fault will occur. So this patch adds a check on whether the compiler being used is a supported version in cmd/go. (This is a redo of CL 401775 with a fix for a build break due to an intervening commit that removed the internal/execabs package.) Updates #44853. Change-Id: I719d4ef2b22cb2d5516e1494cd453c3efb47d6c7 Reviewed-on: https://go-review.googlesource.com/c/go/+/403851 Auto-Submit: Bryan Mills <bcmills@google.com> Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> |
|
|
|
265b5fd2f1 |
Revert "cmd/compile: enable Asan check for global variables"
This reverts CL 401775. Reason for revert: broke build. Change-Id: I4f6f2edff1e4afcf31cd90e26dacf303979eb10c Reviewed-on: https://go-review.googlesource.com/c/go/+/403981 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Bryan Mills <bcmills@google.com> Run-TryBot: Bryan Mills <bcmills@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> |
|
|
|
f52b4ec63d |
cmd/compile: enable Asan check for global variables
With this patch, -asan option can detect the error memory access to global variables. So this patch makes a few changes: 1. Add the asanregisterglobals runtime support function, which calls asan runtime function _asan_register_globals to register global variables. 2. Create a new initialization function for the package being compiled. This function initializes an array of instrumented global variables and pass it to function runtime.asanregisterglobals. An instrumented global variable has trailing redzone. 3. Writes the new size of instrumented global variables that have trailing redzones into object file. 4. Notice that the current implementation is only compatible with the ASan library from version v7 to v9. Therefore, using the -asan option requires that the gcc version is not less than 7 and the clang version is less than 4, otherwise a segmentation fault will occur. So this patch adds a check on whether the compiler being used is a supported version in cmd/go. Updates #44853. Change-Id: Ib877a817209ab2be68a8e22c418fe4a4a20880fc Reviewed-on: https://go-review.googlesource.com/c/go/+/401775 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Bryan Mills <bcmills@google.com> Auto-Submit: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
87cf92e7d5 |
cmd,runtime: enable race detector on s390x
LLVM has SystemZ ThreadSanitizer support now [1], this patch integrates it with golang. The biggest part is the glue code in race_s390x.s, which is derived from race_arm64.s, and then the support needs to be enabled in four places. [1] https://reviews.llvm.org/D105629 Change-Id: I1d4e51beb4042603b681e4aca9af6072879d54d6 Reviewed-on: https://go-review.googlesource.com/c/go/+/336549 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Keith Randall <khr@golang.org> |
|
|
|
4f898840d1 |
runtime: improve the annotation of debugCallV2 for arm64
This CL improves the annotation documentation of the debugCallV2 function for arm64. Change-Id: Icc2b52063cf4fe779071039d6a3bca1951108eb0 Reviewed-on: https://go-review.googlesource.com/c/go/+/402514 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Eric Fang <eric.fang@arm.com> Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> |
|
|
|
23f13255f0 |
runtime: re-add import in trace.go
CL 400795, which uses the runtime/internal/atomic package in trace.go, raced against CL 397014 removing that import. Re-add the import. Change-Id: If847ec23f9a0fdff91dab07e93d9fb1b2efed85b Reviewed-on: https://go-review.googlesource.com/c/go/+/403845 Run-TryBot: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Rhys Hiltner <rhys@justin.tv> Reviewed-by: Ian Lance Taylor <iant@google.com> |
|
|
|
89c0dd829f |
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general blocking store their data in a shared hash table. The bookkeeping work at the end of a garbage collection cycle involves maintenance on each memory allocation record. Previously, a single lock guarded access to the hash table and the contents of all records. When a program has allocated memory at a large number of unique call stacks, the maintenance following every garbage collection can hold that lock for several milliseconds. That can prevent progress on all other goroutines by delaying acquirep's call to mcache.prepareForSweep, which needs the lock in mProf_Free to report when a profiled allocation is no longer in use. With no user goroutines making progress, it is in effect a multi-millisecond GC-related stop-the-world pause. Split the lock so the call to mProf_Flush no longer delays each P's call to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle, and mProf_Flush uses locks on the memory records' accumulator and their N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it uses a lock on the memory records' N+2 cycle. The profiles for sync.Mutex contention and general blocking now share a separate lock, and another lock guards insertions to the shared hash table (uncommon in the steady-state). Consumers of each type of profile take the matching accumulator lock, so will observe consistent count and magnitude values for each record. For #45894 Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa Reviewed-on: https://go-review.googlesource.com/c/go/+/399956 Reviewed-by: Carlos Amedee <carlos@golang.org> Run-TryBot: Rhys Hiltner <rhys@justin.tv> Reviewed-by: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
2c0a9884e0 |
runtime: add CPU samples to execution trace
When the CPU profiler and execution tracer are both active, report the CPU profile samples in the execution trace data stream. Include only samples that arrive on the threads known to the runtime, but include them even when running g0 (such as near the scheduler) or if there's no P (such as near syscalls). Render them in "go tool trace" as instantaneous events. For #16895 Change-Id: I0aa501a7b450c971e510961c0290838729033f7f Reviewed-on: https://go-review.googlesource.com/c/go/+/400795 Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Rhys Hiltner <rhys@justin.tv> Reviewed-by: David Chase <drchase@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
52bd1c4d6c |
runtime: decrease STW pause for goroutine profile
The goroutine profile needs to stop the world to get a consistent
snapshot of all goroutines in the app. Leaving the world stopped while
iterating over allgs leads to a pause proportional to the number of
goroutines in the app (or its high-water mark).
Instead, do only a fixed amount of bookkeeping while the world is
stopped. Install a barrier so the scheduler confirms that a goroutine
appears in the profile, with its stack recorded exactly as it was during
the stop-the-world pause, before it allows that goroutine to execute.
Iterate over allgs while the app resumes normal operations, adding each
to the profile unless they've been scheduled in the meantime (and so
have profiled themselves). Stop the world a second time to remove the
barrier and do a fixed amount of cleanup work.
This increases both the fixed overhead and per-goroutine CPU-time cost
of GoroutineProfile. It also increases the wall-clock latency of the
call to GoroutineProfile, since the scheduler may interrupt it to
execute other goroutines.
name old time/op new time/op delta
GoroutineProfile/small/loaded-8 1.05ms ± 5% 4.99ms ±31% +376.85% (p=0.000 n=10+9)
GoroutineProfile/sparse/loaded-8 1.04ms ± 4% 3.61ms ±27% +246.61% (p=0.000 n=10+10)
GoroutineProfile/large/loaded-8 7.69ms ±17% 20.35ms ± 4% +164.50% (p=0.000 n=10+10)
GoroutineProfile/small/idle 958µs ± 0% 1820µs ±23% +89.91% (p=0.000 n=10+10)
GoroutineProfile/sparse/idle-8 1.00ms ± 3% 1.52ms ±17% +51.18% (p=0.000 n=10+10)
GoroutineProfile/small/idle-8 1.01ms ± 4% 1.47ms ± 7% +45.28% (p=0.000 n=9+9)
GoroutineProfile/sparse/idle 980µs ± 1% 1403µs ± 2% +43.22% (p=0.000 n=9+10)
GoroutineProfile/large/idle-8 7.19ms ± 8% 8.43ms ±21% +17.22% (p=0.011 n=10+10)
PingPongHog 511ns ± 8% 585ns ± 9% +14.39% (p=0.000 n=10+10)
GoroutineProfile/large/idle 6.71ms ± 0% 7.58ms ± 3% +13.08% (p=0.000 n=8+10)
PingPongHog-8 469ns ± 8% 509ns ±12% +8.62% (p=0.010 n=9+10)
WakeupParallelSyscall/5µs 216µs ± 4% 229µs ± 3% +6.06% (p=0.000 n=10+9)
WakeupParallelSyscall/5µs-8 147µs ± 1% 149µs ± 2% +1.12% (p=0.009 n=10+10)
WakeupParallelSyscall/2µs-8 140µs ± 0% 142µs ± 1% +1.11% (p=0.001 n=10+9)
WakeupParallelSyscall/50µs-8 236µs ± 0% 238µs ± 1% +1.08% (p=0.000 n=9+10)
WakeupParallelSyscall/1µs-8 138µs ± 0% 140µs ± 1% +1.05% (p=0.013 n=10+9)
Matmult 8.52ns ± 1% 8.61ns ± 0% +0.98% (p=0.002 n=10+8)
WakeupParallelSyscall/10µs-8 157µs ± 1% 158µs ± 1% +0.58% (p=0.003 n=10+8)
CreateGoroutinesSingle-8 328ns ± 0% 330ns ± 1% +0.57% (p=0.000 n=9+9)
WakeupParallelSpinning/100µs-8 343µs ± 0% 344µs ± 1% +0.30% (p=0.015 n=8+8)
WakeupParallelSyscall/20µs-8 178µs ± 0% 178µs ± 0% +0.18% (p=0.043 n=10+9)
StackGrowthDeep-8 22.8µs ± 0% 22.9µs ± 0% +0.12% (p=0.006 n=10+10)
StackGrowth 1.06µs ± 0% 1.06µs ± 0% +0.09% (p=0.000 n=8+9)
WakeupParallelSpinning/0s 10.7µs ± 0% 10.7µs ± 0% +0.08% (p=0.000 n=9+9)
WakeupParallelSpinning/5µs 30.7µs ± 0% 30.7µs ± 0% +0.04% (p=0.000 n=10+10)
WakeupParallelSpinning/100µs 411µs ± 0% 411µs ± 0% +0.03% (p=0.000 n=10+9)
WakeupParallelSpinning/2µs 18.7µs ± 0% 18.7µs ± 0% +0.02% (p=0.026 n=10+10)
WakeupParallelSpinning/20µs-8 93.0µs ± 0% 93.0µs ± 0% +0.01% (p=0.021 n=9+10)
StackGrowth-8 216ns ± 0% 216ns ± 0% ~ (p=0.209 n=10+10)
CreateGoroutinesParallel-8 49.5ns ± 2% 49.3ns ± 1% ~ (p=0.591 n=10+10)
CreateGoroutinesSingle 699ns ±20% 748ns ±19% ~ (p=0.353 n=10+10)
WakeupParallelSpinning/0s-8 15.9µs ± 2% 16.0µs ± 3% ~ (p=0.315 n=10+10)
WakeupParallelSpinning/1µs 14.6µs ± 0% 14.6µs ± 0% ~ (p=0.513 n=10+10)
WakeupParallelSpinning/2µs-8 24.2µs ± 3% 24.1µs ± 2% ~ (p=0.971 n=10+10)
WakeupParallelSpinning/10µs 50.7µs ± 0% 50.7µs ± 0% ~ (p=0.101 n=10+10)
WakeupParallelSpinning/20µs 90.7µs ± 0% 90.7µs ± 0% ~ (p=0.898 n=10+10)
WakeupParallelSpinning/50µs 211µs ± 0% 211µs ± 0% ~ (p=0.382 n=10+10)
WakeupParallelSyscall/0s-8 137µs ± 1% 138µs ± 0% ~ (p=0.075 n=10+10)
WakeupParallelSyscall/1µs 216µs ± 1% 219µs ± 3% ~ (p=0.065 n=10+9)
WakeupParallelSyscall/2µs 214µs ± 7% 219µs ± 1% ~ (p=0.101 n=10+8)
WakeupParallelSyscall/50µs 317µs ± 5% 326µs ± 4% ~ (p=0.123 n=10+10)
WakeupParallelSyscall/100µs 450µs ± 9% 459µs ± 8% ~ (p=0.247 n=10+10)
WakeupParallelSyscall/100µs-8 337µs ± 0% 338µs ± 1% ~ (p=0.089 n=10+10)
WakeupParallelSpinning/5µs-8 32.2µs ± 0% 32.2µs ± 0% -0.05% (p=0.026 n=9+10)
WakeupParallelSpinning/50µs-8 216µs ± 0% 216µs ± 0% -0.12% (p=0.004 n=10+10)
WakeupParallelSpinning/1µs-8 20.6µs ± 0% 20.5µs ± 0% -0.22% (p=0.014 n=10+10)
WakeupParallelSpinning/10µs-8 54.5µs ± 0% 54.2µs ± 1% -0.57% (p=0.000 n=10+10)
CreateGoroutines-8 213ns ± 1% 211ns ± 1% -0.86% (p=0.002 n=10+10)
CreateGoroutinesCapture 1.03µs ± 0% 1.02µs ± 0% -0.91% (p=0.000 n=10+10)
CreateGoroutinesCapture-8 1.32µs ± 1% 1.31µs ± 1% -1.06% (p=0.001 n=10+9)
CreateGoroutines 188ns ± 0% 186ns ± 0% -1.06% (p=0.000 n=9+10)
CreateGoroutinesParallel 188ns ± 0% 186ns ± 0% -1.27% (p=0.000 n=8+10)
WakeupParallelSyscall/0s 210µs ± 3% 207µs ± 3% -1.60% (p=0.043 n=10+10)
StackGrowthDeep 121µs ± 1% 119µs ± 1% -1.70% (p=0.000 n=9+10)
Matmult-8 1.82ns ± 3% 1.78ns ± 3% -2.16% (p=0.020 n=10+10)
WakeupParallelSyscall/20µs 281µs ± 3% 269µs ± 4% -4.44% (p=0.000 n=10+10)
WakeupParallelSyscall/10µs 239µs ± 3% 228µs ± 9% -4.70% (p=0.001 n=10+10)
GoroutineProfile/sparse-nil/idle-8 485µs ± 2% 12µs ± 4% -97.56% (p=0.000 n=10+10)
GoroutineProfile/small-nil/idle-8 484µs ± 2% 12µs ± 1% -97.60% (p=0.000 n=10+7)
GoroutineProfile/small-nil/loaded-8 487µs ± 2% 11µs ± 3% -97.68% (p=0.000 n=10+10)
GoroutineProfile/sparse-nil/loaded-8 507µs ± 4% 11µs ± 6% -97.78% (p=0.000 n=10+10)
GoroutineProfile/large-nil/idle-8 709µs ± 2% 11µs ± 4% -98.38% (p=0.000 n=10+10)
GoroutineProfile/large-nil/loaded-8 717µs ± 2% 11µs ± 3% -98.43% (p=0.000 n=10+10)
GoroutineProfile/sparse-nil/idle 465µs ± 3% 1µs ± 1% -99.84% (p=0.000 n=10+10)
GoroutineProfile/small-nil/idle 493µs ± 3% 1µs ± 0% -99.85% (p=0.000 n=10+9)
GoroutineProfile/large-nil/idle 716µs ± 1% 1µs ± 2% -99.89% (p=0.000 n=7+10)
name old alloc/op new alloc/op delta
CreateGoroutinesCapture 144B ± 0% 144B ± 0% ~ (all equal)
CreateGoroutinesCapture-8 144B ± 0% 144B ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
CreateGoroutinesCapture 5.00 ± 0% 5.00 ± 0% ~ (all equal)
CreateGoroutinesCapture-8 5.00 ± 0% 5.00 ± 0% ~ (all equal)
name old p50-ns new p50-ns delta
GoroutineProfile/small/loaded-8 1.01M ± 3% 3.87M ±45% +282.15% (p=0.000 n=10+10)
GoroutineProfile/sparse/loaded-8 1.02M ± 3% 2.43M ±41% +138.42% (p=0.000 n=10+10)
GoroutineProfile/large/loaded-8 7.43M ±16% 17.28M ± 2% +132.43% (p=0.000 n=10+10)
GoroutineProfile/small/idle 956k ± 0% 1559k ±16% +63.03% (p=0.000 n=10+10)
GoroutineProfile/small/idle-8 1.01M ± 3% 1.45M ± 7% +44.31% (p=0.000 n=10+9)
GoroutineProfile/sparse/idle 977k ± 1% 1399k ± 2% +43.20% (p=0.000 n=10+10)
GoroutineProfile/sparse/idle-8 1.00M ± 3% 1.41M ± 3% +40.47% (p=0.000 n=10+10)
GoroutineProfile/large/idle-8 6.97M ± 1% 8.41M ±25% +20.54% (p=0.003 n=8+10)
GoroutineProfile/large/idle 6.71M ± 1% 7.46M ± 4% +11.15% (p=0.000 n=10+10)
GoroutineProfile/sparse-nil/idle-8 483k ± 3% 13k ± 3% -97.41% (p=0.000 n=10+9)
GoroutineProfile/small-nil/idle-8 483k ± 2% 12k ± 1% -97.43% (p=0.000 n=10+8)
GoroutineProfile/small-nil/loaded-8 484k ± 3% 10k ± 2% -97.93% (p=0.000 n=10+8)
GoroutineProfile/sparse-nil/loaded-8 492k ± 2% 10k ± 4% -97.97% (p=0.000 n=10+8)
GoroutineProfile/large-nil/idle-8 708k ± 2% 12k ±15% -98.36% (p=0.000 n=10+10)
GoroutineProfile/large-nil/loaded-8 714k ± 2% 10k ± 2% -98.60% (p=0.000 n=10+8)
GoroutineProfile/sparse-nil/idle 459k ± 1% 1k ± 1% -99.85% (p=0.000 n=10+10)
GoroutineProfile/small-nil/idle 477k ± 1% 1k ± 0% -99.85% (p=0.000 n=10+9)
GoroutineProfile/large-nil/idle 712k ± 1% 1k ± 1% -99.90% (p=0.000 n=7+10)
name old p90-ns new p90-ns delta
GoroutineProfile/small/loaded-8 1.13M ±10% 7.49M ±35% +562.07% (p=0.000 n=10+10)
GoroutineProfile/sparse/loaded-8 1.10M ±12% 4.58M ±31% +318.02% (p=0.000 n=10+9)
GoroutineProfile/large/loaded-8 8.78M ±24% 27.83M ± 2% +217.00% (p=0.000 n=10+10)
GoroutineProfile/small/idle 967k ± 0% 2909k ±50% +200.91% (p=0.000 n=10+10)
GoroutineProfile/sparse/idle-8 1.02M ± 3% 1.96M ±76% +92.99% (p=0.000 n=10+10)
GoroutineProfile/small/idle-8 1.07M ±17% 1.55M ±12% +45.23% (p=0.000 n=10+10)
GoroutineProfile/sparse/idle 992k ± 1% 1417k ± 3% +42.79% (p=0.000 n=9+10)
GoroutineProfile/large/idle 6.73M ± 0% 7.99M ± 8% +18.80% (p=0.000 n=8+10)
GoroutineProfile/large/idle-8 8.20M ±25% 9.18M ±25% ~ (p=0.315 n=10+10)
GoroutineProfile/sparse-nil/idle-8 495k ± 3% 13k ± 1% -97.36% (p=0.000 n=10+9)
GoroutineProfile/small-nil/idle-8 494k ± 2% 13k ± 3% -97.36% (p=0.000 n=10+10)
GoroutineProfile/small-nil/loaded-8 496k ± 2% 13k ± 1% -97.41% (p=0.000 n=10+10)
GoroutineProfile/sparse-nil/loaded-8 544k ±11% 13k ± 1% -97.62% (p=0.000 n=10+9)
GoroutineProfile/large-nil/idle-8 724k ± 1% 13k ± 3% -98.20% (p=0.000 n=10+10)
GoroutineProfile/large-nil/loaded-8 729k ± 3% 13k ± 2% -98.23% (p=0.000 n=10+10)
GoroutineProfile/sparse-nil/idle 476k ± 4% 1k ± 1% -99.85% (p=0.000 n=9+10)
GoroutineProfile/small-nil/idle 537k ±10% 1k ± 0% -99.87% (p=0.000 n=10+9)
GoroutineProfile/large-nil/idle 729k ± 0% 1k ± 1% -99.90% (p=0.000 n=7+10)
name old p99-ns new p99-ns delta
GoroutineProfile/sparse/loaded-8 1.27M ±33% 20.49M ±17% +1514.61% (p=0.000 n=10+10)
GoroutineProfile/small/loaded-8 1.37M ±29% 20.48M ±23% +1399.35% (p=0.000 n=10+10)
GoroutineProfile/large/loaded-8 9.76M ±23% 39.98M ±22% +309.52% (p=0.000 n=10+8)
GoroutineProfile/small/idle 976k ± 1% 3367k ±55% +244.94% (p=0.000 n=10+10)
GoroutineProfile/sparse/idle-8 1.03M ± 3% 2.50M ±65% +142.30% (p=0.000 n=10+10)
GoroutineProfile/small/idle-8 1.17M ±34% 1.70M ±14% +45.15% (p=0.000 n=10+10)
GoroutineProfile/sparse/idle 1.02M ± 3% 1.45M ± 4% +42.64% (p=0.000 n=9+10)
GoroutineProfile/large/idle 6.92M ± 2% 9.00M ± 7% +29.98% (p=0.000 n=8+9)
GoroutineProfile/large/idle-8 8.74M ±23% 9.90M ±24% ~ (p=0.190 n=10+10)
GoroutineProfile/sparse-nil/idle-8 508k ± 4% 16k ± 2% -96.90% (p=0.000 n=10+9)
GoroutineProfile/small-nil/idle-8 508k ± 4% 16k ± 3% -96.91% (p=0.000 n=10+9)
GoroutineProfile/small-nil/loaded-8 542k ± 5% 15k ±15% -97.15% (p=0.000 n=10+10)
GoroutineProfile/sparse-nil/loaded-8 649k ±16% 15k ±18% -97.67% (p=0.000 n=10+10)
GoroutineProfile/large-nil/idle-8 738k ± 2% 16k ± 2% -97.86% (p=0.000 n=10+10)
GoroutineProfile/large-nil/loaded-8 765k ± 4% 15k ±17% -98.03% (p=0.000 n=10+10)
GoroutineProfile/sparse-nil/idle 539k ±26% 1k ±17% -99.84% (p=0.000 n=10+10)
GoroutineProfile/small-nil/idle 659k ±25% 1k ± 0% -99.84% (p=0.000 n=10+8)
GoroutineProfile/large-nil/idle 760k ± 2% 1k ±22% -99.88% (p=0.000 n=9+10)
Fixes #33250
For #50794
Change-Id: I862a2bc4e991cec485f21a6fce4fca84f2c6435b
Reviewed-on: https://go-review.googlesource.com/c/go/+/387415
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
|
|
b9dee7e59b |
runtime/pprof: stress test goroutine profiler
For #33250 Change-Id: Ic7aa74b1bb5da9c4319718bac96316b236cb40b2 Reviewed-on: https://go-review.googlesource.com/c/go/+/387414 Run-TryBot: Rhys Hiltner <rhys@justin.tv> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com> |
|
|
|
209942fa88 |
runtime/pprof: add race annotations for goroutine profiles
The race annotations for goroutine label maps covered the special type of read necessary to create CPU profiles. Extend that to include goroutine profiles. Annotate the copy involved in creating new goroutines. Fixes #50292 Change-Id: I10f69314e4f4eba85c506590fe4781f4d6b8ec2d Reviewed-on: https://go-review.googlesource.com/c/go/+/385660 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Austin Clements <austin@google.com> |
|
|
|
7c404d59db |
runtime: store consistent total allocation stats as uint64
Currently the consistent total allocation stats are managed as uintptrs, which means they can easily overflow on 32-bit systems. Fix this by storing these stats as uint64s. This will cause some minor performance degradation on 32-bit systems, but there really isn't a way around this, and it affects the correctness of the metrics we export. Fixes #52680. Change-Id: I7e6ca44047d46b4bd91c6f87c2d29f730e0d6191 Reviewed-on: https://go-review.googlesource.com/c/go/+/403758 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com> |
|
|
|
e7508598bb |
runtime: use Escape instead of escape in export_test.go
I landed the bottom CL of my stack without rebasing or retrying trybots, but in the rebase "escape" was removed in favor of "Escape." Change-Id: Icdc4d8de8b6ebc782215f2836cd191377cc211df Reviewed-on: https://go-review.googlesource.com/c/go/+/403755 Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> |
|
|
|
f01c20bf2b |
runtime/debug: export SetMemoryLimit
This change also adds an end-to-end test for SetMemoryLimit as a testprog. Fixes #48409. Change-Id: I102d64acf0f36a43ee17b7029e8dfdd1ee5f057d Reviewed-on: https://go-review.googlesource.com/c/go/+/397018 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
91f863013e |
runtime: redesign scavenging algorithm
Currently the runtime's scavenging algorithm involves running from the top of the heap address space to the bottom (or as far as it gets) once per GC cycle. Once it treads some ground, it doesn't tread it again until the next GC cycle. This works just fine for the background scavenger, for heap-growth scavenging, and for debug.FreeOSMemory. However, it breaks down in the face of a memory limit for small heaps in the tens of MiB. Basically, because the scavenger never retreads old ground, it's completely oblivious to new memory it could scavenge, and that it really *should* in the face of a memory limit. Also, every time some thread goes to scavenge in the runtime, it reserves what could be a considerable amount of address space, hiding it from other scavengers. This change modifies and simplifies the implementation overall. It's less code with complexities that are much better encapsulated. The current implementation iterates optimistically over the address space looking for memory to scavenge, keeping track of what it last saw. The new implementation does the same, but instead of directly iterating over pages, it iterates over chunks. It maintains an index of chunks (as a bitmap over the address space) that indicate which chunks may contain scavenge work. The page allocator populates this index, while scavengers consume it and iterate over it optimistically. This has a two key benefits: 1. Scavenging is much simpler: find a candidate chunk, and check it, essentially just using the scavengeOne fast path. There's no need for the complexity of iterating beyond one chunk, because the index is lock-free and already maintains that information. 2. If pages are freed to the page allocator (always guaranteed to be unscavenged), the page allocator immediately notifies all scavengers of the new source of work, avoiding the hiding issues of the old implementation. One downside of the new implementation, however, is that it's potentially more expensive to find pages to scavenge. In the past, if a single page would become free high up in the address space, the runtime's scavengers would ignore it. Now that scavengers won't, one or more scavengers may need to iterate potentially across the whole heap to find the next source of work. For the background scavenger, this just means a potentially less reactive scavenger -- overall it should still use the same amount of CPU. It means worse overheads for memory limit scavenging, but that's not exactly something with a baseline yet. In practice, this shouldn't be too bad, hopefully since the chunk index is extremely compact. For a 48-bit address space, the index is only 8 MiB in size at worst, but even just one physical page in the index is able to support up to 128 GiB heaps, provided they aren't terribly sparse. On 32-bit platforms, the index is only 128 bytes in size. For #48409. Change-Id: I72b7e74365046b18c64a6417224c5d85511194fb Reviewed-on: https://go-review.googlesource.com/c/go/+/399474 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
b4d81147d8 |
runtime: make the scavenger and allocator respect the memory limit
This change does everything necessary to make the memory allocator and the scavenger respect the memory limit. In particular, it: - Adds a second goal for the background scavenge that's based on the memory limit, setting a target 5% below the limit to make sure it's working hard when the application is close to it. - Makes span allocation assist the scavenger if the next allocation is about to put total memory use above the memory limit. - Measures any scavenge assist time and adds it to GC assist time for the sake of GC CPU limiting, to avoid a death spiral as a result of scavenging too much. All of these changes have a relatively small impact, but each is intimately related and thus benefit from being done together. For #48409. Change-Id: I35517a752f74dd12a151dd620f102c77e095d3e8 Reviewed-on: https://go-review.googlesource.com/c/go/+/397017 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
7e4bc74119 |
runtime: set the heap goal from the memory limit
This change makes the memory limit functional by including it in the heap goal calculation. Specifically, we derive a heap goal from the memory limit, and compare that to the GOGC-based goal. If the goal based on the memory limit is lower, we prefer that. To derive the memory limit goal, the heap goal calculation now takes a few additional parameters as input. As a result, the heap goal, in the presence of a memory limit, may change dynamically. The consequences of this are that different parts of the runtime can have different views of the heap goal; this is OK. What's important is that all of the runtime is able to observe the correct heap goal for the moment it's doing something that affects it, like anything that should trigger a GC cycle. On the topic of triggering a GC cycle, this change also allows any manually managed memory allocation from the page heap to trigger a GC. So, specifically workbufs, unrolled GC scan programs, and goroutine stacks. The reason for this is that now non-heap memory can effect the trigger or the heap goal. Most sources of non-heap memory only change slowly, like GC pointer bitmaps, or change in response to explicit function calls like GOMAXPROCS. Note also that unrolled GC scan programs and workbufs are really only relevant during a GC cycle anyway, so they won't actually ever trigger a GC. Our primary target here is goroutine stacks. Goroutine stacks can increase quickly, and this is currently totally independent of the GC cycle. Thus, if for example a goroutine begins to recurse suddenly and deeply, then even though the heap goal and trigger react, we might not notice until its too late. As a result, we need to trigger a GC cycle. We do this trigger in allocManual instead of in stackalloc because it's far more general. We ultimately care about memory that's mapped read/write and not returned to the OS, which is much more the domain of the page heap than the stack allocator. Furthermore, there may be new sources of memory manual allocation in the future (e.g. arenas) that need to trigger a GC if necessary. As such, I'm inclined to leave the trigger in allocManual as an extra defensive measure. It's worth noting that because goroutine stacks do not behave quite as predictably as other non-heap memory, there is the potential for the heap goal to swing wildly. Fortunately, goroutine stacks that haven't been set up to shrink by the last GC cycle will not shrink until after the next one. This reduces the amount of possible churn in the heap goal because it means that shrinkage only happens once per goroutine, per GC cycle. After all the goroutines that should shrink did, then goroutine stacks will only grow. The shrink mechanism is analagous to sweeping, which is incremental and thus tends toward a steady amount of heap memory used. As a result, in practice, I expect this to be a non-issue. Note that if the memory limit is not set, this change should be a no-op. For #48409. Change-Id: Ie06d10175e5e36f9fb6450e26ed8acd3d30c681c Reviewed-on: https://go-review.googlesource.com/c/go/+/394221 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> |
|
|
|
973dcbb87c |
runtime: remove float64 multiplication in heap trigger compute path
As of the last CL, the heap trigger is computed as-needed. This means that some of the niceties we assumed (that the float64 computations don't matter because we're doing this rarely anyway) are no longer true. While we're not exactly on a hot path right now, the trigger check still happens often enough that it's a little too hot for comfort. This change optimizes the computation by replacing the float64 multiplication with a shift and a constant integer multiplication. I ran an allocation microbenchmark for an allocation size that would hit this path often. CPU profiles seem to indicate this path was ~0.1% of cycles (dwarfed by other costs, e.g. zeroing memory) even if all we're doing is allocating, so the "optimization" here isn't particularly important. However, since the code here is executed significantly more frequently, and this change isn't particularly complicated, let's err on the size of efficiency if we can help it. Note that because of the way the constants are represented now, they're ever so slightly different from before, so this change technically isn't a total no-op. In practice however, it should be. These constants are fuzzy and hand-picked anyway, so having them shift a little is unlikely to make a significant change to the behavior of the GC. For #48409. Change-Id: Iabb2385920f7d891b25040226f35a3f31b7bf844 Reviewed-on: https://go-review.googlesource.com/c/go/+/397015 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> |
|
|
|
129dcb7226 |
runtime: check the heap goal and trigger dynamically
As it stands, the heap goal and the trigger are set once by gcController.commit, and then read out of gcController. However with the coming memory limit we need the GC to be able to respond to changes in non-heap memory. The simplest way of achieving this is to compute the heap goal and its associated trigger dynamically. In order to make this easier to implement, the GC trigger is now based on the heap goal, as opposed to the status quo of computing both simultaneously. In many cases we just want the heap goal anyway, not both, but we definitely need the goal to compute the trigger, because the trigger's bounds are entirely based on the goal (the initial runway is not). A consequence of this is that we can't rely on the trigger to enforce a minimum heap size anymore, and we need to lift that up directly to the goal. Specifically, we need to lift up any part of the calculation that *could* put the trigger ahead of the goal. Luckily this is just the heap minimum and minimum sweep distance. In the first case, the pacer may behave slightly differently, as the heap minimum is no longer the minimum trigger, but the actual minimum heap goal. In the second case it should be the same, as we ensure the additional runway for sweeping is added to both the goal *and* the trigger, as before, by computing that in gcControllerState.commit. There's also another place we update the heap goal: if a GC starts and we triggered beyond the goal, we always ensure there's some runway. That calculation uses the current trigger, which violates the rule of keeping the goal based on the trigger. Notice, however, that using the precomputed trigger for this isn't even quite correct: due to a bug, or something else, we might trigger a GC beyond the precomputed trigger. So this change also adds a "triggered" field to gcControllerState that tracks the point at which a GC actually triggered. This is independent of the precomputed trigger, so it's fine for the heap goal calculation to rely on it. It also turns out, there's more than just that one place where we really should be using the actual trigger point, so this change fixes those up too. Also, because the heap minimum is set by the goal and not the trigger, the maximum trigger calculation now happens *after* the goal is set, so the maximum trigger actually does what I originally intended (and what the comment says): at small heaps, the pacer picks 95% of the runway as the maximum trigger. Currently, the pacer picks a small trigger based on a not-yet-rounded-up heap goal, so the trigger gets rounded up to the goal, and as per the "ensure there's some runway" check, the runway ends up at always being 64 KiB. That check is supposed to be for exceptional circumstances, not the status quo. There's a test introduced in the last CL that needs to be updated to accomodate this slight change in behavior. So, this all sounds like a lot that changed, but what we're talking about here are really, really tight corner cases that arise from situations outside of our control, like pathologically bad behavior on the part of an OS or CPU. Even in these corner cases, it's very unlikely that users will notice any difference at all. What's more important, I think, is that the pacer behaves more closely to what all the comments describe, and what the original intent was. Another note: at first, one might think that computing the heap goal and trigger dynamically introduces some raciness, but not in this CL: the heap goal and trigger are completely static. Allocation outside of a GC cycle may now be a bit slower than before, as the GC trigger check is now significantly more complex. However, note that this executes basically just as often as gcController.revise, and that makes up for a vanishingly small part of any CPU profile. The next CL cleans up the floating point multiplications on this path nonetheless, just to be safe. For #48409. Change-Id: I280f5ad607a86756d33fb8449ad08555cbee93f9 Reviewed-on: https://go-review.googlesource.com/c/go/+/397014 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
473c99643f |
runtime: rewrite pacer max trigger calculation
Currently the maximum trigger calculation is totally incorrect with respect to the comment above it and its intent. This change rectifies this mistake. For #48409. Change-Id: Ifef647040a8bdd304dd327695f5f315796a61a74 Reviewed-on: https://go-review.googlesource.com/c/go/+/398834 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
375d696ddf |
runtime: move inconsistent memstats into gcController
Fundamentally, all of these memstats exist to serve the runtime in
managing memory. For the sake of simpler testing, couple these stats
more tightly with the GC.
This CL was mostly done automatically. The fields had to be moved
manually, but the references to the fields were updated via
gofmt -w -r 'memstats.<field> -> gcController.<field>' *.go
For #48409.
Change-Id: Ic036e875c98138d9a11e1c35f8c61b784c376134
Reviewed-on: https://go-review.googlesource.com/c/go/+/397678
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
|
|
d36d5bd3c1 |
runtime: clean up inconsistent heap stats
The inconsistent heaps stats in memstats are a bit messy. Primarily, heap_sys is non-orthogonal with heap_released and heap_inuse. In later CLs, we're going to want heap_sys-heap_released-heap_inuse, so clean this up by replacing heap_sys with an orthogonal metric: heapFree. heapFree represents page heap memory that is free but not released. I think this change also simplifies a lot of reasoning about these stats; it's much clearer what they mean, and to obtain HeapSys for memstats, we no longer need to do the strange subtraction from heap_sys when allocating specifically non-heap memory from the page heap. Because we're removing heap_sys, we need to replace it with a sysMemStat for mem.go functions. In this case, heap_released is the most appropriate because we increase it anyway (again, non-orthogonality). In which case, it makes sense for heap_inuse, heap_released, and heapFree to become more uniform, and to just represent them all as sysMemStats. While we're here and messing with the types of heap_inuse and heap_released, let's also fix their names (and last_heap_inuse's name) up to the more modern Go convention of camelCase. For #48409. Change-Id: I87fcbf143b3e36b065c7faf9aa888d86bd11710b Reviewed-on: https://go-review.googlesource.com/c/go/+/397677 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
4649a43903 |
runtime: track how much memory is mapped in the Ready state
This change adds a field to memstats called mappedReady that tracks how much memory is in the Ready state at any given time. In essence, it's the total memory usage by the Go runtime (with one exception which is documented). Essentially, all memory mapped read/write that has either been paged in or will soon. To make tracking this not involve the many different stats that track mapped memory, we track this statistic at a very low level. The downside of tracking this statistic at such a low level is that it managed to catch lots of situations where the runtime wasn't fully accounting for memory. This change rectifies these situations by always accounting for memory that's mapped in some way (i.e. always passing a sysMemStat to a mem.go function), with *two* exceptions. Rectifying these situations means also having the memory mapped during testing being accounted for, so that tests (i.e. ReadMemStats) that ultimately check mappedReady continue to work correctly without special exceptions. We choose to simply account for this memory in other_sys. Let's talk about the exceptions. The first is the arenas array for finding heap arena metadata from an address is mapped as read/write in one large chunk. It's tens of MiB in size. On systems with demand paging, we assume that the whole thing isn't paged in at once (after all, it maps to the whole address space, and it's exceedingly difficult with today's technology to even broach having as much physical memory as the total address space). On systems where we have to commit memory manually, we use a two-level structure. Now, the reason why this is an exception is because we have no mechanism to track what memory is paged in, and we can't just account for the entire thing, because that would *look* like an enormous overhead. Furthermore, this structure is on a few really, really critical paths in the runtime, so doing more explicit tracking isn't really an option. So, we explicitly don't and call sysAllocOS to map this memory. The second exception is that we call sysFree with no accounting to clean up address space reservations, or otherwise to throw out mappings we don't care about. In this case, also drop down to a lower level and call sysFreeOS to explicitly avoid accounting. The third exception is debuglog allocations. That is purely a debugging facility and ideally we want it to have as small an impact on the runtime as possible. If we include it in mappedReady calculations, it could cause GC pacing shifts in future CLs, especailly if one increases the debuglog buffer sizes as a one-off. As of this CL, these are the only three places in the runtime that would pass nil for a stat to any of the functions in mem.go. As a result, this CL makes sysMemStats mandatory to facilitate better accounting in the future. It's now much easier to grep and find out where accounting is explicitly elided, because one doesn't have to follow the trail of sysMemStat nil pointer values, and can just look at the function name. For #48409. Change-Id: I274eb467fc2603881717482214fddc47c9eaf218 Reviewed-on: https://go-review.googlesource.com/c/go/+/393402 Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Michael Knyszek <mknyszek@google.com> |
|
|
|
85d466493d |
runtime: maintain a direct count of total allocs and frees
This will be used by the memory limit computation to determine overheads. For #48409. Change-Id: Iaa4e26e1e6e46f88d10ba8ebb6b001be876dc5cd Reviewed-on: https://go-review.googlesource.com/c/go/+/394220 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
|
|
|
986a31053d |
runtime: add a non-functional memory limit to the pacer
Nothing much to see here, just some plumbing to make latter CLs smaller and clearer. For #48409. Change-Id: Ide23812d5553e0b6eea5616c277d1a760afb4ed0 Reviewed-on: https://go-review.googlesource.com/c/go/+/393401 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> |
|
|
|
0feebe6eb5 |
runtime: add byte count parser for GOMEMLIMIT
This change adds a parser for the GOMEMLIMIT environment variable's input. This environment variable accepts a number followed by an optional prefix expressing the unit. Acceptable units include B, KiB, MiB, GiB, TiB, where *iB is a power-of-two byte unit. For #48409. Change-Id: I6a3b4c02b175bfcf9c4debee6118cf5dda93bb6f Reviewed-on: https://go-review.googlesource.com/c/go/+/393400 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> |
|
|
|
01359b4681 |
runtime: add GC CPU utilization limiter
This change adds a GC CPU utilization limiter to the GC. It disables assists to ensure GC CPU utilization remains under 50%. It uses a leaky bucket mechanism that will only fill if GC CPU utilization exceeds 50%. Once the bucket begins to overflow, GC assists are limited until the bucket empties, at the risk of GC overshoot. The limiter is primarily updated by assists. The scheduler may also update it, but only if the GC is on and a few milliseconds have passed since the last update. This second case exists to ensure that if the limiter is on, and no assists are happening, we're still updating the limiter regularly. The purpose of this limiter is to mitigate GC death spirals, opting to use more memory instead. This change turns the limiter on always. In practice, 50% overall GC CPU utilization is very difficult to hit unless you're trying; even the most allocation-heavy applications with complex heaps still need to do something with that memory. Note that small GOGC values (i.e. single-digit, or low teens) are more likely to trigger the limiter, which means the GOGC tradeoff may no longer be respected. Even so, it should still be relatively rare. This change also introduces the feature flag for code to support the memory limit feature. For #48409. Change-Id: Ia30f914e683e491a00900fd27868446c65e5d3c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/353989 Reviewed-by: Michael Pratt <mpratt@google.com> |
|
|
|
3e00bd0ae4 |
internal/poll, net, syscall: use accept4 on solaris
Solaris supports accept4 since version 11.4, see https://docs.oracle.com/cd/E88353_01/html/E37843/accept4-3c.html Use it in internal/poll.accept like on other platforms. Change-Id: I3d9830a85e93bbbed60486247c2f91abc646371f Reviewed-on: https://go-review.googlesource.com/c/go/+/403394 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Benny Siegert <bsiegert@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> |
|
|
|
f771edd7f9 |
all: REVERSE MERGE dev.boringcrypto (cdcb4b6) into master
This commit is a REVERSE MERGE. It merges dev.boringcrypto back into its parent branch, master. This marks the end of development on dev.boringcrypto. Manual Changes: - git rm README.boringcrypto.md - git rm -r misc/boring - git rm src/cmd/internal/notsha256/sha256block_arm64.s - git cherry-pick -n 5856aa74 # remove GOEXPERIMENT=boringcrypto forcing in cmd/dist There are some minor cleanups like merging import statements that I will apply in a follow-up CL. Merge List: + 2022-04-29 |
|
|
|
e845f572ec |
[dev.boringcrypto] crypto/ecdsa, crypto/rsa: use boring.Cache
In the original BoringCrypto port, ecdsa and rsa's public and private keys added a 'boring unsafe.Pointer' field to cache the BoringCrypto form of the key. This led to problems with code that “knew” the layout of those structs and in particular that they had no unexported fields. In response, as an awful kludge, I changed the compiler to pretend that field did not exist when laying out reflect data. Because we want to merge BoringCrypto in the main tree, we need a different solution. Using boring.Cache is that solution. For #51940. Change-Id: Ideb2b40b599a1dc223082eda35a5ea9abcc01e30 Reviewed-on: https://go-review.googlesource.com/c/go/+/395883 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Roland Shoemaker <roland@golang.org> |
|
|
|
a840bf871e |
[dev.boringcrypto] crypto/internal/boring: add GC-aware cache
In the original BoringCrypto port, ecdsa and rsa's public and private keys added a 'boring unsafe.Pointer' field to cache the BoringCrypto form of the key. This led to problems with code that “knew” the layout of those structs and in particular that they had no unexported fields. In response, as an awful kludge, I changed the compiler to pretend that field did not exist when laying out reflect data. Because we want to merge BoringCrypto in the main tree, we need a different solution. The different solution is this CL's boring.Cache, which is a concurrent, GC-aware map from unsafe.Pointer to unsafe.Pointer (if generics were farther along we could use them nicely here, but I am afraid of breaking tools that aren't ready to see generics in the standard library yet). More complex approaches are possible, but a simple, fixed-size hash table is easy to make concurrent and should be fine. For #51940. Change-Id: I44062a8defbd87b705a787cffc64c6a9d0132785 Reviewed-on: https://go-review.googlesource.com/c/go/+/395882 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> |
|
|
|
cb6fc99b32 |
runtime: mark sigtramp as TOPFRAME on the rest of unix
This extends CL 402190 from Linux to the rest of the Unix OSes. Marking sigtramp as TOPFRAME allows gentraceback to stop tracebacks at the end of a signal handler, since there is not much beyond sigtramp. Change-Id: I8b7f5d55d41889f59c0a79c65351b9b0b2d77717 Reviewed-on: https://go-review.googlesource.com/c/go/+/402934 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Austin Clements <austin@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Michael Pratt <mpratt@google.com> |
|
|
|
74f0009422 |
runtime: use saved LR when unwinding through morestack
On LR machine, consider F calling G calling H, which grows stack. The stack looks like ... G's frame: ... locals ... saved LR = return PC in F <- SP points here at morestack H's frame (to be created) At morestack, we save gp.sched.pc = H's morestack call gp.sched.sp = H's entry SP (the arrow above) gp.sched.lr = return PC in G Currently, when unwinding through morestack (if _TraceJumpStack is set), we switch PC and SP but not LR. We then have frame.pc = H's morestack call frame.sp = H's entry SP (the arrow above) As LR is not set, we load it from stack at *sp, so frame.lr = return PC in F As the SP hasn't decremented at the morestack call, frame.fp = frame.sp = H's entry SP Unwinding a frame, we have frame.pc = old frame.lr = return PC in F frame.sp = old frame.fp = H's entry SP a.k.a. G's SP The PC and SP don't match. The unwinding will go off if F and G have different frame sizes. Fix this by preserving the LR when switching stack. Also add code to detect infinite loop in unwinding. TODO: add some test. I can reproduce the infinite loop (or throw with added check) but the frequency is low. May fix #52116. Change-Id: I6e1294f1c6e55f664c962767a1cf6c466a0c0eff Reviewed-on: https://go-review.googlesource.com/c/go/+/400575 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Cherry Mui <cherryyz@google.com> Reviewed-by: Eric Fang <eric.fang@arm.com> Reviewed-by: Benny Siegert <bsiegert@gmail.com> |
|
|
|
123e27170a |
runtime: clean up escaping in tests
There are several tests in the runtime that need to force various things to escape to the heap. This CL centralizes this functionality into runtime.Escape, defined in export_test. Change-Id: I2de2519661603ad46c372877a9c93efef8e7a857 Reviewed-on: https://go-review.googlesource.com/c/go/+/402178 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Austin Clements <austin@google.com> Auto-Submit: Austin Clements <austin@google.com> |
|
|
|
4289bd365c |
runtime: simply user throws, expand runtime throws
This gives explicit names to the possible states of throwing (-1, 0, 1). m.throwing is now one of: throwTypeOff: not throwing, previously == 0 throwTypeUser: user throw, previously == -1 throwTypeRuntime: runtime throw, previously == 1 For runtime throws, we now always include frame metadata and system goroutines regardless of GOTRACEBACK to aid in debugging the runtime. For user throws, we no longer include frame metadata or runtime frames, unless GOTRACEBACK=system or higher. For #51485. Change-Id: If252e2377a0b6385ce7756b937929be4273a56c0 Reviewed-on: https://go-review.googlesource.com/c/go/+/390421 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com> |