Commit Graph

6021 Commits

Author SHA1 Message Date
Bryan C. Mills fbf14ae166 runtime: remove arbitrary GOARCH constraints in finalizer tests
These tests were only run on GOARCH=amd64, but the rationale given in
CL 11858043 was GC precision on 32-bit platforms. Today, we have far
more 64-bit platforms than just amd64, and I believe that GC precision
on 32-bit platforms has been substantially improved as well.
The GOARCH restriction seems unnecessary.

Updates #57166.
Updates #5368.

Change-Id: I45c608b6fa721012792c96d4ed94a6d772b90210
Reviewed-on: https://go-review.googlesource.com/c/go/+/456120
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2023-01-19 20:45:58 +00:00
Archana R 1c65b69bd1 runtime: fix performance regression in morestack_noctxt on ppc64
In the fix for 54332 the MOVD R1, R1 instruction was added to
morestack_noctxt function to set the SPWRITE bit. However, the
instruction MOVD R1, R1 results in or r1,r1,r1 which is a special
instruction on ppc64 architecture as it changes the thread priority
and can negatively impact performance in some cases.
More details on such similar nops can be found in Power ISA v3.1
Book II on Power ISA Virtual Environment architecture in the chapter
on Program Priority Registers and Or instructions.
Replacing this by OR R0, R1 has the same affect on setting SPWRITE as
needed by the first fix but does not affect thread priority and
hence does not cause the degradation in performance

Hash65536-64           2.81GB/s ±10%  16.69GB/s ± 0%  +494.44%
Fixes #57741

Change-Id: Ib912e3716c6afd277994d6c1c5b2891f82225d50
Reviewed-on: https://go-review.googlesource.com/c/go/+/461597
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Auto-Submit: Benny Siegert <bsiegert@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
2023-01-16 08:37:36 +00:00
zhengchaopu 18625d9bec runtime: fix incorrect comment
Fix incorrect comment for the runtime package.

Change-Id: Iab889eff0e9c622afbed959d32b8b5f0ed0bfebf
GitHub-Last-Rev: e9587868db
GitHub-Pull-Request: golang/go#57731
Reviewed-on: https://go-review.googlesource.com/c/go/+/461498
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2023-01-11 22:26:01 +00:00
Russ Cox 76d39ae349 cmd/link, runtime: Apple libc atfork workaround take 3
CL 451735 worked around bugs in Apple's atfork handlers by calling
notify_is_valid_token and xpc_atfork_child at startup, so that init
code that wouldn't be safe in the child process would be warmed up in
the parent process instead, but xpc_atfork_child broke use of the xpc
library in Go programs, and xpc is internally used by various macOS
frameworks (#57263).

CL 459175 reverted that change, and then CL 459176 tried a new
approach: use __fork, which doesn't call any of the atfork handlers at all.
That worked, but an Apple engineer reviewing the change in private
email suggests that since __fork is not public API, it should be avoided.
The same engineer (with access to the source code for the xpc library)
suggests that the breakage in #57263 is caused by xpc_atfork_child
marking the library as unusable, expecting an imminent call to exec,
and that calling xpc_date_create_from_current instead would do the
necessary initialization without marking xpc as unusable.

CL 460475 reverted that change, to prepare for this one.

This CL goes back to the original “call functions to warm things up”
approach, replacing xpc_atfork_child with xpc_date_create_from_current.

The CL also updates cmd/link to use OS and SDK version 10.13.0 for
x86 macOS binaries, up from 10.9.0, also suggested by the Apple engineer.
Combined with the two warmup calls, this makes the fork hangs go away.
The minimum macOS version has been 10.13 High Sierra since Go 1.17,
so there should be no problem with writing that in the binaries too.

Fixes #33565.
Fixes #56784.
Fixes #57263.
Fixes #57577.

Change-Id: I20769d9daa1fe9ea930f8009481335f8a14dc21b
Reviewed-on: https://go-review.googlesource.com/c/go/+/460476
Auto-Submit: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-01-10 20:34:22 +00:00
Ian Lance Taylor 376076f3c6 runtime: skip TestCgoPprofCallback in short mode, don't run in parallel
Fixes #54778

Change-Id: If9aef0c06b993ef2aedbeea9452297ee9f11fa06
Reviewed-on: https://go-review.googlesource.com/c/go/+/460461
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
2023-01-09 20:30:17 +00:00
Austin Clements 0bbd67e52f runtime/pprof: document possibility of empty stacks
I spent quite a while determining the cause of empty stacks in
profiles and reasoning out why this is okay. There isn't a great place
to record this knowledge, but a documentation comment on
appendLocsForStack is better than nothing.

Updates #51550.

Change-Id: I2eefc6ea31f1af885885c3d96199319f45edb4ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/460695
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2023-01-09 20:25:19 +00:00
Austin Clements d9f23cfe78 runtime/pprof: improve output of TestLabelSystemstack
The current output of TestLabelSystemstack is a bit cryptic. This CL
improves various messages and hopefully simplifies the logic in the
test.

Simplifying the logic leads to three changes in possible outcomes,
which I verified by running the logic before and after this change
through all 2^4 possibilities (https://go.dev/play/p/bnfb-OQCT4j):

1. If a sample both must be labeled and must not be labeled, the test
now reports that explicitly rather than giving other confusing output.

2. If a sample must not be labeled but is, the current logic will
print two identical error messages. The new logic prints only one.

3. If the test finds no frames at all that it recognizes, but the
sample is labeled, it will currently print a confusing "Sample labeled
got true want false" message. The new logic prints nothing. We've seen
this triggered by empty stacks in profiles.

Fixes #51550. This bug was caused by case 3 above, where it was
triggered by a profile label on an empty stack. It's valid for empty
stacks to appear in a profile if we sample a goroutine just as it's
exiting (and that goroutine may have a profile label), so the test
shouldn't fail in this case.

Change-Id: I1593ec4ac33eced5bb89572a3ba7623e56f2fb3d
Reviewed-on: https://go-review.googlesource.com/c/go/+/460516
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-01-09 20:25:17 +00:00
Cherry Mui 1ba7341cb2 cmd/link, runtime: use a different section for Go libfuzzer counters
Currently in libfuzzer mode, we put our counters in section
__sancov_cntrs. When linking with C/C++ code that also has fuzzer
counters, apparently the C linker combines our counters and their
counters and registers them together. But in the Go runtime we
also have code to register our counters. So the Go counters ended
up registered twice, causing problems.

Since we already have code to register our counters, put them in
a Go-specific section so it won't be combined with the C counters.

Fixes #57449.

Change-Id: If3d41735124e7e301572d4b7aecf7d057ac134c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/459055
Reviewed-by: Nicolas Hillegeer <aktau@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
2022-12-23 01:12:02 +00:00
Russ Cox 6c9b661867 runtime: revert Apple libc atfork workaround
Revert CL 451735 (1f4394a0c9), which fixed #33565 and #56784
but also introduced #57263.

I have a different fix to apply instead. Since the first fix was
never backported, it will be easiest to backport the new fix
if the new fix is done in a separate CL from the revert.

Change-Id: I6c8ea3a46e542ee4702675bbc058e29ccd2723e0
Reviewed-on: https://go-review.googlesource.com/c/go/+/459175
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
2022-12-22 18:03:58 +00:00
Cherry Mui de6abd7889 runtime/internal/startlinetest: work around shared buildmode linking issue
The runtime/internal/startlinetest package contains a call to a
function defined in runtime_test. Generally this is fine as this
package is only linked in for runtime_test. Except that for "go
install -buildmode=shared std", which include all packages in std,
including this test-only internal package. In this mode, the
caller is included in the linking but the callee is not, causing
linking error. Work around it by calling
runtime_test.callerStartLine via a function pointer. The function
pointer is only set in runtime_test. In the shared std build, the
function pointer will not be set, and this is fine.

Fixes #57334.

Change-Id: I7d871c50ce6599c6ea2802cf6e14bb749deab220
Reviewed-on: https://go-review.googlesource.com/c/go/+/458696
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
2022-12-22 04:34:33 +00:00
Cherry Mui 18baca6765 runtime/race: add build tag to internal amd64vN packages
Only one of the runtime/race/internal/amd64vN packages should be
included in a build. Generally this is true because the
runtime/race package would import only one of them depending on
the build configuration. But for "go install -buildmode=shared std"
it includes all Go packages in std, which includes both, which
then causes link-time failure due to duplicated symbols. To avoid
this, we add build tags to the internal packages, so, depending on
the build configuation, only one package would contain buildable
go files therefore be included in the build.

For #57334.

Change-Id: I52ddc3a40e16c7d04b4dd861e9689918d27e8509
Reviewed-on: https://go-review.googlesource.com/c/go/+/458695
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-12-22 04:34:09 +00:00
Than McIntosh fadd77c05b runtime/coverage: add missing file close in test support helper
The processPod() helper (invoked by processCoverTestDir, which is in
turn called by _testmain.go) was opening and reading counter data
files, but never closing them. Add a call to close the files after
they have been read.

Fixes #57407.

Change-Id: If9a489f92e4bab72c5b2df8697e14420a6f7b8f5
Reviewed-on: https://go-review.googlesource.com/c/go/+/458835
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-12-21 20:18:10 +00:00
Guoqi Chen 0b2ad1d815 cmd/compile: sign-extend the 2nd argument of the LoweredAtomicCas32 on loong64,mips64x,riscv64
The function LoweredAtomicCas32 is implemented using the LL-SC instruction pair
on loong64, mips64x, riscv64. However,the LL instruction on loong64, mips64x,
riscv64 is sign-extended, so it is necessary to sign-extend the 2nd parameter
"old" of the LoweredAtomicCas32, so that the instruction BNE after LL can get
the desired result.

The function prototype of LoweredAtomicCas32 in golang:
    func Cas32(ptr *uint32, old, new uint32) bool

When using an intrinsify implementation:
    case 1: (*ptr) <= 0x80000000 && old < 0x80000000
        E.g: (*ptr) = 0x7FFFFFFF, old = Rarg1= 0x7FFFFFFF

        After run the instruction "LL (Rarg0), Rtmp": Rtmp = 0x7FFFFFFF
        Rtmp ! = Rarg1(old) is false, the result we expect

    case 2: (*ptr) >= 0x80000000 && old >= 0x80000000
        E.g: (*ptr) = 0x80000000, old = Rarg1= 0x80000000

        After run the instruction "LL (Rarg0), Rtmp": Rtmp = 0xFFFFFFFF_80000000
        Rtmp ! = Rarg1(old) is true, which we do not expect

When using an non-intrinsify implementation:
    Because Rarg1 is loaded from the stack using sign-extended instructions
    ld.w, the situation described in Case 2 above does not occur

Benchmarks on linux/loong64:
name     old time/op  new time/op  delta
Cas      50.0ns ± 0%  50.1ns ± 0%   ~     (p=1.000 n=1+1)
Cas64    50.0ns ± 0%  50.1ns ± 0%   ~     (p=1.000 n=1+1)
Cas-4    56.0ns ± 0%  56.0ns ± 0%   ~     (p=1.000 n=1+1)
Cas64-4  56.0ns ± 0%  56.0ns ± 0%   ~     (p=1.000 n=1+1)

Benchmarks on Loongson 3A4000 (GOARCH=mips64le, 1.8GHz)
name     old time/op  new time/op  delta
Cas      70.4ns ± 0%  70.3ns ± 0%   ~     (p=1.000 n=1+1)
Cas64    70.7ns ± 0%  70.6ns ± 0%   ~     (p=1.000 n=1+1)
Cas-4    81.1ns ± 0%  80.8ns ± 0%   ~     (p=1.000 n=1+1)
Cas64-4  80.9ns ± 0%  80.9ns ± 0%   ~     (p=1.000 n=1+1)

Fixes #57282

Change-Id: I190a7fc648023b15fa392f7fdda5ac18c1561bac
Reviewed-on: https://go-review.googlesource.com/c/go/+/457135
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: David Chase <drchase@google.com>
2022-12-17 01:12:22 +00:00
Oleg Zaytsev 61e2b8ec59 cmd/gc: test temp string comparison with all ops
The comment on `slicebytetostringtmp` mention that `==` operator does
not allocate []byte to string conversion, but the test was testing only
`==` and `!=` and the compiler actually optimizes all comparison
operators.

Also added a test for concatenation comparison, which also should not
allocate.

Change-Id: I6f4c5c4f238808138fa901732e1fd5b6ab25f725
Reviewed-on: https://go-review.googlesource.com/c/go/+/456415
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2022-12-13 14:05:23 +00:00
Than McIntosh 7973b0e508 cmd/{go,cover,covdata}: fix 'package main' inconsistent handling
Fix a buglet in cmd/cover in how we handle package name/path for the
"go build -o foo.exe *.go" and "go run *.go" cases.

The go command assigns a dummy import path of "command-line-arguments"
to the main package built in these cases; rather than expose this
dummy to the user in coverage reports, the cover tool had a special
case hack intended to rewrite such package paths to "main". The hack
was too general, however, and was rewriting the import path of all
packages with (p.name == "main") to an import path of "main". The hack
also produced unexpected results for cases such as

  go test -cover foo.go foo_test.go

This patch removes the hack entirely, leaving the package path for
such cases as "command-line-arguments".

Fixes #57169.

Change-Id: Ib6071db5e3485da3b8c26e16ef57f6fa1712402c
Reviewed-on: https://go-review.googlesource.com/c/go/+/456237
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
2022-12-08 18:29:31 +00:00
Bryan C. Mills eaf0e3d465 runtime: remove arbitrary timeouts in finalizer tests
These short timeouts can overrun due to system scheduling delay
(or GC latency) on a slow or heavily-loaded host.

Moreover, if the test deadlocks we will probably want to know what the
GC goroutines were doing at the time. With an arbitrary timeout, we
never get that information; however, if we allow the test to time out
completely we will get a goroutine dump (and, if GOTRACEBACK is
configured in the environment, that may even include GC goroutines).

Fixes #57166.

Change-Id: I136501883373c3ce4e250dc8340c60876b375f44
Reviewed-on: https://go-review.googlesource.com/c/go/+/456118
Auto-Submit: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2022-12-08 14:19:25 +00:00
Cherry Mui e535d6776c Revert "runtime/pprof: unskip TestTimeVDSO on Android"
This reverts CL 455358, commit 98da0fb43f.

Reason for revert: still failing https://build.golang.org/log/c9f13a76069f523b5b4a37a75ec52b30a1f3427a

Change-Id: I8246d233c4fb86781b882f19dea82065cc21bc26
Reviewed-on: https://go-review.googlesource.com/c/go/+/455696
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2022-12-06 22:07:25 +00:00
Cherry Mui 98da0fb43f runtime/pprof: unskip TestTimeVDSO on Android
It is possible that CL 455166 fixes this. Try unskipping the test
and see. If it fails again we can skip it again.

Fixes #48655.

Change-Id: Ia81b06cb7608f74adb276bc018e8fc840285bc11
Reviewed-on: https://go-review.googlesource.com/c/go/+/455358
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-12-06 18:11:17 +00:00
Cherry Mui 185e1a7b27 runtime: prioritize VDSO and libcall unwinding in profiler
In the profiler, when unwinding the stack, we have special
handling for VDSO calls. Currently, the special handling is only
used when the normal unwinding fails. If the signal lands in the
function that makes the VDSO call (e.g. nanotime1) and after the
stack switch, the normal unwinding doesn't fail but gets a stack
trace with exactly one frame (the nanotime1 frame). The stack
trace stops because of the stack switch. This 1-frame stack trace
is not as helpful. Instead, if vdsoSP is set, we know we are in
VDSO call or right before or after it, so use vdsoPC and vdsoSP
for unwinding. Do the same for libcall.

Also remove _TraceTrap for VDSO unwinding, as vdsoPC and vdsoSP
correspond to a call, not an interrupted instruction.

Fixes #56574.

Change-Id: I799aa7644d0c1e2715ab038a9eef49481dd3a7f5
Reviewed-on: https://go-review.googlesource.com/c/go/+/455166
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-12-05 22:01:50 +00:00
Cherry Mui 2986358969 runtime/cgo: fix typo in gcc_loong64.S
Fix typo in CL 454838.

Change-Id: I0e91d22cf09949f0bf924ebcf09f1ddac525bac4
Reviewed-on: https://go-review.googlesource.com/c/go/+/455161
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2022-12-05 18:57:04 +00:00
Cherry Mui ad55b878e7 runtime/cgo: add .file directive to GNU assembly files
Without it, at least on ARM64 with older BFD linker, it will
include the file of the object file (which is of a temporary path)
as a debug symbol into the binary, causing the build to be
nondeterministic. Adding a .file directive makes it to create a
STT_FILE symbol with deterministic input, and prevent the linker
creating one using the temporary object file path.

Fixes #57035.

Change-Id: I3ab716b240f60f7a891af2f7e10b467df67d1f31
Reviewed-on: https://go-review.googlesource.com/c/go/+/454838
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
2022-12-05 16:41:48 +00:00
Russ Cox c0497d1a81 runtime/debug: add missing period
Pointed out in review of CL 453602,
but it looks like I forgot to re-upload before submitting.

Change-Id: I8f4fac52ea0f904f6f9b06e13fc8ed2e778f2360
Reviewed-on: https://go-review.googlesource.com/c/go/+/454835
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Julie Qiu <julieqiu@google.com>
2022-12-02 23:40:37 +00:00
Russ Cox dadd80ae20 runtime/debug: more complete BuildInfo documentation
A potential user did not realize Deps included all transitive dependencies,
not just direct dependencies of the main module. Clarify that and add
various other useful information.

Change-Id: I5b8e1314bb26092edbcc090ba8eb9859f0a70662
Reviewed-on: https://go-review.googlesource.com/c/go/+/453602
Run-TryBot: Russ Cox <rsc@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Julie Qiu <julieqiu@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
2022-12-02 14:23:18 +00:00
Than McIntosh e24380b6fa runtime,hash/maphash: eliminate maphash torture test for -race
Disable the "torture" portion of the maphash tests if -race is in
effect (these tests can cause timeouts on the longtest -race builder).

Fixes #57030.

Change-Id: I23d7561dac3e81d979cad9e0efa6f5b7154aadd2
Reviewed-on: https://go-review.googlesource.com/c/go/+/454455
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2022-12-01 19:24:55 +00:00
Than McIntosh 0fd7be7ee5 testing: remove stale builder names from windows tests
A couple of the windows runtime tests were being gated by "if
testenv.Builder() == ..." guards that referred to builders that have
long since been obsoleted (e.g. "windows-amd64-gce"). Use a more
generic guard instead, checking for windows-<goarch> prefix.

Change-Id: Ibdb9ce2b0cfe10bba986bd210a5b8ce5c1b1d675
Reviewed-on: https://go-review.googlesource.com/c/go/+/453035
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-29 01:18:35 +00:00
Rongrong 2c7c98c3ad syscall, runtime/internal/syscall: zero r2 before mips linux syscalls
All mips variant perform syscalls similarly. R2 (v0) holds r1 and R3
(v1) holds r2 of a syscall. The latter is only used by 2-ret syscalls.
A 1-ret syscall would not touch R3 but keeps it as is, making r2 be a
random value. Always reset it to 0 before SYSCALL to fix the issue.

Fixes #56426

Change-Id: Ie49965c0c3c224c4a895703ac659205cd040ff56
Reviewed-on: https://go-review.googlesource.com/c/go/+/452975
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meng Zhuo <mzh@golangcn.org>
2022-11-24 04:12:23 +00:00
Than McIntosh d6859465e5 testing: skip TestVectoredHandlerExceptionInNonGoThread on windows-amd64-2012-*
Modify skip rule for TestVectoredHandlerExceptionInNonGoThread to
trigger on both the base builder (windows-amd64-2012) and the newcc
canary builder (windows-amd64-2012-newcc).

Updates #49681.

Change-Id: I58109fc2e861b943cb66be0feec348671be84ab3
Reviewed-on: https://go-review.googlesource.com/c/go/+/452436
Run-TryBot: Than McIntosh <thanm@google.com>
Auto-Submit: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-11-21 17:11:59 +00:00
Than McIntosh f0331c524e testing: skip flaky TestRaiseException on windows-amd64-2012-*
Modify skip rule for TestRaiseException to trigger on both the base
builder (windows-amd64-2012) and the newcc canary builder
(windows-amd64-2012-newcc).

Updates #49681.

Change-Id: I132f9ddd102666b68ad04cc661fdcc2cd841051a
Reviewed-on: https://go-review.googlesource.com/c/go/+/451294
Auto-Submit: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
2022-11-21 16:27:03 +00:00
Joel Sing e84ce0802d runtime: change tfork behaviour to unbreak openbsd/mips64
Currently, tfork on openbsd/mips64 returns the thread ID on success and
a negative error number on error. In CL#447175, newosproc was changed
to assume that a non-zero value is an error - return zero on success to
match this expectation.

Change-Id: I955efad49b149146165eba3d05fe40ba75caa098
Reviewed-on: https://go-review.googlesource.com/c/go/+/451257
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
2022-11-19 03:33:26 +00:00
cui fliter b2faff18ce all: add missing periods in comments
Change-Id: I69065f8adf101fdb28682c55997f503013a50e29
Reviewed-on: https://go-review.googlesource.com/c/go/+/449757
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joedian Reid <joedian@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-11-18 17:59:44 +00:00
Keith Randall 893964b972 runtime,cmd/link: increase stack guard space when building with -race
More stuff to do = more stack needed. Bump up the guard space when
building with the race detector.

Fixes #54291

Change-Id: I701bc8800507921bed568047d35b8f49c26e7df7
Reviewed-on: https://go-review.googlesource.com/c/go/+/451217
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-11-18 16:26:25 +00:00
Joel Sing e18d07ddc5 runtime: optimise memmove on riscv64
Implement a more optimised memmove on riscv64, where up to 64 bytes are moved
per loop after achieving alignment. In the unaligned case, memory is moved at
up to 8 bytes per loop.

This also avoids doing unaligned loads and stores, which results in kernel
traps and a significant performance penality.

Fixes #48248.

name                               old speed      new speed        delta
Memmove/1-4                        31.3MB/s _ 0%    26.6MB/s _ 0%    -14.95%  (p=0.000 n=3+3)
Memmove/2-4                        50.6MB/s _ 1%    42.6MB/s _ 0%    -15.75%  (p=0.000 n=3+3)
Memmove/3-4                        64.5MB/s _ 1%    53.4MB/s _ 2%    -17.11%  (p=0.001 n=3+3)
Memmove/4-4                        74.9MB/s _ 0%    99.2MB/s _ 0%    +32.55%  (p=0.000 n=3+3)
Memmove/5-4                        82.3MB/s _ 0%    99.0MB/s _ 1%    +20.29%  (p=0.000 n=3+3)
Memmove/6-4                        88.2MB/s _ 0%   102.3MB/s _ 1%    +15.87%  (p=0.000 n=3+3)
Memmove/7-4                        93.4MB/s _ 0%   102.0MB/s _ 0%     +9.18%  (p=0.000 n=3+3)
Memmove/8-4                         188MB/s _ 3%     188MB/s _ 6%       ~     (p=0.964 n=3+3)
Memmove/9-4                         182MB/s _ 6%     163MB/s _ 1%       ~     (p=0.069 n=3+3)
Memmove/10-4                        177MB/s _ 0%     149MB/s _ 4%    -15.93%  (p=0.012 n=3+3)
Memmove/11-4                        171MB/s _ 6%     148MB/s _ 0%    -13.65%  (p=0.045 n=3+3)
Memmove/12-4                        166MB/s _ 5%     209MB/s _ 0%    +26.12%  (p=0.009 n=3+3)
Memmove/13-4                        170MB/s _ 1%     188MB/s _ 4%    +10.76%  (p=0.039 n=3+3)
Memmove/14-4                        158MB/s _ 0%     185MB/s _ 0%    +17.13%  (p=0.000 n=3+3)
Memmove/15-4                        166MB/s _ 0%     175MB/s _ 0%     +5.38%  (p=0.000 n=3+3)
Memmove/16-4                        320MB/s _ 6%     343MB/s _ 0%       ~     (p=0.149 n=3+3)
Memmove/32-4                        493MB/s _ 5%     628MB/s _ 1%    +27.51%  (p=0.008 n=3+3)
Memmove/64-4                        706MB/s _ 0%    1132MB/s _ 0%    +60.32%  (p=0.000 n=3+3)
Memmove/128-4                       837MB/s _ 1%    1623MB/s _ 1%    +93.96%  (p=0.000 n=3+3)
Memmove/256-4                       960MB/s _ 0%    2070MB/s _ 6%   +115.68%  (p=0.003 n=3+3)
Memmove/512-4                      1.04GB/s _ 0%    2.55GB/s _ 0%   +146.05%  (p=0.000 n=3+3)
Memmove/1024-4                     1.08GB/s _ 0%    2.76GB/s _ 0%   +155.62%  (p=0.000 n=3+3)
Memmove/2048-4                     1.10GB/s _ 0%    2.90GB/s _ 1%   +164.31%  (p=0.000 n=3+3)
Memmove/4096-4                     1.11GB/s _ 0%    2.98GB/s _ 0%   +169.77%  (p=0.000 n=3+3)
MemmoveOverlap/32-4                 443MB/s _ 0%     500MB/s _ 0%    +12.81%  (p=0.000 n=3+3)
MemmoveOverlap/64-4                 635MB/s _ 0%     908MB/s _ 0%    +42.92%  (p=0.000 n=3+3)
MemmoveOverlap/128-4                789MB/s _ 0%    1423MB/s _ 0%    +80.28%  (p=0.000 n=3+3)
MemmoveOverlap/256-4                925MB/s _ 0%    1941MB/s _ 0%   +109.86%  (p=0.000 n=3+3)
MemmoveOverlap/512-4               1.01GB/s _ 2%    2.37GB/s _ 0%   +134.86%  (p=0.000 n=3+3)
MemmoveOverlap/1024-4              1.06GB/s _ 0%    2.68GB/s _ 1%   +151.67%  (p=0.000 n=3+3)
MemmoveOverlap/2048-4              1.09GB/s _ 0%    2.89GB/s _ 0%   +164.82%  (p=0.000 n=3+3)
MemmoveOverlap/4096-4              1.11GB/s _ 0%    3.01GB/s _ 0%   +171.30%  (p=0.000 n=3+3)
MemmoveUnalignedDst/1-4            24.1MB/s _ 1%    21.3MB/s _ 0%    -11.76%  (p=0.000 n=3+3)
MemmoveUnalignedDst/2-4            41.6MB/s _ 1%    35.9MB/s _ 0%    -13.72%  (p=0.000 n=3+3)
MemmoveUnalignedDst/3-4            54.0MB/s _ 0%    45.5MB/s _ 2%    -15.76%  (p=0.004 n=3+3)
MemmoveUnalignedDst/4-4            63.9MB/s _ 1%    81.6MB/s _ 0%    +27.70%  (p=0.000 n=3+3)
MemmoveUnalignedDst/5-4            69.4MB/s _ 6%    84.8MB/s _ 0%    +22.08%  (p=0.015 n=3+3)
MemmoveUnalignedDst/6-4            77.8MB/s _ 2%    89.0MB/s _ 0%    +14.53%  (p=0.004 n=3+3)
MemmoveUnalignedDst/7-4            83.0MB/s _ 0%    90.7MB/s _ 1%     +9.30%  (p=0.000 n=3+3)
MemmoveUnalignedDst/8-4            6.97MB/s _ 2%  127.73MB/s _ 0%  +1732.57%  (p=0.000 n=3+3)
MemmoveUnalignedDst/9-4            7.81MB/s _ 1%  125.41MB/s _ 0%  +1506.45%  (p=0.000 n=3+3)
MemmoveUnalignedDst/10-4           8.59MB/s _ 2%  123.52MB/s _ 0%  +1337.43%  (p=0.000 n=3+3)
MemmoveUnalignedDst/11-4           9.23MB/s _ 6%  119.81MB/s _ 4%  +1197.55%  (p=0.000 n=3+3)
MemmoveUnalignedDst/12-4           10.3MB/s _ 0%   155.9MB/s _ 7%  +1416.08%  (p=0.001 n=3+3)
MemmoveUnalignedDst/13-4           10.9MB/s _ 3%   155.1MB/s _ 0%  +1321.26%  (p=0.000 n=3+3)
MemmoveUnalignedDst/14-4           11.4MB/s _ 5%   151.0MB/s _ 0%  +1229.37%  (p=0.000 n=3+3)
MemmoveUnalignedDst/15-4           12.6MB/s _ 0%   147.0MB/s _ 0%  +1066.39%  (p=0.000 n=3+3)
MemmoveUnalignedDst/16-4           7.17MB/s _ 0%  184.33MB/s _ 5%  +2470.90%  (p=0.001 n=3+3)
MemmoveUnalignedDst/32-4           7.26MB/s _ 0%  252.00MB/s _ 2%  +3371.12%  (p=0.000 n=3+3)
MemmoveUnalignedDst/64-4           7.25MB/s _ 2%  306.37MB/s _ 1%  +4125.75%  (p=0.000 n=3+3)
MemmoveUnalignedDst/128-4          7.32MB/s _ 1%  338.03MB/s _ 1%  +4517.85%  (p=0.000 n=3+3)
MemmoveUnalignedDst/256-4          7.31MB/s _ 0%  361.06MB/s _ 0%  +4841.47%  (p=0.000 n=3+3)
MemmoveUnalignedDst/512-4          7.35MB/s _ 0%  373.55MB/s _ 0%  +4982.36%  (p=0.000 n=3+3)
MemmoveUnalignedDst/1024-4         7.33MB/s _ 0%  379.00MB/s _ 2%  +5068.18%  (p=0.000 n=3+3)
MemmoveUnalignedDst/2048-4         7.31MB/s _ 2%  383.05MB/s _ 0%  +5142.47%  (p=0.000 n=3+3)
MemmoveUnalignedDst/4096-4         7.35MB/s _ 1%  385.97MB/s _ 1%  +5151.25%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/32-4    9.43MB/s _ 0%  233.72MB/s _ 0%  +2377.56%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/64-4    8.13MB/s _ 3%  288.77MB/s _ 0%  +3451.91%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/128-4   7.77MB/s _ 0%  326.62MB/s _ 3%  +4103.65%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/256-4   7.28MB/s _ 6%  357.24MB/s _ 0%  +4804.85%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/512-4   7.44MB/s _ 0%  363.63MB/s _ 7%  +4787.54%  (p=0.001 n=3+3)
MemmoveUnalignedDstOverlap/1024-4  7.37MB/s _ 0%  383.17MB/s _ 0%  +5101.40%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/2048-4  7.29MB/s _ 2%  387.69MB/s _ 0%  +5215.68%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/4096-4  7.18MB/s _ 5%  389.22MB/s _ 0%  +5320.84%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/1-4            24.2MB/s _ 0%    21.4MB/s _ 1%    -11.70%  (p=0.001 n=3+3)
MemmoveUnalignedSrc/2-4            41.7MB/s _ 0%    36.0MB/s _ 0%    -13.71%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/3-4            52.1MB/s _ 6%    46.4MB/s _ 1%       ~     (p=0.074 n=3+3)
MemmoveUnalignedSrc/4-4            60.4MB/s _ 0%    76.4MB/s _ 0%    +26.39%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/5-4            71.2MB/s _ 1%    84.7MB/s _ 0%    +18.90%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/6-4            77.7MB/s _ 0%    88.7MB/s _ 0%    +14.06%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/7-4            82.9MB/s _ 1%    90.7MB/s _ 1%     +9.42%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/8-4            74.6MB/s _ 0%   120.6MB/s _ 0%    +61.62%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/9-4            78.7MB/s _ 1%   123.9MB/s _ 1%    +57.42%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/10-4           82.1MB/s _ 0%   121.7MB/s _ 0%    +48.21%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/11-4           83.7MB/s _ 5%   122.0MB/s _ 0%    +45.79%  (p=0.003 n=3+3)
MemmoveUnalignedSrc/12-4           88.6MB/s _ 0%   160.8MB/s _ 0%    +81.56%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/13-4           91.0MB/s _ 0%   155.0MB/s _ 0%    +70.29%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/14-4           92.0MB/s _ 2%   151.0MB/s _ 0%    +64.09%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/15-4           12.6MB/s _ 0%   146.6MB/s _ 0%  +1063.32%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/16-4           13.3MB/s _ 0%   188.8MB/s _ 2%  +1319.02%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/32-4           9.44MB/s _ 0%  254.24MB/s _ 1%  +2594.21%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/64-4           8.27MB/s _ 0%  302.33MB/s _ 2%  +3555.78%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/128-4          7.73MB/s _ 3%  338.82MB/s _ 0%  +4281.29%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/256-4          7.58MB/s _ 0%  362.19MB/s _ 0%  +4678.23%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/512-4          7.44MB/s _ 1%  374.49MB/s _ 0%  +4933.51%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/1024-4         7.30MB/s _ 2%  379.74MB/s _ 0%  +5099.54%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/2048-4         7.34MB/s _ 2%  385.50MB/s _ 0%  +5154.38%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/4096-4         7.35MB/s _ 1%  383.64MB/s _ 0%  +5119.59%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/32-4    7.22MB/s _ 0%  254.94MB/s _ 0%  +3432.66%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/64-4    7.29MB/s _ 1%  296.99MB/s _ 5%  +3973.89%  (p=0.001 n=3+3)
MemmoveUnalignedSrcOverlap/128-4   7.32MB/s _ 1%  336.73MB/s _ 1%  +4500.09%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/256-4   7.30MB/s _ 1%  361.41MB/s _ 0%  +4850.82%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/512-4   7.34MB/s _ 0%  374.92MB/s _ 0%  +5007.90%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/1024-4  7.34MB/s _ 0%  380.15MB/s _ 0%  +5079.16%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/2048-4  7.36MB/s _ 0%  383.78MB/s _ 0%  +5116.76%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/4096-4  7.35MB/s _ 0%  386.32MB/s _ 0%  +5156.05%  (p=0.000 n=3+3)

Change-Id: Ibc13230af7b1e205ed95a6470e2cf64ff4251405
Reviewed-on: https://go-review.googlesource.com/c/go/+/426256
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
2022-11-18 15:33:16 +00:00
Michael Knyszek e4435cb844 runtime: add page tracer
This change adds a new GODEBUG flag called pagetrace that writes a
low-overhead trace of how pages of memory are managed by the Go runtime.

The page tracer is kept behind a GOEXPERIMENT flag due to a potential
security risk for setuid binaries.

Change-Id: I6f4a2447d02693c25214400846a5d2832ad6e5c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/444157
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-18 03:45:30 +00:00
Keith Randall d6171c9be2 runtime: fix conflict between lfstack and checkptr
lfstack does very unsafe things. In particular, it will not
work with nodes that live on the heap. In normal use by the runtime,
that is the case (it is only used for gc work bufs). But the lfstack
test does use heap objects. It goes through some hoops to prevent
premature deallocation, but those hoops are not enough to convince
-d=checkptr that everything is ok.

Instead, allocate the test objects outside the heap, like the runtime
does for all of its lfstack usage. Remove the lifetime workaround
from the test.

Reported in https://groups.google.com/g/golang-nuts/c/psjrUV2ZKyI

Change-Id: If611105eab6c823a4d6c105938ce145ed731781d
Reviewed-on: https://go-review.googlesource.com/c/go/+/448899
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2022-11-17 23:12:04 +00:00
Russ Cox 1f4394a0c9 runtime: work around Apple libc bugs to make exec stop hanging
For a while now, we've had intermittent reports about problems with
os/exec on macOS, but no clear way to reproduce them. Recent changes
in the os/exec package test seem to have aligned the stars just right,
at least on my two x86 and ARM MacBook Pro laptops, to make the
package test hang with roughly 50% probability. When it does hang, the
stacks I see in the hung process match the ones reported for the
Go-based hangs in #33565. (They do not match the ones reported in the
so-called C reproducer in that issue, but I think that reproducer is
actually reproducing a different race, between fork and exit.)

The stacks obtained from the hung child processes are in
libSystem_atfork_child, which is supposed to reinitialize various
parts of the C library in the new process.

One common stack dies in _notify_fork_child calling _notify_globals
(inlined) calling _os_alloc_once, because _os_alloc_once detects that
the once lock is held by the parent process and then calls
_os_once_gate_corruption_abort. The allocation is setting up the
globals for the notification subsystem. See the source code at [1].
To work around this, we can allocate the globals earlier in the Go
program's lifetime, before any execs are involved, by calling any
notify routine that is exported, calls _notify_globals, and doesn't do
anything too expensive otherwise. notify_is_valid_token(0) fits the bill.

The other common stack dies in xpc_atfork_child calling
_objc_msgSend_uncached which ends up in
WAITING_FOR_ANOTHER_THREAD_TO_FINISH_CALLING_+initialize. Of course,
whatever thread the child is waiting for is in the parent process and
is not going to finish anything in the child process. There is no
public source code for these routines, so it is unclear exactly what
the problem is. However, xpc_atfork_child turns out to be exported
(for use by libSystem_atfork_child, which is in a different library,
so xpc_atfork_child is unlikely to be unexported any time soon).
It also stands to reason that since xpc_atfork_child is called at the
start of any forked child process, it can't be too harmful to call at
the start of an ordinary Go process. And whatever caches it needs for
a non-deadlocking fast path during exec empirically do get initialized
by calling it at startup.

This CL introduces a function osinit_hack, called at osinit time,
which calls notify_is_valid_token(0) and xpc_atfork_child().
Doing so makes the os/exec test pass reliably on both my laptops -
I can run it successfully hundreds of times in a row when my previous
record was twice in a row.

Fixes #33565.
Fixes #56784.

[1] https://opensource.apple.com/source/Libnotify/Libnotify-241/notify_client.c.auto.html


Change-Id: I16a14a800893c40244678203532a3e8d6214b6bd
Reviewed-on: https://go-review.googlesource.com/c/go/+/451735
Run-TryBot: Russ Cox <rsc@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-17 21:15:36 +00:00
Russ Cox 5bf9aeba1c runtime/race: do not use cgo on macOS
The use of an empty import "C" to trigger cgo in runtime/race
serves two purposes:

1. Cause the runtime to use the C library to create system threads,
   because the race syso implementation expects things like
   thread-local storage to work correctly.

2. Derive the right set of //go:cgo_import_dynamic comments
   to pass to the Go linker, so that it doesn't diagnose them as
   undefined references.

On macOS, (1) is unnecessary because using the C library
(via DLL calls) is the only way the runtime ever creates threads.

We can accomplish (2) by writing those comments ourselves.

Having done that in this CL, cgo is no longer needed to run
the race detector on macOS, which means that having a
pre-compiled set of .a files is no longer necessary,
nor is having Xcode for use with cgo when rebuilding those .a files.

Change-Id: Iee24cc67900eb542141b32beaadafb2c94f5fe26
Reviewed-on: https://go-review.googlesource.com/c/go/+/451055
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-11-16 21:38:55 +00:00
Changkun Ou 80d487111b runtime: clarify finalizer semantics for tiny objects
This change clarifies that a finalizer is not guaranteed to run,
not only for zero bytes objects but also tiny objects (< 16bytes).

Fixes #46827

Change-Id: I193e77f6f024c79110604f86bcb1a28b16cf98ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/337391
Run-TryBot: Changkun Ou <mail@changkun.de>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-11-15 17:33:54 +00:00
Nick Ripley 30b1af00ff runtime/pprof: scale mutex profile samples when they are recorded
Samples in the mutex profile have their count and duration scaled
according to the probability they were sampled. This is done when the
profile is actually requested. The adjustment is done using to the
current configured sampling rate. However, if the sample rate is changed
after a specific sample is recorded, then the sample will be scaled
incorrectly. In particular, if the sampling rate is changed to 0, all of
the samples in the encoded profile will have 0 count and duration. This
means the profile will be "empty", even if it should have had samples.

This CL scales the samples in the profile when they are recorded, rather
than when the profile is requested. This matches what is currently done
for the block profile.

With this change, neither the block profile nor mutex profile are scaled
when they are encoded, so the logic for scaling the samples can be
removed.

Change-Id: If228cf39284385aa8fb9a2d62492d839e02f027f
Reviewed-on: https://go-review.googlesource.com/c/go/+/443056
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-11-15 17:33:07 +00:00
Cherry Mui febe7b8e2a runtime: make GC see object as allocated after it is initialized
When the GC is scanning some memory (possibly conservatively),
finding a pointer, while concurrently another goroutine is
allocating an object at the same address as the found pointer, the
GC may see the pointer before the object and/or the heap bits are
initialized. This may cause the GC to see bad pointers and
possibly crash.

To prevent this, we make it that the scanner can only see the
object as allocated after the object and the heap bits are
initialized. Currently the allocator uses freeindex to find the
next available slot, and that code is coupled with updating the
free index to a new slot past it. The scanner also uses the
freeindex to determine if an object is allocated. This is somewhat
racy. This CL makes the scanner use a different field, which is
only updated after the object initialization (and a memory
barrier).

Fixes #54596.

Change-Id: I2a57a226369926e7192c253dd0d21d3faf22297c
Reviewed-on: https://go-review.googlesource.com/c/go/+/449017
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-15 02:55:24 +00:00
Michael Knyszek 2b59307ac2 Revert "runtime: delay incrementing freeindex in malloc"
This reverts commit bed2b7cf41.

Reason for revert: I clicked submit by accident on the wrong CL.

Change-Id: Iddf128cb62f289d472510eb30466e515068271b2
Reviewed-on: https://go-review.googlesource.com/c/go/+/449501
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-11-14 22:09:16 +00:00
qmuntal da564d0006 runtime,cmd/internal/obj/x86: use TEB TLS slots on windows/amd64
This CL redesign how we get the TLS pointer on windows/amd64.

We were previously reading it from the [TEB] arbitrary data slot,
located at 0x28(GS), which can only hold 1 TLS pointer.

With this CL, we will read the TLS pointer from the TEB TLS slot array,
located at 0x1480(GS). The TLS slot array can hold multiple
TLS pointers, up to 64, so multiple Go runtimes running on the
same thread can coexists with different TLS.

Each new TLS slot has to be allocated via [TlsAlloc],
which returns the slot index. This index can then be used to get the
slot offset from GS with the following formula: 0x1480 + index*8

The slot index is fixed per Go runtime, so we can store it
in runtime.tls_g and use it latter on to read/update the TLS pointer.

Loading the TLS pointer requires the following asm instructions:

  MOVQ runtime.tls_g, AX
  MOVQ AX(GS), AX

Notice that this approach is also implemented on windows/arm64.

[TEB]: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block
[TlsAlloc]: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsalloc

Updates #22192

Change-Id: Idea7119fd76a3cd083979a4d57ed64b552fa101b
Reviewed-on: https://go-review.googlesource.com/c/go/+/431775
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2022-11-14 20:43:12 +00:00
Russ Cox ea4631cc0c internal/godebug: define more efficient API
We have been expanding our use of GODEBUG for compatibility,
and the current implementation forces a tradeoff between
freshness and efficiency. It parses the environment variable
in full each time it is called, which is expensive. But if clients
cache the result, they won't respond to run-time GODEBUG
changes, as happened with x509sha1 (#56436).

This CL changes the GODEBUG API to provide efficient,
up-to-date results. Instead of a single Get function,
New returns a *godebug.Setting that itself has a Get method.
Clients can save the result of New, which is no more expensive
than errors.New, in a global variable, and then call that
variable's Get method to get the value. Get costs only two
atomic loads in the case where the variable hasn't changed
since the last call.

Unfortunately, these changes do require importing sync
from godebug, which will mean that sync itself will never
be able to use a GODEBUG setting. That doesn't seem like
such a hardship. If it was really necessary, the runtime could
pass a setting to package sync itself at startup, with the
caveat that that setting, like the ones used by runtime itself,
would not respond to run-time GODEBUG changes.

Change-Id: I99a3acfa24fb2a692610af26a5d14bbc62c966ac
Reviewed-on: https://go-review.googlesource.com/c/go/+/449504
Run-TryBot: Russ Cox <rsc@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-14 15:19:57 +00:00
Ian Lance Taylor 122a22e0e9 internal/syscall/unix: use runtime.gostring for Gostring
Under the race detector, checkptr flags uses of unsafe.Slice that
result in slices that straddle multiple Go allocations.
Avoid that scenario by calling existing runtime code.

This fixes a failure on the darwin-.*-race builders introduced in
CL 446178.

Change-Id: I6e0fdb37e3c3f38d97939a8799bb4d10f519c5b9
Reviewed-on: https://go-review.googlesource.com/c/go/+/449936
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
2022-11-11 23:24:12 +00:00
cui fliter 5a3243e6b6 all: fix problematic comments
Change-Id: Ib6ea1bd04d9b06542ed2b0f453c718115417c62c
Reviewed-on: https://go-review.googlesource.com/c/go/+/449755
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2022-11-11 19:12:52 +00:00
Bryan C. Mills 46bed9d04c runtime/race: add missing copyright headers to syso import files
These were apparently missed in CL 424034.

Change-Id: I60fcdd8c16992177a23c0e701f4224b250cfabee
Reviewed-on: https://go-review.googlesource.com/c/go/+/449855
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Bryan Mills <bcmills@google.com>
2022-11-11 18:53:49 +00:00
Cherry Mui bed2b7cf41 runtime: delay incrementing freeindex in malloc
When the GC is scanning some memory (possibly conservatively),
finding a pointer, while concurrently another goroutine is
allocating an object at the same address as the found pointer, the
GC may see the pointer before the object and/or the heap bits are
initialized. This may cause the GC to see bad pointers and
possibly crash.

To prevent this, we make it that the scanner can only see the
object as allocated after the object and the heap bits are
initialized. As the scanner uses the freeindex to determine if an
object is allocated, we delay the increment of freeindex after the
initialization.

As currently in some code path finding the next free index and
updating the free index to a new slot past it is coupled, this
needs a small refactoring. In the new code mspan.nextFreeIndex
return the next free index but not update it (although allocCache
is updated). mallocgc will update it at a later time.

Fixes #54596.

Change-Id: I6dd5ccf743f2d2c46a1ed67c6a8237fe09a71260
Reviewed-on: https://go-review.googlesource.com/c/go/+/427619
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-11-11 18:33:54 +00:00
Ian Lance Taylor 14018c8bec runtime: retry thread creation on EAGAIN
This copies the logic we use in runtime/cgo, when calling pthread_create,
into runtime proper, when calling newosproc.

We only do this in newosproc, not newosproc0, because in newosproc0 we
need a nosplit function literal, and we need to pass arguments to it through
newosproc, which is a pain. Also newosproc0 is only called at process
startup, when thread creation is less likely to fail anyhow.

Fixes #49438

Change-Id: Ia26813952fdbae8aaad5904c9102269900a07ba9
Reviewed-on: https://go-review.googlesource.com/c/go/+/447175
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-11-10 20:44:45 +00:00
Ian Lance Taylor 79d9b395ad runtime: consolidate some low-level error reporting
Use a single writeErrStr function. Avoid using global variables.
Use a single version of some error messages rather than duplicating
the messages in OS-specific files.

Change-Id: If259fbe78faf797f0a21337d14472160ca03efa0
Reviewed-on: https://go-review.googlesource.com/c/go/+/447055
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-11-10 18:51:20 +00:00
Cherry Mui 7717ac151a runtime: make Malloc benchmarks actually benchmark malloc
The compiler is too clever so the allocations are currently
avoided. Rewrite to make them actually allocate.

Change-Id: I9542e1365120b2ace318360883b0b01ed5670da7
Reviewed-on: https://go-review.googlesource.com/c/go/+/449476
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-11-10 17:33:31 +00:00
Guoqi Chen 3a41094107 runtime: using wyrand for fastrand on linux/loong64
Benchmarks on linux/loong64:
name               old time/op    new time/op    delta
Fastrand           4.06ns ± 0%    3.60ns ± 0%    -11.29%  (p=0.000 n=32+27)
Fastrand64         7.20ns ± 0%    7.16ns ± 0%     -0.52%  (p=0.000 n=23+31)
FastrandHashiter   49.5ns ± 1%    48.9ns ± 2%     -1.24%  (p=0.000 n=32+32)
Fastrandn/2        4.45ns ± 0%    3.81ns ± 0%    -14.37%  (p=0.000 n=32+32)
Fastrandn/3        4.45ns ± 0%    3.81ns ± 0%    -14.32%  (p=0.000 n=32+32)
Fastrandn/4        4.45ns ± 0%    3.81ns ± 0%    -14.33%  (p=0.000 n=31+32)
Fastrandn/5        4.44ns ± 0%    3.81ns ± 0%    -14.26%  (p=0.000 n=31+30)

Change-Id: I0aba7d2331221426c44cc0d0dddecca3b585fda4
Reviewed-on: https://go-review.googlesource.com/c/go/+/446896
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2022-11-09 06:30:59 +00:00