This reverts CL 485500.
Reason for revert: This breaks internal tests at Google, see b/280861579 and b/280820455.
Change-Id: I426278d400f7611170918fc07c524cb059b9cc55
Reviewed-on: https://go-review.googlesource.com/c/go/+/492995
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Chressie Himpel <chressie@google.com>
The current mp.incgocallback() logic allows for trace events to be
recorded using frame pointer unwinding during cgocallbackg when they
shouldn't be. Specifically, mp.incgo will be false during the
reentersyscall call at the end. It's possible to crash with tracing
enabled because of this, if C code which uses the frame pointer register
for other purposes calls into Go. This can be seen, for example, by
forcing testprogcgo/trace_unix.c to write a garbage value to RBP prior
to calling into Go.
We can drop the mp.incgo check, and instead conservatively avoid doing
frame pointer unwinding if there is any C on the stack. This is the case
if mp.ncgo > 0, or if mp.isextra is true (meaning we're coming from a
thread created by C). Rename incgocallback to reflect that we're
checking if there's any C on the stack. We can also move the ncgo
increment in cgocall closer to where the transition to C happens, which
lets us use frame pointer unwinding for the entersyscall event during
the first Go-to-C call on a stack, when there isn't yet any C on the
stack.
Fixes#59830.
Change-Id: If178a705a9d38d0d2fb19589a9e669cd982d32cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/488755
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Nick Ripley <nick.ripley@datadoghq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This change addresses a `checkdead` panic caused by a race condition between
`sysmon->startm` and `checkdead` callers, due to prematurely releasing the
scheduler lock. The solution involves allowing a `startm` caller to acquire the
scheduler lock and call `startm` in this context. A new `lockheld` bool
argument is added to `startm`, which manages all lock and unlock calls within
the function. The`startIdle` function variable in `injectglist` is updated to
call `startm` with the lock held, ensuring proper lock handling in this
specific case. This refined lock handling resolves the observed race condition
issue.
Fixes#59600
Change-Id: I11663a15536c10c773fc2fde291d959099aa71be
Reviewed-on: https://go-review.googlesource.com/c/go/+/487316
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
This reapplies CL 481061, with the followup fixes in CL 482975, CL 485315, and
CL 485316 incorporated.
CL 481061, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 482975 is a followup fix to a C declaration in testprogcgo.
CL 485315 is a followup fix for x_cgo_getstackbound on Illumos.
CL 485316 is a followup cleanup for ppc64 assembly.
[Original CL 481061 description]
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
[CL 485500 description]
CL 479915 passed the G to _cgo_getstackbound for direct updates to
gp.stack.lo. A G can be reused on a new thread after the previous thread
exited. This could trigger the C TSAN race detector because it couldn't
see the synchronization in Go (lockextra) preventing the same G from
being used on multiple threads at the same time.
We work around this by passing the address of a stack variable to
_cgo_getstackbound rather than the G. The stack is generally unique per
thread, so TSAN won't see the same address from multiple threads. Even
if stacks are reused across threads by pthread, C TSAN should see the
synchonization in the stack allocator.
A regression test is added to misc/cgo/testsanitizer.
Fixes#51676.
Fixes#59294.
Fixes#59678.
Change-Id: Ic62be31a06ee83568215e875a891df37084e08ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/485500
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Also preserve the PC/SP in reentersyscall when doing lock ranking.
The test is TestDestructorCallbackRace with the staticlockranking
experiment enabled.
For #59711
Change-Id: I87ac1d121ec0d399de369666834891ab9e7d11b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/487955
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
This change resolves an issue where checkdead could result in a double lock when shedtrace is enabled. This fix involves adding unlocks before all throws in the checkdead function to ensure the scheduler lock is properly released.
Fixes#59758
Change-Id: If3ddf9969f4582c3c88dee9b9ecc355a63958103
Reviewed-on: https://go-review.googlesource.com/c/go/+/487375
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
This reverts commit CL 486381.
Submitted out of order and breaks bootstrap.
Change-Id: Ia472111cb966e884a48f8ee3893b3bf4b4f4f875
Reviewed-on: https://go-review.googlesource.com/c/go/+/486915
Reviewed-by: David Chase <drchase@google.com>
TryBot-Bypass: Austin Clements <austin@google.com>
This reverts CL 481061.
Reason for revert: When built with C TSAN, x_cgo_getstackbound triggers
race detection on `g->stacklo` because the synchronization is in Go,
which isn't instrumented.
For #51676.
For #59294.
For #59678.
Change-Id: I38afcda9fcffd6537582a39a5214bc23dc147d47
Reviewed-on: https://go-review.googlesource.com/c/go/+/485275
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
(This is a retry of CL 462035 which was reverted at 474976.
The only change from that CL is the aix fix SRODATA->SNOPTRDATA
at inittask.go:141)
As described here:
https://github.com/golang/go/issues/31636#issuecomment-493271830
"Find the lexically earliest package that is not initialized yet,
but has had all its dependencies initialized, initialize that package,
and repeat."
Simplify the runtime a bit, by just computing the ordering required
in the linker and giving a list to the runtime.
Update #31636Fixes#57411
RELNOTE=yes
Change-Id: I28c09451d6aa677d7394c179d23c2c02c503fc56
Reviewed-on: https://go-review.googlesource.com/c/go/+/478916
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
Fixes#51676.
Fixes#59294.
Change-Id: I9bf1400106d5c08ce621d2ed1df3a2d9e3f55494
Reviewed-on: https://go-review.googlesource.com/c/go/+/481061
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: DeJiang Zhu (doujiang) <doujiang24@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This reverts CL 392854.
Reason for revert: caused #59294, which was derived from google
internal tests. The attempted fix of #59294 caused more breakage.
Change-Id: I5a061561ac2740856b7ecc09725ac28bd30f8bba
Reviewed-on: https://go-review.googlesource.com/c/go/+/481060
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This reverts CL 479915.
Reason for revert: breaks a lot google internal tests.
Change-Id: I13a9422e810af7ba58cbf4a7e6e55f4d8cc0ca51
Reviewed-on: https://go-review.googlesource.com/c/go/+/481055
Reviewed-by: Chressie Himpel <chressie@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
For #59294.
Change-Id: Ie52a8f931e0648d8753e4c1dbe45468b8748b527
Reviewed-on: https://go-review.googlesource.com/c/go/+/479915
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Introduce a new m.incgocallback field that is true while C code calls
into Go code. Use it in the tracer in order to fallback to the default
unwinder instead of frame pointer unwinding for this scenario. The
existing fields (incgo, ncgo) were not sufficient to detect the case
where a thread created in C calls into Go code.
Motivation:
1. Take advantage of a cgo symbolizer, if registered, to unwind through
C stacks without frame pointers.
2. Reduce the chance of crashes. It seems unsafe to follow frame
pointers when there could be C code that was compiled without frame
pointers.
Removing the curgp.m.incgocallback check in traceStackID shows the
following minor differences between frame pointer unwinding and the
default unwinder when there is no cgo symbolizer involved.
trace_test.go:60: "goCalledFromCThread": got stack:
main.goCalledFromCThread
/src/runtime/testdata/testprogcgo/trace.go:58
_cgoexp_45c15a3efb3a_goCalledFromCThread
_cgo_gotypes.go:694
runtime.cgocallbackg1
/src/runtime/cgocall.go:318
runtime.cgocallbackg
/src/runtime/cgocall.go:236
runtime.cgocallback
/src/runtime/asm_amd64.s:998
crosscall2
/src/runtime/cgo/asm_amd64.s:30
want stack:
main.goCalledFromCThread
/src/runtime/testdata/testprogcgo/trace.go:58
_cgoexp_45c15a3efb3a_goCalledFromCThread
_cgo_gotypes.go:694
runtime.cgocallbackg1
/src/runtime/cgocall.go:318
runtime.cgocallbackg
/src/runtime/cgocall.go:236
runtime.cgocallback
/src/runtime/asm_amd64.s:998
trace_test.go:60: "goCalledFromC": got stack:
main.goCalledFromC
/src/runtime/testdata/testprogcgo/trace.go:51
_cgoexp_45c15a3efb3a_goCalledFromC
_cgo_gotypes.go:687
runtime.cgocallbackg1
/src/runtime/cgocall.go:318
runtime.cgocallbackg
/src/runtime/cgocall.go:236
runtime.cgocallback
/src/runtime/asm_amd64.s:998
crosscall2
/src/runtime/cgo/asm_amd64.s:30
runtime.asmcgocall
/src/runtime/asm_amd64.s:848
main._Cfunc_cCalledFromGo
_cgo_gotypes.go:263
main.goCalledFromGo
/src/runtime/testdata/testprogcgo/trace.go:46
main.Trace
/src/runtime/testdata/testprogcgo/trace.go:37
main.main
/src/runtime/testdata/testprogcgo/main.go:34
want stack:
main.goCalledFromC
/src/runtime/testdata/testprogcgo/trace.go:51
_cgoexp_45c15a3efb3a_goCalledFromC
_cgo_gotypes.go:687
runtime.cgocallbackg1
/src/runtime/cgocall.go:318
runtime.cgocallbackg
/src/runtime/cgocall.go:236
runtime.cgocallback
/src/runtime/asm_amd64.s:998
runtime.systemstack_switch
/src/runtime/asm_amd64.s:463
runtime.cgocall
/src/runtime/cgocall.go:168
main._Cfunc_cCalledFromGo
_cgo_gotypes.go:263
main.goCalledFromGo
/src/runtime/testdata/testprogcgo/trace.go:46
main.Trace
/src/runtime/testdata/testprogcgo/trace.go:37
main.main
/src/runtime/testdata/testprogcgo/main.go:34
For #16638
Change-Id: I95fa27a3170c5abd923afc6eadab4eae777ced31
Reviewed-on: https://go-review.googlesource.com/c/go/+/474916
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
Fixes#51676
Change-Id: I380702fe2f9b6b401b2d6f04b0aba990f4b9ee6c
GitHub-Last-Rev: 93dc64ad98
GitHub-Pull-Request: golang/go#51679
Reviewed-on: https://go-review.googlesource.com/c/go/+/392854
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: thepudds <thepudds1460@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
This is relatively easy using the new traceback iterator.
Ancestor tracebacks are now limited to 50 frames. We could keep that
at 100, but the fact that it used 100 before seemed arbitrary and
unnecessary.
Fixes#7181
Updates #54466
Change-Id: If693045881d84848f17e568df275a5105b6f1cb0
Reviewed-on: https://go-review.googlesource.com/c/go/+/475960
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Currently, filling PC traceback buffers is one of the jobs of
gentraceback. This moves it into a new function, tracebackPCs, with a
simple API built around unwinder, and changes all callers to use this
new API.
Updates #54466.
Change-Id: Id2038bded81bf533a5a4e71178a7c014904d938c
Reviewed-on: https://go-review.googlesource.com/c/go/+/468300
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
This reverts commit ce2a609909.
aka CL 462035
Reason for revert: this CL is causing some problems in some internal Google programs.
Change-Id: I4476b8d8d2c3d7b5703d1d85c93baebb4b4e5d26
Reviewed-on: https://go-review.googlesource.com/c/go/+/474976
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
As described here:
https://github.com/golang/go/issues/31636#issuecomment-493271830
"Find the lexically earliest package that is not initialized yet,
but has had all its dependencies initialized, initialize that package,
and repeat."
Simplify the runtime a bit, by just computing the ordering required
in the linker and giving a list to the runtime.
Update #31636Fixes#57411
RELNOTE=yes
Change-Id: I1e4d3878ebe6e8953527aedb730824971d722cac
Reviewed-on: https://go-review.googlesource.com/c/go/+/462035
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
These directives affect the next declaration, so the existing form is
valid, but can be confusing because it is easy to miss. Move then
directly above the declaration for improved readability.
CL 69120 previously moved the Gosched nosplit away to hide it from
documentation. Since CL 224737, directives are automatically excluded
from documentation.
Change-Id: I8ebf2d47fbb5e77c6f40ed8afdf79eaa4f4e335e
Reviewed-on: https://go-review.googlesource.com/c/go/+/472957
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Move this knob from a binary-startup thing to a build-time thing.
This will enable followon optmizations to the write barrier.
Change-Id: Ic3323348621c76a7dc390c09ff55016b19c43018
Reviewed-on: https://go-review.googlesource.com/c/go/+/447778
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
- Fix typo in throw error message for arena.
- Correct typos in assembly and Go comments.
- Fix log message in TestTraceCPUProfile.
Change-Id: I874c9e8cd46394448b6717bc6021aa3ecf319d16
GitHub-Last-Rev: d27fad4d3c
GitHub-Pull-Request: golang/go#58375
Reviewed-on: https://go-review.googlesource.com/c/go/+/465975
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
In the profiler, when unwinding the stack, we have special
handling for VDSO calls. Currently, the special handling is only
used when the normal unwinding fails. If the signal lands in the
function that makes the VDSO call (e.g. nanotime1) and after the
stack switch, the normal unwinding doesn't fail but gets a stack
trace with exactly one frame (the nanotime1 frame). The stack
trace stops because of the stack switch. This 1-frame stack trace
is not as helpful. Instead, if vdsoSP is set, we know we are in
VDSO call or right before or after it, so use vdsoPC and vdsoSP
for unwinding. Do the same for libcall.
Also remove _TraceTrap for VDSO unwinding, as vdsoPC and vdsoSP
correspond to a call, not an interrupted instruction.
Fixes#56574.
Change-Id: I799aa7644d0c1e2715ab038a9eef49481dd3a7f5
Reviewed-on: https://go-review.googlesource.com/c/go/+/455166
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Change-Id: I69065f8adf101fdb28682c55997f503013a50e29
Reviewed-on: https://go-review.googlesource.com/c/go/+/449757
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joedian Reid <joedian@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
This change adds a new GODEBUG flag called pagetrace that writes a
low-overhead trace of how pages of memory are managed by the Go runtime.
The page tracer is kept behind a GOEXPERIMENT flag due to a potential
security risk for setuid binaries.
Change-Id: I6f4a2447d02693c25214400846a5d2832ad6e5c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/444157
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Use a single writeErrStr function. Avoid using global variables.
Use a single version of some error messages rather than duplicating
the messages in OS-specific files.
Change-Id: If259fbe78faf797f0a21337d14472160ca03efa0
Reviewed-on: https://go-review.googlesource.com/c/go/+/447055
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
runtime.bgsweep contains an infinite loop. With aggressive enough
inlining, it may not perform any CALLs on a typical iteration. If
the runtime trying to preempt this goroutine, the lack of CALLs may
prevent preemption for ever occurring.
bgsweep does happen to call goschedIfBusy. Add a preempt check there to
make sure we yield eventually.
For #55022.
Change-Id: If22eb86fd6a626094b3c56dc745c8e4243b0fb40
Reviewed-on: https://go-review.googlesource.com/c/go/+/447135
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Ms are allocated via standard heap allocation (`new(m)`), which means we
must keep them alive (i.e., reachable by the GC) until we are completely
done using them.
Ms are primarily reachable through runtime.allm. However, runtime.mexit
drops the M from allm fairly early, long before it is done using the M
structure. If that was the last reference to the M, it is now at risk of
being freed by the GC and used for some other allocation, leading to
memory corruption.
Ms with a Go-allocated stack coincidentally already keep a reference to
the M in sched.freem, so that the stack can be freed lazily. This
reference has the side effect of keeping this Ms reachable. However, Ms
with an OS stack skip this and are at risk of corruption.
Fix this lifetime by extending sched.freem use to all Ms, with the value
of mp.freeWait determining whether the stack needs to be freed or not.
Fixes#56243.
Change-Id: Ic0c01684775f5646970df507111c9abaac0ba52e
Reviewed-on: https://go-review.googlesource.com/c/go/+/443716
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Extra Ms may lead to the "no consistent ordering of events possible" error when parsing trace file with cgo enabled, since:
1. The gs in the extra Ms may be in `_Gdead` status while starting trace by invoking `runtime.StartTrace`,
2. and these gs will trigger `traceEvGoSysExit` events in `runtime.exitsyscall` when invoking go functions from c,
3. then, the events of those gs are under non-consistent ordering, due to missing the previous events.
Add two events, `traceEvGoCreate` and `traceEvGoInSyscall`, in `runtime.StartTrace`, will make the trace parser happy.
Fixes#29707
Change-Id: I2fd9d1713cda22f0ddb36efe1ab351f88da10881
GitHub-Last-Rev: 7bbfddb81b
GitHub-Pull-Request: golang/go#54974
Reviewed-on: https://go-review.googlesource.com/c/go/+/429858
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: xie cui <523516579@qq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Add a new API (not public/exported) for registering a function with
the runtime that should be called when program execution terminates,
to be used in the new code coverage re-implementation. The API looks
like
func addExitHook(f func(), runOnNonZeroExit bool)
The first argument is the function to be run, second argument controls
whether the function is invoked even if there is a call to os.Exit
with a non-zero status. Exit hooks are run in reverse order of
registration, e.g. the first hook to be registered will be the last to
run. Exit hook functions are not allowed to panic or to make calls to
os.Exit.
Updates #51430.
Change-Id: I906f8c5184b7c1666f05a62cfc7833bf1a4300c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/354790
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Currently bgsweep attempts to be a low-priority background goroutine
that runs mainly when the application is mostly idle. To avoid
complicating the scheduler further, it achieves this with a simple
heuristic: call Gosched after each span swept. While this is somewhat
inefficient as there's scheduling overhead on each iteration, it's
mostly fine because it tends to just come out of idle time anyway. In a
busy system, the call to Gosched quickly puts bgsweep at the back of
scheduler queues.
However, what's problematic about this heuristic is the number of
tracing events it produces. Average span sweeping latencies have been
measured as low as 30 ns, so every 30 ns in the sweep phase, with
available idle time, there would be a few trace events emitted. This
could result in an overwhelming number, making traces much larger than
they need to be. It also pollutes other observability tools, like the
scheduling latencies runtime metric, because bgsweep stays runnable the
whole time.
This change fixes these problems with two modifications to the
heursitic:
1. Check if there are any idle Ps before yielding. If there are, don't
yield.
2. Sweep at least 10 spans before trying to yield.
(1) is doing most of the work here. This change assumes that the
presence of idle Ps means that there is available CPU time, so bgsweep
is already making use of idle time and there's no reason it should stop.
This will have the biggest impact on the aforementioned issues.
(2) is a mitigation for the case where GOMAXPROCS=1, because we won't
ever observe a zero idle P count. It does mean that bgsweep is a little
bit higher priority than before because it yields its time less often,
so it could interfere with goroutine scheduling latencies more. However,
by sweeping 10 spans before volunteering time, we directly reduce trace
event production by 90% in all cases. The impact on scheduling latencies
should be fairly minimal, as sweeping a span is already so fast, that
sweeping 10 is unlikely to make a dent in any meaningful end-to-end
latency. In fact, it may even improve application latencies overall by
freeing up spans and sweep work from goroutines allocating memory. It
may be worth considering pushing this number higher in the future.
Another reason to do (2) is to reduce contention on npidle, which will
be checked as part of (1), but this is a fairly minor concern. The main
reason is to capture the GOMAXPROCS=1 case.
Fixes#54767.
Change-Id: I4361400f17197b8ab84c01f56203f20575b29fc6
Reviewed-on: https://go-review.googlesource.com/c/go/+/429615
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
This change adds a metric to the runtime/metrics package which tracks
total mutex wait time for sync.Mutex and sync.RWMutex. The purpose of
this metric is to be able to quickly get an idea of the total mutex wait
time.
The implementation of this metric piggybacks off of the existing G
runnable tracking infrastructure, as well as the wait reason set on a G
when it goes into _Gwaiting.
Fixes#49881.
Change-Id: I4691abf64ac3574bec69b4d7d4428b1573130517
Reviewed-on: https://go-review.googlesource.com/c/go/+/427618
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Currently, wait reasons are set somewhat inconsistently. In a follow-up
CL, we're going to want to rely on the wait reason being there for
casgstatus, so the status quo isn't really going to work for that. Plus
this inconsistency means there are a whole bunch of cases where we could
be more specific about the G's status but aren't.
So, this change adds a new function, casGToWaiting which is like
casgstatus but also sets the wait reason. The goal is that by using this
API it'll be harder to forget to set a wait reason (or the lack thereof
will at least be explicit). This change then updates all casgstatus(gp,
..., _Gwaiting) calls to casGToWaiting(gp, ..., waitReasonX) instead.
For a number of these cases, we're missing a wait reason, and it
wouldn't hurt to add a wait reason for them, so this change also adds
those wait reasons.
For #49881.
Change-Id: Ia95e06ecb74ed17bb7bb94f1a362ebfe6bec1518
Reviewed-on: https://go-review.googlesource.com/c/go/+/427617
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Updates #54854
Change-Id: Ie18665e93e477b6f220acf4c6c070b2af4343064
Reviewed-on: https://go-review.googlesource.com/c/go/+/428157
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Use an atomic.Uint32 to represent the state of finalizer goroutine.
fingStatus will only be changed to fingWake in non fingWait state,
so it is safe to set fingRunningFinalizer status in runfinq.
name old time/op new time/op delta
Finalizer-8 592µs ± 4% 561µs ± 1% -5.22% (p=0.000 n=10+10)
FinalizerRun-8 694ns ± 6% 675ns ± 7% ~ (p=0.059 n=9+8)
Change-Id: I7e4da30cec98ce99f7d8cf4c97f933a8a2d1cae1
Reviewed-on: https://go-review.googlesource.com/c/go/+/400134
Reviewed-by: Joedian Reid <joedian@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Michael Pratt <mpratt@google.com>
Note that this changes some unsynchronized operations of g.atomicstatus to synchronized operations.
Updates #53821
Change-Id: If249d62420ea09fbec39b570942f96c63669c333
Reviewed-on: https://go-review.googlesource.com/c/go/+/425363
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Note that this changes the non-atomic operations in p.destroy() to atomic operations.
For #53821
Change-Id: I7bba77c9a2287ba697c87cce2c79293e4d1b3334
Reviewed-on: https://go-review.googlesource.com/c/go/+/425774
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Note that this changes a few unsynchronized operations of forcegcstate.idle to synchronized operations.
Updates #53821
Change-Id: I041654cc84a188fad45e2df7abce3a434f9a1f15
Reviewed-on: https://go-review.googlesource.com/c/go/+/425361
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
So they match with when/nextwhen fields of timer struct.
Updates #53821
Change-Id: Iad0cceb129796745774facfbbfe5756df3a320b4
Reviewed-on: https://go-review.googlesource.com/c/go/+/423117
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>