This change mostly implements the design described in #60773 and
includes a new scalable parser for the new trace format, available in
internal/trace/v2. I'll leave this commit message short because this is
clearly an enormous CL with a lot of detail.
This change does not hook up the new tracer into cmd/trace yet. A
follow-up CL will handle that.
For #60773.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-linux-amd64-longtest-race
Change-Id: I5d2aca2cc07580ed3c76a9813ac48ec96b157de0
Reviewed-on: https://go-review.googlesource.com/c/go/+/494187
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Currently the user arena code writes heap bits to the (*mspan).heapBits
space with the platform-specific byte ordering (the heap bits are
written and managed as uintptrs). However, the compiler always emits GC
metadata for types in little endian.
Because the scanning part of the code that loads through the type
pointer in the allocation header expects little endian ordering, we end
up with the wrong byte ordering in GC when trying to scan arena memory.
Fix this by writing out the user arena heap bits in little endian on big
endian platforms.
This means that the space returned by (*mspan).heapBits has a different
meaning for user arenas and small object spans, which is a little odd,
so I documented it. To reduce the chance of misuse of the writeHeapBits
API, which now writes out heap bits in a different ordering than
writeSmallHeapBits on big endian platforms, this change also renames
writeHeapBits to writeUserArenaHeapBits.
Much of this can be avoided in the future if the compiler were to write
out the pointer/scalar bits as an array of uintptr values instead of
plain bytes. That's too big of a change for right now though.
This change is a no-op on little endian platforms. I confirmed it by
checking for any assembly code differences in the runtime test binary.
There were none. With this change, the arena tests pass on ppc64.
Fixes#64048.
Change-Id: If077d003872fcccf5a154ff5d8441a58582061bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/541315
Run-TryBot: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Currently tickspersecond forces a 100 millisecond sleep the first time
it's called. This isn't great for profiling short-lived programs, since
both CPU profiling and block profiling might call into it.
100 milliseconds is a long time, but it's chosen to try and capture a
decent estimate of the conversion on platform with course-granularity
clocks. If the granularity is 15 ms, it'll only be 15% off at worst.
Let's try a different strategy. First, let's require 5 milliseconds of
time to have elapsed at a minimum. This should be plenty on platforms
with nanosecond time granularity from the system clock, provided the
caller of tickspersecond intends to use it for calculating durations,
not timestamps. Next, grab a timestamp as close to process start as
possible, so that we can cover some of that 5 millisecond just during
runtime start.
Finally, this function is only ever called from normal goroutine
contexts. Let's do a regular goroutine sleep instead of a thread-level
sleep under a runtime lock, which has all sorts of nasty effects on
preemption.
While we're here, let's also rename tickspersecond to ticksPerSecond.
Also, let's write down some explicit rules of thumb on when to use this
function. Clocks are hard, and using this for timestamp conversion is
likely to make lining up those timestamps with other clocks on the
system difficult if not impossible.
Note that while this improves ticksPerSecond on platforms with good
clocks, we still end up with a pretty coarse sleep on platforms with
coarse clocks, and a pretty coarse result. On these platforms, keep the
minimum required elapsed time at 100 ms. There's not much we can do
about these platforms except spin and try to catch the clock boundary,
but at 10+ ms of granularity, that might be a lot of spinning.
Fixes#63103.
Fixes#63078.
Change-Id: Ic32a4ba70a03bdf5c13cb80c2669c4064aa4cca2
Reviewed-on: https://go-review.googlesource.com/c/go/+/538898
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Currently dedicated GC mark workers really try to avoid getting
preempted. The one exception is for a pending STW, indicated by
sched.gcwaiting. This is currently fine because other kinds of
preemptions don't matter to the mark workers: they're intentionally
bound to their P.
With the new execution tracer we're going to want to use forEachP to get
the attention of all Ps. We may want to do this during a GC cycle.
forEachP doesn't set sched.gcwaiting, so it may end up waiting the full
GC mark phase, burning a thread and a P in the meantime. This can mean
basically seconds of waiting and trying to preempt GC mark workers.
This change makes all mark workers yield if (*p).runSafePointFn != 0 so
that the workers actually yield somewhat promptly in response to a
forEachP attempt.
Change-Id: I7430baf326886b9f7a868704482a224dae7c9bba
Reviewed-on: https://go-review.googlesource.com/c/go/+/537235
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Currently any thread that tries to get the attention of all Ps (e.g.
stopTheWorldWithSema and forEachP) ends up in a non-preemptible state
waiting to preempt another thread. Thing is, that other thread might
also be in a non-preemptible state, trying to preempt the first thread,
resulting in a deadlock.
This is a general problem, but in practice it only boils down to one
specific scenario: a thread in GC is blocked trying to preempt a
goroutine to scan its stack while that goroutine is blocked in a
non-preemptible state to get the attention of all Ps.
There's currently a hack in a few places in the runtime to move the
calling goroutine into _Gwaiting before it goes into a non-preemptible
state to preempt other threads. This lets the GC scan its stack because
the goroutine is trivially preemptible. The only restriction is that
forEachP and stopTheWorldWithSema absolutely cannot reference the
calling goroutine's stack. This is generally not necessary, so things
are good.
Anyway, to avoid exposing the details of this hack, this change creates
a safer wrapper around forEachP (and then renames it to forEachP and the
existing one to forEachPInternal) that performs the goroutine status
change, just like stopTheWorld does. We're going to need to use this
hack with forEachP in the new tracer, so this avoids propagating the
hack further and leaves it as an implementation detail.
Change-Id: I51f02e8d8e0a3172334d23787e31abefb8a129ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/533455
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Currently the execution tracer synchronizes with itself using very
heavyweight operations. As a result, it's totally fine for most of the
tracer code to look like:
if traceEnabled() {
traceXXX(...)
}
However, if we want to make that synchronization more lightweight (as
issue #60773 proposes), then this is insufficient. In particular, we
need to make sure the tracer can't observe an inconsistency between g
atomicstatus and the event that would be emitted for a particular
g transition. This means making the g status change appear to happen
atomically with the corresponding trace event being written out from the
perspective of the tracer.
This requires a change in API to something more like a lock. While we're
here, we might as well make sure that trace events can *only* be emitted
while this lock is held. This change introduces such an API:
traceAcquire, which returns a value that can emit events, and
traceRelease, which requires the value that was returned by
traceAcquire. In practice, this won't be a real lock, it'll be more like
a seqlock.
For the current tracer, this API is completely overkill and the value
returned by traceAcquire basically just checks trace.enabled. But it's
necessary for the tracer described in #60773 and we can implement that
more cleanly if we do this refactoring now instead of later.
For #60773.
Change-Id: Ibb9ff5958376339fafc2b5180aef65cf2ba18646
Reviewed-on: https://go-review.googlesource.com/c/go/+/515635
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
This change replaces the 1-bit-per-word heap bitmap for most size
classes with allocation headers for objects that contain pointers. The
header consists of a single pointer to a type. All allocations with
headers are treated as implicitly containing one or more instances of
the type in the header.
As the name implies, headers are usually stored as the first word of an
object. There are two additional exceptions to where headers are stored
and how they're used.
Objects smaller than 512 bytes do not have headers. Instead, a heap
bitmap is reserved at the end of spans for objects of this size. A full
word of overhead is too much for these small objects. The bitmap is of
the same format of the old bitmap, minus the noMorePtrs bits which are
unnecessary. All the objects <512 bytes have a bitmap less than a
pointer-word in size, and that was the granularity at which noMorePtrs
could stop scanning early anyway.
Objects that are larger than 32 KiB (which have their own span) have
their headers stored directly in the span, to allow power-of-two-sized
allocations to not spill over into an extra page.
The full implementation is behind GOEXPERIMENT=allocheaders.
The purpose of this change is performance. First and foremost, with
headers we no longer have to unroll pointer/scalar data at allocation
time for most size classes. Small size classes still need some
unrolling, but their bitmaps are small so we can optimize that case
fairly well. Larger objects effectively have their pointer/scalar data
unrolled on-demand from type data, which is much more compactly
represented and results in less TLB pressure. Furthermore, since the
headers are usually right next to the object and where we're about to
start scanning, we get an additional temporal locality benefit in the
data cache when looking up type metadata. The pointer/scalar data is
now effectively unrolled on-demand, but it's also simpler to unroll than
before; that unrolled data is never written anywhere, and for arrays we
get the benefit of retreading the same data per element, as opposed to
looking it up from scratch for each pointer-word of bitmap. Lastly,
because we no longer have a heap bitmap that spans the entire heap,
there's a flat 1.5% memory use reduction. This is balanced slightly by
some objects possibly being bumped up a size class, but most objects are
not tightly optimized to size class sizes so there's some memory to
spare, making the header basically free in those cases.
See the follow-up CL which turns on this experiment by default for
benchmark results. (CL 538217.)
Change-Id: I4c9034ee200650d06d8bdecd579d5f7c1bbf1fc5
Reviewed-on: https://go-review.googlesource.com/c/go/+/437955
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
This change adds the allocation headers GOEXPERIMENT which is a no-op.
It forks two runtime files temporarily to make the GOEXPERIMENT easier
to maintain. The forked files are mbitmap.go and msize.go.
Change-Id: I60202c00e614e4517de7dd000029cf80dd0121ef
Reviewed-on: https://go-review.googlesource.com/c/go/+/537980
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Temporarily skip to make the builder happy. Will work on a fix.
Updates #63938.
Change-Id: Ic9db771342108430c29774b2c3e50043791189a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/541195
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Use ADD with constants, instead of ADDI. Also use SUB with a positive constant
rather than ADD with a negative constant. The resulting assembly is still the
same.
Change-Id: Ife10bf5ae4122e525f0e7d41b5e463e748236a9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/540136
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: M Zhuo <mzh@golangcn.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
With the introduction of runtime.Pinner, returning a pointer to a pinned
struct that then points to an unpinned Go pointer is correctly caught.
However, the error message remained as "cgo result has Go pointer",
which should be updated to acknowledge that Go pointers to pinned
memory are allowed.
This also updates the comments for cgoCheckArg and cgoCheckResult
to similarly clarify.
Updates #46787
Change-Id: I147bb09e87dfb70a24d6d43e4cf84e8bcc2aff48
GitHub-Last-Rev: 706facb9f2
GitHub-Pull-Request: golang/go#62606
Reviewed-on: https://go-review.googlesource.com/c/go/+/527702
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
This CL continues adding support for And/Or primitives to
more architectures, this time for arm/arm64.
For #61395
Change-Id: Icc44ea65884c825698a345299d8f9511392aceb6
GitHub-Last-Rev: 8267665a03
GitHub-Pull-Request: golang/go#62674
Reviewed-on: https://go-review.googlesource.com/c/go/+/528797
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Mauri de Souza Meneguzzo <mauri870@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
The writeBarrier "needed" struct member has the exact same
value as "enabled", and used interchangeably.
I'm not sure if we plan to make a distinction between the
two at some point, but today they are effectively the same,
so dedup it and keep only "enabled".
Change-Id: I65e596f174e1e820dc471a45ff70c0ef4efbc386
GitHub-Last-Rev: f8c805a916
GitHub-Pull-Request: golang/go#63814
Reviewed-on: https://go-review.googlesource.com/c/go/+/538495
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Mauri de Souza Meneguzzo <mauri870@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This CL adds the atomic primitives for the And/Or operators on x86-64.
It also includes missing benchmarks for the ops.
For #61395
Change-Id: I23ef5192866d21fc3a479d0159edeafc3aeb5c47
GitHub-Last-Rev: df800be192
GitHub-Pull-Request: golang/go#62621
Reviewed-on: https://go-review.googlesource.com/c/go/+/528315
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Previously, badmorestackg0 was never called since it was behind a g ==
R1 check, R1 holding g.m. This is clearly wrong, since we want to check
if g == g0. Fixed by using R2 that holds the value of g0.
Fixes#63953
Change-Id: I1e2a1c3be7ad9e7ae8dbf706ef6783e664a44764
GitHub-Last-Rev: b3e92cf286
GitHub-Pull-Request: golang/go#63954
Reviewed-on: https://go-review.googlesource.com/c/go/+/539840
Reviewed-by: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
ReadMetricsSlow was updated to call the core of readMetrics on the
systemstack to prevent issues with stat skew if the stack gets moved
between readmemstats_m and readMetrics. However, readMetrics calls into
the map implementation, which has race instrumentation. The system stack
typically has no racectx set, resulting in crashes.
Donate racectx to g0 like the tracer does, so that these accesses don't
crash.
For #60607.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-race
Change-Id: Ic0251af2d9b60361f071fe97084508223109480c
Reviewed-on: https://go-review.googlesource.com/c/go/+/539695
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Currently it's possible (and even probable, with mayMoreStackMove mode)
for a stack allocation to occur between readmemstats_m and readMetrics
in ReadMetricsSlow. This can cause tests to fail by producing metrics
that are inconsistent between the two sources.
Fix this by breaking out the critical section of readMetrics and calling
that from ReadMetricsSlow on the systemstack. Our main constraint in
calling readMetrics on the system stack is the fact that we can't
acquire the metrics semaphore from the system stack. But if we break out
the critical section, then we can acquire that semaphore before we go on
the system stack.
While we're here, add another readMetrics call before readmemstats_m.
Since we're being paranoid about ways that metrics could get skewed
between the two calls, let's eliminate all uncertainty. It's possible
for readMetrics to allocate new memory, for example for histograms, and
fail while it's reading metrics. I believe we're just getting lucky
today with the order in which the metrics are produced. Another call to
readMetrics will preallocate this data in the samples slice. One nice
thing about this second read is that now we effectively have a way to
check if readMetrics really will allocate if called a second time on the
same samples slice.
Fixes#60607.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Change-Id: If6ce666530903239ef9f02dbbc3f1cb6be71e425
Reviewed-on: https://go-review.googlesource.com/c/go/+/539117
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
This was converted to a compiler intrinsic and no longer needs to exist
in assembly.
Change-Id: I7495c435d4642e0e71d8f7677d70af3a3ca2a6ba
Reviewed-on: https://go-review.googlesource.com/c/go/+/539195
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
This will make the upcoming GOEXPERIMENT easier to implement, since this
function relies on a lot of heap bitmap internals.
Change-Id: I2ab76e928e7bfd383dcdb5bfe72c9b23c2002a4e
Reviewed-on: https://go-review.googlesource.com/c/go/+/537979
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
We're going to want to fork this data in the near future for a
GOEXPERIMENT, so break it out now.
Change-Id: Ia7ded850bb693c443fe439c6b7279dcac638512c
Reviewed-on: https://go-review.googlesource.com/c/go/+/537978
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Go 1.21 and earlier do not understand this line, causing
"go mod vendor" of //go:build go1.22-tagged code that
uses this feature to fail.
The solution is to include the go/build change to skip over
the line in Go 1.22 (making "go mod vendor" from Go 1.22 onward
work with this change) and then wait to deploy the cgo change
until Go 1.23, at which point Go 1.21 and earlier will be unsupported.
For #56378.
Fixes#63293.
Change-Id: Ifa08b134eac5a6aa15d67dad0851f00e15e1e58b
Reviewed-on: https://go-review.googlesource.com/c/go/+/539235
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Change-Id: Ib89a71e20f9c6b86c97814c75cb427e9bd7075e5
Reviewed-on: https://go-review.googlesource.com/c/go/+/538735
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
OpenBSD 6.3 is more than five years old and has not been supported for
the last four years (only 7.3 and 7.4 are currently supported). As such,
remove special handling of MAP_STACK for 6.3 and earlier.
Change-Id: I1086c910bbcade7fb3938bb1226813212794b587
Reviewed-on: https://go-review.googlesource.com/c/go/+/538458
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Aaron Bieber <aaron@bolddaemon.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
Allow up to 10 standard deviations from the mean, instead of
~5 that the current test allows.
10 standard deviations allows up to a 4500/5500 split.
Fixes#52465
Change-Id: Icb21c1d31fafbcf4723b75435ba5e98863e812c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/538815
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Make the choice of using these instructions dynamic (triggered by cpu
feature detection) rather than static (trigered by GOARM setting).
if GOARM>=7, we know we have them.
For GOARM=5/6, dynamically dispatch based on auxv information.
Update #17082
Update #61588
Change-Id: I8a50481d942f62cf36348998a99225d0d242f8af
Reviewed-on: https://go-review.googlesource.com/c/go/+/525637
TryBot-Result: Gopher Robot <gobot@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
CL 495415 weaken avalanche, making allowed range from 43% to 57%. Since
then, we only see a failure with 58% on linux-386-longtest builder, so
let give the test a bit more wiggle room: 40% to 59%.
Fixes#60170
Change-Id: I9528ebc8601975b733c3d9fd464ce41429654273
Reviewed-on: https://go-review.googlesource.com/c/go/+/538655
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
On some platforms (notably OpenBSD), stacks must be specifically allocated
and marked as being stack memory. Allocate the crash stack using stackalloc,
which ensures these requirements are met, rather than using a global Go
variable.
Fixes#63794
Change-Id: I6513575797dd69ff0a36f3bfd4e5fc3bd95cbf50
Reviewed-on: https://go-review.googlesource.com/c/go/+/538457
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This is the beginning of the math/rand/v2 package from proposal #61716.
Start by copying old API. This CL copies math/rand/* to math/rand/v2
and updates references to math/rand to add v2 throughout.
Later CLs will make the v2 changes.
For #61716.
Change-Id: I1624ccffae3dfa442d4ba2461942decbd076e11b
Reviewed-on: https://go-review.googlesource.com/c/go/+/502495
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Error like "morestack on g0" is one of the errors that is very
hard to debug, because often it doesn't print a useful stack trace.
The runtime doesn't directly print a stack trace because it is
a bad stack state to call print. Sometimes the SIGABRT may trigger
a traceback, but sometimes not especially in a cgo binary. Even if
it triggers a traceback it often does not include the stack trace
of the bad stack.
This CL makes it explicitly print a stack trace and throw. The
idea is to have some space as an "emergency" crash stack. When the
stack is in a really bad state, we switch to the crash stack and
do a traceback.
Currently only implemented on AMD64 and ARM64.
TODO: also handle errors like "morestack on gsignal" and bad
systemstack. Also handle other architectures.
Change-Id: Ibfc397202f2bb0737c5cbe99f2763de83301c1c1
Reviewed-on: https://go-review.googlesource.com/c/go/+/419435
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
After CL 527715, needm uses callbackUpdateSystemStack to set the stack
bounds for g0 on an M from the extra M list. Since
callbackUpdateSystemStack is also used for recursive cgocallback, it
does nothing if the stack is already in bounds.
Currently, the stack bounds in an extra M may contain stale bounds from
a previous thread that used this M and then returned it to the extra
list in dropm.
Typically a new thread will not have an overlapping stack with an old
thread, but because the old thread has exited there is a small chance
that the C memory allocator will allocate the new thread's stack
partially or fully overlapping with the old thread's stack.
If this occurs, then callbackUpdateSystemStack will not update the stack
bounds. If in addition, the overlap is partial such that SP on
cgocallback is close to the recorded stack lower bound, then Go may
quickly "overflow" the stack and crash with "morestack on g0".
Fix this by clearing the stack bounds in dropm, which ensures that
callbackUpdateSystemStack will unconditionally update the bounds in
needm.
For #62440.
Change-Id: Ic9e2052c2090dd679ed716d1a23a86d66cbcada7
Reviewed-on: https://go-review.googlesource.com/c/go/+/537695
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
TryBot-Bypass: Michael Pratt <mpratt@google.com>
As spotted by staticcheck, the body did keep track of errors by sharing
a single err variable, but its last value was never used as the function
simply finished by returning nil.
To prevent postDecode from erroring on empty profiles,
which breaks TestEmptyProfile, add a check at the top of the function.
Update the runtime/pprof test accordingly,
since the default units didn't make sense for an empty profile anyway.
Change-Id: I188cd8337434adf9169651ab5c914731b8b20f39
Reviewed-on: https://go-review.googlesource.com/c/go/+/483137
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
These primitives will be used by the new And/Or sync/atomic apis.
For #61395
Change-Id: I4062d6317e01afd94d3588f5425237723ab15ade
GitHub-Last-Rev: c0a8d8f34d
GitHub-Pull-Request: golang/go#63272
Reviewed-on: https://go-review.googlesource.com/c/go/+/531575
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
This implements the approach I described in
https://go-review.git.corp.google.com/c/go/+/494057/1#message-5c9773bded2f89b4058848cb036b860aa6716de3.
Specifically:
- Each level of test atomically records the cumulative number of races
seen as of the last race-induced test failure.
- When a subtest fails, it logs the race error, and then updates its
parents' counters so that they will not log the same error.
- We check each test or benchmark for races before it starts running
each of its subtests or sub-benchmark, before unblocking parallel
subtests, and after running any cleanup functions.
With this implementation, it should be the case that every test that
is running when a race is detected reports that race, and any race
reported for a subtest is not redundantly reported for its parent.
The regression tests are based on those added in CL 494057 and
CL 501895, with a few additions based on my own review of the code.
Fixes#60083.
Change-Id: I578ae929f192a7a951b31b17ecb560cbbf1ef7a1
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-linux-amd64-longtest-race,gotip-windows-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/506300
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
The goroutine profile has close to three code paths for adding a
goroutine record to the goroutine profile: one for the goroutine that
requested the profile, one for every other goroutine, plus some special
handling for the finalizer goroutine. The first of those captured the
goroutine stack, but neglected to include that goroutine's labels.
Update the tests to check for the inclusion of labels for all three
types of goroutines, and include labels for the creator of the goroutine
profile.
For #63712
Change-Id: Id5387a5f536d3c37268c240e0b6db3d329a3d632
Reviewed-on: https://go-review.googlesource.com/c/go/+/537515
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: David Chase <drchase@google.com>
Change-Id: I3f0b7209621b39cee69566a5cc95e4343b4f1f20
GitHub-Last-Rev: af9dbbe69a
GitHub-Pull-Request: golang/go#63321
Reviewed-on: https://go-review.googlesource.com/c/go/+/531916
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Add a memory barrier on the failure case of the
compare-and-swap for mips, this avoids potential
race conditions.
For #63506
Change-Id: I3df1479d1438ba72aa72567eb3dea76ff745e98d
GitHub-Last-Rev: 2101b9fd44
GitHub-Pull-Request: golang/go#63604
Reviewed-on: https://go-review.googlesource.com/c/go/+/536116
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
CL 473415 allowed 5 more threads in TestWindowsStackMemory, to cover
sysmon and any new threads in future. However, during go1.22 dev cycle,
the test becomes flaky again, failing in windows-386 builder a couple of
times in CL 535975 and CL 536175 (and maybe others that haven't caught).
This CL increases the extra threads from 5 to 10, hopefully to make the
test stable again for windows-386. The theory is that Go process load a
bunch of DLLs, which may start their own threads. We could investigate
more deeply if the test still be flaky with 10 extra threads.
Fixes#58570
Change-Id: I255d0d31ed554859a5046fa76dfae1ba89a89aa3
Reviewed-on: https://go-review.googlesource.com/c/go/+/536058
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Fixes#59679
Change-Id: I1334b793825a2a57d239e3c98373bf4c93cc622a
Reviewed-on: https://go-review.googlesource.com/c/go/+/522215
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Andy Pan <panjf2000@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
RtlGenRandom is a semi-undocumented API, also known as
SystemFunction036, which we use to generate random data on Windows.
It's definition, in cryptbase.dll, is an opaque wrapper for the
documented API ProcessPrng. Instead of using RtlGenRandom, switch to
using ProcessPrng, since the former is simply a wrapper for the latter,
there should be no practical change on the user side, other than a minor
change in the DLLs we load.
Change-Id: Ie6891bf97b1d47f5368cccbe92f374dba2c2672a
Reviewed-on: https://go-review.googlesource.com/c/go/+/536235
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
Auto-Submit: Roland Shoemaker <roland@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Running 'go fix' on the cmd+std packages handled much of this change.
Also update code generators to use only the new go:build lines,
not the old +build ones.
For #41184.
For #60268.
Change-Id: If35532abe3012e7357b02c79d5992ff5ac37ca23
Cq-Include-Trybots: luci.golang.try:gotip-linux-386-longtest,gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/536237
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
The documentation of readvarintUnsafe claims itself and readvarint are
duplicated. However, two implementation are not in synced, since when
readvarint got some minor improvements in CL 43150.
Updating readvarintUnsafe to match readvarint implementation to gain a
bit of speed. While at it, also updating its documentation to clarify
the main difference.
name time/op
ReadvarintUnsafe/old-8 6.04ns ± 2%
ReadvarintUnsafe/new-8 5.31ns ± 3%
Change-Id: Ie1805d0747544f69de88f6ba9d1b3960f80f00e8
Reviewed-on: https://go-review.googlesource.com/c/go/+/535815
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The function WriteTabs has been renamed WritePluginTable.
Change-Id: I5f04b99b91498c41121f898cb7774334a730d7b4
GitHub-Last-Rev: c98ab3f872
GitHub-Pull-Request: golang/go#63595
Reviewed-on: https://go-review.googlesource.com/c/go/+/535996
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
QueryPerformanceCounter is available since Windows 2000 [1], so there
is no need to conditionally load it.
Even if the Go runtime doesn't eventually use it, it is still simpler
and faster to just tell the Windows loader to load it, instead of doing
it ourselves.
[1]: https://learn.microsoft.com/en-us/windows/win32/api/profileapi/nf-profileapi-queryperformancecounter
Change-Id: Ied3b54a6a8fe3b8d51aefab0fe483b3a193b5522
Reviewed-on: https://go-review.googlesource.com/c/go/+/532915
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
AddVectoredContinueHandler is available since Windows XP [1], there is
no need to check if it is available.
[1]: https://learn.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-addvectoredcontinuehandler
Change-Id: I1ddc3d58b3294d9876620cd46159d9692694b475
Reviewed-on: https://go-review.googlesource.com/c/go/+/532817
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
We were using the size stored in the map, which is the smaller
of the real type size and 128.
As of CL 61538 we don't use these functions, but we expect to
use them again in the future after #61626 is resolved.
Change-Id: I7bfb4af5f0e3a56361d4019a8ed7c1ec59ff31fd
Reviewed-on: https://go-review.googlesource.com/c/go/+/535215
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Keith Randall <khr@google.com>