Commit Graph

110 Commits

Author SHA1 Message Date
Michael Anthony Knyszek d4bf716793 runtime: reduce per-P memory footprint when greenteagc is disabled
There are two additional sources of memory overhead per P that come from
greenteagc. One is for ptrBuf, but on platforms other than Windows it
doesn't actually cost anything due to demand-paging (Windows also
demand-pages, but the memory is 'committed' so it still counts against
OS RSS metrics). The other is for per-sizeclass scan stats. However when
greenteagc is disabled, most of these scan stats are completely unused.

The worst-case memory overhead from these two sources is relatively
small (about 10 KiB per P), but for programs with a small memory
footprint running on a machine with a lot of cores, this can be
significant (single-digit percent).

This change does two things. First, it puts ptrBuf initialization behind
the greenteagc experiment, so now that memory is never allocated by
default. Second, it abstracts the implementation details of scan stat
collection and emission, such that we can have two different
implementations depending on the build tag. This lets us remove all the
unused stats when the greenteagc experiment is disabled, reducing the
memory overhead of the stats from ~2.6 KiB per P to 536 bytes per P.
This is enough to make the difference no longer noticable in our
benchmark suite.

Fixes #73931.

Change-Id: I4351f1cbb3f6743d8f5922d757d73442c6d6ad3f
Reviewed-on: https://go-review.googlesource.com/c/go/+/678535
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-06-04 07:43:42 -07:00
khr@golang.org 60d3bcdec3 runtime: remove ptr/scalar bitmap metric
We don't use this mechanism any more, so the metric will always be zero.
Since CL 616255.

Update #73628

Change-Id: Ic179927a8bc24e6291876c218d88e8848b057c2a
Reviewed-on: https://go-review.googlesource.com/c/go/+/671096
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-08 11:21:51 -07:00
Michael Anthony Knyszek 1b40dbce1a runtime: mark and scan small objects in whole spans [green tea]
Our current parallel mark algorithm suffers from frequent stalls on
memory since its access pattern is essentially random. Small objects
are the worst offenders, since each one forces pulling in at least one
full cache line to access even when the amount to be scanned is far
smaller than that. Each object also requires an independent access to
per-object metadata.

The purpose of this change is to improve garbage collector performance
by scanning small objects in batches to obtain better cache locality
than our current approach. The core idea behind this change is to defer
marking and scanning small objects, and then scan them in batches
localized to a span.

This change adds scanned bits to each small object (<=512 bytes) span in
addition to mark bits. The scanned bits indicate that the object has
been scanned. (One way to think of them is "grey" bits and "black" bits
in the tri-color mark-sweep abstraction.) Each of these spans is always
8 KiB and if they contain pointers, the pointer/scalar data is already
packed together at the end of the span, allowing us to further optimize
the mark algorithm for this specific case.

When the GC encounters a pointer, it first checks if it points into a
small object span. If so, it is first marked in the mark bits, and then
the object is queued on a work-stealing P-local queue. This object
represents the whole span, and we ensure that a span can only appear at
most once in any queue by maintaining an atomic ownership bit for each
span. Later, when the pointer is dequeued, we scan every object with a
set mark that doesn't have a corresponding scanned bit. If it turns out
that was the only object in the mark bits since the last time we scanned
the span, we scan just that object directly, essentially falling back to
the existing algorithm. noscan objects have no scan work, so they are
never queued.

Each span's mark and scanned bits are co-located together at the end of
the span. Since the span is always 8 KiB in size, it can be found with
simple pointer arithmetic. Next to the marks and scans we also store the
size class, eliminating the need to access the span's mspan altogether.

The work-stealing P-local queue is a new source of GC work. If this
queue gets full, half of it is dumped to a global linked list of spans
to scan. The regular scan queues are always prioritized over this queue
to allow time for darts to accumulate. Stealing work from other Ps is a
last resort.

This change also adds a new debug mode under GODEBUG=gctrace=2 that
dumps whole-span scanning statistics by size class on every GC cycle.

A future extension to this CL is to use SIMD-accelerated scanning
kernels for scanning spans with high mark bit density.

For #19112. (Deadlock averted in GOEXPERIMENT.)
For #73581.

Change-Id: I4bbb4e36f376950a53e61aaaae157ce842c341bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/658036
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-02 10:28:07 -07:00
Michael Anthony Knyszek 528bafa049 runtime: move sizeclass defs to new package internal/runtime/gc
We will want to reference these definitions from new generator programs,
and this is a good opportunity to cleanup all these old C-style names.

Change-Id: Ifb06f0afc381e2697e7877f038eca786610c96de
Reviewed-on: https://go-review.googlesource.com/c/go/+/655275
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-04-23 08:00:33 -07:00
apocelipes c889004615 internal,runtime: use the builtin clear
To simplify the code.

Change-Id: I023de705504c0b580718eec3c7c563b6cf2c8184
GitHub-Last-Rev: 026b32c799
GitHub-Pull-Request: golang/go#73412
Reviewed-on: https://go-review.googlesource.com/c/go/+/666118
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-04-18 04:21:17 -07:00
Michael Anthony Knyszek 017478a963 runtime: move GC pause time CPU metrics update into the STW
This change fixes a possible race with updating metrics and reading
them. The update is intended to be protected by the world being stopped,
but here, it clearly isn't.

Fixing this lets us lower the thresholds in the metrics tests by an
order of magnitude, because the only thing we have to worry about now is
floating point error (the tests were previously written assuming the
floating point error was much higher than it actually was; that turns
out not to be the case, and this bug was the problem instead). However,
this still isn't that tight of a bound; we still want to catch any and
all problems of exactness. For this purpose, this CL adds a test to
check the source-of-truth (in uint64 nanoseconds) that ensures the
totals exactly match.

This means we unfortunately have to take another time measurement, but
for now let's prioritize correctness. A few additional nanoseconds of
STW time won't be terribly noticable.

Fixes #66212.

Change-Id: Id02c66e8a43c13b1f70e9b268b8a84cc72293bfd
Reviewed-on: https://go-review.googlesource.com/c/go/+/570257
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Nicolas Hillegeer <aktau@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-08 21:03:13 +00:00
Michael Anthony Knyszek ae0a08dee9 runtime: use maxprocs instead of stwprocs for GC CPU pause time metrics
Currently we use stwprocs as the multiplier for the STW CPU time
computation, but this isn't the same as GOMAXPROCS, which is used for
the total time in the CPU metrics. The two numbers need to be
comparable, so this change switches to using maxprocs to make it so.

Change-Id: I423e3c441d05b1bd656353368cb323289661e302
Reviewed-on: https://go-review.googlesource.com/c/go/+/570256
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Nicolas Hillegeer <aktau@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-08 15:39:41 +00:00
Michael Anthony Knyszek 08a7ab97ba runtime: factor out GC pause time CPU stats update
Currently this is done manually in two places. Replace these manual
updates with a method that also forces the caller to be mindful that the
number will be multiplied (and that it needs to be). This will make
follow-up changes simpler too.

Change-Id: I81ea844b47a40ff3470d23214b4b2fb5b71a4abe
Reviewed-on: https://go-review.googlesource.com/c/go/+/570255
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-08 15:39:02 +00:00
Andy Pan 4c2b1e0feb runtime: migrate internal/atomic to internal/runtime
For #65355

Change-Id: I65dd090fb99de9b231af2112c5ccb0eb635db2be
Reviewed-on: https://go-review.googlesource.com/c/go/+/560155
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ibrahim Bazoka <ibrahimbazoka729@gmail.com>
Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com>
2024-03-25 19:53:03 +00:00
Michael Anthony Knyszek b2efd1de97 runtime: put ReadMemStats debug assertions behind a double-check mode
ReadMemStats has a few assertions it makes about the consistency of the
stats it's about to produce. Specifically, how those stats line up with
runtime-internal stats. These checks are generally useful, but crashing
just because some stats are wrong is a heavy price to pay.

For a long time this wasn't a problem, but very recently it became a
real problem. It turns out that there's real benign skew that can happen
wherein sysmon (which doesn't synchronize with a STW) generates a trace
event when tracing is enabled, and may mutate some stats while
ReadMemStats is running its checks.

Fix this by synchronizing with both sysmon and the tracer. This is a bit
heavy-handed, but better that than false positives.

Also, put the checks behind a debug mode. We want to reduce the risk of
backporting this change, and again, it's not great to crash just because
user-facing stats are off. Still, enable this debug mode during the
runtime tests so we don't lose quite as much coverage from disabling
these checks by default.

Fixes #64401.

Change-Id: I9adb3e5c7161d207648d07373a11da8a5f0fda9a
Reviewed-on: https://go-review.googlesource.com/c/go/+/545277
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
2023-11-28 20:47:33 +00:00
Michael Pratt 6ef98ac87c runtime/metrics: add STW stopping and total time metrics
This CL adds four new time histogram metrics:

/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds

The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.

The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.

The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.

All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).

/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.

In the implementation, there are a few minor differences:

* For both mark and sweep termination stops, /gc/pauses:seconds started
  timing prior to calling startTheWorldWithSema, thus including lock
  contention.

These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.

Fixes #63340

Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-11-15 16:49:45 +00:00
Michael Anthony Knyszek b2215e49c7 runtime,runtime/metrics: clarify OS stack metrics
There are some subtle details here about measuring OS stacks in cgo
programs. There's also an expectation about magnitude in the MemStats
docs that isn't in the runtime/metrics docs. Fix both.

Fixes #54396.

Change-Id: I6b60a62a4a304e6688e7ab4d511d66193fc25321
Reviewed-on: https://go-review.googlesource.com/c/go/+/502156
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2023-07-05 15:21:49 +00:00
Keith Randall e126572f8a runtime: have ReadMemStats do a nil check before switching stacks
This gives the user a better stack trace experience. No need to
expose them to runtime.systemstack and friends.

Fixes #61158

Change-Id: I4f423f82e54b062773067c0ae64622e37cb3948b
Reviewed-on: https://go-review.googlesource.com/c/go/+/507755
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2023-07-04 16:54:39 +00:00
Michael Anthony Knyszek 3651d8e516 runtime/metrics: refactor CPU stats accumulation
Currently the CPU stats are only updated once every mark termination,
but for writing robust tests, it's often useful to force this update.
Refactor the CPU stats accumulation out of gcMarkTermination and into
its own function. This is also a step toward real-time CPU stats.

While we're here, fix some incorrect documentation about dedicated GC
CPU time.

For #59749.
For #60276.

Change-Id: I8c1a9aca45fcce6ce7999702ae4e082853a69711
Reviewed-on: https://go-review.googlesource.com/c/go/+/487215
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
2023-05-23 19:24:55 +00:00
Michael Anthony Knyszek b1aadd034c runtime: emit STW events for all pauses, not just those for the GC
Currently STW events are only emitted for GC STWs. There's little reason
why the trace can't contain events for every STW: they're rare so don't
take up much space in the trace, yet being able to see when the world
was stopped is often critical to debugging certain latency issues,
especially when they stem from user-level APIs.

This change adds new "kinds" to the EvGCSTWStart event, renames the
GCSTW events to just "STW," and lets the parser deal with unknown STW
kinds for future backwards compatibility.

But, this change must break trace compatibility, so it bumps the trace
version to Go 1.21.

This change also includes a small cleanup in the trace command, which
previously checked for STW events when deciding whether user tasks
overlapped with a GC. Looking at the source, I don't see a way for STW
events to ever enter the stream that that code looks at, so that
condition has been deleted.

Change-Id: I9a5dc144092c53e92eb6950e9a5504a790ac00cf
Reviewed-on: https://go-review.googlesource.com/c/go/+/494495
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2023-05-19 17:06:45 +00:00
Michael Anthony Knyszek b7c28f484d runtime/metrics: add CPU stats
This changes adds a breakdown for estimated CPU usage by time. These
estimates are not based on real on-CPU counters, so each metric has a
disclaimer explaining so. They can, however, be more reasonably
compared to a total CPU time metric that this change also adds.

Fixes #47216.

Change-Id: I125006526be9f8e0d609200e193da5a78d9935be
Reviewed-on: https://go-review.googlesource.com/c/go/+/404307
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Josh MacDonald <jmacd@lightstep.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-09-16 16:32:20 +00:00
cuiweixie 52d9e7f543 runtime: convert consistentHeapStats.gen to atomic type
For #53821

Change-Id: I9f57b84f6a2c29d750fb20420daef903a9311a83
Reviewed-on: https://go-review.googlesource.com/c/go/+/425781
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-31 21:10:41 +00:00
Cuong Manh Le bcd1ac7120 runtime: drop padding alignment field for timeHistogram
After CL 419449, timeHistogram always have 8-byte alignment.

Change-Id: I93145502bcafa1712b811b1a6d62da5d54d0db42
Reviewed-on: https://go-review.googlesource.com/c/go/+/425777
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2022-08-26 18:01:35 +00:00
hopehook 2883af63c2 runtime: convert p.statsSeq to internal atomic type
For #53821.

Change-Id: I1cab3671a29c218b8a927aba9064e63b65900173
Reviewed-on: https://go-review.googlesource.com/c/go/+/425416
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: hopehook <hopehook@qq.com>
2022-08-26 15:35:13 +00:00
Michael Pratt b04e4637db runtime: convert timeHistogram to atomic types
I've dropped the note that sched.timeToRun is protected by sched.lock,
as it does not seem to be true.

For #53821.

Change-Id: I03f8dc6ca0bcd4ccf3ec113010a0aa39c6f7d6ef
Reviewed-on: https://go-review.googlesource.com/c/go/+/419449
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
2022-08-12 01:53:11 +00:00
Michael Anthony Knyszek 7c404d59db runtime: store consistent total allocation stats as uint64
Currently the consistent total allocation stats are managed as uintptrs,
which means they can easily overflow on 32-bit systems. Fix this by
storing these stats as uint64s. This will cause some minor performance
degradation on 32-bit systems, but there really isn't a way around this,
and it affects the correctness of the metrics we export.

Fixes #52680.

Change-Id: I7e6ca44047d46b4bd91c6f87c2d29f730e0d6191
Reviewed-on: https://go-review.googlesource.com/c/go/+/403758
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2022-05-03 19:58:15 +00:00
Michael Anthony Knyszek 129dcb7226 runtime: check the heap goal and trigger dynamically
As it stands, the heap goal and the trigger are set once by
gcController.commit, and then read out of gcController. However with the
coming memory limit we need the GC to be able to respond to changes in
non-heap memory. The simplest way of achieving this is to compute the
heap goal and its associated trigger dynamically.

In order to make this easier to implement, the GC trigger is now based
on the heap goal, as opposed to the status quo of computing both
simultaneously. In many cases we just want the heap goal anyway, not
both, but we definitely need the goal to compute the trigger, because
the trigger's bounds are entirely based on the goal (the initial runway
is not). A consequence of this is that we can't rely on the trigger to
enforce a minimum heap size anymore, and we need to lift that up
directly to the goal. Specifically, we need to lift up any part of the
calculation that *could* put the trigger ahead of the goal. Luckily this
is just the heap minimum and minimum sweep distance. In the first case,
the pacer may behave slightly differently, as the heap minimum is no
longer the minimum trigger, but the actual minimum heap goal. In the
second case it should be the same, as we ensure the additional runway
for sweeping is added to both the goal *and* the trigger, as before, by
computing that in gcControllerState.commit.

There's also another place we update the heap goal: if a GC starts and
we triggered beyond the goal, we always ensure there's some runway.
That calculation uses the current trigger, which violates the rule of
keeping the goal based on the trigger. Notice, however, that using the
precomputed trigger for this isn't even quite correct: due to a bug, or
something else, we might trigger a GC beyond the precomputed trigger.

So this change also adds a "triggered" field to gcControllerState that
tracks the point at which a GC actually triggered. This is independent
of the precomputed trigger, so it's fine for the heap goal calculation
to rely on it. It also turns out, there's more than just that one place
where we really should be using the actual trigger point, so this change
fixes those up too.

Also, because the heap minimum is set by the goal and not the trigger,
the maximum trigger calculation now happens *after* the goal is set, so
the maximum trigger actually does what I originally intended (and what
the comment says): at small heaps, the pacer picks 95% of the runway as
the maximum trigger. Currently, the pacer picks a small trigger based
on a not-yet-rounded-up heap goal, so the trigger gets rounded up to the
goal, and as per the "ensure there's some runway" check, the runway ends
up at always being 64 KiB. That check is supposed to be for exceptional
circumstances, not the status quo. There's a test introduced in the last
CL that needs to be updated to accomodate this slight change in
behavior.

So, this all sounds like a lot that changed, but what we're talking about
here are really, really tight corner cases that arise from situations
outside of our control, like pathologically bad behavior on the part of
an OS or CPU. Even in these corner cases, it's very unlikely that users
will notice any difference at all. What's more important, I think, is
that the pacer behaves more closely to what all the comments describe,
and what the original intent was.

Another note: at first, one might think that computing the heap goal and
trigger dynamically introduces some raciness, but not in this CL: the heap
goal and trigger are completely static.

Allocation outside of a GC cycle may now be a bit slower than before, as
the GC trigger check is now significantly more complex. However, note
that this executes basically just as often as gcController.revise, and
that makes up for a vanishingly small part of any CPU profile. The next
CL cleans up the floating point multiplications on this path
nonetheless, just to be safe.

For #48409.

Change-Id: I280f5ad607a86756d33fb8449ad08555cbee93f9
Reviewed-on: https://go-review.googlesource.com/c/go/+/397014
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:12:52 +00:00
Michael Anthony Knyszek 375d696ddf runtime: move inconsistent memstats into gcController
Fundamentally, all of these memstats exist to serve the runtime in
managing memory. For the sake of simpler testing, couple these stats
more tightly with the GC.

This CL was mostly done automatically. The fields had to be moved
manually, but the references to the fields were updated via

    gofmt -w -r 'memstats.<field> -> gcController.<field>' *.go

For #48409.

Change-Id: Ic036e875c98138d9a11e1c35f8c61b784c376134
Reviewed-on: https://go-review.googlesource.com/c/go/+/397678
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:12:38 +00:00
Michael Anthony Knyszek d36d5bd3c1 runtime: clean up inconsistent heap stats
The inconsistent heaps stats in memstats are a bit messy. Primarily,
heap_sys is non-orthogonal with heap_released and heap_inuse. In later
CLs, we're going to want heap_sys-heap_released-heap_inuse, so clean
this up by replacing heap_sys with an orthogonal metric: heapFree.
heapFree represents page heap memory that is free but not released.

I think this change also simplifies a lot of reasoning about these
stats; it's much clearer what they mean, and to obtain HeapSys for
memstats, we no longer need to do the strange subtraction from heap_sys
when allocating specifically non-heap memory from the page heap.

Because we're removing heap_sys, we need to replace it with a sysMemStat
for mem.go functions. In this case, heap_released is the most
appropriate because we increase it anyway (again, non-orthogonality). In
which case, it makes sense for heap_inuse, heap_released, and heapFree
to become more uniform, and to just represent them all as sysMemStats.

While we're here and messing with the types of heap_inuse and
heap_released, let's also fix their names (and last_heap_inuse's name)
up to the more modern Go convention of camelCase.

For #48409.

Change-Id: I87fcbf143b3e36b065c7faf9aa888d86bd11710b
Reviewed-on: https://go-review.googlesource.com/c/go/+/397677
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:12:31 +00:00
Michael Anthony Knyszek 4649a43903 runtime: track how much memory is mapped in the Ready state
This change adds a field to memstats called mappedReady that tracks how
much memory is in the Ready state at any given time. In essence, it's
the total memory usage by the Go runtime (with one exception which is
documented). Essentially, all memory mapped read/write that has either
been paged in or will soon.

To make tracking this not involve the many different stats that track
mapped memory, we track this statistic at a very low level. The downside
of tracking this statistic at such a low level is that it managed to
catch lots of situations where the runtime wasn't fully accounting for
memory. This change rectifies these situations by always accounting for
memory that's mapped in some way (i.e. always passing a sysMemStat to a
mem.go function), with *two* exceptions.

Rectifying these situations means also having the memory mapped during
testing being accounted for, so that tests (i.e. ReadMemStats) that
ultimately check mappedReady continue to work correctly without special
exceptions. We choose to simply account for this memory in other_sys.

Let's talk about the exceptions. The first is the arenas array for
finding heap arena metadata from an address is mapped as read/write in
one large chunk. It's tens of MiB in size. On systems with demand
paging, we assume that the whole thing isn't paged in at once (after
all, it maps to the whole address space, and it's exceedingly difficult
with today's technology to even broach having as much physical memory as
the total address space). On systems where we have to commit memory
manually, we use a two-level structure.

Now, the reason why this is an exception is because we have no mechanism
to track what memory is paged in, and we can't just account for the
entire thing, because that would *look* like an enormous overhead.
Furthermore, this structure is on a few really, really critical paths in
the runtime, so doing more explicit tracking isn't really an option. So,
we explicitly don't and call sysAllocOS to map this memory.

The second exception is that we call sysFree with no accounting to clean
up address space reservations, or otherwise to throw out mappings we
don't care about. In this case, also drop down to a lower level and call
sysFreeOS to explicitly avoid accounting.

The third exception is debuglog allocations. That is purely a debugging
facility and ideally we want it to have as small an impact on the
runtime as possible. If we include it in mappedReady calculations, it
could cause GC pacing shifts in future CLs, especailly if one increases
the debuglog buffer sizes as a one-off.

As of this CL, these are the only three places in the runtime that would
pass nil for a stat to any of the functions in mem.go. As a result, this
CL makes sysMemStats mandatory to facilitate better accounting in the
future. It's now much easier to grep and find out where accounting is
explicitly elided, because one doesn't have to follow the trail of
sysMemStat nil pointer values, and can just look at the function name.

For #48409.

Change-Id: I274eb467fc2603881717482214fddc47c9eaf218
Reviewed-on: https://go-review.googlesource.com/c/go/+/393402
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-03 15:12:21 +00:00
Michael Anthony Knyszek 85d466493d runtime: maintain a direct count of total allocs and frees
This will be used by the memory limit computation to determine
overheads.

For #48409.

Change-Id: Iaa4e26e1e6e46f88d10ba8ebb6b001be876dc5cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/394220
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:12:12 +00:00
Russ Cox 9839668b56 all: separate doc comment from //go: directives
A future change to gofmt will rewrite

	// Doc comment.
	//go:foo

to

	// Doc comment.
	//
	//go:foo

Apply that change preemptively to all comments (not necessarily just doc comments).

For #51082.

Change-Id: Iffe0285418d1e79d34526af3520b415a12203ca9
Reviewed-on: https://go-review.googlesource.com/c/go/+/384260
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-05 17:54:15 +00:00
Michael Anthony Knyszek 4a56ba1c45 runtime: remove intermediate fields in memstats for ReadMemStats
Currently, the ReadMemStats (really this is all happening in
readmemstats_m, but that's just a direct call from ReadMemStats) call
chain first populates some fields in memstats, then copies those into
the final MemStats location. This used to make a lot of sense when
memstats' structure aligned with MemStats, and the values were just
copied from one to other. Sometime in the last few releases, we switched
to populating the MemStats manually because a lot of fields had diverged
from their internal representation. Now, we're left with a lot of fields
in memstats that pollute the structure: they only exist to be updated
for the sake of ReadMemStats. Since we're going to be adding more fields
to memstats in further CLs, this is a good opportunity to clean up.

As a result of this change, updatememstats, which used to just update
the aforementioned intermediate fields in memstats, is no longer
necessary, so it is removed.

Change-Id: Ifabfb3ac3002641105af62e9509a6351165dcd87
Reviewed-on: https://go-review.googlesource.com/c/go/+/393397
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2022-03-31 20:02:31 +00:00
Michael Anthony Knyszek df1837799d runtime: make consistentHeapStats acquire/release nosplit
consistentHeapStats is updated during a stack allocation, so a stack
growth during an acquire or release could cause another acquire to
happen before the operation completes fully. This may lead to an invalid
sequence number.

Fixes #49395.

Change-Id: I41ce3393dff80201793e053d4d6394d7b211a5b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/361158
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2021-11-05 17:52:30 +00:00
Michael Anthony Knyszek 9c58e399a4 [dev.typeparams] runtime: fix import sort order [generated]
[git-generate]
cd src/runtime
goimports -w *.go

Change-Id: I1387af0f2fd1a213dc2f4c122e83a8db0fcb15f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/329189
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-06-17 20:42:23 +00:00
Michael Anthony Knyszek 6d85891b29 [dev.typeparams] runtime: replace uses of runtime/internal/sys.PtrSize with internal/goarch.PtrSize [generated]
[git-generate]
cd src/runtime/internal/math
gofmt -w -r "sys.PtrSize -> goarch.PtrSize" .
goimports -w *.go
cd ../..
gofmt -w -r "sys.PtrSize -> goarch.PtrSize" .
goimports -w *.go

Change-Id: I43491cdd54d2e06d4d04152b3d213851b7d6d423
Reviewed-on: https://go-review.googlesource.com/c/go/+/328337
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2021-06-17 18:54:48 +00:00
Michael Anthony Knyszek 0b9ca4d907 runtime/metrics: add tiny allocs metric
Currently tiny allocations are not represented in either MemStats or
runtime/metrics, but they're represented in MemStats (indirectly) via
Mallocs. Add them to runtime/metrics by first merging
memstats.tinyallocs into consistentHeapStats (just for simplicity; it's
monotonic so metrics would still be self-consistent if we just read it
atomically) and then adding /gc/heap/tiny/allocs:objects to the list of
supported metrics.

Change-Id: Ie478006ab942a3e877b4a79065ffa43569722f3d
Reviewed-on: https://go-review.googlesource.com/c/go/+/312909
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-04-27 13:59:22 +00:00
Michael Anthony Knyszek 3eaf75c13a runtime: move next_gc and last_next_gc into gcControllerState
This change moves next_gc and last_next_gc into gcControllerState under
the names heapGoal and lastHeapGoal respectively. These are
fundamentally GC pacer related values, and so it makes sense for them to
live here.

Partially generated by

rf '
    ex . {
	memstats.next_gc -> gcController.heapGoal
	memstats.last_next_gc -> gcController.lastHeapGoal
    }
'

except for updates to comments and gcControllerState methods, where
they're accessed through the receiver, and trace-related renames of
NextGC -> HeapGoal, while we're here.

For #44167.

Change-Id: I1e871ad78a57b01be8d9f71bd662530c84853bed
Reviewed-on: https://go-review.googlesource.com/c/go/+/306603
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-04-14 14:03:30 +00:00
Michael Anthony Knyszek f2d5bd1ad3 runtime: move internal GC statistics from memstats to gcController
This change moves certain important but internal-only GC statistics from
memstats into gcController. These statistics are mainly used in pacing
the GC, so it makes sense to keep them in the pacer's state.

This CL was mostly generated via

rf '
    ex . {
	memstats.gc_trigger -> gcController.trigger
	memstats.triggerRatio -> gcController.triggerRatio
	memstats.heap_marked -> gcController.heapMarked
	memstats.heap_live -> gcController.heapLive
	memstats.heap_scan -> gcController.heapScan
    }
'

except for a few special cases, like updating names in comments and when
these fields are used within gcControllerState methods (at which point
they're accessed through the reciever).

For #44167.

Change-Id: I6bd1602585aeeb80818ded24c07d8e6fec992b93
Reviewed-on: https://go-review.googlesource.com/c/go/+/306598
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-04-13 23:42:29 +00:00
Michael Anthony Knyszek 39a5ee52b9 runtime: decouple consistent stats from mcache and allow P-less update
This change modifies the consistent stats implementation to keep the
per-P sequence counter on each P instead of each mcache. A valid mcache
is not available everywhere that we want to call e.g. allocSpan, as per
issue #42339. By decoupling these two, we can add a mechanism to allow
contexts without a P to update stats consistently.

In this CL, we achieve that with a mutex. In practice, it will be very
rare for an M to update these stats without a P. Furthermore, the stats
reader also only needs to hold the mutex across the update to "gen"
since once that changes, writers are free to continue updating the new
stats generation. Contention could thus only arise between writers
without a P, and as mentioned earlier, those should be rare.

A nice side-effect of this change is that the consistent stats acquire
and release API becomes simpler.

Fixes #42339.

Change-Id: Ied74ab256f69abd54b550394c8ad7c4c40a5fe34
Reviewed-on: https://go-review.googlesource.com/c/go/+/267158
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-11-02 21:21:46 +00:00
Michael Pratt 6abbfc17c2 runtime: add world-stopped assertions
Stopping the world is an implicit lock for many operations, so we should
assert the world is stopped in functions that require it.

This is enabled along with the rest of lock ranking, though it is a bit
orthogonal and likely cheap enough to enable all the time should we
choose.

Requiring a lock _or_ world stop is common, so that can be expressed as
well.

Updates #40677

Change-Id: If0a58544f4251d367f73c4120c9d39974c6cd091
Reviewed-on: https://go-review.googlesource.com/c/go/+/248577
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>
2020-10-30 20:20:58 +00:00
Michael Anthony Knyszek d39a89fd58 runtime,runtime/metrics: add metric for distribution of GC pauses
For #37112.

Change-Id: Ibb0425c9c582ae3da3b2662d5bbe830d7df9079c
Reviewed-on: https://go-review.googlesource.com/c/go/+/247047
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 21:47:54 +00:00
Michael Anthony Knyszek 22d2b984a6 runtime: make sysMemStats' methods nosplit
sysMemStats are updated early on in runtime initialization, so
triggering a stack growth would be bad. Mark them nosplit.

Thank you so much to cherryyz@google.com for finding this fix!

Fixes #42218.

Change-Id: Ic62db76e6a4f829355d7eaabed1727c51adfbd0f
Reviewed-on: https://go-review.googlesource.com/c/go/+/265157
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-10-26 20:10:27 +00:00
Michael Anthony Knyszek b08dfbaa43 runtime,runtime/metrics: add memory metrics
This change adds support for a variety of runtime memory metrics and
contains the base implementation of Read for the runtime/metrics
package, which lives in the runtime.

It also adds testing infrastructure for the metrics package, and a bunch
of format and documentation tests.

For #37112.

Change-Id: I16a2c4781eeeb2de0abcb045c15105f1210e2d8a
Reviewed-on: https://go-review.googlesource.com/c/go/+/247041
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
2020-10-26 18:29:05 +00:00
Michael Anthony Knyszek 79781e8dd3 runtime: move malloc stats into consistentHeapStats
This change moves the mcache-local malloc stats into the
consistentHeapStats structure so the malloc stats can be managed
consistently with the memory stats. The one exception here is
tinyAllocs for which moving that into the global stats would incur
several atomic writes on the fast path. Microbenchmarks for just one CPU
core have shown a 50% loss in throughput. Since tiny allocation counnt
isn't exposed anyway and is always blindly added to both allocs and
frees, let that stay inconsistent and flush the tiny allocation count
every so often.

Change-Id: I2a4b75f209c0e659b9c0db081a3287bf227c10ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/247039
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:28:56 +00:00
Michael Anthony Knyszek f77a9025f1 runtime: replace some memstats with consistent stats
This change replaces stacks_inuse, gcWorkBufInUse and
gcProgPtrScalarBitsInUse with their corresponding consistent stats. It
also adds checks to make sure the rest of the sharded stats line up with
existing stats in updatememstats.

Change-Id: I17d0bd181aedb5c55e09c8dff18cef5b2a3a14e3
Reviewed-on: https://go-review.googlesource.com/c/go/+/247038
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:28:20 +00:00
Michael Anthony Knyszek fe7ff71185 runtime: add consistent heap statistics
This change adds a global set of heap statistics which are similar
to existing memory statistics. The purpose of these new statistics
is to be able to read them and get a consistent result without stopping
the world. The goal is to eventually replace as many of the existing
memstats statistics with the sharded ones as possible.

The consistent memory statistics use a tailor-made synchronization
mechanism to allow writers (allocators) to proceed with minimal
synchronization by using a sequence counter and a global generation
counter to determine which set of statistics to update. Readers
increment the global generation counter to effectively grab a snapshot
of the statistics, and then iterate over all Ps using the sequence
counter to ensure that they may safely read the snapshotted statistics.
To keep statistics fresh, the reader also has a responsibility to merge
sets of statistics.

These consistent statistics are computed, but otherwise unused for now.
Upcoming changes will integrate them with the rest of the codebase and
will begin to phase out existing statistics.

Change-Id: I637a11f2439e2049d7dccb8650c5d82500733ca5
Reviewed-on: https://go-review.googlesource.com/c/go/+/247037
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:28:14 +00:00
Michael Anthony Knyszek ae585ee52c runtime: remove memstats.heap_alloc
memstats.heap_alloc is 100% a duplicate and unnecessary copy of
memstats.alloc which exists because MemStats used to be populated from
memstats via a memmove.

Change-Id: I995489f61be39786e573b8494a8ab6d4ea8bed9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/246975
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:20:05 +00:00
Michael Anthony Knyszek c5dea8f387 runtime: remove memstats.heap_idle
This statistic is updated in many places but for MemStats may be
computed from existing statistics. Specifically by definition
heap_idle = heap_sys - heap_inuse since heap_sys is all memory allocated
from the OS for use in the heap minus memory used for non-heap purposes.
heap_idle is almost the same (since it explicitly includes memory that
*could* be used for non-heap purposes) but also doesn't include memory
that's actually used to hold heap objects.

Although it has some utility as a sanity check, it complicates
accounting and we want fewer, orthogonal statistics for upcoming metrics
changes, so just drop it.

Change-Id: I40af54a38e335f43249f6e218f35088bfd4380d1
Reviewed-on: https://go-review.googlesource.com/c/go/+/246974
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:10:12 +00:00
Michael Anthony Knyszek ad863ba32a runtime: break down memstats.gc_sys
This change breaks apart gc_sys into three distinct pieces. Two of those
pieces are pieces which come from heap_sys since they're allocated from
the page heap. The rest comes from memory mapped from e.g.
persistentalloc which better fits the purpose of a sysMemStat. Also,
rename gc_sys to gcMiscSys.

Change-Id: I098789170052511e7b31edbcdc9a53e5c24573f7
Reviewed-on: https://go-review.googlesource.com/c/go/+/246973
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:10:04 +00:00
Michael Anthony Knyszek 39e335ac06 runtime: copy in MemStats fields explicitly
Currently MemStats is populated via an unsafe memmove from memstats, but
this places unnecessary structural restrictions on memstats, is annoying
to reason about, and tightly couples the two. Instead, just populate the
fields of MemStats explicitly.

Change-Id: I96f6a64326b1a91d4084e7b30169a4bbe6a331f9
Reviewed-on: https://go-review.googlesource.com/c/go/+/246972
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:09:52 +00:00
Michael Anthony Knyszek 8ebc58452a runtime: delineate which memstats are system stats with a type
This change modifies the type of several mstats fields to be a new type:
sysMemStat. This type has the same structure as the fields used to have.

The purpose of this change is to make it very clear which stats may be
used in various functions for accounting (usually the platform-specific
sys* functions, but there are others). Currently there's an implicit
understanding that the *uint64 value passed to these functions is some
kind of statistic whose value is atomically managed. This understanding
isn't inherently problematic, but we're about to change how some stats
(which currently use mSysStatInc and mSysStatDec) work, so we want to
make it very clear what the various requirements are around "sysStat".

This change also removes mSysStatInc and mSysStatDec in favor of a
method on sysMemStat. Note that those two functions were originally
written the way they were because atomic 64-bit adds required a valid G
on ARM, but this hasn't been the case for a very long time (since
golang.org/cl/14204, but even before then it wasn't clear if mutexes
required a valid G anymore). Today we implement 64-bit adds on ARM with
a spinlock table.

Change-Id: I4e9b37cf14afc2ae20cf736e874eb0064af086d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/246971
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:09:41 +00:00
Michael Anthony Knyszek c863849800 runtime: rename mcache fields to match Go style
This change renames a bunch of malloc statistics stored in the mcache
that are all named with the "local_" prefix. It also renames largeAlloc
to allocLarge to prevent a naming conflict, and next_sample because it
would be the last mcache field with the old C naming style.

Change-Id: I29695cb83b397a435ede7e9ad5c3c9be72767ea3
Reviewed-on: https://go-review.googlesource.com/c/go/+/246969
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 17:26:48 +00:00
Michael Anthony Knyszek d677899e90 runtime: flush local_scan directly and more often
Now that local_scan is the last mcache-based statistic that is flushed
by purgecachedstats, and heap_scan and gcController.revise may be
interacted with concurrently, we don't need to flush heap_scan at
arbitrary locations where the heap is locked, and we don't need
purgecachedstats and cachestats anymore. Instead, we can flush
local_scan at the same time we update heap_live in refill, so the two
updates may share the same revise call.

Clean up unused functions, remove code that would cause the heap to get
locked in the allocSpan when it didn't need to (other than to flush
local_scan), and flush local_scan explicitly in a few important places.
Notably we need to flush local_scan whenever we flush the other stats,
but it doesn't need to be donated anywhere, so have releaseAll do the
flushing. Also, we need to flush local_scan before we set heap_scan at
the end of a GC, which was previously handled by cachestats. Just do so
explicitly -- it's not much code and it becomes a lot more clear why we
need to do so.

Change-Id: I35ac081784df7744d515479896a41d530653692d
Reviewed-on: https://go-review.googlesource.com/c/go/+/246968
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 17:26:40 +00:00
Michael Anthony Knyszek cca3d1e553 runtime: don't flush local_tinyallocs
This change makes local_tinyallocs work like the rest of the malloc
stats and doesn't flush local_tinyallocs, instead making that the
source-of-truth.

Change-Id: I3e6cb5f1b3d086e432ce7d456895511a48e3617a
Reviewed-on: https://go-review.googlesource.com/c/go/+/246967
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 17:26:30 +00:00