Commit Graph

7102 Commits

Author SHA1 Message Date
Michael Pratt 77e3d8cf13 internal/runtime/maps: small maps point directly to a group
If the map contains 8 or fewer entries, it is wasteful to have a
directory that points to a table that points to a group.

Add a special case that replaces the directory with a direct pointer to
a group.

We could theoretically do similar for single table maps (no directory,
just point directly to a table), but that is left for later.

For #54766.

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap
Change-Id: I6fc04dfc11c31dadfe5b5d6481b4c4abd43d48ed
Reviewed-on: https://go-review.googlesource.com/c/go/+/611188
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2024-10-28 20:35:25 +00:00
Filippo Valsorda f0b51a2099 crypto/internal/fips: add service indicator mechanism
Placed the fipsIndicator field in some 64-bit alignment padding in the g
struct to avoid growing per-goroutine memory requirements on 64-bit
targets.

Fixes #69911
Updates #69536

Change-Id: I176419d0e3814574758cb88a47340a944f405604
Reviewed-on: https://go-review.googlesource.com/c/go/+/620795
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Derek Parker <parkerderek86@gmail.com>
2024-10-28 14:55:26 +00:00
Filippo Valsorda 0138c1abef Revert "crypto/rand: add randcrash=0 GODEBUG"
A GODEBUG is actually a security risk here: most programs will start to
ignore errors from Read because they can't happen (which is the intended
behavior), but then if a program is run with GODEBUG=randcrash=0 it will
use a partial buffer in case an error occurs, which may be catastrophic.

Note that the proposal was accepted without the GODEBUG, which was only
added later.

This (partially) reverts CL 608435. I kept the tests.

Updates #66821

Change-Id: I3fd20f9cae0d34115133fe935f0cfc7a741a2662
Reviewed-on: https://go-review.googlesource.com/c/go/+/622115
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
2024-10-28 14:46:33 +00:00
Michael Anthony Knyszek 4bf98186b5 runtime: fix mallocgc for asan
This change finally fully fixes mallocgc for asan after the recent
refactoring. Here is everything that changed:

Fix the accounting for the alloc header; large objects don't have them.

Mask out extra bits set from unrolling the bitmap for slice backing
stores in writeHeapBitsSmall. The redzone in asan mode makes it so that
dataSize is no longer an exact multiple of typ.Size_ in this case (a
new assumption I have recently discovered) but we didn't mask out any
extra bits, so we'd accidentally set bits in other allocations. Oops.

Move the initHeapBits optimization for the 8-byte scan sizeclass on
64-bit platforms up to mallocgc, out from writeHeapBitsSmall. So, this
actually caused a problem with asan when the optimization first landed,
but we missed it. The issue was then masked once we started passing the
redzone down into writeHeapBitsSmall, since the optimization would no
longer erroneously fire on asan. What happened was that dataSize would
be 8 (because that was the user-provided alloc size) so we'd skip
writing heap bits, but it would turn out the redzone bumped the size
class, so we'd actually *have* to write the heap bits for that size
class. This is not really a problem now *but* it caused problems for me
when debugging, since I would try to remove the red zone from dataSize
and this would trigger this bug again. Ultimately, this whole situation
is confusing because the check in writeHeapBitsSmall is *not* the same
as the check in initHeapBits. By moving this check up to mallocgc, we
can make the checks align better by matching on the sizeclass, so this
should be less error-prone in the future.

Change-Id: I1e9819223be23f722f3bf21e63e812f5fb557194
Reviewed-on: https://go-review.googlesource.com/c/go/+/622041
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-25 20:35:44 +00:00
Cherry Mui 22664f33b7 runtime: reserve fewer memory for aligned reservation on sbrk systems
Sometimes the runtime needs to reserve some memory with a large
alignment, which the OS usually won't directly satisfy. So, it
asks size+align bytes instead, and frees the unaligned portions.
On sbrk systems, this doesn't work that well, as freeing the tail
portion doesn't really free the memory to the OS. Instead, we
could simply round the current break up, then reserve the given
size, without wasting the tail portion.

Also, don't create heap arena hints on sbrk systems. We can only
grow the break sequentially, and reserving specific addresses
would not succeed anyway.

For #69018.

Change-Id: Iadc2c54d62b00ad7befa5bbf71146523483a8c47
Reviewed-on: https://go-review.googlesource.com/c/go/+/621715
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2024-10-25 17:13:47 +00:00
qmuntal 0addb2a4ea runtime: document that Caller and Frame.File always use forward slashes
Document that Caller and Frame.File always use forward slashes
as path separators, even on Windows.

Fixes #3335

Change-Id: Ic5bbf8a1f14af64277dca4783176cd8f70726b91
Reviewed-on: https://go-review.googlesource.com/c/go/+/603275
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-10-25 16:18:18 +00:00
Michael Anthony Knyszek 2a98a1849f runtime: uphold goroutine profile invariants in coroswitch
Goroutine profiles require checking in with the profiler before any
goroutine starts running. coroswitch is a place where a goroutine may
start running, but where we do not check in with the profiler, which
leads to crashes. Fix this by checking in with the profiler the same way
execute does.

Fixes #69998.

Change-Id: Idef6dd31b70a73dd1c967b56c307c7a46a26ba73
Reviewed-on: https://go-review.googlesource.com/c/go/+/622016
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-24 17:09:10 +00:00
Michael Anthony Knyszek fb2a7f8ce1 runtime: fix ASAN poison calculation in mallocgc
A previous CL broke the ASAN poisoning calculation in mallocgc by not
taking into account a possible allocation header, so the beginning of
the following allocation could have been poisoned.

This mostly isn't a problem, actually, since the following slot would
usually just have an allocation header in it that programs shouldn't be
touching anyway, but if we're going a word-past-the-end at the end of a
span, we could be poisoning a valid heap allocation.

Change-Id: I76a4f59bcef01af513a1640c4c212c0eb6be85b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/622295
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2024-10-24 16:55:08 +00:00
Ian Lance Taylor 76f3208367 runtime: support cgo index into pointer-to-array
We were missing a case for calling a C function with an index
into a pointer-to-array.

Fixes #70016

Change-Id: I9c74d629e58722813c1aaa0f0dc225a5a64d111b
Reviewed-on: https://go-review.googlesource.com/c/go/+/621576
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2024-10-24 14:34:01 +00:00
Shuo Wang 87a89fa451 runtime: add the checkPtraceScope to skip certain tests
When the kernel parameter ptrace_scope is set to 2 or 3,
certain test cases in runtime-gdb_test.go will fail.
We should skip these tests.

Fixes #69932

Change-Id: I685d1217f1521d7f8801680cf6b71d8e7a265188
GitHub-Last-Rev: 063759e04c
GitHub-Pull-Request: golang/go#69933
Reviewed-on: https://go-review.googlesource.com/c/go/+/620857
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-23 17:24:33 +00:00
Ian Lance Taylor d0631b90a3 runtime/debug: minor cleanups after CL 384154
Change some vars to consts, remove some unneeded string conversions.

Change-Id: Ib12eed11ef080c4b593c8369bb915117e7100045
Reviewed-on: https://go-review.googlesource.com/c/go/+/621838
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2024-10-23 04:48:55 +00:00
Ian Lance Taylor 0c460ad014 runtime/debug: document ParseBuildInfo and (*BuildInfo).String
For #51026
Fixes #69971

Change-Id: I47f2938d20cbe9462bf738a506baedad4a7006c3
Reviewed-on: https://go-review.googlesource.com/c/go/+/621837
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-23 04:48:53 +00:00
changwang ma 34e96356b7 runtime: fix typo in error message
Change-Id: I27bf98e84545746d90948dd06c4a7bd70782c49d
Reviewed-on: https://go-review.googlesource.com/c/go/+/621895
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2024-10-23 02:33:18 +00:00
Andrey Bokhanko 4e70258601 runtime: Check LSE support on ARM64 at runtime init
Check presence of LSE support on ARM64 chip if we targeted it at compile time.

Related to #69124
Update #60905

Change-Id: I6fe244decbb4982548982e1f88376847721a33c7
Reviewed-on: https://go-review.googlesource.com/c/go/+/610195
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Shu-Chun Weng <scw@google.com>
2024-10-22 16:16:32 +00:00
Michael Pratt 067c091564 cmd/link,runtime: DWARF/gdb support for swiss maps
For #54766.

Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap
Change-Id: I6695c0b143560d974b710e1d78e7a7d09278f7cc
Reviewed-on: https://go-review.googlesource.com/c/go/+/620215
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21 21:59:06 +00:00
Cherry Mui 4510586f93 runtime: (re)use unused linear memory on Wasm
CL 476717 adopted the memory management mechanism on Plan 9 to
manage Wasm's linear memory. But the Plan 9 code uses global
variable bloc and blocMax to keep track of the runtime's and the
OS's sense of break, whereas the Wasm sbrk function doesn't use
those global variables, and directly goes to grow the linear
memory instead. This causes that if there is any unused portion at
the end of the linear memory, the runtime doesn't use it. This CL
fixes it, adopts the same mechanism as the Plan 9 code.

In particular, the runtime is not aware of any unused initial
memory at startup. Therefore, (most of) the extra initial memory
set by the linker are not actually used. This CL fixes this as
well.

For #69018.

Change-Id: I2ea6a138310627eda5f19a1c76b1e1327362e5f2
Reviewed-on: https://go-review.googlesource.com/c/go/+/621635
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21 21:52:53 +00:00
Michael Anthony Knyszek 6a49f81edc runtime,time: use atomic.Int32 for isSending
This change switches isSending to be an atomic.Int32 instead of an
atomic.Uint8. The Int32 version is managed as a counter, which is
something that we couldn't do with Uint8 without adding a new intrinsic
which may not be available on all architectures.

That is, instead of only being able to support 8 concurrent timer
firings on the same timer because we only have 8 independent bits to set
for each concurrent timer firing, we can now have 2^31-1 concurrent
timer firings before running into any issues. Like the fact that each
bit-set was matched with a clear, here we match increments with
decrements to indicate that we're in the "sending on a channel" critical
section in the timer code, so we can report the correct result back on
Stop or Reset.

We choose an Int32 instead of a Uint32 because it's easier to check for
obviously bad values (negative values are always bad) and 2^31-1
concurrent timer firings should be enough for anyone.

Previously, we avoided anything bigger than a Uint8 because we could
pack it into some padding in the runtime.timer struct. But it turns out
that the type that actually matters, runtime.timeTimer, is exactly 96
bytes in size. This means its in the next size class up in the 112 byte
size class because of an allocation header. We thus have some free space
to work with. This change increases the size of this struct from 96
bytes to 104 bytes.

(I'm not sure if runtime.timer is often allocated directly, but if it
is, we get lucky in the same way too. It's exactly 80 bytes in size,
which means its in the 96-byte size class, leaving us with some space to
work with.)

Fixes #69969.
Related to #69880 and #69312.

Change-Id: I9fd59cb6a69365c62971d1f225490a65c58f3e77
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/621616
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21 21:48:17 +00:00
Alfonso Subiotto Marques 2e9ed44d39 runtime: remove linkname from memhash{32,64} functions
Remove linkname directives that are no longer necessary given
parquet-go/parquet-go#142 removes the dependency on the `memhash{32,64}`
functions.

This change also removes references to segmentio/parquet-go since that
repository was archived in favor of parquet-go/parquet-go.

Updates #67401

Change-Id: Ibafb0c41b39cdb86dac5531f62787fb5cb8d3f01
GitHub-Last-Rev: e14c4e4dfe
GitHub-Pull-Request: golang/go#67784
Reviewed-on: https://go-review.googlesource.com/c/go/+/589795
Auto-Submit: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-10-21 18:45:24 +00:00
Michael Anthony Knyszek acd072a078 runtime: execute publicationBarrier in noscan case for delayed zeroing
This is a peace-of-mind change to make sure that delayed-zeroed memory
(in the large alloc case) is globally visible from the moment the
allocation is published back to the caller.

The way it's written right now is good enough for the garbage collector
(we already have a publication barrier for a nil span.largeType, so the
GC will ignore the noscan span) but this might matter for user code on
weak memory architectures.

Change-Id: I06ac9b95863074e5f09382629083b19bfa87fdb8
Reviewed-on: https://go-review.googlesource.com/c/go/+/619036
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21 15:56:31 +00:00
Michael Anthony Knyszek a1c4fb4361 runtime: specialize heapSetType
Last CL we separated mallocgc into several specialized paths. Let's
split up heapSetType too. This will make the specialized heapSetType
functions inlineable and cut out some branches as well as a function
call.

Microbenchmark results at this point in the stack:

                   │ before.out  │            after-5.out             │
                   │   sec/op    │   sec/op     vs base               │
Malloc8-4            13.52n ± 3%   12.15n ± 2%  -10.13% (p=0.002 n=6)
Malloc16-4           21.49n ± 2%   18.32n ± 4%  -14.75% (p=0.002 n=6)
MallocTypeInfo8-4    27.12n ± 1%   18.64n ± 2%  -31.30% (p=0.002 n=6)
MallocTypeInfo16-4   28.71n ± 3%   21.63n ± 5%  -24.65% (p=0.002 n=6)
geomean              21.81n        17.31n       -20.64%

Change-Id: I5de9ac5089b9eb49bf563af2a74e6dc564420e05
Reviewed-on: https://go-review.googlesource.com/c/go/+/614795
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2024-10-21 15:56:28 +00:00
Michael Anthony Knyszek 8730fcf885 runtime: refactor mallocgc into several independent codepaths
Right now mallocgc is a monster of a function. In real programs, we see
that a substantial amount of time in mallocgc is spent in mallocgc
itself. It's very branch-y, holds a lot of state, and handles quite a few
disparate cases, trying to merge them together.

This change breaks apart mallocgc into separate, leaner functions.
There's some duplication now, but there are a lot of branches that can
be pruned as a result.

There's definitely still more we can do here. heapSetType can be inlined
and broken down for each case, since its internals roughly map to each
case anyway (done in a follow-up CL). We can probably also do more with
the size class lookups, since we know more about the size of the object
in each case than before.

Below are the savings for the full stack up until now.

                    │ after-3.out │              after-4.out              │
                    │   sec/op    │     sec/op      vs base               │
Malloc8-4             13.32n ± 2%   12.17n ±  1%     -8.63% (p=0.002 n=6)
Malloc16-4            21.64n ± 3%   19.38n ± 10%    -10.47% (p=0.002 n=6)
MallocTypeInfo8-4     23.15n ± 2%   19.91n ±  2%    -14.00% (p=0.002 n=6)
MallocTypeInfo16-4    25.86n ± 4%   22.48n ±  5%    -13.11% (p=0.002 n=6)
MallocLargeStruct-4                 270.0n ±   ∞ ¹
geomean               20.38n        30.97n          -11.58%

Change-Id: I681029c0b442f9221c4429950626f06299a5cfe4
Reviewed-on: https://go-review.googlesource.com/c/go/+/614257
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21 15:56:25 +00:00
Michael Anthony Knyszek 60ee99cf5d runtime: break out the debug.malloc codepaths into functions
This change breaks out the debug.malloc codepaths into dedicated
functions, both for making mallocgc easier to read, and to reduce the
function's size (currently all that code is inlined and really doesn't
need to be).

This is a microoptimization that on its own changes very little, but
together with other optimizations and a breaking up of the various
malloc paths will matter all together ("death by a thousand cuts").

Change-Id: I30b3ab4a1f349ba85b4a1b5b2c399abcdfe4844f
Reviewed-on: https://go-review.googlesource.com/c/go/+/617879
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21 15:55:37 +00:00
Michael Anthony Knyszek 6686edc0e7 runtime: move debug checks behind constant flag in mallocgc
These debug checks are very occasionally helpful, but they do cost real
time. The biggest issue seems to be the bloat of mallocgc due to the
"throw" paths. Overall, after some follow-ups, this change cuts about
1ns off of the mallocgc fast path.

This is a microoptimization that on its own changes very little, but
together with other optimizations and a breaking up of the various
malloc paths will matter all together ("death by a thousand cuts").

Change-Id: I07c4547ad724b9f94281320846677fb558957721
Reviewed-on: https://go-review.googlesource.com/c/go/+/617878
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21 15:48:20 +00:00
Michael Anthony Knyszek e750a0cdb3 runtime: rename shouldhelpgc to checkGCTrigger in mallocgc
shouldhelpgc is a very unhelpful name, because it has nothing to do with
assists and solely to do with GC triggering. Name it checkGCTrigger
instead, which is much clearer.

Change-Id: Id38debd424ddb397376c0cea6e74b3fe94002f71
Reviewed-on: https://go-review.googlesource.com/c/go/+/617877
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21 15:48:17 +00:00
Michael Anthony Knyszek 8df6413e11 runtime: recompute assistG before and after malloc
This change stops tracking assistG across malloc to reduce number of
slots the compiler must keep track of in mallocgc, which adds to
register pressure. It also makes the call to deductAssistCredit only
happen if the GC is running.

This is a microoptimization that on its own changes very little, but
together with other optimizations and a breaking up of the various
malloc paths will matter all together ("death by a thousand cuts").

Change-Id: I4cfac7f3e8e873ba66ff3b553072737a4707e2c2
Reviewed-on: https://go-review.googlesource.com/c/go/+/617876
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-10-21 15:48:15 +00:00
Michael Anthony Knyszek d8997c8c1f runtime: use wb flag instead of gcphase for allocate-black check
This is an allocator microoptimization. There's no reason to check
gcphase in general, since it's mostly for debugging anyway.
writeBarrier.enabled is set in all the same cases here, and we force one
fewer cache line (probably) to be touched during malloc.

Conceptually, it also makes a bit more sense. The allocate-black policy
is partly informed by the write barrier design.

Change-Id: Ia5ff593d64c29cf7f4d1bced3204056566444a98
Reviewed-on: https://go-review.googlesource.com/c/go/+/617875
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21 15:46:44 +00:00
Michael Anthony Knyszek 31437f25f2 runtime: simplify mem profile checking in mallocgc
Checking whether the current allocation needs to be profiled is
currently branch-y and weirdly a lot of code. The branches are
generally predictable, but it's a surprising number of instructions.
Part of the problem is that MemProfileRate is just a global that can be
set at any time, so we need to load it and check certain settings
explicitly. In an ideal world, we would just always subtract from
nextSample and have a single branch to take the slow path if we
subtract below zero.

If MemProfileRate were a function, we could trash all the nextSample
values intentionally in each mcache. This would be slow, but
MemProfileRate changes rarely while the malloc hot path is well, hot.
Unfortunate...

Although this ideal world is, AFAICT, impossible, we can still get
close. If we cache the value of MemProfileRate in each mcache, then we
can force malloc to take the slow path whenever MemProfileRate changes.
This does require two additional loads, but crucially, these loads are
independent of everything else in mallocgc. Furthermore, the branch
dependent on those loads is incredibly predictable in practice.

This CL on its own has little-to-no impact on mallocgc. But this
codepath is going to be duplicated in several places in the next CL, so
it'll pay to simplify it. Also, we're very much trying to remedy a
death-by-a-thousand-cuts situation, and malloc is currently still kind
of a monster -- it will not help if mallocgc isn't really streamlined
itself.

Lastly, there's a nice property now that all nextSample values get
immediately re-sampled when MemProfileRate changes.

Change-Id: I6443d0cf9bd7861595584442b675ac1be8ea3455
Reviewed-on: https://go-review.googlesource.com/c/go/+/615815
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2024-10-21 15:46:39 +00:00
Michael Anthony Knyszek 721c04ae4e runtime: optimize 8-byte allocation pointer data writing
This change brings back a minor optimization lost in the Go 1.22 cycle
wherein the 8-byte pointer-ful span class spans would have the pointer
bitmap written ahead of time in bulk, because there's only one possible
pattern.

                  │   before    │               after               │
                  │   sec/op    │   sec/op     vs base              │
MallocTypeInfo8-4   25.13n ± 1%   23.59n ± 2%  -6.15% (p=0.002 n=6)

Change-Id: I135b84bb1d5b7e678b841b56430930bc73c0a038
Reviewed-on: https://go-review.googlesource.com/c/go/+/614256
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21 14:47:08 +00:00
Michael Anthony Knyszek 56fb8350c8 runtime: don't call span.heapBits in writeHeapBitsSmall
For whatever reason, span.heapBits is kind of slow. It accounts for
about a quarter of the cost of writeHeapBitsSmall, which is absurd. We
get a nice speed improvement for small allocations by eliminating this
call.

                   │   before    │               after               │
                   │   sec/op    │   sec/op     vs base              │
MallocTypeInfo16-4   29.47n ± 1%   27.02n ± 1%  -8.31% (p=0.002 n=6)

Change-Id: I6270e26902e5a9254cf1503fac81c3c799c59d6a
Reviewed-on: https://go-review.googlesource.com/c/go/+/614255
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-10-21 14:47:04 +00:00
Michael Pratt d94b7a1876 cmd/compile,internal/runtime/maps: add extendible hashing
Extendible hashing splits a swisstable map into many swisstables. This
keeps grow operations small.

For #54766.

Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap
Change-Id: Id91f34af9e686bf35eb8882ee479956ece89e821
Reviewed-on: https://go-review.googlesource.com/c/go/+/604936
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2024-10-21 14:16:20 +00:00
Shuo Wang c0a126b8dc runtime: revise the documentation comments for netpoll
Supplement to CL 511455.

Updates #61454

Change-Id: I111cbf297dd9159cffba333d610a7a4542915c55
GitHub-Last-Rev: fe8fa18486
GitHub-Pull-Request: golang/go#69900
Reviewed-on: https://go-review.googlesource.com/c/go/+/620495
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-10-21 13:40:40 +00:00
Michael Pratt 488e2d18d9 runtime: more thorough map benchmarks
Based on the benchmarks in github.com/cockroachlabs/swiss.

For #54766.

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap
Change-Id: I9ad925d3272c671e21ec04eb2da5ebd8f0fc6a28
Reviewed-on: https://go-review.googlesource.com/c/go/+/596295
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
2024-10-18 23:13:43 +00:00
Joseph Myers 04f054d334 runtime/testdata: fix for C23 nullptr keyword
src/runtime/testdata/testprogcgo/threadprof.go contains C code with a
variable called nullptr.  This conflicts with the nullptr keyword in
the C23 revision of the C standard (showing up as gccgo test build
failures when updating GCC to use C23 by default when building C
code).

Rename that variable to nullpointer to avoid the clash with the
keyword (any other name that's not a keyword would work just as well).

Change-Id: Ida5ef371a3f856c611409884e185c3d5ded8e86c
GitHub-Last-Rev: 2ec464703b
GitHub-Pull-Request: golang/go#69927
Reviewed-on: https://go-review.googlesource.com/c/go/+/620955
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-10-18 22:35:39 +00:00
Austin Clements 89f29a772a runtime: clarify work.bytesMarked documentation
Change-Id: If5132400aac0ef00e467958beeaab5e64d053d10
Reviewed-on: https://go-review.googlesource.com/c/go/+/619099
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-14 20:36:52 +00:00
Michael Pratt c39bc22c14 all: wire up swisstable maps
Use the new SwissTable-based map in internal/runtime/maps as the basis
for the runtime map when GOEXPERIMENT=swissmap.

Integration is complete enough to pass all.bash. Notable missing
features:

* Race integration / concurrent write detection
* Stack-allocated maps
* Specialized "fast" map variants
* Indirect key / elem

For #54766.

Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap
Change-Id: Ie97b656b6d8e05c0403311ae08fef9f51756a639
Reviewed-on: https://go-review.googlesource.com/c/go/+/594596
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-14 19:58:47 +00:00
Ian Lance Taylor 48849e0866 runtime: don't frob isSending for tickers
The Ticker Stop and Reset methods don't report a value,
so we don't need to track whether they are interrupting a send.

This includes a test that used to fail about 2% of the time on
my laptop when run under x/tools/cmd/stress.

Change-Id: Ic6d14b344594149dd3c24b37bbe4e42e83f9a9ad
Reviewed-on: https://go-review.googlesource.com/c/go/+/620136
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-14 19:04:43 +00:00
qmuntal fa7343aca3 runtime: reduce syscall.SyscallX stack usage
syscall.SyscallX consumes a lot of stack space, which is a problem
because they are nosplit functions. They used to use less stack space,
but CL 563315, that landed in Go 1.23, increased the stack usage by a
lot.

This CL reduces the stack usage back to the previous level.

Fixes #69813.

Change-Id: Iddedd28b693c66a258da687389768055c493fc2e
Reviewed-on: https://go-review.googlesource.com/c/go/+/618497
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-11 17:16:18 +00:00
Michael Pratt 0733682e5f internal/runtime/maps: initial swiss table map implementation
Add a new package that will contain a new "Swiss Table"
(https://abseil.io/about/design/swisstables) map implementation, which
is intended to eventually replace the existing runtime map
implementation.

This implementation is based on the fabulous
github.com/cockroachdb/swiss package contributed by Peter Mattis.

This CL adds an hash map implementation. It supports all the core
operations, but does not have incremental growth.

For #54766.

Change-Id: I52cf371448c3817d471ddb1f5a78f3513565db41
Reviewed-on: https://go-review.googlesource.com/c/go/+/582415
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-08 16:43:52 +00:00
Filippo Valsorda 9a44b8e15a runtime: overwrite startupRand instead of clearing it
AT_RANDOM is unfortunately used by libc before we run (so make sure it's
not cleared) but also is available to cgo programs after we did. It
would be unfortunate if a cgo program assumed it could use AT_RANDOM but
instead found all zeroes there.

Change-Id: I82eff34d8cf5a499b439052b7827b8ef7cabc21d
Reviewed-on: https://go-review.googlesource.com/c/go/+/608437
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2024-10-07 15:35:02 +00:00
Filippo Valsorda 311372c53c runtime: use arc4random_buf() for readRandom
readRandom doesn't matter on Linux because of startupRand, but it does
on Windows and macOS. Windows already uses the same API as crypto/rand.
Switch macOS away from the /dev/urandom read.

Updates #68278

Cq-Include-Trybots: luci.golang.try:gotip-darwin-amd64_14
Change-Id: Ie8f105e35658a6f10ff68798d14883e3b212eb3e
Reviewed-on: https://go-review.googlesource.com/c/go/+/608436
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-07 15:34:53 +00:00
Filippo Valsorda 63cd5a39e9 crypto/rand: add randcrash=0 GODEBUG
For #66821

Change-Id: I525c308d6d6243a2bc805e819dcf40b67e52ade5
Reviewed-on: https://go-review.googlesource.com/c/go/+/608435
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2024-10-07 15:34:42 +00:00
Filippo Valsorda c050d42e1a crypto/rand: crash program if Read would return an error
Fixes #66821
Fixes #54980

Change-Id: Ib081f4e4f75c7936fc3f5b31d3bd07cca1c2a55c
Reviewed-on: https://go-review.googlesource.com/c/go/+/602497
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
2024-10-07 15:33:28 +00:00
Tobias Klauser d39bfafee7 runtime: use stringslite.CutPrefix in isExportedRuntime
Change-Id: I7cbbe3b9a9f08ac98e3e76be7bda2f7df9c61fb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/617915
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-04 19:47:10 +00:00
Vasily Leonenko b4712ab055 runtime: memclrNoHeapPointers optimization for block alignment
goos: linux
goarch: arm64
pkg: runtime
               │  base.log   │               opt.log               │
               │   sec/op    │   sec/op     vs base                │
Memclr/5-4       3.378n ± 2%   3.376n ± 2%        ~ (p=0.128 n=10)
Memclr/16-4      2.749n ± 1%   2.776n ± 2%   +1.00% (p=0.001 n=10)
Memclr/64-4      4.588n ± 2%   4.184n ± 2%   -8.78% (p=0.000 n=10)
Memclr/256-4     8.758n ± 0%   7.103n ± 0%  -18.90% (p=0.000 n=10)
Memclr/4096-4    58.80n ± 0%   57.43n ± 0%   -2.33% (p=0.000 n=10)
Memclr/65536-4   868.7n ± 1%   861.7n ± 1%   -0.80% (p=0.004 n=10)
Memclr/1M-4      23.08µ ± 6%   23.55µ ± 6%        ~ (p=0.739 n=10)
Memclr/4M-4      219.6µ ± 3%   216.1µ ± 2%        ~ (p=0.123 n=10)
Memclr/8M-4      586.1µ ± 1%   586.4µ ± 2%        ~ (p=0.853 n=10)
Memclr/16M-4     1.312m ± 0%   1.311m ± 1%        ~ (p=0.481 n=10)
Memclr/64M-4     5.332m ± 1%   5.681m ± 0%   +6.55% (p=0.000 n=10)
geomean          1.723µ        1.683µ        -2.31%

Change-Id: Icad625065fb1f30b2a4094f3f1e58b4e9b3d841e
Reviewed-on: https://go-review.googlesource.com/c/go/+/616137
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-03 21:06:16 +00:00
Nick Ripley 1edb49a6eb Revert "runtime/pprof: make TestBlockMutexProfileInlineExpansion stricter"
This reverts commit 5b0f8596b7.

Reason for revert: This CL breaks gotip-linux-amd64-noopt builder.

Change-Id: I3950211f05c90e4955c0785409b796987741a9f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/617715
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
2024-10-03 16:44:53 +00:00
Ian Lance Taylor ce60f70374 runtime: clear isSending bit earlier
I've done some more testing of the new isSending field.
I'm not able to get more than 2 bits set. That said,
with this change it's significantly less likely to have even
2 bits set. The idea here is to clear the bit before possibly
locking the channel we are sending the value on, thus avoiding
some delay and some serialization.

For #69312

Change-Id: I8b5f167f162bbcbcbf7ea47305967f349b62b0f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/617497
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2024-10-02 22:11:29 +00:00
Nick Ripley 5b0f8596b7 runtime/pprof: make TestBlockMutexProfileInlineExpansion stricter
While working on CL 611241 and CL 616375, I introduced a bug that wasn't
caught by any test. CL 611241 added more inline expansion at sample time
for block/mutex profile stacks collected via frame pointer unwinding.
CL 616375 then changed how inline expansion for those stacks is done at
reporting time. So some frames passed through multiple rounds of inline
expansion, and this lead to duplicate stack frames in some cases. The
stacks from TestBlockMutexProfileInlineExpansion looked like

	sync.(*Mutex).Unlock
	runtime/pprof.inlineF
	runtime/pprof.inlineE
	runtime/pprof.inlineD
	runtime/pprof.inlineD
	runtime.goexit

after those two CLs, and in particular after CL 616375. Note the extra
inlineD frame. The test didn't catch that since it was only looking for
a few frames in the stacks rather than checking the entire stacks.

This CL makes that test stricter by checking the entire expected stacks
rather than just a portion of the stacks.

Change-Id: I0acc739d826586e9a63a081bb98ef512d72cdc9a
Reviewed-on: https://go-review.googlesource.com/c/go/+/617235
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-02 19:57:33 +00:00
Jason A. Donenfeld 8c269479ed runtime: don't acquirem() in vgetrandom unless necessary
I noticed in pprof that acquirem() was a bit of a hotspot. It turns out
that we can use the same trick that runtime.rand() does, and only
acquirem if we're doing something non-nosplit -- in this case, getting a
new state -- but otherwise just do getg().m, which is safe because we're
inside runtime and don't call split functions.

cpu: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
                     │   sec/op    │   sec/op     vs base               │
ParallelGetRandom-16   2.651n ± 4%   2.416n ± 7%  -8.87% (p=0.001 n=10)
                     │     B/s      │     B/s       vs base               │
ParallelGetRandom-16   1.406Gi ± 4%   1.542Gi ± 6%  +9.72% (p=0.001 n=10)

Change-Id: Iae075f4e298b923e499cd01adfabacab725a8684
Reviewed-on: https://go-review.googlesource.com/c/go/+/616738
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-02 17:00:39 +00:00
Michael Pratt 89228ca439 runtime/pprof: add context to short stack panic
Over the years we've had various bugs in pprof stack handling resulting
in appendLocsForStack crashing because stk is too short for a cached
location. i.e., the cached location claims several inlined frames. Those
should always appear together in stk. If some frames are missing from
stk, appendLocsForStack.

If we find this case, replace the slice out of bounds panic with an
explicit panic that contains more context.

Change-Id: I52725a689baf42b8db627ce3e1bc6c654ef245d4
Reviewed-on: https://go-review.googlesource.com/c/go/+/617135
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-01 17:34:53 +00:00
Mateusz Poliwczak ba10a38ed0 runtime, internal/syscall/unix: mark getrandom vDSO as non-escaping
Updates #66779
Updates #69577

Change-Id: I0dea5a30aab87aaa443e7e6646c1d07aa865ac1c
GitHub-Last-Rev: 1cea46deb3
GitHub-Pull-Request: golang/go#69719
Reviewed-on: https://go-review.googlesource.com/c/go/+/616696
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2024-09-30 18:25:48 +00:00