Commit Graph

151 Commits

Author SHA1 Message Date
Jeff Wentworth fdd6dfd507 [release-branch.go1.17] sync/atomic: fix documentation for CompareAndSwap
The documentation for CompareAndSwap atomic/value incorrectly labelled
the function as CompareAndSwapPointer. This PR fixes that.

Updates #47699.
Fixes #47703.

Change-Id: I6db08fdfe166570b775248fd24550f5d28e3434e
GitHub-Last-Rev: 41f7870792
GitHub-Pull-Request: golang/go#47700
Reviewed-on: https://go-review.googlesource.com/c/go/+/342210
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Trust: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Go Bot <gobot@golang.org>
(cherry picked from commit 0a0a160d4d)
Reviewed-on: https://go-review.googlesource.com/c/go/+/342329
Trust: Dmitri Shuralyov <dmitshur@golang.org>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-08-15 22:50:19 +00:00
Tobias Klauser 2c76a6f7f8 all: add //go:build lines to assembly files
Don't add them to files in vendor and cmd/vendor though. These will be
pulled in by updating the respective dependencies.

For #41184

Change-Id: Icc57458c9b3033c347124323f33084c85b224c70
Reviewed-on: https://go-review.googlesource.com/c/go/+/319389
Trust: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2021-05-13 09:12:17 +00:00
Colin Arnott 2422c5eae5 sync/atomic: add (*Value).Swap and (*Value).CompareAndSwap
The functions SwapPointer and CompareAndSwapPointer can be used to
interact with unsafe.Pointer, however generally it is prefered to work
with Value, due to its safer interface. As such, they have been added
along with glue logic to maintain invariants Value guarantees.

To meet these guarantees, the current implementation duplicates much of
the Store function. Some of this is due to inexperience with concurrency
and desire for correctness, but the lack of generic programming
functionality does not help.

Fixes #39351

Change-Id: I1aa394b1e70944736ac1e19de49fe861e1e46fba
Reviewed-on: https://go-review.googlesource.com/c/go/+/241678
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-05-04 00:15:27 +00:00
Bryan C. Mills 0d1280c685 Revert "sync: improve sync.Pool object stealing"
This reverts CL 303949.

Reason for revert: broke linux-arm-aws TryBots.

Change-Id: Ib44949df70520cdabff857846be0d2221403d2f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/313630
Trust: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-04-26 18:54:39 +00:00
Ruslan Andreev d5d24dbe41 sync: improve sync.Pool object stealing
This CL provide abilty to randomly select P to steal object from its
shared queue. In order to provide such ability randomOrder structure
was copied from runtime/proc.go.
It should reduce contention in firsts Ps and improve balance of object
stealing across all Ps. Also, the patch provides new benchmark
PoolStarvation which force Ps to steal objects.
Benchmarks:
name                old time/op     new time/op     delta
Pool-8                 2.16ns ±14%     2.14ns ±16%    ~     (p=0.425 n=10+10)
PoolOverflow-8          489ns ± 0%      489ns ± 0%    ~     (p=0.719 n=9+10)
PoolStarvation-8       7.00µs ± 4%     6.59µs ± 2%  -5.86%  (p=0.000 n=10+10)
PoolSTW-8              15.1µs ± 1%     15.2µs ± 1%  +0.99%  (p=0.001 n=10+10)
PoolExpensiveNew-8     1.25ms ±10%     1.31ms ± 9%    ~     (p=0.143 n=10+10)
[Geo mean]             2.68µs          2.68µs       -0.28%

name                old p50-ns/STW  new p50-ns/STW  delta
PoolSTW-8               15.0k ± 1%      15.1k ± 1%  +0.92%  (p=0.000 n=10+10)

name                old p95-ns/STW  new p95-ns/STW  delta
PoolSTW-8               16.2k ± 3%      16.4k ± 2%    ~     (p=0.143 n=10+10)

name                old GCs/op      new GCs/op      delta
PoolExpensiveNew-8       0.29 ± 2%       0.30 ± 1%  +2.84%  (p=0.000 n=8+10)

name                old New/op      new New/op      delta
PoolExpensiveNew-8       8.07 ±11%       8.49 ±10%    ~     (p=0.123 n=10+10)

Change-Id: I3ca1d0bf1f358b1148c58e64740fb2d5bfc0bc02
Reviewed-on: https://go-review.googlesource.com/c/go/+/303949
Reviewed-by: David Chase <drchase@google.com>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
2021-04-26 17:13:36 +00:00
panchenglong01 1749f3915e sync: update misleading comment in map.go about entry type
As discussed in: https://github.com/golang/go/issues/45429,  about entry
type comments, it is possible for p == nil when m.dirty != nil, so
update the commemt about it.

Fixes #45429

Change-Id: I7ef96ee5b6948df9ac736481d177a59ab66d7d4d
GitHub-Last-Rev: 202c598a0a
GitHub-Pull-Request: golang/go#45443
Reviewed-on: https://go-review.googlesource.com/c/go/+/308292
Reviewed-by: Changkun Ou <euryugasaki@gmail.com>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Trust: Robert Findley <rfindley@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-04-08 14:08:29 +00:00
Russ Cox d4b2638234 all: go fmt std cmd (but revert vendor)
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).

Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild

Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/294430
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-02-20 03:54:50 +00:00
Martin Möhrmann d902791b50 sync: use 386 instead of x86-32 to refer to the 32 bit x86 architecture
This aligns the naming with GOARCH using 386 as a build target for
this architecture and makes it more easily found when searching
for documentation related to the build target.

Change-Id: I393bb89dd2f71e568124107b13e1b288fbd0c76a
Reviewed-on: https://go-review.googlesource.com/c/go/+/271988
Trust: Martin Möhrmann <moehrmann@google.com>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-11-23 05:57:35 +00:00
Michael Pratt b4f3d52f6a sync: document RWMutex race semantics
RWMutex provides explicit acquire/release synchronization events to the
race detector to model the mutex. It disables sync events within the
methods to avoid e.g., the atomics from adding false synchronization
events, which could cause false negatives in the race detector.

Change-Id: I5126ce2efaab151811ac264864aab1fa025a4aaf
Reviewed-on: https://go-review.googlesource.com/c/go/+/270865
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>
2020-11-18 20:12:03 +00:00
Austin Clements c305e49e96 cmd/go,cmd/compile,sync: remove special import case in cmd/go
CL 253748 introduced a special case in cmd/go to allow sync to import
runtime/internal/atomic. Besides introducing unnecessary complexity
into cmd/go, this breaks other packages (like gopls) that understand
how imports work, but don't understand this special case.

Fix this by using the more standard linkname-based approach to pull
the necessary functions from runtime/internal/atomic into sync. Since
these are compiler intrinsics, we also have to tell the compiler that
the linknamed symbols are intrinsics to get this optimization in sync.

Fixes #42196.

Change-Id: I1f91498c255c91583950886a89c3c9adc39a32f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/265124
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Paul Murphy <murp@ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-10-26 20:12:53 +00:00
Dmitri Shuralyov f8376a55b0 sync: document that Once must not be copied
Fixes #42160.

Change-Id: I9bf8b6f0bf1eccd3ab32cbd94c812f768746d291
Reviewed-on: https://go-review.googlesource.com/c/go/+/264557
Trust: Dmitri Shuralyov <dmitshur@golang.org>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2020-10-23 19:57:55 +00:00
Paul E. Murphy 15ead857db cmd/compiler,cmd/go,sync: add internal {LoadAcq,StoreRel}64 on ppc64
Add an internal atomic intrinsic for load with acquire semantics
(extending LoadAcq to 64b) and add LoadAcquintptr for internal
use within the sync package.  For other arches, this remaps to the
appropriate atomic.Load{,64} intrinsic which should not alter code
generation.

Similarly, add StoreRel{uintptr,64} for consistency, and inline.

Finally, add an exception to allow sync to directly use the
runtime/internal/atomic package which avoids more convoluted
workarounds (contributed by Lynn Boger).

In an extreme example, sync.(*Pool).pin consumes 20% of wall time
during fmt tests.  This is reduced to 5% on ppc64le/power9.

From the fmt benchmarks on ppc64le:

name                           old time/op  new time/op  delta
SprintfPadding                  468ns ± 0%   451ns ± 0%   -3.63%
SprintfEmpty                   73.3ns ± 0%  51.9ns ± 0%  -29.20%
SprintfString                   135ns ± 0%   122ns ± 0%   -9.63%
SprintfTruncateString           232ns ± 0%   214ns ± 0%   -7.76%
SprintfTruncateBytes            216ns ± 0%   202ns ± 0%   -6.48%
SprintfSlowParsingPath          162ns ± 0%   142ns ± 0%  -12.35%
SprintfQuoteString             1.00µs ± 0%  0.99µs ± 0%   -1.39%
SprintfInt                      117ns ± 0%   104ns ± 0%  -11.11%
SprintfIntInt                   190ns ± 0%   175ns ± 0%   -7.89%
SprintfPrefixedInt              232ns ± 0%   212ns ± 0%   -8.62%
SprintfFloat                    270ns ± 0%   255ns ± 0%   -5.56%
SprintfComplex                 1.01µs ± 0%  0.99µs ± 0%   -1.68%
SprintfBoolean                  127ns ± 0%   111ns ± 0%  -12.60%
SprintfHexString                220ns ± 0%   198ns ± 0%  -10.00%
SprintfHexBytes                 261ns ± 0%   252ns ± 0%   -3.45%
SprintfBytes                    600ns ± 0%   590ns ± 0%   -1.67%
SprintfStringer                 684ns ± 0%   658ns ± 0%   -3.80%
SprintfStructure               2.57µs ± 0%  2.57µs ± 0%   -0.12%
ManyArgs                        669ns ± 0%   646ns ± 0%   -3.44%
FprintInt                       140ns ± 0%   136ns ± 0%   -2.86%
FprintfBytes                    184ns ± 0%   181ns ± 0%   -1.63%
FprintIntNoAlloc                140ns ± 0%   136ns ± 0%   -2.86%
ScanInts                        929µs ± 0%   921µs ± 0%   -0.79%
ScanRecursiveInt                122ms ± 0%   121ms ± 0%   -0.11%
ScanRecursiveIntReaderWrapper   122ms ± 0%   122ms ± 0%   -0.18%

Change-Id: I4d66780261b57b06ef600229e475462e7313f0d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/253748
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-10-21 14:34:44 +00:00
Austin Clements 83317d9e3c runtime/internal/atomic: panic nicely on unaligned 64-bit atomics
On 386 and arm, unaligned 64-bit atomics aren't safe, so we check for
this and panic. Currently, we panic by dereferencing nil, which may be
expedient but is pretty user-hostile since it gives no hint of what
the actual problem was.

This CL replaces this with an actual panic. The only subtlety here is
now the atomic assembly implementations are calling back into Go, so
they have to play nicely with stack maps and stack scanning. On 386,
this just requires declaring NO_LOCAL_POINTERS. On arm, this is
somewhat more complicated: first, we have to move the alignment check
into the functions that have Go signatures. Then we have to support
both the tail call from these functions to the underlying
implementation (which requires that they have no frame) and the call
into Go to panic (which requires that they have a frame). We resolve
this by forcing them to have no frame and setting up the frame
manually just before the panic call.

Change-Id: I19f1e860045df64088013db37a18acea47342c69
Reviewed-on: https://go-review.googlesource.com/c/go/+/262778
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2020-10-16 17:31:17 +00:00
ZhangYunHao a3bc52b786 sync: fix typo in pooldequeue
.

Change-Id: I26fa26d67d01bcd583a1efaaf9a38398cbf793f7
GitHub-Last-Rev: ded020d02c
GitHub-Pull-Request: golang/go#41932
Reviewed-on: https://go-review.googlesource.com/c/go/+/261477
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
2020-10-14 13:38:41 +00:00
Changkun Ou 94953d3e59 sync: delete dirty keys inside Map.LoadAndDelete
Fixes #40999

Change-Id: Ie32427e5cb5ed512b976b554850f50be156ce9f2
Reviewed-on: https://go-review.googlesource.com/c/go/+/250197
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2020-08-24 20:05:41 +00:00
Tobias Klauser dc12d5b0f5 all: add empty line between copyright header and package clause
Makes sure the copyright notice is not interpreted as the package level
godoc.

Change-Id: I2afce7c9d620f19d51ec1438b1d0db1774b57146
Reviewed-on: https://go-review.googlesource.com/c/go/+/248760
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
2020-08-17 09:45:44 +00:00
Gaurav Singh 5a18e0b58c sync: fix goroutine leak for when TestMutexFairness times out
If the timeout triggers before writing to the done channel, the
goroutine will be blocked waiting for a corresponding read that’s
no longer existent, thus a goroutine leak. This change fixes that by
using a buffered channel instead.

Change-Id: I9cf4067a58bc5a729ab31e4426edd78bd359e8e0
GitHub-Last-Rev: a7d811a7be
GitHub-Pull-Request: golang/go#40236
Reviewed-on: https://go-review.googlesource.com/c/go/+/242902
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-08-14 23:04:16 +00:00
Ian Lance Taylor 9591515f51 runtime, sync: add copyright headers to new files
For #38029

Change-Id: I71de2b66c1de617d32c46d4f2c1866f9ff1756ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/244631
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dan Scales <danscales@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-07-25 03:26:17 +00:00
BurtonQin 4874835232 net/textproto, sync: unlock mutexes appropriately before panics
Ensure mutexes are unlocked right before panics, where defers aren’t easily usable.

Change-Id: I67c9870e7a626f590a8de8df6c8341c5483918dc
GitHub-Last-Rev: bb8ffe538b
GitHub-Pull-Request: golang/go#37143
Reviewed-on: https://go-review.googlesource.com/c/go/+/218717
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-08 16:19:51 +00:00
Dan Scales 0a820007e7 runtime: static lock ranking for the runtime (enabled by GOEXPERIMENT)
I took some of the infrastructure from Austin's lock logging CR
https://go-review.googlesource.com/c/go/+/192704 (with deadlock
detection from the logs), and developed a setup to give static lock
ranking for runtime locks.

Static lock ranking establishes a documented total ordering among locks,
and then reports an error if the total order is violated. This can
happen if a deadlock happens (by acquiring a sequence of locks in
different orders), or if just one side of a possible deadlock happens.
Lock ordering deadlocks cannot happen as long as the lock ordering is
followed.

Along the way, I found a deadlock involving the new timer code, which Ian fixed
via https://go-review.googlesource.com/c/go/+/207348, as well as two other
potential deadlocks.

See the constants at the top of runtime/lockrank.go to show the static
lock ranking that I ended up with, along with some comments. This is
great documentation of the current intended lock ordering when acquiring
multiple locks in the runtime.

I also added an array lockPartialOrder[] which shows and enforces the
current partial ordering among locks (which is embedded within the total
ordering). This is more specific about the dependencies among locks.

I don't try to check the ranking within a lock class with multiple locks
that can be acquired at the same time (i.e. check the ranking when
multiple hchan locks are acquired).

Currently, I am doing a lockInit() call to set the lock rank of most
locks. Any lock that is not otherwise initialized is assumed to be a
leaf lock (a very high rank lock), so that eliminates the need to do
anything for a bunch of locks (including all architecture-dependent
locks). For two locks, root.lock and notifyList.lock (only in the
runtime/sema.go file), it is not as easy to do lock initialization, so
instead, I am passing the lock rank with the lock calls.

For Windows compilation, I needed to increase the StackGuard size from
896 to 928 because of the new lock-rank checking functions.

Checking of the static lock ranking is enabled by setting
GOEXPERIMENT=staticlockranking before doing a run.

To make sure that the static lock ranking code has no overhead in memory
or CPU when not enabled by GOEXPERIMENT, I changed 'go build/install' so
that it defines a build tag (with the same name) whenever any experiment
has been baked into the toolchain (by checking Expstring()). This allows
me to avoid increasing the size of the 'mutex' type when static lock
ranking is not enabled.

Fixes #38029

Change-Id: I154217ff307c47051f8dae9c2a03b53081acd83a
Reviewed-on: https://go-review.googlesource.com/c/go/+/207619
Reviewed-by: Dan Scales <danscales@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dan Scales <danscales@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-07 21:51:03 +00:00
Keith Randall 95773ab9b0 sync/atomic: fix TestSwapPointer test
It plays way too loose with unsafe.Pointer rules.
It runs afoul of the checkptr rules, so some race detector builds
were failing.

Fixes #38210

Change-Id: I5e1c78201d06295524fdedb3fe5b49d61446f443
Reviewed-on: https://go-review.googlesource.com/c/go/+/226880
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-04-02 03:47:13 +00:00
Daniel Martí a4c48d61f5 sync/atomic: remove panic64
The func has been unused since https://golang.org/cl/93637 in 2018.

Change-Id: I1cab6f265aa5058ac080fd7c7cbf0fe85370f073
Reviewed-on: https://go-review.googlesource.com/c/go/+/224077
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matt Layher <mdlayher@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
2020-03-19 19:46:54 +00:00
Ziheng Liu 42f8199290 all: fix incorrect channel and API usage in some unit tests
This CL changes some unit test functions, making sure that these tests (and goroutines spawned during test) won't block.
Since they are just test functions, I use one CL to fix them all. I hope this won't cause trouble to reviewers and can save time for us.
There are three main categories of incorrect logic fixed by this CL:
1. Use testing.Fatal()/Fatalf() in spawned goroutines, which is forbidden by Go's document.
2. Channels are used in such a way that, when errors or timeout happen, the test will be blocked and never return.
3. Channels are used in such a way that, when errors or timeout happen, the test can return but some spawned goroutines will be leaked, occupying resource until all other tests return and the process is killed.

Change-Id: I3df931ec380794a0cf1404e632c1dd57c65d63e8
Reviewed-on: https://go-review.googlesource.com/c/go/+/219380
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-02-27 19:04:17 +00:00
Changkun Ou 2e8dbae85c sync: add new Map method LoadAndDelete
This CL implements a LoadAndDelete method in sync.Map. Benchmark:

name                                              time/op
LoadAndDeleteBalanced/*sync_test.RWMutexMap-12    98.8ns ± 1%
LoadAndDeleteBalanced/*sync.Map-12                10.3ns ±11%
LoadAndDeleteUnique/*sync_test.RWMutexMap-12      99.2ns ± 2%
LoadAndDeleteUnique/*sync.Map-12                  6.63ns ±10%
LoadAndDeleteCollision/*sync_test.DeepCopyMap-12   140ns ± 0%
LoadAndDeleteCollision/*sync_test.RWMutexMap-12   75.2ns ± 2%
LoadAndDeleteCollision/*sync.Map-12               5.21ns ± 5%

In addition, Delete is bounded and more efficient if many collisions:

DeleteCollision/*sync_test.DeepCopyMap-12   120ns ± 2%   125ns ± 1%   +3.80%  (p=0.000 n=10+9)
DeleteCollision/*sync_test.RWMutexMap-12   73.5ns ± 3%  79.5ns ± 1%   +8.03%  (p=0.000 n=10+9)
DeleteCollision/*sync.Map-12               97.8ns ± 3%   5.9ns ± 4%  -94.00%  (p=0.000 n=10+10)

Fixes #33762

Change-Id: Ic8469a7861d27ab0edeface0078aad8af9b26c2f
Reviewed-on: https://go-review.googlesource.com/c/go/+/205899
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-02-25 14:31:55 +00:00
Ian Lance Taylor 0915a19a11 sync: deflake TestWaitGroupMisuse3
If one of the helper goroutine panics, the main goroutine call to Wait
may hang forever waiting for something to call Done. Put that call in
a goroutine like the others.

Fixes #35774

Change-Id: I8d2b58d8f473644a49a95338f70111d4e6ed4e12
Reviewed-on: https://go-review.googlesource.com/c/go/+/210218
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2019-12-06 05:12:15 +00:00
Rhys Hiltner 7148478f1b sync: yield to the waiter when unlocking a starving mutex
When we have already assigned the semaphore ticket to a specific
waiter, we want to get the waiter running as fast as possible since
no other G waiting on the semaphore can acquire it optimistically.

The net effect is that, when a sync.Mutex is contended, the code in
the critical section guarded by the Mutex gets a priority boost.

Fixes #33747

The original work was done in CL 200577 by Carlo Alberto Ferraris. The
change was reverted in CL 205817 because it broke the linux-arm64-packet
and solaris-amd64-oraclerel builders.

Change-Id: I76d79b1d63fd206ed1c57fe6900cb7ae9e4d46cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/206180
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-11-09 19:31:32 +00:00
Bryan C. Mills 73d57bf80f Revert "sync: yield to the waiter when unlocking a starving mutex"
This reverts CL 200577.

Reason for revert: broke linux-arm64-packet and solaris-amd64-oraclerel builders

Fixes #35424
Updates #33747

Change-Id: I2575fd84d37995d458183caae54704f15d8b8426
Reviewed-on: https://go-review.googlesource.com/c/go/+/205817
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-11-07 15:04:03 +00:00
Carlo Alberto Ferraris a8f57f4ada sync: yield to the waiter when unlocking a starving mutex
When we have already assigned the semaphore ticket to a specific
waiter, we want to get the waiter running as fast as possible since
no other G waiting on the semaphore can acquire it optimistically.

The net effect is that, when a sync.Mutex is contented, the code in
the critical section guarded by the Mutex gets a priority boost.

Fixes #33747

Change-Id: I9967f0f763c25504010651bdd7f944ee0189cd45
Reviewed-on: https://go-review.googlesource.com/c/go/+/200577
Reviewed-by: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-11-07 05:59:33 +00:00
Matthew Dempsky a97ccc8940 sync/atomic: suppress checkptr errors for hammerStoreLoadPointer
This test could be updated to use unsafe.Pointer arithmetic properly
(e.g., see discussion at #34972), but it doesn't seem worthwhile. The
test is just checking that LoadPointer and StorePointer are atomic.

Updates #34972.

Change-Id: I85a8d610c1766cd63136cae686aa8a240a362a18
Reviewed-on: https://go-review.googlesource.com/c/go/+/202597
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-10-22 18:09:03 +00:00
Brad Fitzpatrick 03ef105dae all: remove nacl (part 3, more amd64p32)
Part 1: CL 199499 (GOOS nacl)
Part 2: CL 200077 (amd64p32 files, toolchain)
Part 3: stuff that arguably should've been part of Part 2, but I forgot
        one of my grep patterns when splitting the original CL up into
        two parts.

This one might also have interesting stuff to resurrect for any future
x32 ABI support.

Updates #30439

Change-Id: I2b4143374a253a003666f3c69e776b7e456bdb9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/200318
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-10-10 22:38:38 +00:00
Russ Cox bc593eac2d sync: document implementation of Once.Do
It's not correct to use atomic.CompareAndSwap to implement Once.Do,
and we don't, but why we don't is a question that has come up
twice on golang-dev in the past few months.
Add a comment to help others with the same question.

Change-Id: Ia89ec9715cc5442c6e7f13e57a49c6cfe664d32c
Reviewed-on: https://go-review.googlesource.com/c/go/+/184261
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Ingo Oeser <nightlyone@googlemail.com>
2019-07-01 14:45:49 +00:00
Austin Clements 9caaac2c92 sync: only check for successful PopHeads in long mode
In TestPoolDequeue it's surprisingly common for the queue to stay
nearly empty the whole time and for a racing PopTail to happen in the
window between the producer doing a PushHead and doing a PopHead. In
short mode, there are only 100 PopTail attempts. On linux/amd64, it's
not uncommon for this to fail 50% of the time. On linux/arm64, it's
not uncommon for this to fail 100% of the time, causing the test to
fail.

This CL fixes this by only checking for a successful PopTail in long
mode. Long mode makes 200,000 PopTail attempts, and has never been
observed to fail.

Fixes #31422.

Change-Id: If464d55eb94fcb0b8d78fbc441d35be9f056a290
Reviewed-on: https://go-review.googlesource.com/c/go/+/183981
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2019-06-26 19:48:42 +00:00
Austin Clements c19b3a60da sync: make TestPoolDequeue termination condition more robust
TestPoolDequeue creates P-1 consumer goroutines and 1 producer
goroutine. Currently, if a consumer goroutine pops the last value from
the dequeue, it sets a flag that stops all consumers, but the producer
also periodically pops from the dequeue and doesn't set this flag.

Hence, if the producer were to pop the last element, the consumers
will continue to run and the test won't terminate. This CL fixes this
by also setting the termination flag in the producer.

I believe it's impossible for this to happen right now because the
producer only pops after pushing an element for which j%10==0 and the
last element is either 999 or 1999999, which means it should never try
to pop after pushing the last element. However, we shouldn't depend on
this reasoning.

Change-Id: Icd2bc8d7cb9a79ebfcec99e367c8a2ba76e027d8
Reviewed-on: https://go-review.googlesource.com/c/go/+/183980
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2019-06-26 19:48:40 +00:00
Austin Clements 93c05627f9 sync: fix pool wrap-around test
TestPoolDequeue in long mode does a little more than 1<<21 pushes.
This was originally because the head and tail indexes were 21 bits and
the intent was to test wrap-around behavior. However, in the final
version they were both 32 bits, so the test no longer tested
wrap-around.

It would take too long to reach 32-bit wrap around in a test, so
instead we initialize the poolDequeue with indexes that are already
nearly at their limit. This keeps the knowledge of the maximum index
in one place, and lets us test wrap-around even in short mode.

Change-Id: Ib9b8d85b6d9b5be235849c2b32e81c809e806579
Reviewed-on: https://go-review.googlesource.com/c/go/+/183979
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2019-06-26 19:48:39 +00:00
Russ Cox 06b0babf31 all: shorten some tests
Shorten some of the longest tests that run during all.bash.
Removes 7r 50u 21s from all.bash.

After this change, all.bash is under 5 minutes again on my laptop.

For #26473.

Change-Id: Ie0460aa935808d65460408feaed210fbaa1d5d79
Reviewed-on: https://go-review.googlesource.com/c/go/+/177559
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-05-22 12:54:00 +00:00
Kai Dong 5ccaf2c6ad sync: update comment
Comment update.

Change-Id: If0d054216f9953f42df04647b85c38008b85b026
GitHub-Last-Rev: 133b4670be
GitHub-Pull-Request: golang/go#31539
Reviewed-on: https://go-review.googlesource.com/c/go/+/172700
Reviewed-by: Austin Clements <austin@google.com>
2019-04-19 16:15:36 +00:00
Austin Clements 2dcbf8b369 sync: smooth out Pool behavior over GC with a victim cache
Currently, every Pool is cleared completely at the start of each GC.
This is a problem for heavy users of Pool because it causes an
allocation spike immediately after Pools are clear, which impacts both
throughput and latency.

This CL fixes this by introducing a victim cache mechanism. Instead of
clearing Pools, the victim cache is dropped and the primary cache is
moved to the victim cache. As a result, in steady-state, there are
(roughly) no new allocations, but if Pool usage drops, objects will
still be collected within two GCs (as opposed to one).

This victim cache approach also improves Pool's impact on GC dynamics.
The current approach causes all objects in Pools to be short lived.
However, if an application is in steady state and is just going to
repopulate its Pools, then these objects impact the live heap size *as
if* they were long lived. Since Pooled objects count as short lived
when computing the GC trigger and goal, but act as long lived objects
in the live heap, this causes GC to trigger too frequently. If Pooled
objects are a non-trivial portion of an application's heap, this
increases the CPU overhead of GC. The victim cache lets Pooled objects
affect the GC trigger and goal as long-lived objects.

This has no impact on Get/Put performance, but substantially reduces
the impact to the Pool user when a GC happens. PoolExpensiveNew
demonstrates this in the substantially reduction in the rate at which
the "New" function is called.

name                 old time/op     new time/op     delta
Pool-12                 2.21ns ±36%     2.00ns ± 0%     ~     (p=0.070 n=19+16)
PoolOverflow-12          587ns ± 1%      583ns ± 1%   -0.77%  (p=0.000 n=18+18)
PoolSTW-12              5.57µs ± 3%     4.52µs ± 4%  -18.82%  (p=0.000 n=20+19)
PoolExpensiveNew-12     3.69ms ± 7%     1.25ms ± 5%  -66.25%  (p=0.000 n=20+19)

name                 old p50-ns/STW  new p50-ns/STW  delta
PoolSTW-12               5.48k ± 2%      4.53k ± 2%  -17.32%  (p=0.000 n=20+20)

name                 old p95-ns/STW  new p95-ns/STW  delta
PoolSTW-12               6.69k ± 4%      5.13k ± 3%  -23.31%  (p=0.000 n=19+18)

name                 old GCs/op      new GCs/op      delta
PoolExpensiveNew-12       0.39 ± 1%       0.32 ± 2%  -17.95%  (p=0.000 n=18+20)

name                 old New/op      new New/op      delta
PoolExpensiveNew-12       40.0 ± 6%       12.4 ± 6%  -68.91%  (p=0.000 n=20+19)

(https://perf.golang.org/search?q=upload:20190311.2)

Fixes #22950.

Change-Id: If2e183d948c650417283076aacc20739682cdd70
Reviewed-on: https://go-review.googlesource.com/c/go/+/166961
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-04-05 18:49:08 +00:00
Austin Clements d5fd2dd6a1 sync: use lock-free structure for Pool stealing
Currently, Pool stores each per-P shard's overflow in a slice
protected by a Mutex. In order to store to the overflow or steal from
another shard, a P must lock that shard's Mutex. This allows for
simple synchronization between Put and Get, but has unfortunate
consequences for clearing pools.

Pools are cleared during STW sweep termination, and hence rely on
pinning a goroutine to its P to synchronize between Get/Put and
clearing. This makes the Get/Put fast path extremely fast because it
can rely on quiescence-style coordination, which doesn't even require
atomic writes, much less locking.

The catch is that a goroutine cannot acquire a Mutex while pinned to
its P (as this could deadlock). Hence, it must drop the pin on the
slow path. But this means the slow path is not synchronized with
clearing. As a result,

1) It's difficult to reason about races between clearing and the slow
path. Furthermore, this reasoning often depends on unspecified nuances
of where preemption points can occur.

2) Clearing must zero out the pointer to every object in every Pool to
prevent a concurrent slow path from causing all objects to be
retained. Since this happens during STW, this has an O(# objects in
Pools) effect on STW time.

3) We can't implement a victim cache without making clearing even
slower.

This CL solves these problems by replacing the locked overflow slice
with a lock-free structure. This allows Gets and Puts to be pinned the
whole time they're manipulating the shards slice (Pool.local), which
eliminates the races between Get/Put and clearing. This, in turn,
eliminates the need to zero all object pointers, reducing clearing to
O(# of Pools) during STW.

In addition to significantly reducing STW impact, this also happens to
speed up the Get/Put fast-path and the slow path. It somewhat
increases the cost of PoolExpensiveNew, but we'll fix that in the next
CL.

name                 old time/op     new time/op     delta
Pool-12                 3.00ns ± 0%     2.21ns ±36%  -26.32%  (p=0.000 n=18+19)
PoolOverflow-12          600ns ± 1%      587ns ± 1%   -2.21%  (p=0.000 n=16+18)
PoolSTW-12              71.0µs ± 2%      5.6µs ± 3%  -92.15%  (p=0.000 n=20+20)
PoolExpensiveNew-12     3.14ms ± 5%     3.69ms ± 7%  +17.67%  (p=0.000 n=19+20)

name                 old p50-ns/STW  new p50-ns/STW  delta
PoolSTW-12               70.7k ± 1%       5.5k ± 2%  -92.25%  (p=0.000 n=20+20)

name                 old p95-ns/STW  new p95-ns/STW  delta
PoolSTW-12               73.1k ± 2%       6.7k ± 4%  -90.86%  (p=0.000 n=18+19)

name                 old GCs/op      new GCs/op      delta
PoolExpensiveNew-12       0.38 ± 1%       0.39 ± 1%   +2.07%  (p=0.000 n=20+18)

name                 old New/op      new New/op      delta
PoolExpensiveNew-12       33.9 ± 6%       40.0 ± 6%  +17.97%  (p=0.000 n=19+20)

(https://perf.golang.org/search?q=upload:20190311.1)

Fixes #22331.
For #22950.

Change-Id: Ic5cd826e25e218f3f8256dbc4d22835c1fecb391
Reviewed-on: https://go-review.googlesource.com/c/go/+/166960
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-04-05 18:49:07 +00:00
Austin Clements 59f2704dab sync: add Pool benchmarks to stress STW and reuse
This adds two benchmarks that will highlight two problems in Pool that
we're about to address.

The first benchmark measures the impact of large Pools on GC STW time.
Currently, STW time is O(# of items in Pools), and this benchmark
demonstrates 70µs STW times.

The second benchmark measures the impact of fully clearing all Pools
on each GC. Typically this is a problem in heavily-loaded systems
because it causes a spike in allocation. This benchmark stresses this
by simulating an expensive "New" function, so the cost of creating new
objects is reflected in the ns/op of the benchmark.

For #22950, #22331.

Change-Id: I0c8853190d23144026fa11837b6bf42adc461722
Reviewed-on: https://go-review.googlesource.com/c/go/+/166959
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-04-05 18:49:06 +00:00
Austin Clements 57bb7be4b1 sync: internal dynamically sized lock-free queue for sync.Pool
This adds a dynamically sized, lock-free, single-producer,
multi-consumer queue that will be used in the new Pool stealing
implementation. It's built on top of the fixed-size queue added in the
previous CL.

For #22950, #22331.

Change-Id: Ifc0ca3895bec7e7f9289ba9fb7dd0332bf96ba5a
Reviewed-on: https://go-review.googlesource.com/c/go/+/166958
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-04-05 18:49:04 +00:00
Austin Clements 2b60567002 sync: internal fixed size lock-free queue for sync.Pool
This is the first step toward fixing multiple issues with sync.Pool.
This adds a fixed size, lock-free, single-producer, multi-consumer
queue that will be used in the new Pool stealing implementation.

For #22950, #22331.

Change-Id: I50e85e3cb83a2ee71f611ada88e7f55996504bb5
Reviewed-on: https://go-review.googlesource.com/c/go/+/166957
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-04-05 18:49:03 +00:00
Carlo Alberto Ferraris 05051b56a0 sync: allow inlining the RWMutex.RUnlock fast path
RWMutex.RLock is already inlineable, so add a test for it as well.

name                    old time/op  new time/op  delta
RWMutexUncontended      66.5ns ± 0%  60.3ns ± 1%  -9.38%  (p=0.000 n=12+20)
RWMutexUncontended-4    16.7ns ± 0%  15.3ns ± 1%  -8.49%  (p=0.000 n=17+20)
RWMutexUncontended-16   7.86ns ± 0%  7.69ns ± 0%  -2.08%  (p=0.000 n=18+15)
RWMutexWrite100         25.1ns ± 0%  24.0ns ± 1%  -4.28%  (p=0.000 n=20+18)
RWMutexWrite100-4       46.7ns ± 5%  44.1ns ± 4%  -5.53%  (p=0.000 n=20+20)
RWMutexWrite100-16      68.3ns ±11%  65.7ns ± 8%  -3.81%  (p=0.003 n=20+20)
RWMutexWrite10          26.7ns ± 1%  25.7ns ± 0%  -3.75%  (p=0.000 n=17+14)
RWMutexWrite10-4        34.9ns ± 2%  33.8ns ± 2%  -3.15%  (p=0.000 n=20+20)
RWMutexWrite10-16       37.4ns ± 2%  36.1ns ± 2%  -3.51%  (p=0.000 n=18+20)
RWMutexWorkWrite100      163ns ± 0%   162ns ± 0%  -0.89%  (p=0.000 n=18+20)
RWMutexWorkWrite100-4    189ns ± 4%   184ns ± 4%  -2.89%  (p=0.000 n=19+20)
RWMutexWorkWrite100-16   207ns ± 4%   200ns ± 2%  -3.07%  (p=0.000 n=19+20)
RWMutexWorkWrite10       153ns ± 0%   151ns ± 1%  -0.75%  (p=0.000 n=20+20)
RWMutexWorkWrite10-4     177ns ± 1%   176ns ± 2%  -0.63%  (p=0.004 n=17+20)
RWMutexWorkWrite10-16    191ns ± 2%   189ns ± 1%  -0.83%  (p=0.015 n=20+17)

linux/amd64 bin/go 14688201 (previous commit 14675861, +12340/+0.08%)

The cumulative effect of this and the previous 3 commits is:

name                    old time/op  new time/op  delta
MutexUncontended        19.3ns ± 1%  16.4ns ± 1%  -15.13%  (p=0.000 n=20+20)
MutexUncontended-4      5.24ns ± 0%  4.09ns ± 0%  -21.95%  (p=0.000 n=20+18)
MutexUncontended-16     2.10ns ± 0%  2.12ns ± 0%   +0.95%  (p=0.000 n=15+17)
Mutex                   19.6ns ± 0%  16.3ns ± 1%  -17.12%  (p=0.000 n=20+20)
Mutex-4                 54.6ns ± 5%  45.6ns ±10%  -16.51%  (p=0.000 n=20+19)
Mutex-16                 133ns ± 5%   130ns ± 3%   -1.99%  (p=0.002 n=20+20)
MutexSlack              33.4ns ± 2%  16.2ns ± 0%  -51.44%  (p=0.000 n=19+20)
MutexSlack-4             206ns ± 5%   209ns ± 9%     ~     (p=0.154 n=20+20)
MutexSlack-16           89.4ns ± 1%  90.9ns ± 2%   +1.70%  (p=0.000 n=18+17)
MutexWork               60.5ns ± 0%  55.3ns ± 1%   -8.59%  (p=0.000 n=12+20)
MutexWork-4              105ns ± 5%    97ns ±11%   -7.95%  (p=0.000 n=20+20)
MutexWork-16             157ns ± 1%   158ns ± 1%   +0.66%  (p=0.001 n=18+17)
MutexWorkSlack          70.2ns ± 5%  55.3ns ± 0%  -21.30%  (p=0.000 n=19+18)
MutexWorkSlack-4         277ns ±13%   260ns ±15%   -6.35%  (p=0.002 n=20+18)
MutexWorkSlack-16        156ns ± 0%   146ns ± 1%   -6.40%  (p=0.000 n=16+19)
MutexNoSpin              966ns ± 0%   976ns ± 1%   +0.97%  (p=0.000 n=15+17)
MutexNoSpin-4            269ns ± 4%   272ns ± 4%   +1.15%  (p=0.048 n=20+18)
MutexNoSpin-16           122ns ± 0%   119ns ± 1%   -2.63%  (p=0.000 n=19+15)
MutexSpin               3.13µs ± 0%  3.12µs ± 0%   -0.17%  (p=0.000 n=18+18)
MutexSpin-4              826ns ± 1%   833ns ± 1%   +0.84%  (p=0.000 n=19+17)
MutexSpin-16             397ns ± 1%   394ns ± 1%   -0.78%  (p=0.000 n=19+19)
Once                    5.67ns ± 0%  2.07ns ± 2%  -63.43%  (p=0.000 n=20+20)
Once-4                  1.47ns ± 2%  0.54ns ± 3%  -63.49%  (p=0.000 n=19+20)
Once-16                 0.58ns ± 0%  0.17ns ± 5%  -70.49%  (p=0.000 n=17+17)
RWMutexUncontended      71.4ns ± 0%  60.3ns ± 1%  -15.60%  (p=0.000 n=16+20)
RWMutexUncontended-4    18.4ns ± 4%  15.3ns ± 1%  -17.14%  (p=0.000 n=20+20)
RWMutexUncontended-16   8.01ns ± 0%  7.69ns ± 0%   -3.91%  (p=0.000 n=18+15)
RWMutexWrite100         24.9ns ± 0%  24.0ns ± 1%   -3.57%  (p=0.000 n=19+18)
RWMutexWrite100-4       46.5ns ± 3%  44.1ns ± 4%   -5.09%  (p=0.000 n=17+20)
RWMutexWrite100-16      68.9ns ± 3%  65.7ns ± 8%   -4.65%  (p=0.000 n=18+20)
RWMutexWrite10          27.1ns ± 0%  25.7ns ± 0%   -5.25%  (p=0.000 n=17+14)
RWMutexWrite10-4        34.8ns ± 1%  33.8ns ± 2%   -2.96%  (p=0.000 n=20+20)
RWMutexWrite10-16       37.5ns ± 2%  36.1ns ± 2%   -3.72%  (p=0.000 n=20+20)
RWMutexWorkWrite100      164ns ± 0%   162ns ± 0%   -1.49%  (p=0.000 n=12+20)
RWMutexWorkWrite100-4    186ns ± 3%   184ns ± 4%     ~     (p=0.097 n=20+20)
RWMutexWorkWrite100-16   204ns ± 2%   200ns ± 2%   -1.58%  (p=0.000 n=18+20)
RWMutexWorkWrite10       153ns ± 0%   151ns ± 1%   -1.21%  (p=0.000 n=20+20)
RWMutexWorkWrite10-4     179ns ± 1%   176ns ± 2%   -1.25%  (p=0.000 n=19+20)
RWMutexWorkWrite10-16    191ns ± 1%   189ns ± 1%   -0.94%  (p=0.000 n=15+17)

Change-Id: I9269bf2ac42a04c610624f707d3268dcb17390f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/152698
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-03-09 16:34:17 +00:00
Carlo Alberto Ferraris ca8354843e sync: allow inlining the Once.Do fast path
Using Once.Do is now extremely cheap because the fast path is just an inlined
atomic load of a variable that is written only once and a conditional jump.
This is very beneficial for Once.Do because, due to its nature, the fast path
will be used for every call after the first one.

In a attempt to mimize code size increase, reorder the fields so that the
pointer to Once is also the pointer to Once.done, that is the only field used
in the hot path. This allows to use more compact instruction encodings or less
instructions in the hot path (that is inlined at every callsite).

name     old time/op  new time/op  delta
Once     4.54ns ± 0%  2.06ns ± 0%  -54.59%  (p=0.000 n=19+16)
Once-4   1.18ns ± 0%  0.55ns ± 0%  -53.39%  (p=0.000 n=15+16)
Once-16  0.53ns ± 0%  0.17ns ± 0%  -67.92%  (p=0.000 n=18+17)

linux/amd64 bin/go 14675861 (previous commit 14663387, +12474/+0.09%)

Change-Id: Ie2708103ab473787875d66746d2f20f1d90a6916
Reviewed-on: https://go-review.googlesource.com/c/go/+/152697
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-03-09 05:58:31 +00:00
Carlo Alberto Ferraris 41cb0aedff sync: allow inlining the Mutex.Lock fast path
name                    old time/op  new time/op  delta
MutexUncontended        18.9ns ± 0%  16.2ns ± 0%  -14.29%  (p=0.000 n=19+19)
MutexUncontended-4      4.75ns ± 1%  4.08ns ± 0%  -14.20%  (p=0.000 n=20+19)
MutexUncontended-16     2.05ns ± 0%  2.11ns ± 0%   +2.93%  (p=0.000 n=19+16)
Mutex                   19.3ns ± 1%  16.2ns ± 0%  -15.86%  (p=0.000 n=17+19)
Mutex-4                 52.4ns ± 4%  48.6ns ± 9%   -7.22%  (p=0.000 n=20+20)
Mutex-16                 139ns ± 2%   140ns ± 3%   +1.03%  (p=0.011 n=16+20)
MutexSlack              18.9ns ± 1%  16.2ns ± 1%  -13.96%  (p=0.000 n=20+20)
MutexSlack-4             225ns ± 8%   211ns ±10%   -5.94%  (p=0.000 n=18+19)
MutexSlack-16           98.4ns ± 1%  90.9ns ± 1%   -7.60%  (p=0.000 n=17+18)
MutexWork               58.2ns ± 3%  55.4ns ± 0%   -4.82%  (p=0.000 n=20+17)
MutexWork-4              103ns ± 7%    95ns ±18%   -8.03%  (p=0.000 n=20+20)
MutexWork-16             163ns ± 2%   155ns ± 2%   -4.47%  (p=0.000 n=18+18)
MutexWorkSlack          57.7ns ± 1%  55.4ns ± 0%   -3.99%  (p=0.000 n=20+13)
MutexWorkSlack-4         276ns ±13%   260ns ±10%   -5.64%  (p=0.001 n=19+19)
MutexWorkSlack-16        147ns ± 0%   156ns ± 1%   +5.87%  (p=0.000 n=14+19)
MutexNoSpin              968ns ± 0%   900ns ± 1%   -6.98%  (p=0.000 n=20+18)
MutexNoSpin-4            270ns ± 2%   255ns ± 2%   -5.74%  (p=0.000 n=19+20)
MutexNoSpin-16           120ns ± 4%   112ns ± 0%   -6.99%  (p=0.000 n=19+14)
MutexSpin               3.13µs ± 1%  3.19µs ± 6%     ~     (p=0.401 n=20+20)
MutexSpin-4              832ns ± 2%   831ns ± 1%   -0.17%  (p=0.023 n=16+18)
MutexSpin-16             395ns ± 0%   399ns ± 0%   +0.94%  (p=0.000 n=17+19)
RWMutexUncontended      69.5ns ± 0%  68.4ns ± 0%   -1.59%  (p=0.000 n=20+20)
RWMutexUncontended-4    17.5ns ± 0%  16.7ns ± 0%   -4.30%  (p=0.000 n=18+17)
RWMutexUncontended-16   7.92ns ± 0%  7.87ns ± 0%   -0.61%  (p=0.000 n=18+17)
RWMutexWrite100         24.9ns ± 1%  25.0ns ± 1%   +0.32%  (p=0.000 n=20+20)
RWMutexWrite100-4       46.2ns ± 4%  46.2ns ± 5%     ~     (p=0.840 n=19+20)
RWMutexWrite100-16      69.9ns ± 5%  69.9ns ± 3%     ~     (p=0.545 n=20+19)
RWMutexWrite10          27.0ns ± 2%  26.8ns ± 2%   -0.98%  (p=0.001 n=20+20)
RWMutexWrite10-4        34.7ns ± 2%  35.0ns ± 4%     ~     (p=0.191 n=18+20)
RWMutexWrite10-16       37.2ns ± 4%  37.3ns ± 2%     ~     (p=0.438 n=20+19)
RWMutexWorkWrite100      164ns ± 0%   163ns ± 0%   -0.24%  (p=0.025 n=20+20)
RWMutexWorkWrite100-4    193ns ± 3%   191ns ± 2%   -1.06%  (p=0.027 n=20+20)
RWMutexWorkWrite100-16   210ns ± 3%   207ns ± 3%   -1.22%  (p=0.038 n=20+20)
RWMutexWorkWrite10       153ns ± 0%   153ns ± 0%     ~     (all equal)
RWMutexWorkWrite10-4     178ns ± 2%   179ns ± 2%     ~     (p=0.186 n=20+20)
RWMutexWorkWrite10-16    192ns ± 2%   192ns ± 2%     ~     (p=0.731 n=19+20)

linux/amd64 bin/go 14663387 (previous commit 14630572, +32815/+0.22%)

Change-Id: I98171006dce14069b1a62da07c3d165455a7906b
Reviewed-on: https://go-review.googlesource.com/c/go/+/148959
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-09 05:08:04 +00:00
Carlo Alberto Ferraris 4c3f26076b sync: allow inlining the Mutex.Unlock fast path
Make use of the newly-enabled limited midstack inlining.
Similar changes will be done in followup CLs.

name                    old time/op  new time/op  delta
MutexUncontended        19.3ns ± 1%  18.9ns ± 0%   -1.92%  (p=0.000 n=20+19)
MutexUncontended-4      5.24ns ± 0%  4.75ns ± 1%   -9.25%  (p=0.000 n=20+20)
MutexUncontended-16     2.10ns ± 0%  2.05ns ± 0%   -2.38%  (p=0.000 n=15+19)
Mutex                   19.6ns ± 0%  19.3ns ± 1%   -1.92%  (p=0.000 n=20+17)
Mutex-4                 54.6ns ± 5%  52.4ns ± 4%   -4.09%  (p=0.000 n=20+20)
Mutex-16                 133ns ± 5%   139ns ± 2%   +4.23%  (p=0.000 n=20+16)
MutexSlack              33.4ns ± 2%  18.9ns ± 1%  -43.56%  (p=0.000 n=19+20)
MutexSlack-4             206ns ± 5%   225ns ± 8%   +9.12%  (p=0.000 n=20+18)
MutexSlack-16           89.4ns ± 1%  98.4ns ± 1%  +10.10%  (p=0.000 n=18+17)
MutexWork               60.5ns ± 0%  58.2ns ± 3%   -3.75%  (p=0.000 n=12+20)
MutexWork-4              105ns ± 5%   103ns ± 7%   -1.68%  (p=0.007 n=20+20)
MutexWork-16             157ns ± 1%   163ns ± 2%   +3.90%  (p=0.000 n=18+18)
MutexWorkSlack          70.2ns ± 5%  57.7ns ± 1%  -17.81%  (p=0.000 n=19+20)
MutexWorkSlack-4         277ns ±13%   276ns ±13%     ~     (p=0.682 n=20+19)
MutexWorkSlack-16        156ns ± 0%   147ns ± 0%   -5.62%  (p=0.000 n=16+14)
MutexNoSpin              966ns ± 0%   968ns ± 0%   +0.11%  (p=0.029 n=15+20)
MutexNoSpin-4            269ns ± 4%   270ns ± 2%     ~     (p=0.807 n=20+19)
MutexNoSpin-16           122ns ± 0%   120ns ± 4%   -1.63%  (p=0.000 n=19+19)
MutexSpin               3.13µs ± 0%  3.13µs ± 1%   +0.16%  (p=0.004 n=18+20)
MutexSpin-4              826ns ± 1%   832ns ± 2%   +0.74%  (p=0.000 n=19+16)
MutexSpin-16             397ns ± 1%   395ns ± 0%   -0.50%  (p=0.000 n=19+17)
RWMutexUncontended      71.4ns ± 0%  69.5ns ± 0%   -2.72%  (p=0.000 n=16+20)
RWMutexUncontended-4    18.4ns ± 4%  17.5ns ± 0%   -4.92%  (p=0.000 n=20+18)
RWMutexUncontended-16   8.01ns ± 0%  7.92ns ± 0%   -1.15%  (p=0.000 n=18+18)
RWMutexWrite100         24.9ns ± 0%  24.9ns ± 1%     ~     (p=0.099 n=19+20)
RWMutexWrite100-4       46.5ns ± 3%  46.2ns ± 4%     ~     (p=0.253 n=17+19)
RWMutexWrite100-16      68.9ns ± 3%  69.9ns ± 5%   +1.46%  (p=0.012 n=18+20)
RWMutexWrite10          27.1ns ± 0%  27.0ns ± 2%     ~     (p=0.128 n=17+20)
RWMutexWrite10-4        34.8ns ± 1%  34.7ns ± 2%     ~     (p=0.180 n=20+18)
RWMutexWrite10-16       37.5ns ± 2%  37.2ns ± 4%   -0.89%  (p=0.023 n=20+20)
RWMutexWorkWrite100      164ns ± 0%   164ns ± 0%     ~     (p=0.106 n=12+20)
RWMutexWorkWrite100-4    186ns ± 3%   193ns ± 3%   +3.46%  (p=0.000 n=20+20)
RWMutexWorkWrite100-16   204ns ± 2%   210ns ± 3%   +2.96%  (p=0.000 n=18+20)
RWMutexWorkWrite10       153ns ± 0%   153ns ± 0%   -0.20%  (p=0.017 n=20+19)
RWMutexWorkWrite10-4     179ns ± 1%   178ns ± 2%     ~     (p=0.215 n=19+20)
RWMutexWorkWrite10-16    191ns ± 1%   192ns ± 2%     ~     (p=0.166 n=15+19)

linux/amd64 bin/go 14630572 (previous commit 14605947, +24625/+0.17%)

Change-Id: I3f9d1765801fe0b8deb1bc2728b8bba8a7508e23
Reviewed-on: https://go-review.googlesource.com/c/go/+/148958
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-05 14:59:31 +00:00
Ian Lance Taylor ca7c12d4c9 sync/atomic: add 32-bit MIPS to the 64-bit alignment requirement
runtime/internal/atomic/atomic_mipsx.go enforces 64-bit alignment.

Change-Id: Ifdc36e1c0322827711425054d10f1c52425a13fa
Reviewed-on: https://go-review.googlesource.com/c/161697
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-02-09 23:07:07 +00:00
Brad Fitzpatrick 3813edf26e all: use "reports whether" consistently in the few places that didn't
Go documentation style for boolean funcs is to say:

    // Foo reports whether ...
    func Foo() bool

(rather than "returns true if")

This CL also replaces 4 uses of "iff" with the same "reports whether"
wording, which doesn't lose any meaning, and will prevent people from
sending typo fixes when they don't realize it's "if and only if". In
the past I think we've had the typo CLs updated to just say "reports
whether". So do them all at once.

(Inspired by the addition of another "returns true if" in CL 146938
in fd_plan9.go)

Created with:

$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" | grep -v vendor)
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" | grep -v vendor)

Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f
Reviewed-on: https://go-review.googlesource.com/c/147037
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-02 22:47:58 +00:00
Roberto 963776e689 sync: fix typo in doc
Change-Id: Ie1f35c7598bd2549a048d64e1b1279bf4acaa103
GitHub-Last-Rev: c8cc7dfef9
GitHub-Pull-Request: golang/go#28051
Reviewed-on: https://go-review.googlesource.com/c/140302
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-10-06 12:04:57 +00:00
Ian Lance Taylor 756c352963 sync: simplify (*entry).tryStore
The only change to the go build -gcflags=-m=2 output was to remove
these two lines:

sync/map.go:178:26: &e.p escapes to heap
sync/map.go:178:26: 	from &e.p (passed to call[argument escapes]) at sync/map.go:178:25

Benchstat report for sync.Map benchmarks:

name                                            old time/op  new time/op  delta
LoadMostlyHits/*sync_test.DeepCopyMap-12        10.6ns ±11%  10.2ns ± 3%    ~     (p=0.299 n=10+8)
LoadMostlyHits/*sync_test.RWMutexMap-12         54.6ns ± 3%  54.6ns ± 2%    ~     (p=0.782 n=10+10)
LoadMostlyHits/*sync.Map-12                     10.1ns ± 1%  10.1ns ± 1%    ~     (p=1.127 n=10+8)
LoadMostlyMisses/*sync_test.DeepCopyMap-12      8.65ns ± 1%  8.77ns ± 5%  +1.39%  (p=0.017 n=9+10)
LoadMostlyMisses/*sync_test.RWMutexMap-12       53.6ns ± 2%  53.8ns ± 2%    ~     (p=0.408 n=10+9)
LoadMostlyMisses/*sync.Map-12                   7.37ns ± 1%  7.46ns ± 1%  +1.19%  (p=0.000 n=9+10)
LoadOrStoreBalanced/*sync_test.RWMutexMap-12     895ns ± 4%   906ns ± 3%    ~     (p=0.203 n=9+10)
LoadOrStoreBalanced/*sync.Map-12                 872ns ±10%   804ns ±12%  -7.75%  (p=0.014 n=10+10)
LoadOrStoreUnique/*sync_test.RWMutexMap-12      1.29µs ± 2%  1.28µs ± 1%    ~     (p=0.586 n=10+9)
LoadOrStoreUnique/*sync.Map-12                  1.30µs ± 7%  1.40µs ± 2%  +6.95%  (p=0.000 n=9+10)
LoadOrStoreCollision/*sync_test.DeepCopyMap-12  6.98ns ± 1%  6.91ns ± 1%  -1.10%  (p=0.000 n=10+10)
LoadOrStoreCollision/*sync_test.RWMutexMap-12    371ns ± 1%   372ns ± 2%    ~     (p=0.679 n=9+9)
LoadOrStoreCollision/*sync.Map-12               5.49ns ± 1%  5.49ns ± 1%    ~     (p=0.732 n=9+10)
Range/*sync_test.DeepCopyMap-12                 2.49µs ± 1%  2.50µs ± 0%    ~     (p=0.148 n=10+10)
Range/*sync_test.RWMutexMap-12                  54.7µs ± 1%  54.6µs ± 3%    ~     (p=0.549 n=9+10)
Range/*sync.Map-12                              2.74µs ± 1%  2.76µs ± 1%  +0.68%  (p=0.011 n=10+8)
AdversarialAlloc/*sync_test.DeepCopyMap-12      2.52µs ± 5%  2.54µs ± 7%    ~     (p=0.225 n=10+10)
AdversarialAlloc/*sync_test.RWMutexMap-12        108ns ± 1%   107ns ± 1%    ~     (p=0.101 n=10+9)
AdversarialAlloc/*sync.Map-12                    712ns ± 2%   714ns ± 3%    ~     (p=0.984 n=8+10)
AdversarialDelete/*sync_test.DeepCopyMap-12      581ns ± 3%   578ns ± 3%    ~     (p=0.781 n=9+9)
AdversarialDelete/*sync_test.RWMutexMap-12       126ns ± 2%   126ns ± 1%    ~     (p=0.883 n=10+10)
AdversarialDelete/*sync.Map-12                   155ns ± 8%   158ns ± 2%    ~     (p=0.158 n=10+9)

Change-Id: I1ed8e3109baca03087d0fad3df769fc7e38f6dbb
Reviewed-on: https://go-review.googlesource.com/137441
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-27 21:44:20 +00:00
Dan Kortschak c2eba53e7f cmd/vet,sync: check lock values more precisely
Fixes #26165

Change-Id: I1f3bd193af9b6f8461c736330952b6e50d3e00d9
Reviewed-on: https://go-review.googlesource.com/121876
Reviewed-by: Alan Donovan <adonovan@google.com>
Run-TryBot: Rob Pike <r@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-07-14 06:48:21 +00:00