Commit Graph

20 Commits

Author SHA1 Message Date
Russ Cox a71ca3dfbd runtime, sync, sync/atomic: document happens-before guarantees
A few of these are copied from the memory model doc.
Many are entirely new, following discussion on #47141.
See https://research.swtch.com/gomm for background.

The rule we are establishing is that each type that is meant
to help synchronize a Go program should document its
happens-before guarantees.

For #50859.

Change-Id: I947c40639b263abe67499fa74f68711a97873a39
Reviewed-on: https://go-review.googlesource.com/c/go/+/381316
Auto-Submit: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2022-06-06 20:48:03 +00:00
Jason7602 507a44dc22 sync: remove the redundant logic on sync.(*Pool).Put
When the procUnpin is placed after shared.pushHead, there is
no need for x as a flag to indicate the previous process.

This CL can make the logic clear, and at the same time reduce
a redundant judgment.

Change-Id: I34ec9ba4cb5b5dbdf13a8f158b90481fed248cf5
Reviewed-on: https://go-review.googlesource.com/c/go/+/360059
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-08 17:23:05 +00:00
Russ Cox 2580d0e08d all: gofmt -w -r 'interface{} -> any' src
And then revert the bootstrap cmd directories and certain testdata.
And adjust tests as needed.

Not reverting the changes in std that are bootstrapped,
because some of those changes would appear in API docs,
and we want to use any consistently.
Instead, rewrite 'any' to 'interface{}' in cmd/dist for those directories
when preparing the bootstrap copy.

A few files changed as a result of running gofmt -w
not because of interface{} -> any but because they
hadn't been updated for the new //go:build lines.

Fixes #49884.

Change-Id: Ie8045cba995f65bd79c694ec77a1b3d1fe01bb09
Reviewed-on: https://go-review.googlesource.com/c/go/+/368254
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2021-12-13 18:45:54 +00:00
Meng Zhuo ecb2f231fa runtime,sync: using fastrandn instead of modulo reduction
fastrandn is ~50% faster than fastrand() % n.
`ack -v 'fastrand\(\)\s?\%'` finds all modulo on fastrand()

name              old time/op  new time/op  delta
Fastrandn/2       2.86ns ± 0%  1.59ns ± 0%  -44.35%  (p=0.000 n=9+10)
Fastrandn/3       2.87ns ± 1%  1.59ns ± 0%  -44.41%  (p=0.000 n=10+9)
Fastrandn/4       2.87ns ± 1%  1.58ns ± 1%  -45.10%  (p=0.000 n=10+10)
Fastrandn/5       2.86ns ± 1%  1.58ns ± 1%  -44.84%  (p=0.000 n=10+10)

Change-Id: Ic91f5ca9b9e3b65127bc34792b62fd64fbd13b5c
Reviewed-on: https://go-review.googlesource.com/c/go/+/353269
Trust: Meng Zhuo <mzh@golangcn.org>
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-10-07 14:01:52 +00:00
Bryan C. Mills 0d1280c685 Revert "sync: improve sync.Pool object stealing"
This reverts CL 303949.

Reason for revert: broke linux-arm-aws TryBots.

Change-Id: Ib44949df70520cdabff857846be0d2221403d2f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/313630
Trust: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-04-26 18:54:39 +00:00
Ruslan Andreev d5d24dbe41 sync: improve sync.Pool object stealing
This CL provide abilty to randomly select P to steal object from its
shared queue. In order to provide such ability randomOrder structure
was copied from runtime/proc.go.
It should reduce contention in firsts Ps and improve balance of object
stealing across all Ps. Also, the patch provides new benchmark
PoolStarvation which force Ps to steal objects.
Benchmarks:
name                old time/op     new time/op     delta
Pool-8                 2.16ns ±14%     2.14ns ±16%    ~     (p=0.425 n=10+10)
PoolOverflow-8          489ns ± 0%      489ns ± 0%    ~     (p=0.719 n=9+10)
PoolStarvation-8       7.00µs ± 4%     6.59µs ± 2%  -5.86%  (p=0.000 n=10+10)
PoolSTW-8              15.1µs ± 1%     15.2µs ± 1%  +0.99%  (p=0.001 n=10+10)
PoolExpensiveNew-8     1.25ms ±10%     1.31ms ± 9%    ~     (p=0.143 n=10+10)
[Geo mean]             2.68µs          2.68µs       -0.28%

name                old p50-ns/STW  new p50-ns/STW  delta
PoolSTW-8               15.0k ± 1%      15.1k ± 1%  +0.92%  (p=0.000 n=10+10)

name                old p95-ns/STW  new p95-ns/STW  delta
PoolSTW-8               16.2k ± 3%      16.4k ± 2%    ~     (p=0.143 n=10+10)

name                old GCs/op      new GCs/op      delta
PoolExpensiveNew-8       0.29 ± 2%       0.30 ± 1%  +2.84%  (p=0.000 n=8+10)

name                old New/op      new New/op      delta
PoolExpensiveNew-8       8.07 ±11%       8.49 ±10%    ~     (p=0.123 n=10+10)

Change-Id: I3ca1d0bf1f358b1148c58e64740fb2d5bfc0bc02
Reviewed-on: https://go-review.googlesource.com/c/go/+/303949
Reviewed-by: David Chase <drchase@google.com>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
2021-04-26 17:13:36 +00:00
Austin Clements c305e49e96 cmd/go,cmd/compile,sync: remove special import case in cmd/go
CL 253748 introduced a special case in cmd/go to allow sync to import
runtime/internal/atomic. Besides introducing unnecessary complexity
into cmd/go, this breaks other packages (like gopls) that understand
how imports work, but don't understand this special case.

Fix this by using the more standard linkname-based approach to pull
the necessary functions from runtime/internal/atomic into sync. Since
these are compiler intrinsics, we also have to tell the compiler that
the linknamed symbols are intrinsics to get this optimization in sync.

Fixes #42196.

Change-Id: I1f91498c255c91583950886a89c3c9adc39a32f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/265124
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Paul Murphy <murp@ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-10-26 20:12:53 +00:00
Paul E. Murphy 15ead857db cmd/compiler,cmd/go,sync: add internal {LoadAcq,StoreRel}64 on ppc64
Add an internal atomic intrinsic for load with acquire semantics
(extending LoadAcq to 64b) and add LoadAcquintptr for internal
use within the sync package.  For other arches, this remaps to the
appropriate atomic.Load{,64} intrinsic which should not alter code
generation.

Similarly, add StoreRel{uintptr,64} for consistency, and inline.

Finally, add an exception to allow sync to directly use the
runtime/internal/atomic package which avoids more convoluted
workarounds (contributed by Lynn Boger).

In an extreme example, sync.(*Pool).pin consumes 20% of wall time
during fmt tests.  This is reduced to 5% on ppc64le/power9.

From the fmt benchmarks on ppc64le:

name                           old time/op  new time/op  delta
SprintfPadding                  468ns ± 0%   451ns ± 0%   -3.63%
SprintfEmpty                   73.3ns ± 0%  51.9ns ± 0%  -29.20%
SprintfString                   135ns ± 0%   122ns ± 0%   -9.63%
SprintfTruncateString           232ns ± 0%   214ns ± 0%   -7.76%
SprintfTruncateBytes            216ns ± 0%   202ns ± 0%   -6.48%
SprintfSlowParsingPath          162ns ± 0%   142ns ± 0%  -12.35%
SprintfQuoteString             1.00µs ± 0%  0.99µs ± 0%   -1.39%
SprintfInt                      117ns ± 0%   104ns ± 0%  -11.11%
SprintfIntInt                   190ns ± 0%   175ns ± 0%   -7.89%
SprintfPrefixedInt              232ns ± 0%   212ns ± 0%   -8.62%
SprintfFloat                    270ns ± 0%   255ns ± 0%   -5.56%
SprintfComplex                 1.01µs ± 0%  0.99µs ± 0%   -1.68%
SprintfBoolean                  127ns ± 0%   111ns ± 0%  -12.60%
SprintfHexString                220ns ± 0%   198ns ± 0%  -10.00%
SprintfHexBytes                 261ns ± 0%   252ns ± 0%   -3.45%
SprintfBytes                    600ns ± 0%   590ns ± 0%   -1.67%
SprintfStringer                 684ns ± 0%   658ns ± 0%   -3.80%
SprintfStructure               2.57µs ± 0%  2.57µs ± 0%   -0.12%
ManyArgs                        669ns ± 0%   646ns ± 0%   -3.44%
FprintInt                       140ns ± 0%   136ns ± 0%   -2.86%
FprintfBytes                    184ns ± 0%   181ns ± 0%   -1.63%
FprintIntNoAlloc                140ns ± 0%   136ns ± 0%   -2.86%
ScanInts                        929µs ± 0%   921µs ± 0%   -0.79%
ScanRecursiveInt                122ms ± 0%   121ms ± 0%   -0.11%
ScanRecursiveIntReaderWrapper   122ms ± 0%   122ms ± 0%   -0.18%

Change-Id: I4d66780261b57b06ef600229e475462e7313f0d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/253748
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-10-21 14:34:44 +00:00
Kai Dong 5ccaf2c6ad sync: update comment
Comment update.

Change-Id: If0d054216f9953f42df04647b85c38008b85b026
GitHub-Last-Rev: 133b4670be
GitHub-Pull-Request: golang/go#31539
Reviewed-on: https://go-review.googlesource.com/c/go/+/172700
Reviewed-by: Austin Clements <austin@google.com>
2019-04-19 16:15:36 +00:00
Austin Clements 2dcbf8b369 sync: smooth out Pool behavior over GC with a victim cache
Currently, every Pool is cleared completely at the start of each GC.
This is a problem for heavy users of Pool because it causes an
allocation spike immediately after Pools are clear, which impacts both
throughput and latency.

This CL fixes this by introducing a victim cache mechanism. Instead of
clearing Pools, the victim cache is dropped and the primary cache is
moved to the victim cache. As a result, in steady-state, there are
(roughly) no new allocations, but if Pool usage drops, objects will
still be collected within two GCs (as opposed to one).

This victim cache approach also improves Pool's impact on GC dynamics.
The current approach causes all objects in Pools to be short lived.
However, if an application is in steady state and is just going to
repopulate its Pools, then these objects impact the live heap size *as
if* they were long lived. Since Pooled objects count as short lived
when computing the GC trigger and goal, but act as long lived objects
in the live heap, this causes GC to trigger too frequently. If Pooled
objects are a non-trivial portion of an application's heap, this
increases the CPU overhead of GC. The victim cache lets Pooled objects
affect the GC trigger and goal as long-lived objects.

This has no impact on Get/Put performance, but substantially reduces
the impact to the Pool user when a GC happens. PoolExpensiveNew
demonstrates this in the substantially reduction in the rate at which
the "New" function is called.

name                 old time/op     new time/op     delta
Pool-12                 2.21ns ±36%     2.00ns ± 0%     ~     (p=0.070 n=19+16)
PoolOverflow-12          587ns ± 1%      583ns ± 1%   -0.77%  (p=0.000 n=18+18)
PoolSTW-12              5.57µs ± 3%     4.52µs ± 4%  -18.82%  (p=0.000 n=20+19)
PoolExpensiveNew-12     3.69ms ± 7%     1.25ms ± 5%  -66.25%  (p=0.000 n=20+19)

name                 old p50-ns/STW  new p50-ns/STW  delta
PoolSTW-12               5.48k ± 2%      4.53k ± 2%  -17.32%  (p=0.000 n=20+20)

name                 old p95-ns/STW  new p95-ns/STW  delta
PoolSTW-12               6.69k ± 4%      5.13k ± 3%  -23.31%  (p=0.000 n=19+18)

name                 old GCs/op      new GCs/op      delta
PoolExpensiveNew-12       0.39 ± 1%       0.32 ± 2%  -17.95%  (p=0.000 n=18+20)

name                 old New/op      new New/op      delta
PoolExpensiveNew-12       40.0 ± 6%       12.4 ± 6%  -68.91%  (p=0.000 n=20+19)

(https://perf.golang.org/search?q=upload:20190311.2)

Fixes #22950.

Change-Id: If2e183d948c650417283076aacc20739682cdd70
Reviewed-on: https://go-review.googlesource.com/c/go/+/166961
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-04-05 18:49:08 +00:00
Austin Clements d5fd2dd6a1 sync: use lock-free structure for Pool stealing
Currently, Pool stores each per-P shard's overflow in a slice
protected by a Mutex. In order to store to the overflow or steal from
another shard, a P must lock that shard's Mutex. This allows for
simple synchronization between Put and Get, but has unfortunate
consequences for clearing pools.

Pools are cleared during STW sweep termination, and hence rely on
pinning a goroutine to its P to synchronize between Get/Put and
clearing. This makes the Get/Put fast path extremely fast because it
can rely on quiescence-style coordination, which doesn't even require
atomic writes, much less locking.

The catch is that a goroutine cannot acquire a Mutex while pinned to
its P (as this could deadlock). Hence, it must drop the pin on the
slow path. But this means the slow path is not synchronized with
clearing. As a result,

1) It's difficult to reason about races between clearing and the slow
path. Furthermore, this reasoning often depends on unspecified nuances
of where preemption points can occur.

2) Clearing must zero out the pointer to every object in every Pool to
prevent a concurrent slow path from causing all objects to be
retained. Since this happens during STW, this has an O(# objects in
Pools) effect on STW time.

3) We can't implement a victim cache without making clearing even
slower.

This CL solves these problems by replacing the locked overflow slice
with a lock-free structure. This allows Gets and Puts to be pinned the
whole time they're manipulating the shards slice (Pool.local), which
eliminates the races between Get/Put and clearing. This, in turn,
eliminates the need to zero all object pointers, reducing clearing to
O(# of Pools) during STW.

In addition to significantly reducing STW impact, this also happens to
speed up the Get/Put fast-path and the slow path. It somewhat
increases the cost of PoolExpensiveNew, but we'll fix that in the next
CL.

name                 old time/op     new time/op     delta
Pool-12                 3.00ns ± 0%     2.21ns ±36%  -26.32%  (p=0.000 n=18+19)
PoolOverflow-12          600ns ± 1%      587ns ± 1%   -2.21%  (p=0.000 n=16+18)
PoolSTW-12              71.0µs ± 2%      5.6µs ± 3%  -92.15%  (p=0.000 n=20+20)
PoolExpensiveNew-12     3.14ms ± 5%     3.69ms ± 7%  +17.67%  (p=0.000 n=19+20)

name                 old p50-ns/STW  new p50-ns/STW  delta
PoolSTW-12               70.7k ± 1%       5.5k ± 2%  -92.25%  (p=0.000 n=20+20)

name                 old p95-ns/STW  new p95-ns/STW  delta
PoolSTW-12               73.1k ± 2%       6.7k ± 4%  -90.86%  (p=0.000 n=18+19)

name                 old GCs/op      new GCs/op      delta
PoolExpensiveNew-12       0.38 ± 1%       0.39 ± 1%   +2.07%  (p=0.000 n=20+18)

name                 old New/op      new New/op      delta
PoolExpensiveNew-12       33.9 ± 6%       40.0 ± 6%  +17.97%  (p=0.000 n=19+20)

(https://perf.golang.org/search?q=upload:20190311.1)

Fixes #22331.
For #22950.

Change-Id: Ic5cd826e25e218f3f8256dbc4d22835c1fecb391
Reviewed-on: https://go-review.googlesource.com/c/go/+/166960
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-04-05 18:49:07 +00:00
Aliaksandr Valialkin 8aa31d5dae sync: align poolLocal to CPU cache line size
Make poolLocal size multiple of 128, so it aligns to CPU cache line
on the most common architectures.

This also has the following benefits:

- It may help compiler substituting integer multiplication
  by bit shift inside indexLocal.
- It shrinks poolLocal size from 176 bytes to 128 bytes on amd64,
  so now it fits two cache lines (or a single cache line on certain
  Intel CPUs - see https://software.intel.com/en-us/articles/optimizing-application-performance-on-intel-coret-microarchitecture-using-hardware-implemented-prefetchers).

No measurable performance changes on linux/amd64 and linux/386.

Change-Id: I11df0f064718a662e77a85d88b8a15a8919f25e9
Reviewed-on: https://go-review.googlesource.com/40918
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-20 22:36:07 +00:00
Aliaksandr Valialkin af5c95117b sync: improve Pool performance
Rewrite indexLocal to achieve higher performance.

Performance results on linux/amd64:

name            old time/op  new time/op  delta
Pool-4          19.1ns ± 2%  10.1ns ± 1%  -47.15%  (p=0.000 n=10+8)
PoolOverflow-4  3.11µs ± 1%  2.10µs ± 2%  -32.66%  (p=0.000 n=10+10)

Performance results on linux/386:

name            old time/op  new time/op  delta
Pool-4          20.0ns ± 2%  13.1ns ± 1%  -34.59%  (p=0.000 n=10+9)
PoolOverflow-4  3.51µs ± 1%  2.49µs ± 0%  -28.99%  (p=0.000 n=10+8)

Change-Id: I7d57a2d4cd47ec43d09ca1267bde2e3f05a9faa9
Reviewed-on: https://go-review.googlesource.com/40913
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-18 14:17:48 +00:00
Russ Cox ba048f7ce4 sync: enable Pool when using race detector
Disabled by https://golang.org/cl/53020044 due to false positives.
Reenable and model properly.

Fixes #17306.

Change-Id: I28405ddfcd17f58cf1427c300273212729154359
Reviewed-on: https://go-review.googlesource.com/31589
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2016-10-30 19:09:20 +00:00
Aliaksandr Valialkin c81a3532fe cmd/vet: check sync.* types' copying
Embed noLock struct into the following types, so `go vet -copylocks` catches
their copying additionally to types containing sync.Mutex:
  - sync.Cond
  - sync.WaitGroup
  - sync.Pool
  - atomic.Value

Fixes #14582

Change-Id: Icb543ef5ad10524ad239a15eec8a9b334b0e0660
Reviewed-on: https://go-review.googlesource.com/22015
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-06 16:43:51 +00:00
Matthew Dempsky 0da4dbe232 all: remove unnecessary type conversions
cmd and runtime were handled separately, and I'm intentionally skipped
syscall. This is the rest of the standard library.

CL generated mechanically with github.com/mdempsky/unconvert.

Change-Id: I9e0eff886974dedc37adb93f602064b83e469122
Reviewed-on: https://go-review.googlesource.com/22104
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-15 07:31:45 +00:00
Josh Bleecher Snyder e43c74a0d8 all: use cannot instead of can not
You can not use cannot, but you cannot spell cannot can not.

Change-Id: I2f0971481a460804de96fd8c9e46a9cc62a3fc5b
Reviewed-on: https://go-review.googlesource.com/19772
Reviewed-by: Rob Pike <r@golang.org>
2016-02-21 15:35:50 +00:00
Dmitry Vyukov 7b767f4e52 internal/race: add package
Factor out duplicated race thunks from sync, syscall net
and fmt packages into a separate package and use it.

Fixes #8593

Change-Id: I156869c50946277809f6b509463752e7f7d28cdb
Reviewed-on: https://go-review.googlesource.com/14870
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-26 16:50:31 +00:00
Dmitriy Vyukov af3868f187 sync: release Pool memory during second and later GCs
Pool memory was only being released during the first GC after the first Put.

Put assumes that p.local != nil means p is on the allPools list.
poolCleanup (called during each GC) removed each pool from allPools
but did not clear p.local, so each pool was cleared by exactly one GC
and then never cleared again.

This bug was introduced late in the Go 1.3 release cycle.

Fixes #8979.

LGTM=rsc
R=golang-codereviews, bradfitz, r, rsc
CC=golang-codereviews, khr
https://golang.org/cl/162980043
2014-10-22 20:23:49 +04:00
Russ Cox c007ce824d build: move package sources from src/pkg to src
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
2014-09-08 00:08:51 -04:00