go/src
Lynn Boger eeca3ba92f sync/atomic, runtime/internal/atomic: improve ppc64x atomics
The following performance improvements have been made to the
low-level atomic functions for ppc64le & ppc64:

- For those cases containing a lwarx and stwcx (or other sizes):
sync, lwarx, maybe something, stwcx, loop to sync, sync, isync
The sync is moved before (outside) the lwarx/stwcx loop, and the
 sync after is removed, so it becomes:
sync, lwarx, maybe something, stwcx, loop to lwarx, isync

- For the Or8 and And8, the shifting and manipulation of the
address to the word aligned version were removed and the
instructions were changed to use lbarx, stbcx instead of
register shifting, xor, then lwarx, stwcx.

- New instructions LWSYNC, LBAR, STBCC were tested and added.
runtime/atomic_ppc64x.s was changed to use the LWSYNC opcode
instead of the WORD encoding.

Fixes #15469

Ran some of the benchmarks in the runtime and sync directories.
Some results varied from run to run but the trend was improvement
based on best times for base and new:

runtime.test:
BenchmarkChanNonblocking-128         0.88          0.89          +1.14%
BenchmarkChanUncontended-128         569           511           -10.19%
BenchmarkChanContended-128           63110         53231         -15.65%
BenchmarkChanSync-128                691           598           -13.46%
BenchmarkChanSyncWork-128            11355         11649         +2.59%
BenchmarkChanProdCons0-128           2402          2090          -12.99%
BenchmarkChanProdCons10-128          1348          1363          +1.11%
BenchmarkChanProdCons100-128         1002          746           -25.55%
BenchmarkChanProdConsWork0-128       2554          2720          +6.50%
BenchmarkChanProdConsWork10-128      1909          1804          -5.50%
BenchmarkChanProdConsWork100-128     1624          1580          -2.71%
BenchmarkChanCreation-128            237           212           -10.55%
BenchmarkChanSem-128                 705           667           -5.39%
BenchmarkChanPopular-128             5081190       4497566       -11.49%

BenchmarkCreateGoroutines-128             532           473           -11.09%
BenchmarkCreateGoroutinesParallel-128     35.0          34.7          -0.86%
BenchmarkCreateGoroutinesCapture-128      4923          4200          -14.69%

sync.test:
BenchmarkUncontendedSemaphore-128      112           94.2          -15.89%
BenchmarkContendedSemaphore-128        133           128           -3.76%
BenchmarkMutexUncontended-128          1.90          1.67          -12.11%
BenchmarkMutex-128                     353           310           -12.18%
BenchmarkMutexSlack-128                304           283           -6.91%
BenchmarkMutexWork-128                 554           541           -2.35%
BenchmarkMutexWorkSlack-128            567           556           -1.94%
BenchmarkMutexNoSpin-128               275           242           -12.00%
BenchmarkMutexSpin-128                 1129          1030          -8.77%
BenchmarkOnce-128                      1.08          0.96          -11.11%
BenchmarkPool-128                      29.8          27.4          -8.05%
BenchmarkPoolOverflow-128              40564         36583         -9.81%
BenchmarkSemaUncontended-128           3.14          2.63          -16.24%
BenchmarkSemaSyntNonblock-128          1087          1069          -1.66%
BenchmarkSemaSyntBlock-128             897           893           -0.45%
BenchmarkSemaWorkNonblock-128          1034          1028          -0.58%
BenchmarkSemaWorkBlock-128             949           886           -6.64%

Change-Id: I4403fb29d3cd5254b7b1ce87a216bd11b391079e
Reviewed-on: https://go-review.googlesource.com/22549
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Minux Ma <minux@golang.org>
2016-05-05 18:52:28 +00:00
..
archive archive/zip: pool flate readers 2016-05-04 14:28:27 +00:00
bufio all: replace magic 0x80 with named constant utf8.RuneSelf 2016-04-10 15:15:57 +00:00
builtin
bytes strings, bytes: fix Reader 0 byte read at EOF 2016-05-03 21:01:13 +00:00
cmd sync/atomic, runtime/internal/atomic: improve ppc64x atomics 2016-05-05 18:52:28 +00:00
compress compress/flate: distinguish between base and min match length. 2016-05-05 00:16:39 +00:00
container container/heap: correct number of elements in BenchmarkDup 2016-04-20 15:26:05 +00:00
context context: use https in docs 2016-05-05 18:31:23 +00:00
crypto crypto/cipher, crypto/aes: add s390x implementation of AES-CTR 2016-04-29 21:17:31 +00:00
database/sql database/sql: clone data for named []byte types 2016-04-30 18:40:36 +00:00
debug debug/pe: unexport newly introduced identifiers 2016-05-05 00:20:45 +00:00
encoding encoding/json: add Encoder.DisableHTMLEscaping 2016-04-22 21:35:56 +00:00
errors
expvar expvar: Ensure strings are written as valid JSON. 2016-04-06 03:52:39 +00:00
flag flag: update test case (fix build) 2016-04-21 23:17:18 +00:00
fmt fmt: remove extra space in doc for compound objects 2016-04-17 20:07:32 +00:00
go cmd/compile: use correct packages when exporting/importing _ (blank) names 2016-05-03 14:57:06 +00:00
hash hash/crc32: use vector instructions on s390x 2016-04-22 18:07:15 +00:00
html html/template, text/template: clarify Parse{Files,Glob} semantics 2016-04-22 02:01:54 +00:00
image image/gif: accept an out-of-bounds transparent color index. 2016-04-29 00:01:22 +00:00
index/suffixarray
internal net/http, net/http/httptrace: new package for tracing HTTP client requests 2016-04-28 20:56:38 +00:00
io io: document WriteString calls Write exactly once 2016-04-12 01:03:51 +00:00
log
math all: make copyright headers consistent with one space after period 2016-05-02 13:43:18 +00:00
mime all: standardize RFC mention format 2016-04-12 21:07:52 +00:00
net net/http: correct RFC for MethodPatch 2016-05-04 22:11:56 +00:00
os os/exec: fix variable shadow, don't leak goroutine 2016-04-28 20:56:25 +00:00
path all: use bytes.Equal, bytes.Contains and strings.Contains, again 2016-04-11 15:16:54 +00:00
reflect runtime: reclaim scan/dead bit in first word 2016-04-30 16:49:54 +00:00
regexp all: make copyright headers consistent with one space after period 2016-05-02 13:43:18 +00:00
runtime sync/atomic, runtime/internal/atomic: improve ppc64x atomics 2016-05-05 18:52:28 +00:00
sort all: delete dead non-test code 2016-03-25 06:28:13 +00:00
strconv strconv: fix ParseFloat for special forms of zero values 2016-04-19 22:39:43 +00:00
strings strings, bytes: fix Reader 0 byte read at EOF 2016-05-03 21:01:13 +00:00
sync sync/atomic, runtime/internal/atomic: improve ppc64x atomics 2016-05-05 18:52:28 +00:00
syscall syscall: fix uint64->int cast of control message header 2016-04-27 20:10:09 +00:00
testing testing: add matching of subtest 2016-04-21 19:58:31 +00:00
text html/template, text/template: clarify Parse{Files,Glob} semantics 2016-04-22 02:01:54 +00:00
time time: print zero duration as 0s, not 0 2016-04-21 22:07:59 +00:00
unicode unicode: improve SimpleFold performance for ascii 2016-04-26 21:59:50 +00:00
unsafe
vendor/golang.org/x/net/http2/hpack all: fix spelling mistakes 2016-04-03 17:03:15 +00:00
Make.dist
all.bash
all.bat
all.rc
androidtest.bash all: make copyright headers consistent with one space after period 2016-05-02 13:43:18 +00:00
bootstrap.bash all: make copyright headers consistent with one space after period 2016-05-02 13:43:18 +00:00
buildall.bash
clean.bash
clean.bat
clean.rc
cmp.bash cmd/compile: switch to compact export format by default 2016-04-27 16:59:55 +00:00
iostest.bash
make.bash cmd/dist: redo flag-passing for bootstrap 2016-03-18 19:00:03 +00:00
make.bat
make.rc
naclmake.bash src: split nacltest.bash into naclmake.bash and keep nacltest.bash 2016-04-12 02:03:34 +00:00
nacltest.bash all: make copyright headers consistent with one space after period 2016-05-02 13:43:18 +00:00
race.bash
race.bat
run.bash
run.bat
run.rc