mirror of https://github.com/golang/go.git
725 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
1831437f19 |
math/big: better doc string for Float.Copy, add example test
Fixes #66358. Change-Id: Ic9bde88eabfb2a446d32e1dc5ac404a51ef49f11 Reviewed-on: https://go-review.googlesource.com/c/go/+/590635 Auto-Submit: Robert Griesemer <gri@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Robert Griesemer <gri@google.com> |
|
|
|
2a7ca156b8 |
all: document legacy //go:linkname for final round of modules
Add linknames for most modules with ≥50 dependents. Add linknames for a few other modules that we know are important but are below 50. Remove linknames from badlinkname.go that do not merit inclusion (very small number of dependents). We can add them back later if the need arises. Fixes #67401. (For now.) Change-Id: I1e49fec0292265256044d64b1841d366c4106002 Reviewed-on: https://go-review.googlesource.com/c/go/+/587756 Auto-Submit: Russ Cox <rsc@golang.org> TryBot-Bypass: Russ Cox <rsc@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> |
|
|
|
bf91eb3a8b |
std: fix calls to Printf(s) with non-constant s
In all cases the intent was not to interpret s as a format string. In one case (go/types), this was a latent bug in production. (These were uncovered by a new check in vet's printf analyzer.) Updates #60529 Change-Id: I3e17af7e589be9aec1580783a1b1011c52ec494b Reviewed-on: https://go-review.googlesource.com/c/go/+/587855 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Russ Cox <rsc@golang.org> |
|
|
|
9a3ef86173 |
all: document legacy //go:linkname for modules with ≥5,000 dependents
For #67401. Change-Id: Ifea84af92017b405466937f50fb8f28e6893c8cb Reviewed-on: https://go-review.googlesource.com/c/go/+/587220 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> |
|
|
|
b0b1d42db3 |
all: change from sort functions to slices functions where feasible
Doing this because the slices functions are slightly faster and slightly easier to use. It also removes one dependency layer. This CL does not change packages that are used during bootstrap, as the bootstrap compiler does not have the required slices functions. It does not change the go/scanner package because the ErrorList Len, Swap, and Less methods are part of the Go 1 API. Change-Id: If52899be791c829198e11d2408727720b91ebe8a Reviewed-on: https://go-review.googlesource.com/c/go/+/587655 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Commit-Queue: Ian Lance Taylor <iant@google.com> Reviewed-by: Damien Neil <dneil@google.com> |
|
|
|
587c3847da |
math/rand/v2: add ChaCha8.Read
Fixes #67059 Closes #67452 Closes #67498 Change-Id: I84eba2ed787a17e9d6aaad2a8a78596e3944909a Reviewed-on: https://go-review.googlesource.com/c/go/+/587280 Reviewed-by: Roland Shoemaker <roland@golang.org> Auto-Submit: Filippo Valsorda <filippo@golang.org> Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
cc673d2ec5 |
all: convert PPC64 CMPx ...,R0,... to CMPx Rx,$0
Cleanup all remaining trivial compares against $0 in ppc64x assembly. In math, SRD ...,Rx; CMP Rx, $0 is further simplified to SRDCC. Change-Id: Ia2bc204953e32f08ee142bfd06a91965f30f99b6 Reviewed-on: https://go-review.googlesource.com/c/go/+/587016 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Run-TryBot: Paul Murphy <murp@ibm.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
bf0bbd5360 |
math/rand/v2: drop pointer receiver on zero-width type
Just a cleanup. Change-Id: Ibeb2c7d447c793086280e612fe5f0f7eeb863f71 Reviewed-on: https://go-review.googlesource.com/c/go/+/582875 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Damien Neil <dneil@google.com> Reviewed-by: Jorropo <jorropo.pgm@gmail.com> |
|
|
|
f726c8d0fd |
ppc64x: code cleanup in assembly files
Replacing Branch Conditional (BC) with its extended mnemonic form of BDNZ and BDZ. - BC 16, 0, target can be replaced by BDNZ target - BC 18, 0, target can be replaced by BDZ target Change-Id: I1259e207f2a40d0b72780d5421f7449ddc006dc5 Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10 Reviewed-on: https://go-review.googlesource.com/c/go/+/585077 Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> |
|
|
|
41aab30bd2 |
all: add push linknames to allow legacy pull linknames
CL 585358 adds restrictions to disallow pull-only linknames (currently off by default). Currently, there are quite some pull- only linknames in user code in the wild. In order not to break those, we add push linknames to allow them to be pulled. This CL includes linknames found in a large code corpus (thanks Matthew Dempsky and Michael Pratt for the analysis!), that are not currently linknamed. Updates #67401. Change-Id: I32f5fc0c7a6abbd7a11359a025cfa2bf458fe767 Reviewed-on: https://go-review.googlesource.com/c/go/+/586137 Reviewed-by: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
63a0a905fa |
math/rand/v2: use max builtin in tests
Change-Id: I6d0050319c66fb62c817206e646e1a9449dc444c Reviewed-on: https://go-review.googlesource.com/c/go/+/585715 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Robert Griesemer <gri@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Robert Griesemer <gri@google.com> |
|
|
|
509bbeb407 |
math/rand/v2, math/big: use internal/byteorder
Change-Id: Id07f16d14133ee539bc2880b39641c42418fa6e2
GitHub-Last-Rev:
|
|
|
|
9c4849bf20 |
math/rand/v2: add Uint
Uint was part of the approved proposal but was inadvertently left out of Go 1.22. Add for Go 1.23. Change-Id: Ifaf24447bd70c8524c2fd299eefdf4aa29e49e66 Reviewed-on: https://go-review.googlesource.com/c/go/+/583455 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Robert Griesemer <gri@google.com> |
|
|
|
330bc95093 |
math/big: improve use of addze in mulAddVWW on ppc64x
Improve the use of addze to avoid unnecessary register
moves on ppc64x.
goos: linux
goarch: ppc64le
pkg: math/big
cpu: POWER10
│ old.out │ new.out │
│ sec/op │ sec/op vs base │
MulAddVWW/1 4.524n ± 3% 4.248n ± 0% -6.10% (p=0.002 n=6)
MulAddVWW/2 5.634n ± 0% 5.283n ± 0% -6.24% (p=0.002 n=6)
MulAddVWW/3 6.406n ± 0% 5.918n ± 0% -7.63% (p=0.002 n=6)
MulAddVWW/4 6.484n ± 0% 5.859n ± 0% -9.64% (p=0.002 n=6)
MulAddVWW/5 7.363n ± 0% 6.766n ± 0% -8.11% (p=0.002 n=6)
MulAddVWW/10 10.920n ± 0% 9.856n ± 0% -9.75% (p=0.002 n=6)
MulAddVWW/100 83.46n ± 0% 66.95n ± 0% -19.78% (p=0.002 n=6)
MulAddVWW/1000 856.0n ± 0% 681.6n ± 0% -20.38% (p=0.002 n=6)
MulAddVWW/10000 8.589µ ± 1% 6.774µ ± 0% -21.14% (p=0.002 n=6)
MulAddVWW/100000 86.22µ ± 0% 67.71µ ± 43% -21.48% (p=0.065 n=6)
geomean 73.34n 63.62n -13.26%
Change-Id: I95d6ac49ff6b64aa678e6896f57af9d85c923aad
Reviewed-on: https://go-review.googlesource.com/c/go/+/579235
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
|
|
1843464f01 |
all: consistently use "IEEE 754" over "IEEE-754"
There is no hyphen between the organization and the number. For example, https://standards.ieee.org/ieee/754/6210/ shows the string "IEEE 754-2019" and not "IEEE-754-2019". This assists in searching for "IEEE 754" in documentation and not missing those using "IEEE-754". Change-Id: I9a50ede807984ff1e2f17390bc1039f6a5d162e5 Reviewed-on: https://go-review.googlesource.com/c/go/+/575438 Run-TryBot: Joseph Tsai <joetsai@digital-static.net> Reviewed-by: Robert Griesemer <gri@google.com> Auto-Submit: Joseph Tsai <joetsai@digital-static.net> TryBot-Result: Gopher Robot <gobot@golang.org> TryBot-Bypass: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> |
|
|
|
c841ba3a3e |
math/big: use built-in clear to simplify code
Change-Id: I07c3a498ce1e462c3d1703d77e7d7824e9334651
GitHub-Last-Rev:
|
|
|
|
806ea41fce |
math/rand, math/rand/v2: rename receiver variables
According to the https://go.dev/wiki/CodeReviewComments#receiver-names Change-Id: Ib8bc57cf6a680e5c75d7346b74e77847945f6939 Reviewed-on: https://go-review.googlesource.com/c/go/+/568635 Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> |
|
|
|
ad377e906a |
math: add round assembly implementations on riscv64
goos: linux
goarch: riscv64
pkg: math
│ floor_old.bench │ floor_new.bench │
│ sec/op │ sec/op vs base │
Ceil 54.12n ± 0% 22.05n ± 0% -59.26% (p=0.000 n=10)
Floor 40.80n ± 0% 22.05n ± 0% -45.96% (p=0.000 n=10)
Round 20.73n ± 0% 20.74n ± 0% ~ (p=0.441 n=10)
RoundToEven 24.07n ± 0% 24.07n ± 0% ~ (p=1.000 n=10)
Trunc 38.73n ± 0% 22.05n ± 0% -43.07% (p=0.000 n=10)
geomean 33.58n 22.17n -33.98%
Change-Id: I24fb9e3bbf8146da253b6791b21377bea1afbd16
Reviewed-on: https://go-review.googlesource.com/c/go/+/504737
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: M Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: M Zhuo <mengzhuo1203@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
|
|
|
|
5c92f43c51 |
math/rand/v2: use a doc link for crypto/rand
It's easier to go look at its documentation when there's a link. Change-Id: Iad6c1aa1a3f4b9127dc526b4db473239329780d6 Reviewed-on: https://go-review.googlesource.com/c/go/+/563255 Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Than McIntosh <thanm@google.com> |
|
|
|
4fde3ef2ac |
math/big,crypto/internal/bigmod: unroll loop in addMulVVW for ppc64x
This updates the assembly implementation of AddMulVVW to
unroll the main loop to do 64 bytes at a time.
The code for addMulVVWx is based on the same code and has
also been updated to improve performance.
goos: linux
goarch: ppc64le
pkg: crypto/internal/bigmod
cpu: POWER10
│ bg.orig.out │ bg.out │
│ sec/op │ sec/op vs base │
ModAdd 116.3n ± 0% 116.9n ± 0% +0.52% (p=0.002 n=6)
ModSub 111.5n ± 0% 111.5n ± 0% 0.00% (p=0.273 n=6)
MontgomeryRepr 2.195µ ± 0% 1.944µ ± 0% -11.44% (p=0.002 n=6)
MontgomeryMul 2.195µ ± 0% 1.943µ ± 0% -11.48% (p=0.002 n=6)
ModMul 4.418µ ± 0% 3.900µ ± 0% -11.72% (p=0.002 n=6)
ExpBig 5.736m ± 0% 5.117m ± 0% -10.78% (p=0.002 n=6)
Exp 5.891m ± 0% 5.237m ± 0% -11.11% (p=0.002 n=6)
geomean 9.901µ 9.094µ -8.15%
goos: linux
goarch: ppc64le
pkg: math/big
cpu: POWER10
│ am.orig.out │ am.out │
│ sec/op │ sec/op vs base │
AddMulVVW/1 4.456n ± 1% 3.565n ± 0% -20.00% (p=0.002 n=6)
AddMulVVW/2 4.875n ± 1% 5.938n ± 1% +21.79% (p=0.002 n=6)
AddMulVVW/3 5.484n ± 0% 5.693n ± 0% +3.80% (p=0.002 n=6)
AddMulVVW/4 6.370n ± 0% 6.065n ± 0% -4.79% (p=0.002 n=6)
AddMulVVW/5 7.321n ± 0% 7.188n ± 0% -1.82% (p=0.002 n=6)
AddMulVVW/10 12.26n ± 8% 11.41n ± 0% -6.97% (p=0.002 n=6)
AddMulVVW/100 100.70n ± 0% 93.58n ± 0% -7.08% (p=0.002 n=6)
AddMulVVW/1000 938.6n ± 0% 845.5n ± 0% -9.92% (p=0.002 n=6)
AddMulVVW/10000 9.459µ ± 0% 8.415µ ± 0% -11.04% (p=0.002 n=6)
AddMulVVW/100000 94.57µ ± 0% 84.01µ ± 0% -11.16% (p=0.002 n=6)
geomean 75.17n 71.21n -5.27%
Change-Id: Idd79f5f02387564f4c2cc28d50b1c12bcd9a400f
Reviewed-on: https://go-review.googlesource.com/c/go/+/557915
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Paul Murphy <murp@ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
|
|
|
|
aba18d5b67 |
math/big: fix uint64 overflow in nat.mulRange
Compute median as a + (b-a)/2 instead of (a + b)/2. Add additional test cases. Fixes #65025. Change-Id: Ib716a1036c17f8f33f51e33cedab13512eb7e0be Reviewed-on: https://go-review.googlesource.com/c/go/+/554617 Reviewed-by: Robert Griesemer <gri@google.com> Auto-Submit: Robert Griesemer <gri@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Robert Griesemer <gri@google.com> Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> |
|
|
|
527829a7cb |
all: remove newline characters after return statements
This commit is aimed at improving the readability and consistency of the code base. Extraneous newline characters were present after some return statements, creating unnecessary separation in the code. Fixes #64610 Change-Id: Ic1b05bf11761c4dff22691c2f1c3755f66d341f7 Reviewed-on: https://go-review.googlesource.com/c/go/+/548316 Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> |
|
|
|
c29444ef39 |
math/rand, math/rand/v2: use ChaCha8 for global rand
Move ChaCha8 code into internal/chacha8rand and use it to implement runtime.rand, which is used for the unseeded global source for both math/rand and math/rand/v2. This also affects the calculation of the start point for iteration over very very large maps (when the 32-bit fastrand is not big enough). The benefit is that misuse of the global random number generators in math/rand and math/rand/v2 in contexts where non-predictable randomness is important for security reasons is no longer a security problem, removing a common mistake among programmers who are unaware of the different kinds of randomness. The cost is an extra 304 bytes per thread stored in the m struct plus 2-3ns more per random uint64 due to the more sophisticated algorithm. Using PCG looks like it would cost about the same, although I haven't benchmarked that. Before this, the math/rand and math/rand/v2 global generator was wyrand (https://github.com/wangyi-fudan/wyhash). For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson ALFG was justifiable, since the latter was not any better. But for math/rand/v2, the global generator really should be at least as good as one of the well-studied, specific algorithms provided directly by the package, and it's not. (Wyrand is still reasonable for scheduling and cache decisions.) Good randomness does have a cost: about twice wyrand. Also rationalize the various runtime rand references. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ bbb48afeb7.amd64 │ 5cf807d1ea.amd64 │ │ sec/op │ sec/op vs base │ ChaCha8-32 1.862n ± 2% 1.861n ± 2% ~ (p=0.825 n=20) PCG_DXSM-32 1.471n ± 1% 1.460n ± 2% ~ (p=0.153 n=20) SourceUint64-32 1.636n ± 2% 1.582n ± 1% -3.30% (p=0.000 n=20) GlobalInt64-32 2.087n ± 1% 3.663n ± 1% +75.54% (p=0.000 n=20) GlobalInt64Parallel-32 0.1042n ± 1% 0.2026n ± 1% +94.48% (p=0.000 n=20) GlobalUint64-32 2.263n ± 2% 3.724n ± 1% +64.57% (p=0.000 n=20) GlobalUint64Parallel-32 0.1019n ± 1% 0.1973n ± 1% +93.67% (p=0.000 n=20) Int64-32 1.771n ± 1% 1.774n ± 1% ~ (p=0.449 n=20) Uint64-32 1.863n ± 2% 1.866n ± 1% ~ (p=0.364 n=20) GlobalIntN1000-32 3.134n ± 3% 4.730n ± 2% +50.95% (p=0.000 n=20) IntN1000-32 2.489n ± 1% 2.489n ± 1% ~ (p=0.683 n=20) Int64N1000-32 2.521n ± 1% 2.516n ± 1% ~ (p=0.394 n=20) Int64N1e8-32 2.479n ± 1% 2.478n ± 2% ~ (p=0.743 n=20) Int64N1e9-32 2.530n ± 2% 2.514n ± 2% ~ (p=0.193 n=20) Int64N2e9-32 2.501n ± 1% 2.494n ± 1% ~ (p=0.616 n=20) Int64N1e18-32 3.227n ± 1% 3.205n ± 1% ~ (p=0.101 n=20) Int64N2e18-32 3.647n ± 1% 3.599n ± 1% ~ (p=0.019 n=20) Int64N4e18-32 5.135n ± 1% 5.069n ± 2% ~ (p=0.034 n=20) Int32N1000-32 2.657n ± 1% 2.637n ± 1% ~ (p=0.180 n=20) Int32N1e8-32 2.636n ± 1% 2.636n ± 1% ~ (p=0.763 n=20) Int32N1e9-32 2.660n ± 2% 2.638n ± 1% ~ (p=0.358 n=20) Int32N2e9-32 2.662n ± 2% 2.618n ± 2% ~ (p=0.064 n=20) Float32-32 2.272n ± 2% 2.239n ± 2% ~ (p=0.194 n=20) Float64-32 2.272n ± 1% 2.286n ± 2% ~ (p=0.763 n=20) ExpFloat64-32 3.762n ± 1% 3.744n ± 1% ~ (p=0.171 n=20) NormFloat64-32 3.706n ± 1% 3.655n ± 2% ~ (p=0.066 n=20) Perm3-32 32.93n ± 3% 34.62n ± 1% +5.13% (p=0.000 n=20) Perm30-32 202.9n ± 1% 204.0n ± 1% ~ (p=0.482 n=20) Perm30ViaShuffle-32 115.0n ± 1% 114.9n ± 1% ~ (p=0.358 n=20) ShuffleOverhead-32 112.8n ± 1% 112.7n ± 1% ~ (p=0.692 n=20) Concurrent-32 2.107n ± 0% 3.725n ± 1% +76.75% (p=0.000 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 │ bbb48afeb7.arm64 │ 5cf807d1ea.arm64 │ │ sec/op │ sec/op vs base │ ChaCha8-8 2.480n ± 0% 2.429n ± 0% -2.04% (p=0.000 n=20) PCG_DXSM-8 2.531n ± 0% 2.530n ± 0% ~ (p=0.877 n=20) SourceUint64-8 2.534n ± 0% 2.533n ± 0% ~ (p=0.732 n=20) GlobalInt64-8 2.172n ± 1% 4.794n ± 0% +120.67% (p=0.000 n=20) GlobalInt64Parallel-8 0.4320n ± 0% 0.9605n ± 0% +122.32% (p=0.000 n=20) GlobalUint64-8 2.182n ± 0% 4.770n ± 0% +118.58% (p=0.000 n=20) GlobalUint64Parallel-8 0.4307n ± 0% 0.9583n ± 0% +122.51% (p=0.000 n=20) Int64-8 4.107n ± 0% 4.104n ± 0% ~ (p=0.416 n=20) Uint64-8 4.080n ± 0% 4.080n ± 0% ~ (p=0.052 n=20) GlobalIntN1000-8 2.814n ± 2% 5.643n ± 0% +100.50% (p=0.000 n=20) IntN1000-8 4.141n ± 0% 4.139n ± 0% ~ (p=0.140 n=20) Int64N1000-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.313 n=20) Int64N1e8-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.103 n=20) Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.761 n=20) Int64N2e9-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.636 n=20) Int64N1e18-8 5.266n ± 0% 5.326n ± 1% +1.14% (p=0.001 n=20) Int64N2e18-8 6.052n ± 0% 6.167n ± 0% +1.90% (p=0.000 n=20) Int64N4e18-8 8.826n ± 0% 9.051n ± 0% +2.55% (p=0.000 n=20) Int32N1000-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20) Int32N1e8-8 4.126n ± 0% 4.131n ± 0% +0.12% (p=0.000 n=20) Int32N1e9-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20) Int32N2e9-8 4.132n ± 0% 4.131n ± 0% ~ (p=0.017 n=20) Float32-8 4.109n ± 0% 4.105n ± 0% ~ (p=0.379 n=20) Float64-8 4.107n ± 0% 4.106n ± 0% ~ (p=0.867 n=20) ExpFloat64-8 5.339n ± 0% 5.383n ± 0% +0.82% (p=0.000 n=20) NormFloat64-8 5.735n ± 0% 5.737n ± 1% ~ (p=0.856 n=20) Perm3-8 26.65n ± 0% 26.80n ± 1% +0.58% (p=0.000 n=20) Perm30-8 194.8n ± 1% 197.0n ± 0% +1.18% (p=0.000 n=20) Perm30ViaShuffle-8 156.6n ± 0% 157.6n ± 1% +0.61% (p=0.000 n=20) ShuffleOverhead-8 124.9n ± 0% 125.5n ± 0% +0.52% (p=0.000 n=20) Concurrent-8 2.434n ± 3% 5.066n ± 0% +108.09% (p=0.000 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ bbb48afeb7.386 │ 5cf807d1ea.386 │ │ sec/op │ sec/op vs base │ ChaCha8-32 11.295n ± 1% 4.748n ± 2% -57.96% (p=0.000 n=20) PCG_DXSM-32 7.693n ± 1% 7.738n ± 2% ~ (p=0.542 n=20) SourceUint64-32 7.658n ± 2% 7.622n ± 2% ~ (p=0.344 n=20) GlobalInt64-32 3.473n ± 2% 7.526n ± 2% +116.73% (p=0.000 n=20) GlobalInt64Parallel-32 0.3198n ± 0% 0.5444n ± 0% +70.22% (p=0.000 n=20) GlobalUint64-32 3.612n ± 0% 7.575n ± 1% +109.69% (p=0.000 n=20) GlobalUint64Parallel-32 0.3168n ± 0% 0.5403n ± 0% +70.51% (p=0.000 n=20) Int64-32 7.673n ± 2% 7.789n ± 1% ~ (p=0.122 n=20) Uint64-32 7.773n ± 1% 7.827n ± 2% ~ (p=0.920 n=20) GlobalIntN1000-32 6.268n ± 1% 9.581n ± 1% +52.87% (p=0.000 n=20) IntN1000-32 10.33n ± 2% 10.45n ± 1% ~ (p=0.233 n=20) Int64N1000-32 10.98n ± 2% 11.01n ± 1% ~ (p=0.401 n=20) Int64N1e8-32 11.19n ± 2% 10.97n ± 1% ~ (p=0.033 n=20) Int64N1e9-32 11.06n ± 1% 11.08n ± 1% ~ (p=0.498 n=20) Int64N2e9-32 11.10n ± 1% 11.01n ± 2% ~ (p=0.995 n=20) Int64N1e18-32 15.23n ± 2% 15.04n ± 1% ~ (p=0.973 n=20) Int64N2e18-32 15.89n ± 1% 15.85n ± 1% ~ (p=0.409 n=20) Int64N4e18-32 18.96n ± 2% 19.34n ± 2% ~ (p=0.048 n=20) Int32N1000-32 10.46n ± 2% 10.44n ± 2% ~ (p=0.480 n=20) Int32N1e8-32 10.46n ± 2% 10.49n ± 2% ~ (p=0.951 n=20) Int32N1e9-32 10.28n ± 2% 10.26n ± 1% ~ (p=0.431 n=20) Int32N2e9-32 10.50n ± 2% 10.44n ± 2% ~ (p=0.249 n=20) Float32-32 13.80n ± 2% 13.80n ± 2% ~ (p=0.751 n=20) Float64-32 23.55n ± 2% 23.87n ± 0% ~ (p=0.408 n=20) ExpFloat64-32 15.36n ± 1% 15.29n ± 2% ~ (p=0.316 n=20) NormFloat64-32 13.57n ± 1% 13.79n ± 1% +1.66% (p=0.005 n=20) Perm3-32 45.70n ± 2% 46.99n ± 2% +2.81% (p=0.001 n=20) Perm30-32 399.0n ± 1% 403.8n ± 1% +1.19% (p=0.006 n=20) Perm30ViaShuffle-32 349.0n ± 1% 350.4n ± 1% ~ (p=0.909 n=20) ShuffleOverhead-32 322.3n ± 1% 323.8n ± 1% ~ (p=0.410 n=20) Concurrent-32 3.331n ± 1% 7.312n ± 1% +119.50% (p=0.000 n=20) For #61716. Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2 Reviewed-on: https://go-review.googlesource.com/c/go/+/516860 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Filippo Valsorda <filippo@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
d92434935f |
math/rand/v2: add ChaCha8
This is a replay of CL 516859, after its rollback in CL 543895, with big-endian systems fixed and the tests disabled on RISC-V since the compiler is broken there (#64285). ChaCha8 provides a cryptographically strong generator alongside PCG, so that people who want stronger randomness have access to that. On systems with 128-bit vector math assembly (amd64 and arm64), ChaCha8 runs at about the same speed as PCG (25% slower on amd64, 2% faster on arm64). Fixes #64284. Change-Id: I6290bb8ace28e1aff9a61f805dbe380ccdf25b94 Reviewed-on: https://go-review.googlesource.com/c/go/+/546020 Reviewed-by: Filippo Valsorda <filippo@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
82fc03f9c9 |
Revert "math/rand/v2: add ChaCha8"
This reverts commit
|
|
|
|
ee6b34797b |
all: add floating point option for ARM targets
This change introduces new options to set the floating point mode on ARM targets. The GOARM version number can optionally be followed by ',hardfloat' or ',softfloat' to select whether to use hardware instructions or software emulation for floating point computations, respectively. For example, GOARM=7,softfloat. Previously, software floating point support was limited to GOARM=5. With these options, software floating point is now extended to all ARM versions, including GOARM=6 and 7. This change also extends hardware floating point to GOARM=5. GOARM=5 defaults to softfloat and GOARM=6 and 7 default to hardfloat. For #61588 Change-Id: I23dc86fbd0733b262004a2ed001e1032cf371e94 Reviewed-on: https://go-review.googlesource.com/c/go/+/514907 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> |
|
|
|
6382893890 |
math/rand/v2: add ChaCha8
ChaCha8 provides a cryptographically strong generator
alongside PCG, so that people who want stronger randomness
have access to that. On systems with 128-bit vector math
assembly (amd64 and arm64), ChaCha8 runs at about the same
speed as PCG (25% slower on amd64, 2% faster on arm64).
Obviously all the claimed benchmark variation other than the
new ChaCha8 benchmark is a lie.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ afa459a2f0.amd64 │ bbb48afeb7.amd64 │
│ sec/op │ sec/op vs base │
PCG_DXSM-32 1.488n ± 2% 1.492n ± 2% ~ (p=0.309 n=20)
ChaCha8-32 1.861n ± 2%
SourceUint64-32 1.450n ± 3% 1.590n ± 2% +9.69% (p=0.000 n=20)
GlobalInt64-32 2.067n ± 2% 2.061n ± 1% ~ (p=0.952 n=20)
GlobalInt64Parallel-32 0.1044n ± 2% 0.1041n ± 1% ~ (p=0.498 n=20)
GlobalUint64-32 2.085n ± 0% 2.256n ± 2% +8.23% (p=0.000 n=20)
GlobalUint64Parallel-32 0.1008n ± 1% 0.1018n ± 1% ~ (p=0.041 n=20)
Int64-32 1.779n ± 1% 1.779n ± 1% ~ (p=0.410 n=20)
Uint64-32 1.854n ± 2% 1.882n ± 1% ~ (p=0.044 n=20)
GlobalIntN1000-32 3.140n ± 3% 3.115n ± 3% ~ (p=0.673 n=20)
IntN1000-32 2.496n ± 1% 2.509n ± 1% ~ (p=0.171 n=20)
Int64N1000-32 2.510n ± 2% 2.493n ± 1% ~ (p=0.804 n=20)
Int64N1e8-32 2.471n ± 2% 2.521n ± 1% +1.98% (p=0.003 n=20)
Int64N1e9-32 2.488n ± 2% 2.506n ± 1% ~ (p=0.663 n=20)
Int64N2e9-32 2.478n ± 2% 2.482n ± 2% ~ (p=0.533 n=20)
Int64N1e18-32 3.088n ± 1% 3.216n ± 1% +4.15% (p=0.000 n=20)
Int64N2e18-32 3.493n ± 1% 3.635n ± 2% +4.05% (p=0.000 n=20)
Int64N4e18-32 5.060n ± 2% 5.122n ± 1% +1.22% (p=0.000 n=20)
Int32N1000-32 2.620n ± 1% 2.672n ± 1% +2.00% (p=0.002 n=20)
Int32N1e8-32 2.652n ± 0% 2.646n ± 1% ~ (p=0.743 n=20)
Int32N1e9-32 2.644n ± 1% 2.660n ± 2% ~ (p=0.163 n=20)
Int32N2e9-32 2.619n ± 2% 2.652n ± 1% ~ (p=0.132 n=20)
Float32-32 2.261n ± 1% 2.267n ± 1% ~ (p=0.516 n=20)
Float64-32 2.241n ± 2% 2.276n ± 1% ~ (p=0.080 n=20)
ExpFloat64-32 3.716n ± 1% 3.779n ± 1% +1.68% (p=0.007 n=20)
NormFloat64-32 3.718n ± 1% 3.747n ± 1% ~ (p=0.011 n=20)
Perm3-32 34.11n ± 2% 34.23n ± 2% ~ (p=0.779 n=20)
Perm30-32 200.6n ± 0% 202.3n ± 2% ~ (p=0.055 n=20)
Perm30ViaShuffle-32 109.7n ± 1% 115.5n ± 2% +5.34% (p=0.000 n=20)
ShuffleOverhead-32 107.2n ± 1% 113.3n ± 1% +5.74% (p=0.000 n=20)
Concurrent-32 2.108n ± 6% 2.107n ± 1% ~ (p=0.448 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
│ afa459a2f0.arm64 │ bbb48afeb7.arm64 │
│ sec/op │ sec/op vs base │
PCG_DXSM-8 2.531n ± 0% 2.529n ± 0% ~ (p=0.586 n=20)
ChaCha8-8 2.480n ± 0%
SourceUint64-8 2.531n ± 0% 2.534n ± 0% ~ (p=0.227 n=20)
GlobalInt64-8 2.177n ± 1% 2.173n ± 1% ~ (p=0.733 n=20)
GlobalInt64Parallel-8 0.4319n ± 0% 0.4304n ± 0% -0.32% (p=0.003 n=20)
GlobalUint64-8 2.185n ± 1% 2.185n ± 0% ~ (p=0.541 n=20)
GlobalUint64Parallel-8 0.4295n ± 1% 0.4294n ± 0% ~ (p=0.203 n=20)
Int64-8 4.104n ± 0% 4.107n ± 0% ~ (p=0.193 n=20)
Uint64-8 4.080n ± 0% 4.081n ± 0% ~ (p=0.053 n=20)
GlobalIntN1000-8 2.814n ± 1% 2.814n ± 0% ~ (p=0.879 n=20)
IntN1000-8 4.140n ± 0% 4.141n ± 0% ~ (p=0.428 n=20)
Int64N1000-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.114 n=20)
Int64N1e8-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.898 n=20)
Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.593 n=20)
Int64N2e9-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.158 n=20)
Int64N1e18-8 5.273n ± 0% 5.274n ± 0% ~ (p=0.308 n=20)
Int64N2e18-8 6.059n ± 0% 6.058n ± 0% ~ (p=0.053 n=20)
Int64N4e18-8 8.803n ± 0% 8.800n ± 0% ~ (p=0.673 n=20)
Int32N1000-8 4.131n ± 0% 4.131n ± 0% ~ (p=0.342 n=20)
Int32N1e8-8 4.131n ± 0% 4.131n ± 0% ~ (p=0.091 n=20)
Int32N1e9-8 4.131n ± 0% 4.131n ± 0% ~ (p=0.273 n=20)
Int32N2e9-8 4.131n ± 0% 4.131n ± 0% ~ (p=0.425 n=20)
Float32-8 4.110n ± 0% 4.112n ± 0% ~ (p=0.203 n=20)
Float64-8 4.104n ± 0% 4.106n ± 0% ~ (p=0.409 n=20)
ExpFloat64-8 5.338n ± 0% 5.339n ± 0% ~ (p=0.037 n=20)
NormFloat64-8 5.731n ± 0% 5.733n ± 0% ~ (p=0.692 n=20)
Perm3-8 26.62n ± 0% 26.65n ± 0% +0.09% (p=0.000 n=20)
Perm30-8 194.6n ± 2% 194.9n ± 0% ~ (p=0.141 n=20)
Perm30ViaShuffle-8 156.4n ± 0% 156.5n ± 0% +0.06% (p=0.000 n=20)
ShuffleOverhead-8 125.8n ± 0% 125.0n ± 0% -0.64% (p=0.000 n=20)
Concurrent-8 2.654n ± 6% 2.441n ± 6% -8.06% (p=0.009 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ afa459a2f0.386 │ bbb48afeb7.386 │
│ sec/op │ sec/op vs base │
PCG_DXSM-32 7.793n ± 2% 7.647n ± 1% ~ (p=0.021 n=20)
ChaCha8-32 11.48n ± 2%
SourceUint64-32 7.680n ± 1% 7.714n ± 1% ~ (p=0.713 n=20)
GlobalInt64-32 3.474n ± 3% 3.491n ± 28% ~ (p=0.337 n=20)
GlobalInt64Parallel-32 0.3253n ± 0% 0.3194n ± 0% -1.81% (p=0.000 n=20)
GlobalUint64-32 3.433n ± 2% 3.610n ± 2% +5.14% (p=0.000 n=20)
GlobalUint64Parallel-32 0.3156n ± 0% 0.3164n ± 0% ~ (p=0.073 n=20)
Int64-32 7.707n ± 1% 7.824n ± 0% +1.52% (p=0.005 n=20)
Uint64-32 7.714n ± 1% 7.732n ± 2% ~ (p=0.441 n=20)
GlobalIntN1000-32 6.236n ± 1% 6.176n ± 2% ~ (p=0.499 n=20)
IntN1000-32 10.41n ± 1% 10.31n ± 2% ~ (p=0.782 n=20)
Int64N1000-32 10.97n ± 2% 11.22n ± 2% +2.19% (p=0.002 n=20)
Int64N1e8-32 10.98n ± 1% 11.07n ± 1% ~ (p=0.056 n=20)
Int64N1e9-32 10.95n ± 0% 11.15n ± 2% ~ (p=0.016 n=20)
Int64N2e9-32 11.11n ± 1% 11.00n ± 1% ~ (p=0.654 n=20)
Int64N1e18-32 15.18n ± 2% 14.97n ± 2% ~ (p=0.387 n=20)
Int64N2e18-32 15.61n ± 1% 15.91n ± 1% +1.92% (p=0.003 n=20)
Int64N4e18-32 19.23n ± 2% 18.98n ± 1% ~ (p=1.000 n=20)
Int32N1000-32 10.35n ± 1% 10.31n ± 2% ~ (p=0.081 n=20)
Int32N1e8-32 10.33n ± 1% 10.38n ± 1% ~ (p=0.335 n=20)
Int32N1e9-32 10.35n ± 1% 10.37n ± 1% ~ (p=0.497 n=20)
Int32N2e9-32 10.35n ± 1% 10.41n ± 1% ~ (p=0.605 n=20)
Float32-32 13.57n ± 1% 13.78n ± 2% ~ (p=0.047 n=20)
Float64-32 22.95n ± 4% 23.43n ± 3% ~ (p=0.218 n=20)
ExpFloat64-32 15.23n ± 2% 15.46n ± 1% ~ (p=0.095 n=20)
NormFloat64-32 13.78n ± 1% 13.73n ± 2% ~ (p=0.031 n=20)
Perm3-32 46.62n ± 2% 47.46n ± 2% +1.82% (p=0.004 n=20)
Perm30-32 400.7n ± 1% 403.5n ± 1% ~ (p=0.098 n=20)
Perm30ViaShuffle-32 350.5n ± 1% 348.1n ± 2% ~ (p=0.703 n=20)
ShuffleOverhead-32 326.0n ± 2% 326.2n ± 2% ~ (p=0.440 n=20)
Concurrent-32 3.290n ± 0% 3.297n ± 4% ~ (p=0.189 n=20)
For #61716.
Change-Id: Id2a7e1c1db0beb81f563faaefba65fe292497269
Reviewed-on: https://go-review.googlesource.com/c/go/+/516859
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
|
|
|
|
22278e3835 |
math/big: faster FloatPrec implementation
Based on observations by Cherry Mui (see comments in CL 539299). Add new benchmark FloatPrecMixed. For #50489. name old time/op new time/op delta FloatPrecExact/1-12 129ns ± 0% 105ns ±11% -18.51% (p=0.008 n=5+5) FloatPrecExact/10-12 317ns ± 2% 283ns ± 1% -10.65% (p=0.008 n=5+5) FloatPrecExact/100-12 1.80µs ±15% 1.35µs ± 0% -25.09% (p=0.008 n=5+5) FloatPrecExact/1000-12 9.48µs ±14% 8.32µs ± 1% -12.25% (p=0.008 n=5+5) FloatPrecExact/10000-12 195µs ± 1% 191µs ± 0% -1.73% (p=0.008 n=5+5) FloatPrecExact/100000-12 7.31ms ± 1% 7.24ms ± 1% -0.99% (p=0.032 n=5+5) FloatPrecExact/1000000-12 301ms ± 3% 302ms ± 2% ~ (p=0.841 n=5+5) FloatPrecMixed/1-12 141ns ± 0% 110ns ± 3% -21.88% (p=0.008 n=5+5) FloatPrecMixed/10-12 767ns ± 0% 739ns ± 5% ~ (p=0.151 n=5+5) FloatPrecMixed/100-12 4.93µs ± 2% 3.73µs ± 1% -24.33% (p=0.008 n=5+5) FloatPrecMixed/1000-12 90.9µs ±11% 70.3µs ± 2% -22.66% (p=0.008 n=5+5) FloatPrecMixed/10000-12 2.30ms ± 0% 1.92ms ± 1% -16.41% (p=0.008 n=5+5) FloatPrecMixed/100000-12 87.1ms ± 1% 68.5ms ± 1% -21.42% (p=0.008 n=5+5) FloatPrecMixed/1000000-12 4.09s ± 1% 3.58s ± 1% -12.35% (p=0.008 n=5+5) FloatPrecInexact/1-12 92.4ns ± 0% 66.1ns ± 5% -28.41% (p=0.008 n=5+5) FloatPrecInexact/10-12 118ns ± 0% 91ns ± 1% -23.14% (p=0.016 n=5+4) FloatPrecInexact/100-12 310ns ±10% 244ns ± 1% -21.32% (p=0.008 n=5+5) FloatPrecInexact/1000-12 952ns ± 1% 828ns ± 1% -12.96% (p=0.016 n=4+5) FloatPrecInexact/10000-12 6.71µs ± 1% 6.25µs ± 3% -6.83% (p=0.008 n=5+5) FloatPrecInexact/100000-12 66.1µs ± 1% 61.2µs ± 1% -7.45% (p=0.008 n=5+5) FloatPrecInexact/1000000-12 635µs ± 2% 584µs ± 1% -7.97% (p=0.008 n=5+5) Change-Id: I3aa67b49a042814a3286ee8306fbed36709cbb6e Reviewed-on: https://go-review.googlesource.com/c/go/+/542756 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Robert Griesemer <gri@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@google.com> Auto-Submit: Robert Griesemer <gri@google.com> |
|
|
|
e14b96cb51 |
math/big: update comment in the implementation of FloatPrec
Follow-up on CL 539299: missed to incorporate the updated comment per feedback on that CL. For #50489. Change-Id: Ib035400038b1d11532f62055b5cdb382ab75654c Reviewed-on: https://go-review.googlesource.com/c/go/+/542115 Run-TryBot: Robert Griesemer <gri@google.com> Auto-Submit: Robert Griesemer <gri@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> |
|
|
|
dd88f23a20 |
math/big: implement Rat.FloatPrec
goos: darwin goarch: amd64 pkg: math/big cpu: Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz BenchmarkFloatPrecExact/1-12 9380685 125.0 ns/op BenchmarkFloatPrecExact/10-12 3780493 321.2 ns/op BenchmarkFloatPrecExact/100-12 698272 1679 ns/op BenchmarkFloatPrecExact/1000-12 117975 9113 ns/op BenchmarkFloatPrecExact/10000-12 5913 192768 ns/op BenchmarkFloatPrecExact/100000-12 164 7401817 ns/op BenchmarkFloatPrecExact/1000000-12 4 293568523 ns/op BenchmarkFloatPrecInexact/1-12 12836612 91.26 ns/op BenchmarkFloatPrecInexact/10-12 10144908 114.9 ns/op BenchmarkFloatPrecInexact/100-12 4121931 297.3 ns/op BenchmarkFloatPrecInexact/1000-12 1275886 927.7 ns/op BenchmarkFloatPrecInexact/10000-12 170392 6546 ns/op BenchmarkFloatPrecInexact/100000-12 18307 65232 ns/op BenchmarkFloatPrecInexact/1000000-12 1701 621412 ns/op Fixes #50489. Change-Id: Ic952f00e35d42f2470ecab53df712721997eac94 Reviewed-on: https://go-review.googlesource.com/c/go/+/539299 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Robert Griesemer <gri@google.com> Run-TryBot: Robert Griesemer <gri@google.com> Reviewed-by: Robert Griesemer <gri@google.com> |
|
|
|
8abde68f19 |
math/rand/v2: delete Mitchell/Reeds source
These slowdowns are because we are now using PCG instead of the
Mitchell/Reeds LFSR for the benchmarks. PCG is in fact a bit slower
(but generates statically far better random numbers).
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 01ff938549.amd64 │ afa459a2f0.amd64 │
│ sec/op │ sec/op vs base │
PCG_DXSM-32 1.490n ± 0% 1.488n ± 2% ~ (p=0.408 n=20)
SourceUint64-32 1.352n ± 1% 1.450n ± 3% +7.21% (p=0.000 n=20)
GlobalInt64-32 2.083n ± 0% 2.067n ± 2% ~ (p=0.223 n=20)
GlobalInt64Parallel-32 0.1035n ± 1% 0.1044n ± 2% ~ (p=0.010 n=20)
GlobalUint64-32 2.038n ± 1% 2.085n ± 0% +2.28% (p=0.000 n=20)
GlobalUint64Parallel-32 0.1006n ± 1% 0.1008n ± 1% ~ (p=0.733 n=20)
Int64-32 1.687n ± 2% 1.779n ± 1% +5.48% (p=0.000 n=20)
Uint64-32 1.674n ± 2% 1.854n ± 2% +10.69% (p=0.000 n=20)
GlobalIntN1000-32 3.135n ± 1% 3.140n ± 3% ~ (p=0.794 n=20)
IntN1000-32 2.478n ± 1% 2.496n ± 1% +0.73% (p=0.006 n=20)
Int64N1000-32 2.455n ± 1% 2.510n ± 2% +2.22% (p=0.000 n=20)
Int64N1e8-32 2.467n ± 2% 2.471n ± 2% ~ (p=0.050 n=20)
Int64N1e9-32 2.454n ± 1% 2.488n ± 2% +1.39% (p=0.000 n=20)
Int64N2e9-32 2.482n ± 1% 2.478n ± 2% ~ (p=0.066 n=20)
Int64N1e18-32 3.349n ± 2% 3.088n ± 1% -7.81% (p=0.000 n=20)
Int64N2e18-32 3.537n ± 1% 3.493n ± 1% -1.24% (p=0.002 n=20)
Int64N4e18-32 4.917n ± 0% 5.060n ± 2% +2.91% (p=0.000 n=20)
Int32N1000-32 2.386n ± 1% 2.620n ± 1% +9.76% (p=0.000 n=20)
Int32N1e8-32 2.366n ± 1% 2.652n ± 0% +12.11% (p=0.000 n=20)
Int32N1e9-32 2.355n ± 2% 2.644n ± 1% +12.32% (p=0.000 n=20)
Int32N2e9-32 2.371n ± 1% 2.619n ± 2% +10.48% (p=0.000 n=20)
Float32-32 2.245n ± 2% 2.261n ± 1% ~ (p=0.625 n=20)
Float64-32 2.235n ± 1% 2.241n ± 2% ~ (p=0.393 n=20)
ExpFloat64-32 3.813n ± 3% 3.716n ± 1% -2.53% (p=0.000 n=20)
NormFloat64-32 3.652n ± 2% 3.718n ± 1% +1.79% (p=0.006 n=20)
Perm3-32 33.12n ± 3% 34.11n ± 2% ~ (p=0.021 n=20)
Perm30-32 205.1n ± 1% 200.6n ± 0% -2.17% (p=0.000 n=20)
Perm30ViaShuffle-32 110.8n ± 1% 109.7n ± 1% -0.99% (p=0.002 n=20)
ShuffleOverhead-32 113.0n ± 1% 107.2n ± 1% -5.09% (p=0.000 n=20)
Concurrent-32 2.100n ± 0% 2.108n ± 6% ~ (p=0.103 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
│ 01ff938549.arm64 │ afa459a2f0.arm64 │
│ sec/op │ sec/op vs base │
PCG_DXSM-8 2.531n ± 0% 2.531n ± 0% ~ (p=0.763 n=20)
SourceUint64-8 2.258n ± 1% 2.531n ± 0% +12.09% (p=0.000 n=20)
GlobalInt64-8 2.167n ± 0% 2.177n ± 1% ~ (p=0.213 n=20)
GlobalInt64Parallel-8 0.4310n ± 0% 0.4319n ± 0% ~ (p=0.027 n=20)
GlobalUint64-8 2.182n ± 1% 2.185n ± 1% ~ (p=0.683 n=20)
GlobalUint64Parallel-8 0.4297n ± 0% 0.4295n ± 1% ~ (p=0.941 n=20)
Int64-8 2.472n ± 1% 4.104n ± 0% +66.00% (p=0.000 n=20)
Uint64-8 2.449n ± 1% 4.080n ± 0% +66.60% (p=0.000 n=20)
GlobalIntN1000-8 2.814n ± 2% 2.814n ± 1% ~ (p=0.972 n=20)
IntN1000-8 2.998n ± 2% 4.140n ± 0% +38.09% (p=0.000 n=20)
Int64N1000-8 2.949n ± 2% 4.139n ± 0% +40.35% (p=0.000 n=20)
Int64N1e8-8 2.953n ± 2% 4.140n ± 0% +40.22% (p=0.000 n=20)
Int64N1e9-8 2.950n ± 0% 4.139n ± 0% +40.32% (p=0.000 n=20)
Int64N2e9-8 2.946n ± 2% 4.140n ± 0% +40.53% (p=0.000 n=20)
Int64N1e18-8 3.779n ± 1% 5.273n ± 0% +39.52% (p=0.000 n=20)
Int64N2e18-8 4.370n ± 1% 6.059n ± 0% +38.65% (p=0.000 n=20)
Int64N4e18-8 6.544n ± 1% 8.803n ± 0% +34.52% (p=0.000 n=20)
Int32N1000-8 2.950n ± 0% 4.131n ± 0% +40.06% (p=0.000 n=20)
Int32N1e8-8 2.950n ± 2% 4.131n ± 0% +40.03% (p=0.000 n=20)
Int32N1e9-8 2.951n ± 2% 4.131n ± 0% +39.99% (p=0.000 n=20)
Int32N2e9-8 2.950n ± 2% 4.131n ± 0% +40.03% (p=0.000 n=20)
Float32-8 3.441n ± 0% 4.110n ± 0% +19.44% (p=0.000 n=20)
Float64-8 3.442n ± 0% 4.104n ± 0% +19.24% (p=0.000 n=20)
ExpFloat64-8 4.481n ± 0% 5.338n ± 0% +19.11% (p=0.000 n=20)
NormFloat64-8 4.725n ± 0% 5.731n ± 0% +21.28% (p=0.000 n=20)
Perm3-8 26.55n ± 0% 26.62n ± 0% +0.28% (p=0.000 n=20)
Perm30-8 181.9n ± 0% 194.6n ± 2% +6.98% (p=0.000 n=20)
Perm30ViaShuffle-8 142.9n ± 0% 156.4n ± 0% +9.45% (p=0.000 n=20)
ShuffleOverhead-8 120.8n ± 2% 125.8n ± 0% +4.10% (p=0.000 n=20)
Concurrent-8 2.421n ± 6% 2.654n ± 6% +9.67% (p=0.002 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 01ff938549.386 │ afa459a2f0.386 │
│ sec/op │ sec/op vs base │
PCG_DXSM-32 7.613n ± 1% 7.793n ± 2% +2.38% (p=0.000 n=20)
SourceUint64-32 2.069n ± 0% 7.680n ± 1% +271.19% (p=0.000 n=20)
GlobalInt64-32 3.456n ± 1% 3.474n ± 3% ~ (p=0.654 n=20)
GlobalInt64Parallel-32 0.3252n ± 0% 0.3253n ± 0% ~ (p=0.952 n=20)
GlobalUint64-32 3.573n ± 1% 3.433n ± 2% -3.92% (p=0.000 n=20)
GlobalUint64Parallel-32 0.3159n ± 0% 0.3156n ± 0% ~ (p=0.223 n=20)
Int64-32 2.562n ± 2% 7.707n ± 1% +200.74% (p=0.000 n=20)
Uint64-32 2.592n ± 0% 7.714n ± 1% +197.65% (p=0.000 n=20)
GlobalIntN1000-32 6.266n ± 2% 6.236n ± 1% ~ (p=0.039 n=20)
IntN1000-32 4.724n ± 2% 10.410n ± 1% +120.39% (p=0.000 n=20)
Int64N1000-32 5.490n ± 2% 10.975n ± 2% +99.89% (p=0.000 n=20)
Int64N1e8-32 5.513n ± 2% 10.980n ± 1% +99.15% (p=0.000 n=20)
Int64N1e9-32 5.476n ± 1% 10.950n ± 0% +99.96% (p=0.000 n=20)
Int64N2e9-32 5.501n ± 2% 11.110n ± 1% +101.96% (p=0.000 n=20)
Int64N1e18-32 9.043n ± 2% 15.180n ± 2% +67.86% (p=0.000 n=20)
Int64N2e18-32 9.601n ± 2% 15.610n ± 1% +62.60% (p=0.000 n=20)
Int64N4e18-32 12.00n ± 1% 19.23n ± 2% +60.14% (p=0.000 n=20)
Int32N1000-32 4.829n ± 2% 10.345n ± 1% +114.25% (p=0.000 n=20)
Int32N1e8-32 4.825n ± 2% 10.330n ± 1% +114.09% (p=0.000 n=20)
Int32N1e9-32 4.830n ± 2% 10.350n ± 1% +114.26% (p=0.000 n=20)
Int32N2e9-32 4.750n ± 2% 10.345n ± 1% +117.81% (p=0.000 n=20)
Float32-32 10.89n ± 4% 13.57n ± 1% +24.61% (p=0.000 n=20)
Float64-32 19.60n ± 4% 22.95n ± 4% +17.12% (p=0.000 n=20)
ExpFloat64-32 12.96n ± 3% 15.23n ± 2% +17.47% (p=0.000 n=20)
NormFloat64-32 7.516n ± 1% 13.780n ± 1% +83.34% (p=0.000 n=20)
Perm3-32 36.78n ± 2% 46.62n ± 2% +26.72% (p=0.000 n=20)
Perm30-32 238.9n ± 2% 400.7n ± 1% +67.73% (p=0.000 n=20)
Perm30ViaShuffle-32 189.7n ± 2% 350.5n ± 1% +84.79% (p=0.000 n=20)
ShuffleOverhead-32 159.8n ± 1% 326.0n ± 2% +104.01% (p=0.000 n=20)
Concurrent-32 3.286n ± 1% 3.290n ± 0% ~ (p=0.743 n=20)
On the other hand, compared to the original "update benchmarks" CL,
the cleanups we've made more than compensate for PCG being a bit
slower than LFSR, at least on 64-bit x86. ARM64 (Apple M1) is a bit
slower: perhaps the 64x64→128 multiply is slower there for some reason.
386 is noticeably slower, but it's also a non-SSA backend.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 220860f76f.amd64 │ afa459a2f0.amd64 │
│ sec/op │ sec/op vs base │
SourceUint64-32 1.555n ± 1% 1.450n ± 3% -6.78% (p=0.000 n=20)
GlobalInt64-32 2.071n ± 1% 2.067n ± 2% ~ (p=0.673 n=20)
GlobalInt63Parallel-32 0.1023n ± 1%
GlobalInt64Parallel-32 0.1044n ± 2%
GlobalUint64-32 5.193n ± 1% 2.085n ± 0% -59.86% (p=0.000 n=20)
GlobalUint64Parallel-32 0.2341n ± 0% 0.1008n ± 1% -56.93% (p=0.000 n=20)
Int64-32 2.056n ± 2% 1.779n ± 1% -13.47% (p=0.000 n=20)
Uint64-32 2.077n ± 2% 1.854n ± 2% -10.74% (p=0.000 n=20)
GlobalIntN1000-32 4.077n ± 2% 3.140n ± 3% -22.98% (p=0.000 n=20)
IntN1000-32 3.476n ± 2% 2.496n ± 1% -28.19% (p=0.000 n=20)
Int64N1000-32 3.059n ± 1% 2.510n ± 2% -17.96% (p=0.000 n=20)
Int64N1e8-32 2.942n ± 1% 2.471n ± 2% -15.98% (p=0.000 n=20)
Int64N1e9-32 2.932n ± 1% 2.488n ± 2% -15.14% (p=0.000 n=20)
Int64N2e9-32 2.925n ± 1% 2.478n ± 2% -15.30% (p=0.000 n=20)
Int64N1e18-32 3.116n ± 1% 3.088n ± 1% ~ (p=0.013 n=20)
Int64N2e18-32 4.067n ± 1% 3.493n ± 1% -14.11% (p=0.000 n=20)
Int64N4e18-32 4.054n ± 1% 5.060n ± 2% +24.80% (p=0.000 n=20)
Int32N1000-32 2.951n ± 1% 2.620n ± 1% -11.22% (p=0.000 n=20)
Int32N1e8-32 3.102n ± 1% 2.652n ± 0% -14.50% (p=0.000 n=20)
Int32N1e9-32 3.535n ± 1% 2.644n ± 1% -25.20% (p=0.000 n=20)
Int32N2e9-32 3.514n ± 1% 2.619n ± 2% -25.47% (p=0.000 n=20)
Float32-32 2.760n ± 1% 2.261n ± 1% -18.06% (p=0.000 n=20)
Float64-32 2.284n ± 1% 2.241n ± 2% ~ (p=0.016 n=20)
ExpFloat64-32 3.757n ± 1% 3.716n ± 1% ~ (p=0.034 n=20)
NormFloat64-32 3.837n ± 1% 3.718n ± 1% -3.09% (p=0.000 n=20)
Perm3-32 35.23n ± 2% 34.11n ± 2% -3.19% (p=0.000 n=20)
Perm30-32 208.8n ± 1% 200.6n ± 0% -3.93% (p=0.000 n=20)
Perm30ViaShuffle-32 111.7n ± 1% 109.7n ± 1% -1.84% (p=0.000 n=20)
ShuffleOverhead-32 101.1n ± 1% 107.2n ± 1% +6.03% (p=0.000 n=20)
Concurrent-32 2.108n ± 7% 2.108n ± 6% ~ (p=0.644 n=20)
PCG_DXSM-32 1.488n ± 2%
goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
│ 220860f76f.arm64 │ afa459a2f0.arm64 │
│ sec/op │ sec/op vs base │
SourceUint64-8 2.316n ± 1% 2.531n ± 0% +9.33% (p=0.000 n=20)
GlobalInt64-8 2.183n ± 1% 2.177n ± 1% ~ (p=0.533 n=20)
GlobalInt63Parallel-8 0.4331n ± 0%
GlobalInt64Parallel-8 0.4319n ± 0%
GlobalUint64-8 4.377n ± 2% 2.185n ± 1% -50.07% (p=0.000 n=20)
GlobalUint64Parallel-8 0.9237n ± 0% 0.4295n ± 1% -53.50% (p=0.000 n=20)
Int64-8 2.538n ± 1% 4.104n ± 0% +61.68% (p=0.000 n=20)
Uint64-8 2.604n ± 1% 4.080n ± 0% +56.68% (p=0.000 n=20)
GlobalIntN1000-8 3.857n ± 2% 2.814n ± 1% -27.04% (p=0.000 n=20)
IntN1000-8 3.822n ± 2% 4.140n ± 0% +8.32% (p=0.000 n=20)
Int64N1000-8 3.318n ± 0% 4.139n ± 0% +24.74% (p=0.000 n=20)
Int64N1e8-8 3.349n ± 1% 4.140n ± 0% +23.64% (p=0.000 n=20)
Int64N1e9-8 3.317n ± 2% 4.139n ± 0% +24.80% (p=0.000 n=20)
Int64N2e9-8 3.317n ± 2% 4.140n ± 0% +24.81% (p=0.000 n=20)
Int64N1e18-8 3.542n ± 1% 5.273n ± 0% +48.85% (p=0.000 n=20)
Int64N2e18-8 5.087n ± 0% 6.059n ± 0% +19.12% (p=0.000 n=20)
Int64N4e18-8 5.084n ± 0% 8.803n ± 0% +73.16% (p=0.000 n=20)
Int32N1000-8 3.208n ± 2% 4.131n ± 0% +28.79% (p=0.000 n=20)
Int32N1e8-8 3.610n ± 1% 4.131n ± 0% +14.43% (p=0.000 n=20)
Int32N1e9-8 4.235n ± 0% 4.131n ± 0% -2.44% (p=0.000 n=20)
Int32N2e9-8 4.229n ± 1% 4.131n ± 0% -2.33% (p=0.000 n=20)
Float32-8 3.468n ± 0% 4.110n ± 0% +18.50% (p=0.000 n=20)
Float64-8 3.447n ± 0% 4.104n ± 0% +19.05% (p=0.000 n=20)
ExpFloat64-8 4.567n ± 0% 5.338n ± 0% +16.86% (p=0.000 n=20)
NormFloat64-8 4.821n ± 0% 5.731n ± 0% +18.89% (p=0.000 n=20)
Perm3-8 28.89n ± 0% 26.62n ± 0% -7.84% (p=0.000 n=20)
Perm30-8 175.7n ± 0% 194.6n ± 2% +10.76% (p=0.000 n=20)
Perm30ViaShuffle-8 153.5n ± 0% 156.4n ± 0% +1.86% (p=0.000 n=20)
ShuffleOverhead-8 119.8n ± 1% 125.8n ± 0% +4.97% (p=0.000 n=20)
Concurrent-8 2.433n ± 3% 2.654n ± 6% +9.13% (p=0.001 n=20)
PCG_DXSM-8 2.531n ± 0%
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 220860f76f.386 │ afa459a2f0.386 │
│ sec/op │ sec/op vs base │
SourceUint64-32 2.370n ± 1% 7.680n ± 1% +224.05% (p=0.000 n=20)
GlobalInt64-32 3.569n ± 1% 3.474n ± 3% -2.66% (p=0.001 n=20)
GlobalInt63Parallel-32 0.3221n ± 1%
GlobalInt64Parallel-32 0.3253n ± 0%
GlobalUint64-32 8.797n ± 10% 3.433n ± 2% -60.98% (p=0.000 n=20)
GlobalUint64Parallel-32 0.6351n ± 0% 0.3156n ± 0% -50.31% (p=0.000 n=20)
Int64-32 2.612n ± 2% 7.707n ± 1% +195.04% (p=0.000 n=20)
Uint64-32 3.350n ± 1% 7.714n ± 1% +130.25% (p=0.000 n=20)
GlobalIntN1000-32 5.892n ± 1% 6.236n ± 1% +5.82% (p=0.000 n=20)
IntN1000-32 4.546n ± 1% 10.410n ± 1% +128.97% (p=0.000 n=20)
Int64N1000-32 14.59n ± 1% 10.97n ± 2% -24.75% (p=0.000 n=20)
Int64N1e8-32 14.76n ± 2% 10.98n ± 1% -25.58% (p=0.000 n=20)
Int64N1e9-32 16.57n ± 1% 10.95n ± 0% -33.90% (p=0.000 n=20)
Int64N2e9-32 14.54n ± 1% 11.11n ± 1% -23.62% (p=0.000 n=20)
Int64N1e18-32 16.14n ± 1% 15.18n ± 2% -5.95% (p=0.000 n=20)
Int64N2e18-32 18.10n ± 1% 15.61n ± 1% -13.73% (p=0.000 n=20)
Int64N4e18-32 18.65n ± 1% 19.23n ± 2% +3.08% (p=0.000 n=20)
Int32N1000-32 3.560n ± 1% 10.345n ± 1% +190.55% (p=0.000 n=20)
Int32N1e8-32 3.770n ± 2% 10.330n ± 1% +174.01% (p=0.000 n=20)
Int32N1e9-32 4.098n ± 0% 10.350n ± 1% +152.53% (p=0.000 n=20)
Int32N2e9-32 4.179n ± 1% 10.345n ± 1% +147.52% (p=0.000 n=20)
Float32-32 21.18n ± 4% 13.57n ± 1% -35.93% (p=0.000 n=20)
Float64-32 20.60n ± 2% 22.95n ± 4% +11.41% (p=0.000 n=20)
ExpFloat64-32 13.07n ± 0% 15.23n ± 2% +16.48% (p=0.000 n=20)
NormFloat64-32 7.738n ± 2% 13.780n ± 1% +78.08% (p=0.000 n=20)
Perm3-32 36.73n ± 1% 46.62n ± 2% +26.91% (p=0.000 n=20)
Perm30-32 211.9n ± 1% 400.7n ± 1% +89.05% (p=0.000 n=20)
Perm30ViaShuffle-32 165.2n ± 1% 350.5n ± 1% +112.20% (p=0.000 n=20)
ShuffleOverhead-32 133.9n ± 1% 326.0n ± 2% +143.37% (p=0.000 n=20)
Concurrent-32 3.287n ± 2% 3.290n ± 0% ~ (p=0.365 n=20)
PCG_DXSM-32 7.793n ± 2%
For #61716.
Change-Id: I4e9c0525b5f84a2ac46f23da9e365495e2d05777
Reviewed-on: https://go-review.googlesource.com/c/go/+/502506
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
|
|
8631fcbf31 |
math/rand/v2: add PCG-DXSM
For the original math/rand, we ported Plan 9's random number
generator, which was a refinement by Ken Thompson of an algorithm
by Don Mitchell and Jim Reeds, which Mitchell in turn recalls as
having been derived from an algorithm by Marsaglia. At its core,
it is an additive lagged Fibonacci generator (ALFG).
Whatever the details of the history, this generator is nowhere
near the current state of the art for simple, pseudo-random
generators.
This CL adds an implementation of Melissa O'Neill's PCG, specifically
the variant PCG-DXSM, which she defined after writing the PCG paper
and which is now the default in Numpy. The update is slightly slower
(a few multiplies and adds, instead of a few adds), but the state
is dramatically smaller (2 words instead of 607). The statistical
output properties are better too.
A followup CL will delete the old generator.
PCG is the only change here, so no benchmarks should be affected.
Including them anyway as further evidence for caution.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 8993506f2f.amd64 │ 01ff938549.amd64 │
│ sec/op │ sec/op vs base │
SourceUint64-32 1.325n ± 1% 1.352n ± 1% +2.00% (p=0.000 n=20)
GlobalInt64-32 2.240n ± 1% 2.083n ± 0% -7.03% (p=0.000 n=20)
GlobalInt64Parallel-32 0.1041n ± 1% 0.1035n ± 1% ~ (p=0.064 n=20)
GlobalUint64-32 2.072n ± 3% 2.038n ± 1% ~ (p=0.089 n=20)
GlobalUint64Parallel-32 0.1008n ± 1% 0.1006n ± 1% ~ (p=0.804 n=20)
Int64-32 1.716n ± 1% 1.687n ± 2% ~ (p=0.045 n=20)
Uint64-32 1.665n ± 1% 1.674n ± 2% ~ (p=0.878 n=20)
GlobalIntN1000-32 3.335n ± 1% 3.135n ± 1% -6.00% (p=0.000 n=20)
IntN1000-32 2.484n ± 1% 2.478n ± 1% ~ (p=0.085 n=20)
Int64N1000-32 2.502n ± 2% 2.455n ± 1% -1.88% (p=0.002 n=20)
Int64N1e8-32 2.484n ± 2% 2.467n ± 2% ~ (p=0.048 n=20)
Int64N1e9-32 2.502n ± 0% 2.454n ± 1% -1.92% (p=0.000 n=20)
Int64N2e9-32 2.502n ± 0% 2.482n ± 1% -0.76% (p=0.000 n=20)
Int64N1e18-32 3.201n ± 1% 3.349n ± 2% +4.62% (p=0.000 n=20)
Int64N2e18-32 3.504n ± 1% 3.537n ± 1% ~ (p=0.185 n=20)
Int64N4e18-32 4.873n ± 1% 4.917n ± 0% +0.90% (p=0.000 n=20)
Int32N1000-32 2.639n ± 1% 2.386n ± 1% -9.57% (p=0.000 n=20)
Int32N1e8-32 2.686n ± 2% 2.366n ± 1% -11.91% (p=0.000 n=20)
Int32N1e9-32 2.636n ± 1% 2.355n ± 2% -10.70% (p=0.000 n=20)
Int32N2e9-32 2.660n ± 1% 2.371n ± 1% -10.88% (p=0.000 n=20)
Float32-32 2.261n ± 1% 2.245n ± 2% ~ (p=0.752 n=20)
Float64-32 2.280n ± 1% 2.235n ± 1% -1.97% (p=0.007 n=20)
ExpFloat64-32 3.891n ± 1% 3.813n ± 3% ~ (p=0.087 n=20)
NormFloat64-32 3.711n ± 1% 3.652n ± 2% ~ (p=0.021 n=20)
Perm3-32 32.60n ± 2% 33.12n ± 3% ~ (p=0.107 n=20)
Perm30-32 204.2n ± 0% 205.1n ± 1% ~ (p=0.358 n=20)
Perm30ViaShuffle-32 121.7n ± 2% 110.8n ± 1% -8.96% (p=0.000 n=20)
ShuffleOverhead-32 106.2n ± 2% 113.0n ± 1% +6.36% (p=0.000 n=20)
Concurrent-32 2.190n ± 5% 2.100n ± 0% -4.13% (p=0.001 n=20)
PCG_DXSM-32 1.490n ± 0%
goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
│ 8993506f2f.arm64 │ 01ff938549.arm64 │
│ sec/op │ sec/op vs base │
SourceUint64-8 2.271n ± 0% 2.258n ± 1% ~ (p=0.167 n=20)
GlobalInt64-8 2.161n ± 1% 2.167n ± 0% ~ (p=0.693 n=20)
GlobalInt64Parallel-8 0.4303n ± 0% 0.4310n ± 0% ~ (p=0.051 n=20)
GlobalUint64-8 2.164n ± 1% 2.182n ± 1% ~ (p=0.042 n=20)
GlobalUint64Parallel-8 0.4287n ± 0% 0.4297n ± 0% ~ (p=0.082 n=20)
Int64-8 2.478n ± 1% 2.472n ± 1% ~ (p=0.151 n=20)
Uint64-8 2.460n ± 1% 2.449n ± 1% ~ (p=0.013 n=20)
GlobalIntN1000-8 2.814n ± 2% 2.814n ± 2% ~ (p=0.821 n=20)
IntN1000-8 3.003n ± 2% 2.998n ± 2% ~ (p=0.024 n=20)
Int64N1000-8 2.954n ± 0% 2.949n ± 2% ~ (p=0.192 n=20)
Int64N1e8-8 2.956n ± 0% 2.953n ± 2% ~ (p=0.109 n=20)
Int64N1e9-8 3.325n ± 0% 2.950n ± 0% -11.26% (p=0.000 n=20)
Int64N2e9-8 2.956n ± 2% 2.946n ± 2% ~ (p=0.027 n=20)
Int64N1e18-8 3.780n ± 1% 3.779n ± 1% ~ (p=0.815 n=20)
Int64N2e18-8 4.385n ± 0% 4.370n ± 1% ~ (p=0.402 n=20)
Int64N4e18-8 6.527n ± 0% 6.544n ± 1% ~ (p=0.140 n=20)
Int32N1000-8 2.964n ± 1% 2.950n ± 0% -0.47% (p=0.002 n=20)
Int32N1e8-8 2.964n ± 1% 2.950n ± 2% ~ (p=0.013 n=20)
Int32N1e9-8 2.963n ± 2% 2.951n ± 2% ~ (p=0.062 n=20)
Int32N2e9-8 2.961n ± 2% 2.950n ± 2% -0.37% (p=0.002 n=20)
Float32-8 3.442n ± 0% 3.441n ± 0% ~ (p=0.211 n=20)
Float64-8 3.442n ± 0% 3.442n ± 0% ~ (p=0.067 n=20)
ExpFloat64-8 4.472n ± 0% 4.481n ± 0% +0.20% (p=0.000 n=20)
NormFloat64-8 4.734n ± 0% 4.725n ± 0% -0.19% (p=0.003 n=20)
Perm3-8 26.55n ± 0% 26.55n ± 0% ~ (p=0.833 n=20)
Perm30-8 181.9n ± 0% 181.9n ± 0% -0.03% (p=0.004 n=20)
Perm30ViaShuffle-8 143.1n ± 0% 142.9n ± 0% ~ (p=0.204 n=20)
ShuffleOverhead-8 120.6n ± 1% 120.8n ± 2% ~ (p=0.102 n=20)
Concurrent-8 2.357n ± 2% 2.421n ± 6% ~ (p=0.016 n=20)
PCG_DXSM-8 2.531n ± 0%
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 8993506f2f.386 │ 01ff938549.386 │
│ sec/op │ sec/op vs base │
SourceUint64-32 2.102n ± 2% 2.069n ± 0% ~ (p=0.021 n=20)
GlobalInt64-32 3.542n ± 2% 3.456n ± 1% -2.44% (p=0.001 n=20)
GlobalInt64Parallel-32 0.3202n ± 0% 0.3252n ± 0% +1.56% (p=0.000 n=20)
GlobalUint64-32 3.507n ± 1% 3.573n ± 1% +1.87% (p=0.000 n=20)
GlobalUint64Parallel-32 0.3170n ± 1% 0.3159n ± 0% ~ (p=0.167 n=20)
Int64-32 2.516n ± 1% 2.562n ± 2% ~ (p=0.016 n=20)
Uint64-32 2.544n ± 1% 2.592n ± 0% +1.85% (p=0.000 n=20)
GlobalIntN1000-32 6.237n ± 1% 6.266n ± 2% ~ (p=0.268 n=20)
IntN1000-32 4.670n ± 2% 4.724n ± 2% ~ (p=0.644 n=20)
Int64N1000-32 5.412n ± 1% 5.490n ± 2% ~ (p=0.159 n=20)
Int64N1e8-32 5.414n ± 2% 5.513n ± 2% ~ (p=0.129 n=20)
Int64N1e9-32 5.473n ± 1% 5.476n ± 1% ~ (p=0.723 n=20)
Int64N2e9-32 5.487n ± 1% 5.501n ± 2% ~ (p=0.481 n=20)
Int64N1e18-32 8.901n ± 2% 9.043n ± 2% ~ (p=0.330 n=20)
Int64N2e18-32 9.521n ± 1% 9.601n ± 2% ~ (p=0.703 n=20)
Int64N4e18-32 11.92n ± 1% 12.00n ± 1% ~ (p=0.489 n=20)
Int32N1000-32 4.785n ± 1% 4.829n ± 2% ~ (p=0.402 n=20)
Int32N1e8-32 4.748n ± 1% 4.825n ± 2% ~ (p=0.218 n=20)
Int32N1e9-32 4.810n ± 1% 4.830n ± 2% ~ (p=0.794 n=20)
Int32N2e9-32 4.812n ± 1% 4.750n ± 2% ~ (p=0.057 n=20)
Float32-32 10.48n ± 4% 10.89n ± 4% ~ (p=0.162 n=20)
Float64-32 19.79n ± 3% 19.60n ± 4% ~ (p=0.668 n=20)
ExpFloat64-32 12.91n ± 3% 12.96n ± 3% ~ (p=1.000 n=20)
NormFloat64-32 7.462n ± 1% 7.516n ± 1% ~ (p=0.051 n=20)
Perm3-32 35.98n ± 2% 36.78n ± 2% ~ (p=0.033 n=20)
Perm30-32 241.5n ± 1% 238.9n ± 2% ~ (p=0.126 n=20)
Perm30ViaShuffle-32 187.3n ± 2% 189.7n ± 2% ~ (p=0.387 n=20)
ShuffleOverhead-32 160.2n ± 1% 159.8n ± 1% ~ (p=0.256 n=20)
Concurrent-32 3.308n ± 3% 3.286n ± 1% ~ (p=0.038 n=20)
PCG_DXSM-32 7.613n ± 1%
For #61716.
Change-Id: Icb274ca1f782504d658305a40159b4ae6a2f3f1d
Reviewed-on: https://go-review.googlesource.com/c/go/+/502505
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Rob Pike <r@golang.org>
|
|
|
|
f2e2637227 |
math/rand/v2: simplify Perm
The compiler says Perm is being inlined into BenchmarkPerm,
and yet BenchmarkPerm30ViaShuffle, which you'd think is the
same code, still runs significantly faster.
The benchmarks are mystifying but this is clearly still a step in
the right direction, since BenchmarkPerm30ViaShuffle is still
the fastest and we avoid having two copies of that logic.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ e1bbe739fb.amd64 │ 8993506f2f.amd64 │
│ sec/op │ sec/op vs base │
SourceUint64-32 1.316n ± 2% 1.325n ± 1% ~ (p=0.208 n=20)
GlobalInt64-32 2.048n ± 1% 2.240n ± 1% +9.38% (p=0.000 n=20)
GlobalInt64Parallel-32 0.1037n ± 1% 0.1041n ± 1% ~ (p=0.774 n=20)
GlobalUint64-32 2.039n ± 2% 2.072n ± 3% ~ (p=0.115 n=20)
GlobalUint64Parallel-32 0.1013n ± 1% 0.1008n ± 1% ~ (p=0.417 n=20)
Int64-32 1.692n ± 2% 1.716n ± 1% ~ (p=0.122 n=20)
Uint64-32 1.643n ± 2% 1.665n ± 1% ~ (p=0.062 n=20)
GlobalIntN1000-32 3.287n ± 1% 3.335n ± 1% ~ (p=0.147 n=20)
IntN1000-32 2.678n ± 2% 2.484n ± 1% -7.24% (p=0.000 n=20)
Int64N1000-32 2.684n ± 2% 2.502n ± 2% -6.80% (p=0.000 n=20)
Int64N1e8-32 2.663n ± 2% 2.484n ± 2% -6.76% (p=0.000 n=20)
Int64N1e9-32 2.633n ± 1% 2.502n ± 0% -4.98% (p=0.000 n=20)
Int64N2e9-32 2.657n ± 1% 2.502n ± 0% -5.87% (p=0.000 n=20)
Int64N1e18-32 3.125n ± 2% 3.201n ± 1% +2.43% (p=0.000 n=20)
Int64N2e18-32 3.476n ± 1% 3.504n ± 1% +0.83% (p=0.009 n=20)
Int64N4e18-32 4.795n ± 1% 4.873n ± 1% ~ (p=0.106 n=20)
Int32N1000-32 2.485n ± 2% 2.639n ± 1% +6.20% (p=0.000 n=20)
Int32N1e8-32 2.457n ± 1% 2.686n ± 2% +9.34% (p=0.000 n=20)
Int32N1e9-32 2.452n ± 1% 2.636n ± 1% +7.52% (p=0.000 n=20)
Int32N2e9-32 2.453n ± 1% 2.660n ± 1% +8.44% (p=0.000 n=20)
Float32-32 2.254n ± 1% 2.261n ± 1% ~ (p=0.888 n=20)
Float64-32 2.262n ± 1% 2.280n ± 1% ~ (p=0.040 n=20)
ExpFloat64-32 3.777n ± 2% 3.891n ± 1% +3.03% (p=0.000 n=20)
NormFloat64-32 3.606n ± 1% 3.711n ± 1% +2.91% (p=0.000 n=20)
Perm3-32 33.12n ± 2% 32.60n ± 2% ~ (p=0.045 n=20)
Perm30-32 176.1n ± 1% 204.2n ± 0% +15.96% (p=0.000 n=20)
Perm30ViaShuffle-32 109.3n ± 1% 121.7n ± 2% +11.30% (p=0.000 n=20)
ShuffleOverhead-32 112.5n ± 1% 106.2n ± 2% -5.56% (p=0.000 n=20)
Concurrent-32 2.099n ± 0% 2.190n ± 5% +4.36% (p=0.001 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
│ e1bbe739fb.arm64 │ 8993506f2f.arm64 │
│ sec/op │ sec/op vs base │
SourceUint64-8 2.290n ± 1% 2.271n ± 0% ~ (p=0.015 n=20)
GlobalInt64-8 2.180n ± 1% 2.161n ± 1% ~ (p=0.180 n=20)
GlobalInt64Parallel-8 0.4294n ± 0% 0.4303n ± 0% +0.19% (p=0.001 n=20)
GlobalUint64-8 2.170n ± 1% 2.164n ± 1% ~ (p=0.673 n=20)
GlobalUint64Parallel-8 0.4283n ± 0% 0.4287n ± 0% ~ (p=0.128 n=20)
Int64-8 2.481n ± 1% 2.478n ± 1% ~ (p=0.867 n=20)
Uint64-8 2.464n ± 1% 2.460n ± 1% ~ (p=0.763 n=20)
GlobalIntN1000-8 2.814n ± 0% 2.814n ± 2% ~ (p=0.969 n=20)
IntN1000-8 2.934n ± 2% 3.003n ± 2% +2.35% (p=0.000 n=20)
Int64N1000-8 2.957n ± 1% 2.954n ± 0% ~ (p=0.285 n=20)
Int64N1e8-8 2.935n ± 2% 2.956n ± 0% +0.73% (p=0.002 n=20)
Int64N1e9-8 2.935n ± 2% 3.325n ± 0% +13.29% (p=0.000 n=20)
Int64N2e9-8 2.933n ± 4% 2.956n ± 2% ~ (p=0.163 n=20)
Int64N1e18-8 3.781n ± 1% 3.780n ± 1% ~ (p=0.805 n=20)
Int64N2e18-8 4.362n ± 0% 4.385n ± 0% ~ (p=0.077 n=20)
Int64N4e18-8 6.576n ± 1% 6.527n ± 0% ~ (p=0.024 n=20)
Int32N1000-8 2.942n ± 2% 2.964n ± 1% ~ (p=0.073 n=20)
Int32N1e8-8 2.941n ± 1% 2.964n ± 1% ~ (p=0.058 n=20)
Int32N1e9-8 2.938n ± 2% 2.963n ± 2% +0.87% (p=0.003 n=20)
Int32N2e9-8 2.982n ± 2% 2.961n ± 2% ~ (p=0.056 n=20)
Float32-8 3.441n ± 0% 3.442n ± 0% ~ (p=0.030 n=20)
Float64-8 3.441n ± 0% 3.442n ± 0% +0.03% (p=0.001 n=20)
ExpFloat64-8 4.472n ± 0% 4.472n ± 0% ~ (p=0.877 n=20)
NormFloat64-8 4.716n ± 0% 4.734n ± 0% +0.38% (p=0.000 n=20)
Perm3-8 26.66n ± 0% 26.55n ± 0% -0.39% (p=0.000 n=20)
Perm30-8 143.3n ± 0% 181.9n ± 0% +26.97% (p=0.000 n=20)
Perm30ViaShuffle-8 142.9n ± 0% 143.1n ± 0% ~ (p=0.669 n=20)
ShuffleOverhead-8 121.1n ± 1% 120.6n ± 1% -0.41% (p=0.004 n=20)
Concurrent-8 2.379n ± 2% 2.357n ± 2% ~ (p=0.337 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ e1bbe739fb.386 │ 8993506f2f.386 │
│ sec/op │ sec/op vs base │
SourceUint64-32 2.087n ± 1% 2.102n ± 2% ~ (p=0.507 n=20)
GlobalInt64-32 3.538n ± 2% 3.542n ± 2% ~ (p=0.425 n=20)
GlobalInt64Parallel-32 0.3207n ± 1% 0.3202n ± 0% ~ (p=0.963 n=20)
GlobalUint64-32 3.543n ± 1% 3.507n ± 1% ~ (p=0.034 n=20)
GlobalUint64Parallel-32 0.3170n ± 0% 0.3170n ± 1% ~ (p=0.920 n=20)
Int64-32 2.548n ± 1% 2.516n ± 1% ~ (p=0.139 n=20)
Uint64-32 2.565n ± 2% 2.544n ± 1% ~ (p=0.394 n=20)
GlobalIntN1000-32 6.300n ± 1% 6.237n ± 1% ~ (p=0.029 n=20)
IntN1000-32 4.750n ± 0% 4.670n ± 2% ~ (p=0.034 n=20)
Int64N1000-32 5.515n ± 2% 5.412n ± 1% -1.86% (p=0.009 n=20)
Int64N1e8-32 5.527n ± 0% 5.414n ± 2% -2.05% (p=0.002 n=20)
Int64N1e9-32 5.531n ± 2% 5.473n ± 1% ~ (p=0.047 n=20)
Int64N2e9-32 5.514n ± 2% 5.487n ± 1% ~ (p=0.298 n=20)
Int64N1e18-32 9.059n ± 1% 8.901n ± 2% ~ (p=0.037 n=20)
Int64N2e18-32 9.594n ± 1% 9.521n ± 1% ~ (p=0.051 n=20)
Int64N4e18-32 12.05n ± 2% 11.92n ± 1% ~ (p=0.357 n=20)
Int32N1000-32 4.840n ± 2% 4.785n ± 1% ~ (p=0.189 n=20)
Int32N1e8-32 4.832n ± 2% 4.748n ± 1% ~ (p=0.042 n=20)
Int32N1e9-32 4.815n ± 2% 4.810n ± 1% ~ (p=0.878 n=20)
Int32N2e9-32 4.813n ± 1% 4.812n ± 1% ~ (p=0.542 n=20)
Float32-32 10.90n ± 2% 10.48n ± 4% -3.85% (p=0.007 n=20)
Float64-32 20.32n ± 4% 19.79n ± 3% ~ (p=0.553 n=20)
ExpFloat64-32 12.95n ± 3% 12.91n ± 3% ~ (p=0.909 n=20)
NormFloat64-32 7.570n ± 1% 7.462n ± 1% -1.44% (p=0.004 n=20)
Perm3-32 37.80n ± 2% 35.98n ± 2% -4.79% (p=0.000 n=20)
Perm30-32 214.0n ± 1% 241.5n ± 1% +12.85% (p=0.000 n=20)
Perm30ViaShuffle-32 188.7n ± 2% 187.3n ± 2% ~ (p=0.029 n=20)
ShuffleOverhead-32 160.8n ± 1% 160.2n ± 1% ~ (p=0.180 n=20)
Concurrent-32 3.288n ± 0% 3.308n ± 3% ~ (p=0.037 n=20)
For #61716.
Change-Id: I342b611456c3569520d3c91c849d29eba325d87e
Reviewed-on: https://go-review.googlesource.com/c/go/+/502504
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Rob Pike <r@golang.org>
|
|
|
|
488e2a56b9 |
math/rand/v2: remove bias in ExpFloat64 and NormFloat64
The original implementation of the ziggurat algorithm was designed for
32-bit random integer inputs. This necessitated reusing some low-order
bits for the slice selection and the random coordinate, which introduces
statistical bias. The result is that PractRand consistently fails the
math/rand normal and exponential sequences (transformed to uniform)
within 2 GB of variates.
This change adjusts the ziggurat procedures to use 63-bit random inputs,
so that there is no need to reuse bits between the slice and coordinate.
This is sufficient for the normal sequence to survive to 256 GB of
PractRand testing.
An alternative technique is to recalculate the ziggurats to use 1024
rather than 128 or 256 slices to make full use of 64-bit inputs. This
improves the survival of the normal sequence to far beyond 256 GB and
additionally provides a 6% performance improvement due to the improved
rejection procedure efficiency. However, doing so increases the total
size of the ziggurat tables from 4.5 kB to 48 kB.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 2703446c2e.amd64 │ e1bbe739fb.amd64 │
│ sec/op │ sec/op vs base │
SourceUint64-32 1.337n ± 1% 1.316n ± 2% ~ (p=0.024 n=20)
GlobalInt64-32 2.225n ± 2% 2.048n ± 1% -7.93% (p=0.000 n=20)
GlobalInt64Parallel-32 0.1043n ± 2% 0.1037n ± 1% ~ (p=0.587 n=20)
GlobalUint64-32 2.058n ± 1% 2.039n ± 2% ~ (p=0.030 n=20)
GlobalUint64Parallel-32 0.1009n ± 1% 0.1013n ± 1% ~ (p=0.984 n=20)
Int64-32 1.719n ± 2% 1.692n ± 2% ~ (p=0.085 n=20)
Uint64-32 1.669n ± 1% 1.643n ± 2% ~ (p=0.049 n=20)
GlobalIntN1000-32 3.321n ± 2% 3.287n ± 1% ~ (p=0.298 n=20)
IntN1000-32 2.479n ± 1% 2.678n ± 2% +8.01% (p=0.000 n=20)
Int64N1000-32 2.477n ± 1% 2.684n ± 2% +8.38% (p=0.000 n=20)
Int64N1e8-32 2.490n ± 1% 2.663n ± 2% +6.99% (p=0.000 n=20)
Int64N1e9-32 2.458n ± 1% 2.633n ± 1% +7.12% (p=0.000 n=20)
Int64N2e9-32 2.486n ± 2% 2.657n ± 1% +6.90% (p=0.000 n=20)
Int64N1e18-32 3.215n ± 2% 3.125n ± 2% -2.78% (p=0.000 n=20)
Int64N2e18-32 3.588n ± 2% 3.476n ± 1% -3.15% (p=0.000 n=20)
Int64N4e18-32 4.938n ± 2% 4.795n ± 1% -2.91% (p=0.000 n=20)
Int32N1000-32 2.673n ± 2% 2.485n ± 2% -7.02% (p=0.000 n=20)
Int32N1e8-32 2.631n ± 2% 2.457n ± 1% -6.63% (p=0.000 n=20)
Int32N1e9-32 2.628n ± 2% 2.452n ± 1% -6.70% (p=0.000 n=20)
Int32N2e9-32 2.684n ± 2% 2.453n ± 1% -8.61% (p=0.000 n=20)
Float32-32 2.240n ± 2% 2.254n ± 1% ~ (p=0.878 n=20)
Float64-32 2.253n ± 1% 2.262n ± 1% ~ (p=0.963 n=20)
ExpFloat64-32 3.677n ± 1% 3.777n ± 2% +2.71% (p=0.004 n=20)
NormFloat64-32 3.761n ± 1% 3.606n ± 1% -4.15% (p=0.000 n=20)
Perm3-32 33.55n ± 2% 33.12n ± 2% ~ (p=0.402 n=20)
Perm30-32 173.2n ± 1% 176.1n ± 1% +1.67% (p=0.000 n=20)
Perm30ViaShuffle-32 115.9n ± 1% 109.3n ± 1% -5.69% (p=0.000 n=20)
ShuffleOverhead-32 101.9n ± 1% 112.5n ± 1% +10.35% (p=0.000 n=20)
Concurrent-32 2.107n ± 6% 2.099n ± 0% ~ (p=0.051 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
│ 2703446c2e.arm64 │ e1bbe739fb.arm64 │
│ sec/op │ sec/op vs base │
SourceUint64-8 2.275n ± 0% 2.290n ± 1% ~ (p=0.044 n=20)
GlobalInt64-8 2.154n ± 1% 2.180n ± 1% ~ (p=0.068 n=20)
GlobalInt64Parallel-8 0.4298n ± 0% 0.4294n ± 0% ~ (p=0.079 n=20)
GlobalUint64-8 2.160n ± 1% 2.170n ± 1% ~ (p=0.129 n=20)
GlobalUint64Parallel-8 0.4286n ± 0% 0.4283n ± 0% ~ (p=0.350 n=20)
Int64-8 2.491n ± 1% 2.481n ± 1% ~ (p=0.330 n=20)
Uint64-8 2.458n ± 0% 2.464n ± 1% ~ (p=0.351 n=20)
GlobalIntN1000-8 2.814n ± 2% 2.814n ± 0% ~ (p=0.325 n=20)
IntN1000-8 2.933n ± 0% 2.934n ± 2% ~ (p=0.079 n=20)
Int64N1000-8 2.962n ± 1% 2.957n ± 1% ~ (p=0.259 n=20)
Int64N1e8-8 2.960n ± 1% 2.935n ± 2% ~ (p=0.276 n=20)
Int64N1e9-8 2.935n ± 2% 2.935n ± 2% ~ (p=0.984 n=20)
Int64N2e9-8 2.934n ± 0% 2.933n ± 4% ~ (p=0.463 n=20)
Int64N1e18-8 3.777n ± 1% 3.781n ± 1% ~ (p=0.516 n=20)
Int64N2e18-8 4.359n ± 1% 4.362n ± 0% ~ (p=0.256 n=20)
Int64N4e18-8 6.536n ± 1% 6.576n ± 1% ~ (p=0.224 n=20)
Int32N1000-8 2.937n ± 0% 2.942n ± 2% ~ (p=0.312 n=20)
Int32N1e8-8 2.937n ± 1% 2.941n ± 1% ~ (p=0.463 n=20)
Int32N1e9-8 2.936n ± 0% 2.938n ± 2% ~ (p=0.044 n=20)
Int32N2e9-8 2.938n ± 2% 2.982n ± 2% ~ (p=0.174 n=20)
Float32-8 3.441n ± 0% 3.441n ± 0% ~ (p=0.064 n=20)
Float64-8 3.441n ± 0% 3.441n ± 0% ~ (p=0.826 n=20)
ExpFloat64-8 4.486n ± 0% 4.472n ± 0% -0.31% (p=0.000 n=20)
NormFloat64-8 4.721n ± 0% 4.716n ± 0% ~ (p=0.051 n=20)
Perm3-8 26.65n ± 0% 26.66n ± 0% ~ (p=0.080 n=20)
Perm30-8 143.2n ± 0% 143.3n ± 0% +0.10% (p=0.000 n=20)
Perm30ViaShuffle-8 143.0n ± 0% 142.9n ± 0% ~ (p=0.642 n=20)
ShuffleOverhead-8 120.6n ± 1% 121.1n ± 1% +0.41% (p=0.010 n=20)
Concurrent-8 2.399n ± 5% 2.379n ± 2% ~ (p=0.365 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 2703446c2e.386 │ e1bbe739fb.386 │
│ sec/op │ sec/op vs base │
SourceUint64-32 2.072n ± 2% 2.087n ± 1% ~ (p=0.440 n=20)
GlobalInt64-32 3.546n ± 27% 3.538n ± 2% ~ (p=0.101 n=20)
GlobalInt64Parallel-32 0.3211n ± 0% 0.3207n ± 1% ~ (p=0.753 n=20)
GlobalUint64-32 3.522n ± 2% 3.543n ± 1% ~ (p=0.071 n=20)
GlobalUint64Parallel-32 0.3172n ± 0% 0.3170n ± 0% ~ (p=0.507 n=20)
Int64-32 2.520n ± 2% 2.548n ± 1% ~ (p=0.267 n=20)
Uint64-32 2.581n ± 1% 2.565n ± 2% ~ (p=0.143 n=20)
GlobalIntN1000-32 6.171n ± 1% 6.300n ± 1% ~ (p=0.037 n=20)
IntN1000-32 4.752n ± 2% 4.750n ± 0% ~ (p=0.984 n=20)
Int64N1000-32 5.429n ± 1% 5.515n ± 2% ~ (p=0.292 n=20)
Int64N1e8-32 5.469n ± 2% 5.527n ± 0% ~ (p=0.013 n=20)
Int64N1e9-32 5.489n ± 2% 5.531n ± 2% ~ (p=0.256 n=20)
Int64N2e9-32 5.492n ± 2% 5.514n ± 2% ~ (p=0.606 n=20)
Int64N1e18-32 8.927n ± 1% 9.059n ± 1% ~ (p=0.229 n=20)
Int64N2e18-32 9.622n ± 1% 9.594n ± 1% ~ (p=0.703 n=20)
Int64N4e18-32 12.03n ± 1% 12.05n ± 2% ~ (p=0.733 n=20)
Int32N1000-32 4.817n ± 1% 4.840n ± 2% ~ (p=0.941 n=20)
Int32N1e8-32 4.801n ± 1% 4.832n ± 2% ~ (p=0.228 n=20)
Int32N1e9-32 4.798n ± 1% 4.815n ± 2% ~ (p=0.560 n=20)
Int32N2e9-32 4.840n ± 1% 4.813n ± 1% ~ (p=0.015 n=20)
Float32-32 10.51n ± 4% 10.90n ± 2% +3.71% (p=0.007 n=20)
Float64-32 20.33n ± 3% 20.32n ± 4% ~ (p=0.566 n=20)
ExpFloat64-32 12.59n ± 2% 12.95n ± 3% +2.86% (p=0.002 n=20)
NormFloat64-32 7.350n ± 2% 7.570n ± 1% +2.99% (p=0.007 n=20)
Perm3-32 39.29n ± 2% 37.80n ± 2% -3.79% (p=0.000 n=20)
Perm30-32 219.1n ± 2% 214.0n ± 1% -2.33% (p=0.002 n=20)
Perm30ViaShuffle-32 189.8n ± 2% 188.7n ± 2% ~ (p=0.147 n=20)
ShuffleOverhead-32 158.9n ± 2% 160.8n ± 1% ~ (p=0.176 n=20)
Concurrent-32 3.306n ± 3% 3.288n ± 0% -0.54% (p=0.005 n=20)
For #61716.
Change-Id: I4c5fe710b310dc075ae21c97d1805bcc20db5050
Reviewed-on: https://go-review.googlesource.com/c/go/+/516275
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Rob Pike <r@golang.org>
|
|
|
|
ecda959b99 |
math/rand/v2: optimize Float32, Float64
We realized too late after Go 1 that float64(r.Uint64())/(1<<64)
is not a correct implementation: it occasionally rounds to 1.
The correct implementation is float64(r.Uint64()&(1<<53-1))/(1<<53)
but we couldn't change the implementation for compatibility, so we
changed it to retry only in the "round to 1" cases.
The change to v2 lets us update the algorithm to the simpler,
faster one.
Note that this implementation cannot generate 2⁻⁵⁴, nor 2⁻¹⁰⁰,
nor any of the other numbers between 0 and 2⁻⁵³. A slower algorithm
could shift some of the probability of generating these two boundary
values over to the values in between, but that would be much slower
and not necessarily be better. In particular, the current
implementation has the property that there are uniform gaps between
the possible returned floats, which might help stability. Also, the
result is often scaled and shifted, like Float64()*X+Y. Multiplying by
X>1 would open new gaps, and adding most Y would erase all the
distinctions that were introduced.
The only changes to benchmarks should be in Float32 and Float64.
The other changes remain a cautionary tale.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 4d84a369d1.amd64 │ 2703446c2e.amd64 │
│ sec/op │ sec/op vs base │
SourceUint64-32 1.348n ± 2% 1.337n ± 1% ~ (p=0.662 n=20)
GlobalInt64-32 2.082n ± 2% 2.225n ± 2% +6.87% (p=0.000 n=20)
GlobalInt64Parallel-32 0.1036n ± 1% 0.1043n ± 2% ~ (p=0.171 n=20)
GlobalUint64-32 2.077n ± 2% 2.058n ± 1% ~ (p=0.560 n=20)
GlobalUint64Parallel-32 0.1012n ± 1% 0.1009n ± 1% ~ (p=0.995 n=20)
Int64-32 1.750n ± 0% 1.719n ± 2% -1.74% (p=0.000 n=20)
Uint64-32 1.707n ± 2% 1.669n ± 1% -2.20% (p=0.000 n=20)
GlobalIntN1000-32 3.192n ± 1% 3.321n ± 2% +4.04% (p=0.000 n=20)
IntN1000-32 2.462n ± 2% 2.479n ± 1% ~ (p=0.417 n=20)
Int64N1000-32 2.470n ± 1% 2.477n ± 1% ~ (p=0.664 n=20)
Int64N1e8-32 2.503n ± 2% 2.490n ± 1% ~ (p=0.245 n=20)
Int64N1e9-32 2.487n ± 1% 2.458n ± 1% ~ (p=0.032 n=20)
Int64N2e9-32 2.487n ± 1% 2.486n ± 2% ~ (p=0.507 n=20)
Int64N1e18-32 3.006n ± 2% 3.215n ± 2% +6.94% (p=0.000 n=20)
Int64N2e18-32 3.368n ± 1% 3.588n ± 2% +6.55% (p=0.000 n=20)
Int64N4e18-32 4.763n ± 1% 4.938n ± 2% +3.69% (p=0.000 n=20)
Int32N1000-32 2.403n ± 1% 2.673n ± 2% +11.19% (p=0.000 n=20)
Int32N1e8-32 2.405n ± 1% 2.631n ± 2% +9.42% (p=0.000 n=20)
Int32N1e9-32 2.402n ± 2% 2.628n ± 2% +9.41% (p=0.000 n=20)
Int32N2e9-32 2.384n ± 1% 2.684n ± 2% +12.56% (p=0.000 n=20)
Float32-32 2.641n ± 2% 2.240n ± 2% -15.18% (p=0.000 n=20)
Float64-32 2.483n ± 1% 2.253n ± 1% -9.26% (p=0.000 n=20)
ExpFloat64-32 3.486n ± 2% 3.677n ± 1% +5.49% (p=0.000 n=20)
NormFloat64-32 3.648n ± 1% 3.761n ± 1% +3.11% (p=0.000 n=20)
Perm3-32 33.04n ± 1% 33.55n ± 2% ~ (p=0.180 n=20)
Perm30-32 171.9n ± 1% 173.2n ± 1% ~ (p=0.050 n=20)
Perm30ViaShuffle-32 100.3n ± 1% 115.9n ± 1% +15.55% (p=0.000 n=20)
ShuffleOverhead-32 102.5n ± 1% 101.9n ± 1% ~ (p=0.266 n=20)
Concurrent-32 2.101n ± 0% 2.107n ± 6% ~ (p=0.212 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
│ 4d84a369d1.arm64 │ 2703446c2e.arm64 │
│ sec/op │ sec/op vs base │
SourceUint64-8 2.261n ± 1% 2.275n ± 0% ~ (p=0.082 n=20)
GlobalInt64-8 2.160n ± 1% 2.154n ± 1% ~ (p=0.490 n=20)
GlobalInt64Parallel-8 0.4299n ± 0% 0.4298n ± 0% ~ (p=0.663 n=20)
GlobalUint64-8 2.169n ± 1% 2.160n ± 1% ~ (p=0.292 n=20)
GlobalUint64Parallel-8 0.4293n ± 1% 0.4286n ± 0% ~ (p=0.155 n=20)
Int64-8 2.473n ± 1% 2.491n ± 1% ~ (p=0.317 n=20)
Uint64-8 2.453n ± 1% 2.458n ± 0% ~ (p=0.941 n=20)
GlobalIntN1000-8 2.814n ± 2% 2.814n ± 2% ~ (p=0.972 n=20)
IntN1000-8 2.933n ± 2% 2.933n ± 0% ~ (p=0.287 n=20)
Int64N1000-8 2.934n ± 2% 2.962n ± 1% ~ (p=0.062 n=20)
Int64N1e8-8 2.935n ± 2% 2.960n ± 1% ~ (p=0.183 n=20)
Int64N1e9-8 2.934n ± 2% 2.935n ± 2% ~ (p=0.367 n=20)
Int64N2e9-8 2.935n ± 2% 2.934n ± 0% ~ (p=0.455 n=20)
Int64N1e18-8 3.778n ± 1% 3.777n ± 1% ~ (p=0.995 n=20)
Int64N2e18-8 4.359n ± 1% 4.359n ± 1% ~ (p=0.122 n=20)
Int64N4e18-8 6.546n ± 1% 6.536n ± 1% ~ (p=0.920 n=20)
Int32N1000-8 2.940n ± 2% 2.937n ± 0% ~ (p=0.149 n=20)
Int32N1e8-8 2.937n ± 2% 2.937n ± 1% ~ (p=0.620 n=20)
Int32N1e9-8 2.938n ± 0% 2.936n ± 0% ~ (p=0.046 n=20)
Int32N2e9-8 2.938n ± 2% 2.938n ± 2% ~ (p=0.455 n=20)
Float32-8 3.486n ± 0% 3.441n ± 0% -1.28% (p=0.000 n=20)
Float64-8 3.480n ± 0% 3.441n ± 0% -1.13% (p=0.000 n=20)
ExpFloat64-8 4.533n ± 0% 4.486n ± 0% -1.03% (p=0.000 n=20)
NormFloat64-8 4.764n ± 0% 4.721n ± 0% -0.90% (p=0.000 n=20)
Perm3-8 26.66n ± 0% 26.65n ± 0% ~ (p=0.019 n=20)
Perm30-8 143.4n ± 0% 143.2n ± 0% -0.17% (p=0.000 n=20)
Perm30ViaShuffle-8 142.9n ± 0% 143.0n ± 0% ~ (p=0.522 n=20)
ShuffleOverhead-8 120.7n ± 0% 120.6n ± 1% ~ (p=0.488 n=20)
Concurrent-8 2.360n ± 2% 2.399n ± 5% ~ (p=0.062 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 4d84a369d1.386 │ 2703446c2e.386 │
│ sec/op │ sec/op vs base │
SourceUint64-32 2.101n ± 2% 2.072n ± 2% ~ (p=0.273 n=20)
GlobalInt64-32 3.518n ± 2% 3.546n ± 27% +0.78% (p=0.007 n=20)
GlobalInt64Parallel-32 0.3206n ± 0% 0.3211n ± 0% ~ (p=0.386 n=20)
GlobalUint64-32 3.538n ± 1% 3.522n ± 2% ~ (p=0.331 n=20)
GlobalUint64Parallel-32 0.3231n ± 0% 0.3172n ± 0% -1.84% (p=0.000 n=20)
Int64-32 2.554n ± 2% 2.520n ± 2% ~ (p=0.465 n=20)
Uint64-32 2.575n ± 2% 2.581n ± 1% ~ (p=0.213 n=20)
GlobalIntN1000-32 6.292n ± 1% 6.171n ± 1% ~ (p=0.015 n=20)
IntN1000-32 4.735n ± 1% 4.752n ± 2% ~ (p=0.635 n=20)
Int64N1000-32 5.489n ± 2% 5.429n ± 1% ~ (p=0.324 n=20)
Int64N1e8-32 5.528n ± 2% 5.469n ± 2% ~ (p=0.013 n=20)
Int64N1e9-32 5.438n ± 2% 5.489n ± 2% ~ (p=0.984 n=20)
Int64N2e9-32 5.474n ± 1% 5.492n ± 2% ~ (p=0.616 n=20)
Int64N1e18-32 9.053n ± 1% 8.927n ± 1% ~ (p=0.037 n=20)
Int64N2e18-32 9.685n ± 2% 9.622n ± 1% ~ (p=0.449 n=20)
Int64N4e18-32 12.18n ± 1% 12.03n ± 1% ~ (p=0.013 n=20)
Int32N1000-32 4.862n ± 1% 4.817n ± 1% -0.94% (p=0.002 n=20)
Int32N1e8-32 4.758n ± 2% 4.801n ± 1% ~ (p=0.597 n=20)
Int32N1e9-32 4.772n ± 1% 4.798n ± 1% ~ (p=0.774 n=20)
Int32N2e9-32 4.847n ± 0% 4.840n ± 1% ~ (p=0.867 n=20)
Float32-32 22.18n ± 4% 10.51n ± 4% -52.61% (p=0.000 n=20)
Float64-32 21.21n ± 3% 20.33n ± 3% -4.17% (p=0.000 n=20)
ExpFloat64-32 12.39n ± 2% 12.59n ± 2% ~ (p=0.139 n=20)
NormFloat64-32 7.422n ± 1% 7.350n ± 2% ~ (p=0.208 n=20)
Perm3-32 38.00n ± 2% 39.29n ± 2% +3.38% (p=0.000 n=20)
Perm30-32 212.7n ± 1% 219.1n ± 2% +3.03% (p=0.001 n=20)
Perm30ViaShuffle-32 187.5n ± 2% 189.8n ± 2% ~ (p=0.457 n=20)
ShuffleOverhead-32 159.7n ± 1% 158.9n ± 2% ~ (p=0.920 n=20)
Concurrent-32 3.470n ± 0% 3.306n ± 3% -4.71% (p=0.000 n=20)
For #61716.
Change-Id: I1933f1f9efd7e6e832d83e7fa5d84398f67d41f5
Reviewed-on: https://go-review.googlesource.com/c/go/+/502503
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
|
|
c266587846 |
math/rand/v2: add, optimize N, UintN, Uint32N, Uint64N
Now that we can break the value stream, we can take advantage
of better algorithms that have been suggested since the original
code was written.
Also optimizes IntN, Int32N, Int64N, Perm (indirectly).
All the N variants (IntN, Int32N, Int64N, UintN, N, etc) now
return the same values given a Source and parameter n, so that
for example uint(r.IntN(10)) and r.UintN(10) and r.N(uint(10))
are completely interchangeable.
Int64N4e18 gets slower but that is a near worst case for
the algorithm and is extremely unlikely in practice.
32-bit Int32N variants got slower too, by 15-30%, in exchange
for speeding up everything on 64-bit systems and consistency
across the N functions.
Also rename previously missed benchmark
GlobalInt63Parallel to GlobalInt64Parallel.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 11ad9fdddc.amd64 │ 4d84a369d1.amd64 │
│ sec/op │ sec/op vs base │
SourceUint64-32 1.335n ± 1% 1.348n ± 2% ~ (p=0.335 n=20)
GlobalInt64-32 2.046n ± 1% 2.082n ± 2% ~ (p=0.310 n=20)
GlobalInt63Parallel-32 0.1037n ± 1%
GlobalInt64Parallel-32 0.1036n ± 1%
GlobalUint64-32 2.075n ± 0% 2.077n ± 2% ~ (p=0.228 n=20)
GlobalUint64Parallel-32 0.1013n ± 1% 0.1012n ± 1% ~ (p=0.878 n=20)
Int64-32 1.726n ± 2% 1.750n ± 0% +1.39% (p=0.000 n=20)
Uint64-32 1.673n ± 1% 1.707n ± 2% +2.03% (p=0.002 n=20)
GlobalIntN1000-32 3.895n ± 2% 3.192n ± 1% -18.05% (p=0.000 n=20)
IntN1000-32 3.403n ± 1% 2.462n ± 2% -27.65% (p=0.000 n=20)
Int64N1000-32 3.053n ± 2% 2.470n ± 1% -19.11% (p=0.000 n=20)
Int64N1e8-32 2.718n ± 1% 2.503n ± 2% -7.91% (p=0.000 n=20)
Int64N1e9-32 2.712n ± 1% 2.487n ± 1% -8.31% (p=0.000 n=20)
Int64N2e9-32 2.690n ± 1% 2.487n ± 1% -7.57% (p=0.000 n=20)
Int64N1e18-32 3.084n ± 2% 3.006n ± 2% -2.53% (p=0.000 n=20)
Int64N2e18-32 4.026n ± 1% 3.368n ± 1% -16.33% (p=0.000 n=20)
Int64N4e18-32 4.049n ± 2% 4.763n ± 1% +17.62% (p=0.000 n=20)
Int32N1000-32 2.730n ± 0% 2.403n ± 1% -11.94% (p=0.000 n=20)
Int32N1e8-32 2.916n ± 2% 2.405n ± 1% -17.53% (p=0.000 n=20)
Int32N1e9-32 3.375n ± 1% 2.402n ± 2% -28.83% (p=0.000 n=20)
Int32N2e9-32 3.292n ± 1% 2.384n ± 1% -27.58% (p=0.000 n=20)
Float32-32 2.673n ± 1% 2.641n ± 2% ~ (p=0.147 n=20)
Float64-32 2.485n ± 1% 2.483n ± 1% ~ (p=0.804 n=20)
ExpFloat64-32 3.577n ± 2% 3.486n ± 2% -2.57% (p=0.000 n=20)
NormFloat64-32 3.797n ± 2% 3.648n ± 1% -3.92% (p=0.000 n=20)
Perm3-32 35.79n ± 2% 33.04n ± 1% -7.68% (p=0.000 n=20)
Perm30-32 205.1n ± 1% 171.9n ± 1% -16.14% (p=0.000 n=20)
Perm30ViaShuffle-32 111.2n ± 2% 100.3n ± 1% -9.76% (p=0.000 n=20)
ShuffleOverhead-32 100.5n ± 2% 102.5n ± 1% +1.99% (p=0.007 n=20)
Concurrent-32 2.188n ± 5% 2.101n ± 0% ~ (p=0.013 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
│ 11ad9fdddc.arm64 │ 4d84a369d1.arm64 │
│ sec/op │ sec/op vs base │
SourceUint64-8 2.272n ± 1% 2.261n ± 1% ~ (p=0.172 n=20)
GlobalInt64-8 2.155n ± 1% 2.160n ± 1% ~ (p=0.482 n=20)
GlobalInt63Parallel-8 0.4352n ± 0%
GlobalInt64Parallel-8 0.4299n ± 0%
GlobalUint64-8 2.173n ± 1% 2.169n ± 1% ~ (p=0.262 n=20)
GlobalUint64Parallel-8 0.4340n ± 0% 0.4293n ± 1% -1.08% (p=0.000 n=20)
Int64-8 2.544n ± 1% 2.473n ± 1% -2.83% (p=0.000 n=20)
Uint64-8 2.552n ± 1% 2.453n ± 1% -3.90% (p=0.000 n=20)
GlobalIntN1000-8 3.856n ± 0% 2.814n ± 2% -27.02% (p=0.000 n=20)
IntN1000-8 3.820n ± 0% 2.933n ± 2% -23.22% (p=0.000 n=20)
Int64N1000-8 3.219n ± 2% 2.934n ± 2% -8.85% (p=0.000 n=20)
Int64N1e8-8 3.221n ± 2% 2.935n ± 2% -8.91% (p=0.000 n=20)
Int64N1e9-8 3.276n ± 2% 2.934n ± 2% -10.44% (p=0.000 n=20)
Int64N2e9-8 3.217n ± 0% 2.935n ± 2% -8.78% (p=0.000 n=20)
Int64N1e18-8 3.502n ± 2% 3.778n ± 1% +7.91% (p=0.000 n=20)
Int64N2e18-8 4.968n ± 1% 4.359n ± 1% -12.26% (p=0.000 n=20)
Int64N4e18-8 4.963n ± 0% 6.546n ± 1% +31.92% (p=0.000 n=20)
Int32N1000-8 3.189n ± 1% 2.940n ± 2% -7.81% (p=0.000 n=20)
Int32N1e8-8 3.514n ± 1% 2.937n ± 2% -16.41% (p=0.000 n=20)
Int32N1e9-8 4.133n ± 0% 2.938n ± 0% -28.91% (p=0.000 n=20)
Int32N2e9-8 4.137n ± 0% 2.938n ± 2% -28.97% (p=0.000 n=20)
Float32-8 3.468n ± 1% 3.486n ± 0% +0.52% (p=0.000 n=20)
Float64-8 3.478n ± 0% 3.480n ± 0% ~ (p=0.063 n=20)
ExpFloat64-8 4.563n ± 0% 4.533n ± 0% -0.67% (p=0.000 n=20)
NormFloat64-8 4.768n ± 0% 4.764n ± 0% -0.07% (p=0.001 n=20)
Perm3-8 28.94n ± 0% 26.66n ± 0% -7.88% (p=0.000 n=20)
Perm30-8 175.9n ± 0% 143.4n ± 0% -18.50% (p=0.000 n=20)
Perm30ViaShuffle-8 152.6n ± 1% 142.9n ± 0% -6.29% (p=0.000 n=20)
ShuffleOverhead-8 119.6n ± 1% 120.7n ± 0% +0.96% (p=0.000 n=20)
Concurrent-8 2.452n ± 3% 2.360n ± 2% -3.73% (p=0.007 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 11ad9fdddc.386 │ 4d84a369d1.386 │
│ sec/op │ sec/op vs base │
SourceUint64-32 2.091n ± 1% 2.101n ± 2% ~ (p=0.672 n=20)
GlobalInt64-32 3.514n ± 2% 3.518n ± 2% ~ (p=0.723 n=20)
GlobalInt63Parallel-32 0.3197n ± 0%
GlobalInt64Parallel-32 0.3206n ± 0%
GlobalUint64-32 3.542n ± 1% 3.538n ± 1% ~ (p=0.304 n=20)
GlobalUint64Parallel-32 0.3218n ± 0% 0.3231n ± 0% ~ (p=0.071 n=20)
Int64-32 2.552n ± 2% 2.554n ± 2% ~ (p=0.693 n=20)
Uint64-32 2.566n ± 1% 2.575n ± 2% ~ (p=0.606 n=20)
GlobalIntN1000-32 5.965n ± 2% 6.292n ± 1% +5.46% (p=0.000 n=20)
IntN1000-32 4.652n ± 1% 4.735n ± 1% +1.77% (p=0.000 n=20)
Int64N1000-32 14.485n ± 1% 5.489n ± 2% -62.11% (p=0.000 n=20)
Int64N1e8-32 14.675n ± 1% 5.528n ± 2% -62.33% (p=0.000 n=20)
Int64N1e9-32 16.805n ± 2% 5.438n ± 2% -67.64% (p=0.000 n=20)
Int64N2e9-32 14.515n ± 1% 5.474n ± 1% -62.28% (p=0.000 n=20)
Int64N1e18-32 16.165n ± 1% 9.053n ± 1% -44.00% (p=0.000 n=20)
Int64N2e18-32 17.945n ± 2% 9.685n ± 2% -46.03% (p=0.000 n=20)
Int64N4e18-32 18.35n ± 2% 12.18n ± 1% -33.62% (p=0.000 n=20)
Int32N1000-32 3.608n ± 1% 4.862n ± 1% +34.77% (p=0.000 n=20)
Int32N1e8-32 3.767n ± 1% 4.758n ± 2% +26.31% (p=0.000 n=20)
Int32N1e9-32 4.130n ± 2% 4.772n ± 1% +15.54% (p=0.000 n=20)
Int32N2e9-32 4.206n ± 1% 4.847n ± 0% +15.24% (p=0.000 n=20)
Float32-32 22.18n ± 4% 22.18n ± 4% ~ (p=0.195 n=20)
Float64-32 20.75n ± 4% 21.21n ± 3% ~ (p=0.394 n=20)
ExpFloat64-32 12.58n ± 3% 12.39n ± 2% ~ (p=0.032 n=20)
NormFloat64-32 7.920n ± 3% 7.422n ± 1% -6.29% (p=0.000 n=20)
Perm3-32 40.27n ± 1% 38.00n ± 2% -5.65% (p=0.000 n=20)
Perm30-32 213.2n ± 2% 212.7n ± 1% ~ (p=0.995 n=20)
Perm30ViaShuffle-32 164.2n ± 2% 187.5n ± 2% +14.22% (p=0.000 n=20)
ShuffleOverhead-32 134.7n ± 2% 159.7n ± 1% +18.52% (p=0.000 n=20)
Concurrent-32 3.301n ± 2% 3.470n ± 0% +5.10% (p=0.000 n=20)
For #61716.
Change-Id: Id1481b04202883cd0b23e21bb58d1bca4e482bd3
Reviewed-on: https://go-review.googlesource.com/c/go/+/502500
Reviewed-by: Rob Pike <r@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
|
|
c7dddb02d3 |
math/rand/v2: change Source to use uint64
This should make Uint64-using functions faster and leave
other things alone. It is a mystery why so much got faster.
A good cautionary tale not to read too much into minor
jitter in the benchmarks.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 220860f76f.amd64 │ 11ad9fdddc.amd64 │
│ sec/op │ sec/op vs base │
SourceUint64-32 1.555n ± 1% 1.335n ± 1% -14.15% (p=0.000 n=20)
GlobalInt64-32 2.071n ± 1% 2.046n ± 1% ~ (p=0.016 n=20)
GlobalInt63Parallel-32 0.1023n ± 1% 0.1037n ± 1% +1.37% (p=0.002 n=20)
GlobalUint64-32 5.193n ± 1% 2.075n ± 0% -60.06% (p=0.000 n=20)
GlobalUint64Parallel-32 0.2341n ± 0% 0.1013n ± 1% -56.74% (p=0.000 n=20)
Int64-32 2.056n ± 2% 1.726n ± 2% -16.10% (p=0.000 n=20)
Uint64-32 2.077n ± 2% 1.673n ± 1% -19.46% (p=0.000 n=20)
GlobalIntN1000-32 4.077n ± 2% 3.895n ± 2% -4.45% (p=0.000 n=20)
IntN1000-32 3.476n ± 2% 3.403n ± 1% -2.10% (p=0.000 n=20)
Int64N1000-32 3.059n ± 1% 3.053n ± 2% ~ (p=0.131 n=20)
Int64N1e8-32 2.942n ± 1% 2.718n ± 1% -7.60% (p=0.000 n=20)
Int64N1e9-32 2.932n ± 1% 2.712n ± 1% -7.50% (p=0.000 n=20)
Int64N2e9-32 2.925n ± 1% 2.690n ± 1% -8.03% (p=0.000 n=20)
Int64N1e18-32 3.116n ± 1% 3.084n ± 2% ~ (p=0.425 n=20)
Int64N2e18-32 4.067n ± 1% 4.026n ± 1% -1.02% (p=0.007 n=20)
Int64N4e18-32 4.054n ± 1% 4.049n ± 2% ~ (p=0.204 n=20)
Int32N1000-32 2.951n ± 1% 2.730n ± 0% -7.49% (p=0.000 n=20)
Int32N1e8-32 3.102n ± 1% 2.916n ± 2% -6.03% (p=0.000 n=20)
Int32N1e9-32 3.535n ± 1% 3.375n ± 1% -4.54% (p=0.000 n=20)
Int32N2e9-32 3.514n ± 1% 3.292n ± 1% -6.30% (p=0.000 n=20)
Float32-32 2.760n ± 1% 2.673n ± 1% -3.13% (p=0.000 n=20)
Float64-32 2.284n ± 1% 2.485n ± 1% +8.80% (p=0.000 n=20)
ExpFloat64-32 3.757n ± 1% 3.577n ± 2% -4.78% (p=0.000 n=20)
NormFloat64-32 3.837n ± 1% 3.797n ± 2% ~ (p=0.204 n=20)
Perm3-32 35.23n ± 2% 35.79n ± 2% ~ (p=0.298 n=20)
Perm30-32 208.8n ± 1% 205.1n ± 1% -1.82% (p=0.000 n=20)
Perm30ViaShuffle-32 111.7n ± 1% 111.2n ± 2% ~ (p=0.273 n=20)
ShuffleOverhead-32 101.1n ± 1% 100.5n ± 2% ~ (p=0.878 n=20)
Concurrent-32 2.108n ± 7% 2.188n ± 5% ~ (p=0.417 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
│ 220860f76f.arm64 │ 11ad9fdddc.arm64 │
│ sec/op │ sec/op vs base │
SourceUint64-8 2.316n ± 1% 2.272n ± 1% -1.86% (p=0.000 n=20)
GlobalInt64-8 2.183n ± 1% 2.155n ± 1% ~ (p=0.122 n=20)
GlobalInt63Parallel-8 0.4331n ± 0% 0.4352n ± 0% +0.48% (p=0.000 n=20)
GlobalUint64-8 4.377n ± 2% 2.173n ± 1% -50.35% (p=0.000 n=20)
GlobalUint64Parallel-8 0.9237n ± 0% 0.4340n ± 0% -53.02% (p=0.000 n=20)
Int64-8 2.538n ± 1% 2.544n ± 1% ~ (p=0.189 n=20)
Uint64-8 2.604n ± 1% 2.552n ± 1% -1.98% (p=0.000 n=20)
GlobalIntN1000-8 3.857n ± 2% 3.856n ± 0% ~ (p=0.051 n=20)
IntN1000-8 3.822n ± 2% 3.820n ± 0% -0.05% (p=0.001 n=20)
Int64N1000-8 3.318n ± 0% 3.219n ± 2% -2.98% (p=0.000 n=20)
Int64N1e8-8 3.349n ± 1% 3.221n ± 2% -3.79% (p=0.000 n=20)
Int64N1e9-8 3.317n ± 2% 3.276n ± 2% -1.24% (p=0.001 n=20)
Int64N2e9-8 3.317n ± 2% 3.217n ± 0% -3.01% (p=0.000 n=20)
Int64N1e18-8 3.542n ± 1% 3.502n ± 2% -1.16% (p=0.001 n=20)
Int64N2e18-8 5.087n ± 0% 4.968n ± 1% -2.33% (p=0.000 n=20)
Int64N4e18-8 5.084n ± 0% 4.963n ± 0% -2.39% (p=0.000 n=20)
Int32N1000-8 3.208n ± 2% 3.189n ± 1% -0.58% (p=0.001 n=20)
Int32N1e8-8 3.610n ± 1% 3.514n ± 1% -2.67% (p=0.000 n=20)
Int32N1e9-8 4.235n ± 0% 4.133n ± 0% -2.40% (p=0.000 n=20)
Int32N2e9-8 4.229n ± 1% 4.137n ± 0% -2.19% (p=0.000 n=20)
Float32-8 3.468n ± 0% 3.468n ± 1% ~ (p=0.350 n=20)
Float64-8 3.447n ± 0% 3.478n ± 0% +0.90% (p=0.000 n=20)
ExpFloat64-8 4.567n ± 0% 4.563n ± 0% -0.10% (p=0.002 n=20)
NormFloat64-8 4.821n ± 0% 4.768n ± 0% -1.09% (p=0.000 n=20)
Perm3-8 28.89n ± 0% 28.94n ± 0% +0.17% (p=0.000 n=20)
Perm30-8 175.7n ± 0% 175.9n ± 0% +0.14% (p=0.000 n=20)
Perm30ViaShuffle-8 153.5n ± 0% 152.6n ± 1% ~ (p=0.010 n=20)
ShuffleOverhead-8 119.8n ± 1% 119.6n ± 1% ~ (p=0.147 n=20)
Concurrent-8 2.433n ± 3% 2.452n ± 3% ~ (p=0.616 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 220860f76f.386 │ 11ad9fdddc.386 │
│ sec/op │ sec/op vs base │
SourceUint64-32 2.370n ± 1% 2.091n ± 1% -11.75% (p=0.000 n=20)
GlobalInt64-32 3.569n ± 1% 3.514n ± 2% -1.56% (p=0.000 n=20)
GlobalInt63Parallel-32 0.3221n ± 1% 0.3197n ± 0% -0.76% (p=0.000 n=20)
GlobalUint64-32 8.797n ± 10% 3.542n ± 1% -59.74% (p=0.000 n=20)
GlobalUint64Parallel-32 0.6351n ± 0% 0.3218n ± 0% -49.33% (p=0.000 n=20)
Int64-32 2.612n ± 2% 2.552n ± 2% -2.30% (p=0.000 n=20)
Uint64-32 3.350n ± 1% 2.566n ± 1% -23.42% (p=0.000 n=20)
GlobalIntN1000-32 5.892n ± 1% 5.965n ± 2% ~ (p=0.082 n=20)
IntN1000-32 4.546n ± 1% 4.652n ± 1% +2.33% (p=0.000 n=20)
Int64N1000-32 14.59n ± 1% 14.48n ± 1% ~ (p=0.652 n=20)
Int64N1e8-32 14.76n ± 2% 14.67n ± 1% ~ (p=0.836 n=20)
Int64N1e9-32 16.57n ± 1% 16.80n ± 2% ~ (p=0.016 n=20)
Int64N2e9-32 14.54n ± 1% 14.52n ± 1% ~ (p=0.533 n=20)
Int64N1e18-32 16.14n ± 1% 16.16n ± 1% ~ (p=0.606 n=20)
Int64N2e18-32 18.10n ± 1% 17.95n ± 2% ~ (p=0.062 n=20)
Int64N4e18-32 18.65n ± 1% 18.35n ± 2% -1.61% (p=0.010 n=20)
Int32N1000-32 3.560n ± 1% 3.608n ± 1% +1.33% (p=0.001 n=20)
Int32N1e8-32 3.770n ± 2% 3.767n ± 1% ~ (p=0.155 n=20)
Int32N1e9-32 4.098n ± 0% 4.130n ± 2% ~ (p=0.016 n=20)
Int32N2e9-32 4.179n ± 1% 4.206n ± 1% ~ (p=0.011 n=20)
Float32-32 21.18n ± 4% 22.18n ± 4% +4.70% (p=0.003 n=20)
Float64-32 20.60n ± 2% 20.75n ± 4% +0.73% (p=0.000 n=20)
ExpFloat64-32 13.07n ± 0% 12.58n ± 3% -3.82% (p=0.000 n=20)
NormFloat64-32 7.738n ± 2% 7.920n ± 3% ~ (p=0.066 n=20)
Perm3-32 36.73n ± 1% 40.27n ± 1% +9.65% (p=0.000 n=20)
Perm30-32 211.9n ± 1% 213.2n ± 2% ~ (p=0.262 n=20)
Perm30ViaShuffle-32 165.2n ± 1% 164.2n ± 2% ~ (p=0.029 n=20)
ShuffleOverhead-32 133.9n ± 1% 134.7n ± 2% ~ (p=0.551 n=20)
Concurrent-32 3.287n ± 2% 3.301n ± 2% ~ (p=0.330 n=20)
For #61716.
Change-Id: I8d2f73f87dd3603a0c2ff069988938e0957b6904
Reviewed-on: https://go-review.googlesource.com/c/go/+/502499
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Rob Pike <r@golang.org>
|
|
|
|
1f4db9dbd6 |
math/rand/v2: update benchmarks
Change the benchmarks to use the result of the calls,
as I found that in certain cases inlining resulted in
discarding part of the computation in the benchmark loop.
Add various benchmarks that will be relevant in future CLs.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 220860f76f.amd64 │
│ sec/op │
SourceUint64-32 1.555n ± 1%
GlobalInt64-32 2.071n ± 1%
GlobalInt63Parallel-32 0.1023n ± 1%
GlobalUint64-32 5.193n ± 1%
GlobalUint64Parallel-32 0.2341n ± 0%
Int64-32 2.056n ± 2%
Uint64-32 2.077n ± 2%
GlobalIntN1000-32 4.077n ± 2%
IntN1000-32 3.476n ± 2%
Int64N1000-32 3.059n ± 1%
Int64N1e8-32 2.942n ± 1%
Int64N1e9-32 2.932n ± 1%
Int64N2e9-32 2.925n ± 1%
Int64N1e18-32 3.116n ± 1%
Int64N2e18-32 4.067n ± 1%
Int64N4e18-32 4.054n ± 1%
Int32N1000-32 2.951n ± 1%
Int32N1e8-32 3.102n ± 1%
Int32N1e9-32 3.535n ± 1%
Int32N2e9-32 3.514n ± 1%
Float32-32 2.760n ± 1%
Float64-32 2.284n ± 1%
ExpFloat64-32 3.757n ± 1%
NormFloat64-32 3.837n ± 1%
Perm3-32 35.23n ± 2%
Perm30-32 208.8n ± 1%
Perm30ViaShuffle-32 111.7n ± 1%
ShuffleOverhead-32 101.1n ± 1%
Concurrent-32 2.108n ± 7%
goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
│ 220860f76f.arm64 │
│ sec/op │
SourceUint64-8 2.316n ± 1%
GlobalInt64-8 2.183n ± 1%
GlobalInt63Parallel-8 0.4331n ± 0%
GlobalUint64-8 4.377n ± 2%
GlobalUint64Parallel-8 0.9237n ± 0%
Int64-8 2.538n ± 1%
Uint64-8 2.604n ± 1%
GlobalIntN1000-8 3.857n ± 2%
IntN1000-8 3.822n ± 2%
Int64N1000-8 3.318n ± 0%
Int64N1e8-8 3.349n ± 1%
Int64N1e9-8 3.317n ± 2%
Int64N2e9-8 3.317n ± 2%
Int64N1e18-8 3.542n ± 1%
Int64N2e18-8 5.087n ± 0%
Int64N4e18-8 5.084n ± 0%
Int32N1000-8 3.208n ± 2%
Int32N1e8-8 3.610n ± 1%
Int32N1e9-8 4.235n ± 0%
Int32N2e9-8 4.229n ± 1%
Float32-8 3.468n ± 0%
Float64-8 3.447n ± 0%
ExpFloat64-8 4.567n ± 0%
NormFloat64-8 4.821n ± 0%
Perm3-8 28.89n ± 0%
Perm30-8 175.7n ± 0%
Perm30ViaShuffle-8 153.5n ± 0%
ShuffleOverhead-8 119.8n ± 1%
Concurrent-8 2.433n ± 3%
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ 220860f76f.386 │
│ sec/op │
SourceUint64-32 2.370n ± 1%
GlobalInt64-32 3.569n ± 1%
GlobalInt63Parallel-32 0.3221n ± 1%
GlobalUint64-32 8.797n ± 10%
GlobalUint64Parallel-32 0.6351n ± 0%
Int64-32 2.612n ± 2%
Uint64-32 3.350n ± 1%
GlobalIntN1000-32 5.892n ± 1%
IntN1000-32 4.546n ± 1%
Int64N1000-32 14.59n ± 1%
Int64N1e8-32 14.76n ± 2%
Int64N1e9-32 16.57n ± 1%
Int64N2e9-32 14.54n ± 1%
Int64N1e18-32 16.14n ± 1%
Int64N2e18-32 18.10n ± 1%
Int64N4e18-32 18.65n ± 1%
Int32N1000-32 3.560n ± 1%
Int32N1e8-32 3.770n ± 2%
Int32N1e9-32 4.098n ± 0%
Int32N2e9-32 4.179n ± 1%
Float32-32 21.18n ± 4%
Float64-32 20.60n ± 2%
ExpFloat64-32 13.07n ± 0%
NormFloat64-32 7.738n ± 2%
Perm3-32 36.73n ± 1%
Perm30-32 211.9n ± 1%
Perm30ViaShuffle-32 165.2n ± 1%
ShuffleOverhead-32 133.9n ± 1%
Concurrent-32 3.287n ± 2%
For #61716.
Change-Id: I2f0938eae4b7bf736a8cd899a99783e731bf2179
Reviewed-on: https://go-review.googlesource.com/c/go/+/502496
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Rob Pike <r@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
|
|
1cc5b34d28 |
math/rand/v2: remove Rand.Seed
Removing Rand.Seed lets us remove lockedSource as well, along with the ambiguity in globalRand about which source to use. For #61716. Change-Id: Ibe150520dd1e7dd87165eacaebe9f0c2daeaedfd Reviewed-on: https://go-review.googlesource.com/c/go/+/502498 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> |
|
|
|
48bd1fc93b |
math/rand/v2: clean up regression test
Add more test cases. Replace -printgolden with -update, which rewrites the files for us. For #61716. Change-Id: I7c4c900ee896042429135a21971a56ebe16b6a66 Reviewed-on: https://go-review.googlesource.com/c/go/+/516858 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
d6c1ef52ad |
math/rand/v2: remove Read
In math/rand, Read is deprecated. Remove in v2. People should use crypto/rand if they need long strings. For #61716. Change-Id: Ib254b7e1844616e96db60a3a7abb572b0dcb1583 Reviewed-on: https://go-review.googlesource.com/c/go/+/502497 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
d42750b17c |
math/rand/v2: rename various functions
Int31 -> Int32 Int31n -> Int32N Int63 -> Int64 Int63n -> Int64N Intn -> IntN The 31 and 63 are pedantic and confusing: the functions should be named for the type they return, same as all the others. The lower-case n is inconsistent with Go's usual CamelCase and especially problematic because we plan to add 'func N'. Capitalize the n. For #61716. Change-Id: Idb1a005a82f353677450d47fb612ade7a41fde69 Reviewed-on: https://go-review.googlesource.com/c/go/+/516857 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Robert Griesemer <gri@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
59f0ab4036 |
math/rand/v2: start of new API
This is the beginning of the math/rand/v2 package from proposal #61716. Start by copying old API. This CL copies math/rand/* to math/rand/v2 and updates references to math/rand to add v2 throughout. Later CLs will make the v2 changes. For #61716. Change-Id: I1624ccffae3dfa442d4ba2461942decbd076e11b Reviewed-on: https://go-review.googlesource.com/c/go/+/502495 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Rob Pike <r@golang.org> |
|
|
|
bf97e724b5 |
all: drop old +build lines
Running 'go fix' on the cmd+std packages handled much of this change. Also update code generators to use only the new go:build lines, not the old +build ones. For #41184. For #60268. Change-Id: If35532abe3012e7357b02c79d5992ff5ac37ca23 Cq-Include-Trybots: luci.golang.try:gotip-linux-386-longtest,gotip-linux-amd64-longtest,gotip-windows-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/536237 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
d57303e65f |
math: add available godoc link
Change-Id: I4a6c2ef6fd21355952ab7d8eaad883646a95d364 Reviewed-on: https://go-review.googlesource.com/c/go/+/535087 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Than McIntosh <thanm@google.com> |
|
|
|
da8f406f06 |
all: simplify bool conditions
Change-Id: Id2079f7012392dea8dfe2386bb9fb1ea3f487a4a Reviewed-on: https://go-review.googlesource.com/c/go/+/526015 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: qiulaidongfeng <2645477756@qq.com> |
|
|
|
0dfb22ed70 |
all: use ^TestName$ regular pattern for invoking a single test
Use ^ and $ in the -run flag regular expression value when the intention is to invoke a single named test. This removes the reliance on there not being another similarly named test to achieve the intended result. In particular, package syscall has tests named TestUnshareMountNameSpace and TestUnshareMountNameSpaceChroot that both trigger themselves setting GO_WANT_HELPER_PROCESS=1 to run alternate code in a helper process. As a consequence of overlap in their test names, the former was inadvertently triggering one too many helpers. Spotted while reviewing CL 525196. Apply the same change in other places to make it easier for code readers to see that said tests aren't running extraneous tests. The unlikely cases of -run=TestSomething intentionally being used to run all tests that have the TestSomething substring in the name can be better written as -run=^.*TestSomething.*$ or with a comment so it is clear it wasn't an oversight. Change-Id: Iba208aba3998acdbf8c6708e5d23ab88938bfc1e Reviewed-on: https://go-review.googlesource.com/c/go/+/524948 Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Kirill Kolyshkin <kolyshkin@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
aaa384cf3a |
math/big, math/rand: use the built-in max function
Change-Id: I71a38dd20bfaf2b1aed18892d54eeb017d3d7d66
GitHub-Last-Rev:
|
|
|
|
1d3a77e5e6 |
math/big: using the min built-in function
Change-Id: I9e95806116a8547ec782f66226d1b1382c6156de
Change-Id: I9e95806116a8547ec782f66226d1b1382c6156de
GitHub-Last-Rev:
|
|
|
|
a37da52d75 |
math: enable huge argument tests on s390x
new s390x assembly implementation of Sin/Cos/SinCos/Tan handle huge argument test's. Updates #29240 Change-Id: I9f22d9714528ef2af52c749079f3727250089baf Reviewed-on: https://go-review.googlesource.com/c/go/+/509675 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |